As discussed in the previous post, ∆E94 was conceived with simplicity as a design goal instead of accuracy, compared to ∆ECMC. So it was expected that its performace in the field showed no clear advantage over its predecessor. CIE itself was aware of this, and at the same time started developing a new formula.
The critical aspect of an empirical formula such as ∆E is to fit reasonably well to available data. This means in turn that such a formula would be no better than the data it was derived from. Because of this, development of a new formula brought to the table the need to create data sets of real color differences with enough quality as to serve as starting point.
Training the champion
CIE created a new committe in order to analyze ∆E94 performance and eventually the need to improve it, and its own members were comissioned to get the valuable data needed for their task. Several records were obtained, such as the following:
- BFD-P: Set of 2,776 color pairs within perceptibility limit over several materiales, with an average ∆Eab of 3.0;
- RIT-DuPont: Set of 156 color pairs of painted samples around 19 reference color centers;
- Leeds: Research results of 307 color pairs on painted samples, average ∆Eab 1.6;
- Witt: 418 color pairs on paint around 5 central colors, average ∆Eab 1.9.
After analyzing the 3,657 color pairs available from all this data sets, and comparing to predicted behavior by ∆E94, several effects were observed:
- As anticipated, it was necessary to review perceived differences' dependency of L*, and was established that maximum sensibility occurs at L* = 50, slowly decreasing as we move towards L* = 0 and L* = 100. This is not an absolute effect, but it is closely related to the conditions data were collected, as we will see later. In this graph we can see the correction adopted by means of the parameter SL (same meaning as in previous equations):
- Even close to the gray axis, tolerances are still elliptical and have their main axes in the b* direction. This fact suggests (just for the purpose of computing color differences) converting a* coordinate into a larger value a', mainly in the near-achromatic zone, in order to "round out" those ellipses. So it was proposed to define a' as
where G is a factor which depends on the average chromaticities of the compared samples. The expression (1 + G) has a maximum value of about 1.5 near the gray axis, and progressively reduces to one (i.e. no correction) for C* greater than 50. With this modification, the same correction SC is applied, except it is calculated over a new C', obtained in turn from a', b* instead of a*, b*;
- Regarding SH, it was necessary to grab the ideas developed in ∆ECMC and ∆EBFD, where this factor has a complex dependency of both h* and C*. In the following 3D graph we can have an idea of SC dependency (in yellow) and SH dependency (in white) on every location at the a*b* / C*h* plane. Let's recall that in this graph, the higher a point in this surface, the more tolerance we get to differences. Also you may note that the SH surface features soft ripples, compared to ∆ECMC:
- Last, it was be noted (again) that the blue region demanded special treatment, as proposed by ∆EBFD. In fact, the new algorithm includes a rotatiional correction term which is formally identical.
The new formula assumes the following expression, where ∆L' = ∆L*, ∆C' is the difference between computed C' from modified a' values, and ∆H' is the hue difference obtained from the new C':
Compared to ∆EBFD, we must note that ∆L' = ∆L* depends on the samples' original L* values instead of their XYZ they eventually were derived from. The other terms in the equation share the same form.
We may get a glimpse of the correspondence to reality we can expect from ∆E00 by using the same ellipses we showed before (in fact they represent the BFD-P and RIT-Dupont data sets just described). In the next graph, ellipses in red are real differences, while those in white are the predictions of the new formula:
Although there is no perfect match, in general it can be seen ellipses' size and orientation are very close to tthe real ones in almost all the a*b* plane. Therefore we may expect this formula to be better than its predecessors; it is indeed, but before declaring it as a winner, we better read the small print...
The reference conditions
One of the many research groups working on ∆E00 brought attention to an important fact: original data sets, obtained by evaluation of human observers, depend considerably on viewing conditions. It became clear that it was not enough to just develop a more accurate formula, but to quantify and establish viewing conditions for its application as well.
Some of these conditions may seem trivial, but others, for those working in color assessment and comparison on a daily basis, may seem almost draconian:
|Reference conditions for ∆E00|
|3||Observer:||Normal color vision (pretty obvious, right?)|
|4||Sample size:||Subtended visual angle greater than 4º|
|5||Background field:||Uniform, neutral gray with L* = 50|
|6||Sample separation:||Minimum, sample pairs with direct edge contact|
|7||Sample structure:||Homogeneous (no apparent pattern or non-uniformity)|
|8||Magnitude of ∆E:||0–5 CIELAB units (i.e. ∆Eab ≤ 5)|
There are conditions that can easily be met: condition #2 is easily achieved in well-iluminated rooms; #3 is obvious; #6 and #7 can be expected, etc. But the remaining ones get our attention:
- Ilumination (condition #1): For many years colorimetry moved between measuring using D50 or D65. Many people are convinced that D50 is no longer the right choice por industrial use, because almost all white references (paper, screens) are far from that illuminant. Data sets used to develop ∆E00 are all expressed in XYZ, and were explicitly converted to L*a*b* using D65 as white reference, and thus this condition. The fact that L*a*b* is usually defined with D50 may make us believe we have a problem here. Nevertheless, for the purpose of comparing colors, maybe we haven't (it would be if we wanted to simulate or produce those colors so measured). It is like having a ruler calibrated in units that differs a little from an inch but ignoring how much; it would still be useful to compare measurements, like calculating the length-to-width ratio of a table, but if we wanted to build a new table with that ruler, we may be surprised...
- Sample size (condition #4): Using a viewing angle greater than 4º means we should be using 1964 Standard Observer XYZ curves for 10º instead of tipical 1931 ones for 2º. If we have spectral measurements we can (at least in theory) convert them to XYZ and then to L*a*b* using the appropiate standard observer, but this cannot be done if we have measurements already taken in L*a*b* (like the ones we get from a colorimeter, for instance).
- Background field (condition #5): This one is perhaps the most important, because this assumption is mathematically embedded in the formula. SL dependency on L*, showing a minimum at L* = 50, is a direct consequence of this condition. If the background comparison are made against changes, then the minumum of SL would shift accordingly. In the graphic industry comparisons are usually made using the substrate (paper) as background, so it seems this condition is seldom met. On the other hand, we may justify this choice as the least bad: it's the one that minimizes the maximum difference we can get from an arbitrary background between black and white. By the way, let's explain this behavior, simulated by this parameter, is known as the crispening effect, showed below:
In other words, perception of differences get boosted when the background field has a luminance close to compared samples. This effect explains the shape of the SL graph and its minimum at L* = 50, coincident to the imposed viewing conditions.
- Magnitude of ∆E (condition #8): Analysis is limited to small differences, just because the reference data sets used were generated that way. The imposed limit of 5 seems to be an appropiate (but otherwise arbitrary) choice for the formula to widely reach the industry. In practice, it means we cannot use ∆E00 to compare contrast between two pairs of very different colors, for instance if we wanted to decide whether some blue has the same perceptually difference from some yellow as a certain red with respect to some green.
The biggest problems we may face if we deviate from these conditions are: changing background field lightness (specially when there is excessive contrast between samples and background), samples too separated, changing illuminant type (for example, moving from D65 to A), and comparing textured instead of uniform samples. All this deviations tend to underestimate color differences, mainly in lightness. If these conditions cannot be met, then they should be quantified. For this purpose, the formula has the three parameters kL, kC and kH, which are made equal to unity in standard viewing conditions, and could be fine tuned to another ones. When it is intended to make this fact explicit, it is customary to write the formula as ∆E00(1,1,1), showing parameters values.
In general, all testings and comparative evaluations between this and the other metrics already analyzed clearly make ∆E00 the winner. But we must keep in mind the fact that this superiority is statistical: there will always be pairs of samples and/or special display conditions that will make the other formulas to have a better adjustment to perceived differences in those specific cases.
In any case, the choice of a metric should be based on its performance in the entire L*a*b* space, and this is the reason why the CIE recommends and encourages its adoption.
Considering that ∆E00 is more than 20 years old, we may ask: Does CIE consider this formula to be its best effort? Are they thinking of proposing new metrics? Is anyone considering the development of the successor to L*a*b*? Are the CIE XYZ tristimulus values (on which L*a*b* and essentially all current colorimetry are based) good starting points for a color difference formula?
In a 2013 seminar at the University of Leeds, UK, one of the world's leading color researchers commented as follows:
“The CIEDE2000 formula may not be the final word with respect to a colour difference formula for small colour differences for industry... The experimental data on which the formula is based are far from perfect... However, at the present time the formula represents the best that can be achieved... In our view, CIEDE2000 is timely because there are two different formulae (CMC and CIE94) being widely used at present. This is clearly unsatisfactory. The new formula offers significant improvements over both... Progress on unresolved questions requires different viewpoints being put forward to stimulate new ideas.”
In other words, the chance of having new metrics is open, although today it seems to be more activity at developing new color systems. I will probably write another post about those new systems trying to replace L*a*b* in the near future.
In the next (and last, finally!) post of this series I will share the result of my personal analysis about these four studied metrics, some comparison tools, and a test to "see" the different ∆E in action.