In the last Color Management Forum, organized as usual by Fundación Gutenberg, a light controversy came up about the (relatively) recent recommendation about using ∆E 2000 (aka ∆E00) as parameter for several color tolerances specified in ISO-12647. We should note that recommending that metric doesn't make it mandatory (it is just an informative parameter in ISO parlance), while ∆Eab remains mandatory as a normative parameter, i.e. mandatory to those embracing the norm.
Several articles have been published about ∆E00 superiority over other known metrics such us ∆Eab, ∆ECMC y ∆E94. Further in this series of posts, I'll show a way to compare these different metrics, just to visualize what can be expected from them according to the zone of the gamut they're applied to, but meanwhile let's recall that a metric of this kind should be considered better than any other when the color differences it predicts better correlates with a human observer's perception. After all, that is the purpose of a good metric; otherwise, it would be pretty useless and we would end up approving or rejecting print jobs "by eye", not to mention we would lose a valuable objective tool for process control.
The existence of these different metrics (and we can fairly expect that they will give different results on the same circumstances) let some people suspect their true intentions. If we started using ∆Eab and things seemed to go well, why bring other metrics to the table? Conspiracy theories' enthusiasts may imagine this one: It's just a plot set up by manufacturers of instruments and process systems, along with organizations that establish the rules, to force us to start over and over again, and make expensive investments...
Besides, industry has so much confidence in L*a*b* and ∆Eab that, even when we happen to show ∆E00 superiority, they seem reluctant to accept it. A possible reason for this: pretty often, a difference between two color samples of ∆Eab = 5, for instance, turn up to be just, say, ∆E00 = 2. With no further analysis, we conclude ∆E00 is more "permissive" than ∆Eab, and we stick to "the devil we better know"...
First of all, let's talk about why we've gone so far. Let me picture it as a battle, because we have something at stake, a valuable treasure everyone wants to grab. And relating to ∆E, that treasure is no more, no less, than the Holy Grail of industrial colorimetry: perceptual uniformity.
Perceptual uniformity is a desireable feature of a color space which consists in that distance between different colors, measured just as the separation distance between them, reflects the difference in color sensation as perceived by a human being, regardless of the colors chosen.
As a consequence, particularly important to us, is that if two colors within certain distance (as measured in a given perceptually-uniform color space) are perceived to be the same, then any other color pair within the same distance should be perceived as indistinguishables, whichever the colors chosen.
Colorimetry as we know it was born in 1931 with the CIE XYZ system, but there it's virtually impossible to compare colors numerically, because a difference between two barely distinguishable colors may be represented by a small distance (as in the blue region, for instance) or a much larger one (as the case of the green region). This fact is evidenced by MacAdam ellipses, which empirically shows the "distance" we have to move away from any color to perceive a "difference". It was necessary to wait until CIE, after much analysis and several attempts, came up with UCS (1960), then with U*V*W* (1964) and finally with L*u*v* and L*a*b* (1976). The latter became the winner for several reasons: it was the best at that time, based in the well-proved color-opponency process, and it resembles the Munsell system, used even today.
L*a*b* was designed with perceptual uniformity in mind, and if that goal had been fully achieved we would have just one ∆E today: the original one, now renamed as ∆Eab (and also ∆E76) to tell it apart from its successors. Time eventually showed our chosen one had its shortcomings, and other metrics arose to claim the crown. But the problems detected are not in fact related to ∆E itself, they just revealed the underlying failure:
L*a*b* is not a perceptually uniform color space.
Therefore, classic ∆E fails to measure the "perceived difference" between colors. At this point, common sense would suggest we should fix or replace L*a*b* with another, more uniform system, in such a way ∆E in that space had the properties we need. Which properties are those? You guessed: perceptual uniformity and ease of computation.
The start of hostilities
When this drawback in L*a*b* was discovered, this system was so widespread adopted in the industry that it was not easy to think about a new and substantially improved uniform space. If, after so much effort and research (it took 34 years from MacAdam'se seminal work until the creation of L*a*b*), turned out we didn't achieve the intended goal, how can we even conceive to develop another one fast (and better) enough as to justify changing the system?
Like other circumstances of life, an innocent paid the price: there was no option left but to hack ∆E formulae itself, making it responsible of not fairly measuring difference between colors, when in fact L*a*b* is to blame for not putting colors in ther right place. It's like discovering our old, reliable ruler measures a table different than a chair, and having to replace it with a more complex apparatus which, before taking the actual measure, takes into account whether we are actually measuring a table or a chair...
In the next post of this series we will take a look at the several contenders entering this fight to demote our poor, weak (but innocent) ∆E.