“And yet,” I wrote in my first book 25 years ago, “most color correction could be handled by monkeys…a numerical, curve-based approach calling for little artistic judgment…all the advanced techniques are inevitably based on these surpassingly simple ones. The by-the-numbers rules can be stated in a single sentence: Use the full range of available tones every time, and don’t give the viewers any colors that they will know better than to believe.”

Today’s “advanced techniques”, such as PPW, could not have been dreamed of back then. The lesson remains. Without those basics, neither PPW nor any other system works. The failure of some of the MIT retouchers to keep this in mind is a warning to the rest of us. This post will discuss the most basic of the basic things: the importance of setting proper endpoints, a/k/a setting highlight and shadow.

Like any other system, PPW offers advantages on certain types of image, but few or none on others. The temptation is always to call attention to the ones on which it does well. And, it is easy to show what seems to be impressive results on certain images, but usually know way to prove that people couldn’t have done it just as well another way.

The MIT study solves both these problems. It samples a wide variety of images, not chosen to favor any system over another. And for each one, they have commissioned five different corrections, so that we know what others find possible.

As described previously, I created a sixth “par” version of each of 100 randomly chosen images. I also did a PPW correction myself, limited to methods that could be automated. I then compared the PPW versions to the others. Because the par version was often better than any of its parents, the key comparison was between it and mine. I rated these, according to rules laid out here, as either a decisive win, a win, a tie, a loss, or a decisive loss.

#1828: Although the cast seems ominous, there is little difficulty in correcting this because the uniform's color is known.

#1828: Although the cast seems ominous, there is little difficulty in correcting this because the uniform’s color is known.

#2082: The original is extremely flat, but the clouds have more than enough detail.

#2082: The original is extremely flat, but the clouds have more than enough detail for a retoucher to exploit.

#0091: This Hawaii scene should be ideal for PPW, which can add variation in the water.

#0091: This Hawaii scene should be ideal for PPW, which can add variation in the water.


If the PPW version scores a decisive win against par, it may indicate a superior method. It may also indicate that the retouching group found the exercise too difficult—or that it was not too difficult, but they made silly mistakes, such as failure to set a proper highlight and shadow.

On the other hand, I occasionally made silly mistakes that prevented PPW from winning more decisively. The big difference, though, between someone with a mountain of experience, and people like the five retouchers, is that experienced people make far fewer mistakes.

When we see that the PPW version is clearly superior to the par, we must ask (to use an old sports analogy) did I win, or did they lose? We will now look at three images where they did worse than they should have. Here are the three original files as given to the five retouchers. How would you think their work will compare with mine, if they do a competent job?

The first two originals seem at first glance to be much worse than the third. I have surely had to correct more such horrors in my career than the five retouchers have. Someone inexperienced might infer that I would therefore be likely to do much better than they would on these two. Indeed, that would be a good argument in many similar-looking situations. But in these two, the apparent difficulty is illusory. The chef’s uniform in #1828 obviously should be be white. His hair, judging by his ethnicity, must be black. It is easy, even for beginners and even when the color of the original is this far off, to make the necessary corrections. PPW’s hammers may add texture to the uniform and the skin, which may cause the viewer to prefer it, but it’s hard to visualize a decisive win.

The clouds image, #2082, is only scary because it’s so flat. Clouds don’t need to be as impeccably white as the chef’s uniform but white does have to be their basic color. The hammer actions aren’t needed to build highlight detail because there’s more than enough of that already. There seems to be sunlight coming in from the right and patches of blue where the sky peeks through the clouds. PPW may be able to exploit those, in which case it becomes a slight favorite. Otherwise, I predict a tie, maybe even a loss if somebody finds a creative solution.

The Waikiki image of #0091, now that ought be a decisive win for PPW, which can put attractive color variation in the water in a way that Lightroom can’t. And it should be able to present the high-rises as a more attractive rosy-orange.

How did these predictions pan out? The chef first.

1828-par&PPW: The par version looks flat because it has no highlight (white point).

1828-par & PPW: The par version looks flat because it has no highlight (white point).

To say that PPW wins this comparison is misleading. It would be more accurate to say that the par version loses. It loses because its detailing is so poor. The detailing is poor because the version ignores the advice at the top of this post that it should always use a full tonal range. And it does not do that because, unlike the PPW version, nothing is represented as a bright white. No proper highlight has ever been set.

Believe it or not, the par version, which as usual has decent color, beats all five of its parents. Let’s have a look at three of them, plus a modest suggestion.

1828-C&D: One version can't handle the color cast; the other is even flatter than par.

1828-C & D: One version can’t handle the color cast; the other is even flatter than par.

1828-E&auto tone: another failed retoucher version, plus a suggestion: merely establishing a white and a black point, which the Auto Tone command does with one click.

1828-E & Auto Tone: another failed retoucher version, plus a suggestion: merely establishing a white and a black point, which the Auto Tone command does with one click.

No matter what method of color correction you choose, the following steps are absolutely mandatory. No method will succeed if they are ignored.
1) Locate the lightest significant part of the image.
2) Decide whether it should be made white.
3) Locate the darkest significant part of the image.
4) Decide whether it will be acceptable if made black.
5) Force the lightest and darkest significant parts as far apart as reasonable, while honoring the decision you made about whiteness/blackness.

These items are not always obvious. What if the lightest significant point is not the lightest literal point? And should it be forced to a true white, or some type of off-white? The same considerations apply in reverse to the choice of dark point.

This chef image has no such issue. The lightest literal point is in the uniform; it is also the lightest significant point, and it is emphatically supposed to be white. The hair, similarly, is both the darkest literal point and the darkest significant one, and it is emphatically supposed to be black.

The simplest of all Photoshop commands, Image: Adjustments>Auto Tone, establishes correct white and black points, on the assumption that literal=significant and that the endpoints are supposed to be neutral, both of which happen to be true in this case. Compare the Auto Tone version, prepared from the default with one click, to any of the three retouched versions, or even the par. The color is acceptable and the detail much better.

My suspicion is that Auto Tone is thought of as an amateur method, to be avoided by those in the know. But why not use it here, as least as a start point for something better?

2082 par & PPW: the failure to establish a white point in the clouds at right dooms the par version.

2082 par & PPW: the failure to establish a white point in the clouds at right dooms the par version.

Upwards of 99 percent of images have something that can be used as a dark point. This cloud shot is the exception. Nothing can be represented as black. But the white point issue is the same as with the chef. The lightest literal point, in the clouds at top right, is also the lightest significant point. And clouds can surely be represented as white. Establishing the highlight is the biggest reason that the PPW version works and the par does not.

Despite this flaw, the par version, with its predictably reasonable color, once again beats all five individual efforts. Here are three of the competitors, plus a ringer.

2081-A & C: These two efforts have reasonable detail but major color problems.

2081-A & C: These two efforts have reasonable detail but major color problems.

This image might benefit from slight moves toward a warmer or even a cooler color, but we must keep in mind that the basic color of clouds is white. Both 2081-A and 2082-C go way too far; there is no longer any sense of neutrality in the lightest clouds. Jumping out of the frying pan and into the fire is D:

2082-D & Auto Tone: Left, the clouds are white at a huge cost. Right, Auto Tone applied to the default forces a white and a black point.

2082-D & Auto Tone: Left, the clouds are white, at a huge cost. Right, Auto Tone applied to the default forces a white and a black point.

After looking at 2082-D we can understand why the par version wins. We have looked at two with terrible color but acceptable contrast, and one where the clouds are correctly white but grossly too flat.

Why do we have to choose one or the other, when one click with Auto Tone puts us on the right track? Sure, we can’t accept black clouds, but already the Auto Tone version is arguably the best other than the PPW one, and it is easy enough to make it much better:

2082-AT curve & AT smart blend: Left, a two-second master-curve correction of the Auto Tone version. Right, instead, an intelligent blend of Auto Tone  with 2082-D to soften the shadows.

2082-AT curve & AT smart blend: Left, a two-second master-curve correction of the Auto Tone version. Right, instead, an intelligent blend of Auto Tone with 2082-D to soften the shadows.

Even the sort of person who overuses Auto Tone can apply the kind of master curve shown above to counter the excessive darkness of the straight Auto Tone version. Someone wishing a more conservative effect might prefer the version at right, which is an example of an “intelligent” blend, as opposed to the “stupid” blending that produced the par versions. I lightly blended 2082-D into 2082-Auto Tone, using a mask that emphasized changes to darker areas.

How do the two images we’ve looked at differ? Well, you can’t ask for a more perfect target for Auto Tone than that chef shot of #1828. PPW methods generally work better with flatter originals, but Auto Tone moves so obviously in the correct direction that I used it myself, fading it back to 80% opacity to allow easier later adjustments.

This cloud image is another story. Auto Tone sets an excellent highlight, and knocks out most of the cast. We have other ways of doing these things, so Auto Tone would not be on my agenda here. But not everybody knows these other ways. Apparently the student retouchers did not. Auto Tone is much better than nothing.

Traditionally, retouchers set highlight and shadow points immediately. PPW does not require this, but it and every other rational system does require that they be set eventually. The following example shows that Auto Tone can come in handy even late in the process.

0091-E: The best of the retoucher corrections of this original.

0091-E: The best of the retoucher corrections of this original.

0091-PPW: The Modern Man From Mars action creates desirable variation in the water, sky, and highrises.

0091-PPW: The Modern Man From Mars action creates desirable variation in the water, sky, and highrises.

The results here are as predicted. PPW has a routine that creates color variation without necessarily making the image more vivid. The scene is more realistic.

You may ask why the comparison is to one of the retoucher versions and not to the par, which is not shown. It turns out that three others came up with renditions similar to that of Retoucher E. As for the fifth retoucher, well, this exercise is the only one out of 150 I’ve looked at so far that one version was atrocious enough to bring down the average so much that four of the five retouchers beat par. How bad can that be?

0091-A: The worst of the retoucher versions, which brought down the average result so far that the other four retouchers beat par.

0091-A: The worst of the retoucher versions, which brought down the average result so far that the other four retouchers beat par.


My guess is that this retoucher got tired of spinning his wheels and eventually threw up his hands in despair and went on to the next image. If so, he acted too soon. The lightest literal and significant point in this shot is in what we call the whitewater. The name doesn’t prove that it should actually be set to white; a slight green or cyan might be better. The darkest literal and significant point is in the lava, which should probably be a dull dark brown rather than black. So Auto Tone is not an ideal solution, but if the alternative is 0091-A, the choice is easy.
0091-A-Auto Tone: The Auto Tone command is applied to 0091-A, forcing a white and black point.

0091-A-Auto Tone: The Auto Tone command is applied to 0091-A, forcing a white and black point.


Better even than 0091-E, no? And 0091-E was the best of the other four.

It is correct that Lightroom, which the five retouchers were using, doesn’t have an explicit Auto Tone command, but it has plenty of ways to achieve the same thing quickly. This post, I hope, will be an eyeopener to those who don’t accept that proper highlight and shadow selection is critical.

Color correction takes practice. Nobody should be ashamed of a poor result when the image was too difficult for them. When the poor result comes because the image was too easy, that’s harder to swallow.

The previous entry described giving each of five independently corrected versions 20% weight to create a new, “par” version. This can be called a “stupid” blend, in that no notice is taken of the merits of any of the five. Nevertheless, it appears that this average is better than all five of its parents in a surprisingly high minority of cases. The apparent illogic of this finding gave rise to discussion on the colortheory list and prompts this supplemental post.

#1859: a large tattoo on a person's back.

#1859: a large tattoo on a person’s back.


Recapping: in a well-funded study, scientists from MIT and Adobe gathered 5,000 images and hired five knowledgeable students to correct each one of them, using Lightroom. All the originals, in DNG format, and the corrected versions have been made freely available. I chose 100 of these images at random and corrected them myself using PPW principles and restricting myself to procedures that could be automated. Also, for each of these 100 images I created a “par” version, the “stupid” average of the five student corrections.

This post will examine three very different images, starting with the originals as given to the retouching group.

#4976: This original would challenge even very experienced image technicians.

#4976: This original would challenge even very experienced image technicians.


#0002: Unlike the other two originals, this one starts out in reasonable shape.

#0002: Unlike the other two originals, this one starts out in reasonable shape.

Half the value of these exercises is in predicting what is about to happen, and then discovering whether the prediction is correct. Usually it is, but as we will see in this and other posts, sometimes surprises pop up. Here’s how I felt about these three before beginning.

*The tattoo image, #1859, is not as hard as the desert exercise shown in the previous post. Viewers, however, are notoriously finicky about fleshtones. The retouchers are likely to concentrate more on making the tattoo stand out and may have varied ideas of how to present the skin. The par version will arrive at a consensus. I think that this may be one of those where the par version is better than any of the five. I expect that my own version will win decisively, because PPW’s MMM action can introduce color variation in skintones in a way that does not exist in Lightroom.

*I am also planning to win decisively in the aquarium nightmare of #4976, but for a different reason. I believe that PPW has somewhat effective tools to attack this mess. Whether Lightroom has anything like them is irrelevant, because giving this original to a nonprofessional group is a form of sadism. Chances are that the five corrected versions will go in every possible direction and none of them will be acceptable. I therefore think it’s a sound bet that the par version will be better than all five parents.

*The girl in the pink sweater, #0002, is a nearly total opposite. For a change, the original version is not bad; I wonder how many of these five retouchers, in an attempt to justify their own existence, will make it worse. It doesn’t have features that make it particularly appropriate for PPW, although I hope to get by on my good looks and personality. There probably won’t be too much variation among the work of the five retouchers so it is questionable how the par version can be much better.

We commence hostilities with the tattoo image.

1859-A&C: two retouchers with extremely different opinions of how dark the skin should be.

1859-A&C: two retouchers with extremely different opinions of how dark the skin should be.


Those who read the previous post recognize the similarity. The first two versions shown have drastically different ideas about image weight. I have no clue what retoucher A was thinking, his work is worse than the original. What retoucher C did makes more sense, in lightening the skin the tattoo stands out more, but he went quite a ways too far in my opinion.

Naturally, when averaged, these two errors will to some extent cancel each other.

1859-E&D: one version lightens the skin but the tattoo as well. The other darkens the tattoo for emphasis.

1859-E&D: one version lightens the skin but the tattoo as well. The other darkens the tattoo for emphasis.


Again like the previous posts, two better versions that still tend to even each other out. Like retoucher C, retoucher E lightened the skin but in doing so weakened the tattoo. Retoucher D took a different tack. He somehow darkened the tattoo itself. In doing so, a certain amount of color was lost. Not a bad concept, but I prefer the PPW approach, which would be to move all colors in the tattoo away from orange, to better differentiate them from the skin. So, instead of making the red parts darker as in 1176-D, they should become rosier.

1859-B&par: The final retoucher version is closer to the consensus of the other four, but the actual average, right, is strengthened by its exposure to them.

1859-B&par: The final retoucher version is closer to the consensus of the other four, but the actual average, right, is strengthened by its exposure to them.


1859-PPW: The MMM script causes great variation in the skintone--but is that really what is wanted here?

1859-PPW: The MMM script causes great variation in the skintone–but is that really what is wanted here?

Again as in the previous post, one of the five retouchers comes up with something close to the consensus, and once again it isn’t as good as the “stupid” weighted average. How can this be? The par version is picking up some of the strengths of each, and largely discarding their weaknesses.

On, now, to my boneheaded prediction that my version would be much better than anyone else’s. This kind of blunder is what happens when one sees only one part of an image and forgets the objective. When I looked at the original I saw a lot of skin. My experience is that whenever there is so much skin PPW always does better than any alternative approach, because it adds attractive and believable variation. As I will show in future posts, PPW has a massive advantage in portraits. And in fact, if this person had no tattoo, I believe my version of the back would be considered decisively better than the others, which would seem totally boring by comparison.

Unfortunately, the image does feature a tattoo, which is going to be focus of the viewer’s attention. Putting this much action in the skin does make it look more natural but it is also a distraction from the whole point of the image.

A previous post outlines how I report the results of comparisons of my versions against the par. Here, it wouldn’t matter how a vote went. The question is, would everyone agree that a straight 50-50 blend of the two would be better than either parent. In this case, I think that the answer is clearly yes. This competition is therefore a tie. We move on to the group’s efforts to grapple with a greased pig.

4976-A: The contrast in the face makes the man look like he came out of a horror movie.

4976-A: The contrast in the face makes the man look like he came out of a horror movie.

The above version is unlikely to find favor with the subject.

4976-B: The man is much too dark.

4976-B: The man is much too dark.

4976-C: Ditto.

4976-C: Ditto.

Essentially the above two are a concession of defeat, a decision that it’s time to move on to the next opponent.

4976-D: A step in the right direction for weight, but the fleshtone is quite orange.

4976-D: A step in the right direction for weight, but the fleshtone is quite orange.

4976-E: A reasonable treatment of the foreground subject, on the assumption that the aquarium in the background is totally irrelevant.

4976-E: A reasonable treatment of the foreground subject, on the assumption that the aquarium in the background is totally irrelevant.

Progress is being made, but at too high a price. Both of the above have overly orange skintones. Retoucher E got excellent detail in the face at the cost of obliterating the aquarium.

4796-par: Averaging the five versions seen above creates the best one seen yet.

4796-par: Averaging the five versions seen above creates the best one seen yet.

4976-PPW: PPW blending principles create a superior version.

4976-PPW: PPW blending principles create a superior version.


Little discussion is needed. The exercise was too difficult for the retouching group. Nevertheless, the par version, the average of their terrible work, is much better than any of the five parents. The PPW version, though, rates as a “decisive win” over the par version. By definition this means that it would be almost universally preferred at a glance.

Averaging minimizes the effect of mistakes. Be aware, though, that it also minimizes the effect of really good work.

0002-PPW-A: Left, the PPW version. Right, Retoucher A did better.

0002-PPW-A: Left, the PPW version. Right, Retoucher A did better.

0002-par&default: Left, the par version. Right, repeated for convenience, is the start point, the default version.

0002-par&default: Left, the par version. Right, repeated for convenience, is the start point, the default version.

Retoucher A did a better job with this image than I did, a relatively rare occurrence. When it happens, it indicates outstanding work on the individual’s part, probably much better than that of his four colleagues. It strongly suggests that it is therefore better than the par version.

So be it here. The par version is shown beneath the two, along with a repeat of the default version for reference. I can scarcely tell the difference between the par and the default, the par made the girl’s hair slightly lighter, a good thing. But clearly the other four retouchers, whose work we haven’t seen, dragged the average down. In fact, some or all of them must have made the original worse, a bad idea in color correction.

How do we score this? Well, 0002-par and 0002-a obviously have the best shape. The standard when comparing my work to that of an individual is whether mine is “significantly better,” which it is not. There is no need to inquire whether it is significantly worse.

The more careful comparison is between mine and each par version. I prefer mine here, because I don’t object to very pink skin in children. I also prefer mine to a 50-50 blend of the two. However, the rules also provide that when 100% of the color of one version is married to 100% of the luminosity of the other and the result is better than either parent, then the contest is a tie. And that is the case here.

Summing up the lessons of these two posts.

*We have not considered “intelligent” blending where the goal is to combine the best parts of two or more versions rather than a mathematical average that takes no account of their good and bad features. Such intelligent blending is very powerful, but somewhat difficult. I have provided examples of it here and here.

*Nevertheless, “stupid” blending is unexpectedly effective. It tends to even out color issues; its handling of detail is not so impressive. But in this study of 100 images it was really striking how frequently the par version had better color than my own even when the color of each of its five parents was bad.

*If you’re baffled as to how to proceed with a difficult image, try several careful versions. None of them may be any good by themselves, but to the extent that they agree they will create a better final result. The aquarium shot above is an example. Nobody got a good result, yet the averaged version is acceptable.

*An averaged version is conservative by definition. This makes it valuable to PPW practitioners, who often create files that are too loud. In the desert image of the previous post, in the tattoo at the top of this one, and in the little girl we just saw: in all three of these, if I had had access to the par versions when I finalized my own, I would have made use of them, by blending in some of their color (not luminosity).

*For all these reasons, I reiterate my recommendation that people should routinely make extra versions, even if it means spending less time on the primary version. These extra versions should be done as quickly as possible. We have seen in these two posts that the versions don’t have to be very good in order to be useful. You should be able to rattle them off in a minute or so. The potential loss is one minute in case the version is a complete bust. The potential gain is considerable.

{ 0 comments }

Those interested in quality have always been willing to spend time to get what they considered the best possible results. For some years now I have been suggesting that this is not the best approach in our field. Instead, I have been preaching that it is a better use of time to do the initial correction more quickly and then do an even quicker second version that can be used for blending. I claim that this gives better results in the same time. After working with this dataset, I have a better idea as to why.

Proving such a concept is difficult, particularly since the way I have presented it requires a flexible approach. I say that you should evaluate your initial version and if you suspect certain weaknesses, engineer the second version not to have them. I showed an example of that approach here.

Of course, it’s easy to say that I picked an unusual image for that post. Or that the many blending modes and masks that are available are too confusing. As against that, what if there’s something magical about blending two versions? What if it’s somehow more likely that a blended version is unexpectedly going to be better than its parents?

This is the first of several posts analyzing the lessons that this extensive study offers for our workflow. With 5,000 images each corrected by five different retouchers, we can’t be accused of cherrypicking ones that prove a point. We don’t need 5,000, but the subset has to be chosen at random. I chose 100 images for the testing and selection process, which is described here.

Subsequent posts will discuss more directly how PPW compares to the work of those who don’t have access to it. Now, however, we are just going to look at how the five corrected versions compare to each other.

Suppose that we expanded the competition to include you on these hundred images as a sixth retoucher. Your work would then be compared with each of the five others. Suppose, also, that you are about as skillful as they are.

Of the 500 head-to-head comparisons, how often would your version be decidedly better? I’d say maybe 200. In 200 others you would lose, and in 100 there would be no preference. Or try this one: assuming again that everyone is equally skillful, how often would you expect your version to be the best of the six? That’s harder to quantify: your chances of beating the first opponent are still 40 percent or so, but if you do it your chances of beating the next one increase, and if your version has already beaten four, the chances are very high that it will also beat the fifth. So, my guess is that you could expect to score a clean sweep on perhaps five of the 100 image sets.

The sixth contestant, however, is not you. It is an average of the other five. It is not an “intelligent” blend, either, such as the one I described in my other post. Instead, it weights each version 20 percent, regardless of how good or bad that version is. It also varies from the blend I described, where I deliberately made a second version that would compensate for what I saw as weaknesses in the first. Here, the five retouchers were all trying to accomplish the same thing and did not know what the others were doing.

How well did this “stupid” blend work? Instead of 200 wins over 500 comparisons, it won 382. Instead of five clean sweeps over 100 competitions, it had 26. We’re about to see one, to help understand both why blending and averaging has such an advantage, and under what circumstances it does not. Here’s what was handed to the five retouchers in DNG format.

#4177, as received by the five retouchers, seems too cold and also has contrast issues.

#4177, as received by the five retouchers, seems too cold and also has contrast issues.

The original is technically challenging, in that it starts with a cold cast, but also lacks depth. As usual, each of the five student retouchers made a distinct improvement, but as usual, the results were inconsistent. We’ll start with the worst first. NOTE: I am using the study’s own naming system here. It gave each image a number and then identified each of the five corrections by a letter.

4177-A: The first retoucher's work is quite dark.

4177-A: The first retoucher’s work is quite dark.

4177-E: This version seems too weak overall.

4177-E: This version seems too weak overall.

Retouchers A and E both got reasonable color, but not weight. In 4177-E the overall image looks washed out because the shadow areas are too light; 4177-A has the opposite problem of making the sunny areas too dark.

Already, you may (and should) be thinking: a 50-50 blend of these two would obviously be much better than either parent. We’ll find out in a bit, but first a variation on this theme with two much better efforts.

4177-B: Warming up the image is the right approach, but the clouds should not have turned orange.

4177-B: Warming up the image is the right approach, but the clouds should not have turned orange.

4177-C: The best version so far, but the clouds transition to blue too quickly.

4177-C: The best version so far, but the clouds transition to blue too quickly.

Retouchers B and C’s work illustrates the difficulty of cast reduction. 4177-B went too far in warming things up, leaving the clouds yellow-orange. 4177-C, the best of the four seen so far, didn’t quite go far enough. The clouds are too blue. Looking at the other versions suggests that the rock formation is as well.

A blend of the two might be better than either parent, because the two mini-casts might cancel each other, just as the two incorrect weights did when we compared 4177-A to 4177-E. My guess would be that the blend should favor the bluish 4177-C, which seems to me the better of the two. In real life, where intelligent blending is allowed, I’d guess that blending 25-35% of 4177-B’s color into 4177-C, while leaving 4177-C’s luminosity unchanged, should work well. This posting, however, is about “stupid” blending, so we should limit it to a straight 50-50 blend, which I’ll now show along with the one discussed earlier.

4177-A&E, straight 50-50 blend: With one parent too light and the other too dark, the child comes out just right.

4177-A&E, straight 50-50 blend: With one parent too light and the other too dark, the child comes out just right.

4177-B&C, straight 50-50 blend: One parent has an orange cast, while the other is slightly blue. The child version is more neutral than either.

4177-B&C, straight 50-50 blend: One parent has an orange cast, while the other is slightly blue. The child version is more neutral than either.

Well, I’d have to say that the B-C blend is indeed better than its parents. It’s lost some of the desirable color variation in the rocks that C offered, but the added realism in the sky more than makes up for it. The surprise is, the A-E blend, which is a combination of two images worse than B and C, is competitive with B-C. And both are better than any of the four parents.

The only retoucher version we haven’t yet seen is D. It’s the least objectionable of the five, in my view, although I’m not sure I would rate it better than C. Yet it isn’t quite as good as the par version shown below it, which is a “stupid” blend of all five retoucher versions, each one given 20% weight.

4177-D: The final retoucher has avoided the problems of the other four.

4177-D: The final retoucher has avoided the problems of the other four.

4177-par: A "stupid" blend of the five previous versions, with each weighted 20%.

4177-par: A “stupid” blend of the five previous versions, with each weighted 20%.

Here’s how I rate what we’ve seen so far.

*Each of the five retouchers improved the original.
*The two 50-50 blends (4177-A&E and 4177-B&C) are better than any of their four parents. Whether they are also better than 4177-par is of no concern to me.
*4177-par rates as significantly better than the version above it (4177-D) in that I believe a jury that was able to toggle back and forth between the two would give it at least a two-thirds vote.
*Possibly similar kinds of toggling would be needed to determine that 4177-par is better than 4177-C. The other three retoucher versions don’t need a moment’s consideration.

What accounts for this phenomenal rate of success for a “stupid” blend? And when might it not be appropriate? And what if the versions were prepared not by students but by top professional retouchers?

With multiple versions, errors can cancel one another out. In 4177-A&E, an overly light and an overly dark version did so. In 4177-B&C two contrary casts did the same thing. Even if it had no direct opponent, however, a poor version like 4177-a could still be usable in a blend with four more reasonable efforts.

In addition to minimizing the impact of errors, averaging unfortunately also minimizes cleverness. Only Retoucher C, for example, got attractive color variation in the rocks. That variation is wiped out in the averaged version. Is 4177-par a better image than 4177-c? Yes, but it isn’t a free lunch, and if Retoucher C had done slightly better at knocking out the cold cast then he would have beaten par.

A subtler and more universal factor turns out to be the bigger gain in blending multiple versions. All five of the retouchers presumably agreed that the rocks are more important than the background sky. And they certainly all realized that the original had an undesirable cast.

Understandably, though, they had different ideas about how to do that. One might add orange, another subtract green, still another add cyan and then boost the color. Whatever the approach, the goal is to bring the rocks to a desirable warmth, a point on which they all probably agree fairly closely. The background is a secondary consideration, but it probably offers hints on the method used to warm the rocks.

Whenever an original can be seen as having both important and unimportant areas, multiple attempts to correct will always be more similar in the important areas than elsewhere. The five retouchers here agreed far more closely about the rock color than about the sky.

The critical lesson for why this type of blending is so successful: if you decide to blend a second version with your own, the biggest change will be found in unimportant areas. The effect of the change will be that the important areas stand out more clearly. Why? Because in each version the unimportant areas are affected by the method chosen to correct the initial cast. If somebody else corrected the cast just as effectively, but with a different method, the unimportant areas will move the corresponding areas in your version away from the important object(s), which will not change very much.

Two corollaries of this striking rule:
*The main advantage of the blend is in color, not detailing. Each retoucher tries for the best detail in the important zones, paying less attention elsewhere. There may well be a change in the unimportant zones due to the blend, and it will improve your version, but it will probably not be as big a deal as a color change.
*The big gain comes when the original file has a color issue. If, instead, its color is basically right then no corrected version will be much different from another.

Would we still see such overall favorable results if the five retouchers were more experienced? That’s hard to say. On the one hand the more experienced team would be less prone to make big mistakes, so we wouldn’t get huge swings like the difference between 4157-A&E and its two parents. For professionals, on one hand the impact may be less because the mistakes aren’t as big. On the other, a professional team would be unlikely to create horrors that might make the par version unusable. This actually happened in several of the dataset’s images.

PPW users can exploit another advantage. Averaged versions are by their nature conservative. PPW efforts are by their nature extravagant, extroverted. As part of this study, for each of the hundred images I prepared my own version. The purpose was to examine how often PPW is actually useful in real life. On certain images, of which this is one, it has a tremendous advantage. On others it’s no big deal.

4177-PPW: The retouching group had no access to the MMM action, which is responsible for the color variation in the rocks here.

4157-PPW: The retouching group had no access to the MMM action, which is responsible for the color variation in the rocks here.

Obviously we want a certain amount of variation in the rocks, but there’s no unanimity as to what that amount is. Also, people fall in love with their own ideas. Work long enough with a file of this nature and the eye can become desensitized, whereupon we can save a version today that will seem too loud tomorrow. Or, even if you don’t think it’s gone too far, perhaps someone else will.

If you find yourself in that position, an averaged version can be just what’s needed. 4177-par may be boring, but it has no obvious defect. It can soften 4177-PPW without any problem. The same cannot be said of any of the alternate versions in this post.

Going through this exercise over a hundred images reinforces much of what was already known. That PPW was going to do well on this particular one was a foregone conclusion, because it has an action that drives colors apart.

That blending competitive images would give better results in a very large number of samples was also a foregone conclusion, because I’ve seen how often it has worked in the past. This time, however, although I predicted the result, I did not understand why it would happen. I hope I’ve now been able to explain the reason in this post.

In short, I expected a good result for PPW in this image, and I knew why; I expected good results for blending generally, but I had a poor grasp of why. There’s also a third category, where my preconceptions about what would happen proved false. I’ll show examples of all three categories in subsequent posts.

{ 2 comments }

The MIT 5k Dataset 2: The Ground Rules

by Dan Margulis November 13, 2017

The following details the procedures used in evaluating the images in this study. It is posted separately so that I do not have to repeat it every time I discuss results in the future. I went through the set of 5,000 original images and deleted those I thought were of limited interest. I used the […]

Read the full article →

The MIT 5k Dataset 1: Introduction

by Dan Margulis November 11, 2017

This is the introduction to a series of posts I will make based on my work with files that are part of a remarkable archive. Researchers at MIT and Adobe have recently made available the data from a massive project they have undertaken to study what people look for when they correct color. The researchers’ […]

Read the full article →

Applied Color Theory classes in 2017

by Dan Margulis January 4, 2017

For those wishing to take color skills to the ultimate level, here are the two dates for Applied Color Theory classes in 2017. • ATLANTA, Wednesday, March 22 through Saturday, March 25, 2017. • SAN DIEGO, Wednesday, August 9, through Saturday, August 12, 2017. These classes—four long days, limited to eight persons—have changed the lives […]

Read the full article →

The Presentation of Data: When Red and Blue Are Opponent Colors

by Dan Margulis June 3, 2016

The U.S. presidential campaign offers an interesting insight on opponent colors, and on how best to present data. The conventional way of doing it leaves much to be desired. Residents of other countries have difficulty understanding the American system, where in effect the election is always decided by voters in a small minority of states. […]

Read the full article →

Applied Color Theory Classes in 2016

by Dan Margulis January 30, 2016

In September 1994, I took three days off from my day job in New York, and spent them teaching color correction to six people in Atlanta. We were using Photoshop 3, with no adjustment layers, no multiple undo, no actions, and computers with 16 mb RAM. Sterling Ledet gave the name “Applied Color Theory” to […]

Read the full article →

Photoshop 修色圣典

by Dan Margulis December 25, 2015

Chinese readers will be interested to know that the translation of Modern Photoshop Color Workflow is on press now, and should be available within a few days, from the same publisher who offers Photoshop LAB Color (and is working on the second edition) and of Professional Photoshop. The Chinese market apparently has a lot of […]

Read the full article →

Averaging and the Complementary Background

by Dan Margulis November 26, 2015

A recent post on the appliedcolortheory list, discussing my suggested procedure for making gross changes in the color of a product, noted that it used Filter: Blur>Average, which the poster said he never used under any other circumstances. I will now show a related use of it that can really help out certain product shots. […]

Read the full article →