Most of the photographs captured these days end up being viewed on a display of some sort, with at best 4K (4096×2160) but often no better than HD resolution (1920×1080). Since the cameras that capture them have typically several times that number of pixels, 6000×4000 being fairly normal today, most images need to be substantially downsized for viewing, even allowing for some cropping. Resizing algorithms built into browsers or generic image viewers tend to favor expediency over quality, so it behooves the IQ conscious photographer to manage the process, choosing the best image size and downsampling algorithm for the intended file and display medium.
When downsizing the objective is to maximize the original spatial resolution retained while minimizing the possibility of aliasing and moirè. In this article we will take a closer look at some common downsizing algorithms and their effect on spatial resolution information in the frequency domain.
Measuring Resized Spatial Frequency Response
One way to see what happens to linear spatial resolution in the frequency domain is to measure the MTF performance of a downsized ISO100 D800e hueless slanted edge*, as discussed in an earlier article. The original image is a 300×400 pixel crop directly off DSC_6483.NEF, downloaded from dpreview.com and saved as a 16 bit tiff with absolutely zero processing other than black subtraction and CFA normalization. You can get the tiff here if you would like to do some testing of your own.
That beautiful grayscale edge was first measured by MTF Mapper – then downsized 4:1 using Photoshop CS5’s standard methods with these settings (nearest neighbor shown):
and re-measured by MTF Mapper. The resulting image is 75×100 pixels, 25% the original linear size. Below are the resulting Spatial Frequency Response (MTF) curves of the original image, indicated by a dashed line, and its resized version downsampled via nearest neighbor, the solid blue line. Spatial resolution units are lp/mm.
Nearest neighbor tracks the original curve for the first one quarter of its length only – because it refers to the resized image, which is a quarter of the original’s size:
We know from experience that nearest neighbor downsizing produces heavy aliasing, the reason becoming clearer when the same curves are presented in units of cycles per downsized pixel, as shown below. The other two standard Photoshop CS5 resizing algorithms, bilinear and bicubic, are also displayed with the same scale:
The original curve soars at those lofty heights because one of its pixels corresponds to four of the others’. Nearest neighbor tracks it well, which is good at the lower frequencies but not so good beyond monochrome downsized Nyquist – because the ideal MTF curve for a perfect downsizing would track the original’s up to Nyquist but show little or no energy above 0.5 cycles/pixel, the point at which aliasing and moiré start rearing their ugly heads.
Bilinear and bicubic show different amounts of signal attenuation, with bicubic looking like the best compromise of the three. The blue curve below shows the upper boundaries of what I think an ideal MTF curve should look like for this image after perfect downsizing only (no extra sharpening), with bicubic for reference as the dotted line.
Ideally all of the spatial resolution information above monochrome final Nyquist would be removed before downsampling. In practice it’s often not possible because some of the more advanced algorithms do not necessarily perform downsampling and filtering operations as separate processes. Nor can the filtering be discontinuous as shown by the hard edges in the ideal blue line above, because that would introduce objectionable halos and ringing in the final image.
So finding the best algorithm for downsizing means finding the best compromise between tracking the original curve in the lower frequencies – while minimizing energy at the higher ones.
The benchmark in the photographic community has so far been Photoshop bicubic complemented with additional filtering and/or sharpening. It is what I currently use in its unsharpened version; but there have been attempts at finding solutions with better performance, as discussed in this fascinating thread on LuLa by resampling experts Nicolas Robidoux and Bart van der Wolf. They are using an Elliptical Weighted Averaging (EWA) approach to provide more ‘organic’ results than possible with separable tensor approaches.
So here are MTF curves from three downsizing algorithms discussed in the thread, powered by ImageMagick. None were sharpened after downsizing, as before.
Quadratic and Downsample 1% are much more attenuated than CS5 bicubic, while RobidouxSharp does a good job against it at the cost of a little extra energy beyond Nyquist. RobidouxSharp is referred to in the thread as the “sharpest reasonable” cubic kernel when used with EWA.
And here are a few more including Matlab bicubic. The chart is higher resolution in case you wish to click on it to take a closer look at full screen:
The blue lines are Robidouxs and red is Mitchell. CS5’s implementation of bicubic is ‘sharper’ than Matlab’s, which I understand uses the original coefficients in Keys’ 1981 paper and is shown as red dots. Interestingly Matlab bicubic tracks Mitchell almost all the way. This site suggests CS5 bicubic is implemented through the Keys cubic filter family with B=0 and C=0.75.
In the thread you will also find some real life image comparisons. I am quite interested in this subject so I will keep a close eye on it. For now it seems to me that the alternatives do not give enough visible IQ benefits to overcome CS5 bicubic’s ease of use in most situations, so I will stick with it in normal circumstances.
Incidentally, it wasn’t until I started to play with these charts that I realized that the original MTF curve is plotted on a scale multiplied by the inverse of the magnification factor (with the discussed 25% downsampling that would be 4x) compared to the downsized MTFs. Therefore, when downsizing, what really matters as far as final image spatial resolution IQ is concerned is the portion of the original MTF curve that is closest to the origin. For the examples discussed above this means the first 1/8th of the original MTF plot, up to about 25 lp/mm or 0.125 cycles per pixel. That portion of the curve is more easily associated with what photographers would call ‘contrast’, rather than ‘sharpness’, in the original image.
I had explored this subject here in a different context in the past but hadn’t made the connection until now. Magnification is magnification. I really need to start paying more attention to MTFs between 80 and 100% when buying lenses.
* Please note that this is highly mathematical stuff and that, as Nicolas Robidoux mentions in the thread, ‘the “slanted wedge” is not isotropic enough to tell “everything”. It’s too close to a separable situation, so issues that arise from the use of a separable (= tensor = orthogonal) filter are pretty much swept under the carpet.’ Got it, Thanks Nicolas and Bart 🙂
Hi Jack!
You are touching on some dangerous ground here 🙂 Discussions on the “perfect” downsizing method are just as contentious (and vigorously argued) as those around the “perfect” demosaicing algorithm.
My 2c: Keep in mind that slanted edge MTF measurements are averaged over the entire edge. The potential danger here is that a particular downsizing algorithm might improve this averaged sharpness score at the cost of destroying finer details (which may or may not be noise, depending on your image). This can lead to large differences between the subjective perception of downsizing quality versus MTF curves.
My personal favourite has always been the sinc-filter family (Lanczos in practice), but not necessarily in a photographic aesthetic context, but rather a computer-vision-must-preserve-more-details context.
Thank you Frans, good point about edge optimization, I also don’t mind Lanczos3. My point of reference is the highest SQF with minimum halos when viewing an unsharpened image at 100% at typical distance on a typical 100ppi monitor, always a compromise. Bart van der Wolf’s page provides some visuals, although not on natural images. Unfortunately this stuff gets perceptive/subjective very quickly so this post is just a clumsy attempt at measuring ‘something’.
Well, I would not say it is clumsy at all. It goes a long way towards raising awareness of the distortion of the MTF curve that may result from downsizing.
I also like to look at the ESF after downsizing. That often shows the overshoot/ringing quite clearly, especially on a poorly tuned cubic filter.