This article is about specifying the units of the Discrete Fourier Transform of an image and the various ways that they can be expressed. This apparently simple task can be fiendishly unintuitive.
The image we will use as an example is the familiar Airy Pattern from the last few posts, at f/16 with light of mean 530nm wavelength. Zoomed in to the left in Figure 1; and as it looks in its 1024×1024 sample image to the right:
Feeding image in figure 1b to Matlab’s Fast Fourier Transform routine produces its two dimensional Discrete Fourier Transform. The DFT is the same size as the original image, so also 1024×1024 values. It is shown below in Figure 2a after taking its magnitude and normalizing it to one at the origin, the top left hand corner. The result of these operations is referred to as the Modulation Transfer Function. Figure 2b is a plot of just the top row, all 1024 samples of it. It represents the spatial frequency response of the imaging system in the horizontal direction.
Let’s concentrate on the one dimensional simple case first: what are the spatial resolution units of the plot, horizontal linear frequency in the direction ? They would be the units of the DFT’s image x-axis. In fact, because in this case the sampling points on the image plane are evenly spaced in the x and y directions, they would also be the units of the y-axis, .
Cycles/set and Cycles/sample
The DFT only knows that it was fed a set of equally spaced samples . Its output is infinitely periodic but by convention it produces just the first period which contains the same number of entries as the initial set, 1024 in this case.
By definition, for every integer index position of input set between zero and -1 the DFT computes an output value by summing up all results of the following function for every value in the input set, with dummy variable cycling through the entire input set at each :
(1)
with the number of samples in the set, indicating non-negative integers . The results of the DFT, , are by convention laid out in the same sequence, from to -1.
Every value in the DFT’s output therefore represents an increasing integer spatial frequency corresponding to the relative index position : 0 cycles per set, 1 cycle per set, 2 cycles per set… all the way to the number of input samples () minus one, in this case 1023. Left to right, for every integer position the units of the 1D DFT output are therefore
-1 cycles/set,
So cycles per input set are the natural units of the DFT. In our example the continuous Airy pattern on the sensing plane was sampled by a square grid, 1024 samples on the side. The output of the DFT in the horizontal direction would therefore correspond to linear spatial frequencies from 0 to 1023 cycles/set in 1 cycle/set increments, and those would be the units along the axes of the plot in Figure 2b.
We can convert these units to cycles/sample by dividing frequencies in cycles/set by the number of evenly spaced samples/set () – watch how the units cancel out – to obtain the frequency range from 0 to almost 1 cycle/sample in 1/ cycles/sample increments (1/1024th in our example), as shown in Figure 3a:
– cycles/sample,
So with typical setups the DFT’s output can always be referred to units of cycles/set and cycles/sample. In two or higher dimensions the units of each axis are computed individually following the same process, because DFTs are separable. In the absence of additional information, however, these units cannot be directly related to linear space or to other physical units.
Cycles per Pixel, lp/mm, lp/ph
Fortunately we normally do have information that relates the set of samples under review to physical units. In photography we know the size of the sensor that captured the image and the layout of the pixels it contains. The layout is typically a rectangular grid with evenly spaced pixels, the distance between their centers horizontally and vertically referred to as pixel pitch.
Assuming that the data set fed to the DFT was the result of one sample per pixel we can replace the label of the axis of Figure 3a below with cycles/pixel pitch (c/p) – and if one knows the physical size of the sensor and/or the pixels it’s then easy to convert to more practical units like lp/mm, lp/ph etc.
For example if we know that the physical extent of the input set on the sensor corresponds to 2mm all we have to do is divide the result of the DFT in cycles/set by 2mm/set to obtain units of lp/mm (watch how the units cancel out). Similarly, if we know that pixel pitch is 4.33 microns and we have the result of the DFT in cycles/pixel we simply divide that by 0.00433mm/pixel to obtain units of lp/mm. Such conversions are explained in detail in this article from a photographer’s perspective.
So that’s where units for the output of a Discrete Fourier Transform come from. If you’ve read this far you may also be interested in this practical 2D take on the same subject. The rest of this post explains how some of the units in previous articles of this series were arrived at.
One MTF, Twice the Data
By convention the DFT produces an image in the frequency domain of the same size as the original in the spatial domain (1024 values on the side). When the input is an intensity Point Spread Function, as is often the case in these pages, the second half is the mirror image of the first half because the PSF’s intensity is real – and when a Fourier Transform acts on a fully real function, the result displays Hermitian symmetry.
Since the shown function is in fact infinitely periodic, it is sometimes more meaningful to show the resulting curves centered around the zero frequency. The plot would then appear as in Figure 3b, with the x-axis representing frequencies from -0.5 to almost 0.5 cycles/sample in 1/1024th cycles/sample increments:
-0.5, -0.5+1/n, -0.5+2/n, -0.5+3/n, … , 0.5-1/n cycles/sample.
In two dimensions the resulting MTF looks like a little tent, per the example shown in Figures 2 and 3 of the previous article. It then becomes more intuitive why the top line of the MTF in question corresponds to horizontal spatial frequencies.
SuperSampling
If the available image data referred to super-sampling at a rate of 5 samples/pixel, as in the examples of the last few articles, we would have five times the samples and therefore need to multiply shifted (or unshifted) above by this figure. Watch how the units in Figure 3b cancel out, yielding a range of -2.5 to almost 2.5 cycles/pixel at 5/1024th c/p increments (about 1/204th c/p)*.
If a pixel is 5 micrometers on the side, we could work that into the units: -2.5 to almost 2.5 c/p in 5/1024th cycles/pixel increments divided by 0.005 mm/pixel yields a range of -500 to almost 500 lp/mm in increments of 1000/1024th lp/mm.
Both of these cases are shown below, zoomed into the frequencies of interest from zero to 1 c/p, which in this case correspond to from zero to almost 200 lp/mm.
If one knows the vertical size of the photographic sensor, say typically 24mm for a Full Frame camera, one can multiply units in lp/mm by 24mm/picture height to obtain spatial resolution in lp/ph. Alternatively one can obtain lp/ph by multiplying spatial resolution in cycles/pixel by the number of pixels on the sensor’s size.
Spatial frequency is often expressed in lp/mm or lp/ph when evaluating the performance of an imaging system for photographic applications.
Slicing and Dicing
So those are the units corresponding to the horizontal and vertical axes of Figure 2. They are the units and in the vertical and horizontal directions respectively. But what happens if we take a radial slice in a different direction, say 45 degrees as shown in Figure 5a below (yellow line)? This would represent the imaging system’s frequency response in the diagonal direction. The number of samples in the set would still be the same, 1024, but the samples would physically be further apart as a result of the square sampling grid.
Remember that a sample or a set does not have physical dimensions. So if the result of the yellow diagonal FFT slice in cycles/set is plotted together with the earlier horizontal FFT in Figure 2 we would get this misleading graph:
We know that the two dimensional Airy disc and its MTF are radially symmetric, so the linear result in physical units should be the same no matter the direction of the slice of the 2D MTF. To correct the situation all we need to do is convert them both to comparable units. If the (horizontal) pixel pitch is 5um, samples will be 5 um apart diagonally, because the sampling grid is square. Therefore we need to multiply the diagonal spatial frequency units associated to the relative DFT output by the difference in sample spacing, . Here is the result, zoomed into the zero to 1 c/p range. Misleadingly, the ‘pixel’ in the spatial frequency units refers to pixel pitch, which is specified horizontally (and/or vertically) only by convention:
Slices in other directions would result in different linear sampling intervals which would need to be expressed in compatible physical units for comparison to others.
The ‘down’ Sampled Image
As a final example let’s revisit the sampled image of the previous article. In that case we pretended that Figure 7a below represented a continuous image. Here we will assume that it is the image in the raw data of the actual capture of an Airy pattern that we wish to downsize 5:1. The image in Figure 7b represents the nearest-neighbor downsampled result (nearest-neighbor implemented as averaging of the pixels in the yellow squares of Figure 7a).
The original capture to the left is the same Airy intensity pattern of Figure 1b above, captured in a 1024×1024 sample raw file. Every new pixel to the right corresponds to 5×5 of those before downsampling. That means that the downsized image will be made up of 204×204 pixels. The DFT of the downsampled image will therefore refer to vertical and horizontal spatial resolutions from 0 to 203 cycles/set. Since there are 204 pixels/set we divide the two figures to obtain a range of 0 to almost 1 cycles/new pixel in increments of 1/204th cycles/pixel. Using these spatial frequencies paired with the result of the DFT of the downsampled image produced the plots that compare the performance of the downsampled vs the original image, as shown in the previous article. Pixel in the units refers to the new downsampled pixel 5 um on the side.
From here it’s easy to convert to typical units like lp/mm or lp/ph as discussed earlier.
*As an aside, MTF Mapper super-samples the Line Spread Function at 8 samples/pixel.
Hi,
How to do supersampling?My camera’s pixel pitch is 7.4 microns,so it’s max spatial frequence is about 135lp/mm,I want to know how to increase this.
Thank you!
Hello Aubrey,
Some cameras use the IBIS function to incrementally shift the sensor horizontally and vertically up to one pixel while taking a capture at each position. The 4, 8 or 16 raw captures may then be combined in a single supersampled file. Alternatively, software like picure or photoacute take advantage of the natural shakiness of the photographer’s hands to take a number of shots back to back, then align and combine them in a single supersampled file.
If done properly the result is a ‘raw’ image with increased sampling pitch but with the same pixel aperture. A smaller sampling interval means less aliasing. See for instance here.
Cheers!
Jack
Hi,Jack
Thanks very much for your reply!
The pixel size of my CCD device is 7.4 micron, but the technical support told me that they could test up to 600 lp/mm, so I wonder if there is any algorithm that can improve the sampling frequency in addition to changing hardware.
Another problem is when I perform FFT transformation on LSF data in MATLAB, the MTF reach zero twice before Nyquist frequency and increase a little. I don’t know what went wrong.
Looking forward to your reply. Thanks a lot!
Hi Aubrey,
What are you trying to do and what data are you taking the FFT of? If you are trying to measure the MTF of your imaging system, the slanted edge method supersamples the edge so the MTF is virtually independent of pixel pitch (though still dependent on pixel aperture). If the MTF hits a zero before Nyquist your optics are probably poorly focused or aberrated. See for instance https://www.strollswithmydog.com/a-simple-model-for-sharpness-in-digital-cameras-defocus/ and https://www.strollswithmydog.com/the-slanted-edge-method/
Hello Jack,
I use the slit method to get to MTF of small lens, so I can get the LSF data directly by taking gray distribution, but when I try to get MTF from LSF, the MTF is like a wave except low frequence(
mtf of the low-frequency part decrease rapidly from 1 to 0)
Thanks again for your reply
Hi Jack,
I don’t know if I understand correctly the MTF units. If I have 1024 samples length LSF obtained from 4 bins per pixel the resulting MTF would have frequency values from 0 to 4 cycles/pixel? And Niquist frequency at 2?
Hello Anne, correct on both counts.
If you have a set of 1024 samples, the DFT of that set would span from 0 to 1-1/1024 cycles per sample (c/s) in 1/1024 c/s increments. In Matlab we would write the frequency axis as
0:1/1024:1-1/1024 (c/s).
If physically there are 4 s/p (p=pixel pitch) we would have
(0:1/1024:1-1/1024) c/s * 4 s/p
= 0:1/256:4-1/256 c/p (see how the units play out).
So the range is from 0 to almost 4 c/p and monochrome Nyquist is at half that.
Jack
Hi Jack is
Thank you for the answer. One more thing: in some articles i have seen MTF freqency axis expressed as normalised frequency. What would that mean?
Say for instance that in imaging you were interested in expressing the frequency axis as a percentage of diffraction extinction, with 1/(lambdaN) equal to 1, or as the number of Airy disc radiuses (1.22lambdaN), or…
The conversions are easy, see the relative article.
Jack