A Simple Model for Sharpness in Digital Cameras – Sampling & Aliasing

Having shown that our simple two dimensional MTF model is able to predict the performance of the combination of a perfect lens and square monochrome pixel with 100% Fill Factor we now turn to the effect of the sampling interval on spatial resolution according to the guiding formula:

(1) $\begin{equation*} MTF_{Sys2D} = \left|(\widehat{ PSF_{lens} }\cdot \widehat{PIX_{ap} })\right|_{pu}\ast\ast\: \delta\widehat{\delta_{pitch}} \end{equation*}$

The hats in this case mean the Fourier Transform of the relative component normalized to 1 at the origin ( $_{pu}$ ), that is the individual MTFs of the perfect lens PSF, the perfect square pixel and the delta grid; $**$ represents two dimensional convolution.

Sampling in the Spatial Domain

While exposed a pixel sees the scene through its aperture and accumulates energy as photons arrive. Below left is the representation of, say, the intensity that a star projects on the sensing plane, in this case resulting in an Airy pattern since we said that the lens is perfect. During exposure each pixel integrates (counts) the arriving photons, an operation that mathematically can be expressed as the convolution of the shown Airy pattern with a square, the size of effective pixel aperture, here assumed to have 100% Fill Factor. It is the convolution in the continuous spatial domain of lens PSF with pixel aperture PSF shown in Equation (2) of the first article in the series.

Sampling is then the product of an infinitesimally small Dirac delta function at the center of each pixel, the red dots below left, by the result of the convolution, producing the sampled image below right.

Footprint-PSF3 — Figure 1. Left, 1a: A highly zoomed (3200%) image of the lens PSF, an Airy pattern, projected onto the imaging plane where the sensor sits. Pixels shown outlined in yellow. A red dot marks the sampling coordinates. Right, 1b: The sampled image zoomed at 16000%, 5x as much, because in this example each pixel’s width is 5 linear units on the side.

In typical photographic sensors pixels are square and laid out in a square or rectangular grid as shown above. Therefore the Dirac delta functions at the center of each pixel follow the same grid pattern, also known as a lattice or a two dimensional comb (aka brush) function. Assuming a very large number of pixels, the two dimensional Dirac delta comb can be assumed to be infinitely long.

Sampling in the Frequency Domain

The 2D delta sampling grid can be expressed as follows in the spatial (left) and frequency domains (right):

(2) $\begin{equation*} comb(\frac{x}{a})\cdot comb(\frac{y}{b}) \Leftrightarrow |ab| comb(a f_x)\cdot comb(b f_y) \end{equation*}$

The Fourier transform of a 2D comb is a 2D comb. In our case $a$ and $b$ represent pixel pitch ( $p$ ) and are the same so the deltas are in a square grid at cycle/pixel-pitch spacing in the frequency domain (c/p or cycle per pixel in short).

Multiplications in the frequency domain become convolutions in the spatial domain and vice versa. So applying the spatial domain operations above in the frequency domain we obtain Equation (1). The convolution of the perfect lens MTF and square pixel aperture MTF discussed in the previous article is the product of their individual MTFs:

Sampled Diff+Square with Slice — Figure 2. The combined two dimensional MTF of perfect lens at f/16 and square pixel area with 100% Fill Factor(2a). A horizontal slice of the two dimensional MTF (2b)

Then sampling is accomplished by convolving in two dimensions the rectangular lattice of deltas one cycle per pixel pitch apart with the tent-looking combined MTF in Figure 2a. The result is many tent-like MTFs, each centered on a delta function in the grid:

Figure 3. The two dimensional Lens and Pixel Aperture MTF of Figure 2, convolved with the sampling Dirac delta function lattice of equation (2) in the frequency domain.

We are now in the discrete domain. Sampling the continuous function may have introduced aliasing.

Modeling Aliasing

It is clear that if the reference lens+pixel 2D MTF has some energy beyond half way the distance between two tents (the Nyquist-Shannon frequency) it will start overlapping with its neighbors once convolved with the delta sampling grid. This interference is called aliasing and it becomes more obvious when viewing Figure 3 in profile, aligned with the axis and projected against the X-Z plane:

Figure 4. Horizontal profile of the 2D MTFs in Figure 3.

We are normally used to seeing this information in the 0-1 c/p range only, as shown in Figure 2b above. Note how energy at spatial frequencies above 0.5 c/p intermingle with those of their neighbors. The result is that they are able to sneak back below Nyquist under an ‘alias’, masquerading as lower frequencies. Since image intensity is real, the convolved MTFs are mirror images of each other, frequencies higher than Nyquist can be thought of as ‘folding’ back around 0.5 c/p. Once this happens it is impossible to tell the real low frequencies from the folded aliased ones, which are then free to produce undesirable artifacts like stair stepping, false color and moiré in the final photograph.

Aliasing is the reason why we see that uptick near 0.5 c/p in the system MTF horizontal radial slice shown in Figure 2b. It can be modeled by folding aliased frequencies above Nyquist back towards the origin and adding them to the unaliased model, as shown in Figure 5.

Modeling Alias — Figure 5. Ignoring phase, Aliasing can be modeled by folding frequencies above Nyquist back towards the origin and adding them to the unaliased model there. ‘Measured’ is a horizontal radial slice off the actual 2D Discrete Fourier Transform of the sampled image, as also seen in Figure 2.

It then becomes clear why in order to be able to recover a signal perfectly it is necessary for contiguous MTF solids not to overlap, which means filtering away frequencies above the maximum desirable spatial frequency and sampling at least at twice that rate. This is known as the Whittaker-Shannon sampling theorem.

Directional Aliasing

We know that the two dimensional system MTF of the sampled image is not circularly symmetric because in photographic sensors the sampling grid typically sports a square layout and pixel aperture.

Neighbors above and below or to the left and right are normally one cycle per pixel pitch apart. Those diagonally across are however further away, $\sqrt{2}$ c/p apart. This suggests that in a typical photographic sensor there is less chance of aliasing when the detail being evaluated is in the 45 degree direction with respect to its origin, as can be gleaned from this central cutout of the 2D MTF in Figure 3:

Convolution PSF and Dirac grid Diagonal — Figure 6. Detail of the two dimensional system MTF of the monochrome photographic sensor in Figure 3 showing less potential aliasing energy overlap diagonally.

In our f/16 example it can be seen that there is aliasing overlap in the horizontal or vertical directions, say going from (0,0) to (1,0) but that there is no overlap diagonally, say from (0,0) to (1,1). This is quite obvious when plotting the two respective directional MTF radial slices on the same graph, showing potential for aliasing in the vertical and horizontal direction but not diagonally:

MTF of Sampled Perfect Lens and Pixel — Figure 7. Linear MTF in the horizontal and 45 degree direction of the perfect monochrome sensor with 100% FF square pixels and an unaberrated lens with circular aperture at f/16. The horizontal axis is in cycles per horizontal-pixel-pitch units.

This means that in most images there is likely to be less aliasing when the detail is not aligned with the axes and it explains why moiré patterns and aliasing artifacts are often more pronounced along them. The point is driven home by this test chart with sets of lines at zero, 22.5 and 45 degrees from vertical as captured in a Jpeg file by DPReview.com forum member OpticsEngineer:

Directional Aliasing: test target with lines at different inclinations shows increasing aliasing when the angle is more aligned with the sensor axis — Figure 8. Test chart captured by a Sony A7SII with Sony 20mm/1.8 lens at f/4. Only the green channel is displayed from the Out of Camera Jpeg file after brightening and 2x magnification via nearest neighbor upsampling. This example is less than optimal but still shows more aliasing as the lines become more aligned with the sensor axis. Image courtesy of OpticsEngineer.

Even though this Out of Camera image may be affected by demosaicing, sharpening and compression artifacts, the diagonal lines are clearly least aliased and therefore the easiest to count. On the other hand the other two sets show it in a number of different ways.

Since aliasing takes many forms when displayed, how objectionable it is to viewers tends to be subjective. I personally find that once one learns to spot it one tends to notice it a lot more and it becomes bothersome. Some ‘sharp’ cameras are especially prone to it. Less aliasing is my main reason for preferring larger sensors with smaller pixels.

Nyquist in Two Dimensions

It should by now be clear that in imaging the spatial frequency beyond which aliasing may begin is direction dependent. As mentioned, this is because with pixels on a square grid the distance between neighboring MTF ‘tents’ is 1 c/p along the axes but further away at $\sqrt{2}$ c/p diagonally. The Nyquist frequency is colloquially defined as half that distance so it also changes with the direction of the detail being measured, producing aliasing when energy is present beyond $\frac{1}{2}$ c/p along the axes but only after $\frac{\sqrt{2}}{2 }$ c/p diagonally.

Therefore, with a square pixel grid and a frequency vector at angle $\theta$ relative to one of the axes, the Nyquist spatial frequency in two dimensions is actually

(3) $\begin{equation*} f_{Nyquist,\theta} = \frac{cos(\theta)+sin(\theta)}{2} \; \text{c/p} \end{equation*}$

There is a slightly different take on this subject in the article on the effect of a Bayer CFA on sharpness.

Anti Aliasing

The negative effect of aliasing can be controlled by filtering the original signal before sampling to limit energy captured above the Nyquist frequency. This is the job of an Anti Aliasing filter, discussed in a dedicated article. There is always a trade-off because no filter is perfect, so reducing the impact of frequencies above 0.5 c/p means necessarily also lowering some good frequencies below that – and perceived ‘sharpness’ with them.

Another way to reduce aliasing all other things being equal is to increase the sampling rate. One of the properties of spatial-frequency duality is that narrower features in one domain become larger ones in the other (the $a$ and $b$ factors of equation 2 are in the numerator in one and in the denominator in the other). Sampling at a smaller pitch spreads apart the 2D Dirac deltas in the frequency domain grid, therefore pushing the convolved MTF ‘tents’ further apart, reducing the chance of overlap.

The Simplified Perfect Model

This concludes an explanation of our simple 2D MTF model, it works as advertised in its current form based on a perfect lens and monochrome sensor with perfect square pixels of 100% fill factor:

(4) $\begin{equation*} \begin{align*} MTF_{Sys2D} &= (\frac{2}{\pi}[\arccos(s)-s\sqrt{1-s^2}]\\ &\times |\frac{sin(\pi f_{x} w)}{\pi f_{x} w}||\frac{sin(\pi f_{y} w)}{\pi f_{y} w}|)\\ &**\: comb(p f_x)comb(p f_y) \end{align*} \end{equation*}$

with $s$ the linear spatial frequency $f$ normalized for extinction: $s =f\cdot\lambda N$ ; $f_{x}$ and $f_{y}$ the horizontal and vertical spatial frequency components so that $f = \sqrt(f_x^2+f_y^2)$ ; $w$ the effective linear size of a perfect square pixel on the side directly related to linear fill factor; $p$ pixel pitch, the horizontal and vertical spacing of the centers of pixels as laid out on a rectangular grid in the sensor; x element-wise multiplication and $**$ two dimensional convolution.

We can use this simplified model to start answering questions about the effects of diffraction and pixel size on the spatial resolution performance of our photographic equipment. Next we will add antialiasing filters and simple aberrations to the model, then test it against real captures.

*The units of spatial frequency $f$ are described in detail in this article.

8 thoughts on “A Simple Model for Sharpness in Digital Cameras – Sampling & Aliasing”

Francois Berthiaume says:

January 8, 2022 at 10:52 pm

Hi Jack, thanks for sharing your knowledge throughout these posts, it is very appreciated. This post helped me understand the SFR of a pixel at 45 deg (triangle and sinc^2). I tried looking for references on this subject but all I got was this post and another comment you made in another blog. I was wondering if you have a reference I could cite where this relationship is discussed. Thanks!

Jack says:

January 9, 2022 at 1:15 am

Hello Francois,

That is indeed valid for a square pixel and microlens combination that accept light perfectly from all directions. In practice effective pixel aperture is never quite a perfect square, sometimes by design sometimes by chance. A pillow shape is more typical.

Anyways, as far as your question goes, the ideal diagonal SFR is sinc^2 because when looking at frequency as a vector in 2D (f, with components fx in the x direction, fy in the y direction) the response is sinc(fx)*sinc(fy), and at 45 degrees fx = fy by definition, so you can also write sinc(f)^2, with f = fx = fy. Any 2D Fourier Transform reference will do as a source, I like chapter two of Goodman’s Fourier Optics.

Jack

1. Francois Berthiaume says:
  
  January 9, 2022 at 3:05 am
  
  Thanks!
  
Bruce says:

June 16, 2024 at 7:55 pm

Hi jack!Can you gave me s simple equation to caculate the mtf from a specific frequency and pixel size? Like mtf of 40lp/mm to a 3.76μm cmos with bayer cmos.Thank you vary much!

1. Jack says:
  
  June 17, 2024 at 8:01 am
  
  Hello Bruce, with or without lens?
  
  If you are interested in the effect on MTF of effective pixel aperture for a complete imaging system (lens and all) at a given gaussian spot in the FOV then you need to specify f-number, aberrations, direction etc. You need a plot like the one in Figure 10 in the previous article for the specific setup so that you can read MTF off it at the desired spatial frequency.
  
  Jack
  
  1. Bruce says:
    
    June 17, 2024 at 10:29 am
    
    Hi jack!What i what to learn is under the Nyquist frequency of a low pixel sensor，the difference on mtf between the low pixel sensor and the high pixel sensor (in the same frame) at a specific frequency.So I assumed an ideal lens to explore my question.Thank you!
    
    1. Jack says:
      
      June 17, 2024 at 3:45 pm
      
      For an ideal square pixel and an ideal lens with a circular aperture you would need to generate a theoretical system MTF curve by multiplying element-by-element the MTF curve of diffraction at a given effective f-number (Equation 2 in the previous article) by the MTF curve of the effective pixel aperture (Equation 3 there). Then read the MTF of the resulting system curve at the desired spatial frequency. Repeat for other pixel apertures or f-numbers.
      
      For example, comparing an ideal Full Frame sensor in the green channel at 60MP vs 24MP horizontally (say square pixel aperture of 3.75μm vs 6μm on the side) mounting a diffraction limited lens at f/5.6, this is what you would get:
      
      Nyquist is half way to each curve’s null. Keep in mind that in practice the 24MP sensor would want some sort of an antialiasing filter and, alas, unaberrated lenses do not exist in photography.
      
      Jack
      
      1. Bruce says:
        
        June 17, 2024 at 6:17 pm
        
        Thank you vary much!

Strolls with my Dog