Minimalist ESF, LSF, MTF by Monotonic Regression

Because the Slanted Edge Method (for estimating the Spectral Frequency Response of a camera and lens) is one of the more popular articles on this site, I have fielded variations on the following question many times over the past ten years:

How do you go from the intensity cloud  produced by the projection of a slanted edge captured in a raw file to a good estimate of the relevant Line Spread Function – while touching the data as little as possible so as to be confident in the absolute value of the resulting SFR?

Figure 1.  Slanted edge captured in the raw data and projected to the edge normal.  The data noisy because of shot noise and PRNU.  How to estimate the underlying edge profile (orange line, the Edge Spread Function)?

So I decided to write down the answer that I have settled on.  It relies on monotone spline regression to obtain an Edge Spread Function (ESF) and then reuses the parameters of the regression to infer the Line Spread Function (LSF) analytically.  This front-loads all uncertainty to just the estimation of the ESF from raw data since the other steps on the way to the SFR (derivative and Fast Fourier Transform) become purely mechanical.

This minimalist, what-you-see-is-what-you-get approach gets around the usual need for signal conditioning such as binning and finite difference calculations, with their drawbacks and compensations.  The approach has the potential to be further refined so consider it a hot-rod proof of concept.  Even so it is an intuitively direct implementation of the method and it provides strong validation of Frans van den Bergh’s MTF Mapper, the undisputed champ in this space,[1] as it produces very similar results with slanted edges captured with good technique in a raw file.

I expect this article to only be of interest to keen DIY imaging nerds so I will dive right in, presuming familiarity with the preliminaries, which were partly covered in the article up top.   If you come up with some refinement or spot some weaknesses let me know, I am interested as always.

The Slanted Edge Method in a Nutshell

The slanted edge method aims to estimate the Modulation Transfer Function (a.k.a SFR) of an imaging system by capturing slanted edges in a raw file.  The procedure is simple in principle:  capture with good technique a 100+ pixel slanted edge in the raw data with a monochrome or Bayer image sensor, subtract the black-level and  estimate its angle \theta accurately so that the two-dimensional pixel grid can be projected to a one-dimensional edge normal.

Figure.  Pixel grid projected to  edge normal with relative pixel intensities, the Edge Spread Function.  Differentiation results in a Line Spread Function, which after regularization and Fast Fourier Transformation produces the Modulation Transfer Function of the imaging system in the center of the edge in the direction perpendicular to it.  Image courtesy of Frans van den Bergh, modified to reflect article variables.

The result is data pairs (x_i,y_i), with x_i representing the exact position on the normal and y_i the noisy intensity of a projected pixel.  Once sorted based on x and plotted, the data pairs typically appear at non-uniform intervals on the axis and look something similar to Figure 1.

The projected (x_i,y_i) data set represents a super-sampled, noisy, intensity profile of the edge.  The underlying edge profile curve is 1) estimated as the so-called Edge Spread Function;  it is then 2) regularized and 3) differentiated (not necessarily in that order) to obtain the relative Line Spread Function.  This article is mainly about these three critical steps.

Finally the LSF is put through a Fast Fourier Transform routine to produce its Spectral Frequency Response, which in this context represents the Modulation Transfer Function (MTF) of the imaging system at the center of the edge in the direction perpendicular to it.

Why Monotonic?

The short answer is that ideally the LSF is just a projection to one dimension of the normalized impulse response of the system, the two dimensional Point Spread Function, which is by definition made up of only non-negative intensity.  Therefore the Line Spread Function is also non-negative.    The ESF is the running integral (the cumulative sum) of the LSF.

Point Line and Edge Spread Functions of an imaging system with Antialiasing Filter and other characteristics similar to a Nikon Z7 with 24-70mm at f/4.
Figure 2.  Left: PSF of a typical sensor with an antialiasing filter and a lens at f/4. Center: LSF, same intensities as PSF but projected on the edge normal.  Right: ESF, cumulative sum of LSF left to right.

As the running total moves along the LSF it only finds non-negative numbers to add up so the resulting ESF curve can only increase or stay the same at each subsequent step.  Formally, the physical profile of the edge from a linear raw file is always monotonic (a.k.a. isotonic) as clearly seen in the trend in Figure 1,[*] a fact we should be able to rely on and exploit to our advantage.

Processing such as sharpening and some demosaicing can induce over and undershoots in the ESF so monotonicity is only guaranteed in the case of linear raw data, which is solely what this article and these pages are concerned with.

Noisy, Noisy

However, still referring to Figure 1, captured raw intensities in DN are not monotonic when projected onto the edge normal because they are noisy.  In testing conditions sources of noise are mainly shot noise and (often to a lesser extent) Pixel Response Non Uniformities, both of which increase with intensity, the reason why the dark portion of the projected edge looks less noisy than the bright one.

This is a problem because in order to obtain the MTF curve we need to differentiate and regularize the projected raw data to obtain a LSF to feed to a Fast Fourier Transform routine.  Differentiation amplifies noise.  It is usually implemented as finite differences of the (x_i,y_i) raw data pairs, making an unconditioned ESF unusable in virtually all circumstances:

LSF by finite difference of noisy ESF
Figure 3.  What happens if you differentiate the data in Figure 1 by finite difference.

So how best to tease out the monotonic edge profile curve hiding in the noise, nuances and all?   Some folks early in the journey assume that the ESF or the LSF are only made up of well behaved shapes like sigmoids or gaussians and assume that’s what’s behind the noise.  However, differences from simple shapes is where high frequency detail resides so by doing so they are giving up some accuracy and throwing out a meaningful portion of the spectrum unnecessarily.

In photography we are often interested in those high frequencies because they can give a measure of the physical characteristics of the imaging system, such as lens working f-number, the size of effective pixel aperture, the presence of antialiasing or other filtering.  Horses for courses, I guess.

The Naive Approach

One solution often proposed in the literature is to average the projected raw intensities, say every quarter or eighth of a pixel.  This does regularize the data while providing a measure of noise reduction, though it is equivalent to convolving the raw  capture of the edge with a weirdly shaped filter, changing the underlying data and affecting the Line Spread Function, hence the frequency response of the captured image.

Differentiation to obtain the LSF is then usually computed by finite difference, another operation which effectively filters the data, amplifying noise.  Since the two filtering functions above are known, their effect can be partly reversed in the frequency domain, and that’s what the naive approach does, see for instance this Peter Burns paper.[2]  However, that leaves the higher frequencies with little energy, making them noisy and often unusable.

Too bad, because there is often neat information to be gleaned there, for instance effective pixel aperture size or antialiasing filter strength.  Below left the naive approach as represented by sfrmat5 at default understates MTF and misses the AA and pixel aperture nulls at 2/3 c/p and 1.143 c/p respectively.[3]  The minimalist approach to the right instead follows the theoretical MTF curve almost perfectly – as does our reference, MTF Mapper.

Comparison of MTF curves produced by sfrmat5 and Jack's minimalist method for generating ESF and LSF
Figure 4.  MTF from theoretically accurate, noisy slanted edge captured by a monochrome sensor with the noise characteristics of a FF MILC camera, known pixel aperture null at 1.143 c/p and antialiasing filter null at 2/3 c/p, with a 24mm lens at f/4.  The ideal PSF, LSF and ESF are those shown in Figure 2.

The naive approach does not result in a physically correct, monotonic ESF, thus losing sensitivity and detail.  It is also forced to compensate for the data conditioning that it performs adding biases and losing accuracy along the way, as reader Larry pointed out a while back.

Why Regression

Regression fits the bill because it can answer questions like this: given the entire set of data in Figure 1, what is the one monotonic curve (ESF) most likely to minimize squared deviations from it?

The uncertainty here is only in the intensities (y_i) while the other variable (x_i) can be considered to be known exactly so the least squares minimization criterion is relevant, also because the dominant source of noise, shot noise, induces intensity swings proportional to the square root of the signal.

My first attempt at generating an ESF by monotonic regression worked but was convoluted and it took a quarter of an hour to compute for a typical edge.  I mentioned this to Frans who quickly found a lightening fast C implementation of it by Dr. Jan de Leeuw based on J.B. Kruskal’s Pool Adjacent Violators algorithm (PAVA).[4]

The Pool-Adjacent-Violators-Algorithm

Dr. de Leeuw is the Distinguished Professor of Statistics and former chairperson of the Department of Data Theory at Leiden University. As it turns out Matlab has its own open domain version of it, lsqisotonic,[5]  so suddenly I could generate monotonic ESFs from edge raw intensity data pairs (x_i,y_i)  in microseconds.

Monotonic PAVA regression from raw projected slanted edge data to Edge Spread Function
Figure 5. Kruskal’s Pool-Adjacent-Violators Algorithm regression provides a first approximation to the ESF.  But the stair-stepping creates harmonics which result in high frequency noise.

As you can see in the Figure above results are pretty good when seen at scale but the algorithm produces jumps from one monotone level to the next introducing stair-stepping in the curve.   This generates harmonics resulting in a noisier response at higher frequencies.

An additional issue is that the resulting PAVA ESF produces a regressed \hat y_i data point at each original x_i so the data still needs to be differentiated and regularized in order to be fed to the Fast Fourier Transform routine.  This is easiest done by – you guessed it – binning, which partly defeats the purpose because it results in another version of the sharpness-reduced and noise-increased LSF of the naive approach.

Nevertheless the PAVA regressed ESF is indeed monotonic and better behaved than most so if you bin it, say, every eighth or quarter of a pixel and compensate for it and the finite difference subtraction in the frequency domain, the resulting MTF turns out to be better than what the vast majority of algorithms out there are able to produce from raw data:

Figure 6.  MTF from Slanted edge captured in the center of the raw file of a Nikon Z7 with 24-70/4 S lens at 24mm, f/5.6, white balanced and fed as a grayscale image to both sfrmat5 (left) and the described PAVA algorithm (right).

So how to avoid the stepping, binning and differentiating problems?

Monotone Spline Regression to the Rescue

With some time on my hands during Covid I contacted Dr. de Leeuw for a solution to the stair-stepping issue.  He suggested I take a look at his work on monotonic spline regression.  I did – and never looked back.

The idea behind splines in our context is to break down the pixel axis in Figure 1 into arbitrary sequential segments and fit the projected raw intensity data [x_i,y_i) within each segment with a polynomial of a chosen degree in such a way that the curve generated in each segment is continuous with its neighbors.

I am taking a few liberties with the narrative for simplicity here, refer to Dr. de Leeuw’s ‘Computing and Fitting Montone Splines’ working paper for the correct level of detail in this explanation.[6]  I also found this a helpful visualization, courtesy of Mathworks (not an ESF):

Figure 7. Top: cubic splines; bottom: the relative spline function, courtesy of Mathworks.

As a reminder a cubic polynomial, for instance, has this form

(1)   \begin{equation*} \begin{align} \left f(x) \right &= ax^3+bx^2+cx+d \\ \\ &= \begin{bmatrix} x^3 & x^2 & x & 1 \end{bmatrix} \left[\begin{array}{c} a \\ b \\ c \\ d \end{array}\right] \\ &= h*u \end{align} \end{equation*}

with a not zero and x the horizontal axis.  The formula in the middle is the same as the one at the top but in matrix notation, which makes it easier to read and to follow related code in the notes   The first term on the right hand side collects the basis functions, call them h, while the second the coefficients, call them u.   Matrix multiplication is indicated by the * character here.

The highest exponent determines the so-called degree of the polynomial, in this example it is 3.  The idea is to use regression within each segment to find the [a,b,c,...] coefficients that best match the [x_i,y_i) data by least squares using a monotonicity constraint.

The resulting curve spanning the entire data set is called a spline function and it is equal in each segment to the relative f(x) so derived.  It is fully defined by the segmentation, degree and coefficients of the functions.  And it works well on slanted edges captured in the raw data as shown below and in the appendix:

Figure 8.  Comparing PAVA and B-Spline regression. Results are similar but the latter is smoother and therefore results in cleaner upper frequencies.

Spline regression results in a smooth ESF curve that follows closely the earlier PAVA result, without its edginess and related high frequency noise however.   The open circles in the Figure to the left indicate the start and end of the individual segments.  They are known as knots in spline theory, the space between two of them as the relative knot span.

A Knotty Problem

Within the span of any two knots we have a polynomial spline similar to that in Equation (1), with its own set of coefficients.  The full sequence of them defines the spline function.  Such an ESF is then fully defined by those coefficients, the knots and the chosen spline degree

(2)   \begin{equation*} ESF(x) = f^{[)}_1(x) + f^{[)}_2(x) + f^{[)}_3(x) +....+ f^{[)}_n(x) \end{equation*}

for as many knot spans as there are, each f_i(x) valid only in the given knot span (indicated by ^{[)}).  For n knot spans this can be written as

    \begin{equation*} ESF(x) = \sum_{i=1}^{n} f^{[)}_i(x) \end{equation*}

or equivalently, using Equation (1) in matrix notation

(3)   \begin{align*} \left ESF(x) \right &= \left[ \begin{array}{c} polynomial \\ spline \\ basis \\ functions\\ \text{ size (x,degree,knots)} \end{array} \right] \begin{bmatrix} a_1 \\b_1\\c_1\\...\\d_n \\ ... \end{bmatrix} \\ \\ &= h(x,d,k) * u(y,h) \end{align*}

with h representing the basis function matrix and u the coefficient vector.  Refer to Dr. de Leeuw’s working paper for details on how h can be constructed given x_i, the chosen spline degree d and a number of (not necessarily uniformly distributed) knots k.[6]

With h in hand we solve for coefficients u  by regression, minimizing the square difference between the target spline function and projected raw edge intensity y_i at every x_i .   In other words

(4)   \begin{equation*} minimize || h(x,d,k) * u(y,h) - y||^2 \end{equation*}

by varying u.  The problem in our case is always overdetermined and if we didn’t need a monotonic ESF we could simply solve the normal equation to obtain a non-monotonic spline regressed curve.  We do want a monotonic curve out of the regression, however.

Monotonic Basic Splines

Splines come in different flavors depending on how they are normalized, de Leeuw uses basic splines (B-Splines) because they integrate to one, thus simplifying the math.

As clearly explained in his paper there are a number of options to obtain monotonicity, all variations on the same theme.  I chose the one that requires that the coefficients u be monotonic themselves, the result of which is equivalent to an I-Spline fit, see section 5.3.  This requirement is easily enforced by solvers available in Python, R or any number of other packages, as shown a couple of different ways in the attached Matlab code (e.g. lsqlin).[8]

So with h and the result of the regression in hand (u), we now have a monotonic edge curve as a spline function:

(5)   \begin{equation*} \hat{y}(x) = h(x,d,k) * u(y,h) = ESF(x). \end{equation*}

Note that once we have obtained coefficients u for the given (x_i,y_i) projected raw data set, we can easily reconfigure h(x,d,k) to infer ESF(x) for any x, not just the original series of x_i.

The Beauty of a Spline ESF: a Free Regularized LSF

It should be coming into focus that the ability to use analytical formulas for the ESF allows us to bypass the two problematic steps discussed earlier, eliminating the need for binning and having to take finite differences.

We all know from high school that to differentiate polynomials we rearrange the coefficients mechanically.  For instance the derivative of Equation (1) is

(6)   \begin{equation*} f'(x) = 3ax^2+2bx+c \end{equation*}

It works similarly with splines, as shown by Dr. Ching-Kuang Shene in his course at Michigan Technology University:[7]  the derivative of a B-spline curve is another B-spline curve of one less degree, on the original knot vector, with a new set of coefficients (u') algebraically calculated from the original  parameters (the degree of the spline function, knots k, and u).  From Dr. Shene’s spline derivative page:[*]

In other words drop a degree on h to generate h', calculate u' as above and infer the derivative as h'*u':

(7)   \begin{equation*} ESF'(x) = h'(x,d\text{-}1,k) * u'(u,d,k) = LSF(x) \end{equation*}

So if we know the ESF(x) spline function, and after regression we do, we automatically also can know its derivative LSF(x).

In addition, as mentioned in the previous section both are functions of x – any x – since they are made up of analytical formula snippets, which means that we can choose how often and exactly where to evaluate the spline functions.  We of course choose to sample the LSF in a regular fashion and as often and as far as needed so that the FFT routine will produce a MTF curve with the desired resolution and range.[8]

Monotone spline regression to obtain ESF, LSF and MTF
Figure 9.  Monotonic B-Spline regression to obtain the ESF (left), infer the relative LSF (center) and calculate the MTF (right).   Slanted edge captured in the center of the raw file of a Nikon Z7 with 24-70/4 S lens at 24mm, f/5.6.

So fitting raw slanted edge data with B-Spline functions we get the ESF and LSF regularized and at any desired resolution for free, avoiding the binning and finite difference steps and heading straight for the FFT routine.[*]

Hann or other windowing can also typically be avoided because the monotonic spline fit results in well behaved LSF curves with slopes close to zero at either end in typical photographic testing conditions, assuming a dozen or more pixels on either side of the edge.  This is demonstrated by how closely such results follow accuracy champ MTF Mapper, which I believe employs a Tukey window.

Therefore using B-Spline regression to fit projected raw edge data to a monotonic curve, the process reduces to just three easy to monitor steps only the first of which is not rote (fit monotonic spline function, calculate regularized derivative of spline function, FFT), hopefully providing a satisfying answer to the question posed in the introduction while justifying the minimalist moniker in the title.

 

Appendix: Parameters and Limitations

Of course there is no such thing as a free lunch. The heavy lifting is provided by the initial  monotonic B-Spline fit to projected raw edge data.  To define the B-Spline basis functions their degree and knot spans need to be specified, both of which I came up with empirically for my own photographic purposes: utility knives captured in the raw data by enthusiast-level digital cameras with pixels around 3-6\mu m and decent lenses at typical working f-numbers near the center of the image.

Figure 10. Fuji GFX50s 110mm/2.8 at f/2.8, raw edge captured by Jim Kasson.

Spline degree and knot spans set an upper bound on the slope of the LSF and how quickly it can change.  As long as they are not off to left field for the given application the resulting curves are surprisingly insensitive to their choice, I have been using the same ones from the beginning for edges captured by a variety of digital cameras and I have not felt the need to change them (though this would be a fun line of further inquiry).

Figure 11. Nikon D610 200mm/4 at f/5.6, Anti-Aliasing filter horizontally.

In the spline regressions above the knots were evenly spaced every quarter of a pixel (pitch) within about 1 pixel of the edge, increasing to two per pixel from then on, since this works well with my raw edge captures.  In fact there are normally only shallow waves after about 3-4 pixels from the edge so the knot distance could easily be doubled or more from then on.

Some edge angles result in projected x_i that bunch up, leaving large empty areas where the monotonic regressed ESF is forced to slow down, creating an unsightly LSF.  The issue does not present itself with knots every quarter of a pixel or so with my setups.

Figure 12. Sony A7riii 85mm/1.8 at f/5.6.

I’ve mentioned pixels so far but we should also be speaking in terms of edge rise percentage and maximum physically possible spatial period,  \lambda N for lenses with circular aperture.

It makes sense that for photographic applications in typical working conditions optimal knot spacing should be related to pixel aperture and \lambda N, with \lambda the mean wavelength of light and N the effective f-number of the lens.  Though even the PSF of a diffraction limited lens is convolved with a squarish pixel aperture and possibly antialiasing filter during capture therefore requiring a certain number of knots, say 8, for the 10-90% edge transition.

Figure 13.  Nikon D90 50mm/1.8 at f/5.6, with anti-aliasing filter

I have not given it much thought but since the highest spatial frequency possible with typical photographic lenses is 1/\lambda N, there should be a way to measure and limit the frequency response of the spline bases relative to different knot spacing ensuring that the physical limit is not exceeded by much.  By much because one would be surprised at how inaccurate some camera-reported f-numbers can be in practice, especially higher ones.  Any thoughts from math photogs?

Figure 14. Simulated edge at 24mm f/4, with antialiasing filter at 2/3 c/p and pixel aperture at 1.143 c/p.

Clearly the degree and knot spacing interact: if we spread the knots out we may need more degrees of freedom to follow the underlying curve.  Conversely if the knots are close together, too many degrees of freedom could result in an over-fit curve following the noise.

A degree of zero means that the function within each knot span is a constant, hence the regression creates steps;  a degree of one connects each knot to the next in a straight line; two has quadratic degrees of freedom;  three is cubic and where it looks like the function is not straining to follow the physical line spread profile.  So since the LSF is the derivative of the ESF, I chose a degree of 4, quartic, for the ESF spline bases, so that the LSF would be made up of cubic splines.  One can easily go much higher (30th works) but it’s surprising how little difference it makes in the end.

Figure 15.  Fuji GFX50s 110mm/2.8 at f/4, raw capture by Jim Kasson.

The monotonicity criterion can also affect the ESF, though mildly.  As mentioned I chose to ensure that B-Spline regression results in increasing coefficients, which de Leeuw suggests is equivalent to fitting I-Splines.  One could also use increasing values in the resulting curve as a criterion (section 5.4 of his paper) but this approach lends itself to slightly more overfitting while being more expensive in terms of CPU cycles.

With the setup described above I sometimes see some overfitting around the upper shoulder, as clearly seen in the plots above.  Here for instance is a Z7 at 28mm and f/11, which should otherwise be relatively smooth:

Figure 16.  Nikon Z7 24-70mm/4 at f/11

f/11 with 0.535um mean light should result in a MTF null around 3/4 c/p with 4.35um pixels, which we approximately see.  The overfitting in the LSF above about 1px is partly responsible for the energy not staying at zero after theoretical extinction.

So the minimalist monotonic approach works well with raw edges though there is a lot that could be done to improve it.  In the end, unless one needs to understand every step along the way or roll one’s own, open source MTF Mapper is the better, well rounded and robust tool to use – the best and most professional out there.   Here we simply provided some hopefully easy to understand and intuitive validation of its results, front-loading all uncertainty to the estimation of the ESF.

Notes and References

1. Frans van den Berg’s excellent open source MTF Mapper determines the ESF, LSF, SFR/MTF of your camera and lens by capturing slanted edges at home.  It is robust and comes with an easy to use graphical user interface.  I would like to thank Frans for his help and guidance in this journey of discovery.
2. Slanted-Edge MTF for Digital Camera and Scanner Analysis, Peter D. Burns, Eastman Kodak Company, 2000.
3. sfrmat5 is used in Version 4 of ISO standard 12233 for edge-based camera resolution evaluation, which was published on Feb 17, 2023. It can be downloaded from here.
4. Exceedingly Simple Monotone Regression, Jan de Leeuw, 2017.
5. lsqisotonic is a fast implementation of the Pool Adjacent Violators algorithm in Matlab as modified by Yun Wang, 2017. I believe this version is in the open domain and does not require a toolbox.
6.  Computing and Fitting Monotone Splines, Jan de Leeuw, 2017.
7.  Derivative of a B-Spline Curve, Dr. Ching-Kuang Shene, 2014.
8. The Matlab function used to perform the monotonic PAVA and Spline regressions in this page with some examples can be downloaded by clicking here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.