An Introduction to Pupil Aberrations

As discussed in the previous article, so far we have assumed ideal optics, with spherical wavefronts propagating into and out of the lens’ Entrance and Exit pupils respectively.  That would only be true if there were no aberrations. In that case the photon distribution within the pupils would be uniform and such an optical system would be said to be diffraction limited.

Figure 1.   Optics as a black box, fully described for our purposes by its terminal properties at the Entrance and Exit pupils.  A horrible attempt at perspective by your correspondent: the Object, Pupils and Image planes should all be parallel and share the optical axis z.

On the other hand if lens imperfections, aka aberrations, were present the photon distribution in the Exit Pupil would be distorted, thus unable to form a perfectly  spherical wavefront out of it, with consequences for the intensity distribution of photons reaching the image.

Either pupil can be used to fully describe the light collection and concentration characteristics of a lens.  In imaging we are typically interested in what happens after the lens so we will choose to associate the performance of the optics with the Exit Pupil.

An Aberrated Wavefront

For example a point on axis at the object could produce an aberrated wavefront converging off axis and/or in front/behind the xy image plane.  Here for instance in red is a rough estimate of the aberrated wavefront reverse engineered by the Optical Bench for the state-of-the-art 50mm prime lens at f/1.2 from the last article, an admittedly difficult lift, on-axis:

Figure 2. Zooming into the Exit Pupil area of the 50mm prime shown in Figure 2 of the previous article. Here you can see the Exit pupil position and its size, as well as the aberrated wavefront in red as calculated by the Optical Bench at photonstophotos.net, All other annotations and black lines are mine.

The direction of ray/photon travel is perpendicular to the curve.  In this case the aberrated wavefront has different curvature than the reference sphere, which is centered on the Gaussian image point a radial distance z_i away.  This suggests that it will concentrate photons ahead of the image plane.  Also, the rate of curvature deviation from reference varies with changing powers of the distance to the optical axis.  This means that the cone will not converge to a single point –  its apex will be spread out axially – and if the iris is stopped down, the point of best ‘focus’ will move closer to its ideal location, an indication of focus shift.  These non-idealities are aberrations with names like defocus, tilt, spherical etc.

Optical Path Differences

In fact all aberrations in well designed lenses can be characterized as optical path differences (OPD) from the reference sphere of the actual wavefront as it leaves the Exit Pupil.  If we took a snapshot of photons at the Exit pupil of an aberrated lens, we would notice that some photons would have fallen behind or be ahead compared to the ideal wavefront because of imperfections in the optics.

So at the exit of the black box some photons would not be properly aligned – like the well behaved ones shown in blue in Figure 1 of the previous article – they would be out of phase with their peers and therefore would not produce a spherical wavefront out of the Exit Pupil.  These out of phase photons are said to have accumulated a path length or phase error with respect to the ideal.   In photography the error is usually expressed in \mu m or equivalently waves, with the wavelength of light equal to \lambda in the visible range in air.

Figure 3.  Schematic representation of the converging wavefront from the Exit Pupil to the image plane on axis, ideally in focus (black reference sphere) but in practice not (red aberrated defocus wavefront).  Path length error function ΔW is calculated everywhere throughout the pupil and can be used to characterize lens aberrations.

Figure 3 shows another simplified 2D projection of a 3D aberrated wavefront in red above.  The relative error is expressed as a function of pupil plane coordinates (u,v).   It is the distance measured between the reference sphere and the aberrated wavefront along a ray from the expected geometric point somewhere on the image plane – and ray coordinate (u,v) in the Exit Pupil plane.

Doing this everywhere within the pupil results in a path length/ phase error function denoted \Delta W(u,v) for the Gaussian point predicted by geometrical optics on the image plane.  It is also referred to as the wavefront aberration function.  When positive, the wavefront leads the reference sphere; when it is negative the opposite.[1]  By convention, the function is shifted along z so that it is zero where it meets the optical axis in the pupil plane – since a uniform phase shift throughout the pupil is immaterial to image formation.

Alternatively, and more typically given photographic lenses rotational symmetry, we can express the pupil’s coordinates in polar notation, with a radius \rho=\sqrt(u^2+v^2) and angle  \phi=arctan(\frac{v}{u}) from the optical axis.  We could then express the aberration function equivalently as \Delta W(\rho,\phi).

Path Length Error Example: Defocus

For example let’s take the wavefront of just defocus on axis, as shown in red in Figure 3.  The path length error function \Delta W(u,v) is always positive, increasing with distance from the optical axis.  Defocus simply means that photons converge in front or behind the image plane.  In other words the wavefront is still a sphere coincident with the reference sphere at the origin of the pupil plane – but with a radius longer or shorter than that of reference z_i.  It is easy to show that with paraxial approximation defocus phase error can be expressed in the uv pupil plane as a function of distance from the optical axis \rho[1]

(1)   \begin{equation*} \Delta W(u,v) = c \cdot (u^2+v^2) = c \cdot \rho^2 \end{equation*}

with c a positive or negative constant indicating the amount of defocus and whether the focus point is ahead or behind of the image plane respectively.

Figure 4. Phase error function ΔW(u,v) for defocus. Phase in the Exit Pupil uv plane changes with distance to the optical axis (rho) squared. The peak-to-valley coefficient c is normalized to one in this garish example, with black indicating zero OPD  to the reference sphere, and red maximum OPD.

We can think of \Delta W as a picture of the pupil plane, with every point in (u,v) representing the phase error by its intensity.  This is shown in Figure 4 as the garish contour plot on the uv plane, transitioning from black (no phase shift) to red (maximum shift).  Or we can think of it as a three dimensional surface in relief as shown.  Incidentally, the amount of defocus coefficient c in Equation (1) is the maximum value of \Delta W,  denoted peak-to-valley W_{020} in Seidel notation for instance.

Polynomial Characterization of Aberration Function

The types of aberrations present can be characterized by fitting the typically slow varying ‘surface’ of the phase error Function \Delta W with  meaningful polynomial bases, in rectangular (u,v) or polar (\rho,\phi) coordinates.  Aberrations can then be expressed simply as a combination of the relative coefficients and associated with visible physical effects like defocus, tilt, spherical, coma etc.

The more common sets of polynomials in use are based on polar coordinates and take the names of their proponents, Seidel and Zernike.  In fact wavefront aberration theory and metrology based on such polynomials is an important part of lens design and measurement today, a good introduction to which is given in Wyant.[1] Here for instance is a list of the first Zernike polynomials from wikipedia

Figure 5. Table of the first Zernike Polynomials from wikepedia.

Zernike polynomials are designed to be orthogonal to each other and additive in the root mean square sense, meaning that each additional term refines the previous approximation.   The idea is to keep adding terms with weights appropriate for the desired path length error until they match the surface satisfactorily.  The Aberration Function \Delta W(\rho,\phi) is then the weighted sum of the relative terms shown in column Z_j above.

As you can see there are several different notations, I typically use Wyant’s in these pages-

Spherical Aberration Polynomial for Best ‘Focus’

For example, in the center of a current, in-spec, photographic lens there could just be some spherical aberration (which produces a \Delta W proportional to \rho^4).  We could therefore simulate its effects by using the polynomial with Zernike index 8, assuming that focus shift caused by SA would be counteracted by opposite defocus (which as we have seen goes with \rho^2) producing an rms circle of least confusion, what the camera would see as ‘best’ focus

(2)   \begin{equation*} Z_8 = \sqrt5 (6\rho^4-6\rho^2 + 1) \end{equation*}

As discussed, the resulting aberrated curve describes phase so constants can be ignored because a constant optical phase shift does not impact the image.  Therefore in this case we could simplify it to \Delta W(\rho) = c\cdot (\rho^4-\rho^2), with c a factor indicating the strength of the effect, and use the Fourier Optics Equations in the next article to understand its implications on the image projected by the lens.

Figure 6. Zernike coefficients for a Nikkor Z 50mm f/1.8 S lens obtained from a slanted edge about 10m away in the center of the image. Z8 (c in the text) is about 50nm.

Figure 6 shows the aggregate MTF curve of the indicated imaging system, measured off a knife edge captured in the raw data according to the slanted edge method.  Given its other physical characteristics indicated in the Parameter list, it implies a lens with Spherical (and other \rho^4 ) aberrations described by a Z_8 coefficient about equal to 51.1nm at f/4 (corresponding to c in the discussion above).

If the desired aberration curve were more complex than that,  more terms could be added to obtain a better match.  This would indicate that more aberrations than just Spherical were present.

A similar train of thought can be followed with Seidel’s radially symmetric polynomials.

Next, the complex Pupil Function.

Notes and References

1. Basic Wavefront Aberration Theory for Optical Metrology, James C. Wyant and Katherine Creath.
2. Much inspiration for this article came from Introduction to Fourier Optics, 3rd Edition. Joseph W Goodman.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.