Chapter 1: Ray and Matrix Optics: A Simple Theory of Light
-
Published:2022
Partha Pratim Mondal, "Ray and Matrix Optics: A Simple Theory of Light", Classical and Quantum Optics, Partha Pratim Mondal
Download citation file:
Ray or geometric optics is probably the simplest theory of light that is scientifically sound and mathematically rigorous. The theory is largely based on everyday observations and Fermat’s principle, which dictates that light takes minimal time to travel between two points. This translates into minimum distance between two points, and, hence, the path of minimum time is also the path of minimum distance. Using this as a guiding principle, the laws of reflection and refraction can be arrived at. Moreover, light propagation through a vacuum and various optical elements (e.g., lenses and optical plates), interfaces between different media, and reflection off surfaces (e.g., mirrors) can be explained. This leads to the most important imaging equation, the lensmaker’s formula, which is central to the fabrication of optical elements. To facilitate simple formulations of problems, I have extensively used sign conventions along with the paraxial ray approximation. In addition, I introduce matrix formulation for geometric optics (also known as matrix optics) for realizing complex optical systems (such as beam-expanders and objective lenses). Since aberrations are an integral part of optical system design, a generalized approach is used. In this chapter, we have incorporated all the necessary details for constructing precision optical systems.
Introduction
Among the existing theories of light, ray optics is probably the simplest theory and quantum optics the most complex. While quantum optics is capable of explaining even the minuscule effects of light, classical optics is better known for explaining everyday observations in the most simplistic and intuitive manner. So, there is a hierarchy of theories. However, note that each theory has advantages over the others and, hence, cannot be replaced. One may ask, “If we understand quantum optics, why are other theories necessary at all?” The answer to this lies in the rest of the chapters, where it will become apparent that each theory has a domain of influence. As an example, consider a scenario where we wish to build an imaging system. First, we need to manipulate and direct light emerging from a source. This requires a set of optical lenses, mirrors, and other components. Since the dimension of the objects through which the light passes is much larger than the wavelength of the light, we can consider the light to be a ray and employ simple ray optics to align/collimate the beam. Now, if we introduce a pinhole or slit that has a dimension close to the wavelength of the light, we need to use wave optics to understand its effects. This means that a sophisticated theory (such as electromagnetic optics) would be unnecessary to understand observations made in our everyday lives. Scalar wave optics, being simpler than electromagnetic optics, is quite capable of explaining the effects of a pinhole or slit. Now, consider incorporating an asymmetric optical element such as a quartz plate into the optical system. As the element is optically active (sensitive to the direction of the electric field of light), the electromagnetic theory of light is required to understand its effect. Finally, if the goal of the system is to evaluate properties related to the discrete nature of light (such as photon arrival time or photon bunching), then we need to employ quantum optics. Although there is some overlap in the theories, each has a well-defined domain.
We begin with a simplistic theory of light, i.e., ray/geometric optics, and incorporate increasingly complex theories of light (wave optics, electromagnetic optics, and quantum optics) as the need arises.
Ray/Geometric Optics
Ray optics is touted as the most straightforward theory of light that can explain everyday observations involving light. The theory describes light as a bunch of rays traveling in a medium of varying refractive indices, obeying a set of geometrical rules. Ray optics uses the position and direction of the rays to predict new positions and directions in accordance with geometrical laws. Therefore, this description is useful in everyday observations related to image formation after light passes through or reflects off one or more optical elements (e.g., mirrors, lenses, and plate glass) or as simple as light emerging from a source. The theory is also helpful in predicting the direction of the flow of optical energy.
The theory begins with a set of postulates that govern the path of the light or ray as it passes through materials and optical elements of varying refractive indices. When analyzing rays passing through different optical elements, the paraxial approximation facilitates simple problem formulation and easy interpretation. This approximation considers only paraxial rays (rays that travel at small angles with respect to the optical axis). The set of postulates for ray optics that determines ray propagation through different media is illustrated in Box 1.1.
The postulates describe the paths followed by light rays when they undergo reflection, refraction, and propagation through different media (both homogeneous and heterogeneous). Thus, these postulates explain the propagation of light through simple optical elements (such as glass, mirrors, lens, and prisms) made of different optical materials (glass, polymer, or silica). Optical instruments made of basic optical elements rely heavily on ray optic formulations for alignment, functioning, and precision.
The basic postulates of ray/geometric optics
Rays. Light travels in the form of a pencil of rays obeying geometric rules. They emanate from a source and cease at the detector/eye.
- Optical path-length. This is the total distance traveled by light in the medium or a set of media. Assuming a single homogeneous medium of refractive index (RI) “n”, the time taken to travel a distance “d” in the medium is, . Here, c (= c0/n) and c0 are the speed of light in the medium and air, respectively. The effective path-length (l) is(1.1)However, in inhomogeneous medium where the RI is a function of position, i.e., , the optical path is the integral, i.e.,where d1 and d2 are, respectively, the beginning and end point of optical path “s”.(1.2)
- Fermat’s principle. Optical rays traveling between two arbitrary points (say, d1 and d2) follow a path so that the path-length or equivalently the travel-time is extremum/minimum with respect to other possible paths from d1 to d2, i.e.,where Δ signifies extremun. In general, it is the minimum that suggests, “Light rays travels along the path of minimal time or equivalently shortest distance.”(1.3)
In a homogeneous medium, one can directly consider Fermat’s principle as optical rays follow the path of the shortest distance, whereas, for an inhomogeneous medium, light travels along the path of minimum time.
The Reflection and Refraction of Light
The success of ray optics lies in its ability to explain reflection, refraction, and transmission. Both reflection and refraction are known to occur at the interface of two media with different refractive indices as shown in Fig. 1.1, where a part of the beam gets reflected and the remaining gets refracted. The laws of reflection and refraction (Snell’s law) are essentially the consequence of Fermat’s principle. This is readily evident in the homogeneous medium. Since the case of the inhomogeneous equation requires a complex calculation, we stick to a simple homogeneous case for now. In a homogeneous medium, the speed of light is the same everywhere. According to Fermat’s principle, the path of minimum time is also the path of minimum distance between two points, i.e.,
where n′ is a constant in the homogeneous medium. Hence, the path traveled by light is a straight line connecting d1 and d2.
Figure 1.1 shows the reflection and refraction at the interface of two media. For convenience, a plane of incidence is defined and the rays (incident, reflected and refracted) lie in this plane. According to Fermat’s principle, the path traveled within a medium is, therefore, a straight line. Using simple geometry along with Fermat’s principle, it can be shown that light follows the laws of reflection.
“Laws of reflection: The light travels a path such that the angle of incidence is equal to the angle of reflection, θi = θr and both the rays lie in the plane of incidence.”
At the interface, a part of the light ray gets reflected and the rest gets refracted. Using Fermat’s principle, one can further show that the refracted and incident rays hold a relation given by Snell’s law.
“Laws of refraction (Snell’s law): Both the refracted and incident ray lie in the plane of incidence and the relation between them is given by n1 sin θi = n2 sin θ2, where n1 and n2 are the refractive indices of two mediums.”
The incident and refracted rays travel such that the total time is minimized from the beginning to end. This necessitates a longer travel in a low refractive index n1 and a shorter travel in a high refractive index n2, thereby minimizing the path-length.
Paraxial Rays and Simple Optical Components
In general, paraxial rays are referred to as rays that make small angles to the optical axis such that the approximation, sin θ ≈ θ, holds good. In ideal conditions, only paraxial rays are considered for determining imaging parameters (image position and magnification) for a given optical component. This is better explained by the simplest example of spherical mirror that approximates a paraboloidal mirror for which all paraxial rays focus on a single point.
Spherical interface
We consider reflection and refraction occurring at a spherical interface as a ray traverses from left to right. Ray optics provides a generalized formulation for light propagation through curved and planar surfaces. This is predominantly due to the fact that most of the optical elements involve spherical and planar interfaces. So, it seems imperative to understand spherical surfaces (both concave and convex). It is important to fix a sign convention before beginning to deal with the real problem. We follow the following sign convention assuming that the ray is traveling from left to right.
Sign convention
|A| The object distances are positive when the object is to the left of the interface (real object) and negative for right (virtual object).
|B| The image distance is positive when it is to the left of the interface (real image) and negative when it is to the right (virtual image).
|C| The radius of curvature is positive for a convex interface (when it is to the right of the interface) and negative for a concave interface.
We will see that these rules facilitate generalized ray optics formulation for both concave and convex surfaces.
Concave spherical mirror
Consider a ray reflected by a concave spherical mirror as shown in Fig. 1.2. The object is placed at a distance do from the interface, the image forms at di to the left of the interface, and the radius of curvature is negative (−R) for the concave mirror as shown in Fig. 1.2.
The spherical mirror (radius, R) has a focal length of f = R/2. Since the object is placed at a distance do on the left side of a spherical mirror, the object distance is positive. Consider two rays originating from the bottom of the object (blue and black rays). The blue ray hits the mirror at an angle θ and reflects back to a point I on the optical axis, obeying the laws of reflection, whereas the black ray goes straight (along the optical axis) and hits the mirror at and reflects back along the same line. The intersection of the two rays determines the position of the image on the axis (bottom of the image, I). Similarly, the rays (green and orange rays) emerging from the top of the object (placed at O) hit different parts of the spherical interface. The point where they meet forms the top of the image (off-axis at point I). The same argument works for intermediate points (bottom-to-top of the object at O). So, a complete image is formed at I (both along the axis and off-axis). In this case, the image formed is on the left side of the interface, so the distance di is positive. We further note that the radius of curvature, R, is negative (following sign convention).
For simplistic formulation, it is convenient to work with the rays close to the optical axis (small angles), where the spherical mirror can be approximated as a paraboloidal mirror. This ensures that the rays converge at a single point. Furthermore, this ensures that the first order approximation of sines and cosines of the angles (made by object and image rays) can be used, i.e.,
We seek a relationship between do and di. This requires the angles subtended by the rays as they traverse from the object to the image plane (see Fig. 1.2). From the figure, the following trigonometric relations are evident:
Assuming the angles (α, β, γ) to be very small, the following trigonometric relations are evident from Fig. 1.2(a):
Using the above, the relation becomes
Note that the rays emerging at infinity (do = ∞) converge to di = −R/2. This is defined as the focal length of the spherical mirror, i.e., f = −R/2. So, the relation between object and image distances gets modified to
It may be realized that for a plane mirror, the radius of the curvature is infinity (R → ∞), and the corresponding relation becomes do = −di. From Eq. (1.9), one can easily note the symmetric relation between the image and the object distances. Hence, image and object planes are also called conjugate planes.
Another aspect of the spherical interface is the magnification/demagnification of the object. From Fig. 1.2(b), it is seen that a ray originating from the top of the object (P(y, z)) gets reflected by the mirror and passes through the top of the image (P′(y′, z′)) at I. Since the angles are equal, we have
The negative sign merely indicates that the image is inverted and the magnitude, |m|, is the magnification factor.
Convex spherical mirror
In this section, we explore a convex spherical mirror. Following the concave case (previous section), one can go ahead and derive the relationship between the object and the image distances. In addition, we employ the sign convention along with paraxial ray approximation to ensure that both cases (concave and concave surfaces) are represented by a single equation. Since the object is on the left side and the corresponding virtual image forms on the right side, the object and image distances are do and −di, respectively. The radius of curvature is on the right side of the surface; so, it is +R/2. These considerations give rise to the same equation, i.e.,
where f = R/2 for the convex mirror.
Refraction at a spherical interface
In this section, we discuss refraction at a spherical interface. The interface divides two different media of refractive indices, ni on the left and nr on the right, as shown in Fig. 1.3(a). The rays originate from the object point O and travel toward the spherical interface. Specifically, we choose two rays: the first ray travels along the optical axis and the second ray travels elsewhere (say, point Q). The first ray refracts and goes through the interface unperturbed (see green line), whereas the second ray hits the interface at point Q (at an angle, say, θi with respect to the normal) and undergoes a change in the angle (say, θr), both following Snell’s law (ni sin θi = nr sin θr). The corresponding object and image distances are do and di, respectively. Using Snell’s law at the point Q, we get
Considering the triangles, CQI and IQO, and applying trigonometric relations, we get
Using paraxial ray approximation, Snell’s relationship at point Q modifies to
Neglecting the curvature of the spherical surface in the light of paraxial ray approximation, the above relation modifies to
Following the sign convention, i.e., dO > 0, di < 0 and r < 0 and applying it on the above equation, we get a general form for refraction,
Using the fact that the focal length is f = r/2 and the change in refractive index is Δn = (nr − ni), we get
Now, we focus our attention on Fig. 1.3(b), which is an equivalent representation of the refraction at point Q for an object, with height y located at a distance of z. The corresponding image is formed at a distance of di. Note that the curvature of the spherical interface does not change the results since the sign convention is in force, thereby making both concave and convex systems equivalent.
Applying Snell’s law at the interface point Q [see Fig. 1.3(b)] and using paraxial ray approximation, we get
and the magnification (m) is given by
Planar interface
It may be realized that a planar interface is a special case of the spherical interface. Hence, the results can be determined for the planar interface in the limit, r → ∞. Applying the limit, the imaging equation and the magnification for the planar interface becomes
The above equation suggests that the image that forms at a distance is scaled by the refractive index ratio () and has the same height as that of the object.
Combination of surfaces
After introducing the ray propagation (reflection and refraction) through spherical interfaces, we proceed to analyze biconvex and biconcave lenses. In brief, any lens can be simply realized as a combination of two spherical interfaces. This combination leads to three types: (1) convex–convex (biconvex), (2) concave–concave (biconcave), and (3) convex–concave (meniscus). The spherical surface, in combination with the planar surface, gives rise to other types: (4) plane–convex (plano–convex) and (5) plane–concave (plano–concave). Some of these combinations are as shown in Fig. 1.4 for parallel rays.
Thin bioconvex lens
In this section, we discuss one of the widely used surface combinations, i.e., biconvex lens. The combination of two surfaces comprise: (1) the effect of the spherical surface on the ray-path and (2) the effect of refractive index change as the ray traverses through low–high–low refractive indices (air–glass–air). Figure 1.5 shows the ray propagation through a typical biconvex lens.
In the present discussion, we neglect the thickness of the lens with respect to the object and image distances. This facilitates simplification and avoids complicated calculations. As usual, the derivation of relationship between image and object distances is carried out in the paraxial ray approximation. From Fig. 1.5, it is clear that the ray traverses through two spherical surfaces (S1 and S2). At the first interface (of radius of curvature, R1), the ray undergoes refraction obeying the following relation:
where Δn = nl − ni, with ni and nl being the refractive index of the surrounding lens immersion media and the lens material, respectively.
At the second interface S2 (of radius R2), the ray undergoes refraction for the second time, giving
Since the refractive index is the same on both sides of the biconvex lens and assuming that the thickness of the lens is negligible for a thin lens, we realize that the object for the second surface is the image for the first surface S1. Since the object is on the right of the surface, it must be negative, i.e., . With this substitution and adding both equations [Eqs. (1.21) and (1.22)], we get
We further simplify the above equation by using the fact that the focal length of a thin biconvex lens is defined as the image distance for parallel rays (equivalent to an object placed at infinity, z1 → ∞), i.e.,
Note that the above equation forms the basis for fabricating a lens of specific focal length f since this requires the specification of parameters such as the radius of curvature for surfaces and the refractive index of lens material (nl) and immersion media (ni). The above equation is popularly known as the “lens-makers equation.”
A practical and popular lens-maker’s formula for fabricating an air-immersion biconvex lens (of focal length, fair) is obtained by substituting ni = 1 in Eq. (1.24), i.e.,
So, the imaging equation (1.23) for a thin biconvex lens becomes
The magnification of the system is easy to calculate by noting that the equal angle θ (in the small angle approximation) gives rise to the following relation:
where the negative sign indicates an inverted image.
Biconcave lens
Similar to the biconvex cases, it is easy to work out the imaging equation and the magnification for a biconcave lens. We follow the sign convention to ensure that the relations do not change for a biconcave lens. The ray traversing through a biconcave lens is shown in Fig. 1.6. It may be noted that we have introduced a third ray (green color) passing straight through the center of the lens without any bend because the mid-part of the lens acts as a parallel plate. In short, thin lens approximation ensures that the ray displaces negligibly. Another way of looking at this is that the object distances are much larger than the object height so that the rays always appear parallel at the center of the lens. It may, however, be noted that the image for a biconcave lens is virtual.
The reader may take this as an exercise and derive the relationship between the object and the image distances and the corresponding magnification. For convenience, the sign convention for lenses is summarized below:
Sign convention for lenses
Note that the sign convention for surfaces and lenses are different. However, the sign conventions for lenses logically follow from the sign convention for surfaces. Here, we summarize the sign convention for lenses (for ready reference) with respect to the light traveling from left to right:
Object located to the left of the lens corresponds to the positive object distance and vice versa.
Image formed to the right of the lens corresponds to positive image distances and vice versa.
For a converging lens, the focal length is positive and vice versa.
Object and image heights are positive above the optical axis and vice versa.
The above rules for lenses ensure that real images form to the right side of the lens and virtual images to the left side.
The Matrix Method
The analyses carried out in the preceding sections are useful when dealing with single or at best a few surfaces. The techniques become trickier and difficult to analyze for complex systems such as microscopes and telescopes. Complex systems employ several optical elements that need to be considered and corresponding aberrations may need to be incorporated. In addition, a situation may arise when one of the optical elements in a complex instrument malfunctions and this may require rebuilding the entire system. This may not be feasible due to many reasons, including costs and remote locations such as outer-space. Thus, it is appropriate to design a systematic approach that can predict image formation and magnification as the ray makes its way, undergoing several reflections and refractions.
The matrix formulation provides a systematic approach where a ray is described by its position (y) from the optical axis and the angle (θ) it subtends with the optical axis. Both parameters are represented by a vector, . In a similar way, the parameters at the image plane are described by an output vector, . Under the paraxial ray approximation, the effects of multiple reflections by mirrors and refractions by lenses are represented by an unknown matrix, . To understand the connection, let us consider a ray propagating through a homogeneous medium as shown in Fig. 1.7. Consider a ray traveling from left to right with the input vector and output vectors as shown in Fig. 1.7. In the paraxial ray approximation, one can immediately write
The above can be written in a matrix formulation as
Note that the matrix uniquely determines the effect of the medium on the propagation of light. In general, the same treatment can be extended to incorporate optical elements such as mirrors and lenses. So, the relationship can be symbolically represented by
where the elements of the matrix need to be determined. Inversely, the matrix is unique for a specific optical element and can also be interpreted as representing the element.
Revisiting spherical lenses
Concave mirror
Consider the reflection at the spherical concave mirror as shown in Fig. 1.8. The rays originate from a point O and forms an image I after reflection. Note that −γ = ϕ + β and β = α + ϕ. The key observations related to the position and angle of ray before reflection (θ = α) and after reflection (θ′ = γ) are
where f = (−R/2) for the concave mirror.
The above relations can be encapsulated in a single matrix relation given by
where the matrix elements are .
Thin lens
Next, we analyze ray propagation through a thin lens. To do so, we revert to the previous section, where we analyzed image formation by a thin lens. A simplistic diagram for ray traversing from left to right in a thin lens is shown in Fig. 1.9. Since a thin lens is essentially a combination of two surfaces, we have two matrix equations,
Note that we have omitted a third matrix for the translation between the lens, which is due to the assumption of thin lens approximation. So, the substitution of one equation into the second equation produces
The ray travels from left to right making angles θ and θ′ before and after the first surface S1, whereas before and after the second surface, it makes angles θ′ and θ′′.
So, we will have two relations connecting the vectors, and , at the surface S1. A similar relation exists at the surface, S2, i.e., and .
From Fig. 1.9 (top left), we get the following relations for a ray that travels from left to right. Using Snell’s law for small angle approximation, we have
Noting that β = y/R and using Δn = n′ − n, we have
We have y′ = y after the first surface. All these can be encapsulated in a matrix format as
For the concave interface, the relation can be simply obtained by interchanging n and n′ and realizing that the radius of curvature for the second concave surface is −R2. This gives
Since we are working for a thin biconvex lens, we neglect the thickness of the lens (with respect to the object and image distances). So, the total matrix for the thin biconvex lens is obtained by multiplying M1 and M2 in a reverse order,
where . Equation (1.40) reproduces the imaging equation for a biconvex lens and incorporates the lens-makers formula. The table below summarizes the ray-matrix for frequently used optical components.
Table summarizing the ray-matrix for common optical components
Ray propagation in free space (distance d)
Reflection at a curved interface (radius of curvature, R)
Thin lens (negligible thickness)
Thick lens (thickness d)
Combination of optical elements
Matrix formulation provides an easy and intuitive way for analyzing complex systems, comprising a combination of optical elements. The advantages of this approach are many, including compact representation and element-by-element analysis. Here, we will analyze some simple systems that are often used in optical instrumentation/imaging systems.
Beam-expander system
A beam-expander is an integral component for most of the optical systems used in microscopy and astronomy. We study and analyze the system through matrix formulation. Figure 1.10 shows a typical beam-expander that consists of two biconvex lenses of focal lengths, f1 and f2. The lenses are placed at a distance of f1 + f2 and the rays travel from left to right.
The system can be decomposed into three main parts: the first lens of focal length, f1, the intermediate space of distance, D, and the second lens of focal length, f2. Each of these parts can be independently represented by three different ray-matrices: M1, T, and M2. The composite system has the matrix, M, given by
From the previous sections, the ray-matrix for lenses and the gap between them can be readily obtained. The matrices M1 and M2 are given by
and the translation matrix T is given by
So, the total system matrix becomes
The input ray-vector and the output ray-vector are thus related by
This is a very general expression for a beam-expander system, consisting of two biconvex lenses separated by an arbitrary distance D. Note that the new position y′ is related to the input position element y by the following relation:
In our case, the light rays are parallel to the optical axis at the input, which means that the input angle is zero, i.e., θ = 0. This gives
In practice, however, it is often required to separate the lenses by a distance (D = f1 + f2) for using the configuration as a beam-expander for the input parallel beam. The substitution gives
So, the magnification of the system is given by
This is a very useful relation and immediately tells that the beam gets expanded by (see Fig. 1.10). Beam-expanders are often used to broaden laser beams and are essential components in a complex optical system.
Beam-shrinker system
Although a beam-expander can be used as a beam-shrinker, an alternative approach is used in practice. This consists of a convex–concave combination. Figure 1.11 shows a typical beam-shrinker used in the industry. The rays originate from left and pass through the biconvex–plano–concave lens combination.
The ray-matrix for the first biconvex lens (of focal length, f1) is given by
The ray-matrix for the second plano–convex lens (of focal length, f2) is
The above expression can be easily obtained by simply letting R1 = 0 in the biconvex lens formula, where .
Note that the distance the light travels in between the lenses is D = (f1 + f2), the corresponding translational ray-matrix is given by
Following similar to that of previous case, the ray-matrix for the combined system is
The magnification of the system is given by
Since f2 < f1, this system results in demagnification and, hence, the resultant beam waist-size gets reduced by a factor f2/f1.
The objective/eye-piece
Although general lenses can produce magnified images, the observation of very tiny objects requires a large magnification (in the range of 10–200×). In terms of matrix formulation, this means that the exit angle (after objective) is very large (aperture angle in the range of ). This is often achieved by a combination of concave, convex, and half-ball lenses as shown in Fig. 1.12.
Figure 1.12 shows a ray diagram of light focused by the Amici objective. The first and second biconvex–plano–concave lens combinations are required to gradually bend the rays in a stepwise manner. Finally, a half-ball is employed to sharply bend the rays in order to achieve a large aperture angle. The ray-matrix for the biconvex(BC)-plano–concave(PC) combination is given by
where for the plano–convex lens because the first surface is convex.
The ray-matrix for the half-ball (MHB) can be simply obtained by assuming that the first surface is convex with a positive radius of curvature (+R1); the second surface has an infinite radius of curvature (R2 → ∞), i.e.,
where .
From Fig. 1.12, it can be seen that the objective has two such BC-PC optical elements back-to-back followed by one half-ball with the spacing represented by T1 and T2 matrices. So, the total matrix for the Amici objective is given by
where t1 and t2 are the intermediate distances represented by the ray matrices, T1 and T2, respectively.
Aberration and the Master Equation
Aberration in optics refers to a situation where there is a substantial departure between ideal and actual cases. Ideally, we accept paraxial ray approximation and expect our optical elements (lens, mirrors, etc.) to behave idealistically. In practice, however, the optical models depart from the ideal case and one often needs to quantify it.
A simple way to understand this is through wavefront analysis at the spherical interface as shown in Fig. 1.13. Two wavefronts are shown: wavefront W1 resembles an ideal case (paraxial ray approximation) and W2 refers to an actual case. The wavefronts (after focusing) create an image at I and I′. This results in a shift along both transverse (XY) and longitudinal (Z) directions, which we can simply refer to as transverse and longitudinal aberration. From Fig. 1.13 (see insets), the incremental aberration (path-difference) between W1 and W2 in a refractive medium (n′) is given by
The axial aberration ξz (see Fig. 1.13) is given by
Since θ = y/(z′ + ξz) and θ′ = y/z′, we have
For pictorial representation and clarity, we have shown ξz comparable to that of z′ (see Fig. 1.13). However, in reality, ξz is much smaller than z′, i.e., z′ ≫ ξz. So, we have
Noting that (see insets in Fig. 1.13), we get
So, the aberration along the lateral y-direction is given by
A similar relation can be obtained for the lateral x-direction as
Equations (1.61)–(1.64) indicate that the imperfections in the optical components lead to distortion of the wavefront that ultimately results in aberrations.
Generalized approach
Next, we consider a generalized approach to understand the aberration in the optical system. Figure 1.14 shows a generalized case of a spherical interface undergoing refraction (transverse). We know that aberration is essentially a deviation of rays from ideal rays following paraxial ray-approximation. We also observed that the aberration can be transverse or longitudinal or both. Fermat’s principle states that light rays travel through the distance of least time or least distance (for a homogenous medium). This means that rays travel through all the paths connecting P and I that have the same optical path-length. Two such paths, OAI and OBI, are shown in Fig. 1.14. The rays originating at O form an image at I and aberration causes the ray to differ from this path, resulting in a mismatch between the rays PAI and POI. This is different for different points on the curvature (say, A′). Thus, the aberration function can be defined as
It may be noted that the other rays (PA′I′) on the spherical surface follow a different path and end up at a different point (I′) after refraction instead of the ideal location (I). So, the other rays do not strictly follow Fermat’s principle (within paraxial ray approximation), ultimately resulting in aberration.
Expanding the aberration function, we get
Applying cosine law in the triangles (PAC and CAI) and realizing that cos (180 − θ) = −cosθ, we get
Approximating cosϕ for a higher order and realizing that cosϕ ≈ ϕ = h/R, we get
Incorporating the cosϕ approximation, we get
Following binomial expansion and approximating ((1 + a)1/2 ≈ 1 + a2/2 − x2/8), we get
Incorporating r1/z and r2/z′ and by rearranging, we get the master equation for the aberration as
Note that the first term is equal to zero for the spherical surface (imaging equation). The next term is essentially the term for aberration. In a compact form, we can write
where is a constant.
Next, we use the above master equation [Eq. (1.71)] to understand the different types of aberrations.
Consider rays emerging from the object A and the image forms at A′ as shown in Fig. 1.15. Note that the points O, S1, and S2 lie in a single vertical plane and approximate the wavefront at O. The aberration function at S1 and S2 is given by
To account for all types of aberration, we considered the off-axis aberration function, which is the difference between the aberrations at S1 and S2,
Applying cosine law to the triangle, OS1S2, we have
Noting that cos (180 − γ) = −cosγ and substituting the above expression, we get
Since OS2R and A′R′R are similar triangles, this suggests that u = c′h, where c′ is a proportionality constant. Substituting u, the aberration function gets modified to
So, the type of aberration in any optical system depends on the aperture of the lens (characterized by d), the symmetry around the axis (characterized by γ), and the distance from the axis on the image plane (given by the variable h).
We visually show the effect of each aberration in Fig. 1.16. Spherical aberration is strongly associated with the aperture of the imaging element (lens). This is evident from the first term, C1 ~d4, where d represents the distance of an arbitrary point on the aperture (S1 in Fig. 1.16) from the optical axis. The second kind of aberration is called an astigmatism, which arises where rays traveling in two different perpendicular axes have different focal points. Astigmatism is due to an interplay of all variables, d, the distance of a point from the optical axis in the image plane h and the angle γ (symmetry around the axis). The third kind of aberration is Coma given by the third term, C3 ~h ~d3 ~cosγ as shown in Fig. 1.16(c). Evident from the term, this is an off-axis aberration and non-symmetrical about the optical axis. This aberration is very sensitive to the aperture, r. The fourth kind of aberration is distortion that happens due to its deviation from rectilinear projection. It can be classified as negative (barrel distortion), where points move radially inward, and positive distortion, where points move radially outward. This is represented by the term C4 ~h3 d~cosγ that is quite far from the optical axis. The fifth kind is field curvature, which is symmetric around the optical axis and strongly depends on h and d. This aberration results in a spread of focus on the ideal focus along the optical axis. As a result, the object cannot be focused on a flat imaging sensor and instead a spherical imaging sensor is necessary. The image points near the optical axis will be in perfect focus and the rays off the focus will come into focus before the image sensor, thereby appearing blurred. This is shown in Fig. 1.16(e). A Petzval surface can correct this aberration. In fact, there is a Petzval surface associated with every optical element.
This concludes chapter 1, and we will focus on wave optics and its related wave phenomenon in the subsequent chapter. Some of the interesting articles and books related to ray-optics are Grella (1892), Mondal and Diaspro (2013), Pedrotti and Pedrotti (1993), and Saleh and Teich (2007).