Chapter 1: Foundation of Ray Optics, Wave Optics, and Electromagnetic Optics Free
-
Published:2021
"Foundation of Ray Optics, Wave Optics, and Electromagnetic Optics", Light Sheet Microscopy and Imaging, Partha Pratim Mondal
Download citation file:
Light has fascinated humans for centuries. This has led to the development of many theories, beginning from simple ray optics to the most complex quantum optics. The foundations of light-sheet microscopy lie in the domain of classical optics (i.e., ray optics, wave optics, and electromagnetic optics). Thus, it has become imperative to understand the fundamental principles governing light propagation and its effects and apply them to realizing advanced optical systems. Almost all research disciplines employ light to understand and analyze physical/biological processes. Light has become the backbone of modern optical systems. This introductory chapter explores different theories of light, beginning from simple and moving to complex as the need arises. Subsequent chapters use light as a template to understand light-sheet techniques that have taken the fields of imaging and microscopy by storm.
Introduction
Light can be interpreted as a simple ray or as a complex stream of energy packets. It is always advisable to begin with the simplistic understanding of light and venture into the more complex one based on requirements. The classical description of light comprises ray optics, wave optics, and electromagnetic optics, whereas quantum optics deals with the quantization of light. To understand light sheet based microscopy and imaging, classical optics serve the best purpose. In this chapter, we build on the classical optics principles relevant to understand light sheets and do not claim to address classical optics in its entirety.
As the name suggests, a light sheet corresponds to a sheet of light generated via various optical means. Traditionally, light–matter interactions are studied by point-illumination-based optical systems that predominantly employ spherical optics. Point illumination has the advantage of being intense and precise, thereby facilitating enhanced light–matter interaction. The situation is different when visualizing fluorescent-labeled biological specimens that are delicate and have a low energy threshold. Most fluorescent specimens are known to undergo photobleaching, which is its inability to fluoresce after prolonged exposure to visible light. In addition, point illumination requires sequential scanning for 2D (xy), 3D (xyz), and 4D (xyzt) imaging of large specimens that severely compromises temporal resolution. Hence, a low-power, nonscanning, single-shot, and photobleaching-less solution is desired for delicate/sensitive specimens (such as cells and biological tissues).
In this chapter, we begin with the fundamentals of classical optics relevant to light sheet microscopy before exploring the technique in the rest of the book.
Ray Optics
Ray optics is probably the simplest theory of light that explains our day-to-day observations involving light. Here, we begin with the basic postulates of ray optics.
Basic postulates of ray optics
The crux of ray optics lies in Fermat’s principle of light propagation through air and other mediums. We begin with the postulates of ray optics:
I. Light traveling through a medium is characterized by its refractive index, n ≥ 1. By definition, the refractive index of a medium is n = c0/c, where c0 and c are the speed of light in a vacuum and medium, respectively.
II. The optical path between two points in a homogeneous medium is s = nd, where d is the distance between the points. For an inhomogeneous medium, the path is given by . Thus, the time taken (d/c = nd/c0) to travel a distance d is directly proportional to the optical path (nd).
III. Fermat’s Principle: This states that rays traveling between two points follow a path such that the time of travel or, equivalently, the optical path between the two points is a minimum to other paths, i.e.,
Often we are faced with the situation where light needs to travel from one medium to another medium and also through optical components. On its way, light gets reflected, refracted, and transmitted. As a consequence of the above postulates, one can safely assume that light travels in a straight line in a homogeneous medium (Hero’s principle). The postulates of ray optics can be effectively used to completely understand and build optical systems (optical microscopes, telescopes, and other optical systems) that require beam alignment, beam steering, beam expansion, and transmission through optical components.
Reflection, refraction, and Snell’s law
Rays propagating between two mediums with different refractive indexes gives rise to reflection, refraction, and transmission. Within the domain of ray optics, these phenomena can be derived directly as a consequence of Fermat’s principle.
We begin with reflection of light at an interface of two mediums as shown in Fig. 1.1. Light emanating from a point A in medium 1 strikes the interface (say at O) and part of it is reflected into the same medium at a point C. The optical path traversed by the ray from A to C is n1(AO + OC). Fermat’s/Hero’s principle states that the distance n1(AO + OC) must be minimum. As C′ is a mirror image of point C, OC = OC′, and the distance n1(AO + OC′) must be minimum. Again, employing Hero’s principle, the minimum distance between A to C′ is a straight line, i.e., it must pass through the point O′. Thus, O coincides with O′, implying θ = θ′. Note that Hero’s principle fixes the point of reflection on the interface for a ray traveling from A to C.
Light impinging at an interface of two mediums leads to both reflection and refraction of light. Light originating from a point A in medium 1 (of refractive index n1) to a point C in the second medium (of refractive index n2) results in bending of the ray. This is shown schematically in Fig. 1.2. The refracted light subtends an angle θr with the normal. Note that the light ray does not travel in a straight line from A to C.
Following Fermat’s principle, we seek to minimize the time or equivalently the total optical path length from A to C,
subject to the condition s2 = s − s1.
To minimize the total optical path length, we must equate the first derivative to 0 (and also check that the second derivative is >0),
The above relation is popularly known as Snell’s law. In brief, it states that a light ray traveling from lower to higher refractive index bends towards the normal (or away from interface) and vice versa. Later on, we describe an interesting phenomenon (called total internal reflection) based on this law.
Reflection at a spherical surface
Concave mirror
Consider ray reflected through a concave spherical mirror as shown in Fig. 1.3. The object is placed at a distance z1 from the surface, the image forms at z2 and the radius of curvature is negative (−R) for a concave mirror as shown in Fig. 1.3.
For simplistic formulation, we consider only paraxial rays, i.e., rays close to the optical axis that make small angles. This ensures that the rays converge at a single point. Mathematically, this means that the first-order approximation of sine, cosine, and tangent of the angles (made by the object and image rays) can be approximated as
We seek a relationship between the object plane (z1) and image plane (z2). This requires determination of angles subtended by the rays at object and image points (O and F) as the light traverses from object to image planes. From Fig. 1.3, the following relations are evident:
Assuming the angles (α, β, γ) to be very small (paraxial approximation),
Substituting, we obtain the relation between angles,
It is immediately evident that rays from infinity (z1 = ∞) converge to z2 = −R/2. This is the focus of a spherical interface/mirror, and the focal length is f = −R/2. Thus, the relation between object and image distances is modified to
Image formation by a spherical mirror: (a) ray propagation from object O to image I; (b) image magnification.
Image formation by a spherical mirror: (a) ray propagation from object O to image I; (b) image magnification.
It is important to realize that for a plane mirror, the radius of curvature is infinity (R → ∞), and the corresponding relationship is modified to z2 = −z1. The symmetry between image and object distances is evident from Eq. (1.8). Hence, the image and object points are also called conjugate points.
Another aspect of a spherical interface is the magnification/de-magnification of the object. From Fig. 1.3(b), a ray originating from the top end of the object ends up at the top end of the image after being reflected by the spherical mirror. As the angles are equal, we have
The negative sign merely indicates that the image is inverted and the magnitude of m determines magnification. A similar equivalence can be derived for a convex spherical surface, again leading to Eq. (1.8) with positive radius of curvature. We leave the derivation for the reader to try out.
Light propagation through spherical and cylindrical lenses
Thin biconvex lens
A biconvex lens is one of the most common optical components that can be represented by a combination of two surfaces (S1 and S2) as shown in Fig. 1.4. Unlike previous cases, here we are faced with two issues: (1) the effect of the spherical surface on the ray path; and (2) the effect of refractive index mismatch as the ray traverses through low–high–low refractive indexes (air–glass–air).
At the first interface (of radius of curvature R1), the ray undergoes refraction obeying the following relation:
where Δn = nl − ni, with ni and nl being the refractive index of the surrounding lens immersion medium (here, air) and the lens material (glass).
At the second interface (of radius r2), the ray undergoes refraction for the second time, giving
As the refractive index is the same on both sides of the biconvex lens and assuming that the thickness of the lens is negligible for a thin lens, we realize that the object for the second surface is the image for the first surface S1. As the object is on the right of the surface, it must be negative, i.e., . With this substitution and adding both Eq. (1.10) and Eq. (1.11), we obtain
We can further simplify the above equation by noting that the focal length of a combination of surfaces (such as a biconvex lens) is defined as
The readers are encouraged to derive the above expression for a combination of intersecting spherical surfaces.
The above equation becomes the basis for fabricating a lens of specific focal length, f, because the above equation requires the specification of parameters (radius of curvature for surfaces and the refractive indexes of the glass and immersion medium). Hence, the above equation is popularly known as the “lens-maker’s equation.”
Thus, the imaging equation for thin biconvex lens can be approximated as
The magnification of the system is easy to calculate by noting that the equal angles ϕ (in the small angle approximation) give rise to the following relation:
The negative sign indicates that the image is inverted, and the magnification is
Overall, light passing through a biconvex lens can be summarized by the following relation,
Matrix formulation and representation of optical elements
It is clear that the calculation becomes complex even for simple systems such as a biconvex lens and spherical mirrors. This becomes even more tedious and unmanageable for complex systems such as microscopes and telescopes that employ several optical lenses, mirrors, and other special optical elements. Thus, a convenient way of representing the ray-optics formulation is necessary. This gave rise to the matrix formulation which is a convenient and cohesive way to represent complex optical components.
The matrix formulation provides a systematic approach where a ray is described by its position (y) from the optical axis and the angle (θ) it subtends with the optical axis. Both parameters are represented by a vector, , for the input ray. In a similar way, the parameters at the image position are described by an output vector, . Under the paraxial ray approximation, the optical system (between the input and the output) that involves multiple reflections by mirrors and refractions by the lenses are represented by a unknown matrix, . To understand the connection, let us consider a ray traveling through a homogeneous medium as shown in Fig. 1.5. Consider a ray traveling from left to right with the input and output vectors as shown in Fig. 1.5. In the paraxial ray approximation, one can immediately write
Thus, the above set of equations can be represented in a cohesive matrix form as
The same treatment can be generalized to incorporate optical elements such as mirrors and lenses. Thus, the relationship can be symbolically written as
where the elements A, B, C, and D of the matrix need to be determined. Inversely, this matrix also represents the optical element.
As an example, let us revisit the concave mirror and determine the parameters using the matrix formulation. The rays originate from a point O and form an image I after reflection. Note that −γ = β + θ and β = α + θ. The key observations related to the position and angle of rays before reflection (θ2 = γ) and after reflection (θ1 = α) are
where f = (−R/2) for a concave mirror.
The above relations can be encapsulated in a single matrix form,
where the matrix elements are A = 1, B = 0, C = 1/f, and d = 1.
In a similar way, one can proceed to re-derive the expressions for lenses, plane mirrors, and glass plates. The readers are encouraged to work out the ray matrix M for these components.
For a convex lens (f > 0) and concave lens (f < 0), the ray-optics matrix is
For a spherical boundary [convex (R > 0) and concave (R < 0)] with rays traveling from a medium of refractive index n1 = 1 (air) to a medium of refractive index n2 = n (lens medium),
Similarly, M for a plano-convex lens can be derived from the general case of a biconvex lens by noting that the second surface (S2) is a plane for which the radius of curvature is infinity (R2 → ∞). Using Eqs. (1.12) and (1.13), this produces
where .
Assigning R1 = R (say) and nl = n (say), and assuming the lens immersion medium as air (ni = 1), we obtain
In summary, we obtain the ray matrix for a plano-convex lens,
In the next section, we proceed to use the matrix formulation for complex optical systems/components that are frequently used in light sheet microscopy.
Beam-expander optics
Often, laser beams require expansion for a variety of applications prevalent in optical microscopy and imaging. A beam expander is a portable optical system that is capable of expanding the beams by up to a factor of 10. Often this is realized with a combination of biconvex lenses. Other ways of realizing a beam expansion is to employ a combination of concave and convex lenses.
The simplest beam expander requires two biconvex lenses of different focal lengths f1 and f2 as shown in Fig. 1.6. The lenses are placed at a distance of f1 + f2 and the rays travel from left to right. The system can be decomposed into three main parts: the first lens of focal length f1, the intermediate space of distance d, and the second lens of focal length f2. These three parts can be represented by three different ray matrices: M1, T, and M2. The composite system has the matrix, M, given by
From the previous sections and Appendix A, we know the ray matrix for lenses and the gap between them. The matrixes M1 and M2 are given by
and the translation matrix, T, is given by
Thus, the total system matrix becomes
The input ray vector and the output ray vector are, thus, related by the following relation
This is a general expression for a beam-expander system that consists of two biconvex lens separated by an arbitrary distance, d. Note that the new position y2 is related to the input position element, y1, by the following relation:
In our case, the light rays are parallel to the optical axis at the input which means that the input angle is θ1 = 0. This gives
In practice, however, it is often required to separate the lenses by a distance d = f1 + f2 to use this system as beam expander for the input parallel beam of rays. The substitution gives
Thus, the magnification of the system is given by
This is a very useful relation and immediately tells us that the beam is expanded by a factor of the ratio of two focal lengths, , and the output beam is parallel to the optical axis (see Fig. 1.6).
Cylindrical lens
Cylindrical lenses are better represented in one dimension because they are known to have 1D focusing property. Unlike a spherical lens, a cylindrical lens does not have the symmetry of a spherical system that uniformly focuses light rays across the aperture angle of 2π (along both axes). For a cylindrical lens system, such a symmetry is lacking and focusing is limited to one of the lateral axes (say, the y axis) and the there is no focusing along the x axis. Thus, it is not possible to treat both the lateral axes (x and y) equally in a cylindrical system. For spherical optical elements, the formulation/derivation apply equally in the lateral (x and y) axes. This is also the reason why we have not differentiated between x and y axes and treated them equally.
Consider the ray diagram for a cylindrical lens as shown in Fig. 1.7. The rays are only focused along the y axis and do not have any focusing along the x axis. As a result, a cylindrical lens produces a line at the focus. The following equations capture the property of a cylindrical lens,
An equivalent representation in ray-matrix formulation is
where fCL is the focal length of a cylindrical lens.
Wave Optics
Although the domain of ray optics is expansive for building new optical systems, it has its limitations. The most important of them is its inability to explain the wave nature of light such as diffraction and interference. Such effects have been observed and require a different approach to explain light. The traditional way of understanding light considers it as a geometric entity. It is noted that ray optics can explain optical phenomena when the wavelength of light is very small compared to the objects through or around which light is passing. On the other hand, observing the wave nature of light requires the dimension of objects to be comparable to the wavelength of light. Thus, only when light interacts with objects (such as pinholes) of the size of its wavelength (in micrometers) is the wave nature of light revealed. As we encounter objects that are macroscopic, the wave nature of light is not apparent in daily life.
In this section, we assume light as a scalar quantity and represent it by a wave function that obeys a wave equation. This approximate way of representing light is formally incorporated as postulates in the scalar theory of light and have far-reaching consequences in understanding light, including interference and diffraction.
Postulates of wave optics
The best way to understand light and its properties that are similar to those of waves is to begin with well-founded postulates described as follows:
Postulate I. Light behaves and propagates in the form of waves (just like water waves or sound waves).
Postulate II. In a medium of refractive index n ≥ 1, light travels with a reduced speed, i.e.,
where c0 = 3 × 108 m/s in vacuum. This explains its propagation in a medium and in optical elements through which it propagates.
Postulate III. Light is described by a scalar wave function which is a function of position and time t. The wave function describing light satisfies the following wave equation,
where is the Laplacian operator in Cartesian coordinates.
Corollary 1: The linearity of wave equation ensures that the superposition principle applies. Thus, if and are solutions of the wave equation, then is also a solution and represents an optical wave.
The wave equation determines the characteristics of an optical wave which is a function of position () and time (t). Accordingly, an optical wave propagates in space and evolves with time. To understand the wave dynamics in space and time, it is necessary to decouple these variables and solve the resultant equation separately in space and time. Subsequently, it should be possible to combine the space and time parts to determine the complete solution representing an optical wave. In the next section, we proceed to determine the solution in free space.
Helmholz equation and wave representation
We begin with the time-independent wave equation and try to solve it. To do so, we begin with a possible known solution. Traditionally, waves are recognized by a periodic function of space and time. An excellent example would be water waves that travel from one end to another in the form of crests (high point of the wave) and troughs (low point of the wave) indicating periodic motion. Moreover, each point in the wave moves up and down with time, again indicating periodicity in time. Here, we begin with a simple such function and assume it as a possible wave, i.e.,
where ψ0 and φ can be recognized as the amplitude and phase of the optical wave, respectively, represented by the above equation. Here ω = 2πν is the angular frequency and ν is the frequency of the wave expressed in cycles per second or Hertz. The time period between two consecutive troughs/crests is T = 1/ν = 2π/ω.
For convenience, we switch to exponential representation of the above cosine function and rewrite the wave function as
where
is the complex wave function.
Thus, we have successfully decoupled the space and time variables. Note that, just like the function , the modified function satisfies the wave equation and the associated boundary conditions. Then, substituting into the wave equation (1.41), we obtain
Substituting Eq. (1.44) into the above equation and further simplification gives
Using the well-known relation between ω and wave number k, we obtain the following form
The above equation is popularly known as the Helmholtz equation and describes the properties of an optical wave in space.
The plane wave: Solution of the Helmholtz equation
The Helmholtz equation is time-independent and describes the propagation of light waves in space. The solution to it would provide a wealth of information regarding the nature of light. We seek a solution to the Helmholtz equation (1.47) in the Cartesian coordinate system. We assume that the variables of the solution are separable and, hence, we write Ψ(x, y, z) = Ψ1(x) × Ψ2(y) × Ψ3(z). Substituting in the Helmholtz equation produces three similar equations for three separable variables given by
where u = x, y, z and j = 1, 2, 3 represent the x, y, z components, respectively. The components of wavevectors are related as .
The solution to the above second-order differential equation is simple and can be written directly as
In general, one can generalize the above solution by assuming waves propagating in and directions for which the solution is
where and are the wavevector and position vector, respectively.
Consider waves propagating in the positive direction, i.e.,
Neglecting the constant phase term (arg(A)), the resultant phase of the wave is . The exponential wave function is periodic in nature with a periodicity of 2π, so the wavefront (the surfaces of constant phase) of the resultant wave must obey
where q is an integer.
The above equation suggests that, for q = 0, the plane represented by is perpendicular to the wavevector . Other values, say q = 1, lead to the condition that represents the plane shifted by 2π/k = λ which is perpendicular to the wavevector. The same argument applies for other values of q. Overall, the above expression represents planes that are perpendicular to the wavevector. This is a very important conclusion and forms the very basis of idealistic plane waves.
An important property of an optical wave is its periodicity in space and time. This is better illustrated if we assume light as a wave which is propagating in the z direction with time t [see Eqs. (1.43) and (1.44)], i.e.,
This above equation states that the optical wave [represented by Eq. (1.53)] is periodic in time with a period 1/ν and periodic in space with a period c/ν = λ.
In addition, the phase of the wave function varies with a function of (t − z/c), where c is called the phase velocity of the wave. Collected together, the description leads us to a specific representation of a plane wave called a paraxial wave as discussed in the next section.
Paraxial Helmholtz equation
Paraxial waves play a central role in wave optics because they enable simple problem formulation and solution. These waves are a solution to the paraxial Helmholtz equation that can be derived from the generalized Helmholtz equation.
Consider a plane wave propagating along the z axis for which the wave function reduces to
Substitution of the above solution in the Helmholtz equation gives
where is the transverse Laplacian operator. It may be noted that A is a function of .
Expansion and rearrangement gives
Assuming that the variation of the envelope and its derivative is small, i.e., ΔA ≪ A for Δz ≈ λ, we have . Thus, , implying
Using paraxial wave approximation along with the above inequality, we obtain
Further simplification leads us to the following differential equation:
The above equation is a slowly varying envelope approximation of the Helmholtz equation, also called the paraxial Helmholtz equation. This equation becomes the basis for understanding special kinds of waves with specific beam shapes, such as Gaussian, Bessel, and others.
Solution of the paraxial Helmholtz equation: Gaussian beams, Bessel beams, and others
The solution of the paraxial Helmholtz equation gives rise to important beam shapes. These specialized beams have diverse applications in physical and engineering sciences. In this section, we explore some of these solutions.
Gaussian beam
One of the key solutions of the paraxial Helmholtz equation is a Gaussian beam. Incidentally, most light sources are Gaussian in nature especially at large distances. In this section, we show Gaussian beam as one of the solutions of the paraxial Helmholtz equation.
Consider a plane wave traveling along the z axis with a complex envelope, i.e., A(x, y, z) = A(x, y) e−ikz. The complex envelope of the wave satisfies the paraxial Helmholtz equation
Assuming that the solution has cylindrical symmetry, the paraxial Helmholtz equation can be expressed in cylindrical coordinate as
We try to solve the equation using a trial method by assuming the solution to be of the following form:
Substituting the above solution (1.62) into Eq. (1.61) and solving for A(r, z) produces the following expanded form,
where and C0 is a complex constant. The beam parameters are described as W(z), R(z), and χ(z):
It is important to note that the beam parameters are z-dependent. We show that this is something inherent to Gaussian functions and may not be true with other beam types/solutions of the wave equation. The reader is encouraged to derive the above solution.
Immediately, one can recognize important parameters of the Gaussian beam as discussed in the following.
Gaussian beam intensity. At any z value, the lateral intensity profile of the beam is Gaussian,
Total phase. The phase of Gaussian beam can be recognized as
where kz is the phase collected by the wave as it propagates a distance z. Here χ(z) is the Gouy phase, which is π (from −π/2 to +π/2) for a beam traveling from −∞ to +∞.
Beam width. The width of the beam is z-dependent, i.e.,
where is the smallest beam width at z = 0. Here 2W0 is called the spot size.
Depth of focus. Using z0 to denote the Rayleigh length, the depth of focus is given by
Beam divergence. The width of the beam is least at z0 and increases at large distances, i.e., the beam diverges. At large distances (z → ∞), we have (z/z0)2 ≫ 1. Thus,
where θD = W0/z0 = λ/πW0 is the beam divergence and the beam diverges as a cone of half-angle θD.
All the above parameters are described in Fig. 1.8, which shows a Gaussian beam that assumes its best focus at z = 0 and grows gradually out of focus on either side. The divergence angle and depth of focus are also shown.
Bessel beam
The other solution of the paraxial Helmholtz equation is a Bessel beam that has the amazing property of self-reconstruction when disturbed or scattered by obstacles.
It is easy to show (by direct substitution) that another solution of the Helmholtz equation in the polar coordinate system is
where Jm is the Bessel function of the first kind and order m, kL is the longitudinal wavevector, and .
The intensity of the Bessel beam is given by
Thus, the intensity is independent of z and it is circularly symmetric. This ensures that there is no spread of power as it propagates along the z axis. These special kind of beams are known as Bessel beams, which can propagate long distances without changing shape, i.e., nondiffractive in nature. In practice, a Bessel beam is better realized by using Axicons1.
In the subsequent sections, we discuss two key effects (interference and diffraction) that explain the temporal behavior of light. This is possible due to the wave nature of light as described by the wave theory.
Interference of two or more beams
Consider the superposition of two beams that are derived from the same source. The same source ensures that the beams share similarity within the coherence length (lc = c × τc) of the beam, where τc is the time over which the wave is continuous and so it has memory. This also ensures that the complex conjugate is defined within τc. Thus, the superposition yields
Here, G is defined as the autocorrelation function over which the wave function maintains correlation or remembers itself, i.e.,
and we define a new related function, that is generally used for autocorrelation,
where we have dropped the spatial coordinate (because is fixed).
The time over which the function maintains correlation is also the memory time. This is also the time over which the function remember itself. The memory time is popularly known as the coherence time. Formally, coherence time is defined as
The corresponding spectral width is defined as
where S(ν) is the spectral density.
The coherence time (in the time domain) and spectral width (in the frequency domain) are related by an uncertainty relation given by
Noting that g12 is complex, we can write it as . Upon substitution, we obtain the following expanded form for the resultant intensity,
Thus, for coherent beams 1 and 2, which are completely correlated, i.e., |g12| = 1, we obtain the following expression for interference,
where ϕg = Δϕ = ϕ1 − ϕ2. In addition, the phase difference and path difference between the optical arms (Δd = (d1 − d2)) are related by Δϕ = 2πΔd/λ (see Fig. 1.9).
(a) Interference of two waves equally split by a beam splitter BS50|50 (Michelson interferometer). (b) The resultant interference pattern.
(a) Interference of two waves equally split by a beam splitter BS50|50 (Michelson interferometer). (b) The resultant interference pattern.
Further simplification is possible if we consider both beams to be of equal intensity, I1 = I2 = I0 (say),
However, for two uncorrelated beams, |g12| = 0, the resultant intensity is just the sum of two independent intensities (loosely termed as a statistical mixture),
In general, the output pattern for correlated beams is sinusoidal in nature with a minimum and maximum of 0 and 4I0, respectively. The above expression provides an important conclusion that an odd integer multiple of π/2 gives rise to the sum of two intensities, i.e., 2I0, whereas an even integer multiple produces a resultant intensity of 4I0. A schematic of a typical interferometer along with the interference pattern at the observation plane is shown in Fig. 1.9.
Diffraction of light
The diffraction of light is associated with the bending of light as it passes through or around microscopic objects. Unlike other popular texts, here we restrict ourselves to diffraction associated with light microscopy, such as microscopic apertures and lenses or a combination of them.
Fresnel diffraction
Fresnel diffraction follows from the Fresnel approximation, which states that the spatial frequencies (νx, νy) are much smaller than the cutoff frequency (λ−1) so that . Using this, the transfer function and the impulse response function can be determined to be (Saleh and Teich, 2007)
where d is the propagation distance between input and output planes.
Thus, the complex amplitude in the output plane (x, y) for waves originating in the input plane (x′, y′) is given by
where h0 = (i/λd) e−ikd. Detailed discussion can be found in standard optics textbooks (e.g. Saleh and Teich).
Fraunhofer diffraction
Fraunhofer diffraction is a consequence of Fraunhofer approximation, which states that the only waves that contribute to the complex amplitude at the output plane are those that make angles θx ≈ x/d and θy ≈ y/d with the optical axis. This leads to the following complex amplitude at the output plane,
where ρ2 = x2 + y2, , and F(νx, νy) is the Fourier transform of the object function f(x, y) at the input plane.
Diffraction through a circular aperture
Consider a microscopic hole illuminated by an incoming light as shown in Fig. 1.10. A plane wave (Ae−ikz) illuminates the hole, parts of which are blocked and the rest passes through. The aperture is defined by the function,
where the circular aperture has a radius of .
Fraunhofer diffraction from a circular aperture at large distances (d ≫ 2r0).
The simplest theory of diffraction states that light inside the hole passes through and its propagation is defined by either Fresnel or Fraunhofer approximation depending upon the propagation distance. Thus, if we consider U(x, y) as the input complex amplitude and f(x, y) as the complex amplitude just after the aperture, then they are related by
We seek the field at large distance (d ≫ λ), hence Fraunhofer diffraction applies. Using Eq. (1.85), the resultant field at a distance d is given by
and the corresponding intensity is given by
where νx and νy are the spatial frequencies along x and y, respectively. Here, P(νx, νy) is the Fourier transform of the aperture function p(x, y).
Single lens system
Consider an object f(x, y) at a distance d1 from the lens and an image g(x, y) formed at the focus (d2 = f) of the lens (front focal plane), see Fig. 1.11. The system can be split into three parts: (I) the field at the lens plane (x, y) placed at a distance d, given the input plane (x1, y1), (II) thin lens transmission, and (III) the field at the output plane (image plane (x2, y2) located at the focus) given the field at the lens. We assume a thin lens and, hence, neglect the thickness.
Part I. The field at the lens plane (x′, y′) at a distance d from the input object plane (x, y) is given by
where h0 = (j/λd)e−jkd.
Part II. The transmittance of a thin lens is given by (Saleh and Teich, 2007, p. 54). Thus, the lens is seen as imparting a phase proportional to x2 + y2 to the incident plane wave. Hence, the complex amplitudes to the left and right of the lens are connected by
The diffraction pattern observed on a plane placed at the focal length f of the lens.
The diffraction pattern observed on a plane placed at the focal length f of the lens.
Part III. The field at the output image plane (x′′, y′′) given by the field at input plane at (x′, y′),
where h1 = (j/λf)e−jkf.
Combining parts I, II, and III, we obtain the complete expression as
where
, and .
Now, we use the well-known identities that are related by the Fourier transform,
This gives the following simplified form for the integral,
Incorporating the above expression into Eq. (1.94), we finally obtain
where
In the special case when d = f, we obtain
Thus, the above equation states that a lens performs Fourier transform.
The optical intensity at the output plane is given by
where C is a constant.
4f imaging system
We now consider a two-lens system which is also known as a 4f system. An optical configuration for a 4f system is shown in Fig. 1.12. The system can be thought of as a cascade system with the first subsystem (first lens) performing Fourier transform and the second subsystem performing another Fourier transform (in the coordinate system of the inverted image plane), which is equivalent to an inverse Fourier transform.
A typical 4f system, performing Fourier and inverse Fourier transform.
To understand this, let us consider an object (represented by f(x, y)) placed at the object plane of the first lens. The lens performs Fourier transform of the input object function and splits its Fourier components (spatial frequencies) in the Fourier plane. This can be expressed as
The second lens performs inverse Fourier transform, i.e., it combines the Fourier components in the Fourier plane to produce an inverted image in the image plane. The process can be expressed as
The above equation can be generalized to incorporate filtering in the Fourier plane as shown in Fig. 1.13. Filtering can be achieved by introducing a mask, p(x, y), in the Fourier plane of the first lens. The mask blocks some of the components and allows other components, thereby behaving as a filter in the Fourier domain. The transmission just after the mask is
where x = λfνy and y = λfνy.
A typical 4f system with a spatial filter placed in the Fourier plane.
Finally, the filtered components undergo inverse Fourier transform by the second lens and produce the image g(x, y) at the focus of second lens,
The above expression (1.105) represents the inverted image produced by the object function f(x, y) which is at the back-focal plane of the first lens in the 4f system. We can draw a few important corollaries from the above discussion.
Corollary 2: The transfer function of the filter realized by the mask is
Corollary 3: The impulse response of the system is
where P(νx, νy) is the Fourier transform of p(x, y).
In this section, we have discussed some of the key aspects of microscopy such as 2f(single lens) and 4f(double lens) systems. These systems are frequently used in light sheet microscopy. However, the polarization aspects of light in optical microscopes and imaging systems can be understood using the electromagnetic theory of light.
Electromagnetic Optics
Electromagnetic optics has its roots in electrostatics and magnetostatics. At the beginning of the 20th century, Maxwell brought them together along with his contribution that we now call Maxwell’s equation. Electromagnetic optics spreads over a large swathe of the radiation spectrum ranging from less than a kilohertz to more than a zettahertz. Maxwell’s equation is a unified theory that can be applied to the entire electromagnetic spectrum. Figure 1.14 shows a typical representation of electromagnetic spectrum ranging from low to high frequencies. The electromagnetic spectrum has a vast range of the radiation field that we know today and extends on either side from very low (femto-electronvolt) to high (giga-electronvolt) energies.
The entire electromagnetic spectrum encompasses radiation that we know today, i.e., visible, X-ray, γ-ray, ultraviolet, radio waves, microwaves, and very long waves (thousands of kilometers). Some of the historical milestones leading to the discovery of waves are worth mentioning. As per the records, William Herschel discovered infrared radiation in 1800, when he noticed that the highest temperature was beyond red while studying different colors by thermometer using a prism. Around the same time, Johann Ritter discovered ultraviolet rays when he discovered that invisible light rays induce certain chemical reactions. In 1886, Heinrich Hertz built an apparatus to generate and detect the low-frequency electromagnetic radiation that today we call radio waves. Hertz is also credited with the discovery of microwaves that later became the backbone of the modern revolution (wireless technology and radio). Subsequently, Wilhelm Rontgen noticed a new type of radiation while carrying out high-voltage experiments with an evacuated tube. Later on, these rays were named X-rays and became the backbone of medical noninvasive investigation of the human body. The high-energy γ-rays were discovered by Paul Villard in 1900 when he was studying the radioactive emission of radium.
Later on, it was realized and experimentally demonstrated that all these waves are electromagnetic radiation and, thus, the laws of electromagnetic optics equally applies. This will be the subject of the subsequent sections.
Maxwell’s equations
The underlying postulates leading to electromagnetic optics are as follows:
Postulate I. Light is an electromagnetic wave and the electromagnetic field is described by two interlinked vector fields: the electric field and magnetic field ().
Postulate II. Light obeys Maxwell’s equations. In free space, the equations are given by
where μ0 = 4π × 10−7 H/m and ϵ0 = (1/36π) × 10−9 F/m are the magnetic permeability and electric permittivity of free space, respectively.
Maxwell’s equations are a re-manifestation of the laws of electrostatics and magnetostatics. The first equation is the differential form of Gauss’ law (), where ρ is the charge density. In free space, ρ = 0, leading to the first Maxwell’s equation (). The second equation is Gauss’ law for the magnetic field. As magnetic monopoles do not exist, and positive and negative magnetic charge appear in pairs, the total magnetic charge density is always zero. This gave rise to the second Maxwell’s equation (). The third equation is essentially a differential form of Faraday’s law that indicates that motional electromotive force (EMF) (in volts per meter) is induced in any boundary path of a surface by changing the magnetic flux through the surface. The negative sign indicates that induced EMF opposes the change in magnetic flux (also known as Lenz’s law). The fourth equation is essentially an expanded form of the Ampere–Maxwell law expressed in differential form. This states that a circulating magnetic field is produced by an electric current or by a changing electric field with time or both ()), where is the current density. Of the four equations, equations were already known, but it was the great idea of Maxwell to combine them and discover the theory of electromagnetic waves (Maxwell, 1865). Historically, Maxwell also identified light as an electromagnetic wave (Maxwell, 1873).
Maxwell’s equation in a dielectric medium
Optically, a dielectric medium is defined as a medium in which there are no currents or charges. However, the electric and magnetic properties of the medium play prominent roles in deciding the behavior of light in the medium. In a dielectric medium, the polarization density () is the sum of electric dipole moments in an external electric field. Similarly, magnetization density () depends on the magnetic properties of the medium and can be thought of as the sum of magnetic dipole moments in the presence of an external magnetic field (in analogy with polarization density). Hence, the polarization and magnetization play prominent roles in a dielectric medium.
The equations relating the flux density and the fields are given by
where and are the polarization and magnetization density of the dielectric medium, respectively.
In a dielectric medium, Maxwell’s equations relate the displacement vector (), magnetic flux density (), electric field (), and magnetic field (),
In free space, the polarization and magnetization density , producing and , returning Maxwell’s equation in free space [see Eq. (1.108)].
Wave propagation in free space
The postulates of electromagnetic theory laid the foundation for electromagnetic waves as we know it today. In this section, we derive the wave equation from the basic postulates.
We seek the wave equation for both the electric field vector and the magnetic field vector in free space. From Maxwell’s equations (1.108), we can take the third equation and apply a curl operator on both sides,
Using vector calculus, we expand the left-hand side as follows:
Substituting the first equation () and fourth equation () from postulate II [Eq. (1.108)], we obtain
where is the speed of light in vacuum.
Similarly, one can show that the magnetic field vector satisfies the following wave equation,
Note that the equations for both the electric and magnetic field vectors are similar to that of the wave equation used in wave optics. Thus, light is an electromagnetic wave and the respective electric and magnetic vectors individually satisfy the wave equation.
The solution of the wave equations
Immediately, the next goal is to solve the wave equation for and to determine the solution. To do so, we work out the derivation in the Cartesian coordinate system. Our inference from wave optics indicates that the solution of the wave equation is periodic in time. Here, we seek the solution for the complete wave equation.
Thus, we begin with a solution of the form . Substituting this in the complete wave equation (1.113), we obtain the modified forms for the coupled () waves,
where is the wave number.
The similarity of the equations indicates that the solution to one of them may be sufficient to guess the solution for the other. Hence, we pursue the solution of the equation related to the electric field vector (). Expressing the Laplacian operator in Cartesian coordinates and the electric field vector in its components, we obtain
where the components have dependence on x, y, and z.
The above equation can be decoupled and split into three independent equations,
All the components have the same form, so we solve for the x component and solutions for the others can be obtained by inspection. Expanding in Cartesian coordinates and using the separation of variables method for x, y, z variables (i.e., Ex (x, y, z) = u(x) × v(y) × w(z)), we obtain
The above coupled equations can be separated by splitting k2 as . This produces the following set of separate equations,
The general solutions for the above set of equations are
Equivalently, the above set of solutions can also be expressed in cosine and sine as
Combining the above individual solutions, we obtain the most general solution for the x component of an electric field,
It may be noted that the exponential form [Eq. (1.120)] is useful to describe propagating waves in space (propagation through air or any medium), whereas the trigonometric form [Eq. (1.121)] is useful when dealing with light propagation in a restricted space (such as an optical cavity or optical fibers). Note that the solution for the application at hand requires specific boundary conditions.
In summary, the complete solution is obtained by combining all the components and incorporating the time component (e−iωt), i.e.,
The above equation describes a monochromatic wave traveling in direction with a periodicity of 2π/k = λ in space and a periodicity of 1/ν = 2π/ω in time.
Equation (1.123) is called a time-dependent electromagnetic wave. In reality, all the waves have time dependence, but depending on the context of the problem at hand, we may take the time component, the space component, or both. Specifically, for explaining experiments involving interference at a fixed point, we need to consider the time component (i.e., e−iωt) of the wave, whereas experiments involving polarization require only the space part (i.e., ). There are situations when we need both time and space variables (i.e., ). Thus, the problem at hand actually determines the form we need to choose.
Structure of electromagnetic waves
Now, we are in a position to understand the structure of the electromagnetic field as the light wave propagates in space. To understand this, let us consider the electric and magnetic field vectors of a monochromatic wave,
Both vector fields satisfy Maxwell’s equation, thus we have
Substituting and noting that , the above set of equation becomes
The above set of equations bring out the structure of an electromagnetic field, i.e., the relationship between the electric field vector, magnetic field vector, and propagation vector as light travels in free space. It is apparent that these vectors are mutually orthogonal to each other at any point of time, with electric and magnetic field vectors perpendicular to the propagation vector. Thus, the electric and magnetic field vectors lie in a plane which is orthogonal to the propagation vector of light. This is pictorially shown in Fig. 1.15. The wave is popularly called a transverse electromagnetic (TEM) wave.
Boundary conditions for and
The above solution defining the propagation of wave needs to obey boundary conditions when the light is traversing from one dielectric medium to another. These boundary conditions are due to the requirement that the components of electric and magnetic fields must be continuous at the interface of two mediums. The boundary conditions concern the continuity of all the components of the fields, , , , and . These boundary conditions are an integral part of Maxwell’s equations and affect the reflectance/transmittance of electromagnetic waves at interfaces and their propagation in a medium.
The boundary conditions are defined at the interface of two dielectric mediums (with zero surface current and zero surface charge density), given by
where the subscripts (1, 2) correspond to dielectric mediums (1, 2) and t and n represent the tangential and normal components of the respective fields.
Polarization of Light
Owing to the very fact that light consists of two intertwined complex vector fields, it has a magnitude and phase. Polarization of light is concerned with the time course development of these vectors (electric or magnetic field vectors) at a point in space (say, ) at time t. These vectors lie in a plane tangential to the wavefront normal and vary sinusoidally as the wave progresses with an oscillating amplitude as shown in Fig. 1.16. The electric field has components along both x and y axes.
Time course of the electric and magnetic field as the EM wave propagates.
Consider a monochromatic plane wave traveling along the z axis. Thus, the complex electric field is bound to the XY plane and can be represented as
where the complex amplitude has two complex components (Ax, Ay) given by
Substituting Eq. (1.129) into Eq. (1.128) and representing the electric field in x and y components, we obtain
with
Careful inspection shows that the above set of equations are parametric solutions of a generalized ellipse,
where Δϕ is the phase difference between the x and y components of the electric field.
The above equation (1.132) is the master equation which is capable of explaining all types of polarization states of light. The master equation determines the direction of the tip of the electric field vector in the xy plane as the wave propagates. The tip rotates helically along the z axis and repeats periodically with a period of . Thus, the polarization state of the wave is determined by the orientation (clockwise or counter-clockwise) and the shape of polarization ellipse as seen from the z plane (facing the advancing wave). This is shown in Fig. 1.17.
The polarization ellipse and the trace of the electric field ellipse as the wave propagates. The components of the complex amplitude are also shown.
The polarization ellipse and the trace of the electric field ellipse as the wave propagates. The components of the complex amplitude are also shown.
Linear polarization
Let us analyze the master equation to understand different polarization states of light. We begin with the simplest case, when the phase difference Δϕ is an even integer multiple of π/2. Substituting this into the master equation, we obtain the following simpler form:
The above expression is an equation of a line indicating linear polarization for which the slope is given by . Specifically, when |Ax| = |Ay|, the slope is 1, and the plane of polarization angle is 45° with both x and y axes. This is called planar-polarized light, and is shown in Fig. 1.18. In addition, when |Ay| = 0, Ey = 0 and only Ex survives. In this case, the plane of polarization collapses to a line along the x axis. This is called linearly polarized light along the x axis.
Circular polarization
Now, consider the other possibility when phase difference is an odd integer multiple of π/2. This produces the following simplified expression for the master equation,
where we have assumed Ax = Ay = A0.
The above equation can be immediately recognized as the equation of a circle with radius A0. This indicates that the tip of field rotates in a cycle as the wave propagates, completing each cycle in a distance . This is called circularly polarized light.
A specific case arises for Δϕ = +π/2 and Δϕ = −π/2 for which the tip of the field rotates either in a clockwise or anticlockwise direction. These states of light are called right and left circularly polarized (see Fig. 1.18). Finally, a generalized polarization state can be represented by a Poincaré sphere (Landau et al., 1984). Readers are encouraged to look at the standard textbooks on optics for better understanding of the Poincaré sphere.
Transverse Electric and Magnetic Polarization
Transverse electric (TE) and transverse magnetic (TM) modes are better understood and visualized in a rectangular optical cavity. A field being TE or TM depends on whether or are transverse to the axial direction of the waveguide or equivalently the direction of propagation. Here we choose the z axis as the axis of the waveguide as shown in Fig. 1.19.
The TE mode corresponds to Ez = 0 and represents the following field configuration:
where all the components are functions of x, y, and z.
We proceed to determine the electromagnetic field for TE waves (Ez = 0). We use the fact that Hz satisfies Helmholtz equation, i.e.,
To solve the above equation, we use the separation of variables and assume a solution of the form Hz (x, y, z) = u(x) × v(y) × w(z). Substitution gives
Dividing by uvw and rearranging gives
We have purposely fully separated the x, y variables on the left-hand side and the z variable on the right-hand side of the equation. The above equations are equal only if both the left- and right-hand sides are equal to the same constant, say kz. This produces the following set of equations:
The solution to the second equation can be determined to be plane waves,
Thus, w(z) correspond to a superposition of forward () and backward () traveling plane waves along the z axis.
Now, we proceed to solve the first equation involving x, y variables,
Again, proceeding in a similar way, we end-up with the following set of equations:
In a rectangular cavity as shown in Fig. 1.19, both equations obey boundary conditions that state that the normal component of the field is continuous at the boundary [see Eq. (1.127)], i.e.,
The solution to the second equation involving the y variable is
with the boundary condition on the y variable.
Taking the first derivative, , and enforcing the boundary conditions, we obtain
The corresponding solution is obtained by substituting B2 = 0 and , i.e.,
Now, we concentrate on the first equation involving x [Eq. (1.143)] given by
along with the boundary condition on the x variable.
The general solution to the above equation is similar and can be readily expressed as
where
Taking the first derivative, , and imposing the boundary conditions we obtain
Thus, the solution is obtained by substituting C2 = 0 and into Eq. (1.149), producing
Combining all the components, complete solution in space can be expressed as
where D is the total constant, and kz is given by
with (m, n) ≠ (0, 0).
Incorporating the time-dependent part (e−iωt), the complete solution in space–time can be generalized to
It may be realized that there are infinitely many discrete solutions corresponding to m = 0, 1, 2, … and n = 0, 1, 2, …, but (m, n) ≠ (0, 0). These (m, n) represent modes that correspond to a nonzero solution in a rectangular cavity. The modes are as shown in Fig. 1.20.
All the nonzero components need to satisfy Maxwell’s equations along with the boundary conditions. The conditions state that the tangential components of the electric field and the normal derivative of the tangential components of the magnetic field are zero at the boundaries. Thus, the corresponding TE and TM fields can be obtained from Maxwell’s equations. It can be shown that the spatial dependence of these components are
Each of these components satisfies the Helmholtz equation and the associated boundary conditions. The field corresponding to (m, n) is called the TEmn mode and there are infinitely many such modes. The reader is encouraged to derive each of these components taking into account the boundary conditions for the specific field components.
TM polarization
The TM mode corresponds to Hz = 0 and the corresponding field configurations are
where all the components are functions of x, y, and z.
We proceed to determine the electromagnetic field for TM waves (Hz = 0). We know that Ez satisfies the Helmholtz equation,
along with the following boundary conditions in a rectangular cavity,
Proceeding in a similar way to the TE case, we can show that
where F is a constant.
Again, the solutions are infinite and nonzero fields exist for integer values of (m, n) with the condition (m, n) ≠ (0, 0). These indicate the modes of TM waves or TM polarized waves. The reader is encouraged to derive the above expression. Note that the TM modes are similar to the TE modes for a given frequency (kz). The modes for these cases are shown in Fig. 1.20. Similar to the TE case, the field components for the TM wave can be derived. These components can be derived using the Helmholtz equation along with appropriate boundary conditions. The components corresponding to (m, n) comprise the TMm,n mode and they are infinite.
This chapter is an introduction to special topics of light that are of interest to light sheet optical microscopy. Moreover, the chapter prepares the reader for advanced light sheet microscopy. The basics of light are essential for an in-depth understanding of optical systems and microscopy techniques. Specifically, we explore light sheet optical imaging systems in the following chapters.
AX2540-A axicon (350 nm 700 nm) Thorlabs, USA and Plano-convex Bessel grade Axicons, Edmund Optics, Singapore.