This work has two aims. The first is to develop a Lagrange–Hamilton framework for the analysis of multi-degree-of-freedom nonlinear systems in which non-conservative effects are included in the variational principle of least action from the outset. The framework is a generalization of the Bateman approach in which a set of adjoint coordinates is introduced. A function termed the M-function is introduced as the Fourier transform over the momenta of the joint probability density function (JPDF) of the displacements and momenta, and it is shown that for statistical systems, this function can be written as an expectation involving the new principle function and a general dimensional constant ℏ. This leads to a concise derivation of the Fokker–Planck–Kolmogorov equation. It is found that the equation governing the M-function can be expressed in terms of the new Hamiltonian by replacing momenta by differential operators, meaning that the function satisfies the same equation as the quantum wave function. This gives rise to the second aim of this work: to explore relations between the developed classical framework and quantum mechanics. It is shown that for an undamped linear system, the solution of the M-function equation yields the response JPDF as a sum of Wigner functions. This classical analysis leads to a number of well-known results from quantum mechanics as ℏ → 0, and the extension of this result to nonlinear systems is discussed. The quantum wave function associated with the Hamiltonian is then considered, and the relevance of this function to the physical system is discussed.
I. INTRODUCTION
The equations of motion of a dynamic system can be derived directly by combining Newton’s laws of motion with other laws, such as the equations of elasticity or gravitation. Alternatively, the equations can be derived by defining the Lagrangian to be the difference between the kinetic and potential energies of the system and then using Lagrange’s equations to express the equations of motion in terms of partial derivatives of the Lagrangian. Likewise, the equations can also be derived by forming the Hamiltonian, which is a function of the displacements and momenta of the system, and then using Hamilton’s equations. Neither the Lagrangian nor the Hamiltonian includes non-conservative (dissipative) effects. The primary aim of this paper is to develop a Lagrange and Hamilton framework for the analysis of non-conservative systems. The developed framework naturally lends itself to a secondary aim, which is to explore connections between classical mechanics and quantum mechanics; this is done in a way that differs from the usual quantum-to-classical approach in being a classical-to-quantum analysis. These two aims are described sequentially in what follows.
It is well known that the classical variational principle of least action that underlies the Lagrange and Hamilton approaches to dynamics is limited to conservative systems, although non-conservative effects can be included by adding appropriate generalized forces to the equations of motion that result from the variational principle.1 There has been much interest in the development of alternative variational principles that include non-conservative effects from the outset, and such has been the research effort in this area that many different approaches have been presented in the literature:
The transformation of the system displacements to a set of new variables that grow exponentially in time, leading to a time dependent Lagrangian. The application of Lagrange’s equations to this Lagrangian then yields the equations of motion of the system, including non-conservative effects. Examples of this approach include Ref. 2 for a single degree of freedom system and Ref. 3 for a multi-degree of freedom linear system.
The use of a complex Lagrangian. With this approach, the Lagrangian is expressed in terms of a variable that is a complex linear combination of the system displacement and velocity. Lagrange’s equation then yields the equation of motion for the new variable, which leads to the correct equation of motion for the system. The method is applied to a single degree of freedom system in Ref. 4.
The Prelle–Singer approach. By considering time independent integrals of motion of the damped harmonic oscillator, Chandrasekar et al.5 developed a time independent Lagrangian that yields the correct equation of motion via Lagrange’s equation. The Lagrangian is a complicated function of the system displacement and velocity, and the form of the function differs depending on the nature of the system (for example, underdamped, overdamped, or critically damped). This approach was extended in Ref. 6 to the case of a single degree of freedom system with an nth order constant-coefficient differential equation of motion.
The local potential approach.7 In this method, two kinds of physical variables are introduced: when used in a variational principle, one type of variable (referred to as a thermodynamic variable) is subject to a variation, while the other type remains fixed. The two types of variable are used to form the Lagrangian, and after the application of Lagrange’s equation, the fixed variables are placed equal to the thermodynamic variables. In this way, the Lagrangian can be defined so that the resulting equations of motion correctly describe a non-conservative system.
The use of an action-dependent Lagrangian. The principle function is conventionally defined as the time integral of the Lagrangian, or equivalently, the time derivative of the principle function is equal to the Lagrangian. This relation can be generalized by allowing the Lagrangian to be dependent on the principle function, in which case equating the time derivative of the principle function to the Lagrangian leads to a first order differential equation for the principle function. As described in Ref. 8, the principle of least action then leads to a modified form of Lagrange’s equation. If the Lagrangian is made dependent on the principle function simply by adding a term that is proportional to the principle function, then the resulting equations of motion include linear damping. In Ref. 8, the method is applied to a single degree of freedom system with linear damping and to problems involving fields.
The Bateman approach,9 in which the physical system is augmented by an adjoint (or auxiliary) system. A Lagrangian is defined as a function of both the physical and adjoint displacements and velocities, and the application of Lagrange’s equation leads to an equation of motion for both the physical system and the adjoint system. Non-conservative effects can be included in this approach, in which case the adjoint equation of motion has negative damping or, equivalently, describes a system running in reverse time. This approach is applied to the harmonic oscillator in Refs. 9 and 10, and to both linear and nonlinear single degree of freedom systems, and to two bodies attracted by a central force, in Ref. 11.
Other related methods, including a field approach12 and the use of a Lagrangian with vanishing coefficients13 are reviewed comprehensively by Vujanovic and Jones.14
The present approach to the development of a Langrange and Hamilton framework for non-conservative systems has much in common with the method developed by Bateman.9 By introducing an adjoint system, a Lagrangian is defined in terms of both the physical and adjoint variables; the method is applicable to an arbitrary nonlinear multi-degree of freedom system and is based on knowledge of the conventional Lagrangian and the nature of the non-conservative forces either within the system or applied to the system. By analogy with the conventional approach, the principle function is defined as the integral of the Lagrangian over a time interval subject to the prescribed values of the physical and adjoint displacements at the initial and final times. Setting the first variation of the principle function to zero leads to the correct equations of motion for the physical system and also to a corresponding set of equations of motion for the adjoint variables. The adjoint equations are linear and homogeneous, run backward in time, and have (in the nonlinear case) coefficients that depend on the physical variables—this is very similar to the equations that are generated for the Lagrange multipliers in the constrained optimization of dynamic systems.15 The new Lagrangian leads naturally to a generalization of the Hamiltonian, Hamilton’s equations, the Hamilton–Jacobi equation, and the concept of Poisson brackets. As in the conventional case, if the Hamiltonian is not explicitly dependent on time, then it is a conserved quantity, the difference here being that this result holds even for non-conservative systems.
Having defined generalized expressions for the Lagrangian, the principle function, and the Hamiltonian, attention is then turned to statistical systems, which may have random initial conditions and/or random forcing. In this case, it is helpful to redefine the boundary conditions on the principle function: rather than taking specified displacements at the initial and final times, the displacements and momenta of the physical system are specified at the initial time, and the displacements of the adjoint system are defined at the initial and final times. A function termed the M-function is then defined as the Fourier transform over the momenta of the joint probability density function (JPDF) of the physical displacements and the momenta, and it is shown that this function can be expressed as a function of the principle function in a way that has loose analogies to the Feynman path integral in quantum mechanics and to semi-classical physics. It is shown that for systems with Gaussian white noise excitation, the expression for the M-function leads in a straight forward way to the Fokker–Planck–Kolmogorov (FPK) equation,16 and the Stratonovich16,17 version of the effect of parametric excitation arises naturally from this approach.
For dimensional reasons, the definition of the M-function includes a constant ℏ, which can be given an arbitrary value. A partial differential equation is derived for the M-function, and it is shown that this equation can be expressed in terms of the Hamiltonian by replacing momentum terms by differential operators. In this sense, the equation has exactly the same form as the Schrödinger equation for the quantum wave function, and this provides the motivation for the second aim of the present work, which is to explore connections between classical mechanics and quantum mechanics.
In the case of a linear undamped system, it is shown that a separable solution of the M-function equation leads to Schrödinger’s equation for the physical system, meaning that the M-function can be expressed as a sum of second order products of the quantum wave function but with the constant ℏ retaining an arbitrary value. Expressions for the probability density function of the displacements, and of the momenta, are derived, which are identical to those used in quantum mechanics, and if the coefficient matrix in the expansion of the M-function is positive semi-definite, then the uncertainty principle applies. If a single wave function is used, then the JPDF of the displacements and momenta has the form of a Wigner function,18–20 an expression that has been used previously to construct the JPDF from the quantum wave function. An investigation of the relationship between classical and quantum mechanics is then performed, considering the nature of the coefficient matrix, the positivity of the JPDF, and the situation as ℏ → 0. A numerical example is included to illustrate the key points. The case of a nonlinear system is then considered.
Many previous studies (for example Refs. 21–25) have considered the Schrödinger equation associated with the Bateman Hamiltonian for a damped oscillator. As discussed above, this equation also governs the classical M-function, and the solutions to the equation can therefore be interpreted in a classical way or as a quantum wave function. Having explored the properties of the M-function, attention is turned to the wave function interpretation of the solution, and in agreement with previous work,24,25 it is found that the equation cannot reproduce the conventional quantum behavior of the physical system. This raises questions as to the practical utility of the wave function solutions.
In what follows, the Lagrange and Hamilton formulations are discussed in Secs. II–V. Statistical dynamics is then considered in Sec. VI, including a derivation of the Fokker–Planck–Kolmogorov equation and the equation that governs the M-function. The relations between the present formulation and quantum mechanics are discussed in Secs. VII and VIII, and conclusions are given in Sec. IX.
II. LAGRANGIAN FORMULATION
The present analysis is concerned with the dynamics of a system that has N degrees of freedom, which are represented by the generalized coordinates qi(t), i = 1, 2, …, N. The classical Lagrangian for the system is defined as the difference between the kinetic energy T and the potential energy V, and it can be written in the form1
Here, the subscript on L1 is used to distinguish the classical Lagrangian from the generalized case that will be defined in what follows. It is well known that the equations of motion of the system can be derived from the Lagrangian in the form
where Qi represents the generalized force in degree of freedom i arising from external forces and internal dissipative forces. As discussed in Sec. I, there has been much interest in developing a form of the Lagrange approach that incorporates dissipative effects into the Lagrangian, and this is achieved here by defining a new Lagrangian in the form
where si(t), i = 1, 2, …, N, represents a set of adjoint coordinates; these coordinates can be viewed as a generalization of the dual model employed by Bateman.9 The summation convention is employed in Eq. (3) and all subsequent equations so that a repeated index indicates summation over that index. The action integral is now defined as the time integral of the Lagrangian between two times t0 and t, subject to specified values of the coordinates q and s at the beginning and end of this time interval, so that
where S is termed the principle function. It is readily shown that the first variation of the action integral with respect to the generalized and adjoint coordinates is given by
Noting that qi and si are specified at the start and end times, it can be seen that the first term on the right-hand side of Eq. (5) is zero. The complete expression will be zero, providing
It is readily shown that Eq. (6) is identical to Eq. (2), and thus, setting the first variation of the principle function to zero achieves two things: (i) the dynamical equations of motion of the system are recovered, and (ii) a set of equations of motion are derived for the adjoint coordinates.
As an example, in the very simple case of a damped linear system, the Lagrangian has the form
where M, C, and K are the mass, damping, and stiffness matrices, respectively, and F is a vector of external forces. Equations (6) and (7) in this case yield
Equation (9) is the well-known set of equations that govern the system, while Eq. (10) states that the adjoint coordinates satisfy a time reversed homogeneous set of equations.
Generalized momenta pi associated with the coordinates qi, and ri associated with the coordinates si, can be defined as
The generalized momenta defined by Eq. (11) coincide with those employed in the classical version of Lagrangian mechanics. It is clear from Eq. (14) that in the case of a nonlinear system, the adjoint coordinates satisfy a linear set of equations with time varying coefficients, and that these coefficients depend on the trajectories of the generalized coordinates qi. This situation is analogous to that which arises in the constrained optimization of dynamic systems using the adjoint method,15 where the adjoint coordinates (Lagrange multipliers) satisfy a set of equations very similar to Eq. (14). For future reference, it can be noted that the Lagrangian and the principle function can be written very simply in terms of the generalized momenta and the adjoint coordinates so that
To summarize the present section, it has been shown that if the Lagrangian and the principle function are defined, respectively, by Eqs. (3) and (4), then putting the first variation of the principle function equal to zero yields the equations of motion of the both the physical system and the adjoint system. Dissipative and applied forces are both included in the Lagrangian, and so there is no need for a separate consideration of generalized forces. This approach is extended to Hamiltonian dynamics in Sec. III.
It can be noted that the Lagrangian for the linear system, Eq. (8), differs from that employed by Bateman and others9–11 for a single degree of freedom system. In the present form, the damping term has an asymmetry, which can be avoided by replacing by , as done by Bateman.9 This approach is not taken here for two reasons: (i) it is not readily generalized to nonlinear systems, and (ii) with the present approach, the momentum pj is exactly the physical momentum, but this is not the case for the Bateman version of the Lagrangian. The second point ensures that the results obtained here for q and p are entirely physical.
III. HAMILTONIAN FORMULATION
The classical Hamiltonian is written in terms of the classical Lagrangian in the form1
and the well-known Hamilton equations are
Equation (18) is identical to Eq. (11), and Eq. (19) is the equation of motion of the system, Eq. (13). By analogy with Eq. (17), it can be postulated that the Hamiltonian associated with the generalized Lagrangian of Sec. II is
where it can be noted that the independent variables are now taken to be the generalized coordinates and momenta so that the velocities are functions of these items, i.e.,
To demonstrate that Eq. (20) is indeed the appropriate expression for the generalized Hamiltonian, the function can be differentiated with respect to each of the independent variables to yield
where use has been made of the equations that appear in Sec. II. Collecting these results together gives the set of Hamilton equations for the system in the form
It can be verified that these equations reproduce Eqs. (11)–(14) of Sec. II, thus confirming that the Hamiltonian given by Eq. (20) is the appropriate generalization of the classical Hamiltonian. The “cross” nature of Eqs. (27)–(30) can be noted, in the sense that the Hamiltonian must be differentiated with respect to the adjoint variables to yield the rate of change of the physical variables, and vice-versa.
In the classical Hamiltonian approach, the rate of change of any function of the generalized coordinates can be expressed in terms of Poisson brackets.1 This approach can be generalized here by noting that for any function f(q, p, s, r, t), partial differentiation yields
where the Poisson bracket is defined as
By putting f = H, these equations imply that if the Hamiltonian is not explicitly dependent on time (i.e., ∂H/∂t = 0), then it is a conserved quantity. This generalizes the well-known corresponding result for classical Hamiltonian of a conservative system.
As an example, the generalized Hamiltonian corresponding to the linear multi-degree of freedom system considered in Sec. II is
The Hamiltonian will not depend explicitly on time if the forcing term is zero, and in that case, it will be conserved. Physically, the decay of the physical coordinates due to damping is compensated for by the growth of the adjoint coordinates due to damping (in reverse time), and the Hamiltonian is constant.
As a final point in the present section, it can be noted that if the function f in Eq. (31) depends only on the physical coordinates, then the result is
This equation can be expressed in the modified form
in view of the fact that
IV. DYNAMIC SYSTEMS
The example of a linear dynamic system was considered in Secs. II and III. For what follows, it is helpful to introduce notation for a more general dynamic system: given that the kinetic energy is always a quadratic function of velocity, a very wide class of system can be represented in the form
Here, mij(q) is an inertia coefficient, represents a set of internal forces that can include dissipation and nonlinearity, and represents a set of external forces that can include parametric components. The generalized Lagrangian for this case is given by
where
The generalized momenta are given by , and the Hamiltonian has the form
Here, represents the ijth entry of the inverse of the tensor mij. The equations of motion of the system follow from either Eqs. (6) or (30) and have the form
V. FURTHER PROPERTIES OF THE PRINCIPLE FUNCTION
It follows immediately from the first term on the right-hand side of Eq. (5) that the derivatives of the principle function S(q, s, t; q0, s0, t0) with respect to the physical and adjoint coordinates are given by
The partial derivative of the principle function with respect to the final time follows from Eq. (4) as
where it has been noted that in order for the specified boundary conditions to apply at t + ∂t, there must be a change of in qi(t), and likewise for the adjoint coordinates. Similarly the partial derivative of the principle function with respect to the initial time is
It follows from Eqs. (44), (45), and (48) that the principle function is governed by a first order nonlinear partial differential equation in the form
This equation is the direct analogy of the Hamilton–Jacobi equation of classical mechanics.1 In principle, Eq. (50) can be solved subject to the boundary conditions on the coordinates, and the generalized momenta can then be recovered from Eqs. (44)–(47); in practice, finding a solution to Eq. (50) will be difficult for all but the simplest problems.
Thus far, the principle function has been expressed as a function of the initial and final values of the physical and adjoint coordinates. This implies that the physical motion of the system is governed by a two point boundary value problem, in which the N coordinates are specified at both the initial and final times. Following Eq. (16), the principle function can be written in a form that makes the boundary conditions explicit so that
where it is clear that the generalized momenta depend on the specified boundary conditions. In practical problems, it is much more common to specify the boundary conditions exclusively at the initial time so that the N values of the coordinates and the N values of the momenta are known at t0. There is no reason why Eq. (16) cannot be used for alternative boundary conditions, but care must be taken with the notation to make it clear that that the principle function is now a function of different variables. A very useful set of boundary conditions to consider is that in which the physical coordinates and momenta are specified at the initial time, but the adjoint coordinates are specified at the initial and final times. In this case, the principle function can be written as
and it follows from Eq. (16) that the partial derivative of the principle function with respect to time is given by
VI. STATISTICAL DYNAMICS
If the system has random initial conditions or if it is subjected to random forcing, then there is interest in determining the statistics of the response. A key quantity is the conditional joint probability density function (JPDF) of the generalized coordinates and momenta for specified initial conditions ; if this function is known and the statistical distribution of the random initial conditions is specified, then the statistical distribution of the response can be found. In what follows, the response JPDF is related to the principle function defined by Eq. (52), and this relationship enables governing equations to be established for the response statistics. As discussed below, rather than dealing directly with the JPDF, it is mathematically expedient to consider the Fourier transform of the function over the momenta, and for ease of reference, the resulting function is termed the M-function in what follows.
A. The M-function
The M-function is defined here as
where ℏ is any constant that is of dimension m2 kg/s; in later sections, the case in which ℏ is Planck’s constant will be discussed, but, in general, there is no restriction on the possible value of the constant. The M-function is a Fourier transform of the JPDF over the momenta, and the corresponding inverse transform is
The M-function can be connected to the analysis of Sec. V by noting that the exponent in Eq. (54) contains the principle function, as given by Eq. (52), so that
Here, the expectation E[] is taken over the ensemble of possible random trajectories of the system, with specified initial conditions q0 and p0 and with specified endpoints s and s0 on the adjoint variables—if the expectation is replaced by an integral over the JPDF of the response, then Eq. (54) is recovered.
There is a (weak) analogy between Eq. (56) and the Feynman path integral approach to quantum mechanics.26 In that approach, the quantum wave function ψ(q) is related to the sum over all possible trajectories (with specified endpoints) of the quantity exp(iS1/ℏ), where S1 is the standard principle function. There is a slightly stronger link with semi-classical physics, in which only classical trajectories are included in the sum;27,28 however, were Eq. (56) replaced by E[exp(iS/ℏ)], with S being the present principle function under two point boundary conditions, then the expectation would be taken over and the equation would yield the Fourier transform of the JPDF of the momenta at the initial and final times (which would be a function of s and s0), rather than a wave function. It will be shown in Sec. VII that the M-function has other, stronger, connections to quantum mechanics.
B. The Fokker–Planck–Kolmogorov equation
It is shown in this section that the definition of the M-function, together with the properties of the principle function, leads to a straight forward derivation of the Fokker–Planck–Kolmogorov (FPK) equation that governs the JPDF of the system response when the excitation is Gaussian white noise in the form
Here, the terms ξk(t) represent a set of uncorrelated white noise processes with the property
The dependency of the force in Eq. (57) on the generalized coordinates and momenta covers the general case of parametric excitation. Initially, it can be noted from Eqs. (56) and (53) that the time derivative of the M-function can be written as
where
It is shown in the Appendix that the term in Eq. (59) that involves the excitation force can be expressed as
where
Equation (62) can be re-expressed by (i) writing the expectation on the right-hand side as the integral over q and p of the JPDF multiplied by the quantity inside the expectation, (ii) expressing the generalized velocities in terms of the generalized momenta, (iii) integrating all of the terms by parts so that the derivatives no longer operate on χ, and (iv) taking the inverse Fourier transform of the equation over s, in the sense of Eq. (55). These steps are straight forward, if lengthy, and lead to the final result,
C. Partial differential equation for the M-function
A partial differential equation for the M-function can be derived by noting that the derivative with respect to qj that appears in dj on the right-hand side of Eq. (62) can be replaced by a derivative with respect to , and the differential operator can then be brought outside the expectation. This yields
where for simplicity of notation the argument of the M-function has been replaced by q, so the equation relates to . Equation (65) is not quite as straight forward as it appears since the functions Ajk and gj can depend on the generalized momenta—in this case, the equation is the result obtained when the following substitutions are made:
A Taylor series expansion of the functions in terms of the momenta will then yield the differential equation, which will be of infinite order if the functions are not polynomials. If, for example, the system has non-parametric excitation, a linear damping matrix cjk, and the mass matrix is constant, then
where hj = ∂V/∂qj. In this case, the equation for the M-function becomes
Equations (65) and (71) relate to the case in which the system is subjected to white noise excitation. If the system has deterministic forcing and the Hamiltonian is given by Eq. (42), i.e.,
then it can readily be shown from Eq. (59) that the equation for the M-function can be obtained by replacing the momenta and adjoint momenta in the Hamiltonian by operators so that
to give
This result is directly analogous to the Schrödinger equation in quantum mechanics, although the present analysis is purely classical. In fact, if ℏ is replaced by 2ℏ, then Eq. (75) is exactly the Schrödinger equation for the wave function ψ(q, s, t) associated with the Hamiltonian, and the solutions to this equation have been studied extensively in the past for the case of the Bateman formulation of a damped linear oscillator.21–24 In Sec. VII, the solutions of Eq. (75) for the classical M-function are explored, and in Sec. VIII, the solutions for ψ(q, s, t) are considered. Although the first part of the present work has been motivated to a great extent by damped systems, Secs. VII and VIII consider exclusively the undamped case, which is analytically tractable and yields results of physical interest.
VII. THE M-FUNCTION AND THE SCHRÖDINGER EQUATION
In this section, the solution to Eq. (75) is explored for an undamped system, and it is found that for a linear system, the M-function can be expressed in terms of solutions to the Schrödinger equation relating to the physical system q, regardless of the value of ℏ. It is also found that this result holds for a nonlinear system providing ℏ is small, in a sense that is defined in what follows. The case of a linear system is considered in Subsection VII A, and the analysis is extended to a nonlinear system in Subsection VII B.
A. Linear systems
If the differential equation for the M-function, Eq. (75), is applied to the case of the linear system described by Eq. (9), then the result is
where the damping matrix has been omitted to limit the discussion to an undamped system. Although Eq. (75) was derived in Sec. VI for the case of deterministic initial conditions, the equation is also valid for the case in which the physical system is given an initial distribution ρ(q0, p0, t0) together with initial conditions s0 = 0 on the adjoint coordinates. In this case, the M-function is interpreted as the average value taken over the initial distribution—Eqs. (54) and (55) can each be multiplied by ρ(q0, p0, t0) and integrated over q0 and p0 to yield the relationship between this version of the M-function and the response JPDF; none of the subsequent equations are affected by this change. A solution to Eq. (76) can be sought by means of the transformation,
This transformation would be canonical where a scaling factor of is introduced on the right-hand side of each equation, but there is no need for the transformation to be canonical since the aim is merely to facilitate a solution to the differential equation, and the following algebra is tidier without the scaling factor. Equations (76) and (77) yield
A separable solution can now be sought in the form
and Eq. (78) then yields
Equation (80) is exactly the Schrödinger equation associated with the physical system,29 while Eq. (81) is the Schrödinger equation obtained under a time reversal. The solution for Mx yielded by Eq. (80) will be the quantum wave function ψ of the physical system, evaluated with the argument x. Likewise, the solution for My yielded by Eq. (81) will be the complex conjugate of the wave function, evaluated with the argument y. To consider specifically the case of an unforced system (Fj = 0), a solution to Eq. (80) can be sought in the form
which yields an eigenproblem with eigenvalue En and eigenfunction ψn so that
As an aside, for a single degree of freedom system with mass m and natural frequency ω0, the above equation reduces to the well-known Schrödinger equation for the harmonic oscillator
and the eigenvalues and eigenfunctions are given by29
where Hn is the nth Hermite polynomial and cn is a normalization constant. The equations of motion of the multi-degree of the freedom system can be diagonalized by a modal transformation, and hence, the system can be represented as a set of uncoupled harmonic oscillators. This means that all of the wave function solutions to Eq. (83) will have the form of Eq. (86), with ω0 replaced by the appropriate natural frequency. This detail is mentioned for completeness but is not required for the analysis which follows.
Given Eq. (79), the solution to Eq. (76) can be expanded in terms of the eigenfunctions yielded by Eqs. (80) and (81) so that
The eigenfunctions are orthogonal, and hence, Eq. (87) can readily be inverted to yield an expression for the expansion coefficients anm in terms of the M-function at some specified time ts in the form
In practice, the boundary condition on a physical problem will be expressed as the JPDF ρ(q, p, ts) at ts, and the M-function to be used in Eq. (88) is the Fourier transform of this distribution. It can be demonstrated from Eqs. (54) and (77) that M(y, x, t) = M*(x, y, t), and hence, Eq. (88) implies that the expansion coefficients must be Hermitian, in the sense that
An eigenvalue analysis of the matrix anm yields the eigenvalues λr and the associated eigenvectors θnr; the fact that the matrix is Hermitian implies that the eigenvalues are real. The matrix can be expressed in terms of the eigenvalues and eigenvectors in the form
and Eq. (87) can then be written as
where
Each term in the summation in Eq. (93) has the form of the Wigner function,18 which is an existing construction from the quantum wave function to give an expression for the JPDF of the displacement and momentum. This form has arisen naturally here as an exact solution to the classical equations.
The probability density function of the generalized coordinates can be found by integrating Eq. (93) over the momenta, or equivalently by putting s = 0 in Eq. (91). The result is
By integrating Eq. (93) over the generalized coordinates, it can be shown that the probability density function of the generalized momenta is given by
where is the Fourier transform of the wave function
Equation (94) is mathematically identical to a quantum description of the system.29 The wave function ϕr(q, t) is a linear superposition of the eigenfunctions of the Schrödinger equation and is therefore also a solution of that equation, and Eq. (94) represents a mixture of quantum states with λr being the probability of being in state r (the probability density function associated with the state being the modulus squared wave function). Moreover, it can readily be confirmed that the sum of the eigenvalues (probabilities) λr, being the trace of the matrix anm, will be unity, provided the function M(x, y, ts) that appears in Eq. (88) is associated with a JPDF that has unit area. Equation (95) is also in agreement with quantum mechanics, stating that the wave function for the momenta is equal to the Fourier transform of the wave function for the generalized coordinates. For the single degree of freedom case, if there is a single term in the summations in Eqs. (94) and (95), then it follows from the properties of Fourier transforms that the standard deviations of the response satisfy the condition
which is a statement of the uncertainty principle.29 Equation (97) continues to hold if there are multiple terms in the summation, but only if every coefficient λr is a positive probability. Although the foregoing analysis has followed a logical sequence of steps, on reflection, Eq. (97) is not credible as a general result since the analysis has considered an arbitrary value of ℏ—the uncertainty principle cannot possibly apply for every value of this constant. As discussed in what follows, this apparent difficulty can be resolved by considering the nature of the expansion coefficients anm and the way in which these coefficients depend on the constant ℏ.
The matrix of coefficients anm is necessarily Hermitian, as stated in Eq. (89), but, in general, there is no reason why it must be positive semidefinite. This means that the eigenvalues λr can include negative values, and in that case, the uncertainty condition, Eq. (97), does not follow from Eqs. (94) and (95). Moreover, negative eigenvalues cannot be interpreted as probabilities in Eq. (94), and therefore, the equation lies outside the tenets of quantum mechanics. The conditions under which anm will be positive semidefinite can be explored by rewriting Eq. (88) in the form
It then follows that for any vector v
where
The matrix anm will be positive semidefinite if the result given by Eq. (99) is greater than or equal to zero for any vector v, or equivalently for any function θ. If the JPDF ρ(q, p, ts) is approximately constant over an interval of width 2Δ so that
then the integral over pj that appears in Eq. (99) can be performed over this interval to convert the exponential term in the integrand to a term that is proportional to sinc(2Δsj/ℏ). Now, if (Δ/ℏ) ≫ 1, then the sinc function will be approximately proportional to a Dirac delta function, and as a result, Eq. (99) can be approximated as
Hence, the matrix anm will be positive semidefinite if (Δ/ℏ) ≫ 1, or in physical terms, if the change in ρ(q, p, ts) is extremely small when p is changed by an amount of order ℏ. This situation will arise, for example, if ρ(q, p, ts) is a distribution describing motion at the classical scale and ℏ is Planck’s constant. Another way of viewing the situation is that Eq. (102) will become exact as ℏ tends to zero and so the classical and quantum descriptions agree in this limit, as is well known.
The foregoing points can be illustrated by considering the simple example of a single degree of freedom oscillator with unit mass and unit natural frequency so that m = ω0 = 1 in Eqs. (84)–(86). The JPDF at time ts is taken to be a joint Gaussian distribution so that
and the mean and standard deviations are given the values μq = −μp = σq = σp = 0.5 (where SI units are adopted for all quantities). The correlation coefficient between the displacement and the momentum is set to γ = 0.7. The M-function can be derived analytically as the Fourier transform of the JPDF over the momentum, and the entries of the coefficient matrix anm can then be evaluated numerically by using Eq. (88) in combination with the wave functions given by Eq. (86). The accuracy of the solution can be assessed by reconstructing the JPDF using Eq. (87) and comparing this to the specified function. The results obtained for the case ℏ = 1.5 are shown in Fig. 1, which presents a contour plot of the original JPDF and the reconstructed function—the two sets of contour lines are extremely close.
The eigenvalues λr of the coefficient matrix are shown in Fig. 2, where it is apparent that both positive and negative values are obtained. This is not surprising: were all of the eigenvalues positive, then the reconstructed JPDF would satisfy the uncertainty relation given by Eq. (97), but the original JPDF does not satisfy this relation for ℏ = 1.5.
The corresponding results for the case ℏ = 0.05 are shown in Figs. 3 and 4, where it can be seen that the contours of the JPDF show the same level of agreement as for the previous case, but the eigenvalues λr are now all positive. This means that the reconstructed JPDF will satisfy the uncertainty relation, which is not contradicted by the original JPDF in this case. These results illustrate the tendency for the matrix anm to become positive semi-definite with reducing ℏ, as predicted by the previous discussion. The probability density function of the displacement is given as a function of time by Eq. (94), and this equation has been employed for the present example for each of the two values ℏ = 0.05 and ℏ = 1.5. The resulting probability density function is shown in Fig. 5, where it can be seen that the results for ℏ = 0.05 (curves) are in excellent agreement with the results for ℏ = 1.5 (symbols). The curves are shown in time increments of 2π/9, and the sequence of the curves is indicated by the numbers of Fig. 5. The fact that the curves can be reconstructed by using different values of ℏ, which lead to very different values of the coefficients anm, demonstrates the arbitrary nature of ℏ in the present context.
To summarize the results obtained thus far, Eqs. (93)–(96) are a mathematically exact description of the response of the system for any value of ℏ. If the matrix of expansion coefficients anm is positive semidefinite, then Eqs. (94)–(96) are consistent with quantum mechanics, and the uncertainty principle, Eq. (97), will apply. If the motion is specified by an initial distribution ρ(q, p, ts) specified at a time ts, then the matrix will always become positive semidefinite as ℏ tends to zero.
The foregoing arguments start from the classical situation in which a valid JPDF is specified at some time. The other extreme would be to start from Eq. (93) and specify the probabilities λr. If only one term is included in the summation, then the JPDF is said to be given by a Wigner function,18 and it is well known that this function can predict negative values of the JPDF in certain regions.19,20 Hence, the forging analysis encompasses classical situations, which contradict the quantum equations by giving negative probabilities λr (for insufficiently small ℏ), and quantum situations, which contradict the fact that the classical JPDF should be everywhere positive (for example if anm is of rank 1). However, it also includes solutions where both theories are valid, in the sense that the classical JPDF is everywhere positive and the quantum probabilities λr are also positive.
The Wigner function was originally introduced as an ansatz to produce a JPDF that obeys the rules of quantum mechanics, and it is already established in the literature18–20 that for a linear system, the resulting JPDF satisfies the Louville equation (i.e., the FPK equation for unforced and undamped motion). The existing literature tends to proceed from quantum mechanics to the Louville equation, via the Wigner function, whereas the present analysis has effectively progressed from exact solutions to the Louville equation (or equivalently the M-function equation) to the equations of quantum mechanics and the Wigner function.
As a final comment in this section, it can be shown from Eq. (93) that the average value of the product of the displacement and momentum for coordinate j is given by
where no sum over j is intended in this case. This result does not agree with the quantum mechanical commutator relation, which in operator notation states that . Equation (104) is equivalent to the use of the “average” operator ; because the equation arises from classical mechanics, it is not able to predict the non-classical behavior of the operators. In fact, this is a well-known property of the Wigner function, and it can be overcome by applying the Weyl transform to the classical operators.20 Of course, the issue becomes unimportant as ℏ → 0.
B. Nonlinear systems
If Eq. (75) for the M-function is applied to an undamped nonlinear system, then Eq. (76) is replaced by the more general result
where forcing has been neglected to slightly reduce the following algebra. The final term in Eq. (105) can be re-expressed by using a Taylor series expansion of the potential energy to yield
If ℏ is small, then the third term on the right-hand side of Eq. (107), and all higher order terms, can be neglected, and returning to the original coordinates, the equation for the M-function can be approximated as
The change of variables given by Eq. (77) can now be applied to yield
and this equation can be solved by the method of separation of variables, following exactly the steps outlined in Subsection VII A. For example, the Schrödinger equation given by Eq. (83) is replaced by
which is exactly the Schrödinger equation for the nonlinear system. All of the results of Sec. VII A then apply. The main difference is that the equations are mathematically exact for a linear system, whereas the results are approximate for a nonlinear system due to the fact that higher order terms have been neglected in Eq. (107) on the basis that ℏ is small. These results are consistent with the literature, where it has been established that for a nonlinear system, the Wigner function will only satisfy the Louville equation if ℏ tends to zero.18–20 The issue has been approached from the reverse direction here, where it has been shown that solutions to the Louville equation are expressible in terms of Wigner functions only as ℏ tends to zero.
VIII. THE WAVE FUNCTION AND THE SCHRÖDINGER EQUATION
Numerous authors have used the Bateman Hamiltonian to investigate the quantum mechanics of the damped harmonic oscillator. In this case, Eq. (75) is replaced by
In the present context, this means that the constant ℏ/2 previously used here is replaced by ℏ, and the solution to the equation is interpreted as the quantum wave function ψ(q, s, t) rather than the M-function. It can be noted that the adjoint system has been employed in the principle function and the Hamiltonian solely as a means of generating a variational principle, which includes the effect of damping and which yields the correct equations of motion for the physical system. It has been shown in Sec. III that this approach does indeed yield the correct equations for the physical system in the classical case. In the quantum case, the emphasis of previous work has been on the solution of Eq. (111) for a damped system, but it is also interesting to consider the undamped case to see whether, as in the classical case, the equation can be used to recover the known results for the physical system. This issue can be readily explored here since the solution to Eq. (111) in the absence of damping can be written in the form of Eq. (87) with a suitable adjustment to the value of ℏ. For a single degree of freedom system, the wave functions and the energy levels are given by Eqs. (85) and (86), and Eq. (87) then yields
The physical implications of the foregoing result can be investigated by considering two special cases. In the first case, the coefficients anm are specified so that
where
and α is a normalization constant. The reason for choosing this case is that the formula30
can be applied to yield the following simple result at t = 0,
Moreover, at t = π/(2ω0), the result is
Over every quarter cycle of the natural period of the oscillator, the wave function oscillates between the form given in Eq. (116) and that given in Eq. (117), albeit with varying phase. This means that although the wave function ψn(q) appears at t = 0, the time dependency of the function is very different to that obtained from the conventional Schrödinger equation for the oscillator. The situation is even more complicated when it is recalled that the operator for the momentum p is not −iℏ∂/∂q but rather −iℏ∂/∂s.
A second special case of Eq. (112) that can be considered is the single term solution
In this case, Eq. (115) can be employed to yield
The joint probability density function of the response is given by the square of the wave function, and this can be integrated over s to yield the probability density function of the physical displacement in the form
where the orthogonality of the wave functions has been noted. Although Eq. (120) can also be obtained from the conventional Schrödinger equation, the difference here is that the probabilities λr cannot be specified independently since fixed values arise from Eq. (119).
Given the foregoing examples, it is clear that Eq. (111) does not generate conventional quantum behavior for the physical degree of freedom q. In most previous solutions to this equation, a canonical transformation similar to Eq. (77) is employed (with the addition of a factor of ), which, in the undamped case, yields conventional quantum behavior in the new coordinate x. This does not address the original problem of concern, however, which is the quantum dynamics of q. These issues raise questions as to the physical significance of any solution to Eq. (111), with or without the presence of damping. This point has been noted previously,24 and a modified Hamiltonian has been proposed in Ref. 25. In contrast, the solutions to Eq. (111) are entirely physical in a classical sense if the equation is taken to govern the M-function rather than the wave function.
IX. CONCLUSIONS
This work has considered the extension of the conventional Lagrange and Hamilton formulations of dynamics to cover non-conservative systems, and the resulting approach has led to an investigation of some of the relations between classical and quantum mechanics. The main findings of the work are summarized below.
The current approach is based on the introduction an adjoint system, and the Lagrange and Hamilton equations cover both the physical system and the adjoint system. The Lagrangian is given by Eq. (3), the principle function by Eq. (16), and the Hamiltonian by Eq. (20). The Hamilton equations that govern the motion are given by Eqs. (27)–(30), and the time derivative of any quantity is expressed in terms of Poisson brackets in Eq. (31). If the Hamiltonian does not depend explicitly on time, then it is conserved. The Lagrangian and Hamiltonian employed here differ from the Bateman approach in that the physical momenta are always given by the conventional formulas, regardless of the presence of damping.
If the displacements of the physical and adjoint systems are specified at the initial and final times, then the derivatives of the principle function are given by Eqs. (44)–(48), and the principle function satisfies the Hamilton–Jacobi equation, Eq. (50). If the displacements and momenta of the physical system are specified at the initial time and the displacements of the adjoint system are specified at the initial and final times, then Eqs. (44)–(50) no longer apply and are replaced by Eqs. (52) and (53).
The M-function is defined in Eq. (54) as the Fourier transform over the momenta of the JPDF of the displacements and momenta of the physical system. This can be expressed as an expectation involving the principle function, as shown in Eq. (56), which has some similarity to the quantum path integral and to semi-classical physics.
As shown in Sec. VI B, the representation of the M-function given by Eq. (56) can be combined with the properties of the principle function to derive the Fokker–Planck–Kolmogorov equation for a system excited by white noise. This approach automatically generates the Stratonovich form of the equation when parametric excitation is present.
In the absence of random forcing, the differential equation that governs the M-function can be expressed in the form of Eq. (75), where the momentum terms in the Hamiltonian have been replaced with differential operators, as indicated in Eqs. (73) and (74).
For a linear conservative system, a separable solution of Eq. (75) leads to the Schrödinger equation associated with the physical system in both forward and backward time. The result is that the M-function can be expressed in terms of the quantum wave functions in the form of Eq. (87), or alternatively Eq. (91), and the JPDF of the physical displacements and momenta can be written as a sum of Wigner functions in the form of Eq. (93). It should be emphasized that all of these equations hold for an arbitrary value of ℏ, which was introduced as a dimensional constant in the definition of the M-function. The standard quantum equations for the probability density function of the displacements, Eq. (94), and the momenta, Eq. (95), follow from these equations, although there is no guarantee that the “state probabilities” λr are positive.
If the coefficient matrix anm that appears in the expansion of the M-function, Eq. (87), is positive semi-definite, then the probabilities λr that appear in Eqs. (94) and (95) are positive, and the uncertainty principle applies. It has been shown that for any specified JPDF, the coefficient matrix will become positive semi-definite as ℏ → 0, and in this case, the conditions of both classical mechanics and quantum mechanics are satisfied. Of course, it is well known that classical and quantum mechanics should agree as ℏ → 0, but most arguments behind this statement start at the quantum level and consider the transition toward classical mechanics. The present analysis has been purely classical, and the transition has been from classical mechanics to quantum mechanics.
For a linear system, the Schrödinger equation of the physical system arises as an exact consequence of the M-equation, while for a nonlinear system, the equation arises as an approximation when ℏ is small.
It is important to emphasize that the present approach is not a “derivation” of quantum mechanics: for the nonlinear case, it has been shown that the Schrödinger equation arises with additional terms that become negligible as ℏ → 0 -- in quantum mechanics, these additional terms not just negligible, they are identically zero. In addition, the commutator property does not hold unless the Weyl transform is used,20 and this lies outside the present theory. There is nothing in the present equations to specify the actual value of ℏ and nothing about spin or the more advanced aspects of quantum theory.
If the Bateman Hamiltonian is used directly to derive the Schrödinger equation for the wave function, as in much previous literature,21–24 then the present study of an undamped system (Sec. VIII) shows that the conventional quantum behavior of the physical system is not captured. This issue has been discussed by previous authors,24,25 but it is novel to note that exactly the same Schrödinger equation (with a factor of two on ℏ) governs the classical M-function and generates meaningful classical results for the physical system.
As mentioned in the Introduction, many previous approaches have been developed to extend the Lagrange and Hamilton methodologies to non-conservative systems. One advantage of the present approach is the relatively straight forward way in which the definitions of the Lagrangian, the Hamiltonian, the principle function, the Hamilton–Jacobi equation, and the Poisson bracket have been extended via the use of adjoint variables. Another advantage is the way in which the principle function can be used, via the M-function, to explore the statistical dynamics arising from either random forcing or random initial conditions and the resulting connection that can be made with quantum mechanics.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
APPENDIX: DERIVATION OF EQ. (61)
The integral in this expression can be replaced with a trapezium approximation, and the momentum can then be inserted into the principle function, Eq. (52), to yield
This result can be inserted into Eq. (A1), and the expression can then be evaluated by taking an average over the white noise processes. To facilitate this, it can be noted that for any set of Gaussian variables xkn and any set of functions vk(x), the following result holds:
In the present case, this result can be applied to Eq. (A1) by taking
so that
Furthermore, the relevant set of functions is
and differentiation yields
where it has been noted that
for any smooth function wkr(tn, tm).