A particle confined to an impassable box is a paradigmatic and exactly solvable one-dimensional quantum system modeled by an infinite square well potential. Here, we explore some of its infinitely many generalizations to two dimensions, including particles confined to rectangle-, ellipse-, triangle-, and cardioid-shaped boxes using physics-informed neural networks. In particular, we generalize an unsupervised learning algorithm to find the particles’ eigenvalues and eigenfunctions, even in cases where the eigenvalues are degenerate. During training, the neural network adjusts its weights and biases, one of which is the energy eigenvalue, so that its output approximately solves the stationary Schrödinger equation with normalized and mutually orthogonal eigenfunctions. The same procedure solves the Helmholtz equation for the harmonics and vibration modes of waves on drumheads or transverse magnetic modes of electromagnetic cavities. Related applications include quantum billiards, quantum chaos, and Laplacian spectra.
I. INTRODUCTION
Artificial intelligence has impacted our culture and livelihood dramatically since the turn of the millennium, from smartphones to chatbots to self-driving cars. Scientists have recently begun using machine learning techniques to not only improve our understanding of the world around us but change the way we approach scientific and computational methods, including those methods used to solve the fundamental differential equations that model physical phenomena in our world.
In previous work, physics-informed neural networks have been used to solve classical physics problems, with Lagrangian and Hamiltonian formalisms, for both ordered and chaotic dynamics.1–3 On a more fundamental level, methods have been developed to search for symmetries,4 conservation laws,5 and invariants within such dynamical systems.6 Furthermore, studies have been conducted on specific problems such as heat transfer,7 irreversible processes,8 and energy-dissipating systems.9 These techniques have even been applied to quantum systems.10–13 In this study, we use physics-informed neural networks to solve the quantum eigenvalue problem for particles confined to impassable planar boxes of diverse shapes.
Our work extends that of Jin, Mattheakis, and Protopapas,12,13 who used neural networks to solve the one-dimensional quantum eigenvalue problem for a small number of systems. Here, we generalize and innovate the JMP algorithm to two dimensions and find the eigenvalues and eigenfunctions of the stationary Schrödinger differential equation with Dirichlet boundary conditions in two-dimensional regions that exhibit integrable, ergodic, or chaotic classical billiard dynamics, including irrational triangles and cardioids. Our extension readily handles degenerate energy levels.
A key feature of JMP (pronounced “jump”) is the use of unsupervised learning: during training, the neural network is not given the Schrödinger equation’s solution, which need not be available; instead, the network adjusts its parameters to approximately satisfy the Schrödinger equation and the orthonormality of its solutions. This characteristic showcases the natural ability of physics-informed neural networks to find solutions to problems that may not be solvable analytically or numerically by other methods. Extending this work to multi-dimensional systems advances both machine learning and physics by broadening the usefulness of neural networks and increasing the ways scientists can solve problems.
II. ARTIFICIAL NEURAL NETWORKS
Inspired by mammalian brains, conventional feed-forward neural networks are nested nonlinear functions that depend on many parameters called weights and biases . Training adjusts the weights and biases to approximate the desired outcomes.
III. 2D JMP ALGORITHM
One step of a neural network gradient descent to a second eigenfunction ΨE → Ψ2 given a first eigenfunction Ψ1 (left column). Arrows represent weights , and circles represent biases . In practice, much of the computation involves reverse-mode automatic differentiation. Application to a particle confined to a cardioid-shaped box (right column). Classical billiards orbits are chaotic (top right), while the quantum ground state and first excited state are smooth (middle right). Plateaus in the energy plot indicate eigenvalues, and the addition of the orthogonality loss term forces the wavefunction to morph from Ψ1 to Ψ2 as the energy jumps from E1 to E2 (bottom right).
One step of a neural network gradient descent to a second eigenfunction ΨE → Ψ2 given a first eigenfunction Ψ1 (left column). Arrows represent weights , and circles represent biases . In practice, much of the computation involves reverse-mode automatic differentiation. Application to a particle confined to a cardioid-shaped box (right column). Classical billiards orbits are chaotic (top right), while the quantum ground state and first excited state are smooth (middle right). Plateaus in the energy plot indicate eigenvalues, and the addition of the orthogonality loss term forces the wavefunction to morph from Ψ1 to Ψ2 as the energy jumps from E1 to E2 (bottom right).
With relative loss weights λN = λO = 1, minimizing the differential equation loss LD enforces the stationary Schrödinger equation HΨ = EΨ, minimizing the normalization loss LN discourages trivial zero solutions, and minimizing the orthogonal loss LO encourages independent solutions. Learning continues until the total loss L and the differential equation loss LD and its rate of change are all small. Appendix B discusses alternate versions of the normalization loss. Appendix C discusses our implementation details.
IV. EXAMPLES
From the classical billiards perspective, rectangle and ellipse-shaped boxes are non-ergodic and integrable, while cardioid-shaped boxes (with a polar coordinate boundary r = 1 − δ sin θ) have mixed phase spaces when convex (0 ≤ δ < 1/2) and are ergodic, mixing, and chaotic when concave (1/2 < δ ≤ 1). Triangle-shaped boxes whose angles are irrational multiples of π are ergodic and mixing but not chaotic, but triangle-shaped boxes with one or more angles that are rational multiples of π may not even be ergodic.18,19 From a spectral analysis perspective, for fixed Dirichlet boundary conditions, the only triangles with explicitly known Laplace spectra are the equilateral (60°–60°–60°), isosceles right (45°–45°–90°), and hemi-equilateral (30°–60°–90°) triangles.20 We expect the energy eigenvalue spacings of the integrable rectangle and ellipse-shaped boxes to be distributed according to Poisson statistics and the eigenvalue spacings of the chaotic cardioid-shaped boxes to obey Gaussian Orthogonal Ensemble (GOE) statistics,21,22 with the spectral statistics of triangles somewhere in between.18
V. RESULTS
We construct a fully connected feed-forward neural network with 1 + 2 inputs, two hidden layers containing 100 neurons each, and one output. To train the neural network, we generate 100 random {x, y} points within the box and feed them into the network according to the algorithm outlined in Fig. 1. We typically repeat for 9 × 104 training rounds (or epochs).
The neural network adjusts its weights and biases to converge to the ground-state energy. Adding the orthogonality term LO to the loss function causes the neural network to leave the ground-state energy plateau and rise to the energy associated with the first-excited-state (assuming the ground state is non-degenerate). Similarly, the network finds higher excited states.
Table I and Fig. 2 summarize the results for the first three energy eigenvalues and eigenfunctions for a quantum particle confined to boxes shaped similar to rectangles (simply solvable), ellipses (classically integrable), irrational triangles (classically ergodic), and circles (quantumly degenerate). The eigenfunction color palette stretches from fully saturated red (for Ψ > 0) to fully saturated blue (for Ψ < 0) via completely unsaturated white (for Ψ = 0). The reference energies listed in Table I are numerically computed in Mathematica23 (and checked exactly for the rectangle). Energy plateaus correspond to eigenvalues. The energy eigenvalue uncertainty is the wiggle of the energy plateau as it rings down to its mean value, and the relative error ΔE/E* is the estimated energy minus the reference energy divided by the reference energy. All 2D JMP eigenvalues are within 1% of the reference values. Insets are classical billiards orbits.
Energy eigenvalue examples. Boundary functions B(x, y), reference eigenvalues computed in Mathematica,23 neural network eigenvalue approximations En, and relative errors .
B(x, y) . | . | En . | (%) . |
---|---|---|---|
Rectangle | 6 | 5.99 ± 0.01 | −0.17 |
12 | 12.0 ± 0.02 | 0.00 | |
where a = π/2, | 18 | 18.0 ± 0.03 | 0.00 |
Ellipse | 4.32 | 4.32 ± 0.02 | 0.00 |
9.13 | 9.11 ± 0.02 | −0.22 | |
where a = 1, | 12.8 | 12.8 ± 0.01 | 0.00 |
Circle | 5.78 | 5.80 ± 0.04 | +0.35 |
14.7 | 14.6 ± 0.05 | −0.68 | |
where a = 1, b = 1 | 14.7 | 14.6 ± 0.05 | −0.68 |
Triangle | 4.09 | 4.09 ± 0.00 | 0.00 |
8.00 | 7.98 ± 0.06 | −0.25 | |
where a = 4, b = 4, | 10.8 | 10.8 ± 0.04 | 0.00 |
B(x, y) . | . | En . | (%) . |
---|---|---|---|
Rectangle | 6 | 5.99 ± 0.01 | −0.17 |
12 | 12.0 ± 0.02 | 0.00 | |
where a = π/2, | 18 | 18.0 ± 0.03 | 0.00 |
Ellipse | 4.32 | 4.32 ± 0.02 | 0.00 |
9.13 | 9.11 ± 0.02 | −0.22 | |
where a = 1, | 12.8 | 12.8 ± 0.01 | 0.00 |
Circle | 5.78 | 5.80 ± 0.04 | +0.35 |
14.7 | 14.6 ± 0.05 | −0.68 | |
where a = 1, b = 1 | 14.7 | 14.6 ± 0.05 | −0.68 |
Triangle | 4.09 | 4.09 ± 0.00 | 0.00 |
8.00 | 7.98 ± 0.06 | −0.25 | |
where a = 4, b = 4, | 10.8 | 10.8 ± 0.04 | 0.00 |
2D JMP results for simply solvable, classically integrable, classically ergodic, and quantumly degenerate quantum particles confined to rectangle-, ellipse-, triangle-, and circle-shaped boxes. Red-white-blue contours represent eigenfunctions, and energy plateaus are at eigenvalues. Insets show classical billiard orbits. For the circular degenerate case, where E2 = E3, orthogonality conditions enable the network to find both eigenfunctions Ψ2 ≠ Ψ3.
2D JMP results for simply solvable, classically integrable, classically ergodic, and quantumly degenerate quantum particles confined to rectangle-, ellipse-, triangle-, and circle-shaped boxes. Red-white-blue contours represent eigenfunctions, and energy plateaus are at eigenvalues. Insets show classical billiard orbits. For the circular degenerate case, where E2 = E3, orthogonality conditions enable the network to find both eigenfunctions Ψ2 ≠ Ψ3.
Smaller boxes have higher energy states as the momentum p ∝ 1/λ implies the energy E = p2/2m ∝ 1/λ2, where λ is the quantized wavelength. Thus, adjusting the scaling parameters a, b listed in Table I can keep the energy eigenvalues in a convenient numerical range. Figure 3 compares our boxes to scale.
The neural network implementing the 2D JMP algorithm successfully approximates the first three energy eigenvalues and eigenfunctions for particles confined to a wide range of boxes without assuming or imposing the boxes’ symmetries. Higher-order excited states can be obtained similarly by adding further orthogonality terms to the loss function. Additional training can increase accuracy. Especially impressive is the network’s discovery of the circle-shaped box’s second and third states Ψ2 ≠ Ψ3 despite their degenerate energies E2 = E3 due to the onset of the orthogonality constraints ⟨Ψ3|Ψ1⟩ = 0 and ⟨Ψ3|Ψ2⟩ = 0 in the loss LO.
VI. OTHER APPLICATIONS
VII. CONCLUSIONS
Computers and the algorithms they instantiate are increasingly important in science, technology, and society, from spaceflight27,28 to cryptography.29 We have demonstrated that the JMP algorithm, when extended to two dimensions, enables neural networks to solve the stationary Schrödinger equation and find quantum energy eigenvalues and eigenfunctions for both classically regular and irregular billiards systems. Such capability is another example of physics-informed machine learning mastering dynamical systems that exhibit both order and chaos.1
This success is proof-of-concept that simple feed-forward neural networks, incorporating physics intuition in their loss functions, can solve complicated eigenvalue problems. Well-established state-of-the-art numerical methods18 are currently faster or more accurate, but 2D JMP neural networks illustrate the quantum potential of machine learning. Future work includes exploiting spatial symmetries and focused sampling to reduce the number of training points and generalizing to continuous and three dimensional potential wells.
ACKNOWLEDGMENTS
This research was supported by Office of Naval Research Grant No. N00014-16-1-3066 and a gift from the United Therapeutics Corporation.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
E.G.H. generalized the neural network code and acquired all the data. J.F.L. generated reference results, created the boundary functions, and finalized figures. W.L.D. supervised the work. All authors contributed to the manuscript.
Elliott G. Holliday: Conceptualization (equal); Formal analysis (equal); Investigation (lead); Methodology (equal); Software (lead); Validation (equal); Visualization (supporting); Writing – original draft (equal); Writing – review & editing (supporting). John F. Lindner: Conceptualization (equal); Formal analysis (equal); Methodology (equal); Validation (equal); Visualization (lead); Writing – original draft (equal); Writing – review & editing (lead). William L. Ditto: Conceptualization (equal); Funding acquisition (lead); Project administration (lead); Resources (lead); Supervision (lead); Writing – review & editing (supporting).
DATA AVAILABILITY
The data that support the findings of this study are available within the article.
APPENDIX A: GRADIENT DESCENT
1. Dynamical analog
2. Newton’s method
APPENDIX B: NORMALIZATION LOSS
APPENDIX C: IMPLEMENTATION DETAILS
The Fig. 4 Python sample code implements a simple neural network with sigmoid activation functions that learns the ground state energy eigenvalue and eigenfunction of a particle in a one-dimensional box Ω = [0, 1]. Our PyTorch machine-learning library implementation uses tensors (multidimensional rectangular arrays of numbers) throughout.
Example Python code of neural network learning to model a particle in a one-dimensional box Ω = [0, 1].
Example Python code of neural network learning to model a particle in a one-dimensional box Ω = [0, 1].
ClassNet (lines 10–32) defines the network architecture (lines 14–19), implements a forward pass (lines 21–32), and enforces the Dirichlet boundary conditions (line 31). After instantiating an object of the class object_net and initializing the stochastic gradient descent optimizer (lines 34–35), variable x is a list or columnar array of equally spaced positions inside the box with autograd tracking its operations (lines 37–38).
The for-loop (lines 40–54) manages the neural network training, first shuffling the x values (line 41) and then asking the object_net for the latest eigenfunction and energy eigenvalue approximations (line 42). The grad function invokes autograd to compute the Laplacian (lines 44–46). The .pow() and .mean() methods help compute the loss function (lines 48–50).
The .backward() method also invokes autograd and computes the gradients of the loss with respect to weights and biases, which are then updated by the optimizer (lines 52–54). More generally, the .backward() method computes the .grad attribute of all tensors that have requires_grad = True in the computational graph, the loss of which is the final leaf and the inputs are the roots; then the optimizer iterates through the list of weight and bias parameters it received when initialized, and wherever a tensor requires grad = True, it subtracts the value of its gradient (multiplied by the learning rate) stored in its .grad property.
Final the energy eigenvalue and eigenfunction are extracted as numbers and printed (lines 56–58). This working example returns a ground state energy within about 1% of the exact value.