Spherical harmonics provide a smooth, orthogonal, and symmetry-adapted basis to expand functions on a sphere, and they are used routinely in physical and theoretical chemistry as well as in different fields of science and technology, from geology and atmospheric sciences to signal processing and computer graphics. More recently, they have become a key component of rotationally equivariant models in geometric machine learning, including applications to atomic-scale modeling of molecules and materials. We present an elegant and efficient algorithm for the evaluation of the real-valued spherical harmonics. Our construction features many of the desirable properties of existing schemes and allows us to compute Cartesian derivatives in a numerically stable and computationally efficient manner. To facilitate usage, we implement this algorithm in sphericart, a fast C++ library that also provides C bindings, a Python API, and a PyTorch implementation that includes a GPU kernel.

The spherical harmonics Ylm are basis functions for the irreducible representations of the SO(3) group,1 which makes them a key tool for understanding physical phenomena that exhibit rotational symmetries and designing practical algorithms to model them on a computer. Examples include the distribution of charge in atoms,2 the behavior of gravitational3 and magnetic4,5 fields, and the propagation of light6 and sound7 in the atmosphere. Moreover, they provide a complete set of smooth, orthogonal functions defined on the surface of a sphere, and in this sense, they are widely used in many fields, including computer graphics,8,9 quantum10–13 and physical14–16 chemistry, and signal processing.17,18 More recently, they have become an essential tool in the context of geometric deep learning,19,20 as a structural descriptor needed in SO(3)-, O(3)-, and E(3)-equivariant machine learning models21–24 and more specifically in the construction of symmetry-adapted descriptors of atomic structures in chemical machine learning.25–29 Derivatives of spherical harmonics are also very often used in these applications. An example is the calculation of forces on a configuration of atoms by an E(3)-invariant machine learning model.30 Likewise, spherical harmonics gradients have been used in computer graphics for mid-range illumination, irradiance volumes, and optimization methods.9 

In many of these contexts, the spherical harmonics are used together with an expansion in the radial direction to compute expressions of the form
(1)
where Rnl indicates a radial basis, and rk=rkr̂k is a set of points that correspond to either particles or the positions of a Cartesian mesh. In others, such as message-passing neural networks,31 directional terms that depend on the orientation of interatomic separation vectors are multiplied by continuous filters that are a function of the radial distance. For this reason, although the spherical harmonics are most often defined as complex functions in spherical coordinates, in practical implementations it is often preferred to use their real-valued combinations and to express their value and derivatives directly in terms of the Cartesian coordinates of the points at which they are evaluated. With these applications in mind, we derive compact, efficient expressions to compute the real spherical harmonics and their derivatives of arbitrary order as polynomials of the Cartesian coordinates, we discuss the computational implications of this formulation, and we present a simple yet efficient implementation that can be used both as a C and C++ library and as a Python module.
The real-valued spherical harmonics can be defined in spherical coordinates (θ, ϕ) as
(2)
where Plm is an associated Legendre polynomial and Flm is a prefactor that takes the form
(3)
Possible strategies for a stable and efficient computation of the Flm are discussed in  Appendix C.
Traditionally,32 the real spherical harmonics of points in 3D space have been calculated by first converting the Cartesian coordinates x, y, and z into the spherical coordinates θ and ϕ and then using Eq. (2) to calculate the spherical harmonics, most often via the standard recurrence relations for Plm(t),
(4)
It should be noted how naive differentiation of these recursions introduces poles on the z axis where t = cos θ = ±1 as well as numerical instabilities in the calculation of the derivatives for points close to the z axis.

In contrast, our algorithm aims at calculating Ỹlm=rlYlm, where r=x2+y2+z2. These are the so-called solid harmonics, which consist of simple homogeneous polynomials of the Cartesian coordinates. This choice avoids the need to normalize r and leads to simple and stable iterations to compute Ỹlm as well as very compact expressions for their derivatives with respect to the Cartesian coordinates, which re-use the same factors needed to evaluate Ỹlm. In most applications that require spherical harmonics in Cartesian coordinates, the radial direction is dealt with by a separate expansion [cf. Eq. (1)], and the rl factor that is included in the scaled Ỹlm can be compensated for at little to no additional cost by a corresponding rl factor in the radial term. If, instead, spherical harmonics are needed in their conventional version, they can be recovered easily from the radially scaled (or solid) kind.

As shown in  Appendix A, the scaled Cartesian harmonics Ỹlm can be computed as
(5)
where we define
(6)
with rxy=x2+y2. Similar (but not equivalent) definitions have been used often in the literature, e.g., in Refs. 14, 33, and 34. The quantities Qlm, cm, and sm can be evaluated very efficiently by recursion in Cartesian coordinates. For example, the recursions for Qlm follow almost immediately (see  Appendix A) from those for Plm [Eq. (4)] and the definition of Qlm [Eq. (6)]:
(7)
Many other recursive expressions can be derived based on analogous, well-known relations for Plm, e.g.,
(8)
which can be used to iterate down from Qll, avoiding the pole at rxy = 0 that is present for the similar recursion for Plm.
Similarly, recursive relations for sm and cm can be derived (see  Appendix A) as
(9)
Once the Qlm, cm, and sm quantities are known, their Cartesian derivatives also follow in an extremely compact form:
(10)
These relationships (proven in  Appendix B) are much simpler than the standard recurrence relations for the derivatives of Plm (see e.g., Ref. 35), and they do not lead to poles or instabilities when combined to compute the derivatives of Ỹlm. The simplicity of the expressions in Eq. (10) opens the door to the efficient calculation of Cartesian derivatives of Ỹlm of any order. Indeed, our software implementation (Sec. III) provides derivatives of the spherical harmonics up to second order (Hessians). Previously to this algorithm, this had only been possible by finite differences or through the use of automatic differentiation tools, which, however, lead to significant computational overheads.
In addition, one can find several expressions that directly link some of the derivatives of the Cartesian spherical harmonics to other (l, m) values, e.g.,
(11)
which simplify the calculation of the derivatives of the spherical harmonics even further. For l ≤ 6, we use a computer algebra system to automatically find the expressions that provide Ỹlm and their derivatives in terms of lower-l values with the smallest number of multiplications, and we use them to generate hard-coded implementations, as discussed in Sec. III.

Most computational applications require the evaluation of all the spherical harmonics, and possibly their derivatives, for the values of l up to a maximum degree lmax. In many cases, the spherical harmonics have to be computed for many points simultaneously, e.g., for the interatomic separation vectors of all neighbors of a selected atom.

Even though we recommend using the scaled spherical harmonics in applications, accounting for the fact that accompanying radial terms have to compensate for the scaling, we also provide an implementation of “normalized” spherical harmonics by simply evaluating Ỹlm(x/r,y/r,z/r) and applying the chain rule to the derivatives. These additional operations typically result in an overhead of 5%–10% for the calculation of the spherical harmonics and their derivatives.

Our general routines take a list of Cartesian coordinates and evaluate the spherical harmonics and, optionally, their derivatives and/or second derivatives with respect to the inputs. We use C++ for the implementation using templates to exploit compile-time knowledge of the maximum angular momentum, the need to compute derivatives, and so on. We then define a pure C API covering the typical use cases on top of this C++ API. Since almost all programming languages have a way to call C functions, this enables using our code from most languages used in a scientific context. We also provide a Python package with a high-level interface to the C library.

Furthermore, we use the C++ API to provide an implementation compatible with PyTorch,36 using a custom backward function to compute gradients using the derivatives evaluated in the forward pass and making sure the code is compatible with TorchScript, allowing models to be used without a Python interpreter. Finally, we implement a GPU-accelerated version of the PyTorch code for NVIDIA GPUs, using the CUDA language. Table I compares the features of sphericart to those of other spherical harmonics implementations.41–50 

TABLE I.

Characteristics of some publicly available spherical harmonics libraries. Only e3nn and sphericart offer GPU support. An asterisk in the derivative section means that a workaround was necessary to deal with the poles in the derivatives. “Yes/No” in the hardcoded section means that low degrees are hardcoded for computational efficiency but that generic functions are also implemented, providing spherical harmonics up to arbitrary degrees.

PackageInputOutputYlmHardcoded
librascal41  Polar Real Yes* No 
QUIP/turbosoap42  Polar Real Yes* No 
PACE43  Cartesian Complex Yes No 
e3nn38  Cartesian Real Yes Yes 
ACE.jl44  Polar Complex Yes* No 
Dscribe45  Cartesian Real Yes Yes 
libcint/PySCF46  Cartesian Real No Yes 
SciPy47  Polar Complex No No 
gsl48  Polar Real Yes No 
Boost49  Polar Complex No No 
Google50  Polar Real No Yes/No 
sphericart Cartesian Real Yes Yes/No 
PackageInputOutputYlmHardcoded
librascal41  Polar Real Yes* No 
QUIP/turbosoap42  Polar Real Yes* No 
PACE43  Cartesian Complex Yes No 
e3nn38  Cartesian Real Yes Yes 
ACE.jl44  Polar Complex Yes* No 
Dscribe45  Cartesian Real Yes Yes 
libcint/PySCF46  Cartesian Real No Yes 
SciPy47  Polar Complex No No 
gsl48  Polar Real Yes No 
Boost49  Polar Complex No No 
Google50  Polar Real No Yes/No 
sphericart Cartesian Real Yes Yes/No 

We apply a number of trivial (and a few less obvious) optimizations. For example, we pre-compute the factors Flm [Eq. (3)] to minimize the number of operations in the inner loop of the iterative algorithm discussed above. In order to further accelerate the evaluation of low-l Cartesian harmonics, we use a computer algebra system to derive expressions that evaluate the full Ỹlm and their derivatives with hard-coded expressions using the smallest possible number of multiplications. As presented in Table II and as noted in previous implementations,24,33 there is considerable scope for optimization by using ad hoc expressions. However, the speedup is less remarkable when including the calculation of the derivatives through Eq. (10), which re-uses quantities that have already been computed when evaluating Ỹlm.

TABLE II.

Serial execution time (in ns/point) for computing Cartesian spherical harmonics and their derivatives up to the indicated value of lmax in double precision on an Intel Xeon Gold 6226R CPU, averaged over 1000 calls for 10 000 points, comparing our general purpose recursive algorithm to a hard-coded implementation.

General-purposeHard-coded
ỸlmỸlm,ỸlmỸlmỸlm,Ỹlm
lmax = 1 3.49 11.0 1.33 7.85 
lmax = 2 7.16 21.6 4.21 16.4 
lmax = 3 13.3 36.1 7.26 28.4 
lmax = 4 20.3 55.3 11.8 46.3 
lmax = 5 29.8 81.9 16.3 68.3 
lmax = 6 42.8 121 22.2 95.2 
General-purposeHard-coded
ỸlmỸlm,ỸlmỸlmỸlm,Ỹlm
lmax = 1 3.49 11.0 1.33 7.85 
lmax = 2 7.16 21.6 4.21 16.4 
lmax = 3 13.3 36.1 7.26 28.4 
lmax = 4 20.3 55.3 11.8 46.3 
lmax = 5 29.8 81.9 16.3 68.3 
lmax = 6 42.8 121 22.2 95.2 

As a compromise between the convenience of a function that works for arbitrary lmax and the efficiency of optimized expressions, we provide an interface for the hard-coded implementation up to lmax = 6. By default, our implementation applies the hard-coded version for small l and then switches to the general expression, making use of the alternative recursion (8) to avoid the computation of the Legendre polynomials Qlm. for small l. All our functions can be applied to many 3D points at once, and they are trivially parallelized over the sample index using OpenMP.37 Despite the simplicity of our design, we achieve good parallel scaling, particularly when using a relatively large lmax so that there is a substantial amount of computation for each thread, as illustrated in Fig. 1. The curves show some degree of irregular behavior, suggesting that there might be room for further optimization. As presented in Table III, for large lmax and numbers of data points, there is a substantial performance gain by using single-precision floating-point arithmetics. However, for smaller amounts of computation, there is a very small advantage, and in some corner cases (lmax = 8, nsamples = 10 000, and nthreads = 16), the single-precision version can be slower than that using 64-bit floating-point values, which further suggests that it may be possible to optimize memory-access patterns to improve parallel performance.

FIG. 1.

Scaling of wall-clock timing to evaluate the Cartesian spherical harmonics and their derivatives for 10 000 points using different lmax and numbers of OpenMP threads. Results refer to up to 64 cores of up to two Intel Xeon Platinum 8360Y (2.4 GHz) CPUs.

FIG. 1.

Scaling of wall-clock timing to evaluate the Cartesian spherical harmonics and their derivatives for 10 000 points using different lmax and numbers of OpenMP threads. Results refer to up to 64 cores of up to two Intel Xeon Platinum 8360Y (2.4 GHz) CPUs.

Close modal
TABLE III.

CPU-parallel execution time (in ns/point) for computing Cartesian spherical harmonics (or Cartesian spherical harmonics and their derivatives) up to the indicated value of lmax. Timings refer to 1 and 16 OpenMP threads, respectively, on Intel Xeon Gold 6226R CPU, averaged over 1000 calls for 10 000 points.

1 OpenMP thread16 OpenMP threads
ỸlmỸlm,ỸlmỸlmỸlm,Ỹlm
Single precision 
lmax = 1 1.08 2.00 0.572 0.680 
lmax = 2 3.74 12.4 0.564 1.10 
lmax = 4 9.47 35.8 0.922 2.58 
lmax = 8 56.4 169 17.5 54.2 
lmax = 16 240 1093 28.9 126 
lmax = 32 1128 3101 109 366 
Double precision 
lmax = 1 1.06 6.47 0.451 0.573 
lmax = 2 4.02 15.9 0.578 1.28 
lmax = 4 11.6 47.3 1.00 3.73 
lmax = 8 57.0 280 5.15 21.7 
lmax = 16 252 1602 18.3 164 
lmax = 32 1385 4651 148 440 
1 OpenMP thread16 OpenMP threads
ỸlmỸlm,ỸlmỸlmỸlm,Ỹlm
Single precision 
lmax = 1 1.08 2.00 0.572 0.680 
lmax = 2 3.74 12.4 0.564 1.10 
lmax = 4 9.47 35.8 0.922 2.58 
lmax = 8 56.4 169 17.5 54.2 
lmax = 16 240 1093 28.9 126 
lmax = 32 1128 3101 109 366 
Double precision 
lmax = 1 1.06 6.47 0.451 0.573 
lmax = 2 4.02 15.9 0.578 1.28 
lmax = 4 11.6 47.3 1.00 3.73 
lmax = 8 57.0 280 5.15 21.7 
lmax = 16 252 1602 18.3 164 
lmax = 32 1385 4651 148 440 

We adapt the implementation to a custom CUDA kernel, which allows efficient execution on GPUs. Similarly to the CPU implementation, we use hard-coded spherical harmonics and derivatives up to a reduced hard-coded lmax = 3 to lower shared memory requirements, and we use the general expression for the remaining terms. We parallelize the computation with a two-dimensional thread-block of 16 × 8 threads using a grid dimension of (nsamples/16) blocks. The first dimension in the thread-block parallelizes over samples, while the first thread in the second dimension is responsible for performing the computation. The remaining threads perform coalesced writing of the temporary buffers to global memory. We store the intermediary work buffers for the spherical harmonics and derivatives in fast shared memory, which defaults to 48 KB for most NVIDIA cards. Memory accesses are designed such that each sample-dimension thread is mapped to a shared memory bank contiguously, eliminating possible bank conflicts. The CUDA wrapper automatically re-allocates the necessary amount of shared memory beyond the default 48 KB on the first compute call or attempts to reduce the number of threads launched in the CUDA kernel if the required allocation is too large. As a consequence, for accurate timings, we call the forward step once to perform the initialization and then measure timings thereafter. We note that using the recursion (7) would require storing temporary values for all llmax, so that the default 48 KB shared memory allocation would be filled for lmax ≈ 8 when computing derivatives in 64-bit floating-point format. Using the alternative recursion (8) allows us to evaluate one l channel at a time, so values up to lmax ≈ 30 can be computed without adjusting the shared memory allocation or reducing the number of sample-dimension threads. For GPUs that support more than 48 KB shared memory per block, for example, the A100 (164 KB), the recursion (8) allows us to evaluate values up to lmax ≈ 82.

Table IV presents timings for GPU-accelerated computations on an A100 SXM4 (80 GB) card. In general, using single-precision floating-point arithmetic results in half the computation time than double-precision arithmetic. We note that the GPU per-sample timings reduce significantly upon an increasing number of samples, as the GPU is not fully saturated for nsamples = 10 000. For example, for nsamples = 100 000, the per-sample timings reduce by a factor of 1.5–4. Although the parallel CPU implementation is faster than the GPU for lmax ≤ 4 when using nsamples = 10 000, this trend reverses for higher values of lmax. For example, the 64-bit floating-point computations with nsamples = 10 000 and lmax = 16 show that the GPU outperforms the CPU by a factor of 8.

TABLE IV.

GPU-parallel execution time (in ns/point) for computing Cartesian spherical harmonics (or Cartesian spherical harmonics and their derivatives) up to the indicated value of lmax. Timings refer to a single A100 SXM4/80 GB GPU and are averaged over 10 000 calls on 10 000 and 100 000 points, respectively.

10k points100k points
 Ỹlm Ỹlm,Ỹlm Ỹlm Ỹlm,Ỹlm 
Single precision 
lmax = 1 2.1 2.4 0.3 0.4 
lmax = 2 2.2 2.4 0.3 0.4 
lmax = 4 2.4 3.1 0.5 0.8 
lmax = 8 3.1 4.8 1.0 3.3 
lmax = 16 5.6 15.1 2.8 10.9 
lmax = 32 15.5 43.6 9.6 33.5 
Double precision 
lmax = 1 2.2 2.6 0.4 0.6 
lmax = 2 2.3 2.9 0.4 0.8 
lmax = 4 2.7 4.0 0.7 1.7 
lmax = 8 4.2 8.5 1.7 4.6 
lmax = 16 9.3 22.2 5.2 16.0 
lmax = 32 28.8 75.0 20.6 65.1 
10k points100k points
 Ỹlm Ỹlm,Ỹlm Ỹlm Ỹlm,Ỹlm 
Single precision 
lmax = 1 2.1 2.4 0.3 0.4 
lmax = 2 2.2 2.4 0.3 0.4 
lmax = 4 2.4 3.1 0.5 0.8 
lmax = 8 3.1 4.8 1.0 3.3 
lmax = 16 5.6 15.1 2.8 10.9 
lmax = 32 15.5 43.6 9.6 33.5 
Double precision 
lmax = 1 2.2 2.6 0.4 0.6 
lmax = 2 2.3 2.9 0.4 0.8 
lmax = 4 2.7 4.0 0.7 1.7 
lmax = 8 4.2 8.5 1.7 4.6 
lmax = 16 9.3 22.2 5.2 16.0 
lmax = 32 28.8 75.0 20.6 65.1 

We note in closing that, even though precise timings depend on the hardware and the maximum angular momentum considered, sphericart is between 10 and 40 times faster than the spherical harmonics implementation in e3nn,38 a widely used library for equivariant neural networks. We provide wrappers that use the same conventions as e3nn, to simplify integration of our implementation in existing models and to accelerate computational frameworks that are limited by the evaluation of Ylm.

Spherical harmonics are ubiquitous in the computational sciences, and their efficient calculation has become even more important in light of the widespread adoption of equivariant machine-learning models in chemistry. Reformulating the calculation of spherical harmonics in a scaled form that corresponds to real-valued polynomials of the Cartesian coordinates provides simple expressions for their recursive evaluation. Derivatives can be obtained with little overhead, re-using the same factors that enter the definition of the scaled Ỹlm, and without numerical instabilities. The conventional form of the spherical harmonics can be recovered easily by computing Ỹlm at (x/r, y/r, z/r) and applying the corresponding correction to the derivatives. In many applications, this normalization may be unnecessary, as the scaling can be incorporated (explicitly or through regression) into additional radial terms. We provide an efficient implementation as a C++ library, complete with a C API, Python, and PyTorch bindings, that tackles the most common use case in scientific computing and geometric machine learning, i.e., the evaluation of real-valued spherical harmonics for all angular momentum channels up to a given cutoff lmax and possibly for many 3D points at once. The function call is parallelized over the sample direction, and it uses more efficient hard-coded expressions for low-l terms. Future efforts will focus on the extension of the software library to different programming languages and frameworks as well as on further optimization of new hardware platforms and improved parallelism.

The Authors would like to thank Kevin Kazuki Huguenin-Dumittan for useful discussions, Philip Loche for help with the sphericart library, and Prashanth Kanduri for help with the continuous integration framework. M.C. acknowledges funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 101001890-FIAMMA). GF acknowledges support from the Platform for Advanced Scientific Computing (PASC). M.C. and F.B. acknowledge support from the NCCR MARVEL, funded by the Swiss National Science Foundation (Grant No. 182892).

The authors have no conflicts to disclose.

Filippo Bigi and Michele Ceriotti conceived the project and performed analytical derivations. All authors contributed to software development and the writing of the paper.

Filippo Bigi: Conceptualization (equal); Formal analysis (equal); Software (equal); Writing – original draft (equal); Writing – review & editing (equal). Guillaume Fraux: Software (equal); Writing – review & editing (supporting). Nicholas J. Browning: Software (equal); Writing – review & editing (supporting). Michele Ceriotti: Conceptualization (equal); Formal analysis (equal); Methodology (equal); Resources (lead); Software (equal); Supervision (lead); Writing – original draft (equal); Writing – review & editing (equal).

The sphericart library can be freely downloaded under the Apache License version 2.0 from the public git repository https://github.com/lab-cosmo/sphericart. A Python package that provides a convenient class to compute spherical harmonics and derivatives from a NumPy39 or a PyTorch36 array of 3D coordinates is available on PyPI at https://pypi.org/project/sphericart/. The timing data can be generated using the benchmarking code that is included in the software distribution.

In this appendix, we derive the Cartesian form of the recurrence relations presented in the main text. Starting from Eq. (2), it is sufficient to multiply both sides by rl and multiply and divide the right-hand side by rxy|m| to obtain
(A1)
Then, using the definitions in Eq. (6), Eq. (5) follows.
Let us now derive Eq. (7). These relationships can be obtained as a combination of Eq. (4) and the definition of Qlm in Eq. (8), with the additional observation that t = cos θ = z/r,
(A2)
(A3)
(A4)
(A5)
And, finally, we can derive the recurrence relations for sm and cm in the following way:
(A6)
(A7)
where we have used some well-known trigonometric relations and the fact that sin ϕ = y/rxy and cos ϕ = x/rxy.
In this appendix, we prove the derivative formulas presented in the main text, i.e., Eq. (10). Let us start with the derivatives of sm and cm. To this end, we can rewrite Eq. (9) in matrix form:
(B1)
from which it is easy to see that
(B2)
Differentiation with respect to x yields
(B3)
which proves the x derivatives of sm and cm in Eq. (10), while differentiation with respect to y gives
(B4)
which results in the y derivatives of sm and cm in Eq. (10).
We can now turn to the Qlm/z derivative in Eq. (10), which we prove by induction over the l variable. The base case (l = m + 1) can easily be checked for any m by considering the third equality in Eq. (7), which implies
(B5)
Since Qmm does not depend on z [see Eq. (10)], differentiating with respect to z gives
(B6)
This is, indeed, exactly the Qlm/z derivative in Eq. (10) for l = m + 1. In the induction step, we prove the l case from the l − 1 and l − 2 cases, so it is also necessary to prove the l = m + 2 case in advance. From the last line in Eq. (7), we can write
(B7)
Differentiation leads to
(B8)
Using Qmm/z=0 [see Eq. (7)] and Qm+1m/z=(2m+1)Qmm (which we have just proved), we obtain
(B9)
where we have used (2m+1)zQmm=Qm+1m from the third equality of Eq. (7) and hidden some elementary algebra. This corresponds to the Qlm/z derivative in Eq. (10) for l = m + 2. For the induction step, we can now assume Ql1m/z=(l+m1)Ql2m and Ql2m/z=(l+m2)Ql3m. Differentiating the last equality of Eq. (7) with respect to z on both sides results in
(B10)
We can now insert the assumptions of the induction step into the right-hand-side expression to obtain
(B11)
An elementary manipulation of the zQl2m terms affords
(B12)
Finally, the application of the last equality of Eq. (7) absorbs the Ql2m and Ql3m terms into
(B13)
which simplifies into
(B14)
as required.
To prove the expressions for Qlm/x and Qlm/y in Eq. (10), we first note that, since Qlm is formally only a function of z and r, any x- or y-dependence in Qlm must come from the involvement of the r variable, so that
(B15)
and
(B16)
Hence, in order to justify the x and y derivatives of Qlm in Eq. (10), we simply need to prove that
(B17)
Before doing so, we borrow Eq. (1) from Ref. 40. It is important to note how Ref. 40 follows a different convention, and it does not include a (−1)m factor in the definition of the associated Legendre polynomials. Hence, compared to Ref. 40, we change the sign of the Pl1m1 term to obtain
(B18)
which is now consistent with our conventions. Noting that sin θ = rxy/r and multiplying by rlrxym on both sides gives
(B19)
which, thanks to the definition of Qlm in Eq. (6), translates to
(B20)
Equation (B20) and the last line of Eq. (7) provide two expressions for Qlm. Imposing their equality results in
(B21)
This can be rearranged into
(B22)
From this equation, Ql1m1 can be extracted as
(B23)
Let us now go back to Eq. (B17). We prove it by induction over the l variable, similar to what we did for the z-derivative. Given the ml constraint, the base case reads Qm+1m1/r=rQmm. From the third equality in Eq. (7), we have Qmm1=(2m1)zQm1m1. Now, using the last line of Eq. (7) with l = m + 1 and (abusing notation) m = m − 1, we can express Qm+1m1 as
(B24)
where we have used the second equality of Eq. (7) in the last line. Since Qmm does not depend on r [see Eq. (7)], differentiating both sides of Eq. (B24) with respect to r yields
(B25)
as required. As in the case of the z derivative, the base case also includes a further equality: Qm+2m1/r=rQmm+1. Using the last line of Eq. (7), Qm+2m1 can be written as
(B26)
partial differentiation of the above with respect to r yields
(B27)
However, we have proved Qm+1m1/r=rQmm, and we have Qmm1/r=0 from the first three equalities of Eq. (7). Hence,
(B28)
where we have used the third equality of 7 twice. This concludes the proof of the base cases.
To prove the induction step, we start by differentiating both sides of the last equality of Eq. (7) with respect to r. We obtain
(B29)
We can now assume that Ql1m/r=rQl2m+1 and Ql2m/r=rQl3m+1 to get
(B30)
Now, Eq. (B23) implies Ql2m=(zQl2m+1+r2Ql3m+1)/(lm2). Substituting this into Eq. (B30) leads to
(B31)
Adding like terms (i.e., those in zrQl2m+1 and those in r3Ql3m+1) results in
(B32)
However, by considering the last line of Eq. (7) with l decreased by one and m increased by one, the last expression can be simplified into
(B33)
which concludes the proof.
The prefactors Flm contain a ratio of factorials that can lead to numerical instabilities in a naïve implementation. It is, however, easy to see that one can compute them iteratively as
(C1)
It is also possible to incorporate the prefactors in the definition of the modified associated Legendre polynomials, defining Q̃lm=FlmQlm. This simplifies somehow the construction of Ỹlm and avoids possible instabilities connected with the fact that Flm become very small for large ml, at the price of complicating slightly the expressions for the recursion and derivatives of Qlm, e.g.,
(C2)
1.
C.
Müller
,
Spherical Harmonics
(
Springer
,
1966
).
2.
E.
Steiner
,
J. Chem. Phys.
39
,
2365
(
1963
).
3.
M.
Rexer
,
C.
Hirt
,
S.
Claessens
, and
R.
Tenzer
,
Surv. Geophys.
37
,
1035
(
2016
).
4.
R.
Knaack
and
J. O.
Stenflo
,
Astron. Astrophys.
438
,
349
(
2005
).
5.
A.
Morschhauser
,
V.
Lesur
, and
M.
Grott
,
J. Geophys. Res.: Planets
119
,
1162
, (
2014
).
7.
M. A.
Poletti
,
J. Audio Eng. Soc.
53
,
1004
(
2005
).
8.
N. L.
Max
and
E. D.
Getzoff
,
IEEE Comput. Graphics Appl.
8
,
42
(
1988
).
9.
P.-P.
Sloan
, in
Game Developers Conference
(
Peter-Pike Sloan Web Page
,
2008
), Vol.
9
, p.
42
.
10.
H. B.
Schlegel
and
M. J.
Frisch
,
Int. J. Quantum Chem.
54
,
83
(
1995
).
11.
S. A.
Varganov
,
A. T.
Gilbert
,
E.
Deplazes
, and
P. M.
Gill
,
J. Chem. Phys.
128
,
201104
(
2008
).
12.
P. M.
Gill
and
A. T.
Gilbert
,
Chem. Phys.
356
,
86
(
2009
).
13.
S.
Maintz
,
M.
Esser
, and
R.
Dronskowski
,
Acta Phys. Pol., B
47
,
1165
(
2016
).
14.
J. M.
Pérez-Jordá
and
W.
Yang
,
J. Chem. Phys.
104
,
8003
(
1996
).
15.
C. H.
Choi
,
J.
Ivanic
,
M. S.
Gordon
, and
K.
Ruedenberg
,
J. Chem. Phys.
111
,
8825
(
1999
).
16.
L.
Ding
,
M.
Levesque
,
D.
Borgis
, and
L.
Belloni
,
J. Chem. Phys.
147
,
094107
(
2017
).
17.
D. N.
Zotkin
,
R.
Duraiswami
, and
N. A.
Gumerov
, in
2009 IEEE Workshop on Applications of Signal Processing To Audio And Acoustics
(
IEEE
,
2009
), pp.
257
260
.
18.
X.
Li
,
S.
Yan
,
X.
Ma
, and
C.
Hou
,
Applied Acoustics
72
,
646
(
2011
).
19.
T.
Cohen
and
M.
Welling
, in
International Conference on Machine Learning
(
PMLR
,
2016
), pp.
2990
2999
.
20.
M. M.
Bronstein
,
J.
Bruna
,
T.
Cohen
, and
P.
Veličković
, arXiv:2104.13478 (
2021
).
21.
N.
Thomas
,
T.
Smidt
,
S.
Kearnes
,
L.
Yang
,
L.
Li
,
K.
Kohlhoff
, and
P.
Riley
, arXiv:1802.08219 (
2018
).
22.
B.
Anderson
,
T. S.
Hy
, and
R.
Kondor
, in
NeurIPS
(
Curran Associates
,
2019
), p.
10
.
23.
J.
Klicpera
,
F.
Becker
, and
S.
Günnemann
,
Adv. Neural Inf. Process.
34
,
6790
6802
(
2021
).
24.
M.
Geiger
and
T.
Smidt
, arXiv:2207.09453 (
2022
).
25.
A. P.
Bartók
,
R.
Kondor
, and
G.
Csányi
,
Phys. Rev. B
87
,
184115
(
2013
).
26.
A.
Thompson
,
L.
Swiler
,
C.
Trott
,
S.
Foiles
, and
G.
Tucker
,
J. Comput. Phys.
285
,
316
(
2015
).
27.
M. J.
Willatt
,
F.
Musil
, and
M.
Ceriotti
,
J. Chem. Phys.
150
,
154110
(
2019
).
29.
F.
Musil
,
A.
Grisafi
,
A. P.
Bartók
,
C.
Ortner
,
G.
Csányi
, and
M.
Ceriotti
,
Chem. Rev.
121
,
9759
(
2021
).
30.
A. S.
Christensen
and
O. A.
Von Lilienfeld
,
Mach. Learn.: Sci. Technol.
1
,
045018
(
2020
).
31.
K.
Schütt
,
O.
Unke
, and
M.
Gastegger
, in
International Conference on Machine Learning
(
PMLR
,
2021
), pp.
9377
9388
.
32.
W. H.
Press
,
Numerical Recipes: The Art of Scientific Computing
(
Cambridge University Press
,
2007
).
33.
P.-P.
Sloan
,
J. Comput. Graph. Tech.
2
,
84
(
2013
).
35.
P.
Alken
, “
GSL-TR-001-20220827 Implementation of associated Legendre functions in GSL,
” GSL Technical Report No. 1, 2022; available at https://www.gnu.org/software/gsl/tr/tr001.pdf.
36.
A.
Paszke
,
S.
Gross
,
F.
Massa
,
A.
Lerer
,
J.
Bradbury
,
G.
Chanan
,
T.
Killeen
,
Z.
Lin
,
N.
Gimelshein
,
L.
Antiga
,
A.
Desmaison
,
A.
Kopf
,
E.
Yang
,
Z.
DeVito
,
M.
Raison
,
A.
Tejani
,
S.
Chilamkurthy
,
B.
Steiner
,
L.
Fang
,
J.
Bai
, and
S.
Chintala
, in
Advances in Neural Information Processing Systems 32
, edited by
H.
Wallach
,
H.
Larochelle
,
A.
Beygelzimer
,
F.
dAlché-Buc
,
E.
Fox
, and
R.
Garnett
(
Curran Associates, Inc.
,
2019
), pp.
8024
8035
.
37.
L.
Dagum
and
R.
Menon
,
IEEE Comput. Sci. Eng.
5
,
46
(
1998
).
38.
M.
Geiger
, and
T.
Smidt
, “
e3nn: Euclidean neural networks
,” arXiv:2207.09453 (
2022
).
39.
C. R.
Harris
,
K. J.
Millman
,
S. J.
van der Walt
,
R.
Gommers
,
P.
Virtanen
,
D.
Cournapeau
,
E.
Wieser
,
J.
Taylor
,
S.
Berg
,
N. J.
Smith
et al,
Nature
585
,
357
(
2020
).
41.
F.
Musil
et al, “
Efficient implementation of atom-density representations,
J. Chem. Phys.
154
,
114109
(
2021
).
42.
M. A.
Caro
, “
Optimizing many-body atomic descriptors for enhanced computational performance of machine learning based interatomic potentials,
Phys. Rev. B
100
(
2
),
024112
(
2019
).
43.
Y.
Lysogorskiy
et al, “
Performant implementation of the atomic cluster expansion (PACE) and application to copper and silicon,
npj Comp. Mater.
7
(
1
),
97
(
2021
).
44.
D.
Dusson
et al, “
Atomic cluster expansion: Completeness, efficiency and stability,
J. Comp. Phys.
(published online 2022).
45.
L.
Himanen
et al, “
DScribe: Library of descriptors for machine learning in materials science,
Comp. Phys. Comm.
247
,
106949
(
2020
).
46.
Q.
Qiming
et al, “
PySCF: the Python?based simulations of chemistry framework,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
(
1
),
e1340
(
2018
).
47.
P.
Virtanen
et al, “
SciPy 1.0: fundamental algorithms for scientific computing in Python,
Nat. Methods
17
,
261
272
(
2020
).
48.
M.
Galassi
et al, “
GNU scientific library,
Godalming: Network Theory Limited,
2002
.
49
R.
Schäling
,
The Boost C++ Libraries
(Boris Schäling,
2011
).
50
M.
Ludwig
(2015). “
Spherical harmonics library,
Github.
https://github.com/google/spherical-harmonics
Published open access through an agreement with EPFL