There is significant interest in improving the performance of batteries to increase electrification of transportation and aviation. Recently, performance improvements have been in large part due to changes in the composition of the cathode material family, LiNi_{x}Mn_{y}Co_{(1−x−y)}O_{2} (e.g., 111–622–811). Despite the importance of these materials and tremendous progress with density functional theory (DFT) calculations in understanding basic design principles, it is computationally prohibitively expensive to make this problem tractable. Specifically, predicting the open circuit voltage for any cathode material in this family requires evaluation of stability in a quaternary phase space. In this work, we develop machine-learning potentials using fingerprinting based on atom-centered symmetry functions, used with a neural network model, trained on DFT calculations with a prediction accuracy of 3.7 meV/atom and 0.13 eV/Å for energy and force, respectively. We perform hyperparameter optimization of the fingerprinting parameters using Bayesian optimization through the Dragonfly package. Using this ML calculator, we first test its performance in predicting thermodynamic properties within the Debye–Grüneisen model and find good agreement for most thermodynamic properties, including the Gibbs free energy and entropy. Then, we use this to calculate the Li-vacancy ordering as a function of Li composition to simulate the process of discharging/charging of the cathode using grand canonical Monte Carlo simulations. The predicted voltage profiles are in good agreement with the experimental ones and provide an approach to rapidly perform design optimization in this phase space. This study serves as a proof-point of machine-learned DFT surrogates to enable battery materials optimization.

## I. INTRODUCTION

Improving the performance of current Li-ion batteries is crucial toward increasing the electrification of transportation and aviation.^{1,2} A key driver for improvements over the last decade has been modifications of the cathode composition, with LiNi_{x}Mn_{y}Co_{(1−x−y)}O_{2} (NMC) cathodes occupying the majority of EV market share.^{3} Besides, low-Co content in cathode materials has become important for resiliency against price volatility and supply chain concerns,^{4} and this has led the U.S. DOE to identify a target for lowering the Co content to 50 g/kW h, i.e., less than 5 kg of Co for typical 100 kW h electric vehicle battery pack.^{5} Thus, methods to rapidly downselect the large design space of cathode materials are necessary to efficiently enable the inverse design needed for next generation battery materials.^{6}

While density functional theory (DFT) calculations have been proven useful in building design rules related to the performance of NMC cathodes,^{7–9} even a relatively simple prediction of the open circuit voltage as a function of Li insertion involves searching over a quaternary design space and has only been done for limited compositions.^{10} This large compositional space is computationally intractable using DFT calculations. Thus, a DFT surrogate that obeys the underlying symmetries (e.g., translational, rotational, permutation), which can predict both energy and force for a given composition, can vastly improve the design and optimization efforts.

Machine-learning potentials have vastly improved in performance^{11,12} and are quickly approaching the accuracy of *ab initio* methods at a minuscule fraction of the computational cost. There exist numerous featurization and regression approaches with varying levels of complexity with different dataset size needs^{13} and increasing levels of interpretability.^{14} Featurizations are also necessary and effective for unsupervised learning in materials classification and inference.^{15} Additionally, these techniques have been shown to accurately and efficiently expand to many-component systems,^{16} enabling design searches not previously possible. In this work, featurization is performed using atom-centered symmetry functions and neural network as the regressor trained on DFT data for NMC cathodes with an accuracy of 3.7 meV/atom and 0.13 eV/Å for energy and force, respectively. The final calculator uses free, open-source software and can be easily distributed and shared. Hyperparameter optimization of parameters in the atom-centered symmetry functions is done using Bayesian optimization through the Dragonfly package. Using this machine-learned calculator, we test the ability of predicting thermodynamic properties within the Debye–Grüneisen model. We find good agreement with respect to benchmark DFT calculations for most thermodynamic properties, including the predicted lattice constants, Gibbs free energy, and entropy, indicating a robust learning of the energy landscape. Then, we use it to calculate the open circuit voltage as a function of Li insertion by simulating the Li-vacancy ordering using grand canonical Monte Carlo simulations. The predicted voltage profiles are in good agreement with the experimental ones. This provides a strong proof-point of open-source, easily shareable machine-learned DFT surrogates to revolutionize battery materials optimization.

## II. MODEL FITTING

### A. Data

A training set of density functional theory calculations was generated by sampling the NMC phase space in the layered O3^{17} (also known as H1^{18}) structure seen in Fig. 1 as well as the O1 structure. These data were generated in the same way as in the previous work exploring the fully lithiated NMC space through the use of cluster expansion^{7} and also include structures with lithium vacancies. This was done by sweeping through the composition space and randomly arranging the transition metals and lithium atoms. The uniqueness of the structures was determined by comparing their n-body cluster interactions up to four body interactions over a maximum length of 6 Å to ensure no repeated data. After the general atomic arrangements were picked and determined to be unique, a lattice constant search and internal relaxation of atomic coordinates were performed in order to give a collection of atomic positions and lengths of interactions for the same composition and general atomic arrangement. All data were generated using the GPAW package real space implementation of the projector augmented wave method of density functional theory.^{19} The Bayesian error estimation functional^{20} was used to handle the exchange correlation potential at the level of the generalized gradient approximation. Full details of the calculations are available in the supplementary material. The total dataset of 12 962 points was then split into training, testing, and validation sets with the proportions of 80:10:10.

### B. Atom-centered symmetry functions

Within this work, we utilize the atom-centered symmetry functions suggested by Behler and Parrinello^{21,22} as implemented in the Atomistic Machine-learning Package (AMP).^{23} In this scheme, the total energy of a material is decomposed into the sum of the contributions from each atom,

The contribution to the total energy of each atom is computed using a set of Gaussian descriptor functions passed to a fully connected feed forward neural network. These functions, which encode the atomic position as well as the local neighborhood of each atom, must however be invariant to rotations and translations. Additionally, the total energy must be invariant to permutations of atomic labelings, and therefore, the neural network weights must be identical for every atom of the same species. The so called Behler–Parrinello symmetry functions used here that satisfy these requirements are

where

The first Gaussian symmetry function $Gi2$ captures the interactions of an atom i with all atoms j within a cutoff distance, *r*_{c}, while the $Gi4$ symmetry function captures the three body interactions of i with all j and k such that all of the atomic separations *r*_{ij}, *r*_{ik}, and *r*_{jk} are less than the cutoff distance and *θ*_{ijk} is the angle formed by *r*_{ij} and *r*_{ik}. To encode the chemical identities of the material, there exists a separate symmetry function for each unique combination of atomic identities that could be interacting. In these symmetry functions, a series of tunable parameters exist that control which specific bond length and bond angle interactions are captured for each symmetry function. Additionally, many versions of these symmetry functions can be used in conjunction to provide a robust featurization of the data. To ensure a computationally efficient model, the number of total symmetry functions within this study was limited to two forms of $Gi2$ and two forms of $Gi4$ such that with the chemical identity combinatorics, there are a total of 40 symmetry functions for each atomic species. Each set of symmetry functions was fed to a neural network with two hidden layers each with 30 nodes per layer, to finally predict the energy combination of that atomic species with the total energy of the material. The force on each atom can also be predicted using the analytical gradients of these fingerprints with respect to atomic positions.

### C. Hyper-parameter optimization

In order to determine ideal parameters for fingerprinting, a Bayesian optimization approach was taken by using the Dragonfly package.^{24} The 40 total fingerprints used correspond to two distinct G2 fingerprints with *r*_{s} = 0 and two different *η* values to be optimized and two G4 fingerprint types with *ζ* = 4, *γ* = −1 and 1, and a single *η* parameter to be learned. The Dragonfly algorithm searched this three-dimensional parameter space, attempting to minimize the prediction error on a test set. Additionally, the Dragonfly architecture used a series of acquisition functions in order to ensure an efficient search for the optimum value as some acquisition function will perform better than others for some tasks. These acquisition functions include the Gaussian process upper confidence bound (GP-UCB),^{25,26} Thompson sampling,^{27} expected improvement,^{28} and top-two expected improvement.^{29} The optimization process can be seen in Fig. 2 with labeling for which the acquisition algorithm was used for each evaluation. The acquisition functions balance explore and exploit strategies with the Dragonfly algorithm deciding this based on the learned Gaussian process response surface up to that point. In general, we find that the explore strategy (GP-UCB) tends to have a higher RMSE but enables learning a richer response surface.

For each evaluation, the neural networks were trained on energies only to a level of 1 meV/atom training error. The objective function for the optimization was then the root mean squared error on a test set. The best model found using this algorithm had an RMSE of 3.81 meV/atom on the test set.

### D. Force training

While these atom-centered symmetry functions fed to a neural network have demonstrated success in predicting the energy of materials as a function of the identity and positions of the atoms, to approach the utility of the underlying DFT, the prediction of forces is also needed. Unfortunately, this has remained a struggle for multi-component systems. The procedure of force training involves setting proportion parameters for the relative weighting of the energy loss and the force loss. The loss function is given as

where $Ej^$ and *E*_{j} are the predicted and reference energies of the *j*th data point, $F^ik$ and *F*_{ik} are the predicted and reference forces on the atom i of the *j*th data point in the k direction, and *α* is the force coefficient that sets the weighting of the force and energy sum of squared errors. A larger weighting for the force error will ensure more accurate force predictions at the expense of energetic accuracy. Therefore, for a fixed set of fingerprints, the quality of the force and energy fit will depend largely on this force parameter. Additionally, due to the smoothness of the functional form chosen and its analytical integration, forces predicted from machine-learned potentials have been known to be more accurate than the underlying data.^{30} Therefore, given the possible errors associated with the prediction of forces from a finite-difference implementation of DFT, attempting to match the DFT predicted forces too precisely may harm the accuracy of the potential. We trained models with four different force parameters in an attempt to find the largest force parameter that can train a model with comparable energetic accuracy of the energy only model. During the neural network training for the different force parameter cases, the loss on a holdout set was evaluated every ten epochs and the various force parameter valued models were then compared. The optimal force parameter tested was *α* = 0.001 giving a test error of 2.15 meV/atom and 0.142 eV/Å in energy and forces, respectively. A more detailed discussion of the effect of the force coefficient is shown in the supplementary material. The results of this neural network training can be seen in Fig. 3. To verify the success of the final selected model, a third validation set was used giving a root mean squared error of 3.69 meV/atom and 0.129 eV/Å in energy and forces, respectively.

## III. PROPERTY PREDICTION

We now turn to investigate how we can use this computationally efficient and accurate prediction of energy and forces to quickly and effectively predict the properties and performance of these materials. Additionally, we will see the ability of this model to replicate both DFT predictions and structure optimizations and predict the voltage with relatively good experimental agreement.

### A. Thermodynamic properties

We first look at the prediction of vibrational properties. By approximating a material as an elastic continuum with a linear phonon dispersion, the Debye–Grüneisen model can successfully and computationally efficiently predict the vibrational and thermodynamic properties of materials.^{31,32} Using energy vs volume data along with the Poirier–Tarantola^{33} equation of state fit, we use the Debye–Grüneisen model as implemented in the dePye^{34} framework. We have predicted the vibrational properties, including entropy and Gibbs free energy, of LiNiO_{2} using both density functional theory and the machine-learning potential trained within this work. To generate the energy–volume curve as an input for both DFT and the Behler–Parrinello Neural Network (BPNN), starting from the experimental lattice parameters and atomic positions of LiNiO_{2}, the c lattice parameter is optimized first and then the a lattice parameter, as seen in the crystal structure in Fig. 1. This was done by computing the energy of the material that is 0.98, 0.99, 1.0, 1.01, and 1.02 times the original lattice parameter and then fitting an equation of state.^{35} The internal atomic positions were then relaxed to a maximum force of 0.05 eV/Å for DFT and 0.25 eV/Å for the BPNN. As seen in Fig. 4, there is good agreement in the energy–volume curves, the predicted minimum energy, and the predicted volume, demonstrating the ability of the machine-learning potential to recreate the structure optimization routine done by DFT. Additionally, in Fig. 4(c), the predicted difference in Gibbs energy at 300 K is 0.05 eV per formula unit, which is of the same order of magnitude as the accuracy of DFT. Predictions of other derived properties, enthalpy, volume, heat capacity at constant pressure, and bulk modulus, are shown in the supplementary material.

### B. Grand canonical Monte Carlo

We now turn to understanding these layered oxide materials as cathodes for lithium ion batteries. This is done by understanding the lithium vacancy ordering as the lithium atom is inserted or removed, representing discharge or charge, respectively. To simulate the Li-vacancy orderings within various compositions of NMC materials, grand canonical Monte Carlo was used. The initial transition metal orderings were simulated using a previous cluster expansion fit and canonical Monte Carlo at 300 K.^{7} Once the orderings were determined, the lattice constants were optimized and the internal atomic potentials of the fully lithiated cathode were relaxed with the machine-learning potential in this work to a maximum force of 0.25 eV/Å or a maximum of ten steps. As mentioned above, the c and a lattice constants were optimized separately using an equation of state fit. To ensure the most optimal lattice constants, the procedure of fitting the lattice constants and relaxing to a maximum force of 0.25 eV/Å or ten steps was repeated again for the fully lithiated state. All of the lithium atoms were then removed, and this procedure was repeated again twice with a maximum force stopping criterion of 0.55 eV/Å. The number of relaxation steps was capped at 10 to save computational costs and ensure there was no runaway in the prediction of forces. We found that in some cases, the first atom position relaxation would begin to quickly reach a maximum force close to the chosen cutoff but then show a constant increase in the maximum predicted force and did not always reach the desired stopping criterion. After the second lattice constant search, the forces would continue to decrease successfully. This could possibly be improved by modifying the optimization algorithm or using an adaptive step size for the position updates. Our choice of 10, however, yields good results with respect to reproducing DFT and experiment. We expect the prediction of forces for the lithiated structures to be more accurate, and therefore, we saw improved relaxation of the fully charged, delithiated state when it was previously relaxed in the discharged state. Additionally, since the positions and separations of the oxygen atoms differed in the two discharged states, relaxing to a higher maximum force yielded better predicted voltage curves. After the final relaxation of the fully delithiated state, a grand canonical Monte Carlo simulation was performed with the machine-learning potential as the estimator of the energy of the system.

The simulation was run for N sweeps, where N is the fixed number of lattice sites on which a Li atom may sit. Each sweep consists of a trial of either insertion or removal of a lithium atom, followed by N trial steps of swapping a lithium atom and a vacancy. The probability of acceptance for insertion or removal is given by^{36}

where Λ is the thermal de Broglie wavelength, N is the number of lithium atoms already in the material, and V is the volume accessible to the lithium atom given by^{37} $V=Vcell\u2212N43\pi rLi3$. Here, we use the van der Waals radius of lithium *r*_{Li} = 182 *pm*.^{38} Then, the lattice constants of the resulting structure for that sweep were optimized to account for the change in the lattice constants as a function of the lithium content.

The chemical potential of lithium was calculated as the energy of a single bcc Li atom. It has been previously shown that the prediction of formation enthalpies of two similar materials is correlated, and therefore, the accuracy of predicting the formation enthalpy can be improved if it is referenced to a material where the experimental formation enthalpy is well known.^{39} We therefore calculate the chemical potential of Li by referencing it to LiCl, which has an oxidation state of 1+ just as the lithium atom will have when inside the electrode materials being simulated here. Thus, $\mu Li=ELiClDFT\u2212EClDFT\u2212\Delta Hfexp$, where we use the experimental formation enthalpy $\Delta Hfexp=\u22124.231\u2009eV$.^{40} The DFT references are the gas state of diatomic Cl_{2} and crystalline LiCl.

Once the Monte Carlo simulation has run sufficiently long and has moved from the delithiated starting point to the maximum lithium discharged state, the stability of the lithium vacancy orderings is analyzed through a convex hull analysis. The results of the grand canonical Monte Carlo routine and the resulting convex hull for LiCoO_{2} and LiNiO_{2} can be seen in Fig. 5(a). We also show the lattice constants as a function of the lithium content to demonstrate the machine-learning potential’s ability to largely capture the known lattice constant changes as a function of the lithium content.^{41–43} We see the layer separation determined by the c lattice parameter in Fig. 5(c) expands as more lithium is inserted into the material. For LiCoO_{2}, we see a maximum in the separation around *x*_{Li} = 0.4 and, as lithium increases, a small decrease in the layer separation matching experimental measurements.^{43} For the in plane layer size, we also see the correct expansion with the increasing lithium content above *x*_{Li} = 0.4 for both LiCoO_{2} and LiNiO_{2};^{41} additionally, the lattice parameter was predicted to be larger for *x*_{Li} = 0 than for *x*_{Li} = 1 as is experimentally observed.^{44}

### C. Open circuit voltage

With the convex hull of stable lithium vacancy orderings known, the open circuit voltage of these cathode materials with respect to the lithium metal can be easily determined. First, the average potential over the entire discharge can be calculated as

For a fixed transition metal composition, M = Ni_{y}Mn_{z}Co_{1−y−z}. The results shown in Table I compare the predicted average voltage with that of the previous computational work using cluster expansion giving good agreement. Additionally, a transition between two different lithium compositions (states-of-charge), *x*_{1} to *x*_{2}, is given by

A detailed calculation of the open circuit voltage for this transition can be predicted by using the Nernst equation and the computational lithium electrode^{45} as

Phase . | CE . | BPNN . |
---|---|---|

111 | 3.88 | 3.93 |

532 | 3.89 | 3.87 |

811 | 3.86 | 3.81 |

622 | 3.92 | 3.96 |

001 | 4.10 | 4.05 |

Phase . | CE . | BPNN . |
---|---|---|

111 | 3.88 | 3.93 |

532 | 3.89 | 3.87 |

811 | 3.86 | 3.81 |

622 | 3.92 | 3.96 |

001 | 4.10 | 4.05 |

The grand canonical Monte Carlo routine to predict the convex hull of lithium content for a given transition metal ordering as described above was repeated for a collection of NMC phases, and the resulting voltage prediction can be seen in Fig. 6, demonstrating the ability of this approach to compare a large collection of transition metal compositions. We see from this plot that LiCoO_{2} has the highest voltage, with a general decrease as the amount of Co decreases. The popular NMC111 phase with equal parts of Ni, Mn, and Co is the highest Mn content material investigated here and has the largest variation in voltage values over the discharge of the material. As the amount of Ni increases, for example, from 622 to 811, we see a flattening out of the voltage curve. The average voltage throughout the materials however remains relatively constant despite these variations in the shape in the voltage curve, demonstrating the need for a detailed, but previously computationally impractical full open circuit voltage calculation.

To benchmark these predictions with respect to well known experimental voltage profiles, we show in Fig. 7 the comparison of our prediction with the experiment for LiCoO_{2} and LiNiO_{2}. We see qualitatively good agreement with the experiment with errors no worse than would be expected from using DFT. One of the main errors in our prediction is around *x* = 0.2 for the LiNiO_{2} curve. At this Li content, the layered structure is known to be in the so called H2 phase vs the H1 phase used for the grand canonical Monte Carlo simulations. While these phases only differ by an offset in the layer alignment, the machine-learning potential within this work has not been trained on these data, and therefore, it is not included.

The ability to generate the voltage profile and lattice structure dynamics as a function of the state-of-charge (SOC) for any arbitrary NMC composition marks the first step toward a computationally feasible optimization workflow for relevant performance properties of cathode materials. To produce each of the predictions seen in Fig. 6 with DFT would require well over 10 000 single point calculations—a nearly impossible task. Not only can the properties of known NMC materials be predicted, but new never before synthesized materials can be explored saving hugely cost and time in the development of high Ni cathodes needed for the next generation electrification. The use of a machine-learning potential enables the prediction of lattice expansion, which is known to cause a capacity fade in Ni-rich materials,^{48} a capability not available in previous cluster expansion approaches for computational analysis of cathode materials. This represents an important proof-point of the capability of machine-learning potentials for optimization of battery materials.

## IV. CONCLUSION

Within this work, a computationally efficient and accurate neural network potential has been generated to model LiNi_{x}Mn_{y}Co_{(1−x−y)}O_{2} cathode materials. For the first time, the prediction of structural effects due to lithium insertion and removal as well as the prediction of the open circuit voltage can be done for any NMC composition within hours using a single core with a higher fidelity than had been previously seen in any computational study. Additionally, we use the open-source AMP implementation of the Behler–Parrinello scheme, which enables easy transfer of the final machine-learning calculator. We hope this machine-learning potential, which can accurately reproduce energies, forces, and structural predictions of density functional theory, will enable fast optimization of these cathode materials previously thought to be intractable.

## SUPPLEMENTARY MATERIAL

See the supplementary material for the training plot for other force coefficients and a comparison of other thermodynamic properties predicted from the BPNN vs DFT.

## DATA AVAILABILITY

The data that support the findings of this study are openly available in Github at https://github.com/BattModels/LithiumVacancyOrdering.

## ACKNOWLEDGMENTS

G.H. gratefully acknowledges the funding support from the National Science Foundation under Award No. CBET-1604898. Acknowledgment is also given to the Extreme Science and Engineering Discovery Environment (XSEDE) for providing computational resources under Award No. TG-CTS180061.

## REFERENCES

_{x}Mn

_{y}Co

_{z}O

_{2}cathode materials

_{1−2x}Co

_{x}Mn

_{x}O

_{2}(NCM) cathode materials

_{1−x}Ni

_{1+x}O

_{2}prepared with the excess lithium method

_{2}and LiCoO

_{2}electrodes

_{2}(R3m) for 4 volt secondary lithium cells

_{2}and LiNiO

_{2}upon Li intercalation and de-intercalation

_{1-x}Co

_{x}O

_{2}metastable oxides prepared by soft chemistry

_{2}thin film electrodes prepared by PLD

_{x}NiO

_{2}for 0 ≤ x ≤ 1

_{x}Co

_{y}Mn

_{z}O

_{2}