Machine learning (ML) continues to revolutionize computational chemistry for accelerating predictions and simulations by training on experimental or accurate but expensive quantum mechanical (QM) calculations. Photodynamics simulations require hundreds of trajectories coupled with multiconfigurational QM calculations of excited-state potential energies surfaces that contribute to the prohibitive computational cost at long timescales and complex organic molecules. ML accelerates photodynamics simulations by combining nonadiabatic photodynamics simulations with an ML model trained with high-fidelity QM calculations of energies, forces, and non-adiabatic couplings. This approach has provided time-dependent molecular structural information for understanding photochemical reaction mechanisms of organic reactions in vacuum and complex environments (i.e., explicit solvation). This review focuses on the fundamentals of QM calculations and ML techniques. We, then, discuss the strategies to balance adequate training data and the computational cost of generating these training data. Finally, we demonstrate the power of applying these ML-photodynamics simulations to understand the origin of reactivities and selectivities of organic photochemical reactions, such as cis–trans isomerization, [2 + 2]-cycloaddition, 4π-electrostatic ring-closing, and hydrogen roaming mechanism.
I. INTRODUCTION
Photochemical reactions are ubiquitous and responsible for many processes, including photosynthesis, vision, photocatalysis, and photodissociation. They generally occur in mild conditions with high atomic economies and provide synthetic tools with excellent spatiotemporal control over chemoselectivity, regioselectivity, and stereoselectivity. These essential features enable building molecular frameworks from designed molecules1–3 to structurally complex natural products.4–6 As such, photochemical reactions are gaining increasing importance in academic research and industrial applications for the synthesis of energy-dense fuels,7–11 innovation of energy storage devices,12–16 advancement of drug design,17,18 and development of novel materials.19,20
Photochemical reactions begin with a ground-state molecule absorbing a photon with an appropriate frequency to promote an instantaneous electronic transition to an excited state of the same spin multiplicity. The subsequent processes redistribute the absorbed energy along the molecular vibration modes on the excited-state potential energy surface (PES), which leads to structural transformations. In contrast to light-induced luminescence as the consequence of the radiative decay, photochemical reaction undergoes a nonradiative decay relaxing the photoexcited molecule to the ground state as a product or ground-state reactive intermediate.
Developments in time-resolved spectroscopic technologies allow chemists to study photochemical reactions on an ultrafast timescale.21–23 In principle, the instrument, for instance, transient absorption spectroscopy, can probe the excited-state dynamics of photoexcited molecules and provides structural information. At the same time, the reacting species evolve during photochemical reactions. However, interpreting the molecular vibrations becomes increasingly complicated for conformationally flexible molecules with competing non-radiative mechanistic pathways. The increasing degrees of freedom lead to elusive spectroscopic data, where the excited-state vibrational modes involve multiple synchronous bond stretching or torsions.24 Moreover, the timescale resolution of spectroscopy techniques (i.e., transient absorption and time-of-flight mass spectroscopy) is infeasible to capture all transient intermediates and molecules in excited states.
Quantum mechanics provides a theoretical framework to understand the relationship between molecular excited-state structures and their reactivities and selectivities. Calculations of the electronic transitions and the mechanistic critical points along the relevant excited-state PES provide unprecedented knowledge of the time-dependent evolution of molecular excited states toward photoproducts.
The theories derived from a static PES in equilibrium often fail to explain the reaction mechanisms of the photochemical reactions over a vast range of timescales (10−15–10−3 s). As such, dynamical effects must be considered to understand experimental quantum yields or predict the outcome of unknown photochemical reactions. These dynamical effects can be simulated with nonadiabatic molecular dynamics (NAMD). The mixed quantum-classic (MQC) trajectory formalism of NAMD is widely used to capture the dynamical effects on the excited- and ground states on-the-fly. The MQC trajectory propagates along an excited-state PES using classical mechanics by solving Newton's equation of motion. Quantum chemical methods evaluate nuclear forces by solving approximations to Schrödinger's equation. Despite the flexibility and accuracy of NAMD simulations, they can have a prohibitive computational cost because they rely on thousands of consecutive quantum mechanical (QM) calculations along a reaction coordinate. Thus, the NAMD simulation based on the QM calculations of the excited state PESs, e.g., time-dependent density functional theory (TDDFT) and complete active space self-consistent field (CASSCF) calculations, are commonly limited to medium-sized molecules with 40–50 atoms and timescales of only a few picoseconds.25,26 Therefore, many research groups are developing new approaches to extend the length and timescale of NAMD, such as improving the accuracy of semiempirical methods,27–29 fitting analytical potential with linear vibronic coupling model,30,31 and accelerating QM calculations with GPU.32
With the rise of high-performance computing resources, researchers worldwide have generated massive amounts of data; machine learning (ML) has been used to design molecules that intelligently satisfy multiple properties.33,34 Examples of these include material design and drug discovery,35,36 reaction barrier prediction,37 transition states search,38–40 solving the Schrödinger equation41,42 modeling wave functions,43,44 optimizing density functionals,45,46 and predicting IR,47 UV-Vis,48,49 and nuclear magnetic resonance (NMR)50 spectra. The main advantage of ML is that it can effectively learn the N-dimensional relationships of any input–output relation datasets that are otherwise overwhelming to humans. This feature enables the development of ML potential as an alternative for QM calculations to predict excited-state PESs for NAMD at a negligible computational cost without losing accuracy.
This review provides an overview of recent developments in ML photodynamics. We will first discuss the fundamentals of the theoretical methods integrated into the ML photodynamics approach, including the QM methods available for generating accurate reference data of excited-state electronic properties. Then, we will review the NAMD approaches compatible with ML potentials for simulating photochemical reactions, the ML techniques ready for fitting excited-state PESs with QM reference data, and the training data generation strategies. Finally, we will discuss the emerging chemical problems in photochemical reaction research and the technical details of the employed QM and ML methods studying photochemical reaction mechanisms, such as cis–trans isomerization, [2 + 2]-cycloadditions, 4π-electrocyclic ring-closing reactions, and hydrogen roaming.
II. METHODS
We first introduce the methods used in the ML photodynamics simulation. Figure 1 illustrates a brief flow chart for choosing appropriate methods in our experience, explained by the following discussions.
A. Machine learning methods
ML methods are popular for developing fast and accurate potential models for molecular dynamics simulations.51,52 For photodynamics simulations, the ML potential learns the relationships between the molecular structures and the ground and excited-state electronic properties, for instance, energies, forces, nonadiabatic couplings (NACs), and spin–orbit couplings (SOCs). The molecular descriptor numerical representation encodes three-dimensional structural information. The ML model reads the molecular descriptor to predict the ground and excited-state energies, forces, NACs, and SOCs by minimizing the errors between the expected numbers and the ground-truth data. Selection of the ML models and molecular descriptors needs extraordinary carefulness because they are two main factors determining the accuracy of the ML potentials for photodynamics simulations.
Kernel and neural networks (NNs) are popular methods for training ML potentials (Fig. 1). The kernel methods (e.g., kernel ridge regression and Gaussian process regression) are nonparametric to search the form of the kernel functional depending on the training data that make probabilistic predictions. The complexity and number of parameters of kernel methods increase as the training data expands. Neural networks optimize a fixed number of predefined hidden layers and node parameters by minimizing a loss function (e.g., mean square error). This NN architecture is called a multiple-layer perceptron (MLP). The optimization of NN also requires tuning hyperparameters (e.g., activation functions, number of hidden layers and nodes, learning rates, and regularizations). Since the complexity of the NN is independent of the training data, it is expected to have good computational scaling and memory usage for large training sets, including photodynamics simulations. Various ML potentials have been developed to improve the accuracy of the predicted potential energies and forces, such as high-dimensional neural network potential (HDNNP),53 SchNet,54 PhysNet,55 Accurate NeurAl networK engINe for Molecular Energies (ANI),56,57 deep potential smooth edition (DeepPot-SE),58 symmetrized gradient-domain machine learning (sGDML),59 permutationally invariant kernel ridge regression using RE descriptor and the Gaussian kernel (pKREG),60 reproducing kernel Hilbert space (RKHS),61 Gaussian approximation potential (GAP),62 neural equivariant interatomic potential (NequIP),63 and local equivariant deep neural network interatomic potential (Allegro).64 Some packages of a collection of ML methods are MLatom60,65 and fast learning of atomic rare events (FLARE).66 A comparison of NN and kernel methods show similar accuracies for ML photodynamics simulations.67 More benchmarks on various ML potentials are reviewed in Refs. 51, 68, and 69.
Molecular descriptors must satisfy the translational and rotational invariance to hold a valid and machine-learnable mapping between molecular structures and corresponding energies and forces. These requirements motivate the development of global descriptors, such as internal coordinates, Coulomb Matrix,70 and inverse distance matrix.71 The global descriptors above are not invariant with respect to the permutation of chemically equivalent atoms. Permutation-invariant representations are created with sorting techniques (e.g., bag-of-bonds72 and randomly sorted Coulomb matrices73). However, it could result in discontinuous PESs, where the ML potential becomes challenging to train. Permutationally invariant polynomials74 is another solution applicable only for small molecules as the number of permutations grows exponentially with molecular complexity. We have introduced the permutation map technique75,76 to restore the permutationally invariant structural information from existing training data according to molecular symmetry. Alternatively, we can choose a local descriptor representing the local chemical environment around each atom with a set of basic functions in a given cutoff sphere. The local descriptors are permutationally invariant and size extensive because they decompose the total energy into atomic contributions. Some examples of local descriptors are atom-centered symmetry functions (ACSFs),77 smooth overlap of atomic positions (SOAP),78 and Faber−Christensen−Huang−Lilienfeld (FCHL).79 Pozdnyakov and co-workers addressed the significance of the higher-order terms in the local descriptors.80 The high-order terms improve the atomic structure representations by adding angular information beside the interatomic-distanced-based radial information. Using physics-informed descriptors can achieve higher accuracy in ML performance while substantially reducing the size of the required training data.
Recent NN-based ML potentials incorporate the local descriptor in the model, thus enabling simultaneous optimization of the descriptors and model parameters during the training. Message-passing neural networks (MPNNs) implement such descriptors.81 MPNNs embed the atomic information (e.g., atom type) using discrete numbers, called node features, and the interatomic information using a transformation function of the pairwise atomic distances, called edge features. The node and edge features go through the message-passing layer that updates the node features by convoluting the neighboring node features weighted by the corresponding edge features within a local environment following the molecular connectivity graph. Iterative updating of node features with a message-passing layer refines the encoded messages on each atomic center. Their sum is used to fit the molecular properties (e.g., total energy). Increasingly sophisticated convolutional layers are introduced to extract the atomic features acting as a pattern filter for the local atomic environment. This type of convolutional layer was implemented in deep tensor NN (DTNN)54 and the descendant SchNet,82 which uses Gaussian functions to learn the internuclear distance for fitting molecular potential and force field. Similar NNs are PhysNet,55 HIP-NN,83 DeepPot-SE,58 SchNOrb,43 and SpookyNet.84 The Gaussian functions only transform the node and edge features as scalers, which is inefficient for learning the angular information. Spherical harmonic functions allow NN to learn the node and edge features in the form of tensors. These geometrical tensors can represent the radial, angular, and high-order structural information, significantly accelerating the training with a minimum amount of data. The convolution of tensors requires special tensor product operations satisfying the translational and rotational invariance, which is implemented in Euclidean equivariant neural networks (e3nn).63 NequIP63 and Allegro64 are two applications of e3nn for learning the atomic potential. NequIP outperforms existing ML models with high accuracy but up to three orders of magnitude fewer datasets;63 and the Allegro, with similar accuracy to NequIP, achieves high learning and prediction efficiency in simulating a system with 100 × 106 atoms.64 To the best of our knowledge, many graph convolutional NNs (GCNN) are available for ground-state molecular dynamics, but only a few have been adapted for photodynamics simulations.
Researchers can learn excited-state PESs based on a time series of molecular geometries in NAMD trajectories with recurrent NNs (RNNs).85 The RNN uses a directional connective graph that passes the outputs of the previous step as inputs of the current step. It predicts further data by considering the historical evolution's memory information, which extracts the temporal features. The long short-term memory (LSTM) units are used to avoid gradient vanishing and exploding problems.86 The LSTM-NN has been used to predict mixed quantum-classical Liouville dynamics by Kananenka and co-workers.87 Recent work by Gu and Lan reported successful LSTM-NN in multi-configuration time-dependent Hartree (MCTDH) dynamics88 and quasi-classical dynamics based Meyer–Miller mapping Hamiltonian (MM-MQC).89 Due to the complexity of input data constructions—a time series of geometries rather than a single geometry, the LSTM-NN has not been used for simulating photochemical reactions. Nevertheless, the LSTM-NN shows a promising solution to avoid problems in learning NAC. The NACs are essential in the fewest switches surface hopping (FSSH) method because it determines the time evolution of the state population, which controls the surface hopping probability. A recent work by Shen and co-workers showed that LSTM-NN could learn the state population in a given number of consecutive snapshots in surface hopping trajectories.85 As such, the NACs are no longer needed as the predicted state population matrix can be directly used to compute the surface hopping probability for FSSH. Although this approach successfully worked for analytical models, applications in molecular systems have yet to be reported.
B. Nonadiabatic molecular dynamics
In principle, ML potentials can accelerate the NAMD in various flavors from quantum (i.e., wavepacket,90 exact factorization,91 MCTDH,92 and multiple spawning93) to mixed quantum-classic (e.g., mixed quantum-classic Liouville equation,94 Ehrenfest dynamics,95 and surface hopping96) regime. To date, most literature reported the application of ML potential for MCTDH88,97 and trajectory surface hopping (TSH).50,94 Our review of the ML photodynamics is focused on the NAMD methods using TSH trajectories, as it is the most applied approach to study the excited-state and photochemical reactions of molecular systems with ML. Nuclear motions are treated classically along a single PES. Omitting nuclear quantum effects substantially accelerates these simulations and enables the simulation of full-dimensional NAMD photodynamics of interesting, practical molecules.25,26 The most time-consuming step is the computation of ground- and excited-state energies and forces because the MQC trajectory requires sequential on-the-fly computations at each time step. Furthermore, a typical NAMD simulates an ensemble of independent MQC trajectories to obtain statistically meaningful results. As such, the number of trajectories multiplied by the number of nuclear geometries gives thousands of QM calculations resulting in a prohibitive computational cost.
Another solution to the NAC problem relies on on-the-fly computed NACs with the Baeck–An (BA) approximation;121,122 the BA approximation computes NACs as a function of the second-order derivative of the energy gap with respect to the nuclear positions. While calculating the second-order derivative of the energy gap is time-consuming, the auto differentiation techniques are compatible with ML techniques due to the analytical evaluations of second-order derivatives. By transferring the energy gap derivatives from the geometric domain to the time domain, we can numerically compute the second-order derivative by evaluating the temporal changes of energy gaps along with three adjacent time steps.123 The approximated NACs are also termed curvature-approximated time derivative couplings (κTDC), as their quantities only depend on the curvature of PES.123 The κTDC approach enables ML FSSH photodynamics, which is notably less empirical than ZNSH dynamics. Figure 1 demonstrates that the NAMD method choice for ML photodynamics simulations depends on the rigor of the available NAC computations.
C. Quantum mechanical methods
Choosing an appropriate QM method for excited-state calculations is essential to prepare accurate training data for ML photodynamics simulations (Fig. 1). Single and multireference methods are two common types of QM methods popular for studying electronic properties and nuclear dynamics in excited states. Single-reference (SR) methods compute excited-state electronic structures using the ground state as a reference under the adiabatic approximation, such as time-dependent density functional theory (TDDFT),124 approximate second-order coupled-cluster (CC2),125 and algebraic diagrammatic construction to the second order [ADC(2)].126,127 However, SR methods suffer some deficiencies in studying excited states in the following cases. (1) The SR method based on single excitations, e.g., TDDFT, cannot describe the double excitation. (2) The SR methods do not treat the ground and excited-state wavefunction on equal footing; thus, it cannot correctly compute the excited states degenerate to the ground state. These situations are often encountered in computing photochemical reaction pathways, where the doubly excited state could participate in the reaction, and the reactions could go through regions of degeneracy between the ground and excited state (e.g., avoided crossings and conical intersections). Moreover, the SR methods produce the wrong dimensionality for the branching plane of singlet conical intersections because the single excitation formulation computes incorrect interaction between the ground and excited states. This issue results in a problematic energy relaxation path from a conical intersection.128,129 Spin–flip technique (e.g., SF-TDDFT)130,131 was introduced to restore correct PESs surrounding conical intersections. Other methods, such as the mixed reference spin–flip TDDFT (MRSF-TDDFT)132 and spin-restricted ensemble-referenced Kohn–Sham (REKS),133 can compute conical intersection at the cost of SR methods. It should be noted the REKS method is supported with GPU acceleration in TeraChem, which makes it an efficient candidate to compute excited-state PESs.32 In addition, the development of particle–particle random phase approximation (pp-RPA),134–136 hole–hole Tamm–Dancoff approximation (hh-TDA),108,137 and multistate density functional theory (MSDFT)138–140 pave alternative ways to correct descriptions of conical intersections.
Multireference (MR) methods are sufficiently robust to describe conical intersections' electronic structures and nuclear geometries. Many works have shown that the conical intersections computed by multiconfigurational methods are required to explain the mechanism and obtain the structure–reactivity relationships in photochemical reactions.103–105 The MR methods often use CASSCF reference wave function. The CASSCF wavefunction includes the full configurational interactions in a subset of molecular orbitals, called active space, which can fully address issues in the above-mentioned SR method. Two popular MR methods are complete active space second-order perturbation theory (CASPT2)141,142 and multireference configuration interaction (MRCI).143–146 CASPT2 directly adds second-order perturbative corrections (i.e., dynamical correlation) to the CASSCF excited-state energies; MRCI computes the electronic excitations by generating single and double electron transitions on top of the CASSCF wavefunctions. Moreover, the extended multistate (XMS),147 dynamically weighted (XDW),148 and rotated multistate (RMS)149 variants of CASPT2 improve the energy corrections near the state-crossing regions. It is important to remember that the MR methods rooted in CASSCF have limitations. (1) The active space and the state-averaging choice are not trivial. In our experience, the results depend on the researcher's expertise with multiconfigurational methods and active space selection prowess. There has been an important effort in the community to automate the selection of active spaces with machine learning.150,151 These advances have lowered barriers to entry for these computations and are quite promising for vertical excitation energies. There is limited guidance on whether these active spaces will be physically or chemically relevant for light-driven bond-breaking or forming processes (i.e., photochemistry). (2) Modern QM software limits the active space size to 22 electrons and 22 orbitals due to the exponential growth of the configuration state functions. Recent developments in the adaptive sampling configuration interaction (ASCI) method152,153 have expanded the active space size beyond 50 electrons and 50 orbitals. However, the sampled configurations could vary for different structures, which could lead to discontinuity of the PES, making the dynamics simulation numerically unstable. The density matrix renormalization group (DMRG) can also be used to overcome the limit of active space.95,154,155 (3) MR calculations are significantly more expensive than SR calculations because they need to optimize the configuration and orbital coefficients simultaneously or asynchronously. They are only affordable to study the excited states of small molecules.104,156,157 CASSCF is more frequently used for medium-sized molecules,105,158 although the lack of dynamical correlation often overestimates the excitation energies.159 As such, the CASSCF results must be validated against the multireference calculations to confirm the consistent electronic nature (i.e., topology) of the excited-stated PESs along the photochemical reaction pathway. The multiconfiguration pair-density functional theory (MCPDFT) provide a promising compromise between the accuracy and cost of MR methods, which combines CASSCF for static correlations and DFT for dynamical correlations.160–162 In some cases, neither SR nor MR methods may be sufficient to describe the entire photochemical reaction pathways, where a combination of QM methods can be employed.163,164
Given the difficulties discussed above, one must choose a QM method carefully for training ML potential. We summarize the capabilities that need to be considered for choosing a proper QM method in Table I.
. | Static correlation . | Dynamic correlation . | Double excitation . | S0/Sn crossing . | Sn/Sm crossing . | Orbital selection . | Configuration selection . | Used to train ML potential . |
---|---|---|---|---|---|---|---|---|
TDDFT | No | Yes | No | No | Yes | No | No | |
ADC(2) | No | Yes | Yes | No | Yes | No | No | Ref. 163 |
CC2 | No | Yes | Yes | Yes | No | No | No | |
SF-TDDFTa | Yes | Yes | Yes | Yes | Yes | No | No | Ref. 115 |
MSDFT | Yes | Yes | Yes | Yes | Yes | No | No | |
pp-RPA/hh-TDA | Yes | Yes | Yes | Yes | Yes | No | No | |
REKS | Yes | Yes | Yes | Yes | Yes | Yes | No | Ref. 165 |
CASSCF | Yes | No | Yes | Yes | Yes | Yes | No | Refs. 111 and 75 |
ASCI | Yes | No | Yes | Yes | Yes | Yes | Auto-sampled | |
MCPDFT | Yes | Yes | Yes | Yes | Yes | Yes | No | |
MRCI | Yes | Yes | Yes | Yes | Yes | Yes | No | Refs. 71 and 94 |
CASPT2b | Yes | Yes | Yes | Yes | Yes | Yes | No | Refs. 76 and 163 |
. | Static correlation . | Dynamic correlation . | Double excitation . | S0/Sn crossing . | Sn/Sm crossing . | Orbital selection . | Configuration selection . | Used to train ML potential . |
---|---|---|---|---|---|---|---|---|
TDDFT | No | Yes | No | No | Yes | No | No | |
ADC(2) | No | Yes | Yes | No | Yes | No | No | Ref. 163 |
CC2 | No | Yes | Yes | Yes | No | No | No | |
SF-TDDFTa | Yes | Yes | Yes | Yes | Yes | No | No | Ref. 115 |
MSDFT | Yes | Yes | Yes | Yes | Yes | No | No | |
pp-RPA/hh-TDA | Yes | Yes | Yes | Yes | Yes | No | No | |
REKS | Yes | Yes | Yes | Yes | Yes | Yes | No | Ref. 165 |
CASSCF | Yes | No | Yes | Yes | Yes | Yes | No | Refs. 111 and 75 |
ASCI | Yes | No | Yes | Yes | Yes | Yes | Auto-sampled | |
MCPDFT | Yes | Yes | Yes | Yes | Yes | Yes | No | |
MRCI | Yes | Yes | Yes | Yes | Yes | Yes | No | Refs. 71 and 94 |
CASPT2b | Yes | Yes | Yes | Yes | Yes | Yes | No | Refs. 76 and 163 |
The original SF-TDDFT has spin-contamination, the spin adapted (SA), and mixed reference (MR) SF-TDDFT do not have this problem.
The original CAPST2 lead to incorrect state crossing, which is corrected by the extended approaches, XMS-, XDW, and RMS-CASPT2.
D. Training data
The performance of ML potential in predicting photochemical reactions highly relies on the quality of training data. The training data for photodynamics simulations require sufficiently sampling the structures on the ground and excited-state PESs (i.e., configurational space). This task requires different types of training data from the commonly used molecular structure and properties database across the chemical compound space, such as QM7,70 QM7b,166 QM8,167 and QM9.168 The MD17 database, for instance, is popular for benchmarking ML potential across ground-state configurational space for small organic molecules.169 A more recent WS22 database provides 1.18 × 106 equilibrium and non-equilibrium geometries of molecules up to 22 atoms, sampled from Wigner distributions centered at different ground-state conformations.170 The design of the WS22 database has been quite helpful for benchmarking ML potential for excited-state dynamics. The diversity in chemical composition and accessible conformations assures a broad and statistically robust representation of the PESs.
Photodynamics simulations could lead to an immense amount of excited-state molecular geometries outside the existing databases. Unfortunately, a universal ML potential trained with existing data for the excited states of the molecules with various sizes and compositions is not available. The main reason is that the excitation energy is an intensive quantity, which does not depend on the number of atoms. The ML potential can learn atomic contributions to the excitation energy of a molecule; however, they are not transferable to other molecules. This problem limits the transferability of the ML potential for predicting the excited-state energy across the chemical species space. Whether existing another quantity to transfer the excited-state information is yet to be determined. Second, the excited-state electronic energy is not a smooth function of nuclear configurations, especially near the state-crossing region, if the number of computed excited states is insufficient. However, it is impossible to determine the number of excited states for all molecules and photochemical reactions. Molecules can go through various states in photochemical reactions depending on the nature of absorption and wavelength of the employed light source. Therefore, training ML potentials for photodynamics simulations requires generating tailored training data for a specific photochemical reaction.
There are two main strategies for training data generation, human-designed, and data-driven approaches. The human-designed approach is based on well-designed molecular structure sampling methods, such as the MD-based methods (umbrella sampling,171 trajectory-guided sampling,172 enhanced sampling,173 and metadynamics174), stochastic surface walk,175 Wigner sampling,176 and normal mode scans.56,71 These methods have been applied for learning thermal reactions; however, the data sampling efficacy is limited for complex photochemical reactions. For instance, the ground-state reaction pathways are distinct from the excited state(s) due to typically different topologies of the ground- and excited-state PESs. As a result, the sampled structures during ground-state MD simulations are likely irrelevant to those involved in the excited-state dynamics. For this reason, the excited-state MD is more suitable for sampling the structures. However, the excited-state MD becomes prohibitively expensive as the number of degrees of freedom increases, and the reliability heavily depends on the choice of the QM methods, as discussed in 2.3. Wigner sampling for the reactant or product geometries captures only accessible nonequilibrium geometries near local or global minima. Normal mode scans include many irrelevant molecular vibrations. A composite scheme that combines Wigner sampling and geometrical interpolation could overcome the individual limitation to generate a compact yet relevant initial set.111 A similar approach has been used to generate the WS22 database.170 The data-driven approach, also called adaptive sampling (Fig. 2) or active learning, is more efficient and nearly eliminates human bias. The objective is to train an initial ML potential with available computed training data and use it to explore the excited-state PESs with the active learning algorithm.
The adaptive sampling can be done in the configuration space or energy landscape. Lan and co-workers showed that comparing the Coulomb matrix of the predicted structure and the training set can determine the out-of-sample structures112 and Gómez-Bombarelli and co-workers implemented a geometric clustering-based method to search uncertainty structures,115 which both correspond to adaptive sampling in the configuration space. Marquetand and co-workers used a measure of uncertainty in the ML predicted electronic properties to determine the undersampled data,71 corresponding to adaptive sampling in the energy space. The uncertainty metric is available in the kernel methods, such as Gaussian process regression (GPR), as it always returns the prediction's covariance. However, a single NN model does not have an uncertainty measure. Our group has followed the concept of the query by committee,177 where two NNs are independently trained, and the standard deviation between their predictions quantifies the uncertainty. The adaptive sampling in configuration space highly depends on the sampled configuration in the initial training set, thus, is suitable to train ML potential for photochemical reactions with a dominated pathway. For photochemical reactions with multiple and possibly unseen pathways, it is difficult to generate a training set sufficient for the entire configuration space at the beginning. Thus, adaptive sampling in the energy space is recommended as it does not require predefined configuration space. It should be noted that once the adaptive sampling in energy space sampled all relevant regions in the configuration space, one can switch to the adaptive sampling in the configuration space to refine the training data and fitted PES. In addition to sampling structures in configuration and energy space, the energy gap can also be used to detect the new structures to improve the fitted PESs near the state-crossing regions.112,115 Once the uncertain data are collected in adaptive sampling, they are recomputed with multiconfigurational QM calculations to expand the original training dataset, iteratively improving the ML potential by retraining the model with new data until no undersampled data are found.
The advantage of adaptive sampling is that the training process automatically explores the excited-state PESs, searching all possible photochemical reaction pathways. It benefits the training data preparation as it requires knowledge of one of the possible competing photochemical reactions. The recent study on the photochemical [2 + 2]-cycloaddition of [3]-ladderdiene toward cubane has shown the efficiency of adaptive sampling, discovering the unseen structures and reactions mechanism based on exclusive information about the [2 + 2]-cycloaddition pathway.75
E. Multiscale approaches with ML potential
Herein, we briefly discuss the QM/MM and ONIOM methods to show how ML potentials can be adapted to accelerate the multiscale calculations. The QM/MM method depends on the level of integration between the QM and MM regions [the last term in Eq. (2)]. Ignoring the interactions between QM and MM regions, known as mechanical embedding, usually leads to large errors. Including the electrostatic interactions as the QM regions seeing the MM atoms as point charges, known as electrostatic embedding, is popular to obtain reliable results. The van der Waals interactions are also commonly added into the electrostatic embedding approach. The inclusion of the electronic polarization between the QM and MM regions can further improve the energy descriptions while the development of polarizable MM potential is still in progress.183 The ONIOM method provides a convenient framework to combine any two types of calculations without additional developments on their coupling term. For the QM/MM type calculations, the ONIOM(QM:MM) methods left the QM–MM interactions in the MM calculations for the entire system [the first term in Eq. (3)]. It falls into the mechanical embedding of the QM/MM approach if the MM calculations use non-polarizable potential and the QM calculations have no embedding charges, which is not recommended in general. Thus, the common practices of the ONIOM(QM:MM) approaches employ the electrostatic embedding QM calculations, which is comparable to the electrostatic embedding of QM/MM approaches. Similarly, the polarizable embedding ONIOM(QM:MM) can be achieved if the MM potential is polarizable. It should be noted that the advantage of the ONIOM methods appears to be combining the QM and QM′ calculations. The simple mechanical embedding scheme of ONIOM(QM:QM′) already includes electrostatic interactions and electronic polarization at the QM′ level. More accurate electrostatic schemes can be obtained by using the charge embedding QM calculations.184
The ML potential can be used to accelerate the QM, QM′, or MM calculations, enabling various schemes for multiscale calculations, such as ML/MM, ML/semi-empirical (QM′), or QM/ML. In the QM/ML scheme, an ML potential trained with QM calculation data could work as the MM backend. It ensures the same accuracy of the QM and ML energies alleviating mismatched accuracies between the QM and MM energies.185 By treating the MM region as multilayer energy-based fragments, the QM/ML method has successfully simulated the excited-state dynamics of dimethyldiazene in a solvent box of 40 water molecules.186 In the ML/MM scheme, one can use the ML potential to correct the energies and forces obtained from the low-cost semiempirical QM (e.g., DFTB) calculations (i.e., Δ-learning), matching the results of DFT calculations. The Δ-learning approach has been reported to accelerate the MD simulations for thermal reactions (e.g., SN2 reaction and Claisen rearrangement) using the HDNNP187–189 and DeepPot-SE.190 On the other hand, the ML potential, such as HDNNP,191 FieldSchNet,192 and Deep-Pot-SE,193 have been adapted to a QM calculator that predicts the energies and forces at the same level as the QM training data. These methods belong to the electronic embedding ML potential. They incorporated additional electronic embedding features to learn the environment-polarized QM data, which properly retains the dependence of energies and forces on the external charges. A similar approach has been implemented to support the electrostatic embedding GPR model.194 A recent study demonstrated that the embedded atom neural networks (EANNs)195,196 in a mechanical embedding scheme could be as accurate as the electronic embedding ML potential when choosing an appropriate QM region for training data calculations. The ML/MM methods still have inconsistent energetic descriptions between the QM and MM regions. To mitigate the inconsistency during SchNet model training, a buffered region between the inner QM and outer MM regions was introduced.197 The buffer region experiences full electronic polarization by the inner QM region. The effective interactions with the QM region are calculated at the QM level, while the interactions with the MM region are described at the MM level. The total interactions are a combination of MM interactions and the effect of the QM region on the electronic degrees of freedom of the buffer region, which minimizes the artifacts arising from mixing QM and MM data.
III. APPLICATIONS OF ML PHOTODYNAMICS FOR RESOLVING PHOTOCHEMICAL REACTION MECHANISMS
In the past years, ML photodynamics has shown numerous applications for studying excited-state dynamics, nonradiative decay processes, and complex photoreactions.51 We briefly survey the contributions that realize the ML photodynamics from model system to practical molecular systems. In 2018, Lan and co-workers reported NAMD simulations for a 12-atom molecule, 6-aminopyrimidine, based on a KRR model;112 Dral, Barbatti, and Thiel published a KRR model for a 33-dimensional spin-boson system;198 Cui and co-workers trained the NN-based DPMD model for CH2NH.113 Later, they reported another DPMD model for five water molecules as the environment for the QM/ML NAMD of CH3N=NCH3.186 In 2019, Marquetand and co-workers demonstrated a feed-forward NN model for CH2NH2+.71 Their works achieved the first nanosecond-scale ML-accelerated photodynamics simulation. In 2020, Marquetand and co-workers successfully developed and used their SchNarc model for NAMD simulation of CH2NH2− and CSH2, respectively.110 Meanwhile, Brorsen and co-workers reported a KRR model for a three-dimensional spin-boson model system.114 For comparison between the KRR and NN, Maquetant and co-workers demonstrated that both models could learn the excited-state dynamics of small molecules, such as CH2NH2+, S=CH2, and SO2.67,110 Haberson and co-workers used the KRR model to perform MCTDH simulations for several molecular systems,97 explaining the excited-state hydrogen tunneling phenomena.199 Pavlo and co-workers applied KRR to learn the quantum dissipative dynamics of open systems;200 they also reported that a CNN model could predict 10-ps-long trajectories of open quantum systems dynamics in one shot.201 In addition, we want to point out that there are other strategies to accelerate the NAMD simulations, for instance, the linear vibronic coupling (LVC)30 model and the diabatic potential energy matrix (DPEM) representation.202–206 Interested readers are recommended to refer to the above-mentioned reference for more details. In the following section, we will discuss a few examples of the ML photodynamics application in resolving complex mechanistic questions in photochemical reactions.
A. Photochemical cis–trans isomerization of hexafluoro-2-butene
The first example demonstrating an automatic discovery of photochemical reaction pathways using ML photodynamics simulations is the NN-predicted cis–trans isomerization of hexafluoro-2-butene [Fig. 3(a)].111 The NN potential was constructed using two independently trained MLP models with the inverse distance matrix representation. The training set was initially generated with the Wigner sampling combined with geometrical interpolations from the cis to the trans-configuration, complemented with snapshots of short-time (50 fs) NAMD trajectories, which gives a total number of 4961 data points. The training data were computed at CASSCF(2,2)/cc-pVDZ level. Adaptive sampling searched the undersampled geometries. It propagated 250 trajectories from the S1 state in 500 fs. 80% of the trajectories have finished in five iterations finding 565 new geometries. Up to 28 iterations, 98% of the trajectories were completed. It collected an overall of 1516 new geometries leading to a final training set of 6207 data points. The mean absolute errors (MAEs) of energy predictions were 0.023–0.025 eV, satisfying the commonly accepted chemical accuracy of 0.043 eV (1 kcal mol−1).
The NN predictions of energies, forces, and NACs take 0.01 s on a single CPU, which was 3.4 × 104 times faster than the CASSCF(2,2)/cc-pVDZ calculations. The significant accelerations allow us to simulate the dynamics of trans-hexafluoro-2-butene up to 10 ns in 50 h using a single CPU. It also afforded more trajectories than the QM NAMD simulations. For example, the 500 fs simulations of trans-hexafluoro-2-butene cost-efficiently propagated more than 5000 trajectories to obtain a statistically sufficient sampling of the photochemical reaction pathways.
Due to the overfitting issue of NACs, the ML photodynamics simulations using FSSH with NN-predicted NACs overestimate the S1 lifetime of trans-hexafluoro-2-butene compared to the reference CASSCF(2,2)/cc-pVDZ trajectories [Fig. 3(b)]. The ZNSH with NN-predicted energies and forces could reproduce the QM reference value of 29.0 fs, predicting an S1 lifetime of 33.5 fs. Given the NN ZNSH trajectories, the final trans:cis ratio was 3.5:1, which is close to the trans:cis ratio of 2.8:1 found in the QM reference trajectories. Characterizations of the trajectories showed that the NN and QM trajectories underwent virtually identical geometrical changes at the S1/S0 surface hopping points, showing a similar topology of the S1/S0 crossing seam.
In addition to the cis–trans isomerization mechanism, photoexcited hexafluoro-2-butene can promote hydrogen migrations forming a carbene intermediate. This pathway was intentionally excluded from the initial training data as human-introduced bias in the underlying photochemical reactions of hexafluoro-2-butene. The adaptive sampling rediscovered the undersample structures corresponding to the hydrogen migration pathways. The final NN trajectories predicted a trans:carbene ratio of 1.1:1, which also agrees with the QM reference value of 1.6:1. It suggests that ML photodynamics are feasible to study photochemical reactions with minimal prior knowledge of the underlying reactions.
B. Photochemical 4π-electrocyclization of norbornyl cyclohexa-1,3-diene
ML photodynamics simulations can disentangle the photochemistry experiment results beyond the limits of the QM NAMD approach. One such example is the photochemical 4π-electrocyclization of norbornyl cyclohexa-1,3-diene [Fig. 4(a)].111 The experiment observed a stereoselective ring-closing reaction favoring the anti-configuration. Due to the structural complexity and the number of nuclear degrees of freedom, the NAMD simulations with CASSCF(4,3)/ANO-S-VDZP calculations took 17 days to obtain 1 ps dynamics with 240 trajectories. Within the simulation time, two intermediates were predicted and confirmed by geometry optimizations to local minima in the ground state, which were not observed in the experiments. The limited number of trajectories also missed predicting the minor product in endo-configuration. Thus, the QM NAMD simulations were insufficient to explain the experimental stereoselectivity.
We sought to resolve the mechanism of the 4π-electrocyclization of norbornyl cyclohexa-1,3-diene with the help of the ML photodynamics approach. The NN potential used the MLP model. The training data generations include the Wigner sampling, geometrical interpolation, and sampling from the 50 fs trajectories of QM NAMD simulations. This method generated 3349 training data computed at the CASSCF(4,3)/ANO-S-VDZP level. The adaptive sample further collects the undersampled structures by propagating 250 trajectories in 1 ps. The temperature was increased to 1200 K to accelerate the exploration of high-energy nonequilibrium geometries. During the adaptive sampling, the normal completion of trajectories reached 80% after six iterations, then fluctuated in the next 18 iterations. It approached 90% after 50 iterations and stopped at 96% in 100 iterations. The final number of training data is 6272. The MAE of the NN predicted energies are 0.027–0.031 eV.
The ML photodynamics simulations for norbornyl cyclohexa-1,3-diene obtained 3954 trajectories in 1 fs, where a single trajectory only spent 38 s, representing a 3.8 × 104-fold acceleration to the CASSCF(4,3)/ANO-S-VDZP calculation. The increased number of trajectories allows us to observe the less favored pathway toward endo-configurations. The predicted yields of the anti and endo-configuration were 0.7% and 0.2%, respectively. Since the simulations did not include re-excitation of the reformed reactant, the predicted yields are lower than the experimental values of 28% and 4% for the anti and endo-configuration. Nevertheless, the predicted anti:endo ratio in the first 200 fs matched the experimental measurements over 4 h. The ML photodynamics simulations suggested that the anti-configuration is favored because its S1/S0 crossing regions are more accessible than the endo-configuration.
The predicted intermediates in QM NAMD simulations raise a fundamental question on the prediction reliability of the CASSCF(4,3)/ANO-S-VDZP calculations simulating the photochemical reaction of norbornyl cyclohexa-1,3-diene. With ML accelerations, we extended the simulations to 1 ns to monitor the subsequent dynamics of the intermediates. The trajectories showed continued thermal isomerization of the intermediate back to reactant in 1 ns [Fig. 4(b)]. This finding confirmed the metastability of the intermediates and explained their absence in the experiments. Together with QM NAMD results, the ML photodynamics provides a dynamical perspective to understand the 4π-disrotatory electrocyclic ring-closing mechanism. The discovery of the unconventional intermediates suggests that the 4π-electrocyclic ring-closing reaction initially undergoes excited-state vibrations near the FC region resembling a cis–trans isomerization reaction localized at one C=C bond instead of a concerted disrotatory motions at the termini of the diene. This example highlighted the advantages of ML photodynamics simulations in studying complex (photo)chemical reaction mechanisms involving elusive and long-lived intermediates.
C. Photochemical [2 + 2] cycloaddition of substituted [3]-ladderdienes
The substituent effects play an essential role in the [2 + 2]-cycloaddition of [3]-ladderdiene that facilitates the formation of cubane, a class of highly-strained organic molecules. Experiments reported an increasing quantum yield (QY) of cubane following the order of methyl (CH3), trifluoromethyl (CF3), and cyclopropyl (cPr) groups in the octa-substituted [3]-ladderdienes [Fig. 5(a)].75 Uncovering substituent effects has been impossible for decades because octa-substitutions significantly raise the computational cost. For instance, the 1 ps NAMD simulations with CASSCF(8,7)/ANO-S-VDZP+ANO-S-MB(for substituents) calculations for the octamethyl [3]-ladderdiene would take several months. The ML photodynamics simulations can handle full dimensionality greater than 200 degrees of freedom with affordable computational cost. As such, ML photodynamics simulations have been applied to study the substituent effects.75
The NN potentials built by the MLP model showed the flexibility of the NN learning various kinds of photochemical reactions. The training data generation solely used the Wigner sampling for the octa-substituted [3]-ladderdiene and the geometrical interpolations from the optimized geometries of octa-substituted [3]-ladderdiene, the S1/S0 minimum energy conical intersection (MECI) of the [2 + 2]-cycloadditions, and the corresponding cubane derivatives. The nuclear displacements of the Wigner sampled geometries were mixed into the interpolated geometries to expand the structural diversity in the training data. Initial training data were 4321 for octamethyl [3]-ladderdiene and 3361 for octatrifluoromethyl and octacyclopropyl [3]-ladderdiene.
The octa-substituted [3]-ladderdienes have a C2 symmetry in their connection graph. Thus, a permutation map according to the C2 symmetry was used to automatically expand the training data maintaining the permutational invariance in the total energy and the covariance in the forces.75 The generated data contains the structural information in the permutationally equivalent geometries following the C2 relationship without computing more training data or spending more adaptive sampling iterations to search them. Thus, a permutation map helped minimize the requisite training data size and speed up active learning. It leads to a moderate increase in the data size in 1.98–2.23 times when it finds the optimal NN potential with 125 trail trajectories exploring the excited-state PESs [Fig. 5(c)]. Thus, it effectively maintains a manageable computational cost of the NN potential training at the CASSCF(8,7)/ANO-S-VDZP+ANO-S-MB(R) level. The final MAE of NN energy predictions were 0.040–0.046, 0.032–0.045, and 0.037–0.039 eV for the octamethyl, octatrifluoromethyl, and octacyclopropyl [3]-ladderdiene, respectively.
The NN potential showed 1.5–6.4 × 104-fold accelerations to the energies and forces computations with the CASSCF(8,7)/ANO-S-VDZP+ANO-S-MB(R) calculations for the substituted [3]-ladderdiene. The ML photodynamics simulations propagated 3835, 3259, and 3122 trajectories from the S1-FC points in 2 ps with a 0.5 fs time step for the octamethyl, octatrifluoromethy, and octacyclopropyl [3]-ladderdiene. The simulations uncovered four S1 relaxation pathways of octamethyl [3]-ladderdiene, including the [2 + 2]-cycloaddition, 4π-disrotatory electrocyclic ring-opening, σC–C cleavage, and 6π-conrotatory electrocyclic ring-opening [Fig. 5(b)]. It showed the 4π and 6π-ring-opening reactions are faster than the [2 + 2] cycloaddition when R=CH3. The CF3 has close-shell repulsion to block the 6π-ring-opening pathway enhancing the preference of the [2 + 2] cycloaddition. The cPr are steric repulsive, further accelerating the [2 + 2]-cycloaddition reaction and completing the 4π-ring-opening reaction. The predicted reaction QYs were 1%, 14%, and 15% for the octamethyl, octatrifluoromethy, and octacyclopropyl cubane, in line with the trend observed in the experiment.
D. Photochemical 4π-electrocyclization of fluorobenzenes
The nanosecond-scaled NAMD simulations enabled by the ML photodynamics approach provide an effective tool for studying the photochemical reactions involving long-lived photoexcited species. Fluorobenzenes are one example of the excited-state lifetimes falling in nanoseconds' time window [Fig. 6(a)].76 Moreover, the fluorinations of benzene introduce unusual low-lying πσ* states strongly coupled with the ππ* states. Thus, investigating the excited-state dynamics and subsequent photochemical reactions of fluorobenzenes requires a highly accurate description of the excited-state PES at the XMS-CASPT2 level. However, the NAMD simulations in nanoseconds at the XMS-CASPT2 level are far beyond the affordable computing resources.
The ML photodynamics simulations revealed the photochemical reaction mechanism of fluorobenzenes with high-fidelity structural information. The NN potentials built by the MLP model successfully learned the excited-state properties of the highly symmetric fluorobenzene structures with the help of a permutation map [Fig. 6(b)].76 The training data were generated using the Wigner sampled structures of fluorobenzenes at their equilibrium geometries and the interpolated structures from the optimized geometries of fluorobenzenes to the corresponding Dewar-fluorobenzenes via an S1/S0 MECI. The number of initial training data computed at XMS-CASPT2(6,7)/aug-cc-pVDZ level are 901, 3104, 1677, and 4543 for hexafluorobenzene, pentafluorobenzene, tetrafluorobenzene and trifluorobenzenes. The hexafluorobenzene and tetrafluorobenzene have notably smaller data sizes than the others because they have chemically equivalent reaction channels at symmetric carbon atoms. The adaptive samplings continued to explore the undersampled structures. The S1 PESs of fluorobenzenes display S1 minimum regions near the S1-FC regions, which prevent the trajectories from moving farther in picoseconds. As such, the adaptive sampling scaled the initial kinetic energies 2–3 times to fast forward the excited-state dynamics fluorobenzenes, so adaptive sampling can be done using 200 trajectories propagated from S1 in 10 ps. As shown in Fig. 6(c), using a permutation map notably accelerated the training process for hexafluorobenzene [Fig. 6(c)]. With permutations, the completion ratio of adaptive sampling increase to 0.8 in ten iterations and exceeds 0.9 in 30 iterations; without permutations, the adaptive sampling curve fluctuates in the first 20 iterations, which spent 29 iterations to meet a completion ratio of 0.8. The adaptive sampling with permutations found considerably fewer structures (2128) than that without permutations (3029). At the end of adaptive sampling, the training data size increased to 3051, 4037, 4710, and 10 204 for hexafluorobenzene, pentafluorobenzene, tetrafluorobenzene, and trifluorobenzene, respectively. The final MAE of NN energies were accordingly 0.027–0.028, 0.020–0.021, 0.030–0.036, and 0.023–0.024 eV.
The ML photodynamics simulations with about 1800 trajectories in 4 ns predicted time constants were 116, 60, 28, and 12 ps for tetrafluorobenzene, hexafluorobenzene, trifluorobenzene, and pentafluorobenzene. These results reproduced the experimentally observed trends in the S1 decay time constants. The trajectories revealed that the long lifetime of hexafluorobenzene originated from the pseudo-Jahn–Teller effects that create a relatively stable excited-state minimum by breaking the D6h symmetry of the benzene ring. The structural distributions of the S1/S0 surface hopping points of fluorobenzenes showed no correlation to the reaction coordinates along with the 4π-electrocyclic ring-closing pathway. This finding suggests the 4π-electrocyclization of fluorobenzene is controlled by the dynamical effects depending on the instantaneous nuclear momentum facilitating the 1,4-carbon bond formation. As a result, the underlying 4π-electrocyclic ring-closing reaction becomes inefficient with predicted QYs of 0.1%–0.5% in Dewar-fluorobenzenes, explaining the experimental QYs about 1%.
E. Transferable ML potentials to predict the cis–trans isomerizations of azobenzenes
The ongoing ML model development highly desires a transferable ML potential to predict unseen photochemical reactions. However, an ML potential that fits the excited-state PESs of molecules throughout the chemical space is still unavailable. Because of the intensive nature of excitation energy, the decomposition of excited-state energies into atomistic contributions in different kinds of molecules is questionable. The number of excited states involved in the photochemical reactions is also system and problem-dependent, which further complicated the development of the universal ML model for excited states. Recent work by Marquentand and co-workers has shown the first hint at the transferability of the excited states by training ML potential on two isoelectronic molecules.48 A later work by Gómez-Bombarelli demonstrated significant progress in developing the transferability of NNs for predicting the cis–trans isomerizations of a series of azobenzenes [Fig. 7(a)].115
The transferable NNs, called adiabatic artificial NN (DANN), were built upon the PaiNN model [Fig. 7(b)].207 It used scalar and vectorial features to embed the atomic node and edge information and updated the features through equivariant message-passing layers with the neighboring atoms. The scalar output features are mapped to predict atomic contributions to the adiabatic Hamiltonian matrix elements. The adiabatic Hamiltonian was then diagonalized to predict the total energy of each state. The diabatic Hamiltonian matrix helped NN learn the excited-state energies and forces of azobenzene derivatives. It also produces smooth diabatic couplings between the diabatic states, which were rotated from the diabatic basis to the adiabatic basis to fit the NACs. This strategy avoided the direct fitting of the NACs containing nondifferentiable cusp at the CI regions.
The training data contain the geometries of 8269 azobenzene derivatives, including 164 reported in the literature. The geometries were generated by scanning the central CNNC dihedral angle and the CNN/NNC angles corresponding to the rotation and inversion pathways yielding a total of 567 037 geometries. The S0 and S1 energies and forces were computed using spin–flip TDDFT calculation at the BHHLYP/6-31G* level.115 Adaptive sampling was then used to find undersampled geometries and retrain the NNs. As the training data included more equilibrium geometries of azobenzenes derivatives than the near-CI geometries, a sampling procedure based on the structural similarity and the S0/S1 energy gap was introduced to balance geometry selection during the adaptive sampling by giving higher sampling probability to underrepresented geometries.
To test the performance of the DANN, the energies, forces, NACs, and QYs of 40 unseen azobenzenes derivatives were predicted against the computed and experimental values. The prediction MAEs were 3.06 and 3.77 kcal mol−1 for the S0 and S1 states of the unseen species. It should be noted that the MAE of the S0–S1 gap was 1.89 kcal mol−1, contrasting the accuracy of the semiempirical spin–flip TD-DFTB method.208 The predictions of S0 and S1 force leads to errors of 1.72 and 2.31 kcal mol−1 Å−1 with a nice R2 correlation coefficient near 1. However, the predictions of NACs gave a poor R2 value of 0.50. These results suggested the diabatization cannot remove the curl component of the NAC vector, which emphasized the difficulty of predicting NACs. The predicted QYs showed a moderate correlation to the experimental yields. The R2 value was 0.41, indicating numerical errors in the predicted QYs compared to the mean predictor. The Spearman coefficient ρ was 0.71, which suggests the model predicted the QYs in a correct rank. The computational cost of DANN fitted a scaling of N0.49 for N atoms, whereas the SF-TDDFT calculations scaled with N2.8. Thus, the DANN accelerated the QM calculation by five to six orders of magnitude.
Given the accelerations and transferability of DANN, ML photodynamics simulations were then used to virtually screen azobenzene derivatives for largely red-shifted absorptions and high transformation ratio QYa→b/QYb→a. The dataset contains 3100 combinatorial azobenzene derivatives generated by the literature-informed substitution patterns.115 Several hypothetical derivatives were identified to have either the QYcis→trans, QYtrans→cis, or red-shifted absorption wavelength higher or longer than the average values of the compound space.
F. Roaming mechanism of photoexcited tyrosine
Amino acids feature ultrafast nonradiative decay from the electronically excited state to the ground state. The process is faster than the harmful photochemical reactions, thus preventing photodamage caused by UV/visible light. Understanding the photochemical reaction mechanism of amino acids can substantially contribute to improving the photostability of peptides and proteins in the design of novel drugs in phototherapy. Tyrosine is one of the amino acids prone to photoexcitation by sunlight. The major deactivation pathway of tyrosine is the photodissociation of the O–H bond located on the phenol ring. Previous theoretical works by Domcke et al.,209–211 Truhlar et al.,212,213 and Guo et al.214 reported the role of the πσ*excitation in the photodissociation mechanism of phenol, while the studies on excited-state of tyrosine have been limited to static calculations or low accuracy method.215,216 Moreover, two main dissociation channels in a fast and a slow timescale have been proposed for tyrosine after photoexcitation using 200 nm laser pulses,217 which require more comprehensive computational simulations to uncover the excited-state dynamics of tyrosine.
Marquetand and co-workers applied the SchNarc94 approach to comprehensively investigate the excited-state dynamics of photoexcited tyrosine. The SchNarc approach performs ML photodynamics simulation using trajectories surface hopping with arbitrary couplings (SHARC)218 and ML potential built by SchNet models.54 They trained two NNs to predict the excited-state energies, forces and approximated NACs and a third NN to predict the SOCs. The training data of tyrosine included the energies and forces for five singlet states and eight triplet states and the SOCs between them, which yield 29 energies, 29 forces, and 812 SOCs. Figure 8(a) illustrates a flow chart of training the SchNarc model. The structures were initially sampled by scanning the normal modes of the 12 lowest energy conformers of tyrosine and the transformation from the neutral to the zwitterionic form, which gave 1967 data points. The training data for the structures sampled near the FC regions were mainly computed with ADC(2)/cc-pVDZ calculations because they agreed better with the absorption spectrum of tyrosine than those of CASSCF or CASPT2. However, the ADC(2) method, by design, is unable to fully describe the electronic structure and degeneracy between the excited and ground states in photochemical reactions.219 Thus, ad hoc data were computed using CASPT2(12,11)/ANO-RCC-pVDZ calculations when the distance between the hydrogen atom and another atom is longer than empirically defined thresholds. These data were used in an initial adaptive sampling running 100 trajectories for each conformer until the number of data points approximately doubled. The initial adaptive sampling explored the PESs in each state without surface hopping between the same spin multiplicity, which collected 16 738 data points. To enable long timescale simulation, another adaptive sampling was used to train the ML potential of tyrosine with 200 trajectories propagated from the S4 state in 3 ps, finally giving 17 265 data points. The MLP models were used to train the ML potential during the adaptive sampling because it is 18 times faster than the SchNet models. The SchNet models were used to learn the final training data and ML photodynamics simulation, as they produced more accurate results and MLP.
About 1000 ML photodynamics trajectories were excited to the S4 within the excitation window of 6.5–7.0 eV and propagated in 10 ps to inform the photochemistry of tyrosine. 83% of the trajectories showed direct dissociation of a hydrogen atom, the formation of a zwitterionic species, and other fragmentations pathways. 17% of the trajectories found a roaming hydrogen atom after the photodissociation of the O–H bond [Fig. 8(b)]. Of these roaming trajectories, 36% stayed with roaming hydrogen atoms, whereas the other trajectories moved toward diverse fragmentation pathways. These results revealed the roaming mechanisms beyond the chemical intuition in peptides and proteins, which brings our knowledge further toward a better understanding of the photostability and photodamage of biological systems.
IV. CONCLUSION
ML photodynamics approach has become an emerging tool for learning photochemistry. The extraordinary accelerations by ML techniques offer great potential for resolving the elusive photochemical reaction mechanisms at an atomistic level in an ultrafast timescale. The recent development of ML models enables accurate predictions of energies and forces of the same quality as quantum chemical calculations. The training protocol with adaptive sampling can effectively search the undersampled data out of the initial set generated based on chemical intuition, reducing human bias in the training data generation. The reviewed studies have shown ML photodynamics simulations from picoseconds to nanoseconds for various molecules in medium and large sizes. The ML photodynamics simulations provided a high fidelity of trajectories to inform the excited-state structural information underlying the reactivities, stereoselectivity, and unseen mechanistic pathways in complex photochemical reactions.
While the applications of ML photodynamics in photochemistry grow rapidly, several limitations need to be addressed in the ongoing and future method development. First, an accurate ML model for predicting NACs is unavailable. Using the approximated NACs derived from the ML energies, forces, and Hessian does not guarantee accurate surface hopping when the excited and ground state energy gap is significant. The surface hopping calculations depends on the energy gap, which could be smaller than 1 kcal mol−1. Thus, ML potential needs higher accuracy than the commonly accepted chemical accuracy, at least for structures near the state-crossing regions. The transferability of the ML potential for excited states remains largely unexplored. The few studies describe a transferable ML potential limited to isoelectronic molecules or derivatives undergoing similar mechanistic pathways. Therefore, a universal ML potential for representing the excited states of molecules with arbitrary size and composition is desired for fully exploiting the advantages of ML photodynamics for studying photochemistry.
ACKNOWLEDGMENTS
J.L. thanks the supports from the Hoffmann Institute of Advanced Materials at Shenzhen Polytechnic. S.A.L. acknowledges the National Science Foundation CAREER Award (No. NSF-CHE-2144556) and NSF Institute for Data Driven Dynamical Design (No. NSF-OAC-2118201). All authors appreciate the assistance from the Northeastern Research Computing Team and the computing resources provided by the Massachusetts Life Science Center Grant (No. G00006360).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Jingbai Li: Writing – original draft (equal); Writing – review & editing (equal). Steven Lopez: Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.