A MASH simulation of the photoexcited dynamics of cyclobutanone

In response to a community prediction challenge, we simulate the nonadiabatic dynamics of cyclobutanone using the mapping approach to surface hopping (MASH). We consider the first 500 fs of relaxation following photo-excitation to the S2 state and predict the corresponding time-resolved electron-diffraction signal that will be measured by the planned experiment. 397 ab-initio trajectories were obtained on the fly with state-averaged complete active space self-consistent field (SA-CASSCF) using a (12,11) active space. To obtain an estimate of the potential systematic error 198 of the trajectories were calculated using an aug-cc-pVDZ basis set and 199 with a 6-31+G* basis set. MASH is a recently proposed independent trajectory method for simulating nonadiabatic dynamics, originally derived for two-state problems. As there are three relevant electronic states in this system, we used a newly developed multi-state generalisation of MASH for the simulation: the uncoupled spheres multi-state MASH method (unSMASH). This study therefore serves both as an investigation of the photo-dissociation dynamics of cyclobutanone, and also as a demonstration of the applicability of unSMASH to ab-initio simulations. In line with previous experimental studies, we observe that the simulated dynamics is dominated by three sets of dissociation products, C3H6+CO, C2H4+C2H2O and C2H4+CH2+CO, and we interpret our predicted electron-diffraction signal in terms of the key features of the associated dissociation pathways.


I. INTRODUCTION
In recent years, there has been a significant increase in experimental capabilities making it possible to follow ultrafast photochemical processes in real time.Nonetheless, obtaining a clear mechanistic interpretation of molecular quantum dynamics often requires theoretical calculations to be performed in tandem.Computer simulations of photochemistry are complicated by the breakdown of the Born-Oppenheimer approximation.We thus require methods that can describe nonadiabatic dynamics involving transitions between electronic states.
In any theory, it is desirable to minimize the complexity of the description as much as possible, in order to obtain a simple intuitive picture of the key processes at play.Nonadiabatic approaches based on independent semiclassical trajectories achieve just that, of which Tully's fewest-switches surface hopping (FSSH) 1 is the most commonly used.In addition, the favourable computational scaling of independent trajectories with system size means that a high-level of electronic-structure theory can be employed, which is crucial for making quantitative predictions of experiments.Unfortunately, FSSH has a number of well-known problems due to inconsistency and overcoherence. 2 a) Electronic mail: joseph.lawrence@nyu.edub) Electronic mail: jeremy.richardson@phys.chem.ethz.ch The mapping approach to surface hopping (MASH) 3 is a recently proposed alternative to the FSSH algorithm.It has a rigorous derivation based on mapping approaches [4][5][6] but instead of using a mean-field force, it hops between adiabatic states, similarly to FSSH.The key difference between MASH and FSSH is that MASH's dynamics are deterministic rather than stochastic.This has an important benefit, ensuring consistency at all times between the electronic variables and the active surface.Also, the MASH derivation uniquely determines how the momentum should be rescaled at attempted hops.One should rescale along the direction of the nonadiabatic coupling vector, and reflect in the case of all forbidden hops.Although this prescription is identical to Tully's original concept, 7 many alternative suggestions have been made, 2,8,9 and in practice, approximations are often taken. 10Previous work has shown that MASH is often more accurate than FSSH for a range of model systems 3 and there is reason to believe it can even be more accurate than ab initio multiple spawning (AIMS) 11,12 for photochemical problems. 13MASH was shown to correctly recover Marcus theory rates, 14 where FSSH is known to require complicated decoherence corrections. 15,16Additionally, unlike other mapping approaches, 4,6,[17][18][19] MASH rigorously captures the detailed balance necessary to thermalize to the correct equilibrium distribution. 20 this work, we simulate the photochemistry of cyclobutanone, in response to a community prediction chal-arXiv:2402.10410v1[physics.chem-ph]16 Feb 2024 lenge initiated by the Journal of Chemical Physics (JCP).
In the experiment that we seek to predict, cyclobutanone is initially photoexcited using a 200 nm pump pulse, which is assumed to excite the molecule from the S 0 to the S 2 adiabatic electronic state.The resulting nonequilibrium dynamics are then measured using time-resolved electron diffraction.Prior to the community challenge, all previous theoretical studies of cyclobutanone have instead considered the dynamics starting in S 1 , [21][22][23][24] meaning that the present study covers new ground.Simulating the photoexcited dynamics initialized in S 2 also poses additional theoretical challenges.S 2 is known to be a Rydberg state 25,26 and therefore requires a set of diffuse electronic basis functions to correctly describe the dynamics.
The original MASH formulation was limited to twostate problems and is therefore not directly applicable to the present study.Although a multi-state version of MASH has recently been proposed by Runeson and Manolopoulos, 27,28 we note that this does not reduce to the original two-state version and thus loses some of the key benefits of the MASH approach.This is particularly significant for photochemical applications, where the dynamics is expected to largely be a succession of effective two-state nonadiabatic transitions.We therefore developed a new multi-state generalization of MASH which does recover the original version in the case that two states are uncoupled from all others.This is the method employed for the present study.This new approach will be described in detail alongside application to a series of benchmarks in a forthcoming publication. 29he present work describes the first implementation of our new multi-state MASH method using ab initio electronic-structure methods.Due to the time constraints imposed by JCP's challenge, we were not able to implement the most powerful version of the algorithm and thus we limit ourselves to studying the internal conversion between singlet states only and employing an initial distribution obtained from a vertical transition according to the Franck-Condon principle.We note, however, that the MASH formalism can be rigorously extended to treat intersystem crossing to triplet states and to describe the excitation pulse explicitly. 30Future work will test the impact of this more complete description of the nonadiabatic process.This study therefore serves to provide a proof of principle that MASH can be used for realistic simulations of photochemistry and can compete with more established methods such as FSSH and AIMS.

II. METHODS
Before describing the algorithm used for generating MASH trajectories, we first turn to the question of sampling initial conditions.Most nonadiabatic trajectory simulations are initialized using a Wigner function based on a harmonic approximation around the ground-state equilibrium geometry.Ordinarily, we would have used this standard approach within a MASH simulation.However, in the S 0 state, cyclobutanone has a low-frequency puckering mode around a C 2v geometry, which is very anharmonic and depending on the electronic-structure method used may even be predicted to be a double well with a low barrier.It is therefore clear that the harmonic approximation is not valid for this mode.At the level of density-functional theory (B3LYP/def2-TZVP as implemented in ORCA), 31 we located the minimum-energy pathway between the two minima using the nudgedelastic band method 32 and performed a "stream-bed walk" 33 (in Cartesian coordinates) up the other side.We call this the puckering path and from now on we employ mass-weighted coordinates using atomic masses of the most common isotopes, unless otherwise stated.A one-dimensional discrete variable representation (DVR) calculation 34 was carried out to obtain the nuclear wavefunctions along the puckering path.The predicted fundamental vibrational transition is 33 cm −1 in good agreement with the experimental value of 35 cm −1 from farinfrared spectroscopy. 35There is also reasonable agreement with the higher excited vibrational states (supplementary material). 36We learned that the experiment will slightly heat the sample to avoid condensation. 37herefore, one-dimensional positions and momenta were sampled according to the thermal Wigner function at 325 K, which can itself be evaluated in terms of the DVR wavefunctions (supplementary material). 38Hessians were computed at a set of points along the puckering path and interpolated (in Cartesian coordinates).Translational and rotational modes along with the vector tangential to the path were projected out.It was found that the perpendicular frequencies are much larger than the energy-level spacing of the puckering vibrations and vary relatively slowly along the puckering path (supplementary material), which justifies our approach.Finally, the modes perpendicular to the puckering path were sampled using the thermal Wigner function within the standard harmonic approximation, and angular momentum was sampled from a classical Boltzmann distribution.More elaborate schemes for sampling Wigner distributions have been proposed, but these are not yet applicable to such large systems. 39w that the initial conditions for the nuclei are specified, we discuss the electronic-structure method used for the dynamical simulations.In order to capture the excited-state manifold and the bond-breaking dynamics after photo-excitation, a multireference method is required; we employ the state-averaged complete active space self-consistent field (SA-CASSCF) method in order to simultaneously describe the S 0 , S 1 and S 2 states.In SA-CASSCF, the ground and electronic excited states are optimized simultaneously with a common set of orbitals but different configuration-interaction (CI) coefficients, which are constrained to form an orthonormal set.This ensures that the electronic states are orthogonal, which is particularly important when using the overlaps of the electronic wavefunction in the dynamics.The common set of orbitals also allow for efficient implementations of analytic gradients and nonadiabatic coupling vectors (NACV).All CASSCF calculations were performed using Molpro 2023. 40A (12,11) active space was used for all calculations, which includes the aforementioned characteristic orbitals and adds additional valence bonding and non-bonding orbitals.We considered two different basis sets, the Dunning aug-cc-pVDZ basis and the Pople 6-31+G* basis.Both basis sets were chosen as they contain diffuse functions necessary to describe the Rydberg orbital.For both basis sets, the Rydberg orbital appears to be centred around the carbonyl group.
Due to the time constraints imposed by the prediction challenge, we were restricted by the size of the basis set that could be used.Hence, by considering calculations performed using two different basis sets we hope to obtain an estimate of the sensitivity of the calculation to the details of the electronic structure.The aug-cc-pVDZ gave a vertical excitation energy (6.231 eV) that was closer to the experimental S 2 peak maximum (6.4 eV) 25 than 6-31+G* (6.846 eV), and also much closer to the pump laser frequency of the planned experiment (6.2 eV). 42At the C 2v geometry, we find that the aug-cc-pVDZ and 6-31+G* basis sets produce slightly different orbitals in a (12,11) active space, which is likely the reason behind the differences in the excitation energy.
Given that the aug-cc-pVDZ basis seems to produce a more accurate excitation energy for cyclobutanone and is also slightly larger than the 6-31+G* basis, unless otherwise stated we choose to present the calculations performed using the former in the main text.The analogous calculations using the 6-31+G* basis set are however provided in the supplementary material, along with illustrations of the active orbitals for both basis sets.
At the C 2v geometry, the n → π * transition is electricdipole forbidden while the n → 3s is electric-dipole allowed.In addition, the pump-pulse energy is on resonance with the S 2 excitation.We therefore utilized the Franck-Condon approximation to initialize the electronic state in S 2 (according to the MASH procedure described below).In this way, we allowed vertical transitions for the entire initial distribution and do not take account of the bandwidth of the laser pulse, as it is not obvious how to do this in a rigorous way without explicitly simulating the light field.
We optimized the structures of relevant minimumenergy conical intersections (MECI) and crossing points (MECP) and present their relative energies in the supplementary material.Both the S 2 /S 1 MECI and the S 2 /T 2 MECP are below the Franck-Condon energy, implying that they are both energetically accessible.However, the spin-orbit coupling (SOC) at the S 2 /T 2 MECP is only 5 cm −1 which suggests that intersystem crossing may be quite unlikely.Similar conclusions are indicated for relaxation from the S 1 state, in which the SOC is zero at the S 1 /T 1 MECP.Taking cues from the study by Liu and Fang, 43 we identified three S 1 /S 0 MECI structures, which are lower or comparable in energy to the S 1 /T 1 MECP.Together, these results suggest that intersystem crossing is not important for the photodynamics of cyclobutanone.This is in agreement with previous theoretical studies of photochemistry of cyclic ketones on the S 1 state, which show that intersystem crossing only plays a role in molecules with rings of 5 or 6 carbon atoms. 44ur approximation of neglecting the triplet states in the dynamics of cyclobutanone is further tested by selecting 10 trajectories from the final aug-cc-pVDZ set, along which we simulated the electronic dynamics including 3 triplet states (i.e., a 6-state SA-CASSCF) with a (12,11) active space, starting on the S 2 state.The resulting electronic dynamics showed less than 0.6% population transfer to the triplet manifold.Although this simple test is not completely reliable, especially considering the small number of trajectories considered, it nonetheless lends further weight to justify our neglect of the triplet states.Further details of these calculations and plots of the triplet populations over time for a couple of representative trajectories are given in the supplementary material.
We now turn to our choice of dynamics method, MASH.For clarity, we give here a brief discussion of the important features of the method, highlighting key differences to FSSH.For a more detailed discussion see the supplementary material as well as Refs.3, 14 and 29.Before discussing the multi-state generalisation it is instructive to introduce the key ideas behind the original two-state implementation of MASH. 3 In this case there are two key differences to FSSH: first, how the active surface is determined, and second, how the initial electronic variables (wavefunction coefficients) are chosen.Unlike FSSH, in MASH the active state is obtained deterministically.In the two-state case, the active state is determined by the sign of the z component of the Bloch sphere, S

and when S
(2,1) z < 0 the active state is n = 1.That the dynamics can be fully deterministic might at first seem surprising, particularly given that it is the stochastic nature of the hopping in FSSH that allows it to capture wavepacket bifurcation.However, the stochastic nature of the FSSH hops is effectively replaced in MASH by an initial sampling of the wavefunction coefficients.For example, to initialize a system in a pure state on adiabat 2, in FSSH one chooses S MASH has been shown to offer a number of formal improvements over FSSH.Firstly, unlike FSSH it can be rigorously derived as a short-time approximation to the quantum-classical Liouville equation (QCLE), meaning that it can in principle be systematically improved towards this limit.Secondly, and perhaps most importantly, MASH does not suffer from the inconsistency error of FSSH.These methodological improvements are expected to lead to more reliable predictions for no extra computational cost.In fact, for a series of different model systems, MASH has been shown to be as accurate or more accurate than FSSH at reproducing quantum-mechanical benchmark results. 3,14Decoherence corrections can be derived rigorously for MASH, 3 although in the vast majority of cases, the dynamics are already accurate enough without them. 13hile the original MASH method was derived for twostate systems only, we have recently proposed an Nstate generalisation of MASH ideally suited for simulating photochemical systems, which we call the uncoupledspheres multi-state MASH method (unSMASH). 29The unSMASH method generalises the original two-state MASH by treating possible transitions between pairs of adiabatic states independently.This is done by introducing N − 1 independent effective two-state Bloch spheres between the current active state and each of the other states: S (n,j) for j ̸ = n, j = 1, . . ., N .Each sphere then evolves as it would in the original two-state MASH for the truncated electronic space consisting of the two corresponding adiabatic states.Attempted hops occur when one of the S (n,j) z changes sign.As in the two-state theory, the hops are accepted or rejected according to whether there is enough kinetic energy in the direction of the NACV between the active state and the possible new state.The component of the momentum along the NACV is then either rescaled to conserve energy in the case of allowed hops, or reflected in the case of rejected (frustrated) hops.The unSMASH method is a rigorous short-time approximation to the QCLE when there is only coherence between one pair of adiabatic states at a given time.It is therefore well suited to photochemical problems involving a series of successive separate transitions between adiabatic surfaces, as one would expect in the photochemical relaxation of a typical organic molecule such as cyclobutanone.
6][47][48] A full mathematical description of the integrator is given in the supplementary material; here we simply give some of its key features.The basic structure of the integrator involves first propagating the nuclear positions and momenta from t to t + δt using velocity Verlet, before the electronic variables are evolved from t to t + δt using information calculated at geometries q(t) and q(t + δt).Finally any attempted hops are treated, along with their associated momentum rescalings.As with FSSH propagation schemes, one must contend with the fact that close to conical intersections the NACV is very sharply peaked.This means that algorithms that rely solely on the NACV can require arbitrarily small time steps in order to capture the corresponding electronic transition.6][47][48] This can be implemented for MASH in the same way as it is for FSSH.However, in the present work we found that the calculation of NACVs was significantly less computationally expensive than computing the overlaps.For this reason, the integrator we employ uses a mixed scheme, only calculating overlaps when the adiabatic surfaces come close together (|∆V | < 2000 cm −1 ), and otherwise using the NACVs.We note that using the NACVs rather than the overlaps has an additional advantage when combined with CASSCF, as it means that discontinuities in the active space do not lead to spurious nonadiabatic transitions.This is similar to the advantage of calculating the forces analytically rather than by finite difference, which can lead to unphysical sudden large changes in the momenta when encountering a discontinuity in the energy.

III. RESULTS
In total we sampled 200 sets of initial positions and momenta from the Wigner function at 325 K.In each case the initial active state was S 2 , and the initial spheres were randomly sampled as described above.These initial samples were used to launch 200 separate unSMASH trajectories (using the aug-cc-pVDZ basis) with a time step of 0.5 fs.Of these initial 200, a total of 198 trajectories were run for the desired total time of 500 fs.For 123 of these trajectories, spin contamination in the excited-state manifold meant that the SCF step in the SA-CASSCF cycle did not converge within the maximum number of allowed iterations, which was chosen as 160.This typically only occurred at later times after the molecule had already reached S 0 , and had already undergone the primary dissociation step, with the majority (77) having already reached at least 200 fs before the SA-CASSCF failed to converge.Rather than discarding these trajectories (which would bias the results), we elected to finish those that had already reached S 0 by running them for the remaining time on the ground state.This was done using state-specific CASSCF (SS-CASSCF) with the same basis and active space size as for the three-state SA-CASSCF calculations.The 2 trajectories that could not be completed were those that crashed due to spin-contamination while on an excited electronic state; these trajectories are excluded from all analysis that follows.
To compare our simulated results with the results of the proposed experiment, the final set of 198 trajectories (of length 500 fs) were used to generate electrondiffraction signals.This was done based on elasticscattering calculations within the independent-atom model 49 using the ELSEPA program 50 as described in the supplementary material.Note that it would in principle be possible to go beyond this approximation, in the spirit of Ref. 51, using the ab initio two-electron density of the MASH active state, provided by the CASSCF calculations.However, in this work, we assume the independentatom model to be sufficient.The electron-diffraction signal was transformed to obtain the atomic pair distribution functions (PDF).Then ∆PDF was defined as the difference between the PDF at time t and the steadystate PDF as shown in Fig. 1.As experimental electrondiffraction results are typically presented with arbitrary units, 52 all results are given relative to the maximum peak height in the steady-state PDF, i.e. ∆PDF(r, t) = PDF(r, t) − PDF SS (r) max(PDF SS (r)) Finally, the results were convoluted with a Gaussian (160 fs FWHM) to simulate the instrument response function (accounting for both the width of the pump pulse as well as the detector).In Fig. 2, we present the the electron-diffraction signal predicted by our MASH simulation, both before and after convolution.From these results one immediately obtains a qualitative picture of the dynamics after photoexcitation.In the unconvolved signal, after only 50 fs the predicted electron-diffraction signal shows a significant positive peak at around 3.25 Å, along with a corresponding negative peak in the region 1.25-2.5Å.At 100 fs the positive peak has broadened and shifted towards 4 Å while the negative peak has deepened.This trend carries on until around 350 fs after which point the negative peak approaches a steady state and the positive peak has broadened and shifted to such large distances as to become almost invisible.Although in the convolved signal the locations and heights of the peaks are somewhat modified, the same qualitative behaviour can be observed.This behaviour is clearly indicative of a rapid dissociation following the photoexcitation of cyclobutanone.The dissociation leads to a depletion of short bond distances due to bond breaking, with a corresponding increase at continually larger and larger distances as the resulting fragments move apart.In contrast a simple ring-opening reaction, without dissociation, would result in a persistent positive signal between 4 Å < r < 6 Å, as seen for example in the photo-induced ring-opening of cyclohexadiene. 52From these results we can ascertain that the majority of the dissociation occurs within the first 250-300 fs, with the onset of dissociation occurring very rapidly at around 50 fs.
Although the electron diffraction signal contains a large amount of information, and gives an immediate qualitative picture of the nuclear dynamics, it is nevertheless hard to immediately extract detailed mechanistic information from the signal alone.This is why, despite the ever increasing resolution of experiments, molecular simulations are an important tool in understanding complex photochemical processes.In the following we consider what the additional information available from our molecular simulations tells us about the dissociation process before returning to discuss how signatures of these features could be observed in the experimental electrondiffraction signal.

A. Electronic dynamics
We first consider what our simulation predicts about the electronic dynamics after excitation.Figure 3  the average population on each adiabatic state as a function of time.From this we can clearly see that the system undergoes rapid electronic relaxation, with a half life on S 2 of about 50 fs.It appears that the system primarily undergoes a sequential transition, first from S 2 to S 1 and then from S 1 to S 0 .The resulting half life for the combined excited-state manifold (S 2 + S 1 ) is predicted to be about 100 fs, with about 90% of the molecules having relaxed to S 0 by 250 fs.This timescale matches closely the dynamics seen in the electron-diffraction signal, indicating that the energy released into the nuclear degrees of freedom by the electronic relaxation leads rapidly to dissociation.

B. Reaction products
To obtain a clearer quantitative picture of the photodissociation process it is helpful to analyse the trajectories according to the fragments formed in the dissociation process.We define a molecular fragment as a series of atoms that form a connected graph, where nodes of the graph correspond to atoms and edges of the graph indicate that the distance between the corresponding pair of atoms is within a cutoff distance of r < 2.0 Å.Table I shows the total yield of all observed molecular fragments at 500 fs.We note that, due to the presence of secondary dissociation processes that occur after 500 fs, this is not likely to be equivalent to the final product distribution.We can see that the most abundant product is carbon monoxide (CO), closely followed by cyclopropane/propene (C 3 H 6 ).There are also significant numbers of ethene (C 2 H 4 ) as well as ketene (C 2 H 2 O) and the highly reactive methene (CH 2 ), along with a handful of other fragments that are only observed in a small number of trajectories.Most notable of these are the 7 remaining C 4 H 6 O molecules that have not yet dissociated, of which 3 have already undergone ring-opening, 3 remain as cyclobutanone and 1 has undergone a rearrangement to form cyclopropanal.We note that, based on the analysis that follows, we expect the majority of them to eventually dissociate to give C 3 H 6 +CO.Further insight can be gained by grouping the trajectories according to the molecular fragments they produce.Table II shows the three most common sets of products present at 500 fs, along with the frequency at which they are observed.That these three reaction products should dominate is not a surprise based on previous theoretical and experimental work [21][22][23][24]54,55 such as that of Trentelman et al. 55 who rationalised their measurements on the CO produced after photo-excitation of cyclobutanone with 193 nm light in terms of these 3 possible reaction products. Folloing this earlier work, it is helpful to distinguish reactions according to whether the primary dissociation event produces ethene (C 2 H 4 ), labelled the C2 pathway, or cyclopropane/propene (C 3 H 6 ), labelled the C3 pathway.The C3 pathway is the simplest, involving the cleavage of two carbon-carbon bonds to form carbon monoxide and a C 3 H 6 diradical, which typically rapidly forms highly vibrationally excited cyclopropane.In a small number of trajectories the C 3 H 6 radical was observed to undergo a rearrangement to form the more stable propene, and we note that on longer timescales one would expect the excited cyclopropane to also undergo this rearrangement.The C2 pathway is more complicated.In their work forming ethene and ketene (ethenone), with a possible secondary dissociation step, in which ketene dissociates to form carbon monoxide and methene.Here, however, we also consider the possibility of a third process in which both (R2) and (R3a) occur in a single primary dissociation step,

C. Time-dependent fragment formation
To obtain a more detailed understanding of these dissociation pathways, in Fig. 4 we plot the time-dependent yield of each of the 5 major reaction products.It is instructive to first consider the yields of C 2 H 4 and C 3 H 6 .As was observed in the electron-diffraction signal, the onset of dissociation occurs at around 50 fs, where the number of observed fragments begins to increase sharply, and the vast majority of the primary dissociation is over by around 300 fs where the yields of both C 2 H 4 and C 3 H 6 are greater than 90% of their final value.It is notable that although the onset of formation of C 3 H 6 begins slightly earlier than C 2 H 4 , the timescale associated with the formation of C 2 H 4 is significantly shorter.In fact, while the yield of C 2 H 4 is essentially constant between 200 and 500 fs there is a notable 50% increase in the yield of C 3 H 6 over this timescale.This can be understood by noting that (as shown in the supplementary material Fig. S19), trajectories which form C 2 H 4 stay on S 2 for slightly longer but reach S 0 earlier than those which form C 3 H 6 .The rapid formation of C 2 H 4 is thus associated with a sudden successive relaxation from S and then S 1 to S 0 , releasing a large amount of energy and resulting in a rapid (almost concerted) bond breaking.
While C 3 H 6 can also form in this way, we see that there is another slower formation mechanism.This slower mechanism involves trajectories becoming temporarily trapped on S 1 without dissociating; when they eventually relax to S 0 they then predominantly form C 3 H 6 rather than C 2 H 4 .It seems likely that this is because the formation of C 2 H 4 requires a greater amount of energy in the corresponding carbon-carbon bond stretch, and that this slower pathway results in a more even distribution of the energy released from the relaxation from S 2 to S 1 .Returning to Fig. 4 we note that, in contrast to C 2 H 4 , the yield of CH 2 continues to increase beyond 200 fs.Specifically, the yield of CH 2 increases from around 4.5% at t = 200 fs to around 7.6% at t = 500 fs corresponding to an approximately 70% increase in population.This can be understood as arising from a secondary dissociation of ketene, as defined by (R3a).This is confirmed by the concomitant decrease in the yield of ketene over the same period.However, we note that the appearance of methene before 100 fs indicates that secondary dissociation is not the only pathway to its formation.If it were, one would expect the rate of formation of CH 2 to be proportional to the population of ketene.We therefore conclude that a significant fraction of methene is formed via the primary dissociation reaction shown in (R3b).
Although there are differences in the relative fractions of each product formed, the 6-31+G* calculations show similar qualitative behaviour to these aug-cc-pVDZ calculations.In the next subsection we will discuss these quantitative differences further.

D. C3 vs C2 ratio
These observations, along with directly visualising the trajectories, confirm that the dominant reaction pathways are those given in (R1) to (R3b), and allow us to make a prediction of the relative yields of the C3 and C2 pathways (the C3/C2 ratio).To do so we choose in all cases to identify reaction products I as belonging to the C3 pathway and reaction products II and III to the C2 pathway. 56The trajectories not in groups I, II or III can be categorised as those that have undergone hydrogen dissociation and those that have not yet dissociated.We choose to exclude those reactions that have undergone hydrogen dissociation from the discussion of the C3/C2 ratio.For the 7 remaining undissociated C 4 H 6 O, the fact that the yield of C 3 H 6 in Fig. 4 still shows a significant positive slope at 500 fs makes it seem likely that the majority will eventually dissociate according to (R1).For this reason we additionally categorise these trajectories as following the C3 pathway.On this basis, of the trajectories that follow either the C3 or C2 pathways, we observe that 77% dissociate via the C3 pathway, and can estimate the influence of statistical error using a 95% Wilson score interval 53 to give an lower and upper bound to our prediction of (70%, 82%).The corresponding C3/C2 ratio is found to be 3.3 and propagating the 95% Wilson score interval gives upper and lower bounds on the statistical error as (2.35, 4.6).Following the same analysis of the 6-31+G* product yields (given in the supplementary material) one arrives at a somewhat different C3/C2 ratio.There we find the fraction of all C3 and C2 trajectories that dissociate via C3 to be 0.5 with a 95% Wilson score confidence interval of (43%, 57%) corresponding to a C3/C2 ratio of 1 with the statistical confidence interval at 95% of (0.75, 1.35).The two sets of simulations therefore show a statistically significant difference in this quantity, and we will return to discuss how this influences our estimation of the systematic error in our predicted electron diffraction signal in Sec.III F.
As far as we are aware there does not exist a definitive experimental measurement of the C3/C2 ratio at 200 nm.The only attempt at a direct measurement that we could find in the literature was that of Shortridge et al. 57 who obtained a value of 1.2.This experiment used a full arc zinc lamp rather than laser excitation, observing only the 202.6 nm line showed appreciable absorption, and had a buffer gas of cyclopropane that somewhat complicated the interpretation of their results.In later work, Trentelman et al. chose to ignore this result and instead extrapolated the measurements of Denschlag and Lee 54 to obtain a ratio of 1.3 at 193 nm.It is, however, questionable whether this extrapolation is valid, given that it was based on excitation at wavelengths between 318 nm and 248 nm, which correspond to excitation to S 1 (centered at 280 nm) rather than S 2 .We additionally note that, if one takes the C3/C2 ratios for the lowest two wavelengths reported by Denschlag and Lee (253.7 nm and 248 nm) then a linear extrapolation in energy results in 0.6 0.4 0.2 0.0 6. Weighted average over the three dominant reaction products of the simulated ultrafast electron-diffraction results at 500 fs.f1 corresponds to the fraction of products I and f2 to the fraction of products II used in the weighted average, with the remaining fraction corresponding to products III.Results are shown here for the signal convoluted with with a 160 fs (FWHM) Gaussian to simulate the instrument response function.Dashed line shows the prediction from the aug-cc-pVDZ trajectories.
a significantly higher C3/C2 ratio of 2.5, in much closer agreement with the aug-cc-pVDZ ratio prediction.In any case what is clear from the existing experimental results is that within the S 1 band the C3/C2 ratio has a strong energetic dependence.Given this, systematic errors in the electronic structure and approximations used in the initial conditions might be expected to have a profound effect on the ratio seen in the simulations, which would certainly be consistent with the observed difference between the 6-31+G* and aug-cc-pVDZ results.

E. How will the different reaction pathways influence the experimental signal?
Analysing our calculated trajectories has allowed us to give a detailed prediction of what happens during the dissociation.However, in order for our predictions to be testable, we need to connect them to observable features of the planned experimental signal.Hence, in the follow-ing we return to consider how key features of the different reaction pathways could be observed in the experimental electron-diffraction signal.To do this we begin by calculating three hypothetical electron-diffraction signals corresponding to the trajectories that produce reaction products I, II and III.The full signals are given in the supplementary material.In the following we connect key features of these signals to the total signal and analyse how sensitive the planned experiment is to the relative fractions of reactions that follow each pathway.
In Fig. 5 we show the minimum value of the ∆PDF curve as a function of time for each of the reactions alongside the total signal.The minimum value occurs in all reactions and at all times close to 2.5 Å.This corresponds to the distance between the oxygen and the two β-carbons in cyclobutanone (Fig. 1), and also to the distance between the oxygen and the β-carbon in ketene.Hence, this minimum is closely associated with the formation of CO.We note, however, that drawing a direct connection between the depth and the amount of CO is complicated somewhat, both by the presence of two rather than just one β-carbons in cyclobutanone, and also by the fact that the distance between the carbons and the four hydrogens on the neighbouring carbons in cyclopropane is about 2.5 Å (although the lighter H atoms have a weaker signal).Nevertheless, comparing the curves for products I and products II we can see a clear parallel with the C 3 H 6 and C 2 H 4 curves in Fig. 4. The curve for products II has clearly plateaued by around 350 fs while products I continues to decrease gradually.Using this connection we can attribute the notably shorter timescale of the curve for products II vs. products I to the "concerted" vs. "sequential" nature of the dissociation reactions.In a similar manner, the slope at long time for products III can be understood as arising from the secondary dissociation of ketene to form methene and carbon monoxide.
Although these qualitative differences in the slopes are interesting, the most notable difference between the curves is their magnitude at 500 fs.This would therefore appear a good way of using the experimental result to judge the accuracy of our predicted product yields (and C3/C2 ratio).We remind the reader, that in order to compare the results shown here to the future experimental results, one should bear in mind that the results here are given relative to the maximum peak height of the steady-state PDF [Eq.( 1)]. 58If there were only two dominant sets of reaction products, the depth of the minimum alone would allow one to get a good estimate of the ratio of products, which could be done by fitting a weighted average of the corresponding peak depths.Since there are three dominant reaction products, determining the relative ratios of all three from the electron-diffraction signal is not so straightforward.However, by considering the full ∆PDF(r) signal at 500 fs one can additionally make use of the lineshape to help distinguish the reaction products present.Figure 6 does exactly that.Alongside the signal obtained from our MASH simulation (dashed line), it shows the hypothetical signal that would be obtained for different fractions f 1 and f 2 of products I and products II respectively, with the remaining contribution from products III (f 1 + f 2 + f 3 = 1).
[Note the individual signals that are being averaged can be seen in Figs.S20-S22 in the supplementary material.]This serves as a relatively simple testable quantitative prediction that one can use to assess the relative fractions of these key reaction products.While it is not highly sensitive, there is only a relatively small part of the parameter space which is consistent with the aug-cc-pVDZ results, i.e., 0.65 ≲ f 1 ≲ 0.85.Note the equivalent plot for the 6-31+G* calculations is given in the supplementary material and shows that in this case only the region 0.35 ≲ f 1 ≲ 0.65 would be consistent with the predicted result.This demonstrates that, although the predicted electron-diffraction signal is not radically different in the two cases, with a low enough experimental error one would be able to distinguish between them.
In case it is difficult to accurately determine the depth from the experimental signal, in the supplementary ma-terial we also consider how the lineshape at 500 fs alone could be used to distinguish different reaction products.Figure S23 shows the possible signals at 500 fs normalised so that the depth at their minimum is −1.From this one can see that the most notable change to the line shape with varying fractions, f 1 , of reaction products I, is the depth of the shoulder at about 1.8 Å.Although the changes are relatively subtle compared to the absolute differences of Fig. 6, we see that, with a small enough relative error, it should nevertheless be possible to distinguish between different fractions, f 1 , of reaction products I, and hence different C3/C2 ratios.This could therefore also be used as a direct test of our prediction of the product ratio.

F. Potential systematic errors
Finally, having focused on how sensitive the experiment would need to be to distinguish between different possible reaction pathways, it is natural to ask how confident we are in our predicted signal and how large our systematic errors are likely to be.There are four potential sources of error: 1. electronic-structure theory 2. initial conditions/treatment of the pulse 3. dynamics (including zero-point energy leakage) 4. calculation of the electron diffraction signal.
The established sensitivity of the C3/C2 ratio on the excitation wavelength, means that we expect that the results will be particularly sensitive to errors in the electronic-structure theory and initial conditions.Given the accuracy of MASH in previous benchmark tests and the established nature of the independent atom approximation, we therefore focus here on assessing the errors due to the electronic-structure theory and initial conditions.We have therefore performed a number of additional calculations for which additional figures are given in the supplementary material.
Ideally to confirm the accuracy of our prediction we would perform the same simulations at a higher level of electronic-structure theory, e.g., CASPT2 with the same (12,11) active space.Given the time constraints of the challenge, this is not possible.However, we have calculated CASPT2 energies with the aug-cc-pVDZ basis along linear interpolated (in internal coordinates) paths (Fig. S8), to examine the effect of dynamical correlation.Along the path that leads from the Franck-Condon geometry to the S 2 /S 1 MECI and the S 1 minimum, CASPT2 energies are very similar to CASSCF.This is also observed along the path connecting the S 1 minimum to the first S 1 /S 0 MECI.However, the barriers to access the second and third S 1 /S 0 MECIs increase by 0.7 eV and 0.8 eV, respectively, at the CASPT2 level of theory.Since, the third S 1 /S 0 MECI seems to lead to the C2 products (Fig. S15), it is likely that repeating our augcc-pVDZ simulations at the CASPT2 level would further increase the C3/C2 ratio.
Another thing that might influence the C3/C2 ratio is the initial conditions.Hence, to assess the extent to which the initial conditions may influence the predicted electron diffraction signal, we considered how the vertical excitation energy of each initially sampled geometry correlated with the final reaction products.Under the assumption that the strong wavelength dependence observed in the product ratio at longer wavelengths (within the S 1 window) implies a strong wavelength dependence when exciting to S 2 , then one would expect the initial excitation energy in our simulation to correlate with the final product.To test this hypothesis we performed a two-sample z test on the mean of the vertical excitation energy for products I and products II + III.This resulted in a difference in the means that was not statistically significant with a p value of 0.55.This goes against the hypothesis stated above.This implies that the C3/C2 ratio is not so sensitive at these shorter wavelengths, probably because the large excess of energy makes all conical intersections accessible, in contrast to excitation to S 1 where shorter wavelengths may open new channels significantly altering the product distribution.Hence, it appears from these tests at least that the initial conditions are perhaps not such a large source of systematic error.Of course, there are other features of the initial conditions, such as the dependence of transition-dipole moments which we have not included but could affect the product ratios.
Our most direct information about possible systematic errors comes from the difference between the augcc-pVDZ calculations and the 6-31+G* calculations.As discussed above, when comparing the dynamics within a particular channel both sets of calculations give very similar results.However, where they differ is in their predicted C3/C2 ratio (statistical intervals giving 2.35 to 4.6 for aug-cc-pVDZ vs. 0.75 to 1.35 for 6-31+G*).We have shown that this difference could in principle be observed in the electron-diffraction signal, but the overall difference is small.If one were forced to choose between the two without considering prior photochemical experiments, one would say that aug-cc-pVDZ should be preferred since it is a slightly larger basis than 6-31+G*.It was for this reason, in addition to the more accurate S 0 to S 2 vertical excitation energy that we chose to focus on the aug-cc-pVDZ results in the main paper.We note however, that the aug-cc-pVDZ are not expected to be so much more accurate than the 6-31+G* calculations that we should discount them entirely.Given that prior experimental results indicate that the C3/C2 ratio should be around 1.2-1.3, 54,55,57one might well suggest that the most accurate prediction could be obtained by taking an average of the two sets of calculations.We note, however, that the electron-diffraction signal that results from averaging (shown in supplementary material) is difficult to distinguish by eye from the signal in Fig. 2 or the 6-31+G* result (also in supplementary material).Hence we can conclude that, while there is some uncertainty in our predicted C3/C2 ratio, it seems reasonable to assume that the fraction of trajectories following the C3 pathway is between about 0.5 and 0.85.Overall, given the sensitivity of the electron-diffraction signal observed in Fig. 6 and additional figures in the supplementary material we can therefore be confident that our results are likely to accurately reproduce the timescales and other key features of the planned experimental signal.

IV. CONCLUSION
In conclusion, we have used molecular simulations to investigate the dynamics of cyclobutanone after photoexcitation to the S 2 state.We have focused here on details of the dynamics most relevant to JCP's community challenge.In particular we have made quantitative predictions of the electron-diffraction signal that will be observed in the planned experiment and explained this signal in terms of the main fragmentation reactions observed in our simulation.
In addition to understanding the dissociation of cyclobutanone, this study also serves as the first application of the newly proposed multi-state MASH method, called unSMASH, to an ab initio simulation.By choosing to use MASH for our simulations instead of the more commonly used FSSH, we expect to have gained the following advantages.Firstly, our simulations require no ad hoc decoherence corrections.This is because the deterministic dynamics of MASH means that the electronic variables always remain consistent with the active surface, therefore fixing the inconsistency problem of FSSH. 3,14This is likely to be important for accurately describing the photodissociation process, where it is known that the inconsistency error can lead to suppressed product yields. 13econdly, MASH is able to correctly describe the effects of the nonadiabatic force through its uniquely determined momentum rescaling algorithm.This is crucial for accurately describing the electronic population dynamics, which is known to be particularly sensitive to how the momentum rescaling is performed. 13,59While FSSH can in principle describe this effect by rescaling the momenta along the NACV at a hop and reflecting in the case of a frustrated hop, in practice an isotropic momentum rescaling is almost always used in ab initio FSSH simulations.
While we expect the present study to have captured the most important details of the system, there are a number of areas in which the simulation could be improved.Firstly, we could include intersystem crossing between the singlet manifold and the triplet manifold.This is something that can be rigorously achieved within the MASH framework, as will be described in an forthcoming publication. 30We note, however, that given the rapid nature of the dissociation observed in the present study and the weak spin-orbit coupling, it is unlikely there would be time for sufficient population transfer to the triplet manifold to significantly influence the dynam-ics.Secondly, we may seek to improve the accuracy of our initial conditions.The branching ratio between C3 and C2 channels may depend sensitively on the amount of energy imparted to the system by the photo-excitation. 54,55he Franck-Condon approach we have used, taking the initial distribution as the ground state nuclear Wigner distribution placed on the S 2 state and mimicking the finite width of the exciting laser pulse by convolution with a Gaussian, is a standard way of initialising a simulation in photo-excited systems, 52 however, it is not without approximation.It will, therefore, be interesting in future studies to explore the sensitivity of the results to the initial conditions used, or even explicitly simulate the pulse within the MASH dynamics. 30e note, however, that despite the possible improvements that could be made to the initial conditions, at present they would all rely on the use of a Wigner transformed initial density.This has the advantage that it introduces zero-point energy into the initial distribution.However, this is not without its limitations.Specifically, since the underlying nuclear dynamics is classical, the zero-point energy will eventually (due to anharmonicity) become evenly distributed among the molecular degrees of freedom.This so called "zero-point energy leakage" is a well-known problem. 605][66][67][68][69][70] However, for photochemical problems that are inherently far from the linear-response regime, where imaginary-time pathintegral methods have been shown to give very accurate results, [71][72][73][74][75] it is unclear whether a nonadiabatic RPMD would be the final solution to this problem.This is because RPMD effectively assumes a rapid decoherence of vibrational modes that may introduce additional errors in such low-pressure gas-phase systems. 76,77The search for an optimal method for such problems therefore continues.In future studies it would be interesting to assess the importance of zero-point energy leakage in this system by initialising all modes from a classical Boltzmann distribution, and comparing the resulting simulations.If the classical nuclear limit is valid, then one has the additional advantage that MASH has the correct detailed balance to thermalise to the correct distribution. 3he aspect that will probably have the most significant impact on our dynamics is the accuracy of the electronicstructure theory.In the present study we have chosen a level of theory that minimises the cost while still being sufficient to allow the simulation to describe the key qualitative features of the electronic subspace.We used two different basis sets and found that although the qualitative description of the electron-diffraction pattern is similar, the C3/C2 product ratio is significantly different.In future work it will therefore be interesting to investigate the system using even more accurate electronicstructure theory methods, such as using larger basis sets and including dynamic correlation via CASPT2, MRCI or coupled cluster theory. 78Although this would significantly increase the cost of an on-the-fly simulation, by exploiting modern machine-learning techniques 79,80 one may hope to make such calculations tractable.
There are more advantages of developing a machinelearned model than just making the use of high accuracy electronic-structure theory achievable.In particular, having trained a model, it would then be comparatively inexpensive to perform a systematic analysis of the sensitivity to different initial conditions.Furthermore, it would give the opportunity to use more expensive dynamics methods that are impractical for on-the-fly calculations.Such studies would complement the aims of the JCP challenge, by providing an objective comparison of the accuracy of different dynamics methods and the approximations they make, and helping to push the boundaries of accuracy in excited-state simulations.Given the right potential-energy surfaces and couplings, we believe that MASH can be a very powerful simulation tool for obtaining reliable predictions of photochemistry.

SUPPLEMENTARY MATERIAL
See the supplementary material for additional numerical results referred to in the main text.draft (supporting), Code testing (joint), Analysis of accuracy of electronic-structure theory (supporting).J.R.M. code testing (joint), writing manuscript first draft (supporting).M.A.M. Analysis of accuracy of electronicstructure theory (lead), development of interface between MASH code and molpro (joint), writing manuscript first draft (supporting).K.A. Triplet simulations (lead), writing manuscript first draft (supporting).A.K. writing manuscript first draft (supporting).J.O.R. Initial Conditions (lead), Calculation of electron-diffraction signal (lead), writing manuscript first draft (supporting), conceptualisation (lead), project management (lead).

DATA AVAILABILITY
The data that supports the findings of this study are available within the article and its supplementary material.

II. INITIAL DISTRIBUTION
The ground-state potential energy surface was explored at the level of B3LYP/def2-TZVP.These density-function theory calculations were carried out in ORCA. 7At this level of theory, cyclobutanone was found to show a doublewell structure in the puckering mode.The simple harmonic approximation therefore becomes unreliable.In fact, cyclobutanone was known to have a double well from early infrared and Raman spectroscopy. 8,9he saddle point of the double well is the most useful reference geometry.The predicted rotational constants using the rigid-rotor approximation around this geometry are given in Table V, which are in good agreement with the experimentally determined values except for A. However, note that A is strongly dependent on the vibrational state (for example, it drops to 10738.0MHz for the first vibrationally excited state) which explains why the calculated result based on the rigid-rotor approximation is too large.The harmonic frequencies at this geometry are presented in Table VI.Frequencies greater than 2500 cm −1 were scaled by 0.96 to account for anharmonic effects as well as deficiencies of the approximate DFT treatment.This is seen to give a systematic improvement in the comparison to experiment.Note that for the harmonic analysis as well as all the dynamics calculations, we use the atomic mass of the most abundant isotope.
The harmonic approximation is clearly not sufficient for the puckering mode.However, one can easily obtain the anharmonic Wigner function for this single mode directly from the DVR function using Fourier transforms of products of Hermite polynomials, much of which can be worked out analytically.
We first performed a scan along the puckering normal mode and obtained the vibrational wavefunctions for this single degree of freedom using Hermite DVR. 11Unfortunately, this gave a vibrational spacing of about 142 cm −1 , which is significantly larger than the experimentally observed result of 35 cm −1 .This implies that the approximation to treat the normal modes as separable is at fault.Next, we computed a "minimum-energy" pathway as described in the main text.Unlike the scan along the normal mode, this path is curved.The potential along the "puckering path" is plotted in Fig. S9.Using DVR we computed the vibrational energy levels and wavefunctions (shown in Table VII and Fig. S9).With this approach, the calculated levels match much better with experiment.
In order to compute the Wigner function along the puckering path, we employ a finite-basis representation of harmonic-oscillator eigenstates, for n = 0, 1, 2, . . ., where a is an arbitrary scaling factor.Using DVR, the ground-state wavefunction in the puckering mode is found as a linear combination of harmonic-oscillator states as transition DVR Expt.for the puckering mode based on the one-dimensional DVR along the minimum-energy pathway compared with experimental data from far-infrared spectroscopy. 9 The Wigner function is then defined as where for m < n and L n−m m is the generalized Laguerre polynomial.For the final step, we were inspired by Eq. (7.377) from Gradshteyn and Ryzhik, 12 although their formula appears to have an error which had to be corrected by exchanging the arguments of the two Hermite polynomials.
The Wigner function given above is valid only in the low-temperature limit.However, we wish to initialize the molecule at T = 325 K. Therefore, not just the ground state, but the lowest 25 vibrational states were computed (with energies up to about 8k B T ) and the thermal Wigner function defined as a Boltzmann weighted sum over these states: W (x, p) = ν W (ν) (x, p) e −βEν /Z, where Z = ν e −βEν and ν labels the vibrational state.
Because of the relatively high temperature in this case, it was found to give results quite similar to a simple classical approximation W cl (x, p) = e −βV (x) e −βp 2 /2 /(2πℏZ cl ), where Z cl = (2πℏ) −1 e −βV (x) e −βp 2 /2 dx dp.However, the quantum-mechanical approach is more general and was used in our simulations.The thermal Wigner function is shown in Fig. S10.
Coordinates and momenta were sampled from the Wigner function in such a way that in the final ensemble all trajectories will contribute with equal weight.We do this as generating samples is cheap, but running trajectories is computationally expensive and we wish to avoid running an expensive trajectory only to find that it contributes with a small weight.In principle it is possible for the Wigner function to have negative values, in which case the assigned weights would be negative.However, in practice, we did not encounter any negative values in our distribution due to the relatively high temperature.
Finally, the perpendicular modes were sampled.This was done by first interpolating Hessians along the puckering path.Then for each sample in the ensemble, we projected out translations and rotations as well as the vector along the puckering path and diagonalized the mass-weighted Hessian to obtain frequencies and perpendicular normal modes.Frequencies greater than 2500 cm −1 were scaled by 0.96 to account for anharmonic effects as well as deficiencies of the approximate DFT treatment.Coordinates and momenta corresponding to each of these modes were sampled from thermal Wigner functions using the harmonic approximation before being transformed back into Cartesian coordinates.The perpendicular frequencies obtained along the path are shown in Fig. S11.It is seen that they are much larger than the puckering frequency as well as relatively slowly varying, which justifies our approximation to adiabatically separate them from the puckering path.Angular momentum was included by sampling three mass-weighted angular momenta p θ , p ϕ and p ψ from a normal distribution with variance 1/β.These scalars were used to provide the magnitude of the angular momentum vector (computed for each geometry along the puckering path) that was added to the sampled momentum.
For each geometry in our ensemble, we ran a single-point CASSCF calculation to determine the excited-state energies and transition dipole moments.These results are presented in Figs.S12, S13 and S14.It is seen that only the S 2 state is in resonance with the excitation energy of 200 nm, which is equivalent to 6.2 eV.It is also seen that the transition dipole moment between S 0 and S 2 does not vary dramatically across the ensemble.
We found no correlation between the norm of the transition dipole moment and the puckering mode and instead found that it depends on the high-frequency modes.As these are treated quantum mechanically via the Wigner function, it would not be valid to simply weight each geometry in the ensemble by the square of the transition dipole moment.In fact, the correct way to account for the variation of the transition dipole moment is to include it in the calculation of the Wigner function.However, for simplicity, in our treatment, we effectively treat it as a constant.
We also neglect the effects of the pulse shape and allow all trajectories to start in S 2 regardless of the S 2 -S 0 energy gap.We can see no rigorous justification for simply rejecting samples whose S 2 -S 0 energy gap is outside the bandwidth of the laser.The only rigorous approach would require us to simulate the pulse in real time.We believe it will be possible to extend MASH to treat such problems in future work.

III. NUMERICAL IMPLEMENTATION OF MASH
We give here the technical details of how the multi-state MASH calculations were performed.Note that the method described here is the uncoupled-sphere MASH method (unSMASH).We consider here only the case of a system starting in a pure adiabatic state (i.e.no initial adiabatic coherences).

B. Equations of motion
Here we give a detailed description of the algorithm used to evolve the MASH equations of motion.We note that alternative (but formally equivalent) versions of the algorithm are possible.In particular, as discussed in the main text, the present algorithm predominantly makes use of the non-adiabatic coupling vectors: rather than the overlaps This is because the overlaps were found to be significantly more expensive to calculate than the non-adiabatic coupling vectors.However, it would be straightforward to modify the algorithm to predominantly make use of the overlaps if this were not the case.Of course, as is well documented, [13][14][15] it is necessary to use the overlaps rather than the nonadiabatic coupling vectors in the vicinity of a conical intersection, and for this reason the overlaps are used whenever one of the energy gaps drops bellow a predetermined threshold, set here as V cut = 2000 cm −1 .The algorithm is as follows: 1. Propagate the positions and momenta using a velocity Verlet step from t to t + δt on the current active surface: At this point the electronic structure code is called, and the new position is used to calculate the adiabatic potentials V , the force on the current active state, F = − ∂Vn ∂q , and the N − 1 derivative couplings between the active state and all other states, d n,j .Additionally, if any of the |∆V i,j (t)| < V cut or |∆V i,j (t + δt)| < V cut then the overlaps O i,j (t, t + δt) are calculated.Note the sign of the adiabatic wavefunctions are determined using the scheme described in Sec.III C.
The momenta are then updated under the new force (note these are not yet the final momenta) 2. Propagate the electronic variables from from t to t + δt according to where in which the averaged adiabatic energy gap, ∆V n,j , is given by ∆V n,j = ∆V n,j (q(t)) + ∆V n,j (q(t + δt)) 2 (12c) (∆V n,j = V n − V j ) and the averaged non-adiabatic coupling, T n,j , is calculated either directly from the nonadiabatic coupling vectors as or, if either |∆V n,j (t)| < V cut or |∆V n,j (t + δt)| < V cut then it is calculated from the matrix logarithm of the orthogonalised overlap matrix 13,14 T n,j = 2 δt asin Õ(n,j) a set of adiabatic wavefunctions with arbitrary signs at a series of times, {ϕ j (αδt) | α ∈ Z}, then we we define the set of wavefunctions with consistent signs as {ψ j (αδt) = σ j (αδt)ϕ j (αδt) | α ∈ Z}, where σ j (0) = 1.Away from conical intersections, in regions where the overlaps are not calculated, we impose the continuity of the adiabatic wavefunctions by requiring that sign d (n,j) q((α + 1)δt) • d (n,j) q(αδt) = 1 (14)   in terms of the derivative coupling calculated using the uncorrected wavefunctions, d (n,j) ϕ , this means σ j ((α + 1)δt)σ n ((α + 1)δt) × sign d Now since it is only the relative signs that are important we can define σ n ((α + 1)δt) = σ n (αδt), and hence we have from which we can obtain the sign corrected derivative coupling When the overlaps are available we simply define the signs to maintain a positive diagonal in the overlap matrix With this scheme we obtain a unique set of σ j (αδt), that give a continuous set of wavefunctions as the time step δt → 0. The overlap matrices are then simply calculated from the output of the electronic structure code as D. Interface to electronic structure package The unSMASH integrator was interfaced with Molpro 2023 16 for the calculation of all necessary electronic properties.This interface was designed to minimise the cost of the CASSCF calculations by requesting Molpro calculate only the necessary quantities at each step (e.g.gradients, NACVs or overlaps).Additionally to aid in the convergence of the SCF calculation, each step used the previous wavefunction as an initial guess.

IV. ANALYSIS OF HOPPING POINTS
To determine if there is a correlation between the products and the MECIs, we have projected the geometries of final hops in all trajectories along coordinates s a and s b , which are defined as sums of inverse C-C bond lengths: where, r αβ1 is the bond length between the α-C and one of the β-C atoms, and so on.We have verified that r a and r b can differentiate between the various critical points (solid diamonds in Fig. S15).The hopping points are coloured according to the product they result in.Note that product IV refers to trajectories where the fragments correspond to none of the products defined in the main text (this only happens in 11 (3) trajectories at the aug-cc-pVDZ (6-31+G*) level of theory).
We find that MECI-b is never visited by any of the hopping points, which is consistent with it being the highest energy MECI (Table IV).We also see that product I nearly always hops near MECI-a but products II, III and IV can be formed by hopping over a much wider region, which may be near MECI-a or MECI-c.

V. TRIPLET POPULATIONS
As described in the main text, we do not include the triplet states in the simulation.In order to check whether this approximation holds, using 6-state (3 singlet and 3 triplet states) SA-CASSCF with a (12,11) active space and aug-cc-pVDZ basis set, we calculated the energies, overlaps and SOCs between the singlet and triplet states over 10 randomly selected MASH trajectories.Figure S16 shows the energies of the singlet and triplet states along 2 of these trajectories.Note that the 6-state SA-CASSCF calculations yield slightly different energies for the singlet states compared to the 3-state calculations used to generate the trajectories, although the behaviour is qualitatively similar.
We then simulated the electronic dynamics along each of the selected trajectories (i.e., neglecting the back reaction), using the overlaps and the SOCs to determine the time-evolution of an electronic wavefunction initialized in a pure S 2 state.Figure S17 shows the singlet and total triplet populations for two of the selected trajectories.The total triplet population was found to be less than 0.6% along all trajectories considered.

VI. ELECTRON DIFFRACTION
The electron-diffraction signal was simulated within the single-scattering independent-atom approximation 17 using the following formulae for the differential cross sections I(s): where f i (s) and g i (s) are the atomic form factors for direct and spin-flip scattering events on atom i.Note that the spinflip g i (s) factors were included in our calculations even though they are significantly smaller than the corresponding f i (s) and could thus probably be safely neglected.Note that various definitions of the pair distribution function (PDF) are used in the literature.Our definition is equivalent to that used in Ref. 18 and is related to that used in Ref. 17 by P (r) = rPDF(r).
The atomic form factors were generated using the ELSEPA package. 19The default settings were used, that is a Fermi nuclear charge distribution, Dirac-Fock electron density and Furness-McCarthy exchange potential.The de Broglie wavelength of the relativistic scattering electron is λ = h/p, where E 2 = (pc) 2 + (mc 2 ) 2 and E = E kin + mc 2 .We assume the electron has kinetic energy of E kin = 3.7 MeV as in Ref. 18, which gives λ = 0.00297 Å.The momentum transfer for an elastic collision is ℏs, where and θ is the scattering angle (from 0 to π).
The steady-state signal generated from the initial distribution is given in Fig. S18.There are only very minor differences in the simulated PDF between the ensemble average and the result from a single geometry.

VII. ADDITIONAL NUMERICAL RESULTS
We give here a series of additional numerical results to support the findings of the main paper.This includes additional results for the calculations performed using the 6-31+G* basis.For these calculations we used the same set of 200 initial geometries, momenta and effective Bloch-spheres as for the aug-cc-pVDZ calculations.In total there were 199 trajectories that ran for 500 fs, trajectories for which spin-contamination resulted in a failure of the SCF convergence were again finished using SS-CASSCF.
Figures S20-S22 show the ultrafast electron diffraction signal resolved by the final reaction products for both the aug-cc-pVDZ basis and also the 6-31+G*.
The total yields of each molecular fragment are given in Table IX, from which we see that the products produced are broadly consistent between different choices of basis set.Table X shows the yield of the main reaction pathways at 500 fs for the calculations with the 6-31+G* basis set.From this (including also those undissociated molecules in the C3 pathway) we can calculate the fraction of all C3 and C2 trajectories that dissociate via C3 as 0.5 with a 95% Wilson score confidence interval of (43%, 57%) corresponding to a C3/C2 ratio of 1 with the a statistical confidence interval at 95% of (0.75, 1.35).
z is instead sampled from the probability density ρ 2 (S (2,1) z ) = 2h(S (2,1) z )|S (2,1) z | (where h(x) is the Heaviside step function), with S (2,1) x and S (2,1) y chosen uniformly from the corresponding circle on the Bloch sphere.It is due to this ensemble that MASH is able to describe wavepacket bifuraction.

FIG. 1 .
FIG. 1. Calculated steady-state PDF (in Å−2 ) for the initial distribution.Also shown in orange is a histogram of the atom pair distribution (carbons and oxygens only).The inset shows the atom pair distances (in Å) for the C2v geometry.

FIG. 2 .
FIG. 2. Simulated ultrafast electron-diffraction results.The panels on the left show the change in the probability density function relative to the initial configuration.The panels on the right show the same data convolved with a 160 fs (FWHM) Gaussian to simulate the instrument response function.Blue is loss, red is gain, with equally spaced contour levels showing the height of the ∆PDF(r) signal relative to the maximum peak height in the steady-state PDF.

2 FIG. 3 .
FIG.3.Average (unconvolved) adiabatic populations as a function of time for the 198 trajectories.Shaded region shows an approximate 95% confidence interval (the Wilson score interval).53 FIG.S3.The natural orbitals of AS3, a(12,11) with the aug-cc-pVDZ basis set active space, which adds on C-H σ-bonds and two C-C σ * -bonds.
FIG. S5.Geometries of the (a) Product I, (b) Product II and (c) Product III.Note that this data is also available as an xyz file in the supplementary material.
FIG. S7.Geometries of the minima on (a) S1 and (b) S2.Note that this data is also available as an xyz file in the supplementary material.
FIG.S9.Potential energy (blue) along the puckering path at B3LYP/def2-TZVP level.A one-dimensional DVR leads to the wavefunctions shown by the black dotted lines.The resulting quantum Boltzmann distribution at T = 325 K is shown in orange (with arbitrary units).This function is well approximated by a classical Boltzmann distribution, shown with a green dotted line.
FIG. S10.Wigner function along the puckering path.The small wiggles are probably due to the finite truncation and representation on a discrete grid, but are not expected to significantly influence the results.
FIG. S14.Scatter plot of the energies and norms of the transition dipole moments of the S1 and S2 electronic states (with respect to the ground state) for all structures in the initial distribution (aug-cc-pVDZ on the left and 6-31+G* on the right).The red horizontal line marks 6.2 eV which corresponds to the experimental excitation energy.
(n,j) z > 0 corresponds to being on state n.Note that, as is true for the two-state Bloch sphere FIG. S15.Final hops between S2 and S1 (top) and between S1 and S0 (bottom) have been projected along the ra and r b coordinates (aug-cc-pVDZ on the left and 6-31+G* on the right).
FIG. S18.(a) Calculated electron-diffraction signal for the initial distribution.(b) Calculated PDF based on transforming the signal from panel (a).Also shown in orange is a histogram of the atom pair distribution (carbons and oxygens only) without the smearing caused by the limited range of s accessible to experiment.The inset shows the atom pair distances in the C2v geometry.
FIG.S20.Simulated ultrafast electron-diffraction results for the trajectories that correspond to rxn. products I at t = 500 fs, left calculated using the aug-cc-pVDZ basis set and the right with the 6-31+G* basis set.The panels on the left of each subfigure show the change in the probability density function relative to the initial configuration.The panels on the right show the same data convolved with a 160 fs (FWHM) Gaussian to simulate the instrument response function.Blue is loss, red is gain, with equally-spaced contour levels showing the height of the ∆PDF(r) signal relative to the maximum peak height in the steady state PDF.
FIG.S22.Simulated ultrafast electron-diffraction results for the trajectories that correspond to rxn. products III at t = 500 fs, left calculated using the aug-cc-pVDZ basis set and the right with the 6-31+G* basis set.The panels on the left of each subfigure show the change in the probability density function relative to the initial configuration.The panels on the right show the same data convolved with a 160 fs (FWHM) Gaussian to simulate the instrument response function.Blue is loss, red is gain, with equally-spaced contour levels showing the height of the ∆PDF(r) signal relative to the maximum peak height in the steady state PDF.

65 FIG 65 FIG 2 FIG
FIG. S24.Simulated ultrafast electron-diffraction results calculated using the 6-31+G* basis set.The panels on the left show the change in the probability density function relative to the initial configuration.The panels on the right show the same data convolved with a 160 fs (FWHM) Gaussian to simulate the instrument response function.Blue is loss, red is gain, with equally-spaced contour levels showing the height of the ∆PDF(r) signal relative to the maximum peak height in the steady state PDF.

FIG
FIG. S30.Histogram showing the vertical excitation energy for trajectories that produce products I (C3) and those that produce products II+III (C2).

TABLE I .
Total product yields 500 fs after initial excitation.Note the total number of initial C 4 H 6 O molecules is 198.Fragments are identified by using a cutoff radius of 2 Å.H 6 +CO C 2 H 4 +C 2 H 2 O C 2 H 4 +CH 2 +CO 2 to S 1 ative to the initial configuration.Dashed lines show the unconvolved results from the 198 trajectories, and solid lines show the convolved results using a 160 fs (FWHM) Gaussian to simulate the instrument response function.Note the height of the ∆PDF(r) signal in all cases is given relative to the maximum peak height in the steady-state PDF.

TABLE I .
Vertical excitation energies (VEE) and the magnitude of the transition dipole moment (|µ|) for the three active spaces and two basis sets at the B3LYP/def2-TZVP C2v geometry, using 3-state SA-CASSCF.

TABLE IV .
Relative energies of minimum energy conical intersections (MECI) and crossing points (MECP) at the C2v geometry.The energy of a crossing is defined as the average energy of the two states involved and the spin-orbit coupling (SOC) is calculated using the Breit-Pauli Hamiltonian.All crossings were optimized at the SA-CASSCF/6-31+G* level of theory, with a state-averaging over S0, S1, and S2 (S0, T1, S1, T2 and S2) for the MECIs (MECPs).The three S1/S0 MECIs are numbered for future reference.

TABLE V .
Rotational constants (in MHz) calculated from rigid-rotor approximation using the C2v saddle-point structure compared with experimentally determined values for the ground vibrational state from Ref. 10.

TABLE VI
. Harmonic normal-mode frequencies (in wavenumbers) of ground-state C2v saddle point according to B3LYP/def2-TZVP (excluding the low-frequency puckering mode).Experimental results are from gas-phase infrared spectroscopy of Frei and Günthard 8 , except where indicated by a *, for which liquid-phase Raman spectrum are given instead.
The average is taken over the nuclear distribution obtained from the ensemble of trajectories at given time, where r ij is distance between atoms i and j.No extra broadening is necessary as the Wigner distribution used in our study accounts for zero-point energy effects.The Fourier transform used to generate PDF is damped by a Gaussian with parameters given in TableVIII.

TABLE IX .
FIG.S19.Average (unconvolved) adiabatic populations as a function of time for the 141 C3 trajectories (those that produce C 3 H 6 ) left and the 45 C2 trajectories (those that produce C 2 H 4 ) right (calculated with aug-cc-pVDZ basis).Shaded region shows an approximate 95% confidence interval (the Wilson score interval).20Totalproduct yields 500 fs after initial excitation for the calculations with the 6-31+G* basis set.Note the total number of initial C 4 H 6 O molecules is 199.Reaction products are identified by using a cutoff radius of 2 Å.H 6 + CO C 2 H 4 + C 2 H 2 O C 2 H 4 + CH 2 + CO

TABLE X .
Main reaction products at 500 fs for the calculations with a 6-31+G* basis set.Reaction products are identified by using a cutoff radius of 2 Å.