TRAVIS (“Trajectory Analyzer and Visualizer”) is a program package for post-processing and analyzing trajectories from molecular dynamics and Monte Carlo simulations, mostly focused on molecular condensed phase systems. It is an open source free software licensed under the GNU GPL, is platform independent, and does not require any external libraries. Nine years after the original publication of TRAVIS, we highlight some of the recent new functions and features in this article. At the same time, we shortly present some of the underlying algorithms in TRAVIS, which contribute to make trajectory analysis more efficient. Some modern visualization techniques such as Sankey diagrams are also demonstrated. Many analysis functions are implemented, covering structural analyses, dynamical analyses, and functions for predicting vibrational spectra from molecular dynamics simulations. While some of the analyses are known since several decades, others are very recent. For example, TRAVIS has been used to compute the first *ab initio* predictions in the literature of bulk phase vibrational circular dichroism spectra, bulk phase Raman optical activity spectra, and bulk phase resonance Raman spectra within the last few years.

## I. INTRODUCTION

Within the last few decades, the methods of molecular dynamics (MD) and Monte Carlo simulations have become a central part of computational chemistry and physics. In contrast to static calculations, they implicitly include the effect of entropy at finite temperatures. Most experimentally accessible observables are only defined as ensemble averages, rendering it indispensable to rely on ensemble averaging also in computations if experimental results shall be reproduced or predicted. The direct result of a simulation is a trajectory that contains the positions and velocities of the particles in equidistant discrete time steps. Mathematically speaking, such a trajectory is a path through a 6*n*-dimensional space if *n* particles have been simulated. Such a high-dimensional object is not suitable for direct evaluation—methods for reducing the dimensionality are required. These methods will be termed trajectory analyses here.

It is a commonly known phenomenon that many working groups from the field of computational science use their own in-house scripts and codes to evaluate and analyze simulation trajectories. These codes all have their specific scopes, strengths, and limitations, and despite being developed in independent groups, many methods and algorithms are for sure almost equivalent in all these tools. This comes with several disadvantages. On the one hand, it wastes a lot of valuable scientific manpower by “reinventing the wheel” hundreds of times. On the other hand, it increases the risk of subtle scientific or coding problems not being detected because the user base of the code is often only one working group and therefore rather small. Hence, it is highly desirable to have common tools for analyzing trajectories, which are utilized by a large group of scientists.

In 2011, we published the TRAVIS (“Trajectory Analyzer and Visualizer”) program package^{1} as an open source free software, which specifically aims at providing as many trajectory analyses as possible within a unified program so that many scientists are hopefully no longer forced to develop their own in-house solutions to analyze trajectories. There already exist several other software packages for analyzing trajectories, e.g., the Visual Molecular Dynamics (VMD) program package^{2} or Python frameworks such as MDAnalysis.^{3} However, by its huge number of available analyses and by its interactive command-line user interface, TRAVIS is in a certain sense unique. Since the release in 2011, the original TRAVIS article^{1} has already been cited more than 400 times by more than 800 different co-authors (*according to the journal homepage*), and around 95% of these publications actually used the software to produce results in the manuscript. This gives us the impression that our code is somehow useful for the community.

In this article, we want to highlight some of the recently added methods and features in TRAVIS. At the same time, we are going to shortly introduce some of the underlying algorithms that are typically invisible for users but are, of course, very important to achieve the desired efficiency and accuracy of trajectory analyses. This manuscript is structured as follows: After a discussion of general features of TRAVIS in Sec. II, we will discuss specific features from the fields of structural, dynamical, and spectroscopic analyses in Secs. III–V. In this context, structural analyses are defined as analyses that are invariant under any re-shuffling of the trajectory frames, while dynamical analyses somehow depend on the ordering of the frames. Of course, spectroscopic analyses are also dynamical analyses, but they are discussed separately. We end our article with some conclusions.

## II. GENERAL FEATURES

Being licensed under the GNU GPL v3 license, TRAVIS is an open source free software and can be downloaded from http://www.travis-analyzer.de/.

Apart from the source code, Windows executables can also be obtained from the homepage. New versions, including bug fixes and additional features, are regularly supplied. TRAVIS is written in C++, and the developers’ version currently contains around 270 000 lines of source code. It does not require any external libraries or dependencies (except from a C++ compiler) and is platform independent. TRAVIS is currently not parallelized, i.e., it runs on one single central processing unit (CPU) core. However, we are working on introducing optional OpenMP-based parallelization for some suitable modules.

TRAVIS reads many different input trajectory formats, currently including XYZ, PDB, mol2, DL_POLY,^{4} LAMMPS,^{5} Amber,^{6} Gaussian Cube, and compressed BQB trajectories.^{7} Non-orthorhombic cell geometries and variable cell vectors (*NpT ensemble*) are supported. The cell geometry can directly be read from the trajectory if present. Very large systems with several hundred thousand of atoms can be easily handled.

TRAVIS can be used as an interactive program, asking questions to the user in a text-mode console interface. Most of the questions are more or less self-explanatory, and reasonable default values are provided in most cases. On the other hand, TRAVIS can also be used for batch processing with the help of input files. When creating plots and visualizations, TRAVIS does not directly output image files, but rather input files for visualization programs such as VMD,^{2} xmgrace,^{8} Gnuplot,^{9} Povray,^{10} and Mathematica.^{11} By supplying these files to the corresponding programs, high-quality images can be easily obtained without requiring much knowledge of the visualization program.

### A. Fast molecule recognition via domain decomposition

TRAVIS only reads atom labels, positions, and optionally velocities from the input trajectory; any topological information (*molecules, bonds, etc.*) is ignored. Instead, a distance-based bond detection is executed in one trajectory frame, and molecules are defined on the basis of the detected covalent bonds. A pair of atoms is defined to be covalently bonded if its distance is smaller than the sum of the covalent radii of the two atom types (*taken from the literature*)^{12} multiplied with a constant to account for thermal vibrations (*by default 1.15, but can be chosen differently*). To detect bonds, a straight-forward (“canonical”) approach of checking the distance of all pairs of atoms in the system is very inefficient, as its computational time scales with $On2$. TRAVIS uses a more sophisticated approach, which is based on a spatial domain decomposition. Domain decompositions and similar approaches such as the linked list method^{13} are commonly used in computational chemistry (*in particular, in parallelized force field molecular dynamics codes such as LAMMPS*^{5} *and Gromacs*^{14}) but have rarely been applied to trajectory post-processing. First, the longest possible bond distance is determined by finding the element in the system with the largest covalent radius, and taking this radius twice, multiplied by the constant mentioned above. Then, the simulation cell is tessellated into cubes of equal size and shape. The number of cubes along each cell vector is chosen as large as possible while still satisfying the constraint that the edge length of the cube is larger than the longest possible bond distance. All atoms in the system are then binned into these cubes. To find all covalent bond partners of a certain atom inside of such a cube, only atoms within the same cube and within the 26 neighboring cubes need to be considered. This yields a very fast bond detection algorithm that essentially scales with $On$.

The efficiency of this approach is easily visible in Table I, where the molecule recognition times for similar systems (containing ion pairs of the ionic liquid [EMIm][OAc], i.e., 1-ethyl-3-methylimidazolium acetate) of increasing size are shown. While the fourth column gives the molecule recognition time with the canonical algorithm, the fifth column depicts the timing of the domain decomposition approach. For relatively small systems, both methods are acceptable. However, for a system with around 500 000 atoms in a 17 nm cubic cell, the canonical approach takes almost 1 h for the molecule recognition, while the implementation based on domain decomposition only runs for 1 s.

Ion pairs . | Atoms . | Cell (nm) . | t_{Can}(s)
. | t_{Dom} (s)
. |
---|---|---|---|---|

36 | 936 | 2.121 | 0.003 | 0.001 |

288 | 7 488 | 4.242 | 1.036 | 0.001 |

2 304 | 59 904 | 8.485 | 83.009 | 0.011 |

18 432 | 479 232 | 16.969 | 3240.000 | 1.002 |

Ion pairs . | Atoms . | Cell (nm) . | t_{Can}(s)
. | t_{Dom} (s)
. |
---|---|---|---|---|

36 | 936 | 2.121 | 0.003 | 0.001 |

288 | 7 488 | 4.242 | 1.036 | 0.001 |

2 304 | 59 904 | 8.485 | 83.009 | 0.011 |

18 432 | 479 232 | 16.969 | 3240.000 | 1.002 |

After the molecule recognition, topological atom codes are assigned to the atoms by using Shelley’s extension^{15} of the Morgan algorithm.^{16} The algorithm guarantees to give identical codes to topologically equivalent atoms and different codes otherwise. Atom numbers are assigned by descending atom codes. As a by-product, TRAVIS maintains a list of topologically identical atoms, which is helpful for many analyses. Furthermore, a full ring system detection and refinement^{17} is carried out, as some of the analyses are well-suited for cyclic structures. Finally, all individual molecules with identical sets of atom codes are considered to be topologically equivalent and grouped together as molecule types. We would like to point out again that this full protocol only takes 1 s on one CPU core for a system with around 500 000 atoms and 40 000 molecules, as shown in Table I.

### B. The BQB format—Lossless compression of trajectories and volumetric data

In 2018, we developed and published an approach for a highly efficient lossless compression of volumetric data trajectories^{7} and introduced the BQB file format to store the compressed data (“*lossless*” *indeed means that the input Gaussian Cube file is bit-wise reproduced*). The methods in TRAVIS for computing vibrational spectra (*see below*) require the total electron density on a grid along the trajectory, easily amounting to several terabytes of raw data. With our novel compression scheme, these data can be compressed significantly without losing any information. Figure 1 gives some details on the performance of the method. While multi-purpose compression tools such as bzip2 and xz fail to exploit the particular structure of the data (*smoothness in both space and time*) and only reach a compression ratio of around 5:1, the BQB format yields a lossless compression ratio of around 40:1 while still being faster than the former two methods in compression and decompression time. Furthermore, the BQB format allows for random access to the frames, which is, of course, not the case with multi-purpose compression formats. The high compression efficiency is achieved by applying a sequence of methods, including multi-dimensional polynomial extrapolation, Hilbert curve re-ordering, and canonical multi-table Huffman encoding.^{7}

While our original aim was to compress volumetric data trajectories, we noted that the same methodology is also able to losslessly compress standard position trajectories. When starting from a XYZ trajectory file with a frame interval of 0.5 fs, the BQB format reaches a compression ratio of around 20:1 here. This is much more efficient than other existing trajectory formats such as DCD or Gromacs XTC^{14} while not sacrificing accuracy. More details can be found in Ref. 7.

The BQB format and the methods described above are implemented in TRAVIS, which can be used to compress or decompress volumetric or position trajectories, or directly read and analyze such trajectories. Apart from that, there is also a stand-alone tool for compressing and decompressing trajectories, called the “bqbtool.” A library (“libbqb”) to add support for the BQB format to other program packages will be released soon. For more information, refer to the website of the BQB format.^{18}

## III. STRUCTURAL ANALYSES

In this section, some structural analyses implemented in TRAVIS will be presented. As mentioned above, structural analyses are defined as analyses that are invariant under any re-shuffling of the trajectory frames in this context. Due to this property, structural analyses can also be applied to Monte Carlo simulations because they do not require that a physical time elapses along the trajectory.

### A. Particle density histograms

Probably, the most common kinds of trajectory analyses are particle density histograms. In particular, the **radial distribution function** (RDF), also known as $gr$, is found in countless publications on the subject of liquid phase simulation. This was also one of the first analyses that has been implemented in TRAVIS. Apart from the RDF itself, TRAVIS also writes out the corresponding number integral curve so that, e.g., the number of molecules within the first solvation shell can be easily determined. Recently, we added a function that automatically recognizes the first maximum and the first minimum of the RDF and prints out the distance as well as the intensity of these two points in the log file together with the particle number found within the first shell. This is useful for batch processing if many different trajectories are evaluated. Another recent addition is an algorithm for computing RDFs up to larger distances than half of the cell diameter,^{19} which can save some computer time. RDFs computed with TRAVIS have appeared many times in the literature,^{20–32} including number integrals and coordination numbers.^{33–38}

Apart from RDFs, TRAVIS can also create other one-dimensional particle density histograms, e.g., involving angles and dihedral angles. The corresponding functions are then called **angular distribution functions** (ADFs) and **dihedral distribution functions** (DDFs), respectively. Furthermore, the distance between a point and a line can be monitored, where the point can be any atom in some molecule and the line can be defined by any other two atoms in the same or a different molecule. The corresponding analysis is called **point–line distribution function** (LiDF). Similarly, the distance between a point and a plane can be observed, where the plane can be defined either by three points in some molecules or by specifying a base point and a normal vector. This analysis is termed **point–plane distribution function** (PlDF). For all these functions, the temporal development of individual pairs can also be observed. In the case of RDFs, this would yield a plot with selected (*or all*) pair distances against the simulation time to give an example.

If the simulation is not in thermal equilibrium, some of the histograms might change over simulation time. TRAVIS offers to create a 2D contour plot with the histogram quantity on one axis and the simulation time on the other axis. This analysis, which we call time-dependent distribution function (TDDF), investigates if some processes (*phase separation or crystallization*) take place during a simulation. Currently, only RDFs are supported for TDDFs, but the other quantities will be implemented soon.

Another frequently used class of particle density histograms are **spatial distribution functions** (SDFs). Here, a reference molecule is fixed by defining a local coordinate system based on three points from the molecule. Afterward, the average particle density of some other molecules around the fixed reference molecule is computed. As the resulting particle density is a volumetric dataset, only iso-surfaces of these data can be visualized. TRAVIS offers to output the resulting volumetric data as a particle density (nm^{−3}) or relative to the uniform density, similar to RDFs. Furthermore, there are some advanced functions implemented such as smoothing SDFs, inverting SDFs, creating cut planes, and forcing SDFs to be symmetric to certain planes to improve sampling. There are many examples where SDFs computed by TRAVIS have been published in the literature.^{20,21,24,25,32,39–48}

If the simulation cell is non-isotropic (e.g., *systems with a surface or lipid bi-layer*), it can be interesting to compute **density profiles** (DProf) along a fixed vector in the simulation cell. The result is a histogram that gives the particle density of a selected particle type (*either in* nm^{−3} *or relative to uniform density*) in thin slices of the system perpendicular to the chosen vector. Density profiles computed by TRAVIS have already been published.^{29,49,50}

Recently, we implemented a function called **geometric density analysis** (GeoDens) to observe the temporal development of the average particle density within a geometric region along the trajectory. In the specific application, we defined a cylinder between the two amino acid molecules involved in a salt bridge.^{31} The cylinder has a fixed radius, but the starting and ending points are tied to the molecules so that the cylinder moves together with the molecules. By monitoring the particle density of anions within this cylinder relative to the uniform density, we were able to conclude that the anions are repelled from the salt bridge (see Fig. S-18 of Ref. 31).

### B. Combined distribution functions

If two observables are correlated, the individual histograms of the observables do not give a complete picture of the situation, and a multi-dimensional histogram is required to understand the correlation effects (for example, see Fig. 2). Within a liquid phase trajectory,^{23,24} the length of a certain hydrogen bond is depicted on the horizontal axis, while the hydrogen bond angle is shown on the vertical axis. The 2D contour plot possesses a very strong signal, indicating a hydrogen bond, at low distances and angles of around 180°. The plot gives the information that at small distances, the angle is preferentially found to be close to 180°. This is a correlation between both quantities, and the standard 1D histograms would not be able to give this information.

TRAVIS combines a multitude of scalar quantities into multi-dimensional histograms, which we call **combined distribution functions** (CDFs) here. By using combinations of distances, angles, dihedral angles, point–plane distances, and point–line distances, thousands of interesting CDFs can be computed from a single trajectory. There are many examples in the literature of CDFs computed with TRAVIS.^{21–26,30,33,37,40–42,46,51–54} It is even supported to combine three scalar quantities into a three-dimensional histogram in TRAVIS. The resulting histogram can be visualized by means of iso-surfaces, as it has been shown in the literature a few times.^{24,33,55}

An even more direct way for understanding correlation between quantities is the so-called **correlation plot** in TRAVIS. Consider Fig. 3 where a CDF with two distances is shown in the left panel. While the horizontal axis depicts the C–O bond length in an acetate anion within a bulk phase simulation,^{23,24} the vertical axis shows the length of the other C–O bond in the same acetate anion. If both quantities would be uncorrelated, the 2D histogram would simply be the Cartesian product of the two 1D bond length histograms. To reveal the correlation, TRAVIS internally creates this Cartesian product of the two histograms and subtracts it from the actual CDF. The resulting correlation plot is shown in the right-hand panel of Fig. 3. Positive values indicate that the probability of finding this configuration is larger (*i.e., positive correlation*) than if the two quantities would be uncorrelated, while negative values depict situations less probable than in the uncorrelated case (*negative correlation*). We find that configurations with one elongated and one shortened bond are more likely than expected from the individual bond length statistics, while cases where both bonds are either long or short are less likely—a clear correlation effect.

### C. Structure factors

A very important experimental technique for condensed phase systems is measuring structure factors, also called $Sq$. The structure factor is related to $gr$ of the system by a Fourier transform,

Despite this easy relation, it is slightly more involved to compute $Sq$ from a simulation trajectory, as all pairs of atoms contribute to it, and therefore, all possible pairwise RDFs of the system have to be computed, correctly weighted by the scattering cross sections of the corresponding element types, and finally summed up. TRAVIS does this automatically and directly outputs the total $Sq$, together with all partial structure factors (*i.e., contribution to the total structure factor from a certain pair of elements*).^{27} Several commonly used normalization factors are available. Furthermore, it is possible to compute individual intramolecular and intermolecular contributions to the structure factor. Neutron scattering cross sections for the most important isotopes^{56} as well as x-ray scattering cross sections (*given by Cromer–Mann coefficients*) are included in TRAVIS, so both neutron and x-ray structure factors can be computed.^{27} Recently, the option to apply a Lorch-type window function^{57} to the structure factor calculation was implemented. Structure factors computed by TRAVIS have been published in the literature several times.^{27,52,55,58–62} TRAVIS can also compute dynamic structure factors $Sq,\omega $, which is discussed in Sec. IV E.

### D. Order parameters

There exists a large variety of so-called order parameters to determine perturbed or ordered structures. TRAVIS only contains a few of them, which are suitable to characterize lipid bi-layers that we have been simulating.^{29} The first one is the **tail vector tilting analysis**. If a lipid bi-layer is in the sol phase, the fatty acid tail vectors are oriented in random directions and constantly fluctuating. However, when the bi-layer undergoes a transition into the gel phase, the tail vectors are all tilted with the same angle into the same direction. By projecting all tail vectors into the X–Y plane (*where the Z direction is the normal vector of the bi-layer*), it can be easily seen if the bi-layer is in the sol or gel state (see Fig. 3 of Ref. 29).

The second order parameter that is implemented in TRAVIS is *S*_{CD}, also called the deuterium order parameter, as it can be directly measured by nuclear magnetic resonance (NMR) experiments of deuterated samples.^{63,64} It is computed by the following equation:

where *k* depicts the position in the alkyl chain, *N* is the number of lipid molecules, *T* is the total number of time steps, $rikC\u2013Ht$ is one of the C–H (or C–D) bond vectors of the *k*th carbon atom of the chain in time step *t*, and **n** is the membrane normal vector. *S*_{CD} can take values between −0.5 and 1, where 0 corresponds to the statistical distribution (*i.e., no ordering*) and −0.5 indicates that the bond vectors are strictly within the membrane plane. An application of this analysis can be found in Fig. 4 of Ref. 29.

Finally, we implemented a function to compute the *gauche* ratio along alkyl chains, i.e., the average fraction of C–C–C–C dihedral angles at a given chain position being close to 60°, which is called the *gauche* configuration. This configuration is energetically less favorable than the *anti* configuration with dihedral angles of around 180° and therefore also statistically less populated. For an application of this analysis and interpretation of the results, see Fig. 5 and Table 2 of Ref. 29.

### E. Solvent orientation

In condensed phase simulations of solutions, it is often very interesting to understand how the solvent molecules are arranged and oriented around the solute. Information on the solvent shells can be gained from RDFs and SDFs, but no details on the orientation of the individual solvent molecules are gained from these analyses. CDFs involving distance and angle can give information on the solvent orientation, but it is still not apparent what the preferred solvent orientation at a specific position around the solute molecule is. To overcome this, we implemented the **plane projection analysis** (PlProj). Consider Fig. 4 where an [EMIm]^{+} cation from a simulation of one ion pair [EMIm][OAc] in water is fixed. The color scale of the plot depicts the average particle density of water at the given position relative to uniform density (*considering only a thin slice of the simulation cell defined by the [EMIm]*^{+} *ring plane*). A value larger than 1 means that water is more frequently found here than at some arbitrary position, while a value below 1 indicates a depletion of water molecules at that point. It can be seen that there exists a well-defined first solvation layer with particle densities above 1, which is separated from the water bulk phase by a clear region of depletion with particle densities below 1. The vectors in the plot depict the average orientation of water dipole vectors at each point. As expected, it can be seen that water forms hydrogen bonds to the three protic ring hydrogen atoms, and therefore, the water dipole vector points away from these atoms on average. Next to the non-polar regions of the molecule, there is no clear preferential orientation of the water molecules, resulting in very short average vectors.

### F. Hydrogen bond network topology

Many molecular liquids form hydrogen bonds, which often have a significant influence on the properties (*just consider water*). If there are several different hydrogen bond donors and acceptors in a complex system, a subtle balance of interactions is found, and even small changes in hydrogen bond donor or acceptor strengths can lead to large effects. Therefore, it is very important to understand the hydrogen bond topology and the competition for donor/acceptor positions in such systems. A simple estimate for the strength of a hydrogen bond is the distance and height of the first maximum in the X⋯H RDF, together with the number integral up to the first minimum, which gives the average number of hydrogen bonds of atom X. Let us consider a system where a cellulose oligomer is dissolved in a mixture of the ionic liquid [EMIm][OAc] and water. There are three hydrogen bond donors (cellulose alcohol H, water H, and [EMIm]^{+} ring protons) and three acceptors (cellulose O, water O, and [OAc]^{−} O). The hydrogen bond estimates from the pairwise RDFs could be written as a matrix of numbers, but this is not very illustrative. Recently, we found a much more appealing way to represent such data, which is based on Sankey diagrams.^{65} The concept is more than 120 years old now, but it is rarely found in computational chemistry and never has been used to visualize hydrogen bond topology to the best of our knowledge. Therefore, despite its early origin, we consider this a “modern” visualization technique. Some common plotting libraries such as MatPlotLib^{66} are able to produce Sankey diagrams, but not the circular variant described below. For a visualization of the hydrogen bond topology of the mixture described above, see Fig. 5. In the left-hand panel, a **linear Sankey diagram** is shown, which contains the hydrogen bond donors on the left-hand side and the acceptors on the right-hand side. The numbers correspond to the average hydrogen bond count of the donor/acceptor. The width of the connection bars is proportional to the hydrogen bond number. Note that the change in bar width from left to right arises from the different molecule counts. In the right-hand panel of Fig. 5, the same data are visualized as a **circular Sankey diagram**. Again, the numbers represent the total hydrogen bond count of the donors/acceptors, and the connection width is proportional to the hydrogen bond number for the two connected groups. We believe that Sankey diagrams are a very illustrative and clear tool to represent such topologies, and we hope to see them in many future computational chemistry papers from now on. As no common plotting program is able to produce both linear and circular Sankey diagrams as shown, TRAVIS directly outputs the images as SVG vector graphics.

For cases where an even more fine-grained analysis is required, we developed another analysis, which reveals the hydrogen bond pattern between several dozens of hydrogen bond donors and acceptors. For example, if it would be of interest which of the 16 different oxygen atoms in the cellulose pentamer forms hydrogen bonds to which acceptors, this analysis will give a shortcut to the desired results. Here, we consider a system with just cellulose and [EMIm][OAc], but no water (see Fig. 6). We call this analysis the **connection matrix analysis** (CMat). The rows of the matrix on the left-hand side correspond to all different hydrogen bond acceptors in the system, while the columns represent the different hydrogen bond donors. For each of the matrix elements, TRAVIS internally computes an RDF and extracts the distance and height of the first maximum. If the RDF does not indicate the presence of a hydrogen bond, the matrix element is simply filled with a black cross. If a hydrogen bond exists, the square is filled with a color according to a two-dimensional color scale that is shown on the right-hand side of Fig. 6. Both the distance and the height of the first RDF maximum are encoded in color. For example, red indicates a hydrogen bond with a very small distance and a very large maximum height (*i.e., a very strong hydrogen bond*). Yellow, on the other hand, corresponds to a hydrogen bond with a large maximum height but large distance, which is therefore not considered very strong. From looking at the matrix, it is immediately visible that the acetate anion forms strong hydrogen bonds to many donors, while many cellulose oxygen atoms are not involved in strong hydrogen bonds. There are two strong intramolecular hydrogen bonds clearly visible in cellulose, namely, O3⋯H25 and O8⋯H31. Obtaining this information just with standard RDF analyses would have required to compute and manually look at many hundred RDFs. We are convinced that the connection matrix analysis greatly simplifies that process.

### G. Voronoi analysis

Despite being more than 110 years old now, the Voronoi tessellation^{67} is still a fascinating mathematical formalism. When applied to a simulation cell with the atoms as Voronoi sites, in simple words, it partitions the box into Voronoi cells belonging to each atom in a way that all points in space that are closer to a particular atom than to any other atom will be assigned to that atom’s Voronoi cell. TRAVIS uses the free Voro++ library^{68,69} (*integrated in the TRAVIS source code so that it does not have to be provided externally*) to perform a periodic radical Voronoi tessellation.^{70,71} From such a tessellation, many interesting properties can be derived. On the one hand, Voronoi cells can be directly visualized by a 3D rendering via Povray.^{10} On the other hand, one can create statistics on the Voronoi cells’ volumes, surface areas, isoperimetric quotients (*measures for the “sphericity” of a cell*), and so on. It is also possible to monitor Voronoi surface coverage, i.e., which percentage of the faces of a molecule’s Voronoi cell is shared with other molecules of a certain type. Both the visualization and the statistics can be performed with TRAVIS, as already demonstrated in the literature.^{37,42,72,73}

Another interesting result from a Voronoi tessellation is a simple and unbiased neighborhood criterion: If the Voronoi cells of two atoms share a common face, these two atoms are said to be neighbors. In contrast to other neighborhood criteria, e.g., based on the first minimum of the RDF, this approach convinces through its great simplicity, as it does not possess any empirical parameters. It is therefore an unbiased criterion and transferable to all kinds of systems. In Fig. 7, the neighborhood probabilities for all pairs of atoms in a bulk phase simulation of [EMIm][OAc]^{23,24} are visualized as a matrix. White corresponds to a neighborhood probability of zero, i.e., these two atoms never share a common Voronoi face, while purple represents a neighborhood probability of one, i.e., the one atom type always has at least one shared Voronoi face with an atom of the other type. The strong hydrogen bonds between acetate oxygen atoms and the protic ring hydrogen atoms can be directly seen, while some interaction between the hydrogen atoms of the aliphatic side chains is also visible. Such a neighborhood matrix captures the full neighborhood relationship of the system in one image and can therefore be considered a fingerprint of the liquid structure.

### H. Domain analysis

To study microheterogeneity and microphase separation in complex liquid systems, we developed the Voronoi-based **domain analysis** (DomA) in 2015.^{28} First, the atoms of the system are categorized into different classes. This can be, e.g., polar, non-polar, side chain, aromatic, fluorinated, etc. Then, a radical Voronoi tessellation of the simulation cell is performed using all atoms of the system as Voronoi sites. The Voronoi cells of atoms from the same class, which are neighbors, i.e., share a common Voronoi face, are merged. This leads to domains of different sizes and shapes. A two-dimensional illustration of this concept is presented in Fig. 8, where the different fill colors represent the different atom classes and the bold black lines correspond to the domain boundaries. While in a perfectly homogeneous system, there will be many small domains, a certain degree of microheterogeneity will lead to a smaller number of larger domains because the atoms from the same class tend to form clusters. From performing this analysis in all trajectory frames and looking at the domain statistics (*average domain count, volume, surface area, isoperimetric quotient, etc.*), one can gain much insight into the microphase behavior of the system. This has already been successfully applied many times in the literature.^{27,60,73–78}

### I. Void analysis

In a solution, the exclusion volume of a solute perturbs the structure of the pure liquid. Obviously, the required amount of energy for this perturbation is decreased if there already exist void cavities in the pure liquid, which are suited to accommodate the solute. Therefore, it can be expected that the size, shape, and mobility of void cavities play a crucial role in many fundamental processes of solvation and mass transport in fluid systems.

With the aim to quantify this effect, we have implemented a Voronoi-based void analysis in TRAVIS, where a set of spheres is grown into the voids between the molecules in a simulation cell (*see* Fig. 9).^{79} In contrast to other algorithms,^{80,81} we enhance this by the definition of a void domain: The spheres are handled as pseudo-atoms and the corresponding Voronoi cells are combined (cf. the domain analysis). With the aid of the isoperimetric quotient and autocorrelation functions, information on the shape and mobility of the void space can be extracted.

### J. Generalized tuple analysis

Most of the particle density histograms discussed above only involve two molecules at a time. Any RDF is based on the distance between two atoms. In CDFs, there can be more atoms involved (e.g., *when observing a distance and an angle, as shown in* Fig. 2), but there are still only two molecules involved at a time. One can easily think of analyses that involve more than two molecules.

For example, let us consider a simulation of a concentrated aqueous sodium chloride solution. We want to investigate how the O–H bond length histogram depends on the distance from the H atom to Cl^{−} ions, but only under the condition that the oxygen atom is not further than 250 pm away from a Na^{+} ion. This analysis involves three particles at a time (one water molecule, one Na^{+}, and one Cl^{−}). A straightforward approach would loop over all possible 3-tuples of these particles, skipping all tuples that do not fulfill the conditions and using the remaining tuples to populate the multi-dimensional histogram. However, this is an inefficient algorithm, as most of these tuples will not fulfill the conditions, and it is slow to compute so many unnecessary distances. A more optimized approach would first define a maximum observation range. All distances beyond this range are neglected. Then, it would utilize a spatial domain decomposition, as explained for the molecule recognition in Sec. II A, using the maximum observation range as minimal cube diameter. Now, for a particular water molecule, only the ions within the central cube and the 26 surrounding cubes need to be considered for forming three-tuples with that water molecule, excluding all the ions that would not fulfill the conditions anyway. This leads to an implementation that essentially scales linearly with system size.

We are currently working on an unified formulation for all such analyses, which we call the **generalized tuple analysis** (GenTup). Based on a central particle, as many particles as required can be involved in the analysis at the same time, i.e., iterating over *n*-tuples of particles for any number *n*, which all have to be located within a maximum observation range around the central particle (*the maximum observation range can be chosen by the user*). It will be possible to use many different condition relations between the particles (*distances, angles, dihedral angles, triangle surface area, tetrahedron volume, etc.*). To the best of our knowledge, such a generalized analysis has not yet been published in the literature. Our implementation in TRAVIS will be available in some months.

## IV. DYNAMICAL ANALYSES

In this section, some dynamical analyses contained in TRAVIS will be shortly discussed. As already described above, dynamical analyses are defined as analyses that are not invariant under re-shuffling of the trajectory frames, i.e., they depend on the ordering of the frames in some way. Most dynamical analyses should not be applied to Monte Carlo simulations, as there exists no physical time in standard Monte Carlo approaches.

### A. Diffusion coefficients

A common quantity when studying dynamics of condensed phase systems are diffusion coefficients. They can be determined both experimentally and via simulation, and are therefore a valuable property for validating simulations. The diffusion coefficient *D* of some molecule can be computed from the mean square displacement (MSD) via the Einstein relation^{83}

where $pt$ is the position vector of a particle at time *t* and *d* represents the dimensionality of the system (*usually d = 3*). TRAVIS can compute MSDs and automatically performs a linear fit to the second half of the obtained function to determine the diffusion coefficient. Some years ago, a fast method for computing MSDs via Fourier transform has been proposed in the literature;^{84} we are currently implementing this in TRAVIS, which will be available in some months. MSDs and diffusion coefficients computed by TRAVIS have appeared in the literature several times.^{22,29,33,40,49,58,77,85–87}

Another approach to diffusion coefficients of simulations is the Green–Kubo diffusion relation,^{88} which relates the integral of the velocity autocorrelation function to the diffusion coefficient. It reads

where $vt$ is the velocity vector of the considered particle at time *t* and *d* again stands for the dimensionality of the system. In addition, the computation of velocity autocorrelation functions is implemented in TRAVIS so that results from both approaches can be compared (*they are typically not identical for finite-length trajectories, and it is unclear which one is “better”*).

### B. Reorientation dynamics

Another very interesting dynamical property is the reorientation dynamics of certain vectors in a simulation, also known as rotational relaxation time. It can directly be determined by certain NMR and electron paramagnetic resonance (EPR) techniques and therefore offers a good opportunity for comparison between experiment and simulation. Based on a trajectory, this quantity can easily be expressed as an integral over the corresponding vector autocorrelation function $C\tau $. For comparison with some experiments, it is desirable to apply a Legendre polynomial *P*_{n} of order *n* to the vector dot product in the correlation, then termed *n*th order correlation function $Cn\tau $ and *n*th order reorientation time *T*_{n}. This leads to the equations

where $ut$ is the chosen vector in the system at time *t*. As the first-order Legendre polynomial *P*_{1} is the identity, the reorientation dynamics for *n* = 1 equals the case without the Legendre polynomial. For typical simulation lengths, the vector autocorrelation function $C\tau $ does not decay to zero so that the integration to obtain *T* cannot be directly performed. To overcome this, TRAVIS automatically fits a poly-exponential function to $C\tau $ via a least-squares procedure (*using a Levenberg–Marquardt minimizer*),^{89} which is then analytically integrated to determine *T* as a result. Vector reorientation times computed with TRAVIS have already been published in the literature.^{20,31,58,90–92}

### C. Aggregation dynamics

In computer simulation, it is often desirable to determine the average lifetimes of certain aggregates (e.g., *ion pairs or hydrogen bond complexes*). It is for sure not a valid approach to simply observe the lifetime of such a complex once in a trajectory—a molecular dynamics simulation is a chaotic system, and by chance, any arbitrary lifetime can be obtained from such an “experiment” with some non-zero probability. A valid approach is to run a simulation until the investigated aggregate has formed and broken sufficiently many times and then use the autocorrelation formalism to determine its lifetime.^{93,94} To do so, first a geometric criterion for aggregation is chosen. This can be a simple distance criterion or a more complex variant. For example, when defining hydrogen bonds via a geometric criterion, it has shown to be effective to define a criterion for both the distance and angle (*see the yellow rectangle in* Fig. 2 *for a reasonable hydrogen bond criterion*). Then, an auxiliary function

is defined, where $\beta ijt=1$ exactly when an aggregate between molecules *i* and *j* at time *t* is intact according to the chosen geometric criterion. From this, the autocorrelation

is computed. The lifetime *T* of the aggregate can then be obtained as twice the total integral of this autocorrelation function

The factor of 2 can be understood from the following simple consideration: Imagine an aggregate that existed only once for a time interval *a*. The corresponding correlation function $c\tau $ is then a piecewise-linear function that runs through the two points $c0=1$ and $ca=0$. The integral of this function will be $a2$. A factor of 2 is required to retain the original lifetime of *a*.

In practice, the correlation function $c\tau $ will often not decay to zero within the simulation time. As explained in Sec. IV B, TRAVIS automatically fits a poly-exponential function to the correlation function via a least-squares procedure (*using a Levenberg–Marquardt minimizer*),^{89} which is integrated analytically to obtain the lifetime of the aggregate. Furthermore, in finite-size simulation cells, the autocorrelation function never decays to zero because any two molecules that formed an aggregate will eventually meet again. To obtain correct lifetimes, it is important to subtract the ensemble average value from the autocorrelation function before performing the exponential fit and the integration. TRAVIS automatically performs this correction.

The above-description corresponds to so-called *intermittent* autocorrelation functions. TRAVIS can also compute *continuous* autocorrelation functions, which are not discussed here. Aggregate lifetimes and autocorrelation functions computed by TRAVIS have been published in the literature several times.^{20,22,25,37,39,40,43,47,77,78}

#### 1. Efficient implementation

Computing aggregate lifetimes as described above would be highly inefficient. Imagine a liquid water force field trajectory with 1000 molecules and 10^{6} trajectory steps. Storing the value of the $\beta ijt$ function for all pairs and all time steps would require 10^{12} memory cells, i.e., 1 TB of RAM if every function value is stored in 1 byte. A much better approach is to keep a list of aggregates. For any aggregate between two molecules, it is only stored which two molecules are involved in that aggregate and from which to which time step the aggregate was intact. Such a list requires only a tiny fraction of the memory described above, and for every pair *i* and *j*, the function $\beta ijt$ can be reconstructed from the list. By this approach, only the data for one pair of *i* and *j* have to be stored in memory at a time—all the resulting autocorrelation functions are summed up anyway.

This is still not an optimal approach. It requires to compute autocorrelations of *i* · *j* different time series. Autocorrelations can be efficiently computed by using fast Fourier transform (FFT) due to the Wiener–Khintchine theorem,^{95} but it still takes some time. An even more efficient method can exploit the fact that $\beta ijt$ can only take the values 0 and 1 and can therefore be written as the sum of rectangular functions. Let

be a rectangular function. The cross-correlation of two such functions $Rabt$ and $Rcdt$ has a simple piecewise-linear form (*a trapezoid*)

with the substitutions

As $\beta ijt$ can be written as a sum of rectangular functions (*by knowing the aggregate lifetime intervals from the list*), the resulting product in the autocorrelation can be expanded (“multiplied out”) to a sum of trapezoid functions as shown above. Then, the autocorrelation function for a pair of *i* and *j* can be directly constructed as a sum of piecewise-linear functions without performing any actual correlation at all. This highly efficient approach is implemented in TRAVIS. To the best of our knowledge, it has not been described in the literature before.

### D. Reactive flux analysis

The lifetimes obtained from the approach described in Sec. IV C are in some cases suffering from a diffusion bias. The reactive flux approach of Luzar and Chandler^{96–98} eliminates this perturbation with the aid of a second operator *H*(*t*). By definition, it becomes unity if a pair of interactants is close enough to form the interaction in question and zero otherwise. This allows the calculation of the function

which gives the probability of finding a pair that was initially interacting at time *t* = 0 as not interacting anymore, but still in a distance at which an interaction would be possible at time *t* = *τ*. Finally, the two functions *c*(*t*) and *n*(*t*) can be fitted on the simple kinetic equation

The inverse of the rate constant of decay *k*_{d} is defined as the average lifetime $\tau lt=kd\u22121$ of the interaction.

The current implementation in TRAVIS^{30,99} enables the analysis for hydrogen bonding—based on a geometrical criterion as visualized in the left part of Fig. 10—as well as for ion pair dynamics based on a next neighbor criterion and a distance, as illustrated in the right part of Fig. 10. The analysis of the interaction dynamics allows a deeper insight into the molecular cosmos of chemical systems,^{100} especially in organocatalysis,^{101} electrolytes,^{102} and metal extractants.^{77}

### E. Van Hove correlation functions

As described above, an RDF is computed by taking the positions of two particles in the same trajectory frame, measuring the distance between the two positions, and adding the result to a histogram. It is interesting to take the positions of the two particles from *different* trajectory frames with a “lag time” of *τ* between them. If this is performed for different lag times and represented as a contour plot with the histogram value on the horizontal axis and the lag time on the vertical axis, the result is called a **van Hove correlation function** (VHCF).^{103} In Fig. 11, a van Hove correlation function for the O⋯H distance in a simulation of liquid water is presented. It can be seen that the standard RDF is reproduced at *τ* = 0, but the features in the RDF quickly are blurred out for an increase in lag time. Van Hove correlation functions are particularly interesting because their two-dimensional Fourier transform is the dynamic structure factor $Sq,\omega $, which can be measured by neutron scattering experiments. This offers another opportunity to directly compare experimental and computational results.

## V. SPECTROSCOPIC ANALYSES

It is known since a long time that vibrational spectra can be computed from molecular dynamics simulations by time correlation functions.^{104} This comes with several advantages over the widely used static–harmonic approach to spectra. Automatic conformer sampling takes place during the MD simulation, the full solvent effect can be included, realistic band shapes are directly obtained, and even some anharmonic effects (*such as temperature-dependent frequency shifts as well as approximate overtones and combination bands*) are covered.^{105} During the last few years, many functions for predicting vibrational spectra have been implemented in TRAVIS. This section will give an overview on these methods and techniques. A detailed step-by-step tutorial on computing vibrational spectra with CP2k^{106} and TRAVIS can be found on the Internet.^{107}

### A. General processing techniques

This section describes some general techniques used in TRAVIS, which are relevant for the calculation of all types of vibrational spectra. While most of these techniques are basic knowledge of signal processing, they are rarely discussed in articles on computer simulations.

#### 1. Fast correlation

All prediction methods for vibrational spectra from molecular dynamics simulations are based on either autocorrelation or cross-correlation functions of some time series along the trajectory. Typically, these correlations are computed for molecular properties, not for the properties of the whole simulation box. The reason is that correlating properties of the whole system is equivalent to taking into account all cross-correlations between different molecules. However, the motion of very distant molecules is not correlated at all so that these cross-correlations mostly introduce noise into the spectrum. Computing all the molecular correlation functions requires much computational power because each such calculation scales with $Om\u22c5n$, where *n* is the length of the time series and *m* is the chosen correlation depth. Fortunately, there exists the Wiener–Khintchine theorem,^{95} which states that computing the autocorrelation $C\tau $ of a time series $ft$ is equivalent to Fourier transforming the time series, taking the absolute square, and transforming back,

As the Fourier transform of a discrete dataset of length *n* can be efficiently computed by the fast Fourier transform (FFT) method that only scales with $On\u22c5log\u2009n$, this saves a lot of computational time. Equation (15) is only a special case of the cross-correlation theorem so that a similar approach can also be used to speed up the calculation of cross-correlation functions via FFT. TRAVIS computes all autocorrelation and cross-correlation functions based on these approaches using the KISS FFT library^{108} (*integrated in the TRAVIS source code*).

#### 2. Window function and zero padding

When considering Fourier transforms of data, it is very important to choose a suitable window function. There exists no Fourier transform without a window function—using no window function means using a rectangular window, which is a very poor choice. Applying a window function simply means element-wise multiplication of the discrete input data with the window function before Fourier transforming. By default, TRAVIS uses the Hann window function^{109} for computing spectra, which is given by

but other window functions (exponential, Gaussian) are also implemented. The total integral of the Fourier transform of a discrete dataset *a*_{i} depends only on the first value *a*_{0}. Therefore, any window function that fulfills the condition *W*_{0} = 1 does not modify the total integral at all. In other words, window functions alter the peak shapes of the spectrum but maintain the integral intensities of the bands.

By adding zeroes to the tail of the correlation function before Fourier transforming, the resolution of the resulting spectrum can be increased, which is called zero padding. No new spectral information is gained by this technique, it is just an interpolation between the data points of the original spectrum, i.e., a cosmetic measure. By default, TRAVIS uses a zero padding factor of 4.

#### 3. Finite difference correction

For most spectra, the time derivative instead of the quantity itself enters the correlation functions. For a cosine function $cos\omega t+\phi $, it can be shown^{105} that the second-order central finite difference derivative is given by

i.e., the exact value of the derivative is multiplied by a *sinus cardinalis* (sinc) function $sin\omega \Delta t\omega \Delta t$. Since the product of two such time derivatives is formed in the correlation functions, the final spectra need to be divided by a factor of $sin\omega \Delta t\omega \Delta t2$ to obtain the correct intensities. This correction is implemented in TRAVIS and active by default. Due to the correction, high-quality spectra with accurate intensities can be obtained even if the dipole moment only was computed every 4.0 fs.

#### 4. Improved sampling via time reversibility

Similar to most other properties, vibrational spectra should be invariant under reversal of the time direction: The trajectory with reversed time samples the same thermodynamic ensemble as the forward trajectory. As vibrational spectra are ensemble averages, they cannot differ for these two cases. Consider the cross-correlations of two time series $at$ and $bt$ along the trajectory. Let $\xe3t\u2254aT\u2212t$ and $b\u0303t\u2254bT\u2212t$ be the two time series of the trajectory with reversed time, where *T* is the total trajectory length. We find that

with the substitution $t\u0303\u2254T\u2212t\u2212\tau $. Thus, the reversal of the time direction is equivalent to swapping the first and second argument of the cross-correlations. If the spectra are invariant under time reversal, the underlying cross-correlation functions should also be invariant. Therefore, swapping the two arguments will yield identical correlation functions for an infinite trajectory length. However, in practice, sampling is not perfect, and the two correlation functions differ due to statistical noise. By simply taking the arithmetic average of both correlation functions (*with forward time and reversed time*) [see Eq. (19)], the sampling is effectively increased by a factor of 2. We call this approach the “commutator trick,”

It only improves sampling if $at$ and $bt$ are different time series, i.e., in the case of true cross-correlations. If *a* and *b* are identical (*i.e., autocorrelation, as with correlating the dipole moment for infrared spectra*), nothing is gained.

Another subtlety arises if one of both properties *a* and *b* is the magnetic dipole moment [*which is required for vibrational circular dichroism (VCD) and Raman optical activity (ROA) spectra; see below*]. This quantity depends linearly on the atomic velocities. A time reversal inverts all atomic velocities and therefore swaps the sign of the magnetic dipole moment. In this case, taking the average of both correlation functions would only yield noise, as both contributions would cancel out. The sign of the correlation function with reversed time has to be changed to obtain the desired effect, which is given in the following equation:

#### 5. Correcting the Verlet frequency shift

Describing the dynamics of a system with finite time steps does unfortunately not leave the vibrational frequencies of harmonic oscillators invariant. If a harmonic oscillator with “true” frequency *ω*_{0} is solved by a Verlet integration scheme^{110} with integrator time step Δ*t*, it can be derived^{105} that the observed frequency $\omega \u0303$ is given by

Fortunately, the effect is rather small. When simulating an oscillator with *ω*_{0} = 3000.0 cm^{−1} and Δ*t* = 0.5 fs, the observed frequency is $\omega \u0303=2990.0$ cm^{−1}, which is a deviation of around 0.3%. When computing spectra from a trajectory, the observed frequencies $\omega \u0303$ are given and the true frequencies *ω*_{0} are sought after. The inverse of Eq. (21) can be easily written as

By using Eq. (22), the frequency axis of the spectra can be corrected for the frequency shift of the Verlet integrator. The correction is implemented in TRAVIS.

### B. Voronoi integration

When vibrational spectra shall be computed from molecular dynamics simulations, molecular electric properties (*e.g., dipole moments*) are required. The standard approach to obtain those is the so-called Wannier localization.^{111,112} A unitary transformation that minimizes the spatial spread is applied to the set of molecular orbitals, yielding localized Wannier orbitals. For each such orbital, the center of mass (*called “Wannier center”*) can be considered an integer point charge. By assigning the Wannier centers to individual molecules, the molecular dipole moment can be computed. In the case of one isolated molecule, this is not even an approximation. However, this approach comes with several limitations. First, it requires a lot of computational time (*for systems in the order of 1000 atoms, easily more than the actual SCF cycle*). Even worse is the fact that the iterative approach to determine the Wannier centers sometimes does not converge at all. Apart from that, there arise problems with aromatic systems. The Wannier localization enforces an alternating single bond–double bond pattern in aromatic systems, which can introduce artificial spectral bands, as we have recently reported.^{113} When computing molecular polarizabilities for Raman spectra of aromatic systems, the numerical noise introduced by the Wannier localization is several orders of magnitude larger than the spectrum itself. Finally, to the best of our knowledge, it is not possible to derive molecular quadrupole moments from Wannier centers; however, these quadrupole moments are required for the computation of ROA spectra.

To overcome these issues when computing molecular properties, we developed the Voronoi integration in 2015.^{113} After a periodic radical Voronoi tessellation^{67,70,71} of the system (*with the atoms as Voronoi sites and atomic van der Waals radii*^{114} *as Voronoi radii*) has been computed by using the Voro++ library,^{68,69} the total electron density $\rho r$, which is supplied on a grid, is directly integrated within the Voronoi cell of a molecule to yield the desired molecular moments (*e.g., electric dipole moment* **p**^{Mol} *and electric quadrupole moment* $QijMol$),

where *q*_{n} and **r**_{n} are the core charge and position of the *n*th atom in the molecule, respectively, and *δ*_{ij} is the Kronecker delta. We have shown that the spectra obtained via Voronoi integration are very similar to those based on Wannier localization, but the problems with aromatic systems are no longer present.^{113} Furthermore, our integration algorithm is very fast (*∼*1 *s per full frame on one CPU core*) so that a lot of computational time is saved when compared to Wannier localization. Finally, also higher molecular multipole moments such as the electric quadrupole moment (*which is required to compute ROA spectra*) can easily be obtained. The required volumetric electron densities along the trajectory can be stored in our highly efficient lossless compression format BQB^{7} (*see above*). Therefore, we propose the Voronoi integration technique as a new standard method for obtaining molecular properties in bulk phase systems.

### C. Power spectra

The Fourier transform of the atomic velocity autocorrelation function yields the power spectrum of a system, also known as the vibrational density of states (VDOS). In contrast to the experimentally accessible vibrational spectra, the power spectrum is not based on selection rules but contains all motions of a system. If the units are chosen correctly, and if atom mass weighting is applied, the integral over a peak in the power spectrum directly yields the temperature of the corresponding degree of freedom in Kelvins. This is an interesting feature to determine if a simulation is well equilibrated: According to the equipartition theorem, all degrees of freedom should possess the same temperature. If the integrals in the power spectrum indicate that this might not be the case, a longer equilibration can be recommended. One example is presented in Fig. 12, where a power spectrum of a system containing 20 different diatomic molecules with harmonic bonds (*black curve*) is shown together with the integrals over the bands (*red curve*). As the integrals are all of similar magnitude, the system seems to be well-equilibrated. Power spectra computed by TRAVIS already appeared many times in the literature.^{22,25,33,37,41,72,115–117}

### D. Bulk phase normal modes

Based on an approach presented by Mathias *et al.*,^{118,119} we implemented a bulk phase normal mode analysis in TRAVIS.^{115} By iteratively diagonalizing the matrix of all atomic velocity cross-correlation functions, the power spectrum of the system is decomposed into normal modes that are maximally localized in the frequency space. To do so, a set of reference structures is required, onto which the molecules in the trajectory are projected. Due to this requirement, the approach works best for rigid molecules and has severe limitations for flexible side chains longer than ethyl groups. In Fig. 13, an exemplary application of the normal mode analysis is shown, where the individual C–H stretching motions of [EMIm]^{+} cations are distinguished in bulk phase simulations of pure [EMIm][OAc] and an [EMIm][OAc]/water mixture.^{115} There also exist other articles in the literature where this feature of TRAVIS has been applied.^{37,120–122}

### E. Infrared and Raman spectra

The infrared spectrum of a system can be computed as the Fourier transform of the molecular dipole autocorrelation function along a simulation trajectory. Soon after the first *ab initio* molecular dynamics (AIMD) simulations were performed, the first predicted infrared spectrum based on AIMD was published in 1997,^{123} and several applications and further developments followed.^{124–126} For Raman spectra, the molecular polarizability needs to be correlated instead. The first predicted Raman spectrum from AIMD appeared in 2002.^{127} It has been demonstrated many times in the literature that these spectra from AIMD simulations agree very well with the experiment. A module for the computation of such spectra was added to TRAVIS in 2013.^{128} While it initially made use of Wannier centers, it is now based on our Voronoi integration method (*see above*).^{113} Infrared and Raman spectra computed by TRAVIS have been published many times in the literature.^{87,122,129–134}

### F. VCD and ROA spectra

The vibrational circular dichroism (VCD) spectrum of a molecular system can be computed via cross-correlating the molecular electric dipole moment $pMolt$ and the molecular magnetic dipole moment $mMolt$. However, the magnetic dipole moment cannot be easily obtained from a Born–Oppenheimer molecular dynamics (BOMD) simulation, as magnetic moments arise from electric currents, and the Born–Oppenheimer approximation does not account for electric currents. One solution is to apply some kind of perturbation theory (e.g., NVPT^{135,136}), which, however, requires much computational resources. Recently, we developed a purely classical approach that reconstructs the electric currents from the difference in total electron density in subsequent steps along the trajectory.^{137} Once the electric current density $jr$ has been obtained on a grid, integration over the molecular Voronoi cells yields the molecular magnetic dipole moments **m**^{Mol} via the classical expression

where **v**_{n} is the velocity of the *n*th atom in the molecule. Based on this approach, which is implemented in TRAVIS, we published the first *ab initio* prediction of a bulk phase VCD spectrum in 2016.^{137}

Computing Raman optical activity (ROA) spectra from molecular dynamics simulations is slightly more involved. The required correlation functions are based on the electric dipole–electric dipole polarizability *α*^{Mol}, the electric quadrupole–electric dipole polarizability *A*^{Mol}, and the magnetic dipole–electric dipole polarizability *G*′^{Mol}. It can be shown^{139} that

with the electric dipole–electric quadrupole polarizability $AMol$ and the electric dipole–magnetic dipole polarizability $G\u2032Mol$ so that all three quantities can be obtained via finite differences of an applied external electric field **E**. The schematic approach to compute the required polarizabilities for ROA spectra is presented in Fig. 14. We implemented this approach in TRAVIS and were able to present the first *ab initio* prediction of a bulk phase ROA spectrum in the literature^{138} in 2017, which shows a very good agreement with the experiment^{140} (see Fig. 15).

### G. Resonance Raman spectra

Resonance Raman spectra can be computed via temporal correlation of the dynamic polarizability tensor along a trajectory. One approach to obtain this tensor is based on real-time time-dependent density functional theory (RT-TDDFT).^{141} It has been applied several times in the literature,^{142,143} but only to static calculations of isolated molecules in vacuum. In 2019, we implemented this method in TRAVIS and applied it to BOMD trajectories in order to obtain the resonance Raman spectrum.^{144} In contrast to the existing methods, our implementation works for bulk phase systems and therefore is able to take into account full solvent influence as well as some anharmonic effects. Apart from that, our approach is not limited to generalized gradient approximation (GGA) DFT but works with all real time electron structure methods, including, e.g., DFT with hybrid functionals. In our article, we presented the first *ab initio* prediction of a bulk phase resonance Raman spectrum in the literature on the example of uracil in water. We have also shown that the solvent effect is very important in this case, as only the bulk phase spectrum with full solvent influence agrees well with the experiment.

Our newly developed method for predicting resonance Raman spectra does not require a certain laser wavelength as input but yields the full set of resonance Raman spectra for all possible laser wavelengths in one pass. The spectra in dependence on the laser wavelength can be visualized as a contour plot, as shown in Fig. 16 on the example of uracil in water. This feature is not only helpful to understand the coupling between vibrational modes and electronic excitations in the system investigated but it is also beneficial in the design of new interesting experiments by picking laser wavelengths at which interesting resonance effects can be expected.

## VI. CONCLUSION

In this article, we have presented some methods and analyses that have recently been introduced to the TRAVIS program package while also giving a glimpse of some of the underlying algorithms that are essential for efficiency and accuracy of these analyses. While some functions in TRAVIS are known since decades, others are very recent and have been implemented directly after having been developed. For example, TRAVIS has been used to compute the first *ab initio* predictions in the literature of bulk vibrational circular dichroism (VCD) spectra,^{137} bulk phase Raman optical activity (ROA) spectra,^{138} and bulk phase resonance Raman spectra^{144} within the last years. Since the release in 2011, the original TRAVIS article^{1} has already been cited more than 400 times now, and more than 800 different co-authors contributed to these studies. As the authors of TRAVIS, we promise that we will continue to include new methods and features into the code while at the same time maintaining old and existing functions and improving the quality of the documentation.

## ACKNOWLEDGMENTS

We would like to thank all the TRAVIS users around the world for their trust (*and sometimes, unfortunately, also patience*). M.B. acknowledges financial support from the DFG through Project No. Br 5494/1-1.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

## REFERENCES

_{2}solvation in choline chloride/glycerol deep eutectic solvent

**)**,

_{2}]Cl

_{2}

_{2}-reactive aprotic heterocyclic anion ionic liquids from X-ray scattering and molecular dynamics

_{2}solvation in the 1-ethyl-3-methylimidazolium thiocyanate ionic liquid by incorporation into the extended cation–anion network

**)**,

_{2}so small when dissolved in a room temperature ionic liquid? Structure and dynamics of CO

_{2}dissolved in [BMIm

^{+}][PF

_{6}

^{−}]

_{3}(C

_{2}F

_{5})

_{3}]

^{−}vs. [PF

_{6}]

^{−}, flexible anion vs. rigid, and scaled charge vs. unit on the transport properties of [BMIm]

^{+}-based ionic liquids: A comparative MD study

**)**,

_{y}XO

_{4}(X = S, P, Se, y = 1, 2) from ab initio molecular dynamics simulations

_{3}]Cl

_{2}, a case study