Energy landscapes and the closely related cost function landscapes have been recognized in science, mathematics, and various other fields such as economics as being highly useful paradigms and tools for the description and analysis of the properties of many systems, ranging from glasses, proteins, and abstract global optimization problems to business models. A multitude of algorithms for the exploration and exploitation of such landscapes have been developed over the past five decades in the various fields of applications, where many re-inventions but also much cross-fertilization have occurred. Twenty-five years ago, trying to increase the fruitful interactions between workers in different fields led to the creation of workshops and small conferences dedicated to the study of energy landscapes in general instead of only focusing on specific applications. In this perspective, I will present some history of the development of energy landscape studies and try to provide an outlook on in what directions the field might evolve in the future and what larger challenges are going to lie ahead, both from a conceptual and a practical point of view, with the main focus on applications of energy landscapes in chemistry and physics.
I. INTRODUCTION
For a long time, there has been a conscious or unconscious division of the world into the rational, “hard” natural science on the one hand and the human, “soft” science of life and society on the other hand, and it has been suggested that the natural sciences and the humanities are qualitatively distinct entities.1 However, energy landscapes and the closely related cost function landscapes have been recognized in science, mathematics, and various other fields such as economics or sociology as being highly useful paradigms and tools for the description and analysis of the properties of many systems, ranging from glasses, proteins, and abstract global optimization problems to business models and the evolution of life and society.2 There are many parallels one can draw between the supposedly “hard” sciences and the “soft” humanities. This begins with the foundation of these fields as rational enterprises: the foundation of the natural sciences are the laws of physics, the tools of mathematics, and the substrate of materials, from which complexity emerges in the form of whole fields of science such as chemistry, biology, or all fields of engineering. Considering the science of human activities, we find as the foundation the rules of evolution, the tools of psychology, and the substrate of living entities, from which emerge the fields of sociology, economy, law, and society as such. The most obvious bridge between these “hard” and “soft” sciences are the fields of biology and medicine, where aspects of natural laws and the behavior of humans or, more generally, living beings merge.
Considering these fields in a very abstract fashion, we note that in physics, stable states often look like, or can be characterized by, the extremum of some function, e.g., the energy of the system. Quite generally, many dynamic, static, or stationary states, or processes, of a system can be described by a point, set of points, or trajectory on such an energy landscape,2 and it behooves us to ask whether similar landscapes exist or can be constructed in other fields such as evolution, economics, psychology, etc. In fact, the analogy here is cost function landscapes, most directly in the field of economics, where each strategy a business might pursue corresponds to a point on such a cost function landscape with a well-defined cost associated with the strategy.3,4 Alternatively, we can consider the fitness landscape, where each individual state of the system is assigned a fitness value. Here, fitness can range from the evolutionary fitness of a biological species, where a point corresponds to the specific properties of the species and fitness is a measure of its likelihood to survive in a given external environment,5 to the fitness of states and societies, political parties, architectural design,6 or even memes (advertisement!) or ideologies as such. Similarly, the cost function could be the total value associated with the society at a given moment for a given situation by all the individuals constituting the society, be it wealth, pride, or, more generally, the happiness of the citizens in the state7 at a given moment or averaged over some time interval, or the cost associated with a multi-year enterprise or business plan.
Trying to develop this analogy further, we note that the stable states in physics would correspond to stable societies in the world of humanity, or, more generally, stable ecosystems. The latter refers not only to living communities such as a forest or an isolated island but also to stable economic systems such as classical agricultural economies or modern consumer driven macro- or micro-economic models, such as planned economies or market economies guided by the so-called invisible hand of the market.
Since perfectly stable situations are unlikely to exist in the open systems we are usually dealing with in practice, we note that the dynamics and the time evolution of systems in the natural sciences can often be visualized by the movement of a walker, or an ensemble of walkers, on the energy landscape of the system. Here, the energy serves as the driving potential. This means that differences in energy between neighboring states, or gradients for continuous energy landscapes, correspond to forces in the world of physics and chemistry, which can either be studied on the atomic level or can be incorporated and realized in the form of phenomenological laws to describe the dynamics or evolution of a system. Furthermore, we can take the step from the microstates of a system, such as individual configurations of atoms in a molecule, to macrostates that encompass large stable regions of the energy landscape and correspond to a long-lived conformation of the molecule under consideration. On the time scale for which these macrostates are appropriate descriptions of the system, we can again define a landscape on the macrostate level, but now the cost associated with the macrostate would no longer be the (potential) energy of a single atom level configuration but the local free energy of the stable conformation constituting the macrostate.
Clearly, in the world of living beings, this corresponds to the time evolution of biological species, ecosystems, businesses, clubs, societies, cultural memes, etc. Again, we can visualize the time evolution of the business or society as a movement on this cost, fitness, or value function landscape, where cost/fitness/value differences result in forces that drive the changes in an ecosystem or a society. The move from a microstate based to a macrostate based picture can be seen, e.g., in the step from a microstate such as the precise description of the political state of a society in terms of the millions of individuals constituting the society and their individual political preferences, to a metastable macrostate, corresponding to the actual realization of a state possibly organized according to some general (philosophical or social) principles such as democracy, oligarchy, dictatorship, etc. In that case, the cost function would no longer be, e.g., the instantaneous amount of “happiness” or of net profit, but would involve the accumulation of the instantaneous cost over several years or a whole generation, possibly weighted via some control parameter analogous to a temperature. In addition, just like we can often derive approximate phenomenological laws for the evolution of macrostates in physics or chemistry, we can try to formulate general rules for processes in business, society, and the living world in general.
Examples of “energy” landscapes in natural science abound,2 especially as functions of atom configurations: the potential energy or potential enthalpy of a chemical or physical system, or the free energy applied to metastable states that are represented by locally ergodic regions of the landscape. However, we also find cost function landscapes in the natural sciences, where the cost is often the difference between calculated and measured data, which can be summarized as so-called inverse problems.8 Other cost functions can be, e.g., the output of a synthesis process where the optimization is via the choice of synthesis parameters in an appropriately formulated optimal control problem.9,10 A large number of prototypical cost function landscapes are found in a variety of practical and abstract optimization problems, such as the traveling salesman problem,11 the packing problem,12 graph-partitioning problems,13 etc. Of course, one should not forget the field of engineering, where cost function issues abound, ranging from minimal waste in production processes to optimal control problems in robotics.14 This is paralleled by a plethora of cost or fitness landscapes in the world of living beings: evolutionary biological fitness, cost functions of business plans, and budgets of various entities in society, ranging from individuals over various institutions to the whole society or state. However, there is also the social fitness of individuals within a given social environment or social group, the social fitness of different groups in competition, or the biological fitness of groups. Another important type of landscape would be the “migration landscape,” where the state space encompasses all possible distributions of people or biological organisms over some geographical region or even the whole earth. Here we do not refer to a geographical landscape, but to a cost function landscape, which allows us to describe the movement of people on earth or within a region in response to external and internal driving forces, which is related to the objective or cost functions people or biological organisms employ to consciously or subconsciously optimize their lives. Of course, the presence of physical barriers such as mountain ranges, large rivers, deserts, or oceans will influence this movement by increasing the cost associated with a move from one (low cost = “metastable local equilibrium”) distribution of people, or organisms in general, to another such local equilibrium distribution.
In the following, I will first give a limited overview of the history of the field of energy landscapes, where the focus will be on landscapes found and employed in physics and chemistry, but with some side views to other fields. The reason for this is the fact that many techniques and methods that have been developed for studying energy or cost function landscapes in one field can be easily transferred or applied in other fields. Similarly, being familiar with, or at least aware of, other kinds of landscapes than those one encounters in one’s own specific research problems will open up new perspectives to one’s own area of research. Conversely, the tools one has developed for the study of landscapes in one’s own research projects could shed new light on and be very useful in seemingly unconnected fields of science, all the way to the world of human behavior and enterprise. Therefore, recalling the history of the field leads in a natural way to the present and future of energy landscapes and their applications, providing an outlook and perspective on possible research enterprises, both within individual disciplines and of an interdisciplinary character.
II. PRELIMINARIES
A. Definition and representation of landscapes
To begin with, we mention a couple of basic definitions and features of energy landscapes.2 An energy or cost function landscape consists of three elements: a state space, a neighborhood relation or moveclass in state space, and a cost or energy function. The state space is the set of all microstates of the system under investigation, e.g., all possible atom arrangements of N atoms in three-dimensional space, all possible atom arrangements of N atoms inside a periodically repeated cell (a periodic approximant of an infinite crystal) together with the cell parameters, or all possible routes among N cities that a traveling salesman will visit. Here, we note that, at first sight, the state space can be continuous or discrete; e.g., the set of all classical atom configurations is equivalent to R3N, while the set of all combinations of spin orientations of a system of N localized spin 1/2 particles can be visualized as the 2N corners of a cube in N dimensions if neighboring microstates differ by a single spin flip. In the case of a continuous n-dimensional state space where the energy or cost is a (differentiable) function of n real variables, one can usually visualize the energy landscape as a hypersurface over a n-dimensional subset of Rn, exhibiting local minima, maxima, and saddle points and well-defined gradients and higher derivatives. In contrast, for a discrete state space, the landscape is no longer continuous but possesses a graph-like structure once one has decided which microstates are “neighbors” of each other. As a consequence, one frequently calls such a landscape a metagraph8 in order to stress the discreteness of the landscape.
However, the actual topology is determined by the type of dynamics and neighborhood we define for the walker on the landscape. Clearly, there exists a great variety in the shape of such landscapes, whereas the ones shown in Figs. 1–4 only represent a small selection from the multitude of possible types. The neighborhood relation can, e.g., be inspired by physics or chemistry, i.e., we define two atom configurations to be neighbors if they are separated by only an infinitesimal distance in the atom positions or in the other parameters characterizing the microstates. Alternatively, the definition of the neighborhood can be driven by the algorithm we employ to explore the energy landscape, where the neighborhood relation would correspond to the so-called moveclass of the algorithm. In the latter case, the exploration moves can include (non-physical) large jumps such as exchanges of far-away atoms or the randomization of the atom positions inside a small cluster of a few neighboring atoms. In particular, such large jumps are often very efficient if the goal of the exploration is the identification of the global minimum of the landscape. However, while the global minimum is always defined, local minima on the physical energy landscape (with only infinitesimally close microstates being defined as neighbors) that in reality might correspond to metastable modifications can be destroyed by such a jumpmove-type moveclass. Finally, the cost function of interest can be, e.g., the length of a traveling salesman’s tour, the (potential) energy of an atom configuration, or the potential enthalpy if we consider the system at non-zero pressures.
Quantities of interest associated with a cost function landscape that are commonly studied and searched for on an energy landscape are special points, such as local minima, maxima, or saddle points, which can be extended to special regions, such as locally ergodic regions,15,16 where the system can be in isolated local thermal equilibrium on given time scales, or transition regions. To study the dynamics, trajectories on the landscape—constructed according to physical laws like Newton’s equations or derived from phenomenological laws such as “laws of supply and demand” in economics—are simulated and explored. This includes both deterministic dynamics and stochastic evolutions, where the latter ones are reflected in the flows of probability on the landscape (visualized as many walkers on the landscape at the same time). From the analysis of the special points and the regions immediately associated with them, the network of their connections via saddle points, and the probability flows between them, one can deduce the generalized barrier structure that includes not only energy barriers but also kinetic and entropic barriers (cf. Fig. 5).17,18 For the locally ergodic regions, we can compute local free energies by employing local densities of states restricted to such regions. This also allows us to move beyond the dynamics at the level of microstates to the dynamics at the level of macrostates on larger time scales by constructing master equations from the measured probability flows and local densities of states.19
Finally, we note that the high dimensionality of the continuous state space of a cost function landscape, or the gigantic number of microstates in the case of a discrete state space, usually makes visualization rather difficult. Therefore, various schemes of principal (i.e., most meaningful) coordinate analysis have been employed21,22—modern reincarnation is the self-selecting pattern recognition of machine learning algorithms for the identification of highly relevant generalized coordinates.23 Similarly, order parameters and reaction coordinates have been introduced to allow a depiction of the most illustrative (low-dimensional) cuts through the landscape. Another class of visualizations is graph representations, especially via tree graphs that allow us to represent the energies at which low energy states of the system can be connected (cf. Fig. 6),24–29 or more general graphs that depict the minima-saddle-point network of the landscape or the probability flows on the landscape as a function of energy. One should note that the choice of a simplified representation can strongly influence our “intuitive” understanding of the dynamics of the system under investigation, possibly creating problematic preconceptions of the expected system’s behavior in the mind of the researcher.
B. Illustrative model system
To illustrate these concepts, we shortly discuss a system of three Ising-spins (oriented along the z axis), where the energy landscape is a discrete metagraph. We note that not only the actual energy function but also the connectivity of the state space influences the shape of the metagraph and the relaxation processes occurring on the landscape. Figure 7 presents the system definition in the first row: three spins on a triangle with an attractive spin–spin interaction, plus a magnetic field parallel to the z axis. The complete state space consists of 8 = 23 possible orientations of the three spins, labeled as state 1 (all spin pointing up), states 2–4 (two up spins and one down spin), states 5–7 (one up spin and two down spins), and state 8 (three down spins). The second row depicts three moveclasses: (1) is the intuitive single-spin flip moveclass (resulting in a cube-like metagraph), (2) adds a complete reversal of the system corresponding to a rotation of the triangle by 180° in the z-direction, and (3) is some generic moveclass, e.g., reflecting the specific interaction Hamiltonian employed to describe the allowed changes in the state space of the system. Furthermore, for the choice for the magnetic field, we obtain energies of −3J (state 1), +3J (states 2, 3, 4, and state 8), and +5J (states 5, 6, and 7) for the eight microstates of the system.
In the left half of the third row, two tree graph representations of the landscape are shown for the energy landscape when we employ moveclass (3). In the first case, in the so-called single lump tree graph, only the two local minima (states 1 and 8 with energies E = −3J and E = +3J, respectively) are shown, together with a “lumped” state at the energy level (E = +5J), which indicates that the (possibly very complicated) connecting path with the lowest energy requires the walker to ascend to at least the energy of 5J (corresponding to some generalized “saddle” or transition region). The second tree graph includes the local densities of states (multi-lump tree graph), where for each occupied energy level above the minima, the number (or density) of states that can be accessed from the respective local minimum is indicated (here: the lump at energy +3J above minimum state 1 contains the microstates 2–4), while for the transition region, the lump represents the states 5–7. Note that due to the complicated moveclass (3) describing the connectivity among the microstates, the actual path between the two minima depends on some restrictions [compared to the more simple moveclass (1)], i.e., only the path through microstates 2 and 7 connects the two minima while the states 3 and 4 above the state 1 minimum and similarly the states 6 and 5 above the state 8 minimum correspond to kinetic or entropic traps. We also note that if we were considering the landscape with moveclass (2), there would be only one minimum of energy for this landscape (at state 1) since there would exist a direct connection between states 1 and 8, thus eliminating 8 as a local minimum.
Finally, in the right half of the third row, two equilibration trees are shown where the sequence of equilibrations between the states (or the minima) is depicted, again for the landscape with moveclass (3); note that for complex landscapes, this equilibration takes place on multiple exponentially increasing time scales and, therefore, the y axis shows the logarithm of the equilibration times. The first equilibration takes place between the high-lying minimum state 8 and the two trap states 5 and 6, followed by an equilibration between state 7 and the already equilibrated set (8,5,6). Next, state 1 and the trap states 3 and 4 equilibrate, until finally the whole two minima system is equilibrated. If one only focuses on the two minima, then the equilibration tree is rather trivial: the two minima are equilibrated at the time when all the states have been equilibrated.
C. Common types of tree graphs
Next, in Fig. 8, some frequently encountered types of tree graphs are shown, which exhibit characteristic features of energy landscapes. Graph (1) exhibits a landscape dominated by the main trunk of the tree, from which all side-minima can be reached directly without any intermediate saddle regions being present, at which the path to the minimum could deviate into another close-by minimum. Such a tree was observed, e.g., for the traveling salesman problem,26 even though the equilibration tree exhibited many self-similar subsets of states. Similar trees are also found in many cluster landscapes, e.g., for Cu4Ag4.31 In fact, the cluster landscapes often resemble the type of tree represented in graph (4), where many local minima exist, but they are quite deep once they have split from the main trunk of the tree.
Graph (2) corresponds to the structure of a landscape with many low energy minima, which separate rather late on the downhill path on the landscape. Such landscapes are observed for a system consisting of short polymer molecules, where the individual minima correspond to locally frozen-in structures of these molecules, which are all quite similar as far as the energy and the generic structure are concerned; an example might be the landscape of short lattice polymers in two dimensions.32
Graph (3) shows a (self-similar) hierarchical tree; this type of tree structure has been employed to construct ultrametric trees,33 for which exact solutions for random walks can be found. Furthermore, it has been suggested as a prototype for landscapes of glassy systems;34 indications for such self-similar features have been observed for amorphous networks on a lattice.35 However, the dynamics of such systems, including the glass transition, is often controlled by entropic barriers associated with the near-exponential growth of the density of states (DOS) and minima in the low-energy region.36,37 Such a growth in the number of states as a function of energy is not necessarily implied by a hierarchical tree structure, due to the lack of information on how many microstates are associated with each node and how many microstate-microstate connections are included in an edge of the graph. Next, graph (5) sketches a landscape where the number of minima grows approximately exponentially with the energy, exhibiting a partly self-similar structure. Again, this kind of structure might appear in glassy systems.
Graph (6) is also quite generic, in the sense that most of the branches of the tree correspond to small basins, each containing several distinct local minima. Besides being quite generic, such trees are often found in mid- and large-size molecules,25,38 where individual basins correspond to the different bond networks that are underlying the various classes of isomers of the system while still allowing some additional structural differentiation inside the basin through small changes in the bond network, i.e., the existence of closely related isomers and/or spatial distortions of the molecule. Finally, graph (7) depicts a tree graph where the local minima are separated from the main trunk of the tree by rather small energy barriers compared to the depth of the landscape, in contrast to the graph (4) example. This might correspond to a landscape with a funnel-type structure, where the system can rather easily escape a multitude of traps and relatively quickly reach the bottom of the landscape; this kind of landscape structure containing a funnel has been proposed for those proteins that are classified as fast folders.39 Further examples of frequently observed types of tree graphs are discussed in, e.g., Ref. 29.
Of course, the speed of the relaxation or folding process would greatly depend on the entropic barriers, which are not provided in this simple energy-based tree graph; examples of graphs including such barriers can be found in Ref. 40. Clearly, these tree graphs can only represent a small sample of the plethora of structures exhibited by complex energy landscapes and do not cover all possible types of landscapes. In general, we note that most energy landscapes will consist of a mixture of elements taken from these graphs. In particular, when one includes both distorted structures and all possible isomers as part of the state space of a given molecule, one often observes trees where many well-separated minima basins can exist.
Furthermore, tree graphs, whether of the single-lump or multi-lump kind, are in many cases only a first, though very useful, approximation of the landscape; especially the lack of information about entropic or kinetic barriers is problematic when trying to model or study the dynamics of the landscape.
D. Exploration algorithms
Regarding the types of exploration algorithms on landscapes, they can be categorized according to the properties of the landscape that is being explored: global optimizations, i.e., searches for the global minimum and, beyond that, for (all) local minima that are physically or chemically meaningful (or relevant from a biological evolution, business, or society point of view); barrier explorations, i.e., searches for saddle points and measurements of probability flows; measurements of densities of states and densities of local minima as a function of energy, both locally and globally. A plethora of such algorithms have been developed for these purposes, with the majority designed for the purpose of identifying global and local minima. In spite of their enormous number, most of these algorithms consist of clever combinations of a few basic elements or elementary moves, followed in complexity by meta-type algorithms. The latter recombine basic search algorithms into more complex versions, and they often also include information acquired at an earlier stage of running the algorithm and/or proceed by running many such algorithms in parallel, thus employing an interacting or non-interacting ensemble of walkers on the landscape.
The elementary step of essentially all exploration algorithms is the move from a current (micro)state to the next one—usually including the computation of the energy of the new state—where such a move can be split into the selection of the target state and the acceptance of this selected state. In the case of a deterministic algorithm, the selection follows from some simple rule, possibly including information about the trajectory up to the current state and/or knowledge of the gradient of the landscape at the current microstate that allows a direct computation of the target state and usually automatically implies acceptance of the selected state. To this group belong molecular dynamics (MD) simulations or systematic (ideally exhaustive) explorations of the landscape, where in the former case the current velocity of the walker and in the latter case the list of already explored microstates take the history of the trajectory into account. However, if we perform constant temperature MD simulations, the way we introduce the temperature may include some randomness in the process and, therefore, strictly speaking, such a simulation is no longer fully deterministic.
In contrast, stochastic exploration algorithms select a target state at random according to the given moveclass, i.e., from among the states in the neighborhood around the current microstate, where the choice of the actual move can depend on a probability distribution over the set of neighbor states. In this case, some deterministic or stochastic acceptance criterion is applied in order to move the walker from the current state to the target state. Perhaps the most well-known such criterion is the Metropolis criterion,41 where the move is accepted with the probability pacc = min(1, exp(−(Etarget − Ecurrrent)/C)). Here, C is a control parameter for the random walk, which for systems in chemistry and physics commonly equals C = kBT such that the landscape is sampled according to the Boltzmann distribution, and many algorithms have been based on designing optimal schedules for the control parameter with the goal of, e.g., efficiently determining the global minimum.42 An example of a deterministic acceptance criterion for such a stochastic walk would be to accept every proposed move or only to accept a move to a lower energy, corresponding to a random Metropolis walk at infinite temperature or zero temperature, respectively. The latter case is often called a stochastic quench; in contrast, a molecular dynamics simulation at zero temperature, where we follow the downhill gradient while instantaneously eliminating the gain in kinetic energy, would be denoted as a deterministic quench or gradient based local minimization.
These elementary steps for a single walker can be refined by exploiting additional local information, such as second derivatives, etc., to speed up or optimize the exploration, or by employing information about which microstates have already been visited to restrict the moveclass when selecting the next target state. On a higher level of the algorithm, a meta-move can be designed where, e.g., each such move consists of picking a target state followed by a local minimization before the acceptance criterion is applied. In such a case, the criterion would, e.g., compare the energies of the minima accessible from the current state and those from the target state in order to decide whether to accept the move.43,44 Alternatively, one could, e.g., perform a short random (Metropolis) walk in the neighborhood of the target state before deciding on acceptance;45 clearly, the number of possibilities for designing complex moves is only limited by the imagination of the researcher. Similarly, when employing an ensemble of N walkers, the current state consists of the set of N microstates where the walkers reside, and the target state is an ensemble of N′ proposed microstates; here, the number of microstates in the target state might be equal to or larger than the number of microstates in the current state, N′ ≥ N. We note that, for N′ = N, we can visualize the state space of the ensemble as the N-fold product of the state space of an individual walker, creating some kind of meta-landscape in the process, where, e.g., the cost associated with such a state on the meta-landscape might be the average energy of the ensemble of walkers or the lowest energy of all the walkers. Furthermore, when generating the target ensemble, the moveclass can involve information from all microstates belonging to the current ensemble. In this fashion, we can ensure that the walkers stay far away from each other during their explorations,46 or we can pick as targets microstates that can be considered “mixtures” of two or more microstates in the current ensemble, e.g., the so-called cross-over move employed in genetic algorithms.47
In the same fashion, the determination of saddle points can involve deterministic or stochastic search trajectories, where deterministic approaches tend to be more common. Sampling the local and global densities of the state of the landscape can be performed via single-walker or multi-walker trajectories, where the walkers can experience the same or different temperatures and pressures.48 Again, complex acceptance criteria can be designed, and information about the history of the trajectories can be taken into account in order to achieve high sampling efficiency.
Here, one should stress the importance of time scales when studying complex energy or cost function landscapes: For realistic landscapes describing physical or chemical systems, the “natural” dynamics, i.e., the ones driven by the laws of physics, are often so slow that the system resides for macroscopic times in sub-regions of the landscape. In that case, we need to distinguish between (a) the time scale of equilibration, , (b) the time scale of escape, , and (c) the observational time scale, tobs. Here, corresponds to the time scale on which the system can explore the sub-region thoroughly enough such that the time averages over measurements of some relevant or characteristic quantity equal the statistical averages of this quantity according to the Boltzmann distribution restricted to this region, i.e., the system is locally (thermally) equilibrated on this time scale in region and, therefore, also for all times beyond . The time scale of escape, , denotes the time on which the system will leave region with a high probability; as a consequence, the likelihood of the system to stay inside region decreases for times larger than , even though it would still be in local equilibrium as far as is concerned as long as holds. Finally, the observational time scale, tobs, refers to the time scale of our experiment or observations, on which we study the system, either on the computer or in real life.
Therefore, if we place the system into a given sub-region and observe it on a time scale larger than the equilibration time scale but shorter than the escape time scales, , then the system’s properties we measure are thermal equilibrium properties, and we would consider the sub-region as locally ergodic on the time scale tobs.15,16 In the case of a chemical system, this sub-region would correspond to a metastable conformation, modification, or phase.
III. HISTORY
Complex cost function landscapes have been with us forever, in principle; one need only think of all the optimizations of more or less complex tasks that we perform in our daily lives. However, we have only become aware of such landscapes as a way to describe and formalize such complex tasks, as a possibility to understand the behavior of complex systems, or to systematically organize a multitude of such systems over the past decades. This not only applies to the kinds of systems that we are studying in terms of their energy landscapes but also to the methods, with which we perform such investigations. In addition, by trying to understand the dynamics and properties of the systems and their energy landscapes, we have also accumulated and evolved a set of energy landscape concepts that can be employed to acquire a deeper understanding of how such landscapes work.
Regarding the history of the systems studied, the methods developed, and the concepts conceived of in how to explore and understand cost function landscapes in an appropriate fashion, we can divide the history of energy landscapes into several epochs. (1) The time before the invention and subsequent public availability of the computer (before ∼1950): people knew of the existence of highly complicated problems involving cost and energy landscapes, such as classical prototypical global optimization problems or the investigation of the dynamics and time evolution of chemical systems and processes, such as chemical reactions. However, usually, they did not have the tools to address such problems except for highly reduced or symmetric subsets of such landscapes where analytical methods could be employed, or with the help of large numbers of, from our modern point of view, extremely slow human computers. (2) The time from ∼1950 to ∼1980: people (a) were able to construct and implement computational algorithms on computers and employ numerical techniques to address formerly seemingly impossible problems, and (b) began to realize that many systems in the natural sciences, mathematics, economics, and the humanities demanded the introduction of complex multi-minima energy or cost function landscapes and, therefore, initiated the development of methods to systematically explore them, such as the first generation of global optimization algorithms. (3) The 1980s: scientists started seriously addressing problems drawn from realistic situations, especially in chemistry, physics, or biology but also in the humanities, using numerical techniques without the help of major simplifying assumptions regarding, e.g., the symmetry of the system, and where the basic ideas for many new numerical exploration algorithms were developed. (4) Finally, the time since the 1990s: the availability of much larger computational power combined with a deeper realization of the importance of energy and cost function landscapes as such when trying to understand the behavior of complex systems led to a true explosion in both the number and type of systems being studied and the tools and exploration methods being developed and applied.
In Subsections III A–III D, I will give a short and far from complete or exhaustive overview of the history of systems, methods, and concepts of energy and cost function landscapes; for more details, I refer to the literature being cited. Similarly, describing, e.g., how the many algorithms mentioned work in any detail beyond the general remarks in Sec. II D, would be beyond the scope of this presentation, and I must regretfully refer to the literature cited. Not being a historian of science, this can only be a rough summary of the evolution of our treatment and understanding of complex landscapes, which obviously is going to reflect my own perspective and experience. Therefore, the study of cost functions in the context of biological evolution, business, or societal issues will not be discussed in much detail. However, it should give us a feeling for how far we have come over the past 75 years and, therefore, allow us to connect to Sec. IV of this perspective, where we will discuss possible future pathways and open questions in the field of cost function and energy landscapes and recognize how these might have evolved from older lines of investigations of energy landscape related problems.
A. Before 1950
1. Systems
As mentioned earlier, since the dawn of time, human beings have had to deal with complex optimization problems where a multitude of seemingly optimal, i.e., formally called “sub-optimal” solutions appear to exist, and the task to pick the globally optimal one is highly non-trivial. Here, I will only refer to those instances where the problem has been formalized and/or systematized, usually in a mathematical context.
Perhaps the first formal global optimization problem was posed by Kepler in the 17th century:49 the question of the densest packing in three dimensions of a finite/infinite set of identical spheres, a problem that was finally solved only in 2017 after starting an exhaustive proof in the early 1990s.50 Another problem that has been formulated quite early, going back at least to the 18th century, was the traveling salesman problem (TSP), which received its modern formulation in the 1930s and 1950s.11,51 Other classical global optimization problems were the knapsack problem in 189752 and the graph partitioning problem, which was already discussed in the 19th century by Hamilton and Kirkman53 but formulated more systematically in 1966.54–56 In economics, the concept of a Pareto optimization, i.e., a multi-objective optimization where the different objectives are weighted with a Pareto parameter, goes back at least to 1897,57 and modern business cost analysis was developed in the 1940s58 and 1950s.59
In addition, in the 1940s,60 the so-called penalty functions were introduced in optimization problems in order to simplify the actual problem or guide the search procedure more efficiently to a good suboptimal solution. In particular, such penalty functions can serve as an efficient implementation if the allowed state space of a specific cost function problem involves very complicated boundary conditions and restrictions on an originally very simple and intuitive state space. A direct application of such additional conditions on the level of the state space is often very tedious or complicated and might result in discontinuities in the landscape, which complicates the use of search algorithms that rely on derivatives of the energy or cost function. Therefore, penalty functions that assign extremely high energies to the forbidden states allow us to keep the simple full state space and maintain the continuity and differentiability of the landscape while eliminating the forbidden states for all practical purposes. In addition, one could even cite Darwin and his successors for the concept of fitness, whose (local) maximum describes the suboptimal realization of a given species for a given set of external boundary conditions such as climate, geological constraints, and the presence of other competing/non-competing species in the ecosystem.
In parallel to this work, cost and energy landscape problems were being studied in physics and chemistry. Historically, the concept of energy was introduced by Leibniz in the late 1600s in the form of “vis viva” and “vis mortua,”61,62 which roughly corresponded to what we now call kinetic energy and potential energy, respectively (where the term “vis” would actually be literally translated as “force”). In the 19th century, energy was identified as a fundamental entity in the context of Hamilton’s variational principle.63,64 Furthermore, the concept of general energy conservation65 for systems with time-independent potential energy functions was introduced, both in mechanics and in thermodynamics, thus generalizing and formalizing earlier work going back to Galileo, Bernoulli, Huygens, and Leibniz, finally achieving its general form in Noether’s theorems.66 Subsequently, by employing the definition of energy and potential energy in mechanical systems, the definition of the potential energy landscape as a function of the (generalized) coordinates describing the system of interest was reached, whereas in chemistry we usually employ the coordinates of the atoms for this purpose. Regarding cost functions in physics and chemistry, global optimization problems appeared for all inverse problem data analyses, and the Wulff construction67 was developed to determine the optimal shape of a crystal such that its surface energy was minimized.
In addition, small potential energy landscape problems, such as chemical reactions based on the potential energy surface68 and the dynamics of a walker in a double-well system,69 were studied. Similarly, the concept of excited electronic state surfaces and their interaction with the ground state surface in the context of the Born–Oppenheimer approximation,70 the so-called diabatic energy surfaces, was already developed in 1935.71 Already in the 1930s,72 models for potential energy surfaces of small molecules were considered; here, one should also mention the issue of the existence of multiple isomers, which were recognized as (metastable) versions of a given molecule already in the 19th century.73 At the same time, the viscosity of liquids74 and the plasticity of solids75 were discussed in terms of the activated (jump-like) transport of atoms or holes, and the concept of a rugged multi-minima energy landscape exhibiting a multitude of very similar low-energy minima was introduced to visualize and discuss the dynamics of liquids in a comprehensive fashion on the atomic level.74 The issue of highly complex dynamics had been on the minds of scientists since, at least, Laplace’s time, and by the beginning of the 20th century, the concept of chaotic trajectories even in systems with few degrees of freedom and seemingly quite simple energy landscapes had become formalized.76
2. Methods
Since modern computers were not available before 1950, dealing with complex landscapes was considerably more restricted. Nevertheless, one should recall that the concept of a landscape and its depiction go back at least 8500 years, as can be seen by an ancient geographical/town map from ∼6500 B.C. recently discovered77 (cf. Fig. 9). Regarding energy landscape representations, small minima + saddle point sets were known, of course, and in the 1930s, the description of reaction paths in chemical systems was not uncommon.79 Furthermore, the concept of phase diagrams, where for given external (thermodynamic) conditions, one indicates which modification or phase of a chemical or physical system dominates the corresponding energy landscape, was well known already in the 19th century. Regarding optimization methods, quite a number of approaches that should work in principle were known: by 1927, exhaustive (isomer) graph enumerations were performed,80 which would have corresponded to a systematic enumeration of all local minima on a restricted energy landscape of the molecular system, but as yet without a reliable ab initio energy function. Similarly, since the middle of the 19th century, gradient minimization techniques were known,81 and already in 1791, Wurm and Delambre had invented the well-known Verlet algorithm for the integration of differential equations.82 Such differential equations were typical results of global and local optimization problems in physics and mathematics, such as deriving the shape of a catenoid83 or determining a surface of minimal energy for given boundary conditions.84 These problems and the related optimal control problems were solved using the so-called calculus of variation, which goes back at least to the 17th century85,86 and can essentially always be associated with a cost function landscape where the microstates correspond to the trajectories or functions that fulfill the boundary conditions of the optimization or optimal control problem.87 Similarly, the first work on linear programming was performed in the 1930s88 and started to take off in the 1940s.89
Besides searching for global and local minima, the first studies appeared that dealt with the diffusional escape of particles from minima and the transition between minima via a saddle point in 1940,69 while the general thermodynamic and statistical aspects of such transitions were put together in the so-called transition state theory in 1935.90 Regarding the density of states of a physical or chemical system with a complex energy landscape, the classical thermodynamical/statistical mechanical point of view dominated, where no distinction was made between a local density of states and the global density of states, i.e., the analysis proceeded as if the system had infinite time available to completely explore the whole energy landscape. Therefore, the statistical properties referred to the whole landscape instead of only individual sub-regions, and the density of states contributing to the thermodynamically stable phase completely dominated the total free energy of the system, with the obvious exception of the points in thermodynamic space where two phases could co-exist in thermal equilibrium.
The idea of equality of time average and statistical mechanical ensemble average, i.e., the concept of global ergodicity, had already been introduced in the 19th century91 and been formalized by 1913, when the impossibility of ergodicity in purely mechanical systems was shown.92 However, it was already clear at that time that there could be different regions of stability on the most general landscape, or sets of landscapes characterized by certain control parameters, and methods like thermodynamic integration and thermodynamic perturbation were introduced in 193593 and 1954,94 respectively, to calculate the difference between free energies or densities of states of two chemical systems.
3. Concepts
With respect to energy landscape concepts that are found in the literature and in the general understanding of what is relevant in the context of complex energy landscapes, special points such as minima, maxima, and saddle points, together with energy barriers between two minima (via a saddle point), and reaction paths between two states of the chemical system, were known to be of importance already before 1950.79,90 Similarly, in the 1930s, the concept of a multi-minima (free) energy landscape appeared in the discussion of the atomic-level mechanisms controlling the viscous and plastic behavior of macroscopic liquids and crystals, where the movement of atoms or holes in the material was visualized as the activated hopping among a multitude of structurally highly similar minima of nearly equal energy on the landscape.74 This concept of a set of structurally essentially indistinguishable configurations with nearly identical energies that are separated by relatively small (free) energy barriers and whose properties determine the behavior of the system constitutes a precursor of the multi-minima energy landscape paradigm for the description of viscous liquids and glassy systems introduced in the late 1960s,95 regarding the computational approaches and the major ideas derived from analytics, such as exhaustive searches or the closely related systematic use of repeated gradient descents.
Perhaps the greatest difference with respect to the concepts and interpretation of the landscape, when compared to today's understanding, resides in the fact that, nearly instinctively, as perhaps reasonable from a purely mathematical point of view but not necessarily from a practical point of view, only the global minimum of the global optimization problem or of the chemical system appeared to count. In contrast, at present, it has become clear that many low-energy side-minima can be realized as metastable chemical compounds or represent good alternative sub-optimal solutions to the global optimization problem. This is also reflected in the fact that, at that time, the thermodynamic point of view dominated the consideration of the landscape of chemical systems, where again the whole landscape is considered on essentially infinite time scales and, therefore, not the energy landscape but the thermodynamic space dominated the “world view” of most chemists or materials scientists. However, already before 1950, the issue and importance of the robustness of the solution of a global optimization problem were appreciated.96,97 It was clear that the sensitive dependence of the character of the optimal solution on the external conditions incorporated in the cost function landscape required, in general, more knowledge about the landscape than just the global minimum if optimal solutions should be applied in practice.
B. 1950–1980
1. Systems
At the beginning of the computer era, new problems in the study of multi-minima energy landscapes were primarily so-called combinatorial optimization problems, such as the satisfiability problems in the 1950s98 or scheduling problems in 1966.99 However, already in the 1950s, simulated evolution was being investigated,100 followed by the introduction of fitness functions for evolutionary101 purposes in the 1960s and 1970s. Similarly, the first landscape study in the field of robotics took place in 1961.102 In the natural sciences, the 1960s saw the study of cost function landscapes, e.g., in the context of solving crystal structures from powder diffraction data.103 In addition, in biology and economics, global optimization techniques were applied to problems in, e.g., resource allocation104 and modeling real biological systems and evolution.105–107 In addition, in the fields of business and the humanities, the traveling salesman problem was reformulated to study various scheduling and delivery problems.108
However, a decisive step came in the (re-)introduction of the landscape paradigm for the understanding of viscous and glassy systems by Goldstein in 196995 and, analogously, for the formulation of the Levinthal paradox of protein folding,109 i.e., why and how does a protein succeed in finding the folded state in a finite time in spite of the overwhelming number of local minima and the presumably highly complex barrier structure of the energy landscape, which would require astronomical times to sufficiently explore? The arguments employed in that work relied on the assumption that liquids, glasses, proteins, and other large polymers, respectively, possess a complex multi-minima energy landscape. This was quickly followed by the introduction of other glass-like systems in the 1970s, such as spin glasses110,111 and the so-called Coulomb glass, in the context of electron transport in disordered semiconductors.112,113 These studies relied on the concept of energy landscapes “without” a unique global minimum that is separated from the remainder of the system’s local minima by a substantial energy difference and concomitant stabilizing energy barriers. We also find simulations of amorphous Lennard-Jones systems,114,115 which subsequently became a standard class of test systems for complex energy landscapes, or similarly, the use of discrete (Ising-spin) models for the description of phase transitions in alloys.116 Furthermore, model landscapes were formulated that captured the feature of a multitude of local minima being present at low energies, such as the multiple double-well model for glasses117 in 1972. One should note that in amorphous materials, these minima are energetically very similar but structurally quite different on the atomic level, in contrast to the many defect minima of a real crystalline compound that still exhibit essentially the same overall structure as the ideal crystal. At this time, the analysis of the landscape was focused on the statistical mechanics of the minima and the dynamics of double wells, but not yet on the global barrier structure of the system. Regarding the study of barriers, models for cis/trans barriers in molecules were investigated in a local sense, taking the configurational entropy into account.118,119
However, the concept of a chemical system being represented by a multitude of (free) energy minima was not restricted to purely theoretical considerations. Introducing such landscapes also became more common in the experimental investigation of large (bio)molecules, and free energies were measured and computed along various (known or assumed) reaction coordinates in order to analyze and interpret the dynamics of the system.120,121 In particular, one notes that often the focus of these studies was on the determination of intermediate structures along the folding trajectory, with an implied assumption that the trajectory follows a sequence of more or less well-defined kinetic intermediates on the energy landscape.122
2. Methods
As far as method development is concerned, a large number of basic algorithmic types for numerical global optimization were developed during this time period, both stochastic and deterministic in nature. Optimal control methods have been studied since the 1950s,123 and the systematic logic of branch-and-bound and branch-and-cut algorithms that had been developed long before the existence of computers (and was applied in a common-sense approach for exhaustively excluding alternative options, of course) was transferred into the first computer algorithms124,125 in 1960. So-called directed search methods were developed at the same time,126 and a stochastic quench version was introduced as Directed Random Search.127 Similarly, since the late 1950s,128 and early 1960s evolutionary algorithms129 have been developed and subsequently applied, followed by their “evolution” into so-called genetic algorithms only a few years later.130–133 Finally, for many global optimization problems, heuristic search algorithms and special moveclasses were developed, which could subsequently be transferred to other related problems, such as the Lin–Kernighan algorithm for the traveling salesman problem.134
As far as landscape representations are concerned, principal coordinates were introduced in 1966,21 and reduced sets of coordinates such as dihedral angles and state spaces135 have been employed since at least 1963. Here, one should also mention approaches to employ tree graphs to represent the real-space structure of proteins136 in order to provide an efficient model for a (coarse-grained) representation of the feasible microstates in the state space of the protein. However, otherwise, the landscape description was rather abstract, with the landscape being visualized as a plethora of local minima (cf. Fig. 10), or double-well systems,117 without global connectivity. As far as a detailed barrier analysis was concerned, it was essentially still restricted to the analytical and numeric exploration of two-minima problems, although the importance of the barrier structure was recognized in an abstract way.95 On the other hand, the availability of stochastic and deterministic exploration algorithms such as Monte Carlo simulations employing the Metropolis criterion41 or molecular dynamics simulation codes137 since 1953 and 1957, respectively, allowed the first simulations of the dynamics on energy landscapes beyond the two-well systems and, therefore, taking, in principle, the whole complexity of the landscape into account. Therefore, it is not surprising that the potential residing in such Monte Carlo simulations for the purpose of global optimization was also addressed quite early by applications of such simulations with varying temperatures to constrained optimization problems,138 which can be considered precursors for, e.g., the simulated annealing approach139 in the subsequent decade. Similarly, many variations of the standard Monte Carlo simulations were developed in order to study the dynamics of energy landscapes more efficiently, such as, e.g., the n-fold way approach.140
By focusing on the distribution of local minima as a function of energy, together with assumptions about the local curvatures at the minimum, which represent, in physical/mechanical/chemical systems, such as crystals or molecules, the local vibrations inside a minimum basin, the global density of states could be computed for model landscapes in an approximate fashion without needing to take connectivity and barrier structure into account. Going beyond the harmonic approximation, the density of states of a complex system could be relatively efficiently computed using the umbrella sampling technique introduced in 1976,141,142 which constituted a practical realization of the older concepts of thermodynamic integration93 and perturbation94 that evaluate the difference in the free energies/densities of states between the one of a known, usually analytically solvable, system and the unknown one of the system of interest.
3. Concepts
Perhaps the most fundamental mathematical concept introduced in the context of energy landscape studies in the 1970s is so-called NP-completeness,143 which is defined as “nondeterministic polynomial-time complete”: while one can verify in polynomial time (as a function of parameters describing the class of problems) that a suggested solution is indeed a (the) true (satisfactory) solution, finding such solutions cannot be achieved in a similarly short time; one essentially needs to check all potential solutions before finding a (the) true one. Many difficult global optimization problems belong to this category of problems, such as the knapsack problem or the traveling salesman problem mentioned earlier. Proving completeness or non-completeness in polynomial time is a highly non-trivial task; e.g., for linear programming, it took until 1979 before the P-completeness was demonstrated.144
Several new types of global optimization techniques were introduced, which were based on and inspired by different underlying concepts. The branch-and-bound methodology124 from 1960 is fundamentally based on the kind of systematic and exhaustive case-by-case analysis employed in pure mathematics for proving theorems. In addition, even before the Monte-Carlo simulation approach, random quenches, i.e., stochastic or deterministic local minimizations from randomly chosen starting points on the energy landscape, had been employed for a long time145,146 before more refined algorithms were developed that were able to directly or indirectly employ information already gathered about the landscape. In particular, biology inspired algorithms such as evolutionary129 and genetic131,132 algorithms were introduced in the early 1960s. In this context, one should draw a distinction between algorithmic concepts that are based only on information about the energies of the microstates, which are generally applicable for exploring any landscape regardless of its state space being discrete or continuous, and those algorithms that rely on the existence of a continuous state space, ideally in analytical form, for which gradients or higher-order derivatives can be defined.
The new landscape features most relevant during this time were the local densities of states—usually in the harmonic approximation—and of local minima and saddle points. Still, older concepts like order parameters that, in statistical mechanics and thermodynamics, allowed the classification of large regions of the landscape according to certain phases, usually in the context of a continuum approximation,147 were introduced already in the 1930s148 and have had a strong influence on how people try to mentally work with landscapes on the level of stable sub-regions. The concept of stable states was only introduced in 1980,149 but it had been foreshadowed as an implicit assumption in the introduction of the order parameter concept, which implicitly relies on the presence of stable states, and preceding this, in the concept of metastable phases in chemical and physical systems. We will return to this in the discussion of the future of energy landscapes. Concerning the way energy landscapes are investigated in the context of statistical mechanics and thermodynamics, the focus up to 1980 has been on the concept of multiple minima all contributing to the total free energy of the system “at the same time.” In this context, the concept of configurational entropy150 was introduced in 1965 as a fundamental idea. Taken at face value, the configurational entropy again implies the availability of infinite time in order for the system to be able to explore all the local minima that have essentially the same energy, which allows us to account for their quasi-degeneracy in a lumped fashion via an entropy term, i.e., the configurational entropy. Therefore, one includes all of these minima on an equal footing into the time- and ensemble averages, where equality is required for the system to be ergodic on the time scale of observation.
Toward the end of this time period, in 1978, a new important dynamical feature was introduced, the so-called aging phenomenon151 in systems that usually exhibit a highly complex multi-minima energy landscape. Typically in aging experiments, a system that has been quenched out of a high-temperature equilibrium state into some kind of disordered state at low temperatures stays in an equilibrium-like state for a certain period of time, according to certain two-time correlations of the properties of the system, before these correlations start deviating from the equilibrium value. What is particularly fascinating is that the length of this equilibrium time interval grows monotonically with the length of time the system has been allowed to relax after the quench before the correlation measurements are started. In principle, such aging can also take place on a landscape where the complexity resides only in the entropic barrier structure without a multitude of local minima being present, but that is usually not the case for the systems that exhibit aging in experiments.
C. 1980–1990
1. Systems
Many of the realistic models of systems and some of the mathematical problems that are associated with energy landscapes appeared as concrete objects of study in the 1980s. The random energy model (REM)152 from 1981 still belonged to the category of “purely conceptual” models insofar as only the generic features of the model were of relevance and not so much any direct correspondence or other relationship to, e.g., a real glass forming material. To this class of models with characteristic generic features also belonged the tree graph models introduced in 1985 (ultrametric trees33) and 1988 (self-similar trees24). These served as analytically solvable, or at least analyzable, models for complex energy landscapes in order to provide guidance to more realistic future models, such as tree graph representations of the energy landscape of a real system. In most such realistic models, analytical solutions are not feasible, but numerical explorations can be performed; however, in those cases, the interpretation of the results and the derivation of general principles from the numerical output are often very difficult without analytical models that the results can be compared with.
However, by now, systems like binary Lennard-Jones glasses153 in 1983 or alloy models154–156 in the early 1980s were beginning to be studied in a global yet detailed fashion, and the Stillinger–Weber-type landscape models for liquids and glasses157–159 were introduced during that time. In chemistry, polymer glasses160 and protein models161,162 were discussed from the point of view of their energy landscapes, and structure predictions on the level of “from sequence to structure”163 have been performed since the late 1970s. Furthermore, by the late 1980s, the energy landscape of small atomic and nuclear clusters164 began to be studied, in parallel with the modeling of structural glasses165–167 and spin glasses.34,168,169 As an example of the inverse problem, the so-called reverse Monte-Carlo method170 was developed in 1988, where the cost function was the difference between the computed and measured pair or radial distribution functions of a glassy system. Similarly, purely theoretical structure comparisons of crystals were performed,171,172 where the candidate structures were derived by chemical intuition and constructed by hand.
In biology and economics, global optimization techniques were applied to problems in, e.g., resource allocation173 and modeling real biological evolution.174 Closely connected was work in applied mathematics where, besides optimal control problems of various kinds, the landscape of neural network parameters175–177 was first studied in the 1980s. In addition, a variety of analytical models and artificial landscapes for heuristics, standardized test systems for future optimization algorithms, and counter-examples were formulated.178 An example of the latter is the famous “Stillinger’s nightmare” landscape for optimization purposes, where the topology of the landscape is such that the global minimum corresponds to an extremely deep golf hole on top of a mountain, resembling the crater of a volcano landscape.
2. Methods
In the field of landscape representations, the 1980s was the time of tree graphs. Besides the specialized generic ones mentioned earlier, the so-called lumped tree graphs (cf. Fig. 6) for a simplified description of a landscape24 were introduced in 1988. Here, all the states in a minimum basin up to the saddle point are lumped together into one macrostate with a certain capacity of microstates associated with it. In order to perform a stochastic Markov dynamics on such a tree graph according to a Metropolis dynamics at a given temperature, the corresponding Markov matrices were “Boltzmannized”179 in 1988.
As far as the development of new global optimization methods is concerned, several classical approaches were introduced for the first time in 1983, such as simulated annealing139,180, which imitated the physical annealing process in the treatment of metal devices. Another group are the so-called taboo searches181 in 1986, where the algorithm remembers the states or regions that have been visited in the past and, therefore, avoids them when entering the region for the second time. A third class of algorithms that was introduced in 1988 are the jump or bounce algorithms,168 which combine local optimizations with large, more or less random jumps into other regions of the landscape.
This decade is also the one where methods for systematic searches for saddle points were developed: a first approach in 1981 with the search starting in local minima,182,183 and then a few years later the rational function optimization method184 and the so-called slowest-slides approach,164 just to name a few. In addition, the analysis of general transition paths started with first attempts to minimize a route orthogonal to a given path on the landscape—sometimes called the drag method185—in order to identify saddle points connecting two minima and the optimal paths involved. Taking a leaf out of the book of analytical replica methods for the description of complex systems,186 replicas were introduced for the simulations of complex systems and phase transitions, at first using Monte Carlo simulations,187 a methodology frequently denoted as parallel tempering.188 Similarly, after about a decade of silence regarding new methods for the computation of local and global densities of states and the (local) free energies associated with them, a precursor to the currently quite commonly used weighted histogram analysis methods was developed in 1988.189,190 Furthermore, it became standard to compute the local density of states (DOS) in saddle point regions and for local minima via the width of the saddle point (represented by its positive curvatures) and the curvatures of the local minima, respectively.
3. Concepts
Regarding new concepts in the 1980s, in 1982 the idea of associating a set of local minima with a trajectory along a molecular dynamics simulation by periodically performing local gradient minimizations along the trajectory was introduced. This set of local minima was called the “inherent structures (of the landscape),”157,191 assuming that the paths followed by the walker were representative of the landscape. Similarly, the idea of measuring the local density of local minima for some region of, or for the whole, landscape was introduced in 1989,34 and the concept that landscape connectivity might be usefully represented by various kinds of connectivity trees or graphs was also introduced in 1985.25,33,192 Here, one should separate the generic qualitative description of a multi-minima landscape connected via various saddles by tree graphs, as employed in the discussion of the dynamics of glasses or proteins,162 from the formulation of tree models with analytically solvable dynamics33,192 and the construction of a real tree structure for the representation of the energy landscape based on, e.g., the analysis of actual reaction paths25 or derived from systematic global landscape explorations using tools like the lid algorithm.26 The first example of such a realistic tree was obtained from the analysis of the conformational transitions of a protein.25 To a certain extent, this tree graph visualization of the landscape of a protein could also be correlated with the tree model description of the (structural or dynamic) (sub)-domains of a protein going back at least to the 1970s, where the hierarchic structure of the protein could be represented as a tree graph,136,193 suggesting possible folding mechanisms for the protein and, therefore, constituting an implicit reference to the underlying energy landscape of the system governing the actual folding process.
Regarding the new global optimization and exploration methods, the inspiration came again from aspects of the real world, such as the technologically employed annealing procedure that inspired simulated annealing,139 just like biological evolution had inspired evolutionary and genetic algorithms131,132 20 years earlier. The attempt to reactivate the idea of comprehensive or exhaustive exploration of a landscape in the search for the global minimum, in particular for landscapes with discrete state spaces, led to the taboo searches,181,194 while the original saddle searches in 1981 still relied on analytical approaches.183,184 One important step beyond the traditional thermodynamic statistical mechanical picture was the concept of broken ergodicity in 1982,195 which made it very clear that time scales play an important role not only in the practical limits on the durations of experiments and in the limitations of the length of feasible simulations but also on an abstract theoretical level. Starting from a system in global equilibrium at infinite time scales and/or infinite temperatures at a finite time scale, we break the system into subsystems by reducing the available exploration time or reducing the temperature, which represents the kinetic energy available in the system, to cross the energy barriers separating sub-regions of the landscape, such that the system no longer can achieve global ergodicity within the available time. Sometimes, after such a breaking of the global ergodicity, the time scales are sufficiently long and the landscape has a suitable structure such that the individual sub-regions can reach a state of ergodicity by themselves until a further reduction of the time scale or temperature again breaks their sub-region ergodicity.
This conceptual framework brought home the lesson of Levinthal’s paradox109 and the question of to what extent the configurational entropy of, e.g., a glass or a solid solution, is a well-defined thermodynamic quantity. As a consequence, to address this paradox, people introduced the minimal frustration assumption in 1987,196 which evolved into the funnel picture,39 i.e., it was assumed that the energy landscape of proteins that exhibited folding to a unique fold on biologically relevant time scales was required to possess an overall “shape” that could be visualized as a funnel. In this fashion, from essentially wherever the protein resided on the landscape at the beginning of the funneling process—perhaps initiated by a slight reduction in temperature or ion concentration in the medium the protein is embedded in—there are no major traps that delay the downhill progress of the protein’s configuration toward the global minimum, which was identified with the folded configuration. As a consequence, folding began to be conceived as exploring a more global multi-path mechanism and no longer as following a specific sequence of well-defined intermediary structures.
Similarly, the issue of the (quasi-)equilibrium often observed in a disordered state of a macroscopic system, such as a solid solution, an amorphous compound, or any material with a complex landscape, where eons would have to pass before the system would be able to explore the whole landscape densely enough to qualify for being globally equilibrated (unless the glass crystallizes, which for structural glasses is usually possible in principle), faced the problem of broken ergodicity. Here, one should note that the same problem also appears for a crystalline macroscopic system: we cannot ensure global ergodicity, as this would require the system to also explore the realm of disordered states, alternative crystalline modifications, and the presence of dislocations and domain boundaries; alternatively, the system would need to “ignore” the rest of the landscape as far as the crystal’s thermodynamic properties are concerned. However, even when excluding those landscape regions associated with alternative modifications or phases, the complete exploration of all the possible equilibrium defect states that are present in a real crystal at a given temperature and are contributing to its free energy is usually not feasible.
As an answer to this question, possibly first arising in the context of comparisons of simulations of small model systems with real macroscopic glasses, the concept of self-averaging, originally proposed by Lifshitz already in the 1960s,197 was (re-)introduced in 1978,198 and subsequently its applicability (or non-applicability!) in systems with complex energy landscapes was more fully explored in the 1980s. Here, the problem that the system is macroscopic is turned around into a solution by taking inspiration from the fact that major quantities like energy, entropy, or volume of a macroscopic thermodynamic system are extensive, i.e., after breaking the body into two pieces, the sum of their energies or entropies equals the one of the unbroken body. Conversely, one can imagine the macroscopic system as a union of many smaller but still quite large subsystems, where the total free energy is still (approximately) the sum of the individual pieces. Each such piece can now be considered a representative in an ensemble of examples of the original macroscopic system, such that each of these subsystems can approximately equilibrate by itself on the time scale of observation. However, since each of these pieces is a different representative of the complete system—they could even exhibit slightly different compositions—their ensemble average serves as quite a good approximation to the true free energy of the total system. Therefore, instead of having to consider billions of copies of the original body to obtain a statistical mechanical average, this one macroscopic body itself can be visualized as performing such an average over itself, or a self-average. Of course, this procedure can only be used if the material is spatially homogeneous; for example, the global minimum of the macroscopic system should not correspond to a decomposition into two solids of different compositions. Identifying whether a system allows the application of the self-averaging assumption became an important issue when studying various random (quenched) media and the use of tools like the replica method.199
D. 1990 and beyond
1. Systems
By the end of the 1990s, most of the types of energy landscapes people work on at present had been introduced and studied to a certain degree, including those that had already been on the “to-do-list” since earlier times, of course. At the beginning of this decade, analytically solvable graph models were studied, e.g., in 1991,200 and the verification of alloy phase diagrams and the study of solid solutions became partly routine.201 While these calculations still refer to the thermodynamic properties of the whole landscape, their computation requires the performance of statistical ensemble averages over very many minimum or near-minimum configurations.202 In 1997, the old REM model was refurbished by adding connections between the local minima,203 i.e., a barrier structure was added to the original purely minima-focused model. In the field of small molecules and clusters, the landscape was now being explored globally,204 and a number of folding landscape models39,205 were proposed. These global explorations were used to predict the global minimum structures of small and medium sized clusters44,206,207 and to construct lumped tree models of their landscape;28,29 similarly, the optimal lattice occupation in solid solutions was investigated.208–210
In 1990, the first determination of crystal structures for fixed given unit cells and unit cell content without direct comparison with a measured diffractogram, except for information about the values of the cell parameters and the number of atoms in the unit cell, led to the introduction of potential energy landscapes for crystalline approximants.211 At the beginning, empirical potentials or potentials guided by chemical intuition were employed, since single point ab initio energy calculations took several hours even for simple systems, making the use of ab initio energies in global optimizations infeasible. Nevertheless, by 1994, global optimizations for flexible unit cells and flexible unit cell content, i.e., without any experimental information except the identity of the atom types, became possible by using empirical potentials.212 Adding the pressure to the potential energy led to the study of enthalpy landscapes for crystalline systems, in principle, starting in 1995,213,214 although in most investigations the pressure was set close to zero; prediction at many, and in particular, high pressures, became a dedicated enterprise only in the early 2000s.215 Similarly, in the field of molecular chemistry, we find the global study of energy landscapes for proteins on lattices in 1995,216 the introduction of lock-key protein landscapes in 1992217 (at least as a concept), and the first RNA landscape studies.218,219 In the latter cases, one should keep in mind the multitude of possible levels of landscapes that can appear in RNA or protein systems: the energy landscape of a given molecule with a fixed primary sequence; the landscape of possible secondary structures of such a molecule, where usually the free energy would be the cost function; and the evolutionary fitness landscape of feasible different primary sequences of the molecule, which are judged by, e.g., their folding ability.220
In more mathematically oriented fields such as computer science and robotics path planning, landscapes have been studied since the 1990s.221 “Probabilistic” or noisy landscapes were implicitly introduced in the study of neural networks, where depending on the training and testing set, the quality of the network varied. In order to analyze the landscape faced by an ensemble of walkers when using multi-walker algorithms like the genetic algorithm, meta-landscapes of ensembles of walkers were introduced in 1990.222 At present, perhaps the greatest upheaval in the study of energy landscapes might come from the introduction of machine learning for performing the energy calculations themselves. Sometimes, researchers are trying to just replace the full, e.g., ab initio, energy calculation for individual atom configurations by a high-quality interpolation formula hidden in both the neural network parameters and in the translation of the atom configuration—plus possibly additional chemical information—into an efficient input file, the “descriptor,” “feature,” or “representation,”223 for the neural network.224 Yet one can also go beyond this by even completely cutting out the middle-man, i.e., the energy landscape, during, e.g., a structure prediction study,225 like a more powerful version of the earlier applications of neural networks in the structure prediction of proteins.226,227
2. Methods
In the field of method development, a multitude of new algorithms were proposed, many of which were inspired by older algorithms or combinations thereof; “beware the claims of the producer” became a watchword in this kind of Wild West era of global optimization, even though some of the new approaches became the prototypes of the currently most popular algorithms. Just to name a few, we might start with the deluge algorithm in 1990,228 a first surge in the use of machine learning for structure prediction in the early 1990s starting in 1989,226,227 particle swarm optimization in 1995,229 and genetic algorithms for continuous systems in 1995.43 In the latter method, each mutation or crossover was followed by a local minimization, thus allowing the application of the classical genetic algorithm, which had been defined on a state space where each microstate could be mapped to a set of binary numbers, to continuous state spaces like those encountered in systems like clusters, molecules, or crystals. Sequential multi-quench, a variant of the bounce algorithm, was introduced in 1995,230 closely followed by another variant called thermal cycling in 1997231 and the basin hopping algorithm in 1997.44 In addition, in the field of robotics, the rapidly growing-random-tree (RRT) algorithms were introduced first in 1998,232,233 although graph related methods in robotics as such had been around since a decade earlier.14,221
Subsequently, many of these algorithms were slightly modified; for example, particle swarm optimization evolved into the CALYPSO (Crystal Structure Analysis by Particle Swarm Optimization) algorithm,234,235 combined with molecular dynamics, or mixed together. Still, some interesting new approaches resulted from these mixtures, such as the T-RRT algorithm in 2011,236 the ant algorithms inspired by the behavior of ants,237,238 threshold minimization in 2015,239 or the recently upgraded neural-network based (machine learning) prediction algorithms,240 just to name a few. Quite generally, combinations of algorithms or hierarchical algorithm constructions, both regarding the systematics and the (approximations of the) landscape features, led to many new variations in the classical global optimization techniques. Somewhat related was work on optimization algorithms in the context of machine learning, such as the Q-Learning algorithm,241 which formed the basis of many future algorithms.242
Similarly, the 1990s saw much progress in the development of algorithms for the study of the landscape beyond the local minimum, aiming at exploring the full barrier structure of the landscape. In 1991, eigenvector following was introduced243 to identify low-energy saddles, starting the search from a local minimum along a minimum curvature direction, while in 1994, the mostly by-hand attempts of constructing paths between two local minima made a great step forward with the introduction of nudged elastic band (NEB) methods.244 Trying to find efficient ways to map out transitions between local minima and exploring the landscape globally at realistic thermodynamic conditions led to many variations of “activated” molecular dynamics.245–250 Combining this with NEB features resulted in the transition path sampling methods in 1998,251 finally leading to the concept of a hierarchy of transition states in 2001.252
Another approach was the so-called record-dynamics in 1993,228,253 which aimed at trying to move across the largest local barriers in the neighborhood—associated with record time scales—in order to efficiently explore larger and larger regions of the landscape. A completely different methodology constituted the lid26 and threshold27,254 algorithms introduced in 1993 and 1996, respectively, where the landscape was explored exhaustively or stochastically, respectively, below a sequence of energy lids, with or without features of the taboo algorithm, thus measuring not only energetic but also entropic barriers on the landscape. Subsequently, in a molecular dynamics version of the threshold algorithm,20 the stochastic random walks were replaced by MD simulations at very high temperatures, forced to stay below the prescribed energy lids. A related approach, metadynamics,255 was developed in 2002 that combined stochastic or deterministic explorations with a taboo-like feature that prohibited the return to regions visited already. Besides a number of variants of the NEB methods, several schemes were proposed to extend these ideas to larger paths on the landscape; examples are the prescribed path method from 200545,256 and the pathOpt approach from 2013.257
As far as the (local) density of states and free energies of complex energy landscapes were concerned, the so-called computational alchemy scheme was developed in 1990,258 shortly afterward followed by the WHAM (Weighted Histogram Analysis Method) approach in 1992.259 These methods gave rise to many further variants in the 2000s;260–262 for example, the ParQ method was introduced in 2005.263 The local densities of states could also be derived from the threshold27,38 and lid26,35,264 explorations, together with the local density of local minima. This was accompanied by the further development of approaches belonging to, or being related to, the parallel tempering class of algorithms employing several walkers or replicas first presented in the 1980s as an example of a replica exchange Monte Carlo algorithm.187 Most of these methods are trying to deal with the issue of diverging correlations at low temperatures and the very long time scales required to move from minimum to minimum and multi-minima basin to multi-minima basin, as they also appear in the context of an efficient sampling of the density of states. Examples are, e.g., multi-canonical algorithms,265–267 the extension of replica exchange methods to molecular dynamics simulations,268 or hybrid procedures that combine random walk based simulations with analytical statistical tools.269 Besides studying phase transitions, these algorithms have been successfully employed for the prediction of, e.g., protein conformations267 and the general exploration of the landscape of proteins, with a particular focus on the folding process,268 and many new variants have been added over the past three decades, such as the generalized replica exchange method.270
In parallel to the exploration algorithms, the 1990s produced a large number of new, or at least modified, ways to represent or depict the energy landscape. The lid algorithm26 could be employed to generate the full metagraph of a discrete energy landscape restricted to a pocket below a given energy lid, and the exponentially slow equilibration of such a landscape could be visualized by so-called equilibration trees, introduced in 1993.26 The availability of algorithms like the threshold algorithm or efficient saddle point search algorithms allowed the construction of a number of variations of the single- and multi-lump tree graphs (for a discussion of various types of trees, see Sec. II C). Building on the first realizations of trees for proteins in the late 1980s,25 large-scale trees were derived from the systematic study of the various landscapes, e.g., for the traveling salesman problem in 1993,26 for spin glasses in 1994,264 and for periodic solids in 1996,27 and re-invented in 1997 and 1998 under different names, such as one-dimensional projection,30 disconnectivity diagram,28 and disconnectivity graph,29 respectively. Such graphs were later extended to take kinetic aspects into account, e.g., in the form of so-called weighted tree graphs in 2005.271 A modified approach to the inherent regions introduced twenty years earlier were the so-called characteristic regions in 2001,272 where the landscape was analyzed as a function of energy slice with respect to the distribution of local minima that could be reached from the points in a given slice via unbiased stochastic trajectories. Those points from which essentially all stochastic quenches reached the same minimum were termed basin points, and those that split into two or more local minima described the size of the transition regions on the landscape at a given level of energy.
An important new aspect was the study and subsequent representation of probability flows in the energy landscape. Besides the exhaustive flow simulation in restricted pockets of the landscape,26 molecular dynamics and Monte Carlo simulations were analyzed in this regard as early as 1994, when a so-called basin-transition matrix was constructed273 following earlier first root mean square distance (RMSD) based conformation-correlation diagrams,274 and the related molecule optimal dynamic coordinates (MODC) representation was presented in 1997.275 At the same time, the importance of visualizing and modeling the time evolution of chemical systems that exhibit a highly complex multi-minima landscape, such as a protein, a glass, or a spin glass, as the flow of probability instead of a walker on a single trajectory became an important paradigm in the field of protein folding.39,276–278
Using threshold runs, so-called transition maps between local minima (regions) were constructed in 1999,40 together with entropic barriers preventing a walker from leaving a minimum region even though it is at energies above the lowest saddle points, the so-called return barriers.40 Using threshold runs, stochastic minimization, and low-temperature Monte Carlo runs, entropic barriers could be identified in 2003 as stabilizing metastable modifications.17 A completely different approach to the depiction of energy landscape aspects is the study of which sub-regions of the energy landscape are locally ergodic on what time scale, which leads to plots of the local free energy vs observation time scale in 2015 as a conceptual visualization279 and in 2012 in the form of phase diagrams as a function of observation time scale, which include not only the thermodynamically stable phase but also the phases that are metastable on the time scale of observation.280
3. Concepts
Of course, the development of new methods and the addition of new systems to investigate have also led to deeper thinking about the concepts and motivations regarding energy landscapes. Concerning the features of interest in landscapes, by 1993, the study of the local density of states began to extend beyond the one of local minima,26 and saddle + minimum networks were proposed and (partly) studied at the same time.204,206,281,282 An interesting feature of the local density in many complex systems was the approximately exponential growth26,37 in the low-energy regions of the landscape (still containing myriads of local minima), which led to the concept of exponential trapping and its application in 1997283 and 1998,36 as a crucial aspect of both the glass transition35,36 and possibly aging phenomena, and as a reason for the success of some global optimization algorithms283 in spite of their rather short run times compared to the size of the system.
A quite important concept introduced in 1998 was the idea of locally ergodic regions,15,16 which could be stabilized not only by energetic but also by kinetic and entropic barriers, as analyzed in 200317 and 2005.18 The concept of a locally ergodic region formalized the earlier implicit or explicit idea of stable states, originally introduced in 1980149 and later re-introduced in 1995.284 The energetic, kinetic, and entropic barriers surrounding local regions on the landscape could be deduced from the probability flows on the landscape17,285 and were interpreted as generalized barriers. From such generalized barriers, the idea of free energy barriers as a function of time scales emerged in 2016.286 The characteristic regions mentioned earlier272 illustrated the fact that in many systems, the minima basins have a region of attraction up to quite high energies, even on multi-minima landscapes.
As far as the conceptual progress in the global explorations, both regarding minima searches, saddle point determinations, and local density of state estimations, is concerned, we note that combining the old global optimization techniques, such as genetic algorithms, simulated annealing, bounce algorithms, etc., with efficient local minimization techniques resulted in many new algorithms, like basin hopping,44 genetic algorithms on minima,43 thermal cycling,231 local minima hopping,287 etc., which are employed in many of today’s global optimizations in chemistry or physics. Similarly, the concept of interacting (either by explicit multi-walker moves such as cross-overs or by repelling each other in order to increase diversity in the explored landscape regions) or non-interacting multi-walker schemes was exploited, e.g., in the Demon-algorithm from 199246 or the particle swarm optimization from 1995.229 The inspirations gained from robotics research led to the motion planning RRT algorithm in 1998,232 while at least the name of the deluge algorithm in 1993228 is reminiscent of the biblical flood. In addition, the lid algorithm developed in 199326 inherited features from tree model studies and taboo searches. Similarly, the neural networks that were going through another phase of development in the late 1980s and early 1990s and then rose again with the availability of much larger computers in the 2010s were obviously inspired by the human brain. Combining statistical mechanical considerations about transitions with robotics led to the concepts behind the T-RRT algorithm in 2011,236 just as combining the threshold concept with the bounce algorithm led to the threshold minimization algorithm239 while combining threshold and RRT concepts resulted in the IGLOO (Iterative Global Exploration and Local Optimization) algorithm.288–290 Similarly, Lamarckian evolution inspired the various versions of ant algorithms,237,238 and the taboo concept together with an order parameter scheme to classify different chemical modifications are at the basis of the metadynamics from 2002.255
Regarding the concepts related to statistical mechanics and dynamics on landscapes in general, we note the conceptual realization in the early 1990s that the probability flows are both non-trivial and highly relevant on complex energy landscapes.26,276 Furthermore, the importance of the existence of an essentially continuous set of time scales in the dynamics of complex landscapes was demonstrated by the example of the traveling salesman problem.26 In this context, one should mention that even among the general rugged or frustrated multi-minima landscapes, there can be qualitative differences as far as the (energy) barrier structure—leading to a classification scheme into landscape archetypes29—and the time scales are concerned, suggesting that, e.g., landscapes of glasses might exhibit a continuous spectrum of (generalized) barriers and associated time scales while large biomolecules might evolve in a more hierarchical fashion.291
This foreshadowed the concept of local ergodicity,15,16 where those regions on the landscape of a chemical system that are locally ergodic on a given observational time scale correspond to metastable modifications of the system that can exist on this time scale. An important new conceptual insight was the relevance of rare events, which was brought to attention in 1997.292,293 For systems like glasses or other amorphous compounds, or for combinatorial optimization problems, too, the concept of local ergodicity does not hold on typical time scales of interest. Instead, the concept of marginal ergodicity, introduced in 2009,294 often applies. Here, the escape and equilibration times of a nested sequence of regions are of the same order of magnitude, which helps to explain the aging behavior of glassy systems such as spin glasses,34,295 polymer glasses,151,296 or amorphous ceramics.297,298 Once the idea of local ergodicity is asserted or demonstrated on a given observational time scale, local free energies can be properly defined on such time scales, which is reflected in the discussion and construction of (local) free energy landscapes, starting in 2002.299–301
Finally, we turn to the conceptual development of the interpretation and understanding of energy landscapes. Here, the 1990s and early 2000s brought not only the concept of properly defined local free energies but, by combining generalized barriers comprising energetic, entropic, and kinetic aspects, local free energy barriers were conceived of in a systematic way.17,18 One should note that this kind of generalized barrier is different from the concept of a “barrier” in free energy as a function of an order parameter, which is often used in discussions yet usually does not really have a dynamical meaning. Other conceptual developments are the phase diagram, which includes metastable phases as a function of observation time scale,16,280 the concept of quakes in exploration trajectories on energy landscapes,302 and the equilibration tree, which represents the way more and more parts of a complex landscape equilibrate26,27 and provides much insight beyond the traditional listing of relaxation times as negative eigenvalues of the Markov matrix describing the evolution of the system.
In this context, one should mention the issue of properly comparing the probability flow, which represents the average of (infinitely) many walkers as a function of time, with the single-walker trajectory that the system itself follows: how representative is the probability flow considering the fact that the real system exists only as one realization, and how well is one justified in setting up a master equation dynamics based on the combination of the data derived from the flow, the local densities of states, and the detailed balance conditions. These conditions were first introduced for energy landscape (models) in 1991,200 and their presence or violation in various exploration algorithms has frequently been discussed.248,303,304 Such considerations are of particular importance when one wants to acquire optimal control of the probability flows on the landscape in order to access certain regions or even individual minima with a high probability; a simple example of such an optimal control problem was analyzed in 2013.285
IV. TOWARDS THE FUTURE
Energy landscape explorations, ranging from the identification of local minima and their classification to the construction of ever more refined model descriptions of the landscape and the probability flows on the landscape, are by now an important tool both in chemistry and other natural sciences like physics or biology, as well as in mathematics and essentially all fields where the phenomena and processes can be described or visualized as taking place on a cost function hypersurface over a space of microstates. However, there are still many possible extensions and applications that can be envisioned or are being actively pursued, and we will give a brief overview and discussion in this section, where the focus will be on the landscapes of chemical and physical systems. For clarity, we will group these developments into ten different categories: (a) natural extensions of the potential energy landscape for the description of chemical and physical systems needed for applications beyond the isolated system case typically studied so far; (b) landscapes for systems that are not described on the atom level but on the continuum level; (c) master equation approaches and optimal control of probability flows on energy landscapes; (d) energy landscapes on unusual state spaces and their mathematical implications; (e) energy landscapes and quantum mechanics; (f) energy landscapes with fundamental limits of precision; (g) the use of machine learning to represent the energy landscape of a chemical system, or cost function, in general; (h) correlation with secondary landscapes; (i) application of energy landscape concepts to cost functions outside of chemistry and physics; (j) speculative developments most of which might only be realized in the more distant future.
The future being, by definition, rather unpredictable, this outlook toward the future of energy landscapes cannot be a completely well-defined research program but will instead provide suggestions of possible avenues for future developments and applications, which follow in a more or less obvious way from both the trends in the history of landscapes and the tasks we are confronted with in our fields of research, in particular, chemistry and physics. Furthermore, each of these envisioned developments would deserve its own perspective and, therefore, we often can only provide a summary or wish list that indicates the general direction of this class of developments.
A. Extensions of energy landscapes in chemistry and physics
As summarized in Sec. III A 1, the concept of energy underwent substantial developments until the definition of the potential energy landscape as a function of the (generalized) coordinates or parameters describing the system of interest was reached, where in chemistry typically the coordinates of the atoms are employed for this purpose. Several aspects should be noted here regarding the landscape of potential energy: (i) the chemical system is considered isolated from the environment when defining its potential energy via the interactions among the participating atoms, (ii) the energy function is time-independent, and (iii) the state space is also assumed to be time-independent.
However, in the real world, none of these assumptions hold strictly true since we are always exposed to the environment. As a consequence, the energy function needs to be extended in an appropriate fashion in order to take this fact into account, especially if we want to use energy landscape concepts and techniques to analyze the behavior of chemical systems when interacting with the environment. Here, the second root of the definition of energy, the thermodynamic one, serves as guidance. This is particularly natural when we want to implement (static) thermodynamic boundary conditions, such as applied pressure or stresses in general, and the presence of electric and magnetic fields. Therefore, we generalize the potential energy of a system of N atoms, , to a potential enthalpy . Already, this step contains several subtleties, especially if we consider a finite system such as a cluster, where the prescription of a volume associated with the N atoms is ill-defined. The same holds true for the introduction of time-independent electromagnetic fields via adding terms like to the original potential energy, where quantities like the (induced) polarization or magnetization of the material due to the field are actually only properly defined on the continuum level (described by the spatial variable ) and not on the atomic level [described by the vector ]. Furthermore, the applied fields add their own energy terms to the system, but since these fields are assumed to be constant in time, the electromagnetic field energy can usually be ignored in the analysis.2 However, we must consider the fact that, e.g., a constant electric field can induce current flow in the system, which either leads to a build-up of charge at opposite ends of the material or establishes a steady state current through the system, which in turn creates a magnetic field with associated magnetic energy. We refer to the literature for a more in-depth discussion of these complications.2,305
While the pressure is assumed to be isotropically constant throughout the system, the electrostatic potential difference that creates such an electric current is not the only example of a spatially imbalanced external boundary condition. Other important examples are differences in the applied external temperature, thus inducing a flow of heat and/or a variation of temperature throughout the system, or differences in the applied chemical potential, which would cause steady-state particle currents or macroscopic differences in composition. Other externally imposed boundary conditions can be enforced currents, i.e., we consider the system with a given fixed electric current or particle current and identify the (crystalline) modification with the lowest combined potential energy and electric energy (cf. Fig. 11). Again, we refer to the literature for more discussions on these extensions of the energy landscape due to externally imposed thermodynamic boundary conditions.2,305
Another important consequence of the interaction with the environment is the fact that these interactions can evolve with time, in an abrupt or adiabatic, randomly fluctuating, or periodic fashion, just to name some of the most common possibilities. As a consequence, the energy landscape becomes time-dependent, with consequences similar to those one knows from the analysis of simple time-dependent systems in physics, but now applied to a system with an essentially macroscopic number of degrees of freedom. Here, the timescales involved are of great importance, where the time-scale on which the external conditions change must be compared with the equilibration and escape times associated with the locally ergodic regions on the landscape. For a more detailed discussion of some of these issues we again refer to the literature.2,305
One particularly important consequence of the interaction with the environment for the existence and definition of the energy landscape of a chemical system is the possible variation in the number of particles that belong to the system of interest. Early structure prediction algorithms allowed for the variation in the number of particles by introducing a constant chemical potential term in the cost function/extended potential energy/enthalpy landscape.214 However, here the goal was to find the best candidates for a crystalline system envisioned to be synthesized in a typical solid state fashion (“shake and bake”) where the final composition(s) of the product(s) are commonly not known before the experiment has actually been performed. Therefore, in principle, the number of particles in the periodically repeated variable simulation cell needs to be varied during the global search for the energy landscape of the chemical system in order to explore the full range of possible crystalline structure candidates. In practice, performing such a search with varying numbers of particles did not prove to be efficient; instead, the searches were repeated using many different numbers and compositions of particles/simulation cell while keeping the composition and the number of formula units/cell fixed for each search run. However, when modeling the interaction with an environment, e.g., finding the possible stable modifications of a compound at a given partial pressure of a gas like oxygen or of a material inserted in a liquid medium, then a variability in the number of particles must be allowed (perhaps within limits). As a consequence, the microstates of the system can vary in “size,” i.e., the state space must be constructed analogous to the Hilbert–Fock space in quantum mechanics. Furthermore, we usually want to add a chemical potential term for each type of atom in the system. Again, we refer to the literature for a more detailed discussion of some aspects of such an extension of the energy landscape.2,305
As a final extension, we note that for many purposes, one only studies the electronic ground state energy landscape, commonly called the Born–Oppenheimer surface. However, as we have mentioned earlier, the energy surfaces of the excited states are also of relevance and, therefore, one would want to perform analogous studies of the diabatic/adiabatic energy surfaces, possibly even associating two or more energy values with each arrangement of the atoms. As is well-known from studies of quantum phase transitions,306 those atom arrangements (or those variations of the external forces) for which the excited and ground state surfaces become degenerate should be associated with second order quantum phase transitions, or at least be candidates for this. From the perspective of energy landscape studies, the investigation of the barriers between different atom arrangements that belong to different electronic ground states would be of great interest.
Quite generally, while some relatively straightforward suggestions for such extensions exist, not much has been realized in practice so far, and it is clear that there are many open questions related to such extended energy landscapes.
B. Landscapes for systems in the continuum approximation
As we have seen in Subsection IV A, some of the quantities that contribute to the natural extension of the potential energy when interacting with the environment are not well-defined on the atomic level, in the sense that we can add a term just to the interaction energy of each individual atom. Instead, these quantities are properly defined in the continuum approximation,147 where we consider the system consisting of so-called material particles, each labeled by its position (note that this position can change with time), each of which has a diameter on the order of 1–10 nm and contains a varying number of atoms. In order to assign each material particle a well-defined number of atoms and volume, etc., we also consider the system only on time scales larger than a continuum time scale. Furthermore, we assume that the properties of neighboring material particles vary only very slowly as a function of , such that we can consider their properties, such as density, velocity, etc., to be continuous (and differentiable except at boundaries between different materials) on length scales larger than the diameter of the material particle itself. In this field description of the material, we no longer consider individual atoms or even individual rigid material particles, but we are dealing with the system on the mesoscopic level. This is particularly appropriate if we perform qualitative discussions of the behavior of a macroscopic system, such as trying to model the evolution of a material during a complex synthesis with many intermediary stages in terms of an energy landscape picture. Such discussions are very common now in the experimental chemistry and materials chemistry communities when trying to understand and plan the synthesis of a new material. Of course, the continuum approach is also suitable when studying continuum properties such as conductivities (electric, heat, particle, etc.) on an energy landscape level. A major complication appears when we try to introduce mesoscopic properties that belong to the continuum level while exploring the energy landscape on the atomic level in order to find stable modifications under, e.g., steady-state current conditions, since now for each atom-level configuration, we need to identify the corresponding continuum level configuration to which this atom level configuration contributes. For more discussions of some aspects of this issue, we refer to the literature.2
Quite generally, working with the energy landscape on the continuum level allows us to study applications in spatially macroscopic systems (instead of the atom level periodic approximants of crystalline systems, where usually the diameter of the periodic unit cell is only a minute fraction of the size of a single material point). Examples include exploring the presence of macroscopic gradients or the presence of different phases on a single energy landscape, including their phase boundaries. Furthermore, they allow for the realism of finite-size systems by allowing us to include a surface, and when applying spatial distributions/variations of thermodynamic parameters such as temperature, pressure, chemical potentials, etc.
Another extension are the order parameter landscapes that are frequently employed in qualitative discussions of phase transitions. Here, the major problem is that the locally ergodic regions on the atom-level energy landscape are not restricted to particular values of the order parameter. This is already not true for the two individual phases being compared, but even less so for intermediary order parameter values, where one often finds implicit claims that one is allowed to compute the free energy of the system for a given order parameter value, as if the set of microstates exhibiting such an order parameter value were locally ergodic and, therefore, allowed us to write down a well-defined free energy for any given order parameter value. However, on the continuum level, with spatial separation between different phases of the system and a proper definition of transition regions and probability flows among the stable phases, it would be possible to formulate generalized barriers that would replace these problematic “free energy barriers” that are based on inappropriate calculations of free energies as a function of order parameter values. In this context, we also note that, originally, the order parameter was a mesoscopic entity, i.e., while one could often compute an order parameter value for a single microstate according to some mathematical prescription, the spatially varying physical order parameter employed in, e.g., the Landau or Landau–Ginzburg formalism,307 was a function of material particles and, therefore, implicitly assumed a (thermodynamic) average over all the microstates associated with such a material particle. For a further discussion of this issue, we again refer to the literature.2
One specific problematic kind of interaction with the environment is the introduction of temperature, which by its nature cannot be introduced as a simple parameter in a thermodynamics guided extension of the energy landscape analogous to, e.g., pressure. The underlying reason is the statistical definition of entropy, which is only defined for locally ergodic regions, i.e., on time scales larger than the local equilibration time of the system in some parts of the landscape. If we were to define in order to add the term −TS to the potential energy, the entropy associated with this single microstate would be identically zero. However, when we are considering the system on the continuum level, then it is possible to define a spatially varying temperature in principle. This leads to some fundamental questions of the chicken-and-egg kind: On the one hand, it makes only sense to define a temperature for a material particle or a region on the landscape containing a certain number of material particles (the same argument also holds for the atom level energy landscape, of course) if these correspond to a locally ergodic region such that statistical thermodynamic properties are well-defined. On the other hand, we usually define a locally ergodic region for a given temperature, as discussed in the literature.16 In the microcanonical ensemble, the temperature is a local variable that requires time to be established, i.e., it needs local ergodicity to make sense. This is also connected to the problem that temperature is often ill-defined for small isolated systems such as small molecules or clusters (with a given fixed amount of energy).
Finally, this also raises the question of how to construct a well-defined energy landscape purely on the continuum level. In principle, one can use the underlying atom level landscape, as discussed in the literature,2,305 but this is so highly computationally intensive that it practically defeats the purpose. Ideally, one would go to a field description of the landscape, perhaps similar to moving from the atom level classical mechanics in a 6N dimensional phase space to the distribution function over the six-dimensional space and its continuum analog . In such a field description, we would presumably have multiple phase values associated with the material particle (or just if the momentum variable is irrelevant), where each of these values corresponds to a different metastable phase the material particle can belong to. However, the proper and efficient formulation of a continuum landscape independent of the underlying atom level landscape is still not fully solved. Of course, we can always define a phenomenological landscape on the continuum level, similar to the way we often formulate phenomenological laws for the evolution of macroscopic systems instead of working on the atomistic level. However, as is well-known, the transfer or handshake between different stages of the various intermediary descriptions of the system in a multi-scale approach is not always a smooth one, either.
C. Probability flows and optimal control
Due to the complexity of the energy landscape for chemical and physical systems on the atomic level, trying to follow the individual dynamics of the system using standard MD or MC methods at the ab initio level can be quite difficult, although the computational power at present is quite impressive, and perhaps well-trained neural networks can reduce the computational time needed to evaluate energies and gradients thereof without paying too high a price in accuracy. However, in many applications, the individual system trajectory is not of particular interest; instead, the probability flow represented by the statistical outcome of multiple trajectories should be used to describe the time evolution of the system. Examples might be the possible transformation pathways of molecules and clusters in space, the folding transition, or the evolution of a system under controlled variation of external conditions such as pressure and temperature.
The concept of a master equation description of the dynamics in models of energy landscapes is an old one, but the applications have been mostly artificial landscape models,34 very restricted sub-regions of a complex landscape,26 or greatly simplified landscapes of realistic systems.285 However, it is expected that future studies will construct master equations for real systems from data about the locally ergodic regions and their local densities of states combined with the measured local probability flows from global exploration algorithms.19 These will be not only qualitatively but also quantitatively correct and allow us to perform optimal control analyses that will result in the optimal control of, e.g., transformations and syntheses of various metastable compounds in a given chemical system as envisioned in the concept of a theory of solid state synthesis.308 In this context, we note that while achieving optimal control for a full synthesis process on the atomic or mesoscopic level would be the final goal, one would usually be quite delighted if one could propose or construct a successful macroscopic synthesis procedure for a predicted target compound that has not yet been synthesized, even if it is not the most efficient one. While in organic chemistry, systematically constructing such synthesis routes to a given target molecule (often called retro-synthesis) has been possible for a long time due to the exquisite control of the tools of organic chemistry,309–311 the comparable lack of experimental control in solid state synthesis has prevented a similar level of success.214 However, recently, there have been many efforts to systematically propose such synthesis routes via the use of machine learning, such as neural networks trained on the large number of syntheses available in the literature together with the results of syntheses performed in a more or less automated fashion,312,313 leading to a (hopefully) virtuous feedback cycle of network training and experiments plus ab initio calculations.
Commonly, these probability flows are analyzed for constant temperature, i.e., in the canonical ensemble. However, one can also measure probability flows for constant overall energy in the system, i.e., in the microcanonical ensemble. In this latter case, applications would be in modeling the time evolution of a thermally isolated system, in particular small chemical systems like molecules or clusters that have been provided with some finite energy via, e.g., the absorption of a single photon, for which the required energy landscape information can be obtained.38 Again, one can construct a master equation for such a system (for details, see footnote 150 in Ref. 2), where the local densities of states employed are computed via the appropriate combination of the local density of locally ergodic regions on the potential energy landscape and the corresponding kinetic energy.
However, there are open basic questions with any master equation approach to the description of the time evolution of a chemical system. If we are dealing with a macroscopic number of, e.g., molecules, we can assume that each of them follows its own stochastic trajectory, but the average over the evolution of all molecules is well-represented by the probability flow. The same should apply to the time evolution of a solid system, as long as the self-averaging hypothesis holds. However, what should be the correct interpretation if we are dealing with a single molecule, such as a protein, or with a solid on time scales where the self-averaging hypothesis does not apply? One could just dismiss any master equation description in such a case and demand to perform a single molecule MD simulation. However, perhaps, like in quantum mechanics, the probabilistic interpretation of the outcome of a master equation based time evolution might still be applicable, or at least be useful, for such a system, encapsulating, e.g., the (multi-walker) model of the protein folding process via an ensemble of structures in contrast to the (single-walker) description where folding occurs by following a specific sequence of intermediary structures.278
D. Energy landscapes on unusual state spaces
From a mathematical point of view, any kind of combination of a state space, neighborhood definition, and scalar map from a microstate to a cost or energy qualifies as an energy landscape. Usually, the state spaces of such landscapes belong to two categories, continuous ones and discrete ones, where the n-dimensional continuous ones have a “natural” neighborhood definition corresponding to the one of Rn or a subspace thereof. Here, we note that the map to the energy need not be a continuous function, in principle, but for most applications, the hypersurface of the energy is a continuous and often even differentiable function for systems with a natural continuous state space.
Of course, when we apply global optimization algorithms, one does not care too much about the natural neighborhood definition and instead defines an efficient moveclass, which can include jumps to far-away points on the landscape. These would only be reached on very long time scales by a time evolution according to the laws of physics, but such jumps are motivated by our chemical or physical intuition about the landscape, e.g., our information that in many chemical systems atoms can substitute for one another and, therefore, potentially better minima can be generated efficiently by exchanging atoms in addition to the usual small displacement moves of atoms.
Such jump moves can lead to rather unusual neighborhood constructions on the state space of the landscape, in addition to those that are already present due to certain approximations to the system. For example, for solids, both crystalline and amorphous ones, one often employs periodic approximants to handle the essentially infinite number of atoms that belong to a real solid. Such a periodic approximant consists of a (variable) periodic unit cell containing a relatively small number of atoms. Due to the periodicity of the system, the three coordinates of the atoms with respect to a given set of unit cell vectors vary between 0 and 1, where the atom position for the value 1 is the same as for the value 0. As a consequence, along the direction in the state space of atom i along the direction of cell vector , the state space has the topology of a circle, and this applies for every atom along every cell-vector direction. Furthermore, the same infinite crystal can be described by an infinite number of (integer) combinations of the three cell vectors (assuming that the starting set of cell vectors describes a primitive unit cell), even with the restriction that the volumes of all these unit cells should be equal. This again leads to a complicated topology in the state space we are searching for if we assign all these cell vector combinations plus the associated relative atom positions to the same infinite crystal and, therefore, allow large jumps in the state space.
Besides this very important topological aspect that everyone encounters when modeling solid materials using periodic approximants, we can study systems on unusual state spaces such as a Möbius strip (cf. Fig. 12) or its higher-dimensional analogs. Another route to go would be to investigate atoms on general orientable or non-orientable curved spaces, e.g., on a sphere or hyperboloid,314,315 or low-dimensional systems in general.316,317 Furthermore, one could study more abstract systems, where the atoms can have more than three spatial coordinates. Such landscapes might be inspired by the additional dimensions in, e.g., string theory, which are rolled up on very small length scales; here, the periodicity in these extra dimensions is not an artificial one, as was the case with the periodic approximant to the nearly infinite solid, but is part of the fundamental description of the space we live in. Quite generally, topological features of landscapes are a topic of research,318 and there might be interesting results awaiting their discovery. However, besides state spaces with integer local dimensionality, we should also consider and study state spaces that exhibit, e.g., a fractal structure. In some way, they can probably be treated like a discrete state space for practical purposes, but there might well be fundamental aspects of energy landscapes on such systems that are different from those defined over standard continuous or discrete state spaces. After all, besides in mathematics, fractals have appeared in the description of a large variety of numerous fundamental concepts in physics, biology, or chemistry, such as phase transitions or stochastic processes,319 which might be transferable to an energy landscape investigation.
Other state spaces of chemical or physical systems might be based on reciprocal space, i.e., wave vector space, instead of the position space of atom arrangements. For systems made up of atoms, this would usually not be a very useful choice of state space, but it might be suitable for systems where the wave description is more appropriate than the particle description, e.g., in the case of the energy landscape of photonic systems or the energy landscape of interacting vibrational modes.
E. Energy landscape of quantum mechanical systems
When considering unusual state spaces, one particular question looms especially large: to what extent can we combine the energy landscape concept with the curious physics of fully quantum mechanical systems? When studying the dynamics of a quantum system, such as a spin system, on its energy landscape, we usually pretend that after each “move” in state space we perform a measurement and, therefore, can associate a well-defined “quasi-classical” energy value with each microstate. However, this approach does not work for the treatment of quantum systems that exhibit non-classical behavior during their time evolution. To clarify the issue, consider the table below (Fig. 13), where we show the different stages of the “evolution” of the energy landscape concept from the (potential) energy landscape of a purely classical mechanical system to a fully quantum mechanical landscape. We distinguish four different stages: (a) a system of N atoms as classical entities with a classical energy as the sum of the kinetic and potential energy of the system; (b) a system of N atoms as classical entities but with a quantum mechanical energy evaluation of the electronic ground state; (c) a quantum system, where the microstates correspond to the eigenstates of the quantum mechanical Hamiltonian and, therefore, the energy is the eigenenergy of a given microstate while the moveclass corresponds to jumping to another eigenstate—visualized as a short quantum mechanical evolution caused by a “fictitious” perturbative Hamiltonian due to, e.g., the environment—followed by a measurement that forces the system into one of the eigenstates with a probability according to the perturbative time evolution; and (d) a full quantum system, where the state space consists of all possible wave functions or density matrices of the system, and the energy corresponds to the expectation value of the Hamiltonian for a given wave function/density matrix, where the time evolution of the wave function is caused by the Hamiltonian plus possible (small) perturbative terms acting on various time scales.
Each of these levels of description for the energy landscape of, e.g., a chemical system, has its own strengths and weaknesses or inherent problems. To gain some insight, we consider these landscapes with respect to five different aspects: (i) the choice of energy function, (ii) the distinctness of the microstates, and (iii) the natural (physical) neighborhood implied in the choice of state space. Furthermore, we consider the kind of time-evolution we can realize for such a landscape, i.e., (iv) the essentially deterministic classical molecular dynamics and wave function time evolutions based on the proper dynamic laws of Newtonian mechanics and Schrödinger wave mechanics (we do not consider relativistic quantum mechanics or the alternative Heisenberg time evolution since we are focusing on fundamental concerns), respectively, and (v) the stochastic time evolution realized in a Monte-Carlo simulation, which automatically should yield the appropriate statistical mechanical averages for locally ergodic regions.
Our intuition about energy landscapes and derived concepts such as stable or locally ergodic regions, transition regions, etc. is shaped by the energy landscape of a purely classical system. Therefore, it is perhaps no surprise that in case (a), the microstates of the system are well-defined and distinct and exhibit a natural continuous neighborhood, and both MD and MC time evolutions do not show any problems, at least as long as one employs a physical moveclass—without jumpmoves—based only on the natural neighborhood of the state space. The one problem is, of course, the classical energy function, which can only be an approximation of the true quantum mechanical energy function, and, in principle, the classical laws of dynamics that do not include, e.g., tunneling processes. When turning to case (b), we still have a well-defined and distinct set of microstates and the same well-defined physical neighborhood carried over from the purely classical case since the atoms are still considered essentially classical particles. As far as energy is concerned, it is well-defined as the electronic ground state energy for a given atom arrangement, but only within the limits of the Born–Oppenheimer approximation (BOA). As a consequence, typical quantum mechanical contributions such as the zero point energy of the phonons in the system are ill-defined except for those microstates that correspond to minima of the BOA energy, and higher-order corrections of the true quantum mechanical energy such as electron–phonon interactions, etc. cannot easily be defined for a single microstate. Regarding the issue of considering only the electronic ground state, the inclusion of the diabatic landscape can help, but this is still not a panacea. However, within the limits of the Born–Oppenheimer approximation, the time evolution via ab initio molecular dynamics is more or less well-defined, and similarly, a stochastic evolution can be implemented without problems. The limitations and problems with this type of energy landscape are the lack of full quantum mechanics, just as in the purely classical case (a).
This changes when we consider the energy landscape of a quantum system, where we define the microstates of the system as the eigenstates of the main time-independent Hamiltonian of the system. For a well-defined Hamiltonian constant in time, these eigenstates can be chosen as an orthogonal set and, therefore, for a given choice of eigenstates, they are both well-defined and distinct. However, in case (c), where each microstate corresponds to exactly one of these eigenstates, the quantum mechanical time evolution is trivial, as can be seen by directly solving the Schrödinger equation with the initial condition of the system being in a given eigenstate: the time evolution just corresponds to a change in the phase of this eigenstate, such that the system remains in the same eigenstate forever. If we assume interactions with the environment analogous to, e.g., a temperature or the interaction with a probe particle, we are faced with the problem of selecting an appropriate interaction Hamiltonian that perturbs the system and, therefore, produces a linear combination of the eigenstates after some time has passed. Otherwise, we cannot compute the (deterministic) quantum mechanical time evolution according to the Schrödinger equation for the system.
However, even assuming that we can select an appropriate perturbation Hamiltonian, we note that during this time evolution, we are moving outside the state space of the system since, by our definition of the allowed distinct microstates, linear combinations are not permitted. Therefore, for such a system, a deterministic time-evolution is not possible inside the closed state space. Still, we can deal with this problem, at least in part, by demanding that after such a short time evolution, we perform a (classical) measurement of the energy with the main time-independent Hamiltonian and collapse the linear combination into one of the eigenstates. This procedure ensures that the time evolution move stays within the originally defined state space. The problem is, however, that we have no deterministic criterion for selecting the eigenstate into which the wave function collapses, and such a criterion cannot exist due to the quantum nature of the system. Therefore, going back to the statistical Born interpretation of quantum mechanical measurements, we would perform a random selection of the final eigenstate according to the probability distribution given by the squares of the coefficients of the linear combination of the eigenstates in the time-evolved wave function.
In contrast, a time evolution that is stochastic from the outset, i.e., a Monte Carlo simulation on the microstates using the Metropolis criterion, is well-defined for the case (c) of a quantum system with a state space of distinct microstates that are eigenstates of the Hamiltonian of the system. Therefore, we can define local ergodicity and locally ergodic regions in the energy landscape of the system. However, the fact that we are leaving the underlying state space when trying to perform a time evolution based on the laws of quantum mechanics is a basic shortcoming of this type of energy landscape in quantum systems. Similarly, the choice of the neighborhood of a given microstate needs to be introduced either by hand or according to an assumed perturbative Hamiltonian, as mentioned earlier. Furthermore, fundamentally quantum mechanical aspects of a system, such as entanglement, are not accessible for such a state space and the associated energy landscape.
Therefore, we might want to consider possible extensions of the energy landscape concept that might be able to incorporate the fully quantum mechanical nature of the system. Probably the most straightforward way would be to expand the state space to contain all possible wave functions of the system, i.e., all linear combinations of the eigenstates of the Hamiltonian, which is possible since the eigenstates form a complete set of functions. The energy we might assign to such a microstate would be the expectation value of the Hamiltonian for this wave function. However, while in this case (d), the microstates of the system are well-defined and the space is now large enough to encompass any time evolution based on physically reasonable interaction Hamiltonians, these microstates are no longer distinct since they are no longer orthogonal to each other and no longer form a minimal complete set of functions. This has some unpleasant consequences for a naïve transferal of the classical energy landscape procedures to such a system. For example, if we were to compute the time average of the energy expectation value during the time evolution of the wave function, even for some simple systems with only a few eigenstates, and compare this with the ensemble average, as would be required for the definition of local ergodicity, we would run into problems insofar as these averages do not agree with those we get from standard statistical mechanics, in contrast to case (c) described earlier.
We note that the main difference in the time evolution between cases (c) and (d) was the demand that we frequently, usually periodically, perform a measurement with the main Hamiltonian of the system, which collapses the wave function into one of the eigenstates. Furthermore, the ensemble averages in case (c) were taken over the one set of eigenstates that were selected as the microstates of the system, while for case (d), we would perform such an average over all wave functions in the state space. Now, if we perform such wave function collapsing measurements and restrict the ensemble average to only one complete set of (orthogonal) eigenstates, then the statistical mechanical averages, both over time and over the ensemble, are in agreement (by construction) with the result we expect from statistical mechanics for our system.
However, such an effective reduction of the state space to the one in case (c) is rather unsatisfactory since it eliminates the conceptual advantage of using all possible wave functions as microstates with the probabilistic energy function given by the expectation value of the energy. In particular, we can no longer describe entangled states and consider their equilibration as a function of time. One might suggest that perhaps the density matrix description would be more general and allow us to deal with the feature of the quantum mechanical state space that multiple eigenstates can contribute to the state of the system. However, this does not appear to be really a solution, since the probabilities employed in the density matrix description already imply the presence of a (statistical) mixture of the eigenstates and no longer allow for the specific quantum mechanical features of entangled states and other features of the coherent evolution of a wave function. After all, since the single wave function description of the system is a special case of a density matrix and since this wave function already exhibits the problems mentioned earlier regarding various averages, it is doubtful that the density matrix formalism would provide a simple solution to the problem.
While case (c) is a popular way to address the energy landscapes of quantum systems when trying to take the full Hamiltonian into account and the naïve case (d) clearly has problems, it is an important open question on the conceptual level whether an energy landscape picture is appropriate for full quantum mechanical systems, including phenomena like entanglement or coherence of states of the system. Furthermore, if such a combination of quantum mechanics and energy landscapes is possible at all, how far can we go with an energy landscape description, and what is the most efficient self-consistent way to do so that does not contradict other fields of physics such as statistical mechanics. However, if such a description can be found, then the apparatus of statistical mechanics and thermodynamics can be brought to bear on systems that are based on such complex quantum states, such as those involved in quantum computing320,321 or quantum encryption, to name just some current areas of research.
F. Noisy energy landscapes
Another question of interest refers to the presence of noise in the definition of the energy landscape, in particular regarding the definition of the energy function, or, with respect to our knowledge, how accurately we can localize the system in state space (cf. Figs. 14 and 15, respectively). As we have seen in Subsection IV E, trying to combine the full quantum mechanics with the energy landscape concept suggests the possible use of the expectation value of the energy as the function that assigns a microstate an energy value. This kind of probabilistic energy reflects the probability distribution over the eigenstates of the Hamiltonian, which are contained in the microstates when visualized as a general wave function with certain coefficients, as in case (d); unless the microstates equal a single eigenstate of the Hamiltonian, this implies a kind of uncertainty in the value of the energy.
However, there are other instances where we would expect the energy to be known only up to a certain degree. In such cases, we can associate a certain noise level with the energy landscape, which in practice can often be represented by a noise temperature that indicates we should not extrapolate the results of a landscape study at various temperatures down to T = 0. Here, one should distinguish between the noise limit of the accuracy of the energy landscape and the accuracy of the “global minimum energy” solution. There is “real” noise that is incorporated in the definition of the landscape, which is noise due to the fact that in applications we encounter finite process times and, therefore, do not reach the infinite time limit. We are, therefore, faced with the fact that external fluctuations of parameters of the landscape or of fluctuations in the thermodynamic boundary conditions take place, which we cannot average out, and neither can we assume that we can follow the ideal “noiseless” landscape during the simulation of the process.2 We also encounter noise due to limitations on our ability to evaluate the cost function or energy function, which can be caused by a lack of information about the system, e.g., the number of atoms in the cluster we want to model is not known precisely, or the composition of the system is not fully established. Similarly, we might not know the correct parameter values in a model Hamiltonian or be dealing with variations in the parameters of an empirical interaction potential between atoms, i.e., the model itself is fine, but we lack information about the precise values of the parameters in the potential.
In those cases, where we can assign a quantitative measure to our inaccuracy in the energy, we can require that all simulations are performed at temperatures above this noise temperature or that all atom configurations that are within a certain energy range from the global minimum are treated as candidates of equal probability for describing the thermodynamically stable phase. The situation is somewhat more difficult if the inaccuracy resides in the microstates: In the usual simulations we, by construction, can define our microstates exactly and, therefore, at any moment in time, the walker is at one specific microstate, even if the true energy might not be known exactly. However, this does not apply when we try to model the energy landscape not via an abstract model but based on experimental data where we correlate measured energies with measured microstates to construct the landscape and, therefore, have to take the uncertainty in the microstate measurement into account. As a consequence, the simulation of walking on such an energy landscape would consist of movements on a set of basic “finite-size states” that correspond to the ideal microstates—perhaps defined via the average of many measurements—plus a finite region in state space around them.
This lack of precise information about the value of the energy is actually frequently the case even for ab initio energy landscapes during inorganic crystal structure predictions, where solutions found with different basis sets, quasi-potentials, density functionals, etc. exhibit different rankings by energy and, therefore, should be considered equally likely candidates for the true global minimum structure in the experiment. Similarly, the same often applies for the ranking of candidates in molecular crystal structure prediction. In this context, we point out that these candidates with very similar energy can be quite different regarding their structure, such that typical measures of structure similarity do not correlate with the energy differences between the structures.322 Only if we study a sub-region of the landscape that is restricted to an ideal crystalline modification and all the defect structures that can be derived from it does such an energy difference to the global minimum energy of the sub-region correlate with the similarity measure. Other classes of predictions where such an energy difference vs structure difference correlation may apply are systems that exhibit a landscape with a multitude of amorphous or disordered structures with essentially equal energy and no clear qualitative structural differences on a mesoscopic level, e.g., as seen in the radial distribution function of the material. Examples would be glass formers when we consider the structures in the large part of the landscape away from the regions with crystalline modifications, clusters without high symmetry global minima, or large molecules. In these cases, the structures involved are quite different on the atomic level, but there are no simplifying characteristics that allow us to group the (e.g., amorphous) structures into subsets and, therefore, for small energy differences, we may or may not find correlations with various similarity measures based on, e.g., structural features.
Other causes of fluctuations or uncertainties in the value of the energy appear when we deal with systems where either parameters of the landscape or of the external boundary conditions vary with time, as mentioned before. In particular, when the fluctuation time is fast compared to the equilibration and escape times from candidates for locally ergodic regions and the size of the fluctuation is small, introducing noise approximations in the landscape model can be very helpful. The same applies when we want to study the thermodynamics of small finite systems where we are far away from the thermodynamic limit; note that this does not include systems where we employ periodic approximants with only a few atoms, i.e., a few degrees of freedom, to represent an infinite crystal, but refers to small clusters, molecules, or surfaces. In that case, e.g., phase transitions are not discontinuous changes of the structures at a single temperature, but we deal with a smeared out transition where the transition temperature lies within a temperature interval, and we observe jumps between the two competing locally ergodic regions on the landscape on finite time scales.
G. Energy landscapes and machine learning
We have already mentioned that there has been a revival in the use of machine learning in the field of structure prediction (cf. Fig. 16). Separately or combined with this, neural networks are trained to either provide parameters in (empirical) model potentials for the purpose of fast energy computations or to yield a representation of the energy landscape directly via the optimized parameters inside the neural network together with a suitable choice of input variables for the network. Exploiting the large amount of information generated during the neural network driven prediction of (stable) structure candidates, one can, in principle, do both: let the network directly suggest such candidates, followed by a structural relaxation at the ab initio level, and subsequently generate efficient interatomic potentials based on the information obtained during the analysis and relaxation of these structure candidates.323 Here, we just want to mention a couple of questions that need to be addressed in the future. If the network provides parameters for analytical forms of model potentials, then it is straightforward to also compute the gradient or higher derivatives of the energy function, which are important for those search algorithms that rely on gradient minimizations instead of, or in addition to, stochastic minimizations. Furthermore, the computation of the Hessian matrix of second derivatives, which is important for certain minimization routines and for the analysis of the minima identified and whose eigenvalues yield a first estimate of the local density of states near a local minimum, can be performed analytically and does not require additional information from the neural network.
The situation is different if we use the neural network as a black-box, which generates a value for the energy without any model potential in the background. In such a situation, we either need to also train the network to compute values for the gradient and higher derivatives at each point in state space or add a discrete minimizer routine; the latter, however, is a rather unsatisfactory situation since it should be much faster to have the network automatically generate the first and second derivatives based on its training. However, it may well be that the effort to train a network to also compute gradients that are consistent with the single point energies could be quite high and might involve trade-offs in accuracy between the energies and the gradients.
Apart from these issues, the main concern is, of course, the size of the network and the amount of training that is needed to generate a good energy function in the first place. Currently, much work is invested into generating such well-trained networks, where it appears to be important to design a clever set of input variables describing the atom configuration. Furthermore, there are suggestions that relatively limited numbers of neural network parameters might be sufficient, as well as only a moderate amount of training data, i.e., single point ab initio calculations for the system of interest, in order to achieve quite good agreement between the energies computed by the network and those calculated with the original ab initio code.326 However, just like was noted in the case of performing general global optimizations at the ab initio level, where for many random atom configurations encountered during the search, no convergence was achieved,327 we do have a problem once the atom configurations suggested to the network reside outside the region of state space employed for the training set. In this case, especially if the network does not parameterize a robust physically sensible empirical potential but directly maps an atom configuration to an energy, there is no control over the neural network, and one can be lucky or very unlucky with the outcome of the energy calculation. This is similar to the case of empirical potentials in the past, where one could easily find potentials in the literature that were great in the neighborhood of a local minimum but terrible farther away. Extreme cases are molecular modeling potentials, which enforce an attractive harmonic potential between two atoms that were close at the beginning of the calculation but then were moved far apart (e.g., during a randomization jump move) to generate a seemingly nice and chemically reasonable structure. After this move, these atoms should have no interactions at all, but now they exhibit suddenly very high energies due to the molecular modeling potential. Another example is a Born–Mayer potential between oppositely charged ions, which has only a finite energy barrier against the Coulomb energy driven collapse of the system, with the cation gaining infinite energy by occupying the same position as the anion.
Especially for the modeling of phase transitions involving fluid (liquid, gas) phases and solid phases, or interactions between a surface and a surrounding medium, one can expect problems to arise. Quite generally, it will be an important step to analyze how such a neural network potential energy deals with modeling the dynamics of a system, including both atom movements and charge transfers. In this context, another basic issue is the transferability of such potentials, i.e., can we employ the same neural network potential for modeling an organic molecule on, e.g., a Ti-surface, a TiO2 surface, and also on a layer of solid oxygen? Addressing such questions and establishing the reliability of the outcome of such simulations will be important tasks in the field of neural network modeling and the representation of energy landscapes.
H. Correlation with secondary landscapes
In particular, in the context of neural network applications for the study of energy landscapes, one should remark on the issue of secondary criteria of quality of a solution when the energy or cost landscape mainly serves as a tool to achieve the identification of optimal or feasible stable solutions to a specific practical problem. Quite commonly, one has an intuitive or even semi-quantitative criterion to evaluate the quality of the solution presented by the global optimization algorithm, which is not directly connected to the energy or cost of the solution via a mathematical prescription. Such a criterion effectively constitutes a “secondary landscape” of the system of interest, shadowing the actual energy or cost function landscape. In a way, we might say that the use of neural networks tends to constitute the quantification of hidden or explicit secondary criteria, thus implicitly creating a secondary criteria landscape. However, due to the high complexity of modern networks, extracting these secondary criteria from the cost function landscape associated with the trained network in an explicit form that is intelligible to humans is a highly nontrivial task and often might not be possible.
Of course, if the original cost function constitutes the true and complete criterion of the quality of the solution, then this secondary landscape present in our minds, commonly based on our past experiences or on a simplified model picture of the system, serves only as an “emotionally” satisfying confirmation of the outcome of the “essentially black box” exploration algorithm. In that case, any disagreement with the results of the global explorations must be attributed to the simplifications and limitations of the secondary landscape, and we would take such a divergence as an incentive to re-evaluate the quality of, and also the assumptions behind, the secondary criteria. This would be the case if we had generated the parameters of a neural network solely based on such secondary criteria, e.g., if we had assigned a solution or accepted such a solution to our problem based on a (possibly vast) set of past experiences. An example might be hurricane prediction, where we train the network on the correlation between the observed weather before the hurricane formed and the subsequent appearance of the storm and compare the predictions of the network with the outcome of weather simulations and their predictions. In the case of hurricane predictions, such secondary criteria based approaches have already become quite successful.328
The discussion above assumed that the energy landscape corresponds to a complete description of the system as far as the issue of interest is concerned. In contrast, if the energy function itself represents only an approximate assignment of energies or costs to the microstates of the system, then the judgment of the quality of the solutions based on the secondary criteria needs to be taken more seriously and might serve as a warning that the cost function is too simplistic or that the solutions proposed are beyond the range of applicability of the model behind the energy function. If the secondary criteria can be cast into a quantitative secondary cost function, one might even consider adding them to the original energy function, analogous to the multi-objective Pareto approach57 mentioned earlier. An example where a low-quality or incomplete cost function could be combined with a secondary cost function would be the structure determination based on a combination of a (possibly simplified) energy function and the difference between computed and measured powder diffractograms.329–331 Here, the diffractograms constitute the primary cost function since the structure of the material that generates the diffractogram is of interest, but by using a simple energy function as a secondary criterion, many physically and chemically unreasonable structures can be eliminated and the global optimization algorithm can be guided more efficiently to the global minimum solution.
A related example would be the task of predicting structures for, e.g., periodic systems in chemistry, i.e., (idealized) crystal structure prediction, mentioned earlier in Secs. IV F and IV G. Here, the intuition of the chemists—usually sharpened by having seen, studied, and synthesized many chemical compounds—often guides their judgment as to whether the structure proposed is “chemically reasonable.” This is a good supplement to the outcome of the exploration of the energy landscape, and, like in a data-mining approach,324,332–338 possibly combined with high-throughput workflows,339 we can use the secondary landscape and its criteria as tools to suggest promising starting points of the global explorations in the case of gigantic state spaces that would overwhelm a straightforward totally unbiased search approach. However, the situation becomes more problematic when the intuition disagrees with the results of the energy landscape exploration, i.e., the structures proposed by the algorithm as local minima are “weird looking” or the structures proposed by the secondary criteria minimize into completely different kinds of configurations. In the context of chemistry, this usually means that either the quality of the energy function is not sufficient or that we need to rethink our secondary criteria as far as the physics or chemistry involved in the chemical system being studied is concerned. As was mentioned earlier, one often deals with energy functions that are simplified in order to allow for the fast evaluation of the energy during global explorations and, therefore, one usually would not be surprised that some of the structure candidates are either much higher in energy (when re-evaluated with the highest level of ab initio calculation tools available) or are not stable at all when recomputed on such an ab initio level. Such practical issues with the choice of an energy function are well-known, do not pose fundamental problems, and also do not invalidate the general approach being pursued.
In contrast, having to rethink one’s secondary criteria and the implied assumptions on which they are based is often more disconcerting for the researcher, because this points to some fundamental lack of understanding in our picture of the chemical system and its richness in feasible structures and possible chemical processes. However, frequently, one notes that the problems with the secondary criteria are mostly due to unjustified mental shortcuts that are the analogs to the kind of oversimplifications we mentioned regarding the choice of an energy function. It is often just psychologically quite difficult to realize such limitations in the use of supposedly well-established secondary criteria, which might go back to one’s first encounters with various chemistry or physics concepts and simplified examples as a student. On the other hand, if there is not such a simple explanation for the divergence between the results based on the energy landscape and the secondary criterion, then this might suggest a true leap in understanding ahead once this difference in the two criteria is resolved. Clearly, such progress would often prove much more valuable than a specific prediction of, e.g., the structure of a crystalline compound!
At the other extreme of the competition between the energy landscape and the secondary criteria, we deal with problems where no suitable energy landscape yet exists, but due to the importance of the task, we would be most happy to be able to rely on secondary criteria as a first step. Such instances are very common in more applied fields of science, such as geology, and in the humanities, where just to identify an appropriate cost function would constitute a great step forward. An urgent example in this regard is the challenge of earthquake prediction. Here, we are lacking a good cost function or even an appropriate state space for constructing an earthquake cost function landscape. Such a possible state space might be established by measuring, e.g., distortions of the earth’s surface on a fine grid using global positioning system (GPS) data, where each such set of distortions all over the earth, or at least for a large region, would constitute a microstate of the system. However, assigning a robust energy or cost function to such a distortion is a complex task, not to mention generating reliable predictions based on such a cost function, although steady progress is being made.340
Quite generally, quantifying secondary criteria such that they can be represented as a secondary energy or cost function landscape will be an increasingly important task for the future, and systematically studying the qualitative meaning of the correlations—or lack thereof—between the primary and secondary landscapes should provide many insights into the system of interest and the relevance of the associated energy landscape.
I. Landscape applications outside of physics and chemistry: Transferability of landscape concepts and dynamics employed in chemistry and physics to other cost function landscapes
In this penultimate section, let us turn to some open questions and issues in the study of cost function landscapes outside of the fields of physics and chemistry. Besides the well-known examples of biological fitness functions or cost functions in the context of businesses, one might be interested in, e.g., the description and modeling of many types of migration: migration between countries, the movement of politically active citizens and voters in general between political parties, religions, or cultures, just to name a few. For example, the movement of people might be governed by some “happiness index” for people (as a group or as individuals, depending on the microstates of such a model), where the motivation that induces changes in the population pattern depends on the variation (derivative) of this degree of “happiness = cost.” In this context, physical (mountains, oceans, etc.) or societal (bureaucracy, anti-immigration sentiment) barriers against movement would also need to be included. Being able to establish such a migration landscape could prove invaluable in societal economic and political planning, with potentially enormous gains beneficial for many humans—both migrants and residents—and for humanity as a whole.
Clearly, the best way to model such an evolution of the distribution of human beings around the world is still an open question, both regarding the appropriate cost function, the set of microstates, and the neighborhood structure. Already agreeing on a cost function proves very difficult: different people have different ideas about how to assign a cost to their personal situation, not to mention the time scale over which the cost function is evaluated: “happiness” at the moment (seconds or days), stage of life (months or years), or over the whole life (decades); thus, one cannot really define a universally valid definition of “happiness.” As a consequence, the application of, e.g., optimal control methods in the field of the humanities to achieve optimal cost microstates and macrostates for the system is inherently a highly problematic activity. Similarly, there is often no fixed set of microstates; the number of people in a town will be variable, and their age profile will evolve with time. It is not clear whether one can always assign a valid cost to all the natural choices of microstates in the system; e.g., if the cost were taxes levied on residents, then assigning taxes to people moving between places of residence is an unclear proposition.
Another field where cost or fitness function landscapes should be of great importance is the evolution of memes or attitudes in society analogous to the species and their fitness in biological evolution. This naturally leads to extensions of current studies of fitness functions in biology that go beyond classical evolution modeling. In particular, the modeling of species evolution when combined with the spatial distribution of environmental conditions while allowing for migration of the species between different, e.g., valleys, islands, or climate zones, should be an important area of research.341 Here, we nicely connect to the issue of the migration of human beings and the evolution of ecosystems, where it is not obvious what the appropriate choice of fitness function would be.
Considering this plethora of cost functions in fields as varied as biology, economics, and the humanities, let us summarize the similarities and differences between such cost functions and the energy landscapes in physics and chemistry. From a mathematical point of view, there is no difference between a cost function landscape or an energy landscape, and the same holds true for performing global optimizations on such a landscape. We can employ the same kind of general optimization algorithms, selecting both efficient moveclasses and algorithms that are suitable for the type of landscape under consideration. We might even be interested in a multitude of low-lying cost function minima. For example, in a traveling salesman problem, such a set of side-minima could be of interest as alternative routes if we expect some sudden road closures ahead, or in a business plan, we would need alternative plans at hand since we must take noise in customer demand or supply chains into account. This extends to the definition of stable regions on the landscape, which might be characterized by a group of many local minima that are separated by only small cost function barriers and are distinguished from other stable regions by large barriers. In the extreme case, we would need to establish cost function landscapes that can vary with time, analogous to the energy landscapes of chemical systems, where the environmental boundary conditions depend on time.2
Here, we should keep in mind that such barriers are, in a sense, artificial since we can define neighborhoods in a purely algorithmic sense via the moveclass, thus raising or lowering both energetic and entropic barriers on the landscape and even eliminating some local minima. In systems drawn from the realm of humanity, there is usually nothing like the natural laws of physics that underlie the natural moveclass of an energy landscape in chemistry or physics. Nevertheless, even general cost function problems often have an “intuitive” neighborhood according to which we explore the state space; for example, the traveling salesman might change the route by only switching the order of neighboring cities along the current traveling salesman route; this is not a very efficient move as far as global optimization is concerned, but might be a simple first attempt appropriate when already being on the road and having to adjust the schedule on the spot. Similar “intuitive” neighborhoods might exist for the case of modifying business plans where we only change the production portfolio in small steps of certain kinds: e.g., changing the colors or patterns of shirts might be straightforward modifications in the product, but moving from long-sleeve to short-sleeve might be a much bigger and less intuitive change, while switching from producing shirts to selling ice cream would be like a jump to a completely different cost function landscape. Furthermore, there might exist phenomenological laws derived from experience, such as the law of supply-and-demand in economics or other well-tested “rules-of-thumb” how to deal with, e.g., inflationary pressures, etc., which would imply an intuitive way to explore the state space of an economics system when searching for the “stable” set of relevant sub-optimal solutions to a problem in economics or business.
This inquiry regarding the existence and characterization of stable states in economics, biology, the humanities, and society poses the question: how far can we follow the analogy to chemical systems? Given a cost function together with the underlying state space and the neighborhood structure, we can determine the barrier structure of the landscape. In the case of physical systems, we next can introduce the (local) temperature via its thermodynamic definition as a measure of how the local energy of the system changes with its local entropy, or alternatively, we imagine the system to be in contact with an environment that allows the exchange of energy and is characterized by a temperature T. Formally, we can do the same for a generic cost function by picturing an exchange of “cost” with the environment. In addition, as far as, e.g., the use of the Metropolis criterion during a simulated annealing run is concerned, the temperature involved can be treated as just a control parameter of the global optimization algorithm, without any additional meaning.
However, an interesting question is: would it be possible to qualitatively picture or quantitatively describe the actual dynamics and time evolution of a system that does not belong to the world of physics or chemistry as movement on its cost function landscape? Can we interpret differences in costs between neighboring microstates, defined according to an “intuitive” or “phenomenological” moveclass/neighborhood, as forces that control the dynamics of the system? Or, on longer time scales where statistical evolution might take place, use, e.g., a random walk with an acceptance criterion that is constructed in analogy to the Metropolis criterion to model the real long-time evolution of the system? In the former case, a dynamical law based on such forces would be required, and in the latter case, we would need to provide an interpretation of the temperature parameter that is appropriate for the type of system whose cost function we investigate and use as guidance for the time evolution of the system, always assuming the moveclass should be an intuitive (natural) one.
Following the analogy to chemistry and physics, a low temperature should correspond to a case where the evolution of the system—be it biological, economical, or drawn from the humanities—mostly remains inside a stable region and only very slowly transitions into a different state. In contrast, a very high temperature would imply that the system does not even get close to low-cost minima but rapidly moves among a wide range of microstates of the system. In the case of biology, the low temperature might correspond to a very stable external environment, where a certain ecosystem has become established and changes only on very long time scales, while in the high temperature case, the environment is unstable and the ecosystem undergoes massive rapid changes. We would thus associate the temperature with some parameters of the system + environment, such as the mutation rate of the various species. Here, the mutation rate does not only refer to the speed with which genetic mutations appear (which might actually be more or less constant irrespective of the environment; exceptions would be variations in the background radiation or the presence of chemicals that influence the mutation rate), but also includes the successful adaptation of the species via the mutations to the rapidly changing environment. Similarly, in economics, the temperature might correspond to the amount of “free money” in the system, i.e., money that is not yet allocated to specific businesses. The supply of this available money might depend on some external factors such as the central bank interest rates or the budgetary decisions of a government. Of course, these proposed interpretations of a temperature parameter in the case of cost functions can only be suggestions, and it might well be that similar ideas can already be found in the economics, ecosystem, and evolution biology literature.
Going beyond the temperature, one would start by considering to what extent entities like local free energies associated with stable states would be useful in studying the equilibrium and dynamics of systems outside of physics and chemistry and what their interpretation might be. We might consider business models where a given model corresponds to a full cost function landscape, i.e., a microstate represents one setting of the parameters in the given business model. For such a model, several stable states or modes of running the business might exist, but, depending on the “temperature,” one mode might be more profitable than the other one. An example might be a business that can either specialize in one product with a high profit margin or offer a multitude of many products, where the sale of the single product relies on a stable economy (low temperature), while the multi-product strategy is more suitable for shoppers with highly variable incomes due to turbulence in the economy (high temperature). In the second strategy, the presence of multiple products that can be sold would yield a term that is analogous to a configurational entropy, which can stabilize a solid solution at high temperatures vs a single crystal modification that is the thermodynamically stable phase at low temperatures. One should note that the suggested interpretation of temperature in this business context is quite different from the one proposed above for the economy as a whole. Perhaps one could analyze how thermo-economics342–345 may be used as inspiration to address the issue of free energies in systems that can be described with cost function landscapes. One interesting question in this context is that perhaps there might exist several types of temperatures in general, each referring to different statistical interactions with the environment? Furthermore, there might also exist several external parameters that vary with time on different characteristic time scales, and each might be associated with its own characteristic type of temperature.
In this context, it would also be an interesting question whether the concept of “aging” as conceived of in physical or chemical systems can be transferred into the context of economics, evolution, or social systems. Here, we are not discussing an aging population or an aging organism, but we are concerned with the sudden freezing-in of degrees of freedom in the system by reducing the effective temperature or abruptly changing other external parameters of the system. This external shock transfers the system into a non-equilibrium state, which then slowly evolves toward a new equilibrium through a sequence of intermediary “local” equilibrium states. Due to this sequence of marginally ergodic states of the system (for true aging, these equilibrium states are not locally ergodic but only marginally ergodic294), such aging is different from the usual straightforward approach to a single unique new equilibrium state on a single time scale of relaxation after a disturbance has taken place. An example might be the case of an ecosystem on an island where an apex predator is eliminated, and now a sequence of new “candidates” for this apex position from among the remaining species needs to evolve. Another case is the reaction of the ecosystem to a volcanic eruption, where a sequence of different classes of microbes, plants, and animals re-occupy the devastated region.346,347 In this situation, there is no direct movement toward reestablishing the original ecosystem which, for the sake of argument, we can assume to be “optimal” for the general region and climate. Instead, the instantaneous local environment due to the presence of lava fields, etc., favors other species that reach a local equilibrium for some years but, by their presence, prepare the ground for the return of the original species combination, perhaps via several additional intermediary dominant species. Other examples are the study of disturbances in isolated systems such as hydrothermal vents348 where, via controlled perturbations of the ecosystem, such a recovery can be studied in detail.
Such examples drawn from biology are instances of cost function landscapes where, considering the level of sophistication of the microstates and the cost function of the system, the fitness function as a measure of survival of a species is comparable to, e.g., the potential energy function of a chemical system. Therefore, the transfer of the language of chemistry and physics to such systems appears to be reasonable. This becomes much more problematic in the case of situations drawn from humanity, such as an economy or a society. In these examples, we, as human beings, are faced with not only the “grand” picture that assumes some kind of homogeneous society of interchangeable individuals but instead are always aware of and concerned with the fate of individual people. In particular, one must keep in mind that people are not interchangeable units like atoms but are highly diverse individuals with their own individual time-dependent cost functions that depend both on the mind of a given individual and on their complex interactions with the rest of society. Due to such non-linear effects and the inherent noise in the state of the person and society, individual costs cannot just be “added” up to yield the cost function of society. Furthermore, the number of people involved, while seeming very large in our everyday experience, is far from the number needed to apply the analogy to the thermodynamic limits we typically assume in chemistry and physics. Therefore, trying to uncritically transfer the methods developed for chemical systems to the realm of society is highly problematic. One needs only recall the dangerous distortions that appeared when transferring the theory of biological evolution to the development of societies in the late 19th century, with horrible consequences such as massive racial discriminations and wars of extermination both inside individual societies and in the then colonial empires, culminating in the second world war, which was partly fought under the banner of racial superiority.349
Keeping this caveat in mind, when considering analogies to “aging” processes, we might discuss the re-establishment of a functioning economy, working administration, or political structure after, e.g., a natural or man-made disaster like massive droughts causing large-scale famines or a war resulting in a massive destruction of lives and infrastructure. Again, a sequence of “societal and economic states” like anarchy, black markets, planned or guided economies, gang rule, authoritarian or democratic governments, oligarchies, etc. might appear in fast succession until some kind of long-term equilibrium is reached. In this context, we note that history is full of examples of such events and processes. Being able to develop a formal societal landscape description might help in understanding and ameliorating at least some of the worst aspects of these developments during the slow evolution toward stable states that constitute local maxima of societal “happiness.” Possibly, this might even allow the avoidance of large scale misery among the people affected by finding optimal paths toward a more “happy” local (and even global) society.
An interesting example from the prose literature that assumes the existence of well-defined laws of “psychohistory” governing the evolution of human society, together with possible complications due to the fact that humans are individuals, is the Foundation series by Isaac Asimov.350 Here, the psychohistorians try to shorten an “age of barbarism” after the collapse of a highly developed civilization from 30 000 to 1000 years. One can only speculate as to whether Asimov, a trained chemist and professor of biochemistry, imagined the existence of an underlying cost function from which the laws of psychohistory are derived. However, one must always keep in mind that the kind of social engineering implied by such an approach also has its dark sides. These have been sadly highlighted by the many totalitarian types of societies we have seen just in the past 100 years349 and are currently being discussed in the context of the use of modern information technology to influence or control the behavior of people.351–353
These kinds of landscapes and the dynamics in them also touch on the issue of optimal control of processes or optimal distribution of (limited) resources problems. Examples are the optimal distribution of limited health or food resources, or any kind of natural or organizational resource that exists only in finite amounts.3,354,355 Clearly, deciding on the “correct” cost function and state space (microstates on the level of individuals, age cohorts, social groups, etc.) is as much a problem of moral philosophy as of mathematics. Finally, we consider the landscapes associated with optimal control problems, which range from optimal syntheses and transformation routes in chemistry to motion problems in robotics to running processes in society such as elections, distributions of goods in the context of social welfare, teaching and educating children, etc. Clearly, a nearly infinite variety of problems exist that can be cast into the shape of an optimal control problem. However, in the humanities, these associated cost landscapes will often be extremely noisy, due to lack of information, variation in time of the boundary conditions, or because the cost functions are inherently of the Pareto-type, i.e., we are dealing with multi-objective cost functions that furthermore involve not only quantitative cost criteria but also qualitative ones that, e.g., refer to the value of human life. Furthermore, since establishing the existence of a natural neighborhood in the state spaces of systems drawn from the humanities is a highly non-trivial task, any attempt to properly define and solve the corresponding optimal control problem is highly fraught, and the proposed solutions should be reevaluated using a multitude of secondary criteria that take the “human” in humanity into account before trying to put them into action. Still, studying such landscapes, even for ill-defined complex optimal control problems, will be a valuable tool in the future and, when properly employed, can improve the life of many human beings in small and large ways.
Clearly, these attempts at establishing a connection between concepts of energy landscapes in physics and chemistry and analogous ones for cost function landscapes in biology, economics, or the humanities appear to be quite ad hoc. Nevertheless, the coherent transfer of those energy landscape concepts that go beyond the purely mathematical ones (such as minima or saddle points) from the fields of physics and chemistry to other applications of general cost functions appears to be an important open question that needs to be answered anew for every different type of cost function application. However, such an application of concepts and methods outside their original area of application promises many new insights into biology, economics, and the humanities. In addition, cost functions and other concepts developed outside the realm of chemistry and physics might inspire new ideas on how to explore and study the energy landscapes of chemical and physical systems.
J. Speculative developments
In Subsections IV A–IV I, we have considered many possible future developments of energy and cost function landscapes that can be seen as natural extensions of current and past uses of such landscapes in the natural sciences, especially chemistry and physics, but also in more applied fields of science. Furthermore, we considered possible future applications in areas that belong more to the humanities, like business or issues of optimization and optimal control in society. In this final subsection, we are going to discuss some more unusual directions that the construction and study of energy landscapes might take, even though some might be more of a speculative kind.
As was discussed earlier, in many applications of energy landscapes, these landscapes serve, or can be visualized, as the place where the dynamics of a system take place with regard to the time evolution of both micro- and macrostates. In the nearly prototypical case of a system of atoms interacting via a classical interaction potential, the gradient of the potential energy surface yields the forces that enter Newton’s equations and, therefore, allows us to determine the trajectory of the system in state space, i.e., on the energy landscape. However, as mentioned in Subsection IV I, in fields far away from basic physics and chemistry, such as the applied sciences or the humanities, the definition or construction of an appropriate landscape that can address the issues of concern is often quite a challenge, and deriving (phenomenological) laws of evolution for the system from such a landscape is not a straightforward procedure. Therefore, perhaps one might want to reverse this process: one should first determine the phenomenological laws of a system from experimental observations and subsequently attempt to reverse-engineer an energy or cost function landscape. After all, the variational formalism of classical mechanics and the proper definition of the energy took place long after Newton had formulated his eponymous laws: starting from Newton’s equations, one could reach the Lagrange equations via the d’Alembert principle of virtual work356 (developed by d’Alembert and Bernoulli), and subsequently Hamilton’s variational principle, and finally obtain an energy function.357 Clearly, not all phenomenological laws can be written as, e.g., derivatives of some energy or cost function in analogy with (conservative) forces as derivatives of a potential energy. However, perhaps some generalizations of energy functions might be constructed that are fully determined by the set of phenomenological laws, and from which, conversely, these laws could be derived in a systematic fashion. This would be especially important in the case of energy or cost function landscapes in fields like the humanities or applied science and engineering where empirical or experimental observations of processes or dynamics are available but no suitable cost function has been identified.
Another interesting extension might be the construction of landscapes for conserved quantities that are different from energy. After all, one of the inherent properties of a typical potential energy is our ability to derive a consistent set of forces by taking derivatives of the energy such that the total energy of the system is conserved along its trajectories. Therefore, one might envision the use of, e.g., the total angular momentum vector or its absolute value as the (vector) function to which the microstates are mapped; clearly, here one would presumably need to employ the 6N-dimensional phase space of an N atom system as the suitable state space. While one might wonder whether such an additional quantity might be needed since we already have the energy, we recall that, e.g., in hydrodynamics, constraints on many phenomenological laws are based on such conserved quantities.147 Therefore, one would expect that studying the landscape of other conserved quantities besides the total energy should be a scientifically profitable endeavor. Clearly, such conserved quantities are most likely to exist in systems based on or closely related to fundamental physics or chemistry. However, it might well be that problems in ecology, business, or even society exhibit entities that are conserved in a certain sense, at least approximately. One might think of land available to various enterprises ranging from agriculture over mining to living areas, or consider the amount of nutrients, minerals, or water accessible to organisms in an ecosystem.
Going beyond alternative quantities that might serve as stand-ins for the energy or the cost, we could generalize the scalar cost function to vector or tensor valued cost functions. These might be appropriate in some instances, especially when we are dealing with multi-objective problems that cannot easily be formulated as a Pareto problem by assigning each of the (scalar) objectives a weight (function) and adding all weighted objectives.57 Of course, performing such an optimization—one possible version is sometimes referred to as “balanced optimization”358,359—is a non-trivial task. In particular, there is a certain, possibly quite large, degree of freedom involved in how to consistently formulate what we mean by an optimization problem in these kinds of situations. However, the purpose of a general study of such vector valued landscapes clearly goes beyond just performing a global optimization. Instead, we would be interested in the trajectory of such a vector valued system on the landscape and in the derivation of (phenomenological) forces and laws of evolution for such a system from the properties of the vector or tensor valued cost function.
Another possible generalization mentioned already in Subsection IV A in the context of transitions between electronic ground states and excited states is the use of multi-valued energy functions, where each microstate is assigned several values of energy or cost. Frequently, these n different energy values belong to n different smooth hypersurfaces over the same state space. Therefore, one could reformulate the multi-valued functions over the original state space as single-valued functions over an extended state space by replacing the original microstates by extended microstates (i = 1, …, n) and including jumps between different values of i for the same (original microstate) as part of the neighborhood in the extended state space. Of course, we can have degeneracies in these surfaces in the sense that for some microstates, some of the n energy values can be the same, thus leading to a crossing of the hypersurfaces at these microstates. However, this poses no major problems and would actually be expected in the case of, e.g., electronic (quantum) phase transitions.306
However, we might be dealing with hypersurfaces that are not “nice,” in the sense that we cannot construct n smooth or at least continuous hypersurfaces over the original state space, even though this state space itself is continuous. Then, the construction of the extended state space is quite ambiguous since there is no natural way to pick the labeling by i of the extended microstates. In this case, one would hope that additional information about the system is available, which might allow us to group the n energy values (j = 1, …, n) at each into n internally consistent sets that can be labeled by the same number i (i = 1, …, n). Each such set should exhibit some physical or other characteristic property, such that for each choice of i, the energies “belong together” in a sensible way for all microstates , thus allowing us to construct a meaningful extended state space on which the energy is a single-valued function.
Clearly, if the original state space is discrete, then there is already, from the outset, some freedom in the choice of neighborhood in the state space, as discussed in Sec. II. Since the landscapes will not be continuous hypersurfaces but so-called metagraphs,2,8 the grouping of the n energies into meaningful consistent sets will definitely require some external information about the system beyond just the n values of the energy for each original microstate.
In this context, we might also return to the discussion of secondary cost functions in Subsection IV H, which “shadow” the original cost or energy function of interest. Here, the energy values assigned to a microstate by the secondary cost function usually do not correspond to actual energies. As a consequence, it is not clear to what extent the construction of an extended state space with extended microstates of the type (i = 1, 2) would be meaningful for such a situation. In the case of multiple values of the energy corresponding, e.g., to excited electronic states, we could imagine the system to jump between these different electronic states due to interactions with an external electromagnetic field probing the system under investigation. However, jumping from the true energy landscape to the secondary landscape during a simulated time evolution of the system appears to be not very meaningful. An exception would be the case that the secondary criteria landscape corresponds to an alternative energy landscape that would be switched on by the application of, e.g., different thermodynamic boundary conditions; however, in that case, we would be dealing with an example of an extended energy landscape as discussed in Subsection IV A.
Leaving the world of classical physics, we already discussed in Subsection IV E various issues that arise when applying energy landscape concepts to quantum systems and which should be addressed in the future. Another way to go beyond the classical world is to ask questions about the meaning of energy landscapes in relativistic physics. Here, we will assume for the moment that we are not at energies where, e.g., pair production can take place and, therefore, full relativistic quantum mechanics and quantum field theory would apply. On the level of special relativity, the dynamics of a particle takes place in four-dimensional space-time and not in the familiar 3+1 space+time setup. In the latter case, we had been able to take time out of the picture, define our state space, and compute the energy for a given moment in time as a function of the 3D position and momentum vectors of the particles. However, in relativistic systems, the microstates should also involve a time coordinate, in principle. Nevertheless, one could still select one reference system, e.g., the laboratory system, and study the chemical or physical system of interest in this coordinate system and compute the energy at each point in time as measured in the lab system. In that case, the energy of, e.g., a material would involve so-called relativistic corrections to the classical energy, which are well-known in chemistry and physics.360 However, that would not be a problem unless we were to accelerate the whole material to relativistic velocities instead of only allowing the electrons to experience such speeds. In that situation, it would be more straightforward to analyze the system in the rest frame of the material; a drawback would be that now our measurements in the laboratory will experience Lorentz-transformation effects that need to be taken into account when comparing the measured energies with those of the material in its rest frame.
The situation becomes more extreme if we are dealing with a gravitational system for which the Newtonian approximation no longer holds. In this case, the energy is replaced by the energy-momentum tensor, and we can no longer ignore the 4D space-time aspects of our microstates. The dynamics is controlled by Einstein’s equations.361 While one can still formulate a variational Lagrangian-type approach to derive the field equations of general relativity, we are now dealing with a Lagrangian density instead of a simple energy function, and we no longer follow only the motion of particles but must at the same time keep track of the evolution of the geometry of the underlying spacetime. Constructing a meaningful energy landscape for this situation is clearly a challenge, especially considering the fact that in extreme cases such as black holes spacetime becomes extremely distorted with a possible singularity at the center of the black hole.
Black holes clearly belong to materials under extreme conditions; another example is systems where the temperature (and, therefore, the kinetic energy available) is so high that they are best described by a highly interacting quantum system, such as a quark-gluon plasma.362 Applying energy landscape concepts to such a system, e.g., to study possible phase transitions in the quark-gluon plasma, will require much thought regarding the appropriate state space (presumably a Hilbert–Fock-space) and the energy function. One possible way to deal with this problem might be to consider a state space that would consist of quarks and gluon fields discretized on a four-dimensional (three space and one time dimension) grid, and the Hamiltonian that provides the energy function would have to be replaced by the QCD Lagrangian. This lattice field theory363 then allows us to explore the energy landscape; however, the open question is the choice of moveclass, since for this we need a reasonable model for the time-evolution operation.305
V. CONCLUDING REMARKS
In this perspective, I have tried to provide an overview of the long history of energy landscapes in the natural sciences, mathematics, and humanities and to discuss some of the many open questions pointing toward possible future research directions in the field of energy and cost function landscapes. Not only are energy landscapes of fundamental importance in physics and chemistry, but their general applicability as cost and fitness functions in many fields of the natural sciences and humanities, ranging from applied mathematics and combinatorial optimization over biology and engineering to economics and human society, makes them a ubiquitous tool in furthering our understanding and possibly control of such systems in the decades ahead.
In particular, this involves the extension of the classical energy landscape of an isolated system in physics and chemistry to systems exposed to a complex, possibly time-varying environment with unavoidable noise in the state space and the energy function, and to material systems that need to be described on the continuum level instead of the atomic one. We need to address the issue of whether and how it is possible to accommodate all features of quantum mechanics in the construction of the energy landscape of physical systems. Furthermore, two closely connected issues of current and future research are the potential of machine learning in computing the cost or energy of the microstates of an energy landscape and its ability to capture crucial features of a landscape, and the question of how secondary criteria can allow us to qualitatively test the limitations of our landscape models up to the accuracy with which a possible quantitative secondary criteria landscape can shadow the original landscape of a physical or chemical system.
Finally, the question of to what extent concepts of energy landscapes in physics and chemistry that go beyond the global optimization of a general cost function landscape can be naïvely transferred to cost or fitness function landscapes in other fields will be of great relevance in the future. In this regard, the transferability to other fields of the role of the energy landscape in the dynamics and time evolution of a system and the applicability of optimal control methods based on the laws of physics as incorporated in the energy landscape description in the shape of not only the energy of a microstate but also the state space and its neighborhood topology are matters of concern. In particular, attempts to treat problems in human society, such as the allocation of finite resources, using cost function landscapes, need to be carefully analyzed. After all, both the proper definition of the cost functions involved and the appropriate state space and its topology are highly complex, and it is not clear at all whether, e.g., the dynamics of a system consisting of human beings can be incorporated in the description of a cost function landscape. Quite likely, an appropriate way to deal with such optimal control problems will involve not only a proposed cost function landscape but many secondary criteria landscapes that represent the moral issues and imperatives of a human society that are not captured by the straightforwardly proposed cost function. Nevertheless, driven by the needs of society, constructing and studying such landscapes and applying the tools developed in chemistry and physics for the investigation of energy landscapes will be a large and fruitful area of current and future research in the field of energy landscapes.
ACKNOWLEDGMENTS
This work has grown out of two presentations: a public lecture at the Telluride Science Research Center in 2020 and an invited presentation at the Energy Landscape Meeting at Porquerolles in 2023. I would like to thank all the participants of energy landscape and landscape related workshops these past 35 years in Telluride (1990–2005, 2011, 2013, 2015, 2017, 2022), Chemnitz (2000, 2010), Petritoli (2007), Obergurgl (2012), Durham (2014), Porquerolles (2016, 2023), and Belgrade (2019), with whom I had the delightful opportunity to discuss conceptual issues of energy landscapes. In particular, I would like to thank Peter Salamon and Jim Nulton, who introduced me to energy landscapes during my postdoctoral years in San Diego; Paolo Sibani, who really kept my interest in energy landscapes going after my return to Europe; and my other collaborators on many projects involving the development of new energy landscape concepts: Sabine Abb, Patrick Alexa, Christian Ast, Bjarne Andresen, Bohdan Andriyevskii, Burak Bal, Emiliano Buelna-Garcia, Zeljko Cancarevic, Juan Cortes, Klaus Doll, Matt Farrow, Dieter Fischer, Rico Gutzler, Rene Haber, Alexander Hannemann, Chris Heard, Karl Heinz Hoffmann, Haonan Huang, Rudolf Hundt, Martin Jansen, Roy Johnston, Dusica Jovanovic, Klaus Kern, Aniket Kulkarni, Isa Lough, Anna Luna-Valenzuela, Atefe Mahmoodabadi, Branko Matovic, Mohsen Modaressi, Arnulf Möbius, Sridhar Neelamraju, Christina Oligschleger, Rafael Pachecco, Milan Pejic, Ilya Pentin, Alvaro Posada-Amarillas, D. L. V. K. Prasad, Holger Putz, Mathias Rapacioli, Stephan Rauschenbach, Gregoire Salomon, Oscar Sanders-Gutierrez, Manuel Santoro, Tamara Skundric, Nathalie Tarrat, Nico Toto, Markus Wevers, Scott Woodley, Dejan, and Jelena Zagorac. Special thanks go to Bjarne Andresen, Karl Heinz Hoffmann, and Arnulf Möbius for reading this paper and providing valuable feedback, and to Regine Noack for help with drawing the cover page.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
J. C. Schön: Conceptualization (lead); Data curation (lead); Investigation (lead); Methodology (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead).
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.