Most people would’ve seen little reason to quibble with David Chandler’s talk at the spring 2011 Statistical Mechanics Conference. Chandler, a chemist at the University of California, Berkeley, was renowned for having cracked some of the thorniest problems in statistical mechanics. He had lent his insights and his name to a widely used model of equilibrium liquids, the Weeks-Chandler-Andersen theory. And he was a powerful and persuasive talker. “When he said something with conviction,” recalls chemist Austen Angell, “people would nod their heads and say, ‘Well, based on his track record, we think he’s likely to be right.’”
Chandler was at the conference, a gathering of 100 or so theorists on the campus of Rutgers University, to talk about water. He and a graduate student, David Limmer, had used simulations to explore what happens when liquid water is cooled far below its freezing point. It was well known that pristine water—free of dust and other impurities on which ice crystals can nucleate—can be supercooled tens of degrees below 0 °C without freezing. But below what’s called the homogeneous nucleation temperature, around –40 °C, the liquid crystallizes almost instantly, no matter the purity. Chandler and Limmer wanted to know what that deeply supercooled water looks like in the instant before it freezes. What they found was seemingly unremarkable: At every temperature and pressure, the liquid basically resembled ordinary water.
To Princeton University’s Pablo Debenedetti, however, that result was mind-boggling. Two years earlier, Debenedetti and his coworkers had done their own simulations of supercooled water, at temperatures and pressures similar to those Chandler described. The Princeton simulations had revealed something far more intriguing. Yes, the liquid could take a high-density form that resembled water. But it could also take a low-density form, with the molecules arranged into airy hexagons reminiscent of those in ice. The water could morph back and forth between those two forms in much the same way it morphs between ice and liquid, or liquid and vapor.
In his 20-minute presentation, attended by many of the biggest names in condensed-matter theory, Chandler was essentially declaring that the Princeton team had gotten it wrong. “It was a matter of people saying, ‘Who are you going to believe, Chandler or Debenedetti?’” recalls Angell. “And Chandler carried the bigger stick.”
Over the next seven years, the perplexing discrepancy would ignite a bitter conflict, with junior scientists caught in the crossfire. At stake were not only the reputations of the two groups but also a peculiar theory that sought to explain some of water’s deepest and most enduring mysteries. Earlier this year, the dispute was finally settled. And as it turns out, the entire ordeal was the result of botched code.
A unified theory of water
The Berkeley and Princeton teams were trying to solve the same 30-year-old puzzle. In 1976 Angell, then a professor at Purdue University, and his colleague Robin Speedy experimented to see how far they could supercool water, and how the liquid would behave at the extreme temperatures. What they saw surprised them: As water dipped below −20 °C, its isothermal compressibility began to soar, a sign that its density was fluctuating wildly at the molecular scale. The liquid seemed on the verge of some dramatic transformation. But whatever that transformation was, Angell and Speedy couldn’t actually see it; it occurred at temperatures below the homogeneous nucleation temperature, where the liquid state was too short-lived for the researchers to measure.
It fell to theorists, then, to speculate about what the hidden transition might be. In the early 1990s, a Boston University professor named Gene Stanley came up with a compelling explanation. Stanley’s theory hinged on the concept of critical points, special points in a phase diagram where two thermodynamic phases of matter—say, liquid and gas—meld into one. Water has a well-known critical point at about 374 °C and 218 atm, above which liquid water and water vapor become indistinguishable. Stanley posited that water has a second critical point, hidden deep in the supercooled regime. At temperatures below that point, there exist two distinct liquid phases of different densities; above that point, the liquid phases merge. In Stanley’s interpretation, the density fluctuations in Angell and Speedy’s experiment represented a kind of tug-of-war between the two phases, a sign that the critical point was nearby.
The beauty of Stanley’s critical-point formulation was that it meshed seamlessly with everything scientists knew about water at the time. Because a critical point in the supercooled regime would have ramifications throughout the phase diagram, it could explain anomalous properties of water we see at everyday temperatures and pressures, such as the density maximum at 4 °C and the puzzling minimum in isothermal compressibility at 46 °C. It could also explain an intriguing experiment a few years earlier by researchers in Canada, which showed that amorphous water ice could take a high- or low-density form, depending on the temperature and pressure at which it was prepared. That no one had ever actually seen water’s second liquid phase was perfectly consistent with Stanley’s theory: At ordinary temperatures and pressures, the two phases should be indistinguishable.
The credibility of Stanley’s theory was bolstered by his group’s own molecular dynamics simulations, which showed thermodynamic behaviors consistent with a critical point near the predicted temperature of the mystery transformation suggested by Angell and Speedy’s experiment. Over the next decade or so, Stanley’s theory won numerous adherents (see the article by Debenedetti and Stanley, Physics Today, June 2003, page 40). But during that time, neither Stanley nor anyone else performed the one calculation that could have definitively demonstrated the existence of two liquid phases in water simulations: a free-energy calculation. That’s where Chandler and Debenedetti came in.
One model, two results
Thermodynamic phases of matter are defined by minima, or basins, in a free-energy landscape. If you plot a liquid’s free energy as, say, a function of density and you see only one basin, that liquid has only one phase. If you see two basins, it has two. When Chandler and Debenedetti crossed paths at Rutgers in 2011, their respective groups had just completed some of the first free-energy calculations ever performed for supercooled water. The Princeton simulations yielded two basins, just as Stanley had predicted. But Chandler’s team found just one, and because his reputation was sterling, the result inevitably cast the two-liquid theory under a pall of doubt.
Simulating a complicated liquid such as water is inherently tricky, and discrepant results aren’t unusual. There are many ways to model the forces between molecules, and each of them is imperfect. What made this discrepancy so baffling, however, was that the Princeton and Berkeley groups had used the exact same water model, a variant known as ST2. The computed free-energy landscapes should have been identical.
Chandler and Debenedetti immediately began working together to get to the bottom of things. They used a private room at the conference site to discuss their methods and compare approaches. They agreed to exchange and test static configurations, a basic computational check to make sure the teams really were using the same model and parameters. Over the following months, Limmer, who had run the Berkeley simulations, corresponded intensely with Yang Liu, the graduate student who had run the Princeton simulations, to figure out what could have gone wrong.
Then, with the two teams still no closer to a solution, the initially cordial relationship soured. Both sides suggest the turning point was the Princeton group’s decision to publish new results in 2012, before the teams had reached a consensus. Debenedetti recalls that when he sent his Berkeley counterpart the preprint, “Chandler was not pleased.”
“I guess that was kind of a natural point that caused everyone to get frustrated and want to stop talking,” remembers Limmer. After that, he says, “we were communicating only through journals.”
And as communication between the groups chilled, the debate over their simulations heated up.
“A lot of shouting”
Almost from the outset, Chandler and Limmer had several working hypotheses for why the two teams’ simulations disagreed. The one that really seemed to stick was the idea that the Princeton team simply wasn’t running its simulations long enough. In essence, they said, what the Princeton researchers interpreted as a second liquid phase might be just a solid phase that wasn’t finished freezing. “It was quite clear if you ran [the simulation] longer, what you really saw was crystallization that was just really slow to progress,” Limmer says.
The problem was that the calculations were so computationally demanding that the Princeton scientists couldn’t run their simulations longer. Their computers weren’t fast enough. The Berkeley researchers, on the other hand, used an algorithmic trick to speed up their code, which they say was why they could see the putative second liquid phase for what it really was: ordinary ice.
Forced to play catch-up, Debenedetti hired Jeremy Palmer, a newly minted PhD with five years of programming experience, to help streamline the Princeton codes. Palmer recalls that within a few months, “we got like a factor of three or four speed-up.” But even with the longer simulations, the Princeton group still saw no hints of ice in their second liquid phase.
For the next year or two, the two sides were stuck in a stalemate. The Princeton team continued to publish results claiming clear demonstrations of two liquid phases; the Berkeley team continued to publish scathing rebuttals arguing that their Princeton counterparts were bungling the fundamental science.
Things came to a head at a 2013 conference on liquids in Bristol, UK. Debenedetti had planned to give a talk there but pulled out at the last minute and asked Palmer to speak instead. Palmer presented their latest simulation work, essentially a replication of the group’s original free-energy calculations except with a different computational method. Chandler was in the audience, ready to pounce.
Alan Soper, the session chair, recalls a particularly aggressive outburst from Chandler during the question-and-answer period after Palmer’s talk. The Berkeley professor took the stage unexpectedly and engaged Palmer in a heated exchange. It “certainly did not resolve anything as I remember it,” Soper says. “Just a lot of shouting at each other.”
Amid the shouts, however, was the seedling of a breakthrough. At one point in the discussion, Palmer recalls, “Chandler said, ‘Well, you’re getting this disagreement with our simulation results. Tell me what the problem is.’” Since the beginning, the Princeton theorists had been mum on the question. But by that point, they had checked virtually every aspect of their code. They had repeated their free-energy calculations using half a dozen different methods. They had shown that, although the second liquid phase does eventually turn to ice, the ice can transition back to the second liquid phase—a sign that the two phases are distinct.
As Debenedetti puts it, “We did as much computational proof as one can do.” So, challenged in front of his peers to posit an explanation for the conflicting results, Palmer gave public voice to the group’s private suspicions: that there was a bug in the Berkeley code.
The Berkeley swap
Palmer’s suggestion was purely speculative. To prove that the Berkeley team’s code was flawed, he and his colleagues would first need to get their hands on it. When he and Debenedetti asked their Berkeley counterparts for the code in late 2013, Limmer and Chandler didn’t so much say no as say nothing. “I don’t think there was ever a straightforward decline,” says Palmer. But the code didn’t materialize.
A year later, the Princeton group tried a new approach. After Chandler published an arXiv paper criticizing the Princeton work, Debenedetti and his coworkers responded by announcing that they would publish their code for all to see. “We were hoping they would show in-kind effort and post their code,” says Palmer. The gambit didn’t pay off. The Princeton team’s repeated requests for the Berkeley code went unanswered for more than two years.
Limmer maintains that he and his mentor weren’t trying to hide anything. “I had and was very willing to share the code,” he says. What he didn’t have, he says, was the time or personnel to prepare the code in a form that could be useful to an outsider. “When Debenedetti’s group was making their code available,” Limmer explains, “he brought people in to clean up the code and document and run tests on it. He had resources available to him to be able to do that.” At Berkeley, “it was just me trying to get that done myself.”
Nevertheless, the Berkeley team eventually forced its own hand. In early 2016, Chandler published in Nature a critique of simulations the Princeton group had reported two years earlier. Buried six paragraphs into the manuscript was the line that would finally crack open the supercooled water dispute:
“The issue is not in the reliability of simulation algorithms and codes, but rather in the using of codes in ways consistent with reversibility, which can be challenging owing to slow relaxation. LAMMPS codes used in refs 5 and 12 are standard and documented, with scripts freely available upon request.”
The LAMMPS codes Chandler referenced were the ones he and Limmer had used to perform their original free-energy calculations. In that five-word phrase—“scripts freely available upon request”—Palmer and Debenedetti saw an opening, and, almost immediately they began to work it. Says Debenedetti, “We wrote to them and said, ‘Send us your code.’”
After several email exchanges with Limmer and Chandler and a direct plea to Nature editors, the Princeton group finally received a working version of the Berkeley code. Palmer, then an assistant professor at the University of Houston, led the effort to locate potential bugs. “We already had some guesses for where it might be easy for people to make mistakes,” he says. It took only a week or so for him to skim the basic structure of the code and identify three or four places where a bug might be. Then it was just a matter of testing each one, a process that took Palmer, Debenedetti, and their team of coworkers a few months. By summer, they had pinpointed the error.
As it turned out, the trouble stemmed from the algorithmic trick the Berkeley team had used to speed up its code. Both teams had performed their free-energy calculations using Monte Carlo simulations, which can be used to find the low-energy states of a molecular ensemble by randomly sampling—and systematically accepting or rejecting—various potential ensemble configurations. To do Monte Carlo, you need an efficient way to generate those sample configurations. The Berkeley team chose to generate them by running short molecular dynamics simulations in which molecules were initialized with random positions and velocities.
The procedure the Berkeley team used to initialize the molecular dynamics simulations was unorthodox—it involved randomly selecting a pair of molecules and then swapping the velocities of their constituent atoms. Palmer and company discovered that the technique produced sample configurations that seemed to flout basic laws of statistical mechanics: The energies deviated from the expected equilibrium values, governed by the Boltzmann distribution, and the molecules’ rotational and translational temperatures didn’t match up. Perhaps most important, the molecules behaved as if they were tens of degrees hotter than their assigned temperature.
Suddenly it made sense that the Berkeley researchers hadn’t seen a second liquid phase; they were effectively running their simulations at temperatures well above the critical point. The moment the Princeton group swapped out the unorthodox sampling scheme with a standard one, the discrepancy went away.
“One person left”
In February 2017, the Princeton group submitted its findings to the Journal of Chemical Physics. We’ll never know if the results were enough to sway Chandler; that April, the 72-year-old professor emeritus succumbed to prostate cancer. But Limmer, now an assistant professor at Berkeley, continues to defend the team’s work. He maintains that, although the Berkeley team’s sampling technique may have been unconventional, it wasn’t necessarily wrong. “There is no reason why in principle one has to initialize velocities in a specific way in order to sample the right ensemble,” he says. “There are lots of different schemes that invent fictitious dynamics to help facilitate equilibration.”
At the invitation of the Journal of Chemical Physics, Limmer penned a response to the Princeton team’s claims, but it didn’t survive peer review. In April of this year, the Princeton team’s paper was published with no rebuttal, and a seven-year saga came to a close.
“I think the preponderance of opinion has probably swung to the Debenedetti side now,” says Alan Soper. Indeed, other groups have now corroborated the Princeton team’s free-energy calculations. Says Soper, “There’s only one person left supporting the Limmer–Chandler simulations.”
For Palmer, the ordeal exemplifies the importance of transparency in scientific research, an issue that has recently drawn heightened attention in the science community. “One of the real travesties,” he says, is that “there’s no way you could have reproduced [the Berkeley team’s] algorithm—the way they had implemented their code—from reading their paper.” Presumably, he adds, “if this had been disclosed, this saga might not have gone on for seven years.”
Debenedetti sees a silver lining. He notes that the seven-year back-and-forth forced theorists to think deeply about the nature of metastability, and it advanced the state of the art in simulation techniques. But when asked whether the resolution of the debate brings us any closer to resolving the question of whether water has two liquid phases, he takes a long pause, then answers, “No.”
Like any model of water, ST2 is but an imperfect imitation of the real thing: It gets some aspects of water’s behavior right but gets plenty of others wrong. A critical point in an ST2 simulation proves only that a waterlike substance can have two liquid phases, not necessarily that water does. Debenedetti thinks a definitive confirmation of the two-liquid theory of water will have to come from experiments.
Luckily, experimentalists are becoming increasingly deft at pulling back the “crystalline curtain” to see how water behaves below the homogeneous nucleation temperature. High-speed x-ray diffraction techniques, for instance, have made it possible to study supercooled water on a femtosecond time scale (see Physics Today, March 2018, page 18). Those experiments have yielded hints of a critical point, but they leave wiggle room for alternative interpretations. And so, as one battle ends, the war goes on.
Editor’s note, 23 August: The article has been updated to correct the job titles of Jeremy Palmer and David Limmer.