Matters have only gotten messier since Richard Feynman said, in his Nobel lecture in 1965, “We have a habit in writing articles published in scientific journals to make the work as finished as possible, to cover all the tracks, to not worry about the blind alleys or to describe how you had the wrong idea first. …”
It’s a problem that Joan Warnow-Blewett recognized years ago. And one that is exacerbated by the trend in recent decades toward large collaborations. “If you have an experiment spread among many institutions, and even countries, no one feels responsible for saving the documents,” she says. Science historians ask questions like, “What is the economy of very large projects? How are they paid for? What structure do they have?” says Peter Galison, a historian and high-energy physicist at Harvard University. “There are a whole host of issues that surround the way knowledge is certified and enters circulation. Scientists don’t write letters as they once large projects? How are they paid for? What structure do they have?” says Peter Galison, a historian and high-energy physicist at Harvard University. “There are a whole host of issues that surround the way knowledge is certified and enters circulation. Scientists don’t write letters as they once did, but there is more documentation. From eight hours on a big collaboration, more data piles up than all the documentation we have on Galileo. Amidst a huge amount of data, what do you save?”
To see which documents to save and how to save them, Warnow-Blewett and colleagues at the American Institute of Physics’s Center for History of Physics carried out a decade-long study of physics collaborations involving at least three institutions. They visited archives and interviewed more than 600 participants from nearly 60 experiments from the 1970s through the present, focusing first on high-energy physics at accelerators in the US and at CERN; then on space physics and geophysics (including projects involving both NASA and ESA, the European Space Agency); and wrapping up with a medley of subfields including ground-based astronomy, particle and nuclear physics, medical physics, materials science, and experiments controlled remotely via computer.
“When we compare the scope of the records needed to document collaborations against our assessment of current archival policies and practices, the urgency of our project recommendations is abundantly clear,” the researchers write in Documenting Multi-Institutional Collaborations, which was completed earlier this year. “My particular job was to understand how the collaborations worked well enough to understand how to document them,” says Warnow-Blewett. (The report is available free of charge from AIP, Center for History of Physics, One Physics Ellipse, College Park, MD 20740-3843; phone 301-209-3165; e-mail [email protected]; or on the Web at http://www.aip.org/history/pubslst.htm#collabs.)
Rank and record
Perhaps the report’s most surprising finding is that project management and organizational structure—ranging from a pyramid hierarchy to a flat democracy—are not correlated with scientific specialty. “The only exception was high-energy physics, where decision-making is not hierarchical. They have reasons for getting their members involved in everything,” says Warnow-Blewett.
Initial proposals and other core records should be kept for all big physics collaborations, and archivists should become involved early in a project’s planning stages. In addition to such one-size-fits-all recommendations, the report sketches out by subfield the types of records that future historians are likely to want. “You have to find out, Where in the collaborations are the real decisions made? Which are the important committees?” says Warnow-Blewett. “That’s how you find the crucial documents.”
One of the report’s key recommendations is that more records be kept for projects deemed most scientifically significant. Among the records to keep in particle and nuclear physics, for example, are proposals for beam time on accelerators, contracts between host labs and collaborations, and records of lab directors. For top-ranked projects, add intracollaboration mailings and other records from spokespeople, group leaders, project managers and engineers, and some technical records. The list for materials science collaborations starts with proposals and contracts for using synchrotron light sources and other major facilities, and, for the most important projects, includes newsletters plus records from external advisory committees, annual meetings, spokespeople, and staff directors.
The report also recommends a small increase in overhead rates at universities to pay for documenting experiments. The cost would be less than 1% of grants, Warnow-Blewett says. The records that should be saved represent about 5% of the total generated, she adds. “[What’s saved has] got to be small in volume but dense in information.”
“Understanding and documenting large-scale science is a problem we will have for the foreseeable future,” says Galison, who served as an adviser to the AIP study. “The genome, Sloan Digital Sky Survey, CERN, Fermilab, are important examples. The science is not captured by one person’s work, by biography, or by a Nobel Prize. The AIP project focuses attention on the need to preserve documents in order to grasp the way knowledge is produced.”
DOE does it best
Take the Deep Sea Drilling Project, an international collaboration that was run mainly out of the Scripps Institution of Oceanography in La Jolla, California, and one of the projects studied by AIP. “It yielded results important to understanding the structure of Earth and confirming plate tectonics,” says Scripps archivist Deborah Day. “There would have been a chief scientist for each leg of the ship’s tracks. Those leg files are where the scientific meat of the expedition was—they gave information, photographs, reports of scientists. They are what people now come to study. [At first] people didn’t realize that the leg files would be permanently valuable. There was no person in charge of the records.”
What AIP has told archivists, says Day, “is that we need to be proactive. Find out what projects are going on, and select them for retention. We have to go to the principal investigators, and tell them, ‘I am here to work with you.’ If archivists don’t get involved while the project is still alive, Web pages disappear, scientists leave, and paper records get lost. At that point there is no history except memory.”
“It’s not an exaggeration to say that I have my job because of the AIP study,” says SLAC archivist Jean Deken. The study grew out of Warnow-Blewett’s earlier look at DOE national labs, and that’s where it’s had the most impact so far. Not only does SLAC now have a full-time archivist—and Lawrence Livermore National Laboratory is seeking one—but DOE’s new rules for identifying and saving R&D documents were influenced by AIP’s findings, including requiring that projects be ranked. “This shouldn’t be radical, but it was. Before, nobody ever took the project-by-project approach,” says Deken. “A Level 1 project is groundbreaking, it has international significance or alters the direction of research—all significant documents are kept permanently for these projects. Level 2 is important, it increases the scope of knowledge, but is not earth shattering—records are kept for 25 years. Everything else goes into Level 3, and records are kept for 10 years.” The BaBar detector and PEP-II storage ring at SLAC, and the National Ignition Facility at Lawrence Livermore, for example, are Level 1 projects. “The beauty of ranking,” says Deken, “is that scientists respect and understand it immediately.”
DOE keeps the best records, says Warnow-Blewett. “It could be used as a model for other federal agencies—which haven’t come to terms with the importance of documenting the science they’re in the business to do.”