Dual-anonymous peer review gains traction

16 December 2021

NASA and other agencies worldwide are adopting the anonymized process for awarding grants and instrument time after seeing its success for the Hubble Space Telescope.

ALMA antennas.
Two ALMA antennas point toward the skies over Chile with the arc of the Milky Way overhead. The latest round of proposals for ALMA observing time underwent dual-anonymous peer review. Credit: Pablo Carrillo (ESO/NAOJ/NRAO)

Three decades after the launch of the Hubble Space Telescope, the demographics of the astronomers and astrophysicists who win observation time on the telescope have begun to change.

In response to gender discrepancies and other biases that were perpetuated by its traditional time-allocation process, the Space Telescope Science Institute (STScI) in 2018 adopted an approach to proposal reviews in which neither the applicant nor the reviewer knows the other’s identity (see the article by Lou Strolger and Priyamvada Natarajan, Physics Today online, 1 March 2019). Evidence suggests that the Hubble dual-anonymous peer-review process, now in its fourth cycle, is reducing selection bias against women and early-career scientists.

Scientific agencies worldwide are following suit and expanding the process to diverse facilities. Over the next few years, NASA plans to implement dual-anonymous review across nearly all its programs, becoming the first federal science agency in the US to do so.

Anonymizing the process

The push to improve the Hubble process began in 2014, when Neill Reid, STScI’s associate director of science, examined the demographics of the principal investigators (PIs) who received observing time over 11 proposal cycles from 2001 to 2012. He found a statistically significant difference between the time awarded to female and male PIs. In response, the institute tried measures such as going through each proposal and removing the name of the PI from the list of research team members. “But nothing changed,” Reid says. “When you’re discussing a science proposal, you should just talk about the science. So that’s what we did next.”

As a result of both Reid’s study and recommendations by social scientists Stefanie Johnson and Jessica Kirk from the University of Colorado Boulder, the STScI established a working group to explore how to implement dual-anonymous peer review. “We didn’t want this to be harder for the investigators or for the referees,” says Lou Strolger, an STScI observatory scientist who led the working group. His team prepared guidelines for submitting a proposal under the new process. The biggest change was requiring that the proposer avoid taking credit for previous research, so as to avoid self-identification. Applicants would also be asked to submit a non-anonymized “Team Expertise and Background” document describing their capability to do the proposed work.

Hubble proposal success rates, by gender.
The success rates for Hubble Space Telescope observing time proposals for Cycles 11 through 21, corresponding to 2001–12, indicate a bias toward male principal investigators. The overall success rate for male PIs was 23.5%, compared with 18.6% for female PIs. The line shows the fraction of submitted proposals with female PIs for each cycle. Source: I. N. Reid, PASP 126, 923 (2014); graph created with Flourish

The review process, conducted through panel discussions among the referees, was left largely unchanged, with the notable addition of an appointed impartial bystander to reorient discussion if it began to focus too much on the PI rather than the proposal. Only in a later round of review would the referees turn to the expertise documents, with the possible outcome of downgrading a proposal.

The STScI group polled the Hubble user community about its proposed review process. It found that more-junior scientists might be more enticed to start leading research efforts with dual-anonymous review than they would with a more traditional process, in which the PI is the primary focus of attention. Other users feared that the quality of the science would be reduced and that the application process would become cumbersome due to unfamiliarity with the guidelines. The feedback depended on seniority and gender, Reid says: Senior male astronomers were more likely to say anonymous review didn’t make sense because they wouldn’t be able to talk about what they’d done in the past. Female and more-junior researchers were more likely to say, “This is a great idea because I’m not going to be automatically disadvantaged.”

As reported by the National Academies of Sciences, Engineering, and Medicine in the recent Pathways to Discovery in Astronomy and Astrophysics for the 2020s, the hopes of the STScI working group were validated. The three completed dual-anonymous peer review cycles resulted in 30% of winning proposals coming from PIs who had not led successful proposals previously, up from around 5% in the prior decade. “That’s a phenomenal result,” Strolger says. “We didn’t expect it to be so striking.” Additionally, award rate differences by gender shrunk to roughly one percentage point, compared with about five percentage points in previous rounds.

The new normal?

Daniel Evans, assistant deputy associate administrator for research at NASA, was impressed by the STScI team’s reversal of a nearly two-decade statistical bias against female and first-time investigators. Inspired by that success, NASA tested dual-anonymous peer review for 10 of its science programs in 2020 and expanded it to 20 programs in 2021. Last year NASA announced its intent to implement dual-anonymous review across all its Astrophysics General Observer and General Investigator programs, and Evans says it will soon expand to all NASA grant programs.

Hubble observing proposal success rates for first-time PIs.
Percentage of first-time principal investigators with successful proposals to use the Hubble Space Telescope, as a function of observing cycle. Observing cycles 26–29 used a dual-anonymous peer-review process. Source: R. A. Osten/STScI; graph created with Flourish

One of the programs from the 2021 trial is Cryospheric Sciences. “We’re a relatively small community,” says Thorsten Markus, the program manager. “Review panelists often know proposers and might have unconscious biases.” During traditional peer review for research proposals, that familiarity could, for example, allow established scientists to “get away with things that early-career scientists would not,” he says. For instance, a proposal that lacked scientific detail might nonetheless be rated highly on the basis of the reviewers’ knowledge of the investigator’s experience.

To Markus, the most important part of any review process is to “fund a balanced portfolio of both the science and the researchers.” In its first dual-anonymous review cycle in 2021, the Cryospheric Sciences Program selected 11 projects, roughly half of which were led by women and half by men, and a “balanced mix of early-career and senior scientists,” says Markus. “This is what we’d hoped for.” Last year just 4 of the 16 selected projects were led by women, and few were led by early-career scientists.

Other NASA programs have seen similar results. Prior to the trial of dual-anonymous peer review, women constituted 26% of the applicant pool in NASA’s Astrophysics Data Analysis Program (ADAP) but finished in the top two places in the rankings—which essentially guarantees funding—only 16% of the time. After the switch to dual anonymous, women constituted 31% of the pool and finished in the top two places 32% of the time. “That was a real ‘wow’ moment for us,” says Evans.

Cryospheric Sciences, ADAP, and other NASA programs evaluate incoming proposals through a panel charged with ranking them on scientific merit, relevance, and cost. Markus and Evans both observed that the tenor of the NASA review panels’ discussions has changed since the introduction of dual-anonymous review. For example, reviewers say that they feel liberated to focus on the science—even if they can guess who the proposer might be. Following NASA’s first trial run last year, 80% of reviewers polled agreed or strongly agreed that the process improved the overall quality of the peer review.

International efforts

Dual-anonymous review is also gaining traction outside the US. In 2016 Carol Lonsdale of the National Radio Astronomy Observatory in Charlottesville, Virginia, reviewed data from four facilities around the world that are jointly operated by NRAO. She found that referees assigned proposals led by women significantly poorer rankings than they did proposals led by men. John Carpenter, an observatory scientist at one of those facilities, the Atacama Large Millimeter/Submillimeter Array observatory in Chile, performed a similar analysis for ALMA. He found that in addition to gender, the proposal rankings varied along with the country of the PI’s institution and the number of ALMA proposals the PI had submitted in the past. The results led ALMA to adopt a dual-anonymous peer-review system in May 2021.

“Thanks to Hubble’s success, it wasn’t controversial for us to get started,” says Carpenter. He and his colleagues published guidelines for investigators to anonymize their research proposals. “In some cases, it’s still obvious who the proposal team is,” he says. Still, review discussions have deemphasized investigator identity.

ALMA’s first dual-anonymous proposal cycle selected 253 projects that began observing in October 2021. “We’ve had positive feedback so far,” Carpenter says. “People view this as more fair.”

Australia's National Computational Infrastructure.
Gadi is Australia’s most powerful supercomputer. The National Computational Infrastructure where it is housed has begun a trial of reviewing research proposals with a dual-anonymous process. Credit: NCI Australia

In 2019 Isabelle Kingsley, a research associate in the Australian Government’s Women in STEM Ambassador office, and Lisa Harvey-Smith, an astrophysicist and Australia’s Women in STEM Ambassador, invited Australian research organizations to participate in a national trial of dual-anonymous peer review for in-demand research equipment and facilities. The National Computational Infrastructure and three other large organizations, whose facilities include accelerators, synchrotrons, supercomputers, and telescopes, accepted the invitation. The Australian study is among the first to extend dual-anonymous review to nonastronomical research facilities. The trial began in mid 2020 and will continue through the middle of 2022.

Some applicants have expressed concern that anonymization prevents accomplished researchers from leveraging their expertise and reputations, Kingsley acknowledges. (As with the NASA process, team expertise and background are reviewed only later in the process.) But guidelines and examples encourage applicants to describe that experience or access to necessary tools in a way that does not identify them, she says. For example, a proposal could say that the team has access to telescope time on the W. M. Keck Observatory, which will enable spectroscopic follow-up of the galaxies in the sample. “You don’t need to water down your expertise,” Kingsley says.

Looking ahead

The National Academies’ Astro2020 report recommends that dual-anonymous review trials evaluate whether the process broadens the community in terms of gender and other demographics, including ethnicity. But gathering demographic data, including on gender, involves some guesswork. To ensure anonymity and avoid bias, gender information cannot be made obvious in a proposal, and applicants can’t be required to share it separately. Strolger and Carpenter say that staff at STScI and ALMA try to determine each PI’s gender via online searches once the review process is complete.

“Even though there are policies and principles that keep us from collecting demographic data directly, it’s clear that in this instance having the data is important in tracking how things are progressing,” Strolger says. Priyamvada Natarajan, an astrophysicist at Yale University who chairs the Astronomy and Astrophysics Advisory Committee that advises NASA, NSF, and the Department of Energy, agrees. One possible approach, she says, is to set up an independent third-party body that would collect demographic data with assurance of total confidentiality to allay fears of improper uses of the data by funding agencies.

Early returns suggest that dual-anonymous peer review is a powerful tool for increasing equity in access to research funding. Ultimately, publication records will determine whether teams are similarly delivering on the promises they make in their proposals, but it will take time to make that determination. Evans adds that the review strategy “is not the be all, end all.” It should be viewed, he says, as one essential tool of many that are needed to make a transformative and long-lasting impact on diversity, equity, and inclusion within the scientific workforce.

