Many US physics departments are considering dropping the use of Graduate Record Examinations (GREs) in making admissions decisions (see, for example, the commentary by Alexander Rudolph, Physics Today, June 2019, page 10). They are concerned that the exams contribute to the profession’s nonrepresentative demographics. The American Physical Society (APS) Panel on Public Affairs is looking at adopting a similar position. Those decisions may be influenced by a widely publicized Science Advances paper entitled “Typical physics Ph.D. admissions criteria limit access to underrepresented groups but fail to predict doctoral completion,” by Casey Miller and coauthors.1
Although that paper uses data provided by many physics departments, I found some serious statistical flaws in its analysis. Contrary to its conclusions, proper statistical analysis of even the incomplete published features of the data indicates that an equal-weight sum of the quantitative and physics GREs is somewhat better than undergraduate grade point average at predicting who will graduate.2
I believe the key issues raised include the need for more transparency and statistical literacy in handling data, but the effects of graduate admissions policies themselves are also important. Systematic uncertainties in estimating the effects of using GREs in admissions decisions would remain even after a proper analysis of more complete data,2 as is typical for any attempt to estimate causal parameters from observational data.3 Therefore, it may be worth trying a more robust way to get information on those effects.
Given the fairly large number of physics departments that are uncertain about what the GRE’s role in the admissions process should be, APS could ask for departments to volunteer in a randomized controlled trial. Some departments would be assigned to GRE-aware admissions and others to GRE-blind admissions. Ideally, the assignments would be switched after a year. Beyond graduation rates, various other outcomes of interest could be tracked. Departments could participate in long-term follow-up even if they committed to only two years of randomized admissions policy. Incremental costs above the already labor-intensive selection procedures should be small, perhaps even negative, if one counts the time saved in decision making.
Although the information obtained might be inconclusive, at least the setup could be a model for approaching policy issues scientifically and honestly. That’s important when we consider that our credibility on the really big issues—climate, for example—has been challenged by people who wrongly claim we are just pushing political positions disguised as science.