Policy-guided Monte Carlo is an adaptive method to simulate classical interacting systems. It adjusts the proposal distribution of the Metropolis–Hastings algorithm to maximize the sampling efficiency, using a formalism inspired by reinforcement learning. In this work, we first extend the policy-guided method to deal with a general state space, comprising, for instance, both discrete and continuous degrees of freedom, and then apply it to a few paradigmatic models of glass-forming mixtures. We assess the efficiency of a set of physically inspired moves whose proposal distributions are optimized through on-policy learning. Compared to conventional Monte Carlo methods, the optimized proposals are two orders of magnitude faster for an additive soft sphere mixture but yield a much more limited speed-up for the well-studied Kob–Andersen model. We discuss the current limitations of the method and suggest possible ways to improve it.
REFERENCES
We note that this holds even when q and p are “generalized” densities, as long as they are defined with respect to the same measure. We also point out that, in some applications, the Radon–Nikodym derivative cannot be expressed as a ratio of q and p. This occurs, for instance, in spatial point processes where the support of the proposal distribution changes dimension at each step.30,67
This method is analogous to hybrid Monte Carlo, well-known in the context of the simulation of liquids.