There are many atomistic simulation methods with very different costs, accuracies, transferabilities, and numbers of empirical parameters. I show how statistical model selection can compare these methods fairly, even when they are very different. These comparisons are also useful for developing new methods that balance cost and accuracy. As an example, I build a semiempirical model for hydrogen clusters.
Several quantum chemistry codes were considered for this work, but I was not able to get all of the necessary methods for this project working in any one code without modifications. I ultimately chose PySCF because the open-source Python codebase made it easier to find and fix bugs and contribute bug fixes. The offending bugs were associated with logical problems when a fully spin-polarized system had no beta electrons in some post-HF methods and a memory leak in the CCSD(T) code that caused crashes in long workflows.
The most relevant details for success rates were SCF convergence tolerances set to 10−8, the maximum number of DIIS vectors set to 10, and the maximum number of SCF cycles set to 100 for def2-SVP and 200 for def2-QZVPP. The CCSD calculations were also set to a maximum number of 200 iterations and a convergence tolerance of 10−5 on the cluster operator.