TurboGenius: Python suite for high-throughput calculations of ab initio quantum Monte Carlo methods

TurboGenius is an open-source Python package designed to fully control ab initio quantum Monte Carlo (QMC) jobs using a Python script, which allows one to perform high-throughput calculations combined with TurboRVB [K. Nakano et al. J. Phys. Chem. 152, 204121 (2020)]. This paper provides an overview of the TurboGenius package and showcases several results obtained in a high-throughput mode. For the purpose of performing high-throughput calculations with TurboGenius, we implemented another open-source Python package, TurboWorkflows, that enables one to construct simple workflows using TurboGenius. We demonstrate its effectiveness by performing (1) validations of density functional theory (DFT) and QMC drivers as implemented in the TurboRVB package and (2) benchmarks of Diffusion Monte Carlo (DMC) calculations for several data sets. For (1), we checked inter-package consistencies between TurboRVB and other established quantum chemistry packages. By doing so, we confirmed that DFT energies obtained by PySCF are consistent with those obtained by TurboRVB within the local density approximation (LDA), and that Hartree-Fock (HF) energies obtained by PySCF and Quantum Package are consistent with variational Monte Carlo energies obtained by TurboRVB with the HF wavefunctions. These validation tests constitute a further reliability check of the TurboRVB package. For (2), we benchmarked atomization energies of the Gaussian-2 set, binding energies of the S22, A24, and SCAI sets, and equilibrium lattice parameters of 12 cubic crystals using DMC calculations. We found that, for all compounds analyzed here, the DMC calculations with the LDA nodal surface give satisfactory results, i.e., consistent either with high-level computational or with experimental reference values.


I. INTRODUCTION
In recent years, there has been a surge of interest in materials informatics and digital transformation paradigms in the materials science community, which involve utilizing information science and computational chemistry/physics techniques to design or search for novel materials.The kernel method for high-throughput electronic structure calculations is most commonly the Density Functional Theory (DFT), which has been successfully used for designing various materials in silico [1][2][3][4][5][6][7][8] .However, DFT sometimes loses the quantitative predictive power in particular cases, such as materials at extreme conditions or with strong electronic correlation [9][10][11][12][13][14] .Instead, ab initio quantum Monte Carlo (QMC) does not lose predictive power even for such materials because it does not rely on any exchange-correlation functionals in its formalism 15 .However, when it comes to ab initio QMC applications, one of the biggest drawbacks is its complicated computational procedure.Indeed, a QMC study usually requires many involved operations, such as generating trial wave functions, variational optimizations, time-step or lattice-size extrapolation, and finite-size corrections.For instance, a typical workflow of a QMC calculation using the TurboRVB quantum Monte Carlo package 16,17 , is shown in Fig. 1.Automating a) Electronic mail: kousuke 1123@icloud.comsuch required tasks can offer a significant improvement in our productivity, enabling researchers to spend more time doing physics and chemistry rather than launching and monitoring jobs.So far, many workflow management packages have been developed to achieve a more productive research activity and high-throughput calculations.Some representatives are Ai-iDA 18 , AFLOW 19 , Fireworks 20 , and atomate 21 , which have been widely used for generating and/or managing material science database such as NOMAD 22 and Materials Projects 23 .One could immediately exploit these established workflow systems also in QMC calculations, but the combination of a QMC code with these workflow packages is not straightforward due to the complexity of the QMC calculations.Thus, to manage them, we need to implement interfaces, if possible in Python, because most of the established workflow packages are also implemented based on Python, due to its appealing features.In fact, several packages for high-throughput QMC calculations, such as Nexus 24 and QMC-SW 25 , are Python implementations.Wheeler  PyQMC is an all-Python package; thus, it enables one to develop algorithms and complex workflows more flexibly and user-friendly.
TurboGenius is an open-source Python package meant to manage TurboRVB calculations using a python script.In this paper, we explain the basic concepts, designs, functionalities, implemented classes, user interfaces, command-line tools of the package.TurboGenius provides Python classes and command-line tools that fully control TurboRVB jobs, which allow one to realize high-throughput QMC calculations.TurboGenius includes PyTurbo as a sub-package for providing users with more fundamental but more flexible building blocks to control TurboRVB jobs.For demonstrating high-throughput QMC calculations, we also implemented another open-source python package, TurboWorkflows.The demonstrations contain: (1) validations of DFT and QMC implementations of the TurboRVB package and (2) benchmarks of Diffusion Monte Carlo (DMC) calculations for several data sets.As per (1), we confirmed that DFT energies obtained by PySCF 27,28 are consistent with those obtained by TurboRVB within the local density approximation (LDA), and Hartree-Fock (HF) energies obtained by PySCF and Quantum Package 29 are consistent with variational Monte Carlo (VMC) energies obtained by TurboRVB computed with the HF wavefunctions.The validation tests constitute a further reliability check of the TurboRVB package.As far as point (2)  is concerned, we benchmarked atomization energies of the Gaussian-2 set 30 , binding energies of the S22 31 , A24 32 , and SCAI 33 sets, and equilibrium lattice parameters of 12 cubic crystals.We found that, for all compounds analyzed in this study, the diffusion Monte Carlo (DMC) calculations with the LDA nodal surface give satisfactory results, i.e., consistent either with computational or with experimental reference values.
This paper is organized as follows: in Sec.II, we provide an overview of the TurboGenius program structure; in Sec.III, we provide an overview of the PyTurbo program structure; in Sec.IV, we describe the command-line tool and user interface implemented in TurboGenius; in Sec.V, we introduce TurboWorkflows, a python package for realizing highthroughput calculations using TurboGenius; and in Sec.VI, we showcase several validation and benchmark results obtained using TurboGenius and TurboWorkflows.

II. TURBOGENIUS: PROGRAM OVERVIEW
TurboGenius is implemented in Python 3 (the minimal requirement is Python 3.7).Python was chosen because it allows seamless integration with other major workflow frameworks.The main classes of TurboGenius are mainly composed of two types of classes.One is Wavefunction that stores and manipulates a wavefunction information including nuclear positions and pseudopotentials.The others are classes inheriting an abstract class (genius class).The classes enable one to control TurboRVB tasks, such as generating input files, launching jobs, and analyzing outcomes.The major classes of TurboGenius are listed in Table I.Hereafter, we will describe the functionalities of the two main classes.FIG. 1.A typical workflow of a QMC calculation using TurboRVB; (1) Preparation of a JAGP(s/u) ansatz file from a chosen basis set, pseudo potentials, and a structure using makefort10.xbinary.AGPs stands for the symmetric antisymmetrized geminal power (i.e., singlet correlation in the pairing function) 34 , while AGPu stands for the broken symmetry antisymmetrized geminal power, (i.e., singlet + triplet correlations in the pairing function) 34 .The generated WF is a template, i.e., it is fulfilled with random numbers; thus, one needs to initialize it, typically using DFT; (2) Conversion of the generated JAGP(s/u) ansatz to JSD one using convertfort10mol.xbinary because the subsequent DFT calculation works only with molecular orbitals (SD); (3) Initializing wave functions (WFs) using the buildin DFT module (prep.x);(3') If one wants to use JAGP(s/u) ansatz, one can convert the initialized JSD ansatz to a JAGP(s/u) ansatz before the VMC optimization; (4) Optimization of a WF at the VMC level using turborvb.xbinary; (4') One can convert the optimized WF to another type of ansatz.Pf stands for the Pfaffian anstaz, which is the most general form of the AGP WF 17 .(5) Computation of observables such as energy and forces at the VMC level using turborvb.xbinary; (6) Computation of observables such as energy and forces at the LRDMC level using turborvb.x;(7) Computation of other physical properties such as electron density.

A. Wavefunction
Wavefunction is a class manipulating wavefunctions, such as for generating a JSD (Jastrow Slater Determinant) or JAGP (Jastrow correlated Antisymmetrized Geminal Power 16 ) WF for a subsequent DFT initialization, and for converting a WF ansatz to another one (e.g., JSD to JAGP).The listing 1 shows a script to generate a WF file for the water molecule with the cc-pVQZ and cc-pVDZ basis sets for the determinant and the Jastrow parts, respectively, accompanied with the correlation consistent effective core potentials (ccECPs [35][36][37][38] ).The generated WF and PP files are fort.10and pseudo.dat,respectively.Notice that Wavefunction currently has pre-defined keywords for correlation-consistent basis sets implemented in Basis Set Exchange 39 for all-electron calculations, and correlation-consistent effective core potentials (ccECPs) [35][36][37][38] and Burkatzki-Filippi-Dolg (BFD) 40,41 basis sets for pseudopotential calculations.One can also use different basis set and pseudopotentials by providing them as a text list.The generated WF is a Slater Determinant WF with randomized MO coefficients.Indeed, a user should initialize it using the built-in DFT (prep) code before doing QMC calculations.
The Wavefunction class also allows one to generate a wavefunction file from a TREXIO file 42 .The TREXIO library has a standard format for storing wave functions, together with a C-compatible API such that it can be easily used in any programming language, which is being developed in TREX (Targeting Real Chemical Accuracy at the Exascale) project 43 .The listing 2 shows a script to convert a TREXIO file.Thus, instead of using the built-in DFT program, prep, one can use standard quantum chemistry packages such as GAMESS 44 and PySCF 27,28 , then convert it to TurboRVB WF format.The class supports both all-electron and pseudopotential WF with/without periodic boundary conditions (i.e., for both molecules and crystals), and both restricted (ROHF) and unrestricted (UHF) open-shell WFs.Notice that a restricted WF is converted to the symmetric antisymmetrized geminal power (AGPs) (i.e., singlet correlation in the pairing function) 34 , while an unrestricted WF is converted to the broken symmetry antisymmetrized geminal power (AGPu), (i.e., singlet + triplet correlations in the pairing function) 34

B. GeniusIO classes
GeniusIO is a parent class of the wrappers to manage complex QMC procedures.Several classes inheriting GeniusIO are implemented in TurboGenius, as shown in Table I.Here, to make the explanation simple, we focus on one of the classes, VMC genius.VMC genius is a class to control VMC job of TurboRVB.The listing 3 shows a Python script to compute VMC energy of the hydrogen dimer.4) After the VMC job is completed, one can post-process the outcomes depending on the type of jobs.For instance, in the VMC genius class, the method compute energy and forces allows one to get the means and variances of the energy, forces, and stresses using the TurboRVB built-in scripts implementing the bootstrap and jackknife methods 45 , where one can use the reblocking (binning) technique to remove the autocorrelation bias. 46.The related options are bin block (int): block length and warmupblocks (int): the number of disregarded blocks.

III. PYTURBO: PROGRAM OVERVIEW
PyTurbo is a sub-package of the TurboGenius package, which contains fundamental but flexible functionalities.The reason for implementing several classes in an independent sub-package is that there is a trade-off between flexibility and complexity (i.e., the availability of more sophisticated functionality).Indeed, we suppose advanced users will develop and exploit more elaborated procedures than the ones we expect at present.TurboGenius in its present status may not be flexible enough to support all of them.Therefore, we leave PyTurbo as an independent sub-package and keep its implementation as simple as possible, so that advanced users can be provided with the fundamental building blocks to fully control TurboRVB jobs in a Python environment.The major classes of PyTurbo are shown in shown in Table .II. fort.10 is the most fundamental file containing all the WF information except for that of pseudo potentials.The information of pseudo potential is stored in a separate file, pseudo.dat.IO fort10 and Pseudopotentials are classes for manipulating a WF file and PP files, respectively.PyTurbo contains several classes inheriting the abstract class fortranIO.They are essentially Fortran90 wrappers, i.e, in a one-to-one correspondence between a PyTurbo class and a TurboRVB Fortran binary (e.g., Makefort10 class in PyTurbo corresponds to makefort10.x in TurboRVB).The corresponding Tur-boRVB modules are listed in Table III.

A. IO fort10
IO fort10 is a class to manipulate the TurboRVB WF file fort.10 and to extract particular information (e.g., the basis set for the determinant and the Jastrow parts).IO fort10 internally uses the Basis sets class that can store basis-set information for both the determinant and the Jastrow parts.Other information, such as lattice vectors, atomic positions, MO coefficients, AGP and Jastrow matrix elements, are stored in the corresponding attributes implemented in IO fort10.The Atomic Simulation Environment (ASE) 47 package is used to read/write molecule and crystal structures for supporting various file formats.

B. Basis sets
The Basis sets class can store basis-set information for both the determinant and the Jastrow parts.TurboGenius supports the Gaussian-type localized atomic orbitals (GTOs): where the real and the imaginary part (m > 0) of the spherical harmonic function Y l,m,I (θ, φ) centered at R I are taken and rewritten in Cartesian coordinates in order to work with real defined and easy to compute orbitals, l is the corresponding angular momentum and m ≥ 0 is its projection number along the z−quantization axis.For the compatibility with TREXIO, the variables in the classes are defined in the exactly same way as in TREXIO.One can refer to the TREXIO documentation for the details 42 .The class also implements several parsers to write/read specific formats, such as GAMESS 48 and NWCHEM 49 .

C. Pseudopotentials
Pseudopotentials class can store pseudopotential information.PyTurbo supports only the standard semi-local form where r i,I = |r i − R I | is the distance between the i-th electron and the I-th ions, l max is the maximum angular momentum of the ion I, and ator on the spherical harmonics centered at the ion I.As it is now becoming a common practice not only in QMC, both the local V I loc r i,I and the non-local V I l r i,I functions, are expanded over a simple Gaussian basis parametrized by coefficients (e.g., effective charge Z eff and other simple constants), multiplying simple powers of r, and a corresponding Gaussian term: where α k,l , β k,l (usually small positive integers), and γ k,l are the parameters obtained by appropriate fitting.α k,l , β k,l and γ k,l are the parameters stored in the Pseudopotentials class.For the compatibility with TREXIO, the variables in the classes are defined in the exactly same way as in TREXIO.We refer to the documentation of TREXIO for the details 42 .

IV. COMMAND-LINE TOOL AND USER INTERFACE
TurboGenius provides a useful command-line interface, named turbogenius cli.The command-line tool can be run on a terminal by calling turbogenius, which is automatically installed during the setup procedure e.g., by pip.The command-line tool allows one to manipulate input and output files very efficiently and user-friendly.One of the most useful functions of the command-line tool is its helper, as shown in Listings 6 and 7

V. TURBOWORKFLOWS: A PYTHON PACKAGE FOR REALIZING HIGH-THROUGHPUT CALCULATIONS USING TURBOGENIUS
Since TurboGenius is able to fully control TurboRVB jobs, one can implement workflows by combining it with a file/job managing package.To demonstrate it, as a proof of concept, we developed an open-source python package, TurboWorkflows, that enables one to compose simple workflows by combining TurboGenius with a job managing python package, TurboFilemanager.We notice that one can exploit other established workflow packages such as AiiDA 18 and Fireworks 20 as job-managing systems.Combining established workflow managers with TurboGenius is an intriguing future work.The main classes of TurboWorkflows are summarized in Table IV.Each workflow class inherits the parent Workflow class with options useful for a QMC calculation.For instance, in the VMC workflow, one can specify a target accuracy (i.e., statistical error) of the VMC calculation.The VMC workflow first submits an initial VMC run to a machine with the specified MPI and OpenMP processes to get a stochastic error bar per Monte Carlo step.Since the error bar is inversely proportional to the square root of the number of Monte Carlo samplings, the necessary steps to achieve the target accuracy is readily estimated by the initial run.The VMC workflow then submits subsequent production VMC runs with the estimated necessary number of steps.Similar functionalities are also implemented in other workflow scripts such as VMCopt workflow, LRDMC workflow, and LRDMCopt workflow.TurboWorkflows can solve the dependencies of a given set of workflows and manage sequential jobs.The Launcher allows one to pass values and files obtained by a workflow to another workflow by using the Variable class, as shown in the listing 8.As shown in the listing 8, the Launcher class accepts workflows as a list, solves the dependencies of the workflows, and submits independent sequential jobs simultaneously.Launcher realizes this feature by the so-called topological ordering of a Directed Acyclic Graph (DAG) and the built-in python module, asyncio.The listing 8 shows a TurboWorkflows workflow script to perform a sequential job, PySCF → TREXIO → TurboRVB WF (JSD ansatz) → VMC optimization (Jastrow factor optimization) → VMC → LRDMC (lattice space → 0).Finally, we get the extrapolated LRDMC energy of the water dimer.

VI. DEMONSTRATIONS OF HIGH-THROUGHPUT CALCULATIONS
In this section, we will show several results obtained using TurboGenius and TurboWorkflows.Earlier versions of TurboGenius was used for computing phonon dispersion calculations 51 , equation of state calculations 51 , potential energy surfaces of dimers 52,53 , binding energies of molecules [52][53][54] , hydrogen liquids 55 , and hydrogen solids 14 .However, they were not fully performed using a python script.The current versions of TurboGenius and TurboWorkflows allow one to fully control TurboRVB calculations using a python script.In this paper, we show two types of demonstrations, validations of the TurboRVB package and benchmarking QMC calculations using TurboRVB.The summary of the demonstrations are shown in Tables V and VI.

Data set Description of the benchmark test G2-set
Benchmarking the atomization energies of 55 molecules.They were computed using LRDMC at the a → 0 limit.The JSD ansatz with the LDA-PZ nodal surface was employed.The references are experimental values.
The detail is written in Sec.VI B 1.

S22-set
Benchmarking the binding energies of 22 complex systems.They were computed using LRDMC at the a → 0 limit.The JSD ansatz with the LDA-PZ nodal surface was employed.The references are CCSD(T) values.The detail is written in Sec.VI B 1.

A24-set
Benchmarking the binding energies of 24 complex systems.They were computed using LRDMC at the a → 0 limit.The JSD ansatz with the LDA-PZ nodal surface was employed.The references are CCSD(T) values.The detail is written in Sec.VI B 1. SCAI-set Benchmarking the binding energies of 24 complex systems.They were computed using LRDMC at the a → 0 limit.The JSD ansatz with the LDA-PZ nodal surface was employed.

A. Validations of QMC implementations
Validation of scientific softwares, such as checking consistency among softwares that implement the same theory as employed in this study (i.e., inter-software test), is an important step to ensure one's software reliability.The widespread use of validation tests is also important to ensure the trustability of numerical simulations in general.Such validation tests have been performed not-so-widely in the ab initio QMC community, since QMC requires complex computational procedures, as mentioned in the introduction.TurboGenius and Tur-boWorkflows enable one to do the tests much more easily and efficiently.In this paper, we report the results of inter-software tests for the TurboRVB package (v1.0.0).We checked interpackage consistencies between TurboRVB and other established quantum chemistry packages, such as PySCF 27,28 and Quantum Package 29 .The details are written in Secs.VI A 1 and VI A 2. Notice that the data and Python scripts to reproduce the validation tests are available from our public repositories (see Sec. VIII).Using the implemented workflows, we have checked the consistency of the DFT-LDA calculations among packages.DFT energies should be consistent among packages as far as the same basis sets and ECPs are used even though those DFT codes employ different implementation schemes.For the reference calculations, we used the PySCF package (v2.0.1) 27,28 with the Perdew-Zunger (PZ81) local density approximation (PZ-LDA) 56 .For the test sets, we have chosen (1) 38 molecules with singlet spins, (2) 10 insulating crystals, where both orthorhombic and non-orthorhombic cells are included, and (3) 4 metallic crystals.For the crystals, we employed the k = 4 × 4 × 4 (i.e., twisted average) grid such that the reciprocal grid includes both real and complex points.For the real-space grid, which is needed for the numerical integration employed in the built-in DFT module (prep), we employed 0.05 Bohr for the molecules and insulating crystals, while 0.03 Bohr for the metallic crystals.We employed the Fermi-Dirac smearing method with 0.01 Ha for the metals.Figures.S-3, S-4, and S-5 show the consistencies between PySCF and TurboRVBprep LDA calculations for all the above cases.The very slight differences come from the implementations (i.e., TurboRVBprep module employs the numerical integration to compute the overlap and Hamiltonian matrix elements).The corresponding numbers are shown in Tables.S-I, S-II, and S-III.The consistencies between the packages show that the implementations of DFT calculations in the TurboRVB package is correctly done.One of the most prominent features of TurboGenius is the functionality to convert a TREXIO file to a TurboRVB WF because it allows one to use any quantum chemistry or DFT packages to generate trial WFs as long as they employ localized basis sets.We have carefully verified the implementation of the converter by confirming the consistency between HF and VMC energies without Jastrow factor, both for molecules (open systems) and crystals (periodic systems).More specifically, we computed the HF energies of 100 molecules and 9 crystals (such that both orthorhombic and non-orthorhombic cells are included) by PySCF v2.0.1,where the ccECP [35][36][37][38] with accompanied basis sets were employed.The obtained PySCF checkpoint files were converted to TurboRVB WFs via TREXIO files using the converter implemented in Tur-boGenius package.Then, we computed VMC energies of the TurboRVB WFs without Jastrow factor.The PySCF to TurboRVB conversion via TREXIO supports both restricted (i.e., ROHF) and unrestricted (i.e., UHF) open-shell WFs.We tested the former implementation using the 100 molecules, while the latter using the 49 molecules.For the 100 molecules, the same consistency checks were done between HF calculations by Quantum Package (v2.1.2) and TurboRVB.For the crystals, we tested both single-k and multi-k (i.e., twisted average) calculations.For the single-k tests, k=(0.00,0.00, 0.00) and (0.25, 0.25, 0.25) were used.For the twisted average tests, we employed k = 4 × 4 × 4 such that the grid includes both real and complex points.Notice that, in the twisted average case, we compared the VMC energies with the averages of the HF energies obtained at each k-point independently, since the VMC is independently done for each k point (i.e., MCMC is done for each k.)We did not use metals for the VMC validation tests.This is because, for metals, the HF energy obtained with the smearing technique is not consistent with the VMC ones due to the fact that the orbital contributions above the Fermi energy are truncated when converting the DFT orbitals to the single Slater-Determinant ansatz.The HF and VMC energies should be consistent within the statistical errors (i.e., within 3σ) as long as the conversions are done correctly.The results of the validation tests are shown in Figs.S-6-S-10.The corresponding values are shown in Tables S-IV-S-VIII.The results show that the HF and VMC energies are consistent within the statistical errors (i.e., within 3σ), implying that the implementations of the WF converter and VMC calculations in the TurboRVB package are correctly done.

B. Benchmarking of QMC calculations
Benchmarking a theory with several typical systems is a task as important as the validation test, with the aim at examining the accuracy of the theory.Such benchmark calculations can also be performed efficiently using TurboGenius and TurboWorkflows.In this work, we benchmarked atomization energies of the Gaussian-2 set 30 , binding energies of the S22 31 , A24 32 , and SCAI 33 sets, and equilibrium lattice parameters of 12 crystals via equation of states calculations.We found that, for all the compounds, the diffusion Monte Carlo calculations with the PZ-LDA nodal surface give satisfactory results, i.e., consistent either with CCSD(T) or experimental values.The details of the results are reported in the following parts.

G2, S22, A24, and SCAI benchmark sets: atomization energy and binding energy calculations
We report the result of the benchmark tests using the G2 30 , S22 31 , A24 32 , and SCAI 33 sets.The G2-set targets atomization energies of 55 molecules, while the other sets benchmark binding energies of complex systems.The benchmark calculations were performed by TurboWorkflows.Our python workflow launched a sequential job for each atom and molecule, PySCF → TREXIO → TurboRVB WF (Jastrow Slater determinant ansatz) → VMC optimization (Jastrow factor) → VMC → LRDMC (lattice space → 0).Finally, we got extrapolated LRDMC energies of atoms and molecules of the benchmark sets.The quality of the Jastrow optimization did not affect the final extrapolated LRDMC energy because the determinant localization approximation (DLA) 57 was employed.Indeed, the workflow was fully automatic and fully reproducible since the determinant part which determines the nodal surface was obtained deterministically, and the Jastrow factor, which was obtained by stochastic optimization, did not affect the extrapolated FN energy.The geometries of the G2 set were taken from previous benchmark studies [58][59][60] , while those of the other dataset were taken from Benchmark Energy and Geometry DataBase (BEGDB) 61 Figure .2 shows the benchmark result for the G2-set 30 .The corresponding numbers are shown in Table S-IX.We computed the atomization energies of the 55 molecules included in the G2-set by pseudo-potential LRDMC calculations with the JDFT ansatz.We employed the cc-pVQZ basis set with the accompanied ccECP [35][36][37][38] pseudo potentials.For the DFT calculations, we used the PySCF package (v2.0.1) 27,28 with PZ-LDA 56 exchange-correlation functional.The obtained Mean absolute deviation (MAD) with the JDFT ansatz is 3.242 kcal/mol, which is consistent with previous studies.Nemec et al. 62 used all-electron DMC on Slater determinant (SD) WF to obtain the binding energies of the G2 set to an MAD of 3.2 kcal/mol.Grossman 63 reported a similar accuracy, a MAD of 2.9 kcal/mol, in binding energies by DMC pseudopotential calculations.Abhishek et al. 54 reported a MAD of 3.2 kcal/mol by all-electron LRDMC calculations with the JDFT ansatz.Recently, Abhishek et al. 54 also reported a MAD of 1.6 kcal/mol by all-electron LRDMC calculations with the JAGPs ansatz.TurboGenius and TurboWorkflows support the same binding energy calculations with the JAGP ansatz also.However, to reproduce their results, large Jastrow basis sets and very careful optimizations are needed; otherwise, optimizations could be stuck at local mimina.The fully automatic optimization of WFs with many variational parameters is not established yet.This should be solved in the near future for realizing robust high-throughput QMC calculations with optimized (beyond-DFT) nodal surfaces.
The A24 dataset is a set of non-covalent systems large enough to include various types of interactions 32 .The dataset was intended for testing accuracy of computational methods which are used as a benchmark in larger model systems.Figure.4 shows the benchmark result for the A24-set.The corresponding numbers are shown in Table S-XI.We employed the cc-pVQZ basis set with the accompanied ccECP [35][36][37][38] pseudo potentials.For the DFT calculations, we used the PySCF package (v2.0.1) 27,28 with PZ-LDA 56 exchange-correlation functional.The obtained MAD is 0.315 kcal/mol.The full set of the A24-benchmark was studied by Dubecky et.al. 65 .They investigated the effects of the basis set, Jastrow factor, and optimization protocols.They finally obtained MAD of 0.15 kcal/mol with the single-determinant trial wave functions of Slater-Jastrow type using B3LYP orbitals and aug-TZV basis sets accompanied with the ECPs developed by Burkatzki et al. 40 .Their MAD (0.15 kcal/mol) is very closed to the value reported in this study (0.315 kcal/mol).
The SCAI dataset is developed to benchmark interactions between amino acid side chains 33 .The dataset contains a representative set of 24 of the 400 (i.e., 20 × 20) possible interacting side chain pairs.Figure .5 shows the benchmark result for the SCAI-set.The corresponding numbers are shown in Table .S-XII.We employed the cc-pVQZ basis set with the accompanied ccECP [35][36][37][38] pseudo potentials.For the DFT calculations, we used the PySCF package (v2.0.1) 27,28 with PZ-LDA 56 exchange-correlation functional.The obtained MAD is 0.402 kcal/mol, which is as small as the S22 and A24 benchmark sets.To the best of our knowledge, no one has benchmarked the SCAI data set using QMC.Among the systems, 704-DH1 and 705-DHN1 show the largest deviations, -1.33 (26) kcal/mol and -2.15(26) kcal/mol, respectively.

Potential Energy Surface (PES) calculations
Potential Energy Surface (PES) calculations are often computed for dimers in QMC for benchmarking, i.e., for comparing binding energies, equilibrium bond lengths, and harmonic frequencies with experimental values, and for checking if the obtained forces and pressures are biased or not 34,53,[69][70][71][72][73] .To reduce the computation costs of PES calculations and avoid being trapped at local minima, Jastrow factors are usually FIG. 2. Deviation of the LRDMC atomization energies obtained in this study from the experimentally obtained values.Corrections for zeropoint energies and relativistic effects have been included before computing the differences between the LRDMC and experimental values 62 .The blue and red bounds represent discrepancies ± 5 kcal/mol and ± 1 kcal/mol, respectively.DFT calculations with PZ-LDA 56 exchangecorrelation functional were used to generate the trial WFs.The cc-pVQZ basis set with the accompanied ccECP pseudo potentials [35][36][37][38] were employed for the DFT calculations.FIG. 3. The binding energy comparison of the S22 benchmark set.The differences between the LRDMC values obtained in this study and the reference CCSD(T) values are plotted.The blue and red bounds represent discrepancies ± 5 kcal/mol and ± 1 kcal/mol, respectively.DFT calculations with PZ-LDA 56 exchange-correlation functional were used to generate the trial WFs.The cc-pVQZ basis set with the accompanied ccECP pseudo potentials [35][36][37][38] were employed for the DFT calculations.FIG. 4. The binding energy comparison of the A24 benchmark set.The differences between the LRDMC values obtained in this study and the reference CCSD(T) values are plotted.The blue and red bounds represent discrepancies ± 5 kcal/mol and ± 1 kcal/mol, respectively.DFT calculations with PZ-LDA 56 exchange-correlation functional were used to generate the trial WFs.The cc-pVQZ basis set with the accompanied ccECP pseudo potentials [35][36][37][38] were employed for the DFT calculations.FIG. 5.The binding energy comparison of the SCAI benchmark set.The differences between the LRDMC values obtained in this study and the reference CCSD(T) values are plotted.The blue and red bounds represent discrepancies ± 5 kcal/mol and ± 1 kcal/mol, respectively.DFT calculations with PZ-LDA 56 exchange-correlation functional were used to generate the trial WFs.The cc-pVQZ basis set with the accompanied ccECP pseudo potentials [35][36][37][38] were employed for the DFT calculations.
optimized at a certain bond length and copied to WFs with other bond lengths which will then be optimized with a better starting point.TurboWorkflows automatizes these procedure and, for a good point, solves the dependency automatically.An example workflow is shown in Listing S-1, where the initial Jastrow optimization procedure is defined for a bond length (1.10 Å) and the optimized Jastrow factors are copied to WFs at other bond lengths and optimized again.The point is the Value instance that defines workflows that should be completed before.The workflow for the PES calculation is showin in Fig. 7.The PESs were computed at both the VMC and LRDMC levels.Two ansatz were employed in this study, JDFT and JAGPs.The JDFT ansatz were optimized according to the procedure described above using the linear method 74 at the VMC level and then they were converted to the JAGPs ansatz.The JAGPs were further optimized using the linear method at the VMC level.The optimized JDFT and JAGP ansatz were used for the subsequent LRDMC calculations.The T-move approach 75 with the lattice discretization = 0.30 Bohr was employed for the LRDMC calculations.The obtained PESs are shown in Fig. 6.The left-hand side of Fig. 6 shows that the PESs obtained with JDFT and JAGPs ansatz at the VMC level, while the righthand of Fig. 6 shows that the PESs obtained with JDFT and JAGPs ansatz at the LRDMC level.Table VII summarizes the equilibrium bond length and the harmonic frequency obtained from the VMC and LRDMC calculations and those obtained from experiments.The LRDMC calculation with the JAGPs ansatz gives the closest values to the experiment, as expected.The remaining discrepancies can be solved using a larger determinant and Jastrow basis sets as they both affect the quality of the nodal surfaces.Such a benchmark test for other molecules is an intersting future work.The workflow will be also useful for benchmarking new ansatz and algorithms implemented in TurboRVB.

Benchmarking Equations of State in Solids
We report the Equation of States (EOSs) of 12 solids computed at the VMC and LRDMC levels by TurboWorkflows.Such a benchmark has been done for several crystals to test the accuracy of QMC calculations 77,78 .Table VIII shows the Crystallography Open Database(COD)-IDs 79,80 of the 12 crystals computed in this demonstration.The 12 crystals were chosen because Ref. 81 summarizes experimental lattice parameters at 0 K with the zero-point energy subtracted, which are directly comparable with our results.
First of all, we carefully checked the basis-set convergence since PySCF, which generates trial wavefunctions for the subsequent TurboRVB calculations, employs the localized basis set also for the periodic systems.In other words, the convergence is sometimes difficult to be achieved unlike the planewave one.To check the basis-set convergence, we compared the EOSs of the 12 crystals obtained by PySCF (localized basis-set) and QuantumEspresso (Plane-Wave basis set with a 800 Ryd cutoff) with the same ccECP pseudopotentials [35][36][37][38] .The basis set convergence check using 1×1×1 supercell is shown in Fig. S-11 and the finally chosen basis sets are listed in Table VIII.We found that, to achieve the convergence, a large basis set (e.g., V5Z) is often needed.Notice that, in the localized basis sets, orbitals whose exponent is smaller than 0.10 were cut to avoid the numerical instability (i.e., lineardependency 51 ).The consistency holds also for the 2×2×2 supercells as shown in Fig. S-12.The slow convergence with respect to the basis set size comes from the fact that the provided basis sets were tuned using molecules, not solids.Indeed, the exponents are not suitable for solids.Therefore, to achieve a better convergence, we recommend to use basis sets optimized for solids 82 .
We have confirmed that the 2×2×2 supercell with k=2×2×2 twisted average is large enough to mitigate the one-body finite size effect for Diamond (Fig. S-13).It should be common among all the compounds we are studying because they are all insulators with similar lattice parameters.We have not checked whether 2×2×2 supercells are large enough to mitigate the two-body error, but we can assume so because the EOS calculations depend on the relative energies.In fact, Ref. 83 shows that larger supercells do not change the lattice parameter.
Figure 8 shows the EOSs of the 12 crystals computed at the VMC and LRDMC levels with the LDA-PZ nodal surfaces with the converged basis sets.The DFT calculations with the XC=LDA-PZ 56 , PBE 84 , and PBEsol 85 were performed using Quantum Espresso with the Ultra-soft PPs provided by the PS-library Project (v.1.0.0) 86 .The 2×2×2 supercells and k=2×2×2 meshes were employed for all the calculations.The obtained PESs were fitted by the Vinet function 87 .Table VIII shows the lattice parameters (a 0 ) obtained from the Vinet fittings and the available experimental values 81 .We evaluated the performance of the methods by mean absolute error (MAE: × 100)), where M is the sample size (i.e., M = 12 in this work).Fig. 9 shows percentage errors in the calculated lattice parameters compared to experimental ones.
On one hand, our results show that the accuracy of the VMC calculations on the lattice parameter is strongly dependent on crystals.For instance, the estimated equilibrium lattice pa- rameter of Diamond is well consistent with the experimental value both at the DFT and VMC levels, while that of NaCl was severely underestimated.The MARE for the VMC calculations is 0.5087(75) %, which is slightly better than that for the DFT-LDA calculations, while worse than that of the DFT-PBEsol calculations.The results imply the VMC calculations (with the small Jastrow factor employed in this study) is sometimes not sufficient to get accurate EOSs.
On the other hand, our results show that the DMC calculations with the LDA-PZ nodal surfaces are much less dependent on the choices of XCs for generating the trial wave functions, showing the accuracy of the methods.The conclusion is in line with the benchmark calculations presented in Refs.77 and 78, showing that DMC is highly accurate in describing the structural properties of a broad range of solids and that these structural properties are rather insensitive to the given nodal surfaces.The MARE for the DMC calculations is 0.229(11) %, which is the best result among the methods tested in this study.The remaining discrepancy between the experiments and calculations could come either from the fixed nodal surface (i.e., the LDA-PZ is used in this study), from the qualities of the PPs (ccECP), or from their related non-local properties (i.e., in case of T-moves, the quality of the Jastrow factor can still affect the results).In these regards, more comprehensive benchmark tests on the EOS calculations using TurboWorkflows will be an interesting perspective.
Figure S-1 shows a Unified Modeling Language (UML) diagram of the VMC workflow.The users can also define their own workflows inheriting the Workflow class.
Figure S-2 shows the UML diagram corresponding to the script shown in the listing 8.If one wants to manipulate files or values in a more complex way, one should define a new workflow class inheriting the Workflow class and pass it into the Encapsulated Workflow class.Listing 8.A python workflow to obtain the extrapolated LRDMC energy (a → 0) of the water molecule.
The references are CCSD(T) values.The detail is written in Sec.VI B 1. CO dimer Benchmarking the equilibrium bond length and harmonic vibrational frequency of the CO dimer.They were estimated from potential energy surfaces with the JSD (LDA-PZ nodal surface) and JAGP ansatz.The references are experimental values.The details is written in Sec.VI B 2. Cubic crystals Benchmarking the equilibrium lattice parameters 12 cubic crystals.They were estimated by fitting the equation of states obtained by LRDMC.The JSD ansatz with the LDA-PZ nodal surface was employed.The references are experimental values.The detail is written in Sec.VI B 3.

FIG. 6 .
FIG.6.PESs of the CO dimer computed by a python workflow implemented using TurboWorkflows.The left and right PESs were computed at the VMC and LRDMC levels, respectively.The green and blue vertical broken lines represent the equilibrium distances obtained from the JSD (with the LDA-PZ nodal surface) and JAGP PESs, respectively.The red vertical broken line shows the experimental equilibrium distance.

FIG. 8 .
FIG. 8. Results of the equation of state calculations for the 12 crystals by DFT, VMC, and LRDMC.The dot points are obtained values, and the solid lines are the curves obtained by a fit of the Vinet equation to calculations.The VMC and DMC points have statistical errors.In the plots, the error bars represent 2σ.The experimental equilibrium volumes and those obtained by a fit of the Vinet equation to calculations are plotted as vertical broken lines.Notice that the finite temperature thermal expansion and zero point energy were corrected in the experimental values 81 .

FIG. 9 .
FIG. 9. Percentage errors in the obtained lattice parameters as compared to the experimental values 81 .In the plots, the error bars represent 2σ.The positive (negative) percentage indicates that the calculation overestimates (underestimates) the lattice parameter.

ACKNOWLEDGMENTSK
FIG. S-1.UML activity diagram of the VMC workflow class.

FIG. S- 3 .
FIG. S-3.The comparison of the LDA-PZ energies obtained by PySCF and those energies obtained by TurboRVB-prep with the same basis sets for 38 molecules.

FIG. S- 11 .
FIG. S-11.Comparisons of the EOSs computed by Quantum Espresso and PySCF with 1×1×1 conventional cells and k = 2×2×2.The ccECP pseudo potentials were employed for the calculations.The cutoff energy 800 Ry was chosen for the plane-wave Quantum Espresso calculations as the ccECP PPs are hard.For the PySCF calculations, various basis sets from ccp-pVDZ to ccp-pV5Z were tested.The LDA-PZ exchange-correlation functional was used.

TABLE I .
Major classes of the TurboGenius package .
Listing 4. A python workflow to obtain the VMC energy of the Hydrogen dimer.

TABLE II .
Major classes of the PyTurbo package.Corresponding modules in TurboRVB are listed in TableIII.
. One can readily know what options are available and what each option does.This functionality is realized with the click package 50 .Listing 6.The helper implemented in TurboGenius.

TABLE IV .
Major classes of the TurboWorkflows package.

TABLE V .
Summary of the validation tests using TurboGenius and TurboWorkflows.Purpose of the validation 38 molecules LDA-DFT (prep) LDA-DFT Consistency check between the DFT module of PySCF and that of TurboRVB for open-boundary condition systems (molecules).The detail is written in Sec.VI A 1. 10 crystals LDA-DFT (prep) LDA-DFT Consistency check between the DFT module of PySCF and that of TurboRVB for periodic-boundary condition systems with insulating electronic states (without the smearing technique).k = 4×4×4 was used.The detail is written in Sec.VI A 1. 4 crystals LDA-DFT (prep) LDA-DFT Consistency check between the DFT module of PySCF and that of TurboRVB for periodic-boundary condition systems with metallic electronic states (with the smearing technique).k = 4×4×4 was used.The detail is written in Sec.VI A 1. 100 molecules VMC (turborvb) RHF/ROHF Consistency check between HF calculations done by PySCF and VMC calculations without Jastrow factor done by TurboRVB for open-boundary condition systems (molecules).RHF and ROHF were used for spin-unpolarized and spinpolarized systems, respectively.The same consistency check was done between TurboRVB and Quantum Package.The detail is written in sec.VI A 2. 49 molecules VMC (turborvb) UHF Consistency check between HF calculations done by PySCF and VMC calculations without Jastrow factor done by TurboRVB for open-boundary condition systems (molecules) with spin-polarized states (UHF was used).The detail is written in Sec.VI A 2. 9 crystals VMC (turborvb) RHF/ROHF Consistency check between HF calculations done by PySCF and VMC calculations without Jastrow factor done by TurboRVB for periodic-boundary condition systems.RHF and ROHF were used for spin-unpolarized and spinpolarized systems, respectively.k = Γ, k = (0.25, 0.25, 0.25), and k = 4×4×4 were tested.The detail is written in Sec.VI A 2. TABLE VI.A summary of the benchmarks using TurboGenius and TurboWorkflows.

TABLE VII .
Equilibrium bond distances r eq (Å) and harmonic frequencies ω (cm −1 ) of the CO dimer obtained with the JDFT and JAGPs ansatz both at the VMC and LRDMC levels.

TABLE VIII .
81e columns 1-5 contain, compounds, crystal type (A4, diamond; B1, CsCl; B3, zinc blende), CODID, basis set, and ECP.The columns 6-10 contain the equilibrium lattice parameters obtained from the Vinet fitting to DFT, VMC, and LRDMC calculations performed in this paper.The VMC and DMC results include statistical errors (1σ) in the individual calculations.The column 11 contains the experimental values, where the finite temperature thermal expansion and zero point energy were subtracted in the work of Hao et al.81.The lattice parameters are given in Å.The two statistics are shown in the table, mean absolute error (MAE: 1 M (Σ|a calc.− a exp.|)) and mean absolute relative error (MARE: 1 M (Σ |a calc.−aexp.|aexp.×100)), where M is the sample size (i.e., M = 12 in this work).

TABLE S -
I. The comparison of the LDA-PZ energies obtained by PySCF and those obtained by TurboRVB with the same basis sets for 38 molecules.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.)

TABLE S -
II.The comparison of the LDA-PZ energies obtained by PySCF and those obtained by TurboRVB with the same basis sets for 10 crystals with twisted average (k = 4 × 4 × 4).The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.)

TABLE S -
III.The comparison of the LDA-PZ energies obtained by PySCF and those obtained by TurboRVB with the same basis sets for 4 crystals with twisted average (k = 4 × 4 × 4).The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.)

TABLE S -
IV: The comparison of the RHF(ROHF) energies obtained by PySCF and the VMC energies obtained by TurboRVB with the WFs converted from the corresponding PySCF checkpoints files via TREXIO for the 100 molecules.The error bars refer to 3σ of the VMC calculations.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.) Molecules Charge Spin Basis set HF energy (Ha) [PySCF] HF energy (Ha) [TurboRVB] Difference (Ha)

TABLE S -
V: The comparison of the UHF energies obtained by PySCF and the VMC energies obtained by TurboRVB with the WFs converted from the corresponding PySCF checkpoints files via TREXIO for the 49 molecules.The error bars refer to 3σ of the VMC calculations.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.) Molecules Charge Spin Basis set HF energy (Ha) [PySCF] HF energy (Ha) [TurboRVB] Difference (Ha)

TABLE S -
VI.The comparison of the RHF(ROHF) energies obtained by PySCF and the VMC energies obtained by TurboRVB with the WFs converted from the corresponding PySCF checkpoints files via TREXIO for 9 crystals at k = Γ.The error bars refer to 3σ of the VMC calculations.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.) CODID Crystal Crystalsystem Spacegroup Charge Spin HF energy (Ha) [pySCF] HF energy (Ha) [TurboRVB] Difference (Ha)

TABLE S -
VII.The comparison of the RHF(ROHF) energies obtained by PySCF and the VMC energies obtained by TurboRVB with the WFs converted from the corresponding PySCF checkpoints files via TREXIO for 9 crystals at k = (0.25, 0.25, 0.25).The error bars refer to 3σ of the VMC calculations.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.) CODID Crystal Crystalsystem Spacegroup Charge Spin HF energy (Ha) [PySCF] HF energy (Ha) [TurboRVB] Difference (Ha)

TABLE S -
VIII.The comparison of the RHF(ROHF) energies obtained by PySCF and the VMC energies obtained by TurboRVB with the WFs converted from the corresponding PySCF checkpoints files via TREXIO for 9 crystals with twisted average (k = 4 × 4 × 4).The error bars refer to 3σ of the VMC calculations.The TREXIO files used for these calculations are available from our public repository (See Sec.VIII in the main text for the details.)

TABLE S -
IX.The atomization energies of 55 molecule contained in the G2-set.They were computed by LRDMC with the LDA-PZ nodal surfaces and the DLA.The experimental values and the differences between the LRDMC and experimental values are also shown.