A portable and performant graphics processing unit (GPU)-accelerated library for electron repulsion integral (ERI) evaluation, named LibERI, has been developed and implemented via directive-based (e.g., OpenMP and OpenACC) and standard language parallelism (e.g., Fortran DO CONCURRENT). Offloaded ERIs consist of integrals over low and high contraction s, p, and d functions using the rotated-axis and Rys quadrature methods. GPU codes are factorized based on previous developments [Pham et al., J. Chem. Theory Comput. 19(8), 2213–2221 (2023)] with two layers of integral screening and quartet presorting. In this work, the density screening is moved to the GPU to enhance the computational efficacy for large molecular systems. The L-shells in the Pople basis set are also separated into pure S and P shells to increase the ERI homogeneity and reduce atomic operations and the memory footprint. LibERI is compatible with any quantum chemistry drivers supporting the MolSSI Driver Interface. Benchmark calculations of LibERI interfaced with the GAMESS software package were carried out on various GPU architectures and molecular systems. The results show that the LibERI performance is comparable to other state-of-the-art GPU-accelerated codes (e.g., TeraChem and GMSHPC) and, in some cases, outperforms conventionally developed ERI CUDA kernels (e.g., QUICK) while fully maintaining portability.

1.
R.
Di Felice
,
M. L.
Mayes
,
R. M.
Richard
,
D. B.
Williams-Young
,
G. K.-L.
Chan
,
W. A.
de Jong
,
N.
Govind
,
M.
Head-Gordon
,
M. R.
Hermes
,
K.
Kowalski
,
X.
Li
,
H.
Lischka
,
K. T.
Mueller
,
E.
Mutlu
,
A. M. N.
Niklasson
,
M. R.
Pederson
,
B.
Peng
,
R.
Shepard
,
E. F.
Valeev
,
M.
van Schilfgaarde
,
B.
Vlaisavljevich
,
T. L.
Windus
,
S. S.
Xantheas
,
X.
Zhang
, and
P. M.
Zimmerman
, “
A perspective on sustainable computational chemistry software development and integration
,”
J. Chem. Theory Comput.
19
(
20
),
7056
7076
(
2023
).
2.
S.
Seritan
,
C.
Bannwarth
,
B. S.
Fales
,
E. G.
Hohenstein
,
C. M.
Isborn
,
S. I. L.
Kokkila-Schumacher
,
X.
Li
,
F.
Liu
,
N.
Luehr
,
J. W.
Snyder
,
C.
Song
,
A. V.
Titov
,
I. S.
Ufimtsev
,
L.
Wang
, and
T. J.
Martínez
, “
TeraChem: A graphical processing unit-accelerated electronic structure package for large-scale ab initio molecular dynamics
,”
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
11
(
2
),
e1494
(
2021
).
3.
K. G.
Johnson
,
S.
Mirchandaney
,
E.
Hoag
,
A.
Heirich
,
A.
Aiken
, and
T. J.
Martínez
, “
Multinode multi-GPU two-electron integrals: Code generation using the regent language
,”
J. Chem. Theory Comput.
18
(
11
),
6522
6536
(
2022
).
4.
G. M. J.
Barca
,
C.
Bertoni
,
L.
Carrington
,
D.
Datta
,
N.
De Silva
,
J. E.
Deustua
,
D. G.
Fedorov
,
J. R.
Gour
,
A. O.
Gunina
,
E.
Guidez
,
T.
Harville
,
S.
Irle
,
J.
Ivanic
,
K.
Kowalski
,
S. S.
Leang
,
H.
Li
,
W.
Li
,
J. J.
Lutz
,
I.
Magoulas
,
J.
Mato
,
V.
Mironov
,
H.
Nakata
,
B. Q.
Pham
,
P.
Piecuch
,
D.
Poole
,
S. R.
Pruitt
,
A. P.
Rendell
,
L. B.
Roskop
,
K.
Ruedenberg
,
T.
Sattasathuchana
,
M. W.
Schmidt
,
J.
Shen
,
L.
Slipchenko
,
M.
Sosonkina
,
V.
Sundriyal
,
A.
Tiwari
,
J. L.
Galvez Vallejo
,
B.
Westheimer
,
M.
Włoch
,
P.
Xu
,
F.
Zahariev
, and
M. S.
Gordon
, “
Recent developments in the general atomic and molecular electronic structure system
,”
J. Chem. Phys.
152
(
15
),
154102
(
2020
).
5.
J.
Qi
,
Y.
Zhang
, and
M.
Yang
, “
A hybrid CPU/GPU method for Hartree–Fock self-consistent-field calculation
,”
J. Chem. Phys.
159
(
10
),
104101
(
2023
).
6.
K.
Kowalski
,
R.
Bair
,
N. P.
Bauman
,
J. S.
Boschen
,
E. J.
Bylaska
,
J.
Daily
,
W. A.
de Jong
,
T.
Dunning
,
N.
Govind
,
R. J.
Harrison
,
M.
Keçeli
,
K.
Keipert
,
S.
Krishnamoorthy
,
S.
Kumar
,
E.
Mutlu
,
B.
Palmer
,
A.
Panyala
,
B.
Peng
,
R. M.
Richard
,
T. P.
Straatsma
,
P.
Sushko
,
E. F.
Valeev
,
M.
Valiev
,
H. J. J.
van Dam
,
J. M.
Waldrop
,
D. B.
Williams-Young
,
C.
Yang
,
M.
Zalewski
, and
T. L.
Windus
, “
From NWChem to NWChemEx: Evolving with the computational chemistry landscape
,”
Chem. Rev.
121
(
8
),
4962
4998
(
2021
).
7.
Q.
Sun
,
T. C.
Berkelbach
,
N. S.
Blunt
,
G. H.
Booth
,
S.
Guo
,
Z.
Li
,
J.
Liu
,
J. D.
McClain
,
E. R.
Sayfutyarova
,
S.
Sharma
,
S.
Wouters
, and
G. K.
Chan
, “
PySCF: The Python-based simulations of chemistry framework
,”
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
(
1
),
e1340
(
2018
).
8.
D.
Poole
,
J. L.
Galvez Vallejo
, and
M. S.
Gordon
, “
A new kid on the block: Application of Julia to Hartree–Fock calculations
,”
J. Chem. Theory Comput.
16
(
8
),
5006
5013
(
2020
).
9.
G. M. J.
Barca
,
D. L.
Poole
,
J. L. G.
Vallejo
,
M.
Alkan
,
C.
Bertoni
,
A. P.
Rendell
, and
M. S.
Gordon
, “
Scaling the Hartree-Fock matrix build on Summit
,” in
SC20: International Conference for High Performance Computing, Networking, Storage and Analysis
(
IEEE
,
Atlanta, GA
,
2020
), pp.
1
14
.
10.
M.
Alkan
,
B. Q.
Pham
,
J. R.
Hammond
, and
M. S.
Gordon
, “
Enabling Fortran standard parallelism in GAMESS for accelerated quantum chemistry calculations
,”
J. Chem. Theory Comput.
19
(
13
),
3798
3805
(
2023
).
11.
B. Q.
Pham
,
M.
Alkan
, and
M. S.
Gordon
, “
Porting fragmentation methods to graphical processing units using an OpenMP application programming interface: Offloading the Fock build for low angular momentum functions
,”
J. Chem. Theory Comput.
19
(
8
),
2213
2221
(
2023
).
12.
G. M. J.
Barca
,
J. L. G.
Vallejo
,
D. L.
Poole
,
M.
Alkan
,
R.
Stocks
,
A. P.
Rendell
, and
M. S.
Gordon
, “
Enabling large-scale correlated electronic structure calculations: Scaling the RI-MP2 method on Summit
,” in
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
(
ACM
,
St. Louis Missouri
,
2021
), pp.
1
15
.
13.
B. Q.
Pham
,
L.
Carrington
,
A.
Tiwari
,
S. S.
Leang
,
M.
Alkan
,
C.
Bertoni
,
D.
Datta
,
T.
Sattasathuchana
,
P.
Xu
, and
M. S.
Gordon
, “
Porting fragmentation methods to GPUs using an OpenMP API: Offloading the resolution-of-the-identity second-order Møller–Plesset perturbation method
,”
J. Chem. Phys.
158
(
16
),
164115
(
2023
).
14.
E. G.
Hohenstein
,
N.
Luehr
,
I. S.
Ufimtsev
, and
T. J.
Martínez
, “
An atomic orbital-based formulation of the complete active space self-consistent field method on graphical processing units
,”
J. Chem. Phys.
142
(
22
),
224103
(
2015
).
15.
A. E.
DePrince
and
J. R.
Hammond
, “
Coupled cluster theory on graphics processing units I. The coupled cluster doubles method
,”
J. Chem. Theory Comput.
7
(
5
),
1287
1295
(
2011
).
16.
D.
Datta
and
M. S.
Gordon
, “
Accelerating coupled-cluster calculations with GPUs: An implementation of the density-fitted CCSD(T) approach for heterogeneous computing architectures using OpenMP directives
,”
J. Chem. Theory Comput.
19
(
21
),
7640
7657
(
2023
).
17.
I. S.
Ufimtsev
and
T. J.
Martínez
, “
Quantum chemistry on graphical processing units. 1. Strategies for two-electron integral evaluation
,”
J. Chem. Theory Comput.
4
(
2
),
222
231
(
2008
).
18.
I. S.
Ufimtsev
and
T. J.
Martinez
, “
Quantum chemistry on graphical processing units. 2. Direct self-consistent-field implementation
,”
J. Chem. Theory Comput.
5
(
4
),
1004
1015
(
2009
).
19.
I. S.
Ufimtsev
and
T. J.
Martinez
, “
Quantum chemistry on graphical processing units. 3. Analytical energy gradients, geometry optimization, and first principles molecular dynamics
,”
J. Chem. Theory Comput.
5
(
10
),
2619
2628
(
2009
).
20.
L. E.
McMurchie
and
E. R.
Davidson
, “
One- and two-electron integrals over Cartesian Gaussian functions
,”
J. Comput. Phys.
26
(
2
),
218
231
(
1978
).
21.
S.
Seritan
,
C.
Bannwarth
,
B. S.
Fales
,
E. G.
Hohenstein
,
S. I. L.
Kokkila-Schumacher
,
N.
Luehr
,
J. W.
Snyder
,
C.
Song
,
A. V.
Titov
,
I. S.
Ufimtsev
, and
T. J.
Martínez
, “
TeraChem: Accelerating electronic structure and ab initio molecular dynamics with graphical processing units
,”
J. Chem. Phys.
152
(
22
),
224110
(
2020
).
22.
A.
Asadchev
,
V.
Allada
,
J.
Felder
,
B. M.
Bode
,
M. S.
Gordon
, and
T. L.
Windus
, “
Uncontracted Rys quadrature implementation of up to G functions on graphical processing units
,”
J. Chem. Theory Comput.
6
(
3
),
696
704
(
2010
).
23.
M.
Dupuis
,
J.
Rys
, and
H. F.
King
, “
Evaluation of molecular integrals over Gaussian basis functions
,”
J. Chem. Phys.
65
(
1
),
111
116
(
1976
).
24.
J.
Zhang
, “
LIBRETA: Computerized optimization and code synthesis for electron repulsion integral evaluation
,”
J. Chem. Theory Comput.
14
(
2
),
572
587
(
2018
).
25.
See https://www.brianqc.com/ for BrianQC.
26.
Á.
Rák
and
G.
Cserey
, “
The BRUSH algorithm for two-electron integrals on GPU
,”
Chem. Phys. Lett.
622
,
92
98
(
2015
).
27.
G. J.
Tornai
,
I.
Ladjánszki
,
Á.
Rák
,
G.
Kis
, and
G.
Cserey
, “
Calculation of quantum chemical two-electron integrals by applying compiler technology on GPU
,”
J. Chem. Theory Comput.
15
(
10
),
5319
5331
(
2019
).
28.
J.
Kussmann
and
C.
Ochsenfeld
, “
Hybrid CPU/GPU integral engine for strong-scaling ab initio methods
,”
J. Chem. Theory Comput.
13
(
7
),
3153
3159
(
2017
).
29.
J.
Kussmann
and
C.
Ochsenfeld
, “
Employing OpenCL to accelerate ab initio calculations on graphics processing units
,”
J. Chem. Theory Comput.
13
(
6
),
2712
2716
(
2017
).
30.
J. A.
Pople
and
W. J.
Hehre
, “
Computation of electron repulsion integrals involving contracted Gaussian basis functions
,”
J. Comput. Phys.
27
(
2
),
161
168
(
1978
).
31.
S.
Obara
and
A.
Saika
, “
General recurrence formulas for molecular integrals over Cartesian Gaussian functions
,”
J. Chem. Phys.
89
(
3
),
1540
1559
(
1988
).
32.
P. M. W.
Gill
,
M..
Head-Gordon
, and
J. A.
Pople
, “
Efficient computation of two-electron-repulsion integrals and their nth-order derivatives using contracted Gaussian basis sets
,”
J. Phys. Chem.
94
(
14
),
5564
5572
(
1990
).
33.
G. D.
Fletcher
, “
Recursion formula for electron repulsion integrals over hermite polynomials
,”
Int. J. Quantum Chem.
106
(
2
),
355
360
(
2006
).
34.
Y.
Miao
and
K. M.
Merz
, “
Acceleration of high angular momentum electron repulsion integrals and integral derivatives on graphics processing units
,”
J. Chem. Theory Comput.
11
(
4
),
1449
1462
(
2015
).
35.
G. M. J.
Barca
,
M.
Alkan
,
J. L.
Galvez-Vallejo
,
D. L.
Poole
,
A. P.
Rendell
, and
M. S.
Gordon
, “
Faster self-consistent field (SCF) calculations on GPU clusters
,”
J. Chem. Theory Comput.
17
(
12
),
7486
7503
(
2021
).
36.
See https://molssi.org for MolSSI Driver Interface.
37.
T. A.
Barnes
,
E.
Marin-Rimoldi
,
S.
Ellis
, and
T. D.
Crawford
, “
The MolSSI driver interface project: A framework for standardized, on-the-fly interoperability between computational molecular sciences codes
,”
Comput. Phys. Commun.
261
,
107688
(
2021
).
38.
K.
Ishimura
and
S.
Nagase
, “
A new algorithm of two-electron repulsion integral calculations: A combination of Pople–Hehre and McMurchie–Davidson methods
,”
Theor. Chem. Acc.
120
(
1–3
),
185
189
(
2008
).
39.
H. F.
King
and
M.
Dupuis
, “
Numerical integration using Rys polynomials
,”
J. Comput. Phys.
21
(
2
),
144
165
(
1976
).
40.
J.
Rys
,
M.
Dupuis
, and
H. F.
King
, “
Computation of electron repulsion integrals using the Rys quadrature method
,”
J. Comput. Chem.
4
(
2
),
154
157
(
1983
).
41.
V.
Mironov
,
Y.
Alexeev
,
K.
Keipert
,
M.
D’mello
,
A.
Moskovsky
, and
M. S.
Gordon
, “
An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel® Xeon Phi™ processor
,” in
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
(
ACM
,
Denver, CO
,
2017
), pp.
1
12
.
42.
M.
Manathunga
,
H. M.
Aktulga
,
A. W.
Götz
, and
K. M.
Merz
, “
Quantum mechanics/molecular mechanics simulations on NVIDIA and AMD graphics processing units
,”
J. Chem. Inf. Model.
63
(
3
),
711
717
(
2023
).
43.
See https://www.olcf.ornl.gov/summit/ for Summit hardware and software stack.
44.
G. D.
Fletcher
,
M. W.
Schmidt
,
B. M.
Bode
, and
M. S.
Gordon
, “
The distributed data interface in GAMESS
,”
Comput. Phys. Commun.
128
(
1–2
),
190
200
(
2000
).
You do not currently have access to this content.