Accurate and explainable artificial-intelligence (AI) models are promising tools for accelerating the discovery of new materials. Recently, symbolic regression has become an increasingly popular tool for explainable AI because it yields models that are relatively simple analytical descriptions of target properties. Due to its deterministic nature, the sure-independence screening and sparsifying operator (SISSO) method is a particularly promising approach for this application. Here, we describe the new advancements of the SISSO algorithm, as implemented into SISSO++, a C++ code with Python bindings. We introduce a new representation of the mathematical expressions found by SISSO. This is a first step toward introducing “grammar” rules into the feature creation step. Importantly, by introducing a controlled nonlinear optimization to the feature creation step, we expand the range of possible descriptors found by the methodology. Finally, we introduce refinements to the solver algorithms for both regression and classification, which drastically increase the reliability and efficiency of SISSO. For all these improvements to the basic SISSO algorithm, we not only illustrate their potential impact but also fully detail how they operate both mathematically and computationally.

1.
T.
Zhu
et al,
Energy Environ. Sci.
14
,
3559
(
2021
).
2.
S. A.
Miller
et al,
Chem. Mater.
29
,
2494
(
2017
).
3.
K.
Tran
and
Z. W.
Ulissi
,
Nature Catalysis
1
,
696
(
2018
).
4.
V.
Stanev
,
K.
Choudhary
,
A. G.
Kusne
,
J.
Paglione
, and
I.
Takeuchi
,
Commun. Mater.
2
,
105
(
2021
).
5.
P. P.
Angelov
,
E. A.
Soares
,
R.
Jiang
,
N. I.
Arnold
, and
P. M.
Atkinson
,
WIREs Data Min. Knowl. Discovery
11
,
e1424
(
2021
).
6.
D.
Gunning
et al,
Sci. Rob.
4
,
eaay7120
(
2019
).
7.
A.
Das
and
P.
Rad
, arXiv:2006.11371 (
2020
).
8.
F.
Xu
et al, “
Explainable AI: A brief survey on history, research areas, approaches and challenges
,” in
Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics
(
Springer
,
2019
), Vol.
11839 LNAI
, pp.
563
574
.
9.
G. S. I.
Aldeia
and
F. O.
De França
, “
Measuring feature importance of symbolic regression models using partial effects
,” in
GECCO ‘21: Proceedings of the Genetic and Evolutionary Computation Conference
(
Association for Computing Machinery
,
2021
), p.
750
758
.
10.
A.
Holzinger
,
A.
Saranti
,
C.
Molnar
,
P.
Biecek
, and
W.
Samek
, “
Explainable AI methods - a brief overview
,” in
Lecture Notes in Computer Science (Including Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
, Vol.
13200 LNAI
(
Springer Science and Business Media Deutschland GmbH
,
2022
), pp.
13
38
.
11.
Z.
Li
,
J.
Ji
, and
Y.
Zhang
, arXiv: 2111.12210 (
2021
).
12.
Y.
Wang
,
N.
Wagner
, and
J. M.
Rondinelli
,
MRS Commun.
9
,
793
(
2019
).
13.
J. R.
Koza
,
Stat. Comput.
4
,
87
(
1994
).
14.
T.
Mueller
,
E.
Johlin
, and
J. C.
Grossman
,
Phys. Rev. B
89
,
115202
(
2014
).
15.
F.
Yuan
and
T.
Mueller
,
Sci. Rep.
7
,
17594
(
2017
).
16.
S.-M.
Udrescu
and
M.
Tegmark
,
Sci. Adv.
6
(
16
),
eaay2631
(
2020
).
17.
S.
Kim
et al,
IEEE Trans. Neural Networks Learn. Syst.
32
,
4166
(
2021
).
18.
M. D.
Cranmer
,
R.
Xu
,
P.
Battaglia
, and
S.
Ho
,
Learning Symbolic Physics with Graph Networks
(
Curran
,
2019
), see Associates https://ml4physicalsciences.github.io/2019/files/NeurIPS_ML4PS_2019_15.pdf.
19.
M.
Valipour
,
B.
You
,
M.
Panju
, and
A.
Ghodsi
,
Symbolicgpt: A Generative Transformer Model for Symbolic Regression
(
Curran Associates
,
2021
), see https://neurips2022-enlsp.github.io/papers/paper_62.pdf.
20.
B. K.
Petersen
et al, “
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients
,” in OpenReview ICLR 2021 Conference (2023), see https://openreview.net/forum?id=m5Qsh0kBQG&utm_source=miragenews&utm_medium=miragenews\&utm_campaign=news
21.
W.
Tenachi
,
R.
Ibata
, and
F. I.
Diakogiannis
, “
Deep symbolic regression for physics guidconstraints: Toward the automated discovery of physical laws
,” arXiv:2303.03192 [astro-ph.IM].
22.
R.
Ouyang
,
E.
Ahmetcik
,
C.
Carbogno
,
M.
Scheffler
, and
L. M.
Ghiringhelli
,
J. Phys. Mater.
2
,
024002
(
2019
).
23.
R.
Ouyang
,
S.
Curtarolo
,
E.
Ahmetcik
,
M.
Scheffler
, and
L. M.
Ghiringhelli
,
Phys. Rev. Mater.
2
,
83802
(
2018
).
24.
T. A. R.
Purcell
,
M.
Scheffler
,
C.
Carbogno
, and
L. M.
Ghiringhelli
,
J. Open Source Software
7
,
3960
(
2022
).
25.
L.
Foppa
,
T. A.
Purcell
,
S. V.
Levchenko
,
M.
Scheffler
, and
L. M.
Ghringhelli
,
Phys. Rev. Lett.
129
,
55301
(
2022
).
26.
C. J.
Bartel
et al,
Sci. Adv.
5
,
eaav0693
(
2019
).
27.
G. R.
Schleder
,
C. M.
Acosta
, and
A.
Fazzio
,
ACS Appl. Mater. Interfaces
12
,
20149
(
2020
).
28.
Z.-K.
Han
et al,
Nat. Commun.
12
,
1833
(
2021
).
29.
G.
Pilania
,
C. N.
Iverson
,
T.
Lookman
, and
B. L.
Marrone
,
J. Chem. Inf. Model.
59
,
5013
(
2019
).
30.
J.
Fan
and
J.
Lv
,
J. R. Stat. Soc. Ser. B: Stat. Methodol.
70
,
849
(
2008
).
31.
T. A. R.
Purcell
,
M.
Scheffler
,
L. M.
Ghiringhelli
, and
C.
Carbogno
,
npj Comput. Mater
9
,
112
(
2023
).
32.
S. G.
Johnson
, The NLopt nonlinear-optimization package,
2021
, http://github.com/stevengj/nlopt.
33.
T. H.
Rowan
, “
Functional stability analysis of numerical algorithms
,” Ph.D. thesis,
University of Texas at Austin
,
1990
.
34.
J. A.
Nelder
and
R.
Mead
,
Comput. J.
7
,
308
(
1965
).
35.
T.
Runarsson
and
X.
Yao
,
IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)
35
,
233
(
2005
).
36.
J. J.
Forrest
et al, coin-or/clp: Version 1.17.6.
37.
C.-C.
Chang
and
C.-J.
Lin
,
ACM Trans. Intell. Syst. Technol.
2
(
3
),
27
(
2011
), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
38.
R.
Ouyang
,
S.
Curtarolo
,
E.
Ahmetcik
,
M.
Scheffler
, and
L. M.
Ghiringhelli
,
Phys. Rev. Mater.
2
,
083802
(
2018
).
You do not currently have access to this content.