Polymer solar cells admit numerous potential advantages including low energy payback time and scalable high-speed manufacturing, but the power conversion efficiency is currently lower than for their inorganic counterparts. In a Phenyl-C_61-Butyric-Acid-Methyl-Ester (PCBM)-based blended polymer solar cell, the optical gap of the polymer and the energetic alignment of the lowest unoccupied molecular orbital (LUMO) of the polymer and the PCBM are crucial for the device efficiency. Searching for new and better materials for polymer solar cells is a computationally costly affair using density functional theory (DFT) calculations. In this work, we propose a screening procedure using a simple string representation for a promising class of donor-acceptor polymers in conjunction with a grammar variational autoencoder. The model is trained on a dataset of 3989 monomers obtained from DFT calculations and is able to predict LUMO and the lowest optical transition energy for unseen molecules with mean absolute errors of 43 and 74 meV, respectively, without knowledge of the atomic positions. We demonstrate the merit of the model for generating new molecules with the desired LUMO and optical gap energies which increases the chance of finding suitable polymers by more than a factor of five in comparison to the randomised search used in gathering the training set.

1.
M. A.
Green
,
Y.
Hishikawa
,
W.
Warta
,
E. D.
Dunlop
,
D. H.
Levi
,
J.
Hohl-Ebinger
, and
A. W.
Ho-Baillie
,
Prog. Photovoltaics: Res. Appl.
25
,
668
(
2017
).
2.
M.
Turbiez
,
P.
Frère
,
M.
Allain
,
C.
Videlot
,
J.
Ackermann
, and
J.
Roncali
,
Chem.–Eur. J.
11
,
3742
(
2005
).
3.
M.
Cheng
,
X.
Yang
,
F.
Zhang
,
J.
Zhao
, and
L.
Sun
,
J. Phys. Chem. C
117
,
9076
(
2013
).
4.
D. P.
Hagberg
,
T.
Marinado
,
K. M.
Karlsson
,
K.
Nonomura
,
P.
Qin
,
G.
Boschloo
,
T.
Brinck
,
A.
Hagfeldt
, and
L.
Sun
,
J. Org. Chem.
72
,
9550
(
2007
).
5.
T.
Xu
and
L.
Yu
,
Mater. Today
17
,
11
(
2014
).
6.
B.-G.
Kim
,
X.
Ma
,
C.
Chen
,
Y.
Ie
,
E. W.
Coir
,
H.
Hashemi
,
Y.
Aso
,
P. F.
Green
,
J.
Kieffer
, and
J.
Kim
,
Adv. Funct. Mater.
23
,
439
(
2013
).
7.
K. B.
Ornso
,
J. M.
Garcia-Lastra
, and
K. S.
Thygesen
,
Phys. Chem. Chem. Phys.
15
,
19478
(
2013
).
8.
I. Y.
Kanal
,
S. G.
Owens
,
J. S.
Bechtel
, and
G. R.
Hutchison
,
J. Phys. Chem. Lett.
4
,
1613
(
2013
).
9.
E. O.
Pyzer-Knapp
,
K.
Li
, and
A.
Aspuru-Guzik
,
Adv. Funct. Mater.
25
,
6495
(
2015
).
10.
Y.
Imamura
,
M.
Tashiro
,
M.
Katouda
, and
M.
Hada
,
J. Phys. Chem. C
121
,
28275
(
2017
).
11.
F. A.
Faber
,
L.
Hutchison
,
B.
Huang
,
J.
Gilmer
,
S. S.
Schoenholz
,
G. E.
Dahl
,
O.
Vinyals
,
S.
Kearnes
,
P. F.
Riley
, and
O. A.
von Lilienfeld
,
J. Chem. Theory Comput.
13
,
5255
(
2017
).
12.
R.
Gómez-Bombarelli
,
D.
Duvenaud
,
J. M.
Hernández-Lobato
,
J.
Aguilera-Iparraguirre
,
T. D.
Hirzel
,
R. P.
Adams
, and
A.
Aspuru-Guzik
,
ACS. Cent. Sci.
4
,
268
(
2018
).
13.
M. J.
Kusner
,
B.
Paige
, and
J. M.
Hernández-Lobato
, in
Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6-11 August 2017
, edited by
D.
Precup
and
Y. W.
Teh
(
PMLR
,
2017
), Vol. 70, pp.
1945
1954
.
14.
D.
Weininger
,
J. Chem. Inf. Model.
28
,
31
(
1988
).
15.
I.
Hill
,
A.
Kahn
,
Z.
Soos
, and
J. R. A.
Pascal
,
Chem. Phys. Lett.
327
,
181
(
2000
).
16.
R. S.
Kularatne
,
H. D.
Magurudeniya
,
P.
Sista
,
M. C.
Biewer
, and
M. C.
Stefan
,
J. Polym. Sci., Part A: Polym. Chem.
51
,
743
(
2013
).
17.
18.
T. E.
Kang
,
J.
Choi
,
H.-H.
Cho
,
S. C.
Yoon
, and
B. J.
Kim
,
Macromolecules
49
,
2096
(
2016
).
19.
Z.
Li
,
Y.
Zang
,
C.-C.
Chueh
,
N.
Cho
,
J.
Lu
,
X.
Wang
,
J.
Huang
,
C.-Z.
Li
,
J.
Yu
, and
A. K.-Y.
Jen
,
Macromolecules
47
,
7407
(
2014
).
20.
E.
Bundgaard
and
F. C.
Krebs
, “
Low band gap polymer materials for organic solar cells
,”
Sol. Energy Mater. Sol. Cells
91
,
954
(
2007
).
21.
L.
Zhang
,
K.
Pei
,
M.
Yu
,
Y.
Huang
,
H.
Zhao
,
M.
Zeng
,
Y.
Wang
, and
J.
Gao
,
J. Phys. Chem. C
116
,
26154
(
2012
).
22.
N.
Blouin
,
A.
Michaud
,
D.
Gendron
,
S.
Wakim
,
E.
Blair
,
R.
Neagu-Plesu
,
M.
Belletête
,
G.
Durocher
,
Y.
Tao
, and
M.
Leclerc
,
J. Am. Chem. Soc.
130
,
732
(
2008
).
23.
L.
Zhang
,
M.
Yu
,
H.
Zhao
,
Y.
Wang
, and
J.
Gao
,
Chem. Phys. Lett.
570
,
153
(
2013
).
24.
R.
Ramakrishnan
,
P. O.
Dral
,
M.
Rupp
, and
O. A.
von Lilienfeld
,
Sci. Data
1
,
140022
(
2014
).
25.
L.
Ruddigkeit
,
R.
van Deursen
,
L. C.
Blum
, and
J.-L.
Reymond
,
J. Chem. Inf. Model.
52
,
2864
(
2012
).
26.
M.
Rupp
,
A.
Tkatchenko
,
K.-R.
Müller
, and
O. A.
von Lilienfeld
,
Phys. Rev. Lett.
108
,
058301
(
2012
).
27.
K.
Hansen
,
F.
Biegler
,
R.
Ramakrishnan
,
W.
Pronobis
,
O. A.
von Lilienfeld
,
K.-R.
Müller
, and
A.
Tkatchenko
,
J. Phys. Chem. Lett.
6
,
2326
(
2015
).
28.
B.
Huang
and
O. A.
von Lilienfeld
,
J. Chem. Phys.
145
,
161102
(
2016
).
29.
D.
Rogers
and
M.
Hahn
,
J. Chem. Inf. Model.
50
,
742
(
2010
).
30.
K. T.
Schütt
,
P.-J.
Kindermans
,
H. E.
Sauceda
,
S.
Chmiela
,
A.
Tkatchenko
, and
K.-R.
Müller
,
Advances in Neural Information Processing Systems 30
, edited by
I.
Guyon
,
U. V.
Luxburg
,
S.
Bengio
,
H.
Wallach
,
R.
Fergus
,
S.
Vishwanathan
, and
R.
Garnett
(
Curran Associates, Inc.
,
2017
), pp.
991
1001
.
31.
J.
Gilmer
,
S. S.
Schoenholz
,
P. F.
Riley
,
O.
Vinyals
, and
G. E.
Dahl
, in
Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6-11 August 2017
, edited by
D.
Precup
and
Y. W.
Teh
(
PMLR
,
2017
), Vol. 70, pp.
1263
1272
.
32.
D. P.
Kingma
and
M.
Welling
, e-print arXiv:1312.6114v10 [stat.ML] (
2013
).
33.
D. J.
Rezende
,
S.
Mohamed
, and
D.
Wierstra
, in
Proceedings of the 31st International Conference on International Conference on Machine Learning
(
PMLR
,
2014
), pp.
1278
1286
.
34.
J. J.
Irwin
,
T.
Sterling
,
M. M.
Mysinger
,
E. S.
Bolstad
, and
R. G.
Coleman
,
J. Chem. Inf. Model.
52
,
1757
(
2012
).
35.
R.
Gómez-Bombarelli
,
J.
Aguilera-Iparraguirre
,
T. D.
Hirzel
,
D.
Duvenaud
,
D.
Maclaurin
,
M. A.
Blood-Forsythe
,
H. S.
Chae
,
M.
Einzinger
,
D.-G.
Ha
,
T.
Wu
,
G.
Markopoulos
,
S.
Jeon
,
H.
Kang
,
H.
Miyazaki
,
M.
Numata
,
S.
Kim
,
W.
Huang
,
S. I.
Hong
,
M.
Baldo
,
R. P.
Adams
, and
A.
Aspuru-Guzik
,
Nat. Mater.
15
,
1120
(
2016
).
36.
D.
Janz
,
J.
van der Westhuizen
,
B.
Paige
,
M. J.
Kusner
, and
J. M.
Hernandez-Labato
, e-print arXiv:1712.01664 [stat.ML] (
2017
).
37.
H.
Dai
,
Y.
Tian
,
B.
Dai
,
S.
Skiena
, and
L.
Song
, in
NIPS 2017 Workshop Machine Learning for Molecules and Materials
,
2017
.
38.
M.
Simonovsky
and
N.
Komodakis
, e-print arXiv:1802.03480 [cs.LG] (
2018
).
39.
J.
You
,
R.
Ying
,
X.
Ren
,
W. L.
Hamilton
, and
J.
Leskovec
, e-print arXiv:1802.08773 [cs.LG] (
2018
).
40.
Y.
Li
,
O.
Vinyals
,
C.
Dyer
,
R.
Pascanu
, and
P.
Battaglia
, e-print arXiv:1803.03324 [cs.LG] (
2018
).
41.
K.
Wu
,
N.
Sukumar
,
N. A.
Lanzillo
,
C.
Wang
,
R.
“Rampi” Ramprasad
,
R.
Ma
,
A. F.
Baldwin
,
G.
Sotzing
, and
C.
Breneman
,
J. Polym. Sci., Part B: Polym. Phys.
54
,
2082
(
2016
).
42.
F.
Pereira
,
K.
Xiao
,
D. A. R. S.
Latino
,
C.
Wu
,
Q.
Zhang
, and
J.
Aires-de Sousa
,
J. Chem. Inf. Model.
57
,
11
(
2017
).
43.
See ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt for Pubchem substructure fingerprint; accessed
21 November 2017
.
45.
N. M.
O’Boyle
,
M.
Banck
,
C. A.
James
,
C.
Morley
,
T.
Vandermeersch
, and
G. R.
Hutchison
,
J. Cheminf.
3
,
33
(
2011
).
46.
W.
Kohn
and
L. J.
Sham
,
Phys. Rev.
140
,
A1133
(
1965
).
47.
A. D.
Becke
,
J. Chem. Phys.
98
,
5648
(
1993
).
48.
A.
Schäfer
,
C.
Huber
, and
R.
Ahlrichs
,
J. Chem. Phys.
100
,
5829
(
1994
).
49.
M. J.
Frisch
,
G. W.
Trucks
,
H. B.
Schlegel
,
G. E.
Scuseria
,
M. A.
Robb
,
J. R.
Cheeseman
,
G.
Scalmani
,
V.
Barone
,
G. A.
Petersson
,
H.
Nakatsuji
,
X.
Li
,
M.
Caricato
,
A.
Marenich
,
J.
Bloino
,
B. G.
Janesko
,
R.
Gomperts
,
B.
Mennucci
,
H. P.
Hratchian
,
J. V.
Ortiz
,
A. F.
Izmaylov
,
J. L.
Sonnenberg
,
D.
Williams-Young
,
F.
Ding
,
F.
Lipparini
,
F.
Egidi
,
J.
Goings
,
B.
Peng
,
A.
Petrone
,
T.
Henderson
,
D.
Ranasinghe
,
V. G.
Zakrzewski
,
J.
Gao
,
N.
Rega
,
G.
Zheng
,
W.
Liang
,
M.
Hada
,
M.
Ehara
,
K.
Toyota
,
R.
Fukuda
,
J.
Hasegawa
,
M.
Ishida
,
T.
Nakajima
,
Y.
Honda
,
O.
Kitao
,
H.
Nakai
,
T.
Vreven
,
K.
Throssell
,
J. A.
Montgomery
, Jr.
,
J. E.
Peralta
,
F.
Ogliaro
,
M.
Bearpark
,
J. J.
Heyd
,
E.
Brothers
,
K. N.
Kudin
,
V. N.
Staroverov
,
T.
Keith
,
R.
Kobayashi
,
J.
Normand
,
K.
Raghavachari
,
A.
Rendell
,
J. C.
Burant
,
S. S.
Iyengar
,
J.
Tomasi
,
M.
Cossi
,
J. M.
Millam
,
M.
Klene
,
C.
Adamo
,
R.
Cammi
,
J. W.
Ochterski
,
R. L.
Martin
,
K.
Morokuma
,
O.
Farkas
,
J. B.
Foresman
, and
D. J.
Fox
, gaussian 09, Revision E.01,
Gaussian, Inc.
,
Wallingford, CT
,
2009
.
50.
M.
Mesta
,
S.
Suranjan
,
K. S.
Thygesen
, and
J. M.
Garcia-Lastra
, “
First-principles analysis of electronic and optical properties of donor-acceptor type polymer photovoltaic materials
” (unpublished).
51.
T. M.
McCormick
,
C. R.
Bridges
,
E. I.
Carrera
,
P. M.
DiCarmine
,
G. L.
Gibson
,
J.
Hollinger
,
L. M.
Kozycz
, and
D. S.
Seferos
,
Macromolecules
46
,
3879
(
2013
).
52.
J.
Torras
,
J.
Casanovas
, and
C.
Alemán
,
J. Phys. Chem. A
116
,
7571
(
2012
).
53.
R. E.
Larsen
,
J. Phys. Chem. C
120
,
9650
(
2016
).
54.
K. T.
Schütt
,
F.
Arbabzadah
,
S.
Chmiela
,
K. R.
Müller
, and
A.
Tkatchenko
,
Nat. Commun.
8
,
13890
(
2017
).
55.
F.
Pedregosa
,
G.
Varoquaux
,
A.
Gramfort
,
V.
Michel
,
B.
Thirion
,
O.
Grisel
,
M.
Blondel
,
P.
Prettenhofer
,
R.
Weiss
,
V.
Dubourg
,
J.
Vanderplas
,
A.
Passos
,
D.
Cournapeau
,
M.
Brucher
,
M.
Perrot
, and
E.
Duchesnay
,
J. Mach. Learn. Res.
12
,
2825
(
2011
).
56.
J.
Besag
,
J. R. Stat. Soc.: Ser. B: Methodol.
B-48
,
259
(
1986
).
57.
Theano Development Team
, e-print arXiv:1605.02688 [cs.SC] (
2016
).
58.
D. P.
Kingma
,
T.
Salimans
,
R.
Jozefowicz
,
X.
Chen
,
I.
Sutskever
, and
M.
Welling
, e-print arXiv:1606.04934 [cs.LG] (
2016
).
59.
C.
Sønderby
,
T.
Raiko
,
L.
Maaløe
,
S.
Sønderby
, and
O.
Winther
, “
How to train deep variational autoencoders and probabilistic ladder networks
,” in
Proceedings of the 33rd International Conference on Machine Learning (ICML 2016)
,
2016
.
60.
S. R.
Bowman
,
L.
Vilnis
,
O.
Vinyals
,
A. M.
Dai
,
R.
Jozefowicz
, and
S.
Bengio
, e-print arXiv:1511.06349 [cs.LG] (
2015
).
61.
K.
Cho
,
B.
van Merrienboer
,
C.
Gulcehre
,
D.
Bahdanau
,
F.
Bougares
,
H.
Schwenk
, and
Y.
Bengio
, in
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
(
Association for Computational Linguistics
,
2014
), pp.
1724
1734
.
62.
N.
Srivastava
,
G.
Hinton
,
A.
Krizhevsky
,
I.
Sutskever
, and
R.
Salakhutdinov
,
J. Mach. Learn. Res.
15
,
1929
(
2014
).
You do not currently have access to this content.