Machine learning is quickly becoming an important tool in modern materials design. Where many of its successes are rooted in huge datasets, the most common applications in academic and industrial materials design deal with datasets of at best a few tens of data points. Harnessing the power of machine learning in this context is, therefore, of considerable importance. In this work, we investigate the intricacies introduced by these small datasets. We show that individual data points introduce a significant chance factor in both model training and quality measurement. This chance factor can be mitigated by the introduction of an ensemble-averaged model. This model presents the highest accuracy, while at the same time, it is robust with regard to changing the dataset size. Furthermore, as only a single model instance needs to be stored and evaluated, it provides a highly efficient model for prediction purposes, ideally suited for the practical materials scientist.
Skip Nav Destination
Small data materials design with machine learning: When the average model knows best
Article navigation
7 August 2020
Research Article|
August 03 2020
Small data materials design with machine learning: When the average model knows best

Special Collection:
Machine Learning for Materials Design and Discovery
Danny E. P. Vanpoucke
;
Danny E. P. Vanpoucke
1
Aachen-Maastricht Institute for Biobased Materials (AMIBM), Maastricht University
, Brightlands Chemelot campus, Urmonderbaan 22, 6167 RD Geleen, The Netherlands
2
Institute for Materials Research (IMO), Hasselt University
, 3590 Diepenbeek, Belgium
Search for other works by this author on:
Onno S. J. van Knippenberg
;
Onno S. J. van Knippenberg
3
CCL Olympic B.V.
, Keizersveld 30, 5803 AN Venray, The Netherlands
Search for other works by this author on:
Ko Hermans;
Ko Hermans
3
CCL Olympic B.V.
, Keizersveld 30, 5803 AN Venray, The Netherlands
Search for other works by this author on:
Katrien V. Bernaerts
;
Katrien V. Bernaerts
a)
1
Aachen-Maastricht Institute for Biobased Materials (AMIBM), Maastricht University
, Brightlands Chemelot campus, Urmonderbaan 22, 6167 RD Geleen, The Netherlands
a)Author to whom correspondence should be addressed: katrien.bernaerts@maastrichtuniversity.nl
Search for other works by this author on:
Siamak Mehrkanoon
Siamak Mehrkanoon
4
Department of Data Science and Knowledge Engineering, Maastricht University
, 6226 GS Maastricht, The Netherlands
Search for other works by this author on:
a)Author to whom correspondence should be addressed: katrien.bernaerts@maastrichtuniversity.nl
Note: This paper is part of the special collection on Machine Learning for Materials Design and Discovery
J. Appl. Phys. 128, 054901 (2020)
Article history
Received:
April 29 2020
Accepted:
July 15 2020
Connected Content
A companion article has been published:
When the average model knows best
See also
-
companion
Citation
Danny E. P. Vanpoucke, Onno S. J. van Knippenberg, Ko Hermans, Katrien V. Bernaerts, Siamak Mehrkanoon; Small data materials design with machine learning: When the average model knows best. J. Appl. Phys. 7 August 2020; 128 (5): 054901. https://doi.org/10.1063/5.0012285
Download citation file:
Sign in
Don't already have an account? Register
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Sign in via your Institution
Sign in via your InstitutionPay-Per-View Access
$40.00
Citing articles via
Related Content
It Is Important to Know What Questions to Ask
American Journal of Physics (January 1960)
Do We Really Know How to Derive the Basic PNe Parameters?
AIP Conference Proceedings (November 2005)
Resilience analysis in the best route selection in South Sumatera Province: Work in progress
AIP Conference Proceedings (July 2023)
Enhancing Student Learning by Tapping into Physics They Already Know
The Physics Teacher (February 2003)
When the average model knows best
Scilight (August 2020)