Scientific databases offer remarkable potential for solving complex questions in materials science, such as global optimization of materials and designing unknown materials for novel properties. ThermoElectric materials eXplorer (TEXplorer) is a web-based platform designed to collect and share all types of thermoelectric materials data, including synthesis information, materials characterization, transport measurements, and electronic structures obtained from experiments and computations. TEXplorer also provides valuable tools, such as an easy upload and download system, retrieval, automatic post-processing calculations, visualization of datasets, and toolkits for predicting thermoelectric properties through machine learning models. Using the platform, we collected and managed the thermoelectric dataset of SnSe and Bi2Te3 with various doping/alloying elements in this study in order to investigate the complex relationship between doping/alloying elements and the thermoelectric properties of host materials. The web-based interactive data platform enables efficient management and utilization of experimental and computational datasets, supporting the acceleration of data-driven materials research and autonomous material synthesis.

High-quality data have become crucial as the data-driven approach has become ubiquitous in material science. The detailed description, accuracy, and large dataset size improve data quality, and the purposes and uses of data-driven research can be varied and enriched based on these datasets. Under these circumstances, data platforms that collect, verify, and display materials datasets are critical in creating and easily sharing high-quality datasets. Recently, open-access data platforms for materials, such as Materials Projects (MP),1 Automatic-Flow for Materials Discovery (AFLOW),2 the Open Quantum Materials Databases (OQMD),3 Quantum Machines 9 (QM9),4,5 and High Throughput Experimental Materials (HTEM),6 have been providing valuable datasets of materials properties and have initiated numerous data-driven studies. Researchers have systematically collected materials data through these platforms corresponding with their interest. They can easily access the datasets through a web browser and manipulate them using useful data-treatment tools.7–9 Along with machine learning (ML) techniques, datasets acquired from platforms have been utilized to solve various scientific problems, such as predicting chemical and physical properties of materials,10–13 suggesting novel materials and their structures,14–16 and optimization of synthesis and process conditions,17,18 and have led to the extraction of knowledge of design principles for target material properties.19–22 

Data-driven research can accelerate the prediction of material properties with reduced time and cost10,23 and, in some cases, result in more accurate predictions than conventional methods.24,25 Furthermore, it can be applied to unsolved problems from a new perspective.26,27 High-quality datasets contain fundamental relationships between the properties of the material, and knowledge extraction from these data can provide significant hints for materials design. In this context, data-driven studies of thermoelectric materials present interesting challenges due to their inherently complex responses28,29 to extrinsic modifications.

The thermoelectric effect enables to convert of waste heat into electric power with a conversion efficiency determined by the thermoelectric figure of merit (ZT),
(1)
which is a combination of electrical conductivity (σ), Seebeck coefficient (S), absolute temperature (T), and electronic and lattice contribution to thermal conductivity (κele, κlat). Various approaches to engineer thermoelectric materials have been attempted to achieve high-ZT values, such as doping/alloying,30–32 nano-structuring,33–38 and decoupling the charge carrier and phonon transport.39,40 However, it is still difficult to elucidate the mechanisms that enhance ZT due to complex inter-connected transport parameters, and data-driven research has been presented as a new methodology in the past decade. Gaultois et al.41 provided an extensive database of thermoelectric materials by collecting thermoelectric properties for over 18000 data from publications on the materials research laboratory (MRL) platform. TE Design Lab42 offered computational and experimental data of thermoelectric materials, as well as visualization and mining tools. Starrydata243 was developed as an open database for experimental data of thermoelectric materials, containing more than 11500 samples’ data collected from published papers. Based on these databases, simple predictions of thermoelectric material properties using ML models were available,44–47 and a fast high-throughput search for high-ZT compounds, including heavy rare-earth elements, was suggested.44,48

Although the extensive database41–43 has successfully launched many data-driven studies on thermoelectric materials, the dataset obtained from the literature has fatal weaknesses. Not all of the synthesis procedures or detailed information are described in the literature, and extractable data vary from paper to paper. Therefore, there is a tendency to collect fundamental characteristics, and the collected datasets are better suited to solving simple rather than practical problems. In addition, published data usually consist of the best performance data, and data with degraded properties, so-called “dark data,”6,27 are rarely described in papers. Datasets consisting of only superior results without dark data are biased, and ML with biased datasets will likely fail to train diverse material spaces. In solving practical problems using ML, it is crucial to systematically generate a dataset from diverse material spaces, including dark data, and to collect detailed information that describes the full-cycle of experimental/computational procedures and measurement results.

For systematic data collection, including dark data, a web-based data platform, “ThermoElectric materials eXplorer (TEXplorer),” was developed and is specialized in searching for novel high-performance thermoelectric materials with various doping/alloying compositions in SnSe and Bi2Te3 systems. In Fig. 1, a workflow of the data-driven research using the TEXplorer platform is provided. TEXplorer collects all detailed information, such as synthesis conditions, transport measurements, and electronic structures, obtained from experiments and density functional theory (DFT) calculations. In addition, user-friendly tools for retrieval, visualization, and analysis of thermoelectric material properties make it convenient to use and manage data beyond the functionality of simple data storage. Furthermore, predictive models obtained from ML research are available on TEXplorer as a toolkit that guides researchers to select optimal doping and alloying elements for SnSe/Bi2Te3 thermoelectric systems based on our datasets.49 

FIG. 1.

Workflow of the data-driven research with a data platform of TEXplorer. Our work starts from the data generation of experimental and computational approaches, and the raw data are uploaded to TEXplorer. The collected raw data are parsed, standardized for clarity, and visualized with well-designed plots and tables in TEXplorer. The raw data are downloadable in JSON format. We constructed the ML models using the organized datasets to predict thermoelectric material properties and search for optimal materials compositions with high-ZT values. TEXplorer provides toolkits for predicting thermoelectric properties using ML models to easily predict thermoelectric properties for a given material composition.

FIG. 1.

Workflow of the data-driven research with a data platform of TEXplorer. Our work starts from the data generation of experimental and computational approaches, and the raw data are uploaded to TEXplorer. The collected raw data are parsed, standardized for clarity, and visualized with well-designed plots and tables in TEXplorer. The raw data are downloadable in JSON format. We constructed the ML models using the organized datasets to predict thermoelectric material properties and search for optimal materials compositions with high-ZT values. TEXplorer provides toolkits for predicting thermoelectric properties using ML models to easily predict thermoelectric properties for a given material composition.

Close modal

The web-based data platform TEXplorer consists of the following parts: (1) Deposition of raw data files of experimental and computational datasets with parsing and standardizing the datasets on data storage; (2) visualization of the synthesis information, measurement results, and calculated electronic structures with tables and figures; and (3) ML toolkits for predicting thermoelectric materials properties. We describe the usage and functionalities of each application below.

A major function of TEXplorer is to collect and reposit thermoelectric materials data from experiments and computations. The experimental data consist of all information regarding synthesis, post-process, characterization, and measurement of transport properties. The computational data include the crystal structure of doped/alloyed materials and their electronic structures. In this study, we performed experiments and computations for SnSe50–52 and Bi2Te353–57 based-thermoelectric materials with various doping and alloying. SnSe and Bi2Te3 are representative thermoelectric materials near mid-high temperature and ambient temperature, respectively, and doping and alloying have been utilized as powerful strategies to improve their ZT. The main roles of doping/alloying in thermoelectric materials are to control carrier concentration and to modulate the electronic structures of the host material. However, identifying an effective doping/alloying element and its optimal content is still challenging due to the complex interdependence among doping/alloying configurations, electronic structures, and changes in ZT. Thus, data-driven research is expected to accelerate the search for high-ZT compounds and uncover hidden knowledge about the intricacies of thermoelectric properties.

We investigated doped/alloyed SnSe and Bi2Te3 compounds with the chemical composition of Snx1Sex2:Ax3Bx4 and Bix1Tex2:Ax3Bx4Cx5, where A, B, and C are doping/alloying atoms, and x1−5 are the corresponding ratios of host and doped/alloyed elements (see the supplementary material for the details.) For experimental datasets, 617 SnSe- and 523 Bi2Te3-based compounds were considered with different doping/alloying compositions. We synthesized them by a solid-state/solution reaction, pulverized the product by hand/mechanical grinding, and consolidated the obtained powders using spark plasma sintering (SPS). We selectively performed post-processing, such as ball-milling (BM), oxygen reduction, and annealing, to enhance the thermoelectric performance. The crystal structure of the sample was then characterized with powder X-ray diffraction (XRD) using a Bruker D8 Advance diffractometer, and electrical and thermal transport were measured in a full range of temperatures [Fig. 2(c)]. σ and S were measured using Netzsch SBA45858,59 and Ulvac-Riko ZEM3,50,57,58,60 and thermal diffusivity (α) was measured using Netzsch LFA457.32,50,57,61 All parameters for the synthesis process and details for each sample were written in a spreadsheet file with pre-set templates, and the raw data files from measuring instruments were collected for the database.

FIG. 2.

Crystal structures of (a) SnSe and (b) Bi2Te3. Doping elements substitute host elements or locate at interstitial sites. Red, gray, blue, and orange balls represent Sn, Se, Bi, and Te atoms. Generated data from full-cycle (c) experimental and (d) computational procedures.

FIG. 2.

Crystal structures of (a) SnSe and (b) Bi2Te3. Doping elements substitute host elements or locate at interstitial sites. Red, gray, blue, and orange balls represent Sn, Se, Bi, and Te atoms. Generated data from full-cycle (c) experimental and (d) computational procedures.

Close modal

Coupled with the experimental dataset, we generated computational datasets of electronic band structures on doped/alloyed SnSe and Bi2Te3 compounds, as shown in Fig. 2(d). We performed the DFT calculations on SnSe and Bi2Te3 systems with 59 and 64 different elements, respectively, using the Vienna Ab initio Simulation Package (VASP).62,63 Doping sites were considered as Sn- and Se-substitution sites and an interstitial site between the layers for the SnSe system and Bi- and Te-substitution sites and two interstitial sites for the Bi2Te3 systems, as shown in Figs. 2(a) and 2(b). We used a homemade automation code to compute and gather valuable information, such as the electronic band structure, density of states, formation energy profiles, bulk modulus, and simulated XRD patterns of each system [Fig. 2(d)]. Each calculation step is organized into subfolders, and the raw data files, necessarily including the “vasprun.xml” file, are collected for the database. More details of the experiment and computation procedures are described in the supplementary material.

Based on these newly generated datasets, data deposition on the TEXplorer platform is designed to be user-friendly with a drag-and-drop interface on a webpage or by using the application programming interface (API). The pre-designed parser works automatically to give the ID number for a new dataset and extracts key/value pairs for data ingestion. For computational data, we used the pymatgen8 library for the data parsing. If some of the required data files are missing or inputted incorrectly, an error message will pop up, and the upload process will stop. The collected data were standardized to match and unify keywords across multiple datasets created by different users. For example, many users have saved the ball-milling process with their own keywords, such as ballmill, Ball mill, ball-milling, and BM, and thus, we unified the keywords as BM in the database, as shown in Fig. 1, to remove inaccuracies and to improve the data quality. All ingested data extracted from the spreadsheet and the raw data files were stored in the MongoDB64 database in Javascript object notation (JSON) format and can be downloaded in JSON format.

The DataExplorer page provides a search function for each experimental and computational dataset. The experimental section allows users to search for target data by selecting elements from the Periodic Table and by advanced queries consisting of logical operators and additional keywords, such as molar mass, measuring device, and synthesis methods. When a sample is selected from the retrieved list, details of the synthesis process are displayed, and a new window with all experimental information for a given sample appears via a hyperlink on the ID number.

The new window displays a table of sample information and synthesis processes, as shown in Fig. 3, and the data types are listed in Table I. The “Data Information” section provides the name of the data creator, updated date, file-downloadable links, and additional information marked by a hashtag. “Process” refers to primary and post-processing parameters in experiments, such as nominal composition, synthesis method with corresponding details (temperature, time, milling speed, and cooling method), and SPS conditions (temperature with heating rate, time). If ball milling is performed, the milling speed and time are also provided. The “Results” shows the transport properties for the chosen sample measured by ZEM3,50,57,58,60 SBA,58,59 and LFA,32,50,57,61 and a corresponding powder XRD pattern. Plots of transport properties, such as σ, S, Lorenz number (L), power factor, α, total thermal conductivity (κtot), κlat, and ZT are provided as a function of temperature. The sample size thickness and diameter, sample density, and heat capacity are also represented. The XRD pattern is presented as a function of 2θ to analyze the crystal structure of the sample.

FIG. 3.

“DataExplorer” for an experimental dataset of Sn0.99Cr0.01Se (TE01246). The data information and synthesis process details are represented. Thermal diffusivity and thermal conductivity obtained from the LFA measurement are plotted among the measurement results.

FIG. 3.

“DataExplorer” for an experimental dataset of Sn0.99Cr0.01Se (TE01246). The data information and synthesis process details are represented. Thermal diffusivity and thermal conductivity obtained from the LFA measurement are plotted among the measurement results.

Close modal
TABLE I.

Data format for the experimental dataset.

Collected dataData typeCollected dataData type
Composition List of objectsa SPS  
Host List of objectsa Temperature (K) Integer 
Dopant List of objectsa ΔSPS temperature (K/min) Integer 
Created time and person’s name String/integer Time (min) Integer 
Uploaded time and person’s name String/integer SBA/ZEM3 measurement  
Synthesis process  Direction String 
Date Integer Diameter Float 
Composition List of objectsa Sample geometry String 
Synthesis method String Thickness Float 
Synthesis temperature (K) Integer Electrical conductivity (S/cm) List of objectsb 
Synthesis time (min) Integer Seebeck coefficient (μV/K) List of objectsb 
Milling speed Integer Power factor (μW/cm K2List of objectsb 
Milling time (min) Integer Lorenz number (W Ω/108 K2List of objectsb 
Cooling method String ZT  
Grinding method String Direction String 
LFA measurement  ZT List of objectsb 
Direction String XRD  
Thickness (mm) Integer Direction String 
References density Float Intensity (counts) List of objectsc 
Diameter Float Lattice thermal conductivity  
Heat capacity Float Direction String 
Thermal diffusivity (mm2/s) List of objectsb Lattice thermal conductivity List of objectsb 
Thermal conductivity (W/m K) List of objectsb (W/m K)  
Collected dataData typeCollected dataData type
Composition List of objectsa SPS  
Host List of objectsa Temperature (K) Integer 
Dopant List of objectsa ΔSPS temperature (K/min) Integer 
Created time and person’s name String/integer Time (min) Integer 
Uploaded time and person’s name String/integer SBA/ZEM3 measurement  
Synthesis process  Direction String 
Date Integer Diameter Float 
Composition List of objectsa Sample geometry String 
Synthesis method String Thickness Float 
Synthesis temperature (K) Integer Electrical conductivity (S/cm) List of objectsb 
Synthesis time (min) Integer Seebeck coefficient (μV/K) List of objectsb 
Milling speed Integer Power factor (μW/cm K2List of objectsb 
Milling time (min) Integer Lorenz number (W Ω/108 K2List of objectsb 
Cooling method String ZT  
Grinding method String Direction String 
LFA measurement  ZT List of objectsb 
Direction String XRD  
Thickness (mm) Integer Direction String 
References density Float Intensity (counts) List of objectsc 
Diameter Float Lattice thermal conductivity  
Heat capacity Float Direction String 
Thermal diffusivity (mm2/s) List of objectsb Lattice thermal conductivity List of objectsb 
Thermal conductivity (W/m K) List of objectsb (W/m K)  
a

“element”: string and “amount”: float.

b

“temperature”: float and “target”: float.

c

“2θ”: list of float and “target”: list of float.

The automatic calculation procedure was embedded to evaluate the specific heat capacity (Cp), κtot, κlat, L, and ZT values from the measured raw data from the equipment. κtot is simply evaluated by multiplying the measured α by specific heat capacity and density, (κtot = αρCp, where α is thermal diffusivity, ρ is density, and Cp is specific heat capacity), where Cp is automatically calculated from the atomic composition of a given sample according to the Dulong–Petit law.65–67 The lattice contribution to the thermal conductivity is evaluated by the equation κlat = κtotκele, where κele is estimated via the Wiedemann–Franz relationship (κele = LσT).68  L is calculated by employing a single parabolic band model within the acoustic phonon scattering.40,69 ZT at a given T was obtained according to Eq. (1), where each value is linearly interpolated at T because the measurements cannot be performed at the very same temperature with different instruments for the transport properties.

In the computational data section, the Periodic Table based-searches are available, and each hyperlink on the ID number leads to a new window showing the electronic structures and computational details for a given sample (Fig. 4). The collected data formats are presented in Table II. This page is specially optimized for doped/alloyed systems where host and dopant atoms are distinguished. The atomic coordinates with unit cell structures are represented by a table of xyz coordinates and a picture of a ball-and-stick model, and hovering the mouse over the table of xyz coordinates highlights the selected atom in the figure of the crystal structures. The band structures, density of states, formation energies of possible rich conditions, and simulated XRD patterns are displayed, and user-interactive zooming is a useful tool that allows users to visualize a subset of the displayed data. The user-customized plot can be downloaded as an image file. The computational details, such as the types of van der Waals (vdW) functionals70 and pseudopotentials,71  k-points, and energy cutoff, are provided.

FIG. 4.

“DataExplorer” for a computational dataset of Sn31Se32Pb (TC00543). Atomic structures with unit cell information are visualized with a ball-and-stick model and xyz coordinates. The band structures and density of states are illustrated, and details of data information and calculation details are also given on the TEXplorer webpage.

FIG. 4.

“DataExplorer” for a computational dataset of Sn31Se32Pb (TC00543). Atomic structures with unit cell information are visualized with a ball-and-stick model and xyz coordinates. The band structures and density of states are illustrated, and details of data information and calculation details are also given on the TEXplorer webpage.

Close modal
TABLE II.

Data format for the computational dataset.

Collected dataData typeCollected dataData type
Created time and person’s name String/integer Updated time and person’s name String/integer 
Calculation information    
Solver String K-points List of integer 
XC type String K-points shift List of float 
Pseudopotentials String Energy cutoff Float 
Structural information  DFT results  
Host/dopant String Total energy (eV) Float 
Substitution site String Bandgap (eV) Float 
Chemical formula List of objectsa Band structure Using pymatgen8  
Space group String Density of states Using pymatgen8  
Lattice parameters (Å) Float Formation energy (eV) List of objectsb 
Unit cell volume (Å3Float Seebeck coefficient jpg image 
Atomic coordinates (Å) List of objects XRD intensity (counts) List of objectsc 
Collected dataData typeCollected dataData type
Created time and person’s name String/integer Updated time and person’s name String/integer 
Calculation information    
Solver String K-points List of integer 
XC type String K-points shift List of float 
Pseudopotentials String Energy cutoff Float 
Structural information  DFT results  
Host/dopant String Total energy (eV) Float 
Substitution site String Bandgap (eV) Float 
Chemical formula List of objectsa Band structure Using pymatgen8  
Space group String Density of states Using pymatgen8  
Lattice parameters (Å) Float Formation energy (eV) List of objectsb 
Unit cell volume (Å3Float Seebeck coefficient jpg image 
Atomic coordinates (Å) List of objects XRD intensity (counts) List of objectsc 
a

“element”: string and “amount”: list of float.

b

rich_environment: (“energies”: list of float and “formation energy”: list of float).

c

“2θ”: list of float and “target”: list of float.

The “Visualization-Graph” page allows comparison of thermoelectric properties between different samples through well-designed plots, as shown in Fig. 5. If we retrieve and select experimental samples on the list, the corresponding thermoelectric properties are plotted as a function of temperature. The following thermoelectric properties of σ, S, α, κtot, κlat, power factor, and ZT can be drawn with the dynamic zooming function. Hovering the mouse over the ID number on the legend highlights the graph line of the selected sample, and clicking an ID number hides or shows its graph line on the figure. The manipulated figure is also downloadable as an image file.

FIG. 5.

“Visualization-Graph.” Comparison of thermoelectric properties (here, ZT) among different samples in experimental and computational datasets. With the search function, users can manipulate the plots with sample selections and user-interactive zooming.

FIG. 5.

“Visualization-Graph.” Comparison of thermoelectric properties (here, ZT) among different samples in experimental and computational datasets. With the search function, users can manipulate the plots with sample selections and user-interactive zooming.

Close modal

For computational datasets, a user-designed three-dimensional plot is available along the x- and y-axes and the size of the points with the following parameters: ID number, uploader name, number of atomic sites, number of elements, incorporation of vdW interaction in the calculation, VASP version, space group of initial and final structures, cell volume, bandgap, fermi energy, final total energy, and final total energy per atom. It is designed for practical statistical analysis of computational datasets.

Comparisons between electronic band structures and density of states for computational datasets are provided separately in the “Visualization-List” section. The figures of multiple samples are arranged in parallel, as shown in Fig. 6, and the changes in electronic structures by doping/alloying elements can be easily identified on this page.

FIG. 6.

“Visualization-List.” The electronic band structures and the density of states for selected samples are drawn in parallel for comparison.

FIG. 6.

“Visualization-List.” The electronic band structures and the density of states for selected samples are drawn in parallel for comparison.

Close modal

In our previous work, we constructed ML models based on our thermoelectric database to predict the thermoelectric properties of SnSe-based materials with arbitrary doping.49 We trained the relationship between material compositions and thermoelectric properties of σ, S, and κtot within the experimental datasets. The computational datasets were used as feature vectors to describe the electronic structures of a given sample. Using the ML models and the solubility limit of dopants,49 we searched for optimal doping compositions with a fast screening of 2832 compositions and analyzed the physical mechanisms of the selected high-ZT compounds. Using our pre-trained ML model, the composition of Na0.01(Sn0.96Ge0.04)0.99Se gives the maximum ZT value of 2.393 at 800 K among 2832 compositions. Similarly, we also constructed ML models to predict the thermoelectric properties of Bi2Te3 systems with various doping/alloying.

TEXplorer provides ML toolkits for predicting thermoelectric properties of doped/alloyed SnSe/Bi2Te3 materials using pre-trained ML models. Once given an arbitrary doped/alloyed SnSe/Bi2Te3 composition with element types, molar ratios, and vacancy ratios, feature vectors are automatically generated in the background, and the ML models predict the thermoelectric properties of the given composition. The predicted results of σ, S, κtot, and ZT are plotted as a function of temperature, and the maximum value of ZT is displayed with the corresponding temperature (Fig. 7). Users can predict the thermoelectric properties of various compositions in SnSe/Bi2Te3 systems on TEXplorer and download the full results in Excel format.

FIG. 7.

“ML.” The predicting toolkit for thermoelectric properties of electrical conductivity, Seebeck coefficient, thermal conductivity, and ZT values. (a) By entering the doping element type, ratio, and amount of Sn vacancy, (b) the ML models predict the thermoelectric properties. The chemical formula, the maximum ZT with temperature, and corresponding plots are displayed. The predicted results of Na0.01(Sn0.96Ge0.04)0.99Se are displayed.

FIG. 7.

“ML.” The predicting toolkit for thermoelectric properties of electrical conductivity, Seebeck coefficient, thermal conductivity, and ZT values. (a) By entering the doping element type, ratio, and amount of Sn vacancy, (b) the ML models predict the thermoelectric properties. The chemical formula, the maximum ZT with temperature, and corresponding plots are displayed. The predicted results of Na0.01(Sn0.96Ge0.04)0.99Se are displayed.

Close modal

Based on the TEXplorer data platform, we systematically generated our own dataset of SnSe and Bi2Te3 systems with various doping and alloying, including dark data, to obtain design knowledge of high-ZT thermoelectric materials. TEXplorer is specially designed to collect and visualize experimental and computational data of thermoelectric materials’ properties. TEXplorer’s capabilities range from data collection with detailed information and interactive web-based visualization, which distinguishes it from other data platforms, such as the MRL platform,41 TE Design Lab,42 and Starrydata2,43 that rely on published paper data. TEXplorer offers researchers a user-friendly environment for data generation, processing, management, and data-driven research beyond the function of data repositories. By uploading data to TEXplorer, researchers can benefit from automated data processing and visualization, similar to an electronic laboratory notebook. Additionally, the data collected through TEXplorer is well-organized into high-quality datasets that contain abundant information on experiments and computations. Therefore, TEXplorer supports researchers to build their own database of thermoelectric materials by simply uploading their data to the platform, expanding the potential of data utilization to data-driven research.

With the high-quality datasets of SnSe systems collected through TEXplorer, we could understand the relationship between thermoelectric properties and doping/alloying conditions.49 The computational datasets proved essential in more effectively describing the experimental data. They also improved the predictive performance of thermoelectric properties for both known and unknown doping elements. The constructed ML models based on our datasets bring about successful material design with proper doping elements and ratios for high thermoelectric performance. These pre-trained models were uploaded on the platform, and users can easily estimate the thermoelectric performance of SnSe/Bi2Te3 systems with various doping/alloying by filling out simple parameters.

Also, we used ML predictive models and high-throughput screening to figure out the design principles of SnSe systems with various doping elements and ratios.49 Furthermore, our simple predictive models on thermoelectric properties can be integrated with metaheuristic optimization algorithms to search for high-ZT materials compositions, such as the artificial bee colony (ABC) algorithm.72 Solving this inverse problem is expected to provide a shortcut to finding promising material compositions based on our datasets and ML predictive models. As the data size increases, the prediction accuracy and optimization performance can improve, so ML toolkits in TEXplorer will also be updated with models using larger datasets.

Thanks to the rich information on the experimental conditions included in our datasets, the datasets can be applied to optimization problem studies. Bayesian optimization73 models and active learning74 can be employed to find high-ZT materials by optimizing experimental procedures with actively updated experimental results. Accurate predictive models based on the datasets can be a foundation for autonomous materials synthesis.75–77 The computational datasets are also available with more advanced features, such as density of states,78,79 without structural and elemental factors. Furthermore, the data platform can be employed universally for data collection and visualization, and the workflow and the structure of TEXplorer have been applied to various material platforms, such as perovskite solar cell materials,80 catalytic materials,81 and two-dimensional (2D) materials.82 

Although we initially built the database specifically for SnSe and Bi2Te3 systems with doping/alloying, TEXplorer was designed with flexibility, allowing for the collection of thermoelectric materials properties from a wide variety of materials and synthesis methods. For example, users can upload thermoelectric properties data of Skutterudite or Half-Heusler alloys as well as their synthesis methods. Currently, TEXplorer offers template files for seven different synthesis methods, including solid–state reaction, ink printing, Bridgman method, and arc-melting method. Each template file consists of all the required information for a given synthesis method and can be customized to meet the specific needs of individual researchers. The template files are available in the supplementary material and on the TEXplorer front page.

Data obtained from measurement devices, such as SBA, ZEM3, and LFA, can be parsed to extract σ, S, and κ. Additional parsers will be developed for data generated from new measurement equipment. Computational datasets can be collected more straightforwardly by uploading raw data files, including “vasprun.xml” files from the VASP package.62,63 Consequently, we aim to make TEXplorer more widely accessible to general registered users and establish it as a central hub for thermoelectric materials data. Similarly to the Perovskite Database Project,83 which collects both literature (past) and experimental (future) data in one place to create a new standard for disseminating perovskite device data, TEXplorer can serve as a standard data management system in the field of thermoelectric materials. As more users share their data on the platform, it is expected to the TEXplorer platform and database will facilitate further analysis and understanding of thermoelectric materials using machine learning or artificial intelligence.

The web-based research data platform TEXplorer for experimental and computational datasets on the properties of thermoelectric materials was developed to collect, visualize, share, and utilize data for data-driven research. In addition, TEXplorer is expected to increase the convenience of communication in collaborative work between experimental and computational groups, facilitate data management and processing, and establish a foundation for public-access databases. The TEXplorer platform is valuable both in providing systematic thermoelectric properties datasets and functioning as a convenient data-collecting platform, which plays an essential role in the full-cycle of data-driven material design research using ML approaches. We believe that TEXplorer will significantly promote the value and usefulness of datasets to the research community by providing an easy-to-use data collection platform and data-driven research. Ultimately, it is expected to pave the way for accelerating and facilitating materials design and discovery in data-driven research.

See the supplementary material for detailed information on experimental procedures; computational details on DFT calculations; material list on a generated dataset; and TEXplorer templates for various experimental procedures.

We thank Professor Jae Sung Son, Kyu Hyoung Lee, Tae-Soo You, and Il-Ho Kim to share their experimental synthesis procedures. This work was supported by the Nano-Material Technology Development Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science and the ICT (Grant Nos. NRF-2017M3A7B4049273 and NRF-2017M3A7B4049274). H.L., T.K., S.B., and I.C. were partially supported by the Institute for Basic Science (Grant No. IBS-R009-G2). Computational resources were provided by the Korea Institute of Science and Technology Information (KISTI) supercomputing center.

The authors have no conflicts to disclose.

Y.-L.L. and H.L. equally contributed to this work.

Yea-Lee Lee: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Hyungseok Lee: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (lead); Writing – review & editing (equal). Seunghun Jang: Conceptualization (equal); Data curation (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal). Jeongho Shin: Data curation (equal); Resources (equal); Software (equal). Taeshik Kim: Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal). Sejin Byun: Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal). In Chung: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal). Jino Im: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Writing – original draft (equal); Writing – review & editing (equal). Hyunju Chang: Conceptualization (equal); Funding acquisition (equal); Project administration (equal); Writing – review & editing (equal).

All electronic data, including experiments and computations, are available in our web-based platform, TEXplorer (https://www.texplorer.org/dataExplorer), with permission from the corresponding authors upon reasonable request.

1.
A.
Jain
,
S. P.
Ong
,
G.
Hautier
,
W.
Chen
,
W. D.
Richards
,
S.
Dacek
,
S.
Cholia
,
D.
Gunter
,
D.
Skinner
,
G.
Ceder
, and
K. A.
Persson
,
APL Mater.
1
(
1
),
011002
(
2013
).
2.
S.
Curtarolo
,
W.
Setyawan
,
S.
Wang
,
J.
Xue
,
K.
Yang
,
R. H.
Taylor
,
L. J.
Nelson
,
G. L. W.
Hart
,
S.
Sanvito
,
M.
Buongiorno-Nardelli
,
N.
Mingo
, and
O.
Levy
,
Comput. Mater. Sci.
58
,
227
235
(
2012
).
3.
J. E.
Saal
,
S.
Kirklin
,
M.
Aykol
,
B.
Meredig
, and
C.
Wolverton
,
JOM
65
(
11
),
1501
1509
(
2013
).
4.
L.
Ruddigkeit
,
R.
van Deursen
,
L. C.
Blum
, and
J.-L.
Reymond
,
J. Chem. Inf. Model.
52
(
11
),
2864
2875
(
2012
).
5.
R.
Ramakrishnan
,
P. O.
Dral
,
M.
Rupp
, and
O. A.
von Lilienfeld
,
Sci. Data
1
(
1
),
140022
(
2014
).
6.
A.
Zakutayev
,
N.
Wunder
,
M.
Schwarting
,
J. D.
Perkins
,
R.
White
,
K.
Munch
,
W.
Tumas
, and
C.
Phillips
,
Sci. Data
5
(
1
),
180053
(
2018
).
7.
L.
Ward
,
A.
Dunn
,
A.
Faghaninia
,
N. E. R.
Zimmermann
,
S.
Bajaj
,
Q.
Wang
,
J.
Montoya
,
J.
Chen
,
K.
Bystrom
,
M.
Dylla
,
K.
Chard
,
M.
Asta
,
K. A.
Persson
,
G. J.
Snyder
,
I.
Foster
, and
A.
Jain
,
Comput. Mater. Sci.
152
,
60
69
(
2018
).
8.
A.
Jain
,
G.
Hautier
,
C. J.
Moore
,
S.
Ping Ong
,
C. C.
Fischer
,
T.
Mueller
,
K. A.
Persson
, and
G.
Ceder
,
Comput. Mater. Sci.
50
(
8
),
2295
2310
(
2011
).
9.
G.
Pizzi
,
A.
Cepellotti
,
R.
Sabatini
,
N.
Marzari
, and
B.
Kozinsky
,
Comput. Mater. Sci.
111
,
218
230
(
2016
).
10.
T.
Xie
and
J. C.
Grossman
,
Phys. Rev. Lett.
120
(
14
),
145301
(
2018
).
11.
C. C.
Fischer
,
K. J.
Tibbetts
,
D.
Morgan
, and
G.
Ceder
,
Nat. Mater.
5
(
8
),
641
646
(
2006
).
12.
G.
Pilania
,
C.
Wang
,
X.
Jiang
,
S.
Rajasekaran
, and
R.
Ramprasad
,
Sci. Rep.
3
(
1
),
2810
(
2013
).
13.
J.
Lee
,
A.
Seko
,
K.
Shitara
,
K.
Nakayama
, and
I.
Tanaka
,
Phys. Rev. B
93
(
11
),
115104
(
2016
).
14.
K.
Kim
,
S.
Kang
,
J.
Yoo
,
Y.
Kwon
,
Y.
Nam
,
D.
Lee
,
I.
Kim
,
Y.-S.
Choi
,
Y.
Jung
,
S.
Kim
,
W.-J.
Son
,
J.
Son
,
H. S.
Lee
,
S.
Kim
,
J.
Shin
, and
S.
Hwang
,
npj Comput. Mater.
4
(
1
),
67
(
2018
).
15.
R.
Vasudevan
,
G.
Pilania
, and
P. V.
Balachandran
,
J. Appl. Phys.
129
(
7
),
070401
(
2021
).
16.
Y.
Liu
,
T.
Zhao
,
W.
Ju
, and
S.
Shi
,
J. Materiomics
3
(
3
),
159
177
(
2017
).
17.
H. W.
Kim
,
S. W.
Lee
,
G. S.
Na
,
S. J.
Han
,
S. K.
Kim
,
J. H.
Shin
,
H.
Chang
, and
Y. T.
Kim
,
React. Chem. Eng.
6
(
2
),
235
243
(
2021
).
18.
H.
Gao
,
T. J.
Struble
,
C. W.
Coley
,
Y.
Wang
,
W. H.
Green
, and
K. F.
Jensen
,
ACS Cent. Sci.
4
(
11
),
1465
1476
(
2018
).
19.
G.
Hautier
,
A.
Miglio
,
G.
Ceder
,
G.-M.
Rignanese
, and
X.
Gonze
,
Nat. Commun.
4
(
1
),
2292
(
2013
).
20.
J.
Im
,
S.
Lee
,
T.-W.
Ko
,
H. W.
Kim
,
Y.
Hyon
, and
H.
Chang
,
npj Comput. Mater.
5
(
1
),
37
(
2019
).
21.
R.
Armiento
,
B.
Kozinsky
,
M.
Fornari
, and
G.
Ceder
,
Phys. Rev. B
84
(
1
),
014103
(
2011
).
22.
S. M.
Moosavi
,
K. M.
Jablonka
, and
B.
Smit
,
J. Am. Chem. Soc.
142
(
48
),
20273
20287
(
2020
).
23.
S.
Hong
,
C. H.
Liow
,
J. M.
Yuk
,
H. R.
Byon
,
Y.
Yang
,
E.
Cho
,
J.
Yeom
,
G.
Park
,
H.
Kang
,
S.
Kim
,
Y.
Shim
,
M.
Na
,
C.
Jeong
,
G.
Hwang
,
H.
Kim
,
H.
Kim
,
S.
Eom
,
S.
Cho
,
H.
Jun
,
Y.
Lee
,
A.
Baucour
,
K.
Bang
,
M.
Kim
,
S.
Yun
,
J.
Ryu
,
Y.
Han
,
A.
Jetybayeva
,
P.-P.
Choi
,
J. C.
Agar
,
S. V.
Kalinin
,
P. W.
Voorhees
,
P.
Littlewood
, and
H. M.
Lee
,
ACS Nano
15
(
3
),
3971
3995
(
2021
).
24.
J.
Han
,
K.-J.
Go
,
J.
Jang
,
S.
Yang
, and
S.-Y.
Choi
,
npj Comput. Mater.
8
(
1
),
196
(
2022
).
25.
F.
Brockherde
,
L.
Vogt
,
L.
Li
,
M. E.
Tuckerman
,
K.
Burke
, and
K.-R.
Müller
,
Nat. Commun.
8
(
1
),
872
(
2017
).
26.
V.
Tshitoyan
,
J.
Dagdelen
,
L.
Weston
,
A.
Dunn
,
Z.
Rong
,
O.
Kononova
,
K. A.
Persson
,
G.
Ceder
, and
A.
Jain
,
Nature
571
(
7763
),
95
98
(
2019
).
27.
P.
Raccuglia
,
K. C.
Elbert
,
P. D. F.
Adler
,
C.
Falk
,
M. B.
Wenny
,
A.
Mollo
,
M.
Zeller
,
S. A.
Friedler
,
J.
Schrier
, and
A. J.
Norquist
,
Nature
533
(
7601
),
73
76
(
2016
).
28.
G.
Tan
,
L.-D.
Zhao
, and
M. G.
Kanatzidis
,
Chem. Rev.
116
(
19
),
12123
12149
(
2016
).
29.
P.
Vaqueiro
and
A. V.
Powell
,
J. Mater. Chem.
20
(
43
),
9577
9584
(
2010
).
30.
H.-S.
Kim
,
N. A.
Heinz
,
Z. M.
Gibbs
,
Y.
Tang
,
S. D.
Kang
, and
G. J.
Snyder
,
Mater. Today
20
(
8
),
452
459
(
2017
).
31.
C.
Zhou
,
Y.
Yu
,
Y. K.
Lee
,
O.
Cojocaru-Mirédin
,
B.
Yoo
,
S.-P.
Cho
,
J.
Im
,
M.
Wuttig
,
T.
Hyeon
, and
I.
Chung
,
J. Am. Chem. Soc.
140
(
45
),
15535
15545
(
2018
).
32.
C.
Zhou
,
Y.
Yu
,
Y.-L.
Lee
,
B.
Ge
,
W.
Lu
,
O.
Cojocaru-Mirédin
,
J.
Im
,
S.-P.
Cho
,
M.
Wuttig
,
Z.
Shi
, and
I.
Chung
,
J. Am. Chem. Soc.
142
(
35
),
15172
15186
(
2020
).
33.
J. R.
Sootsman
,
D. Y.
Chung
, and
M. G.
Kanatzidis
,
Angew. Chem., Int. Ed.
48
(
46
),
8616
8639
(
2009
).
34.
C. J.
Vineis
,
A.
Shakouri
,
A.
Majumdar
, and
M. G.
Kanatzidis
,
Adv. Mater.
22
(
36
),
3970
3980
(
2010
).
35.
M. G.
Kanatzidis
,
Chem. Mater.
22
(
3
),
648
659
(
2009
).
36.
W.
Liu
,
X.
Yan
,
G.
Chen
, and
Z.
Ren
,
Nano Energy
1
(
1
),
42
56
(
2012
).
37.
P.
Pichanusakorn
and
P.
Bandaru
,
Mater. Sci. Eng.: R: Rep.
67
(
2–4
),
19
63
(
2010
).
38.
C.
Zhou
and
I.
Chung
,
Coord. Chem. Rev.
421
,
213437
(
2020
).
39.
G.
Tan
,
F.
Shi
,
S.
Hao
,
L.-D.
Zhao
,
H.
Chi
,
X.
Zhang
,
C.
Uher
,
C.
Wolverton
,
V. P.
Dravid
, and
M. G.
Kanatzidis
,
Nat. Commun.
7
(
1
),
12167
(
2016
).
40.
C.
Zhou
,
Y. K.
Lee
,
J.
Cha
,
B.
Yoo
,
S.-P.
Cho
,
T.
Hyeon
, and
I.
Chung
,
J. Am. Chem. Soc.
140
(
29
),
9282
9290
(
2018
).
41.
M. W.
Gaultois
,
T. D.
Sparks
,
C. K. H.
Borg
,
R.
Seshadri
,
W. D.
Bonificio
, and
D. R.
Clarke
,
Chem. Mater.
25
(
15
),
2911
2920
(
2013
).
42.
P.
Gorai
,
D.
Gao
,
B.
Ortiz
,
S.
Miller
,
S. A.
Barnett
,
T.
Mason
,
Q.
Lv
,
V.
Stevanović
, and
E. S.
Toberer
,
Comput. Mater. Sci.
112
,
368
376
(
2016
).
43.
Y.
Katsura
,
M.
Kumagai
,
T.
Kodani
,
M.
Kaneshige
,
Y.
Ando
,
S.
Gunji
,
Y.
Imai
,
H.
Ouchi
,
K.
Tobita
,
K.
Kimura
, and
K.
Tsuda
,
Sci. Technol. Adv. Mater.
20
(
1
),
511
520
(
2019
).
44.
M. W.
Gaultois
,
A. O.
Oliynyk
,
A.
Mar
,
T. D.
Sparks
,
G. J.
Mulholland
, and
B.
Meredig
,
APL Mater.
4
(
5
),
053213
(
2016
).
45.
A.
Furmanchuk
,
J. E.
Saal
,
J. W.
Doak
,
G. B.
Olson
,
A.
Choudhary
, and
A.
Agrawal
,
J. Comput. Chem.
39
(
4
),
191
202
(
2018
).
46.
Y.
Xu
,
L.
Jiang
, and
X.
Qi
,
Comput. Mater. Sci.
197
,
110625
(
2021
).
47.
G. S.
Na
,
S.
Jang
, and
H.
Chang
,
npj Comput. Mater.
7
(
1
),
106
(
2021
).
48.
G. S.
Na
and
H.
Chang
,
npj Comput. Mater.
8
(
1
),
214
(
2022
).
49.
Y.-L.
Lee
,
H.
Lee
,
T.
Kim
,
S.
Byun
,
Y. K.
Lee
,
S.
Jang
,
I.
Chung
,
H.
Chang
, and
J.
Im
,
J. Am. Chem. Soc.
144
(
30
),
13748
13763
(
2022
).
50.
L.-D.
Zhao
,
S.-H.
Lo
,
Y.
Zhang
,
H.
Sun
,
G.
Tan
,
C.
Uher
,
C.
Wolverton
,
V. P.
Dravid
, and
M. G.
Kanatzidis
,
Nature
508
(
7496
),
373
377
(
2014
).
51.
L.-D.
Zhao
,
G.
Tan
,
S.
Hao
,
J.
He
,
Y.
Pei
,
H.
Chi
,
H.
Wang
,
S.
Gong
,
H.
Xu
,
V. P.
Dravid
,
C.
Uher
,
G. J.
Snyder
,
C.
Wolverton
, and
M. G.
Kanatzidis
,
Science
351
(
6269
),
141
(
2016
).
52.
C.
Chang
,
M.
Wu
,
D.
He
,
Y.
Pei
,
C.-F.
Wu
,
X.
Wu
,
H.
Yu
,
F.
Zhu
,
K.
Wang
,
Y.
Chen
,
L.
Huang
,
J.-F.
Li
,
J.
He
, and
L.-D.
Zhao
,
Science
360
(
6390
),
778
(
2018
).
53.
H. J.
Goldsmid
and
R. W.
Douglas
,
Br. J. Appl. Phys.
5
(
11
),
386
390
(
1954
).
54.
T.
Zhu
,
L.
Hu
,
X.
Zhao
, and
J.
He
,
Adv. Sci.
3
(
7
),
1600004
(
2016
).
55.
I. T.
Witting
,
T. C.
Chasapis
,
F.
Ricci
,
M.
Peters
,
N. A.
Heinz
,
G.
Hautier
, and
G. J.
Snyder
,
Adv. Electron. Mater.
5
(
6
),
1800904
(
2019
).
56.
B.
Zhu
,
X.
Liu
,
Q.
Wang
,
Y.
Qiu
,
Z.
Shu
,
Z.
Guo
,
Y.
Tong
,
J.
Cui
,
M.
Gu
, and
J.
He
,
Energy Environ. Sci.
13
(
7
),
2106
2114
(
2020
).
57.
K.
Park
,
K.
Ahn
,
J.
Cha
,
S.
Lee
,
S. I.
Chae
,
S.-P.
Cho
,
S.
Ryee
,
J.
Im
,
J.
Lee
,
S.-D.
Park
,
M. J.
Han
,
I.
Chung
, and
T.
Hyeon
,
J. Am. Chem. Soc.
138
(
43
),
14458
14468
(
2016
).
58.
O.
Brand
,
G. K.
Fedder
,
C.
Hierold
,
J. G.
Korvink
, and
O.
Tabata
,
Thermoelectric Energy Conversion: Basic Concepts and Device Applications.
(
John Wiley & Sons
,
2017
).
59.
Y. K.
Lee
,
K.
Ahn
,
J.
Cha
,
C.
Zhou
,
H. S.
Kim
,
G.
Choi
,
S. I.
Chae
,
J.-H.
Park
,
S.-P.
Cho
,
S. H.
Park
,
Y.-E.
Sung
,
W. B.
Lee
,
T.
Hyeon
, and
I.
Chung
,
J. Am. Chem. Soc.
139
(
31
),
10887
10896
(
2017
).
60.
J.
Mackey
,
F.
Dynys
, and
A.
Sehirlioglu
,
Rev. Sci. Instrum.
85
(
8
),
085119
(
2014
).
61.
W. J.
Parker
,
R. J.
Jenkins
,
C. P.
Butler
, and
G. L.
Abbott
,
J. Appl. Phys.
32
(
9
),
1679
1684
(
1961
).
62.
G.
Kresse
and
J.
Furthmüller
,
Phys. Rev. B
54
(
16
),
11169
11186
(
1996
).
63.
G.
Kresse
and
J.
Furthmüller
,
Comput. Mater. Sci.
6
(
1
),
15
50
(
1996
).
64.
See https://www.mongodb.com/ for MongoDB.
65.
A. T.
Petit
and
P. L.
Dulong
,
Annales de Chimie et de Physique
10
,
395
413
(
1819
).
66.
C.
Kittel
,
Introduction to Solid State Physics
8th ed.
(
John Wiley & Sons, Inc.
,
2018
).
67.
J. W.
Mellor
,
A Comprehensive Treatise on Inorganic and Theoretical Chemistry
(
Longmans, Green and Co, Ltd.
,
1923
).
68.
M.
Jonson
and
G. D.
Mahan
,
Phys. Rev. B
21
(
10
),
4223
4229
(
1980
).
69.
A.
May
,
G.
Snyder
, and
D.
Rowe
,
Thermoelectrics Handbook: Thermoelectrics and Its Energy Harvesting
(
CRC Press
,
Boca Raton, FL
,
2012
).
70.
A.
Tkatchenko
and
M.
Scheffler
,
Phys. Rev. Lett.
102
(
7
),
073005
(
2009
).
71.
J. P.
Perdew
,
K.
Burke
, and
M.
Ernzerhof
,
Phys. Rev. Lett.
77
(
18
),
3865
3868
(
1996
).
72.
D.
Karaboga
, Technical Report No. tr06,
Erciyes University, Engineering Faculty, Computer
,
2005
.
73.
J.
Močkus
,
The Bayesian Approach to Local Optimization
(
Springer
,
1989
).
74.
B.
Settles
, Computer Sciences Technical Report No. 1648,
University of Wisconsin–Madison
,
2010
.
75.
B.
Burger
,
P. M.
Maffettone
,
V. V.
Gusev
,
C. M.
Aitchison
,
Y.
Bai
,
X.
Wang
,
X.
Li
,
B. M.
Alston
,
B.
Li
,
R.
Clowes
,
N.
Rankin
,
B.
Harris
,
R. S.
Sprick
, and
A. I.
Cooper
,
Nature
583
(
7815
),
237
241
(
2020
).
76.
B. P.
MacLeod
,
F. G. L.
Parlane
,
T. D.
Morrissey
,
F.
Häse
,
L. M.
Roch
,
K. E.
Dettelbach
,
R.
Moreira
,
L. P. E.
Yunker
,
M. B.
Rooney
,
J. R.
Deeth
,
V.
Lai
,
G. J.
Ng
,
H.
Situ
,
R. H.
Zhang
,
M. S.
Elliott
,
T. H.
Haley
,
D. J.
Dvorak
,
A.
Aspuru-Guzik
,
J. E.
Hein
, and
C. P.
Berlinguette
,
Sci. Adv.
6
(
20
),
eaaz8867
(
2020
).
77.
R.
Shimizu
,
S.
Kobayashi
,
Y.
Watanabe
,
Y.
Ando
, and
T.
Hitosugi
,
APL Mater.
8
(
11
),
111110
(
2020
).
78.
V.
Fung
,
G.
Hu
,
P.
Ganesh
, and
B. G.
Sumpter
,
Nat. Commun.
12
(
1
),
88
(
2021
).
79.
M.
Kuban
,
S.
Rigamonti
,
M.
Scheidgen
, and
C.
Draxl
,
Sci. Data
9
(
1
),
646
(
2022
).
80.
See http://solar.chemdx.org/ for information about perovskite solar cell database.
81.
See http://cat.chemdx.org/ for information about catalysis database.
82.
See https://2dmat.chemdx.org for information about 2D materials database.
83.
T. J.
Jacobsson
,
A.
Hultqvist
,
A.
García-Fernández
,
A.
Anand
,
A.
Al-Ashouri
,
A.
Hagfeldt
,
A.
Crovetto
,
A.
Abate
,
A. G.
Ricciardulli
,
A.
Vijayan
,
A.
Kulkarni
,
A. Y.
Anderson
,
B. P.
Darwich
,
B.
Yang
,
B. L.
Coles
,
C. A. R.
Perini
,
C.
Rehermann
,
D.
Ramirez
,
D.
Fairen-Jimenez
,
D.
Di Girolamo
,
D.
Jia
,
E.
Avila
,
E. J.
Juarez-Perez
,
F.
Baumann
,
F.
Mathies
,
G. S. A.
González
,
G.
Boschloo
,
G.
Nasti
,
G.
Paramasivam
,
G.
Martínez-Denegri
,
H.
Näsström
,
H.
Michaels
,
H.
Köbler
,
H.
Wu
,
I.
Benesperi
,
M. I.
Dar
,
I.
Bayrak Pehlivan
,
I. E.
Gould
,
J. N.
Vagott
,
J.
Dagar
,
J.
Kettle
,
J.
Yang
,
J.
Li
,
J. A.
Smith
,
J.
Pascual
,
J. J.
Jerónimo-Rendón
,
J. F.
Montoya
,
J.-P.
Correa-Baena
,
J.
Qiu
,
J.
Wang
,
K.
Sveinbjörnsson
,
K.
Hirselandt
,
K.
Dey
,
K.
Frohna
,
L.
Mathies
,
L. A.
Castriotta
,
M. H.
Aldamasy
,
M.
Vasquez-Montoya
,
M. A.
Ruiz-Preciado
,
M. A.
Flatken
,
M. V.
Khenkin
,
M.
Grischek
,
M.
Kedia
,
M.
Saliba
,
M.
Anaya
,
M.
Veldhoen
,
N.
Arora
,
O.
Shargaieva
,
O.
Maus
,
O. S.
Game
,
O.
Yudilevich
,
P.
Fassl
,
Q.
Zhou
,
R.
Betancur
,
R.
Munir
,
R.
Patidar
,
S. D.
Stranks
,
S.
Alam
,
S.
Kar
,
T.
Unold
,
T.
Abzieher
,
T.
Edvinsson
,
T. W.
David
,
U. W.
Paetzold
,
W.
Zia
,
W.
Fu
,
W.
Zuo
,
V. R. F.
Schröder
,
W.
Tress
,
X.
Zhang
,
Y.-H.
Chiang
,
Z.
Iqbal
,
Z.
Xie
, and
E.
Unger
,
Nat. Energy
7
(
1
),
107
115
(
2022
).

Supplementary Material