A multi-layer perceptron neural network was used to predict the laser transition figure of merit, a measure of the laser threshold gain, of over 900 × 10^{6} Quantum Cascade (QC) laser designs using only layer thicknesses and the applied electric field as inputs. Designs were generated by randomly altering the layer thicknesses of an initial 10-layer design. Validating the predictions with our 1D Schrödinger solver, the predicted values show 5%–15% error for the laser structures, well within QC laser design variations. The algorithm (i) allowed for the identification of high figure of merit structures, (ii) recognized which layers should be altered to maximize the figure of merit at a given electric field, and (iii) increased the original design figure of merit of 94.7–141.2 eV ps Å^{2}, a 1.5-fold improvement and significant for QC lasers. The computational time for laser design data collection is greatly reduced from 32 h for 27 000 designs using our 1D Schrödinger solver on a virtual machine, to 8 h for 907 × 10^{6} designs using the machine learning algorithm on a laptop computer.

## I. INTRODUCTION

QuantumCascade Lasers (QCLs) are optoelectronic devices, which operate in the mid-infrared^{1–3} and THz regimes^{4,5} of the electromagnetic spectrum. These lasers are useful for atmospheric trace chemical and particle detection,^{6,7} low-visibility communication,^{8,9} and label-free medical imaging,^{10,11} among various other applications. A QCL design consists of alternating well and barrier material such that the band structure forms a multi-quantum well heterostructure. Electrons are pumped into the system, and photons are emitted through intersubband transitions in the conduction band; between optical transitions, the electrons continue traveling across the heterostructure through a longitudinal optical (LO) phonon, and other scattering. Ideally, above laser threshold, a photon is released for every electron and every period of the active region and injector that is present in the entire active core of the structure. The emission wavelength of the QCL can be tuned by changing the specific parameters of the design, such as layer thickness, applied electric field, material compositions, and material system. A combination of human intuition and computational analysis using solutions from a Schrödinger solver is a common approach to the QCL design.^{12}

There have been several attempts at optimizing the large design parameter space of QCLs using various computational methods. Use of a genetic algorithm (GA) increased the wall-plug efficiency of a mid-infrared QCL by 7%,^{13} while another GA method was able to optimize a THz QCL transition frequency over a 2.9 THz range,^{14} among other applications.^{15–18} Simulated annealing on a triple step quantum well design kept the transition energy around 50 meV^{19} and also optimized superstructure gratings in QCLs to achieve non-equidistant frequencies.^{20} Inverse spectral theory maximizes the gain in a quantum well laser^{21} as well as optimizes the active region of a 12 *μ*m QCL.^{22} Machine learning (ML) for the optimization of semiconductor devices has been applied to defect identification during the fabrication process^{23–25} and to designing nanostructures in nanophotonics.^{26–28} ML approaches for QCLs have so far looked at improving calculation time for modal gain,^{29} predicting resonant mode characteristics of QCLs in the THz regime,^{30} predicting the emission spectra of THz quantum cascade random lasers,^{31} predicting the threshold gain from higher-order modes given the refractive index profile of a QCL cavity,^{32} as well as labeling of relevant wavefunctions.^{33} In Ref. 34, an inverse network predicts the active region design of a QCL and then a forward network predicts metrics of interest from that design, such as energy difference and LO lifetimes with low errors of 2%–15%, although entire QCL designs (active and injector regions) at different electric fields are not used for training the networks.

Here, we develop a framework for designing and optimizing the QCL design by developing a ML tool around the Schrödinger solver. In particular, we develop a framework for optimizing a figure of merit (FoM), a measure for the laser threshold gain, for an initial QC design using ML. By specifying the number of layers and layer thicknesses, along with the applied electric field, the algorithm predicts the FoM. Previous work started with collecting data for a QCL dataset to use in training an algorithm.^{35} A laser transition code was built to compile QCL datasets by identifying the electronic state-pair transition in a QCL design with a high FoM in the band structure.^{36} ML is then applied to various QCL datasets, optimizing the parameters of a multi-layer perceptron (MLP) neural network, such as the activation functions and number of neurons in a hidden layer.^{37}

This paper focuses on optimizing the FoM of a 10-layer structure by adding a random thickness to every layer in the −2 to +3 Å regime, up to 30% variation of layer thickness, and changing the electric field from 10 to 150 kV/cm. The total number of designs is predicted by ML, and then, selected designs with very high predicted FoM values are evaluated using our 1D Schrodinger solver. ML (i) suggests new QCL designs with a high FoM and (ii) recognizes which layers should be altered, and by how much, in order to maximize the FoM for a specific starting design.

## II. METHODS

### A. Identifying starting design and QCL parameters

*µ*m wavelength QCL.

^{1}The layer thickness sequence is

**9**/57/

**11**/54/

**12**/45/

**25**/34/

**14**/33 (Å), where the bold font indicates the Al

_{0.48}In

_{0.52}As barrier material and the normal text indicates the In

_{0.53}Ga

_{0.47}As well material. ErwinJr2, our research group’s 1D Schrödinger solver,

^{38}was used to calculate the eigenenergies and eigenfunctions of the multi-quantum well heterostructure at an applied electric field. ErwinJr2 also calculates the FoM, gain, and energy difference between any two eigenstates, among various other quantities. The FoM for this framework is calculated as

*E*

_{ul}is the energy difference between the upper and lower electronic subbands,

*τ*

_{u}and

*τ*

_{l}are the upper and lower scattering lifetimes, respectively,

*τ*

_{ul}is the LO phonon scattering lifetime between the upper and lower states, and

*z*

_{ul}is the dipole matrix element. Our FoM has units of [eV ps Å

^{2}] and is used to assess the quality of a state-pair transition, with a higher FoM indicating a better laser transition, leading, for example, to a lower threshold.

One “design” is defined as one QCL layer sequence at one electric field. Each design has a unique set of eigenenergies and eigenfunctions that change when any layer thickness, or the electric field, is altered. A unique aspect of QCLs is that there is a favorable state-pair interaction that releases photons at the energy difference of that electronic intersubband transition through stimulated emission. This “laser transition” is repeated for, however, many periods of active region of the structure. A code was developed to automatically identify this laser transition for every design using ErwinJr2 and can be found in Ref. 36. In short, by specifying the layer thicknesses and electric field as inputs, the code collects the energy difference, scattering times, FoM, and gain coefficient from all the state-pair transitions in the multi-quantum well design. The laser transition is identified by first filtering out transitions that are in the continuum, have energy larger than the conduction band offset (about 500 meV), and have scattering times larger than 100 ps. Then, the three state-pair electronic transitions with the highest FoM are identified, with the “laser transition” being the state-pair energetically in the middle relative to the other state-pairs. The code marks it as red in graphs (as shown in Fig. 1) and records it for a QCL dataset. Further details about the laser transition code can be found in Refs. 35 and 36.

Figure 1 shows the band structure of our starting 10-layer design for four periods at 90 kV/cm. For our starting design, the FoM of 94.7 eV ps Å^{2} at 90 kV/cm was the highest for an electric field sweep of 10–150 kV/cm at 10 kV/cm increments. The energy difference is 108.9 meV or 11.4 *µ*m emission wavelength.

### B. [−2, +3] Å dataset

A dataset of 1800 structures was generated using the 10-layer starting structure, with an electric field range of 10–150 kV/cm applied to every structure in increments of 10 kV/cm. The 15-field iterations for 1800 structures give a total of 27 000 designs. The structures were generated by altering each of the ten layers separately by a random thickness. This random thickness was an integer value and could vary from −2 to +3 Å, including 0 Å. For ten layers and six integer random thicknesses, the total number of possible structures generated is 6^{10} or ∼60.4 × 10^{6}. Including the electric field sweep, we have 15 × 6^{10} or ∼907 × 10^{6} total designs, which is our design space to be predicted by ML. The design space greatly increases with each additional layer pair, or each additional random thickness and that is why a 10-layer structure with [−2, +3] Å random tolerance range was initially selected when making the dataset. The dataset can be found at Ref. 39 and was generated with the Microsoft Azure cloud computing environment using a Linux Ubuntu virtual machine with eight central processing units (CPUs) and 16 GB RAM. Other QCL datasets with different tolerance ranges, electric field sweeps, or based on different starting structures can be easily prepared in this same manner.

### C. Building the ML algorithm

A multi-layer perceptron (MLP) neural network consisting of five layers is used to make the algorithm to predict QCL FoMs. The first layer of the neural network, the input layer, takes ten QCL layer thicknesses and the electric field of a design (11 inputs in total). Next, three hidden layers follow, each consisting of a fully connected layer of 50 neurons, a normalization layer, and a rectifier linear unit (ReLU) activation layer. Finally, the output layer has five quantities: FoM, gain coefficient, dipole matrix element, LO phonon scattering time, and energy difference of the selected laser transition, fully connected followed by a regression layer. This multi-output regression model uses the root-mean-square error (RMSE) to assess the accuracy of the FoM. The activation function used is the Adam optimizer with a 0.001 initial learning rate and trains for 150 epochs. The dataset is split up into 70% for training the network, 15% for validation, and 15% for testing. ML is performed using the Deep Network Designer in MATLAB using a personal computer with four CPUs and 16 GB of RAM.^{40} More details on building the MLP neural network and optimization of network parameters for different QCL datasets can be found in Ref. 37. The trained ML algorithm can be found in Ref. 41.

## III. RESULTS

### A. [−2, +3] Å dataset and ML algorithm

27 000 laser transition FoM values are collected in the [−2, +3] Å dataset^{39} in ∼32 h using our Azure virtual machine. Figure 2(a) shows a FoM scatterplot of all 1800 structures at 100 kV/cm. Although there is only −2 to +3 Å variation between the layer changes, FoM values range anywhere from low 20 to high 150 eV ps Å^{2}, with the average FoM being 91.3 eV ps Å^{2} (solid line). The average FoM for the entire dataset at all electric fields is 67.3 eV ps Å^{2} and is plotted as a dashed line for reference.

The MLP neural network^{41} was trained on the dataset in 412 s with a FoM RMSE of 16.4 eV ps Å^{2}. The FoM values for 15 randomly selected designs of the test subset are plotted in Fig. 2(b) with the actual dataset values represented as circles, algorithm prediction depicted as triangles, and the error depicted as crosses. The plot shows very good agreement between the actual and predicted FoM values of the testing set, with average 5%–15% error for predicted vs real laser transitions. This current prediction accuracy is already well within what is needed for a successful QCL design. High errors (e.g., designs 3 and 13) can be attributed to either numerical errors, inherent to the Schrödinger solver, or a lack of cascading at low and very high electric fields; such “designs” are filtered away when optimizing for high FoM structures.

### B. Design subspace visualization

The ML algorithm is used to predict the entire 10-layer [−2, +3] Å design space of 15 × 6^{10} designs. Prediction of the entire design space is done in 8 h by ML, 134 000 times faster, and with less computational resources, than using the laser transition code. To visualize the data, the design space was separated into six subsets, where the first layer of every design is kept constant. Thus, the total number of structures where +2 Å is always added to the first layer is 6^{9}, and the total number of designs is 15 × 6^{9}–151 × 10^{6}.

The [+2] Å design subspace is visualized in Fig. 3(a), where every subspace structure is plotted in the x–y plane and every electric field iteration is plotted along the z-axis. The ML predicted FoM is visualized as the color of this 3D scatterplot. Every structure is organized in the x–y plane sequentially. Figure 3(b) shows the same 3D scatterplot but shows only designs where the FoM is higher than 130 eV ps Å^{2}. Applying a FoM threshold filter allows us to see where the high performing structures are. From over 150 × 10^{6} structures plotted in Fig. 3(a), there are 20 with a FoM greater than 130 eV ps Å^{2} shown in Fig. 3(b). The largest FoM designs are in the 90 and 100 kV/cm electric field range.

The highest FoM design is found at an applied electric field of 100 kV/cm. Figure 4(a) shows the 2D scatterplot of all 6^{9} designs of the [+2] Å subspace at 100 kV/cm with the color representing the FoM. A FoM threshold of 129 eV ps Å^{2} is applied, and the 51 structures with FoM higher than this threshold are plotted as red dots on the 2D scatterplot. Figure 4(b) shows the band structure for this highest FoM design, S1, with the laser transition highlighted in red, ErwinJr2 FoM of 141.2 eV ps Å^{2}, and an error of −7.4% between ML and ErwinJr2 FoM.

### C. High FoM structures identified using ML

Table I lists the top two FoM structures for all six subspaces. The layer addition, electric field, predicted ML FoM, actual ErwinJr2 FoM (EJ2), and error between the FoMs are also listed. Several layer thickness columns had identical values across all high FoM subspace designs, indicating that which layers are essential to best optimize the original 10-layer structure. These were layer 2 (L02) and layer 9 (L09) with +3 Å for all designs, as well as layer 5 (L05) and layer 10 (L10) with −2 Å for every design. The two highest FoM values for predicted ML designs appear in the [+3] Å subspace, and the highest actual ErwinJr2 FoM values are in the [+2] Å subspace.

L01 (Å) . | L02 (Å) . | L03 (Å) . | L04 (Å) . | L05 (Å) . | L06 (Å) . | L07 (Å) . | L08 (Å) . | L09 (Å) . | L10 (Å) . | E-field (kV/cm) . | FoM_{ML} (eV ps Å^{2})
. | FoM_{EJ2} (eV ps Å^{2})
. | Error (%) . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

−2 | 3 | −2 | −1 | −2 | −2 | −2 | 3 | 3 | −2 | 110 | 117.6 | 123 | −4.4 |

−2 | 3 | −2 | 1 | −2 | −2 | −2 | 2 | 3 | −2 | 110 | 117.0 | 120.6 | −3.0 |

−1 | 3 | −2 | 1 | −2 | −1 | 0 | 3 | 3 | −2 | 100 | 120.4 | 126.1 | −4.5 |

−1 | 3 | −2 | 2 | −2 | −2 | 0 | 3 | 3 | −2 | 100 | 120.5 | 124.5 | −3.2 |

0 | 3 | −2 | 1 | −2 | −2 | 1 | 3 | 3 | −2 | 100 | 124.8 | 130.4 | −4.3 |

0 | 3 | −2 | 1 | −2 | −1 | 0 | 3 | 3 | −2 | 100 | 124.7 | 129.8 | −4.0 |

1 | 3 | −2 | −1 | −2 | 0 | −1 | 3 | 3 | −2 | 100 | 127.8 | 135.4 | −5.6 |

1 | 3 | −2 | −2 | −2 | 0 | 0 | 3 | 3 | −2 | 100 | 127.7 | 134.1 | −4.7 |

2 | 3 | −2 | −2 | −2 | 1 | −2 | 3 | 3 | −2 | 100 | 130.7 | 141.2 | −7.4 |

2 | 3 | −2 | −2 | −2 | 0 | −1 | 3 | 3 | −2 | 100 | 130.8 | 140.2 | −6.7 |

3 | 3 | −1 | −2 | −2 | 3 | −1 | 3 | 3 | −2 | 90 | 133.3 | 139.4 | −4.4 |

3 | 3 | 0 | −2 | −2 | 3 | −2 | 3 | 3 | −2 | 90 | 133.7 | 138.3 | −3.3 |

L01 (Å) . | L02 (Å) . | L03 (Å) . | L04 (Å) . | L05 (Å) . | L06 (Å) . | L07 (Å) . | L08 (Å) . | L09 (Å) . | L10 (Å) . | E-field (kV/cm) . | FoM_{ML} (eV ps Å^{2})
. | FoM_{EJ2} (eV ps Å^{2})
. | Error (%) . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

−2 | 3 | −2 | −1 | −2 | −2 | −2 | 3 | 3 | −2 | 110 | 117.6 | 123 | −4.4 |

−2 | 3 | −2 | 1 | −2 | −2 | −2 | 2 | 3 | −2 | 110 | 117.0 | 120.6 | −3.0 |

−1 | 3 | −2 | 1 | −2 | −1 | 0 | 3 | 3 | −2 | 100 | 120.4 | 126.1 | −4.5 |

−1 | 3 | −2 | 2 | −2 | −2 | 0 | 3 | 3 | −2 | 100 | 120.5 | 124.5 | −3.2 |

0 | 3 | −2 | 1 | −2 | −2 | 1 | 3 | 3 | −2 | 100 | 124.8 | 130.4 | −4.3 |

0 | 3 | −2 | 1 | −2 | −1 | 0 | 3 | 3 | −2 | 100 | 124.7 | 129.8 | −4.0 |

1 | 3 | −2 | −1 | −2 | 0 | −1 | 3 | 3 | −2 | 100 | 127.8 | 135.4 | −5.6 |

1 | 3 | −2 | −2 | −2 | 0 | 0 | 3 | 3 | −2 | 100 | 127.7 | 134.1 | −4.7 |

2 | 3 | −2 | −2 | −2 | 1 | −2 | 3 | 3 | −2 | 100 | 130.7 | 141.2 | −7.4 |

2 | 3 | −2 | −2 | −2 | 0 | −1 | 3 | 3 | −2 | 100 | 130.8 | 140.2 | −6.7 |

3 | 3 | −1 | −2 | −2 | 3 | −1 | 3 | 3 | −2 | 90 | 133.3 | 139.4 | −4.4 |

3 | 3 | 0 | −2 | −2 | 3 | −2 | 3 | 3 | −2 | 90 | 133.7 | 138.3 | −3.3 |

Design S1 from Fig. 4(b) is in the [+2] Å subspace and has the highest actual ErwinJr2 FoM of 141.2 eV ps Å^{2} with a wavelength of 10.21 *µ*m (121.4 meV) at 100 kV/cm. Design S1 is underlined and boldface in Table I. The original 10-layer QCL design has a FoM of 94.7 eV ps Å^{2} at 90 kV/cm, while the QCL design S1, identified by ML, increases the FoM 1.5-fold, a signification improvement for QCLs.

The second highest ErwinJr2 FoM design in the [+2] Å subspace (denoted S2) has the same operating field of 100 kV/cm, with a predicted ML FoM of 130.8 eV ps Å^{2} and an actual ErwinJr2 FoM of 140.2 eV ps Å^{2} as seen in Table I. ErwinJr2 calculates the FoM for designs S1 and S2 that differ by only 1 eV ps Å^{2}, demonstrating the numerical sensitivity to layer thicknesses and boundary conditions of the 1D Schrödinger approach. Similarly, the predicted ML FoM values for these designs only vary slightly by 0.1 eV ps Å^{2}, showcasing the accuracy of the algorithm for the identification of high FoM designs. There is only a 1 Å difference between structures S1 and S2 in layers 6 (L06) and 7 (L07). The band structures of these QCL designs are very similar as well, thus indicating that these two layers, by themselves, do not play a major role in maximizing the FoM for the layer thickness tolerance range of [−2, +3] Å.

## IV. CONCLUSION

A ML framework has been developed for optimizing the FoM for a 10-layer QCL design. A design space of 6^{10} structures with 15 electric field iterations, i.e., ∼907 × 10^{6} unique designs, is created. The FoM for 27 000 structures from this design space forms our initial dataset and is collected in 36 h on a virtual machine. A MLP neural network splits the dataset into a training subset, a validation subset, and a testing subset and afterward is used to predict the entire design space of ∼907 × 10^{6} designs in 8 h using a personal laptop computer. Different visualization techniques are used to plot this large data space by splitting the FoM values into six subspaces, where the first layer thickness is constant. By comparing the layer thicknesses of high FoM structures in the subspaces with each other, ML (i) finds designs with high FoM and (ii) identifies which layers are most important when maximizing the FoM for a starting design.

This ML framework is used to identify a new QCL design, S1, in the [+2] Å subspace, with a FoM 1.5 times larger than the original design. The design operates around the same electric field without adding or removing the number of layers, only changing the layer thicknesses by −2 to +3 Å, and is identified faster than using the laser transition code or a 1D Schrödinger solver and human intuition. In the future, the framework will be used to expand the layer thickness design space and to identify new QCL design strategies based on other predicted parameters such as energy difference (emission frequency), dipole matrix element, gain, or scattering times. We expect that the combination with ML will help speed up or enable other QCL theoretical analysis or design tools^{12} as well as the most current research frontiers in QCLs.^{42–45}

## ACKNOWLEDGMENTS

The authors acknowledge the financial support from the Schmidt DataX Fund at Princeton University made possible through a major gift from the Schmidt Futures Foundation, the National Science Foundation, under Grant No. DGE-2039656, and by a grant from the Fund for Energy Research with Corporate Partners administered by the Andlinger Center for Energy and the Environment at Princeton University. This work was partially funded by the Center for Statistics and Machine Learning at Princeton University through the support of Microsoft. We thank Dr. Ming Lyu for the current version of ErwinJr2^{38} used in this paper.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Andres Correa Hernandez**: Conceptualization (equal); Data curation (lead); Formal analysis (lead); Investigation (lead); Methodology (equal); Software (lead); Validation (equal); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). **Claire F. Gmachl**: Conceptualization (equal); Formal analysis (supporting); Funding acquisition (lead); Investigation (supporting); Methodology (equal); Supervision (lead); Validation (equal); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal).

## DATA AVAILABILITY

The laser transition code is openly available in the Princeton Data Commons repository at http://doi.org/10.34770/z3r2-hg07.^{36}

The ErwinJr2 software is openly available to download on the GitHub repository at https://github.com/ErwinJr2/ErwinJr2.^{38}

The 10-layer [−2, +3] Å dataset is openly available in the Princeton Data Commons repository at http://doi.org/10.34770/r7nr-ee50.^{39}

The code to train the MLP neural network, as well as the algorithm used to predict the 900 × 10^{6} structures, is available in the Princeton Data Commons repository at http://doi.org/10.34770/e034-4670.^{41}

## REFERENCES

*μ*m

*λ*∼ 11–12 µm