Physical models can help improve solar cell efficiency during the design phase and for quality control after the fabrication process. We present a data-driven approach to inverse modeling that can predict the underlying parameters of a finite element method solar cell model based on an electroluminescence (EL) image of a solar cell with known cell geometry and laser scribed defects. For training the inverse model, 75 000 synthetic EL images were generated with randomized parameters of the physical cell model. We combine 17 deep convolutional neural networks based on a modified VGG19 architecture into a deep ensemble to add uncertainty estimates. Using the silicon solar cell model, we show that such a novel approach to data-driven statistical inverse modeling can help apply recent developments in deep learning to new engineering applications that require real-time parameterizations of physical models augmented by confidence intervals. The trained network was tested on four different physical solar cell samples, and the estimated parameters were used to create the corresponding model representations. Resimulations of the measurements yielded relative deviations of the calculated and the measured junction voltage values of 0.2% on average with a maximum of 10%, demonstrating the validity of the approach.
I. INTRODUCTION
The transition of the world energy system to renewable, fossil-free sources relies heavily on photovoltaic (PV) electricity.1 Due to the high energy consumption in the manufacturing process, the energy payback time of monocrystalline solar modules is in the range of 10% of the total lifetime.2 Therefore, further improvements in cell design and cell production are desirable.
One possible way to shorten the energy payback time of solar cells is to reduce power conversion losses due to electrical resistance in the solar cell and due to cell defects.3,4 To investigate these losses in detail, various measurement techniques, such as electroluminescence (EL), photoluminescence (PL), and infrared (IR) imaging, have been established both during the design process in the laboratory and for quality assurance integrated in the production line.5,6
By combining the imaging techniques with physical modeling, it is possible to gain detailed insight into the impact of specific design decisions and defects on the expected module performance. To do so, the defining parameters of the cell have to be extracted from measurements. It has been shown that the characteristics of solar cells can be modeled with Finite Element Method (FEM) models that solve an equivalent circuit representation that is defined by spatially distributed resistances, such as the sheet resistance of the electrodes, contact resistivity, and parallel shunt resistivity, and spatially distributed diode model parameters.7
The present work is part of a larger effort to use model calculations of solar cells to characterize defects, quantify their effects on the solar cell, and predict the impact of defects and design decision during upscaling to larger cell areas and modules.8 Standard silicon cells are used in this work because of better reproducibility and ease of use in measurements. However, the method is intended to be transferred to novel cell technologies, such as perovskite cells, where the origin of the defects is less well understood and major challenges exist regarding inhomogeneity, upscaling, and stability.9,10
Parameterization of the numerical model of a physical system can be challenging if its parameters are not directly measurable. Finding an appropriate set of model parameters that will enable the model to reproduce important features of the system must often be done indirectly, by finding a set of parameters that lead to model results that agree with the available measurements. Solving such an inverse problem is often a non-trivial, ill-posed task. In research and laboratory work, model parameters are often manually fitted to match model predictions with measurements. This process can be automated by minimizing the model error with least squares optimization techniques.11 If statistical analysis is required, the problem can be reformulated in a Bayesian framework and Markov Chain Monte Carlo (MCMC) algorithms can be used to sample from the posterior distribution of the input parameters.12
For many foreseeable applications of physical models of solar cells, such as quality control in a production line, these traditional approaches are insufficient due to their high computation time. This is particularly true in the use case presented here where two- or three-dimensional models are needed to describe features of the measured solar cell images. Therefore, an explicit form of an inverse model that can replace these steps by directly computing the model parameters based on measured data would be beneficial.13 Deep neural networks (DNNs) are promising candidates to serve as such inverse models, as they are able to adapt arbitrary nonlinear mappings when trained on a sufficiently large training dataset.14–17
The use of a neural network as an inverse model has successfully been demonstrated by training an autoencoder surrogate model to recover the material parameters from simulated current–voltage (IV) curves.18 A hybrid approach in which part of the model parameters are predicted by a convolutional neural network (CNN) while the others are estimated with traditional optimization techniques has showed promising results for determining layer properties based on reflectance spectroscopy.19 In addition, it has been demonstrated that CNNs can be used to predict the cell efficiency or the location of defects based on electroluminescence images.20–22 By combining EL, PL, and reflectance spectroscopy, CNNs have been used to predict the full IV curve of a cell together with other key characteristics of the cell.23,24 In a further work, a spatially resolved approach based on a U-net has been used to determine the local dark saturation current.25
In this work, we propose an approach using a convolutional neural network (CNN) as an inverse model that computes cell parameters from EL measurements of silicon solar cells. The CNN is trained on a training set of synthetic EL images simulated with the numerical model that it aims to invert.26 The simulated training set has a predefined cell geometry. Shunt defects with high parallel conductivity are added to the cell geometry by placing standardized rectangular subdomains. The numerical model is a 2D+1D FEM model that calculates the 2D potential distribution of the cell’s top and bottom electrodes based on domain specific electrode properties and coupling laws. The model is implemented in the simulation software Laoss 4.0 and distributed by Fluxim AG.8,27 In order to provide an uncertainty estimate of the calculated model parameters, we use a deep ensemble CNN model that is able to predict a Gaussian approximation of the model parameter probability distribution.28 The CNN model can then be applied to data acquired in an EL measurement setup to estimate the underlying physical model parameters of the solar cells. If successfully applied, this method enables the development and training of a CNN that can almost instantly create a model representation of the presented measurement sample.
The proposed method demonstrates a novel implementation of recent developments in machine learning that could extend existing engineering applications of deep learning for industrial practice.29,30 Bridging the gap between data-driven and physical models raises new challenges that are less frequently discussed in the deep learning literature: First, CNN network architectures are designed and implemented for classification tasks in the majority of cases. Dealing with physical parameters requires the use of multivariate regression models, which increases the complexity of the training. Second, in science and engineering, there is often a need to provide an uncertainty measure to evaluate the confidence in a result. Common deep neural network architectures are designed to provide point predictions. The use of deep learning in the context of physical models, therefore, makes it necessary to test and exploit the potential of recent developments in network architectures that provide uncertainty estimates.31 Our work provides a comprehensive example of an engineering application of a regression CNN that incorporates such an uncertainty estimation. The methodology could be applied to several other regression tasks in the surging area of physics-based deep learning where either sufficient training data or a detailed numerical model is available.32
The contributions of this work can be summarized as follows: We train deep ensemble CNNs to estimate physical parameters based on PV cell imaging data. The training data are generated by a physical simulation model whose inverse is to be approximated by the CNN. To our knowledge, such a combination of deep ensemble CNNs and inverse modeling is a novel combination of already successful concepts in science and engineering. It is shown that the presented method can be used to estimate physical model parameters without relying on specific symmetries in the layout of the grid lines and busbars that form the front contact of the PV cell. Therefore, our approach can be applied more generally to different PV cell types. We show that the extracted parameter sets can be used to parameterize the model and reproduce the measured images with high precision. We discuss the changes in the CNN structure and training hyperparameters required to implement and train such a deep ensemble CNN inverse model. In doing so, we provide further evidence that standard implementations of deep neural networks can be adapted for use in a scientific setting with minor modifications.
A schematic of the proposed workflow is shown in Fig. 1. The steps of the developed method are described in this paper as follows: The measurement samples are described in Sec. II A. The geometry of this cell sample was extracted from an EL measurement and used as the basis for the inverse model. The Laoss simulation model is explained in Sec. II B. Section II C shows how Laoss is used to model the measured solar cell samples. Additional post-processing steps applied to the Laoss simulation results are explained in Sec. II D. The structure of the inverse model is defined in Sec. II E. The inverse model is trained on a set of simulated EL images. The Laoss parameterization used to generate the training data is discussed in Sec. II F. The network architecture and training of the inverse CNN model are explained in Sec. II G. The results of the inverse model method are evaluated in Sec. III. A discussion and a brief outlook are given in Secs. IV and V.
II. MATERIALS AND METHODS
A. Measurement sample
The proposed method was carried out and tested for a monocrystalline solar cell of type XS156B3-200R from Motech Industries.33 Due to electrical current strength limitations in the available measurement equipment, a laser cutter was used to cut a 2 × 2 cm2 area from the wafer cell with a busbar at the top. The smaller area of the cell sample also reduced the computation time of the FEM model. An additional laser cut was used to introduce an artificial shunt between two grid lines. Two contact strips were then soldered to the busbar and rear solder pads of the sample cell.
The electroluminescence signal of the sample cell was recorded with a Nikon D800 digital SLR camera. The calibration constant for relating the EL signal with the junction voltage was determined with a low forward voltage of 0.55 V assuming a negligible current and, therefore, a constant voltage across the whole image.34 The high-voltage EL image was acquired with a forward voltage in the range of 0.60–0.64 V. To increase the signal-to-noise ratio of the measurement, the camera exposure time was set to 450 s at an ISO level of 200. The measurement image was exported in NEF (RAW) format and converted to grayscale. The resulting measurement image is shown in Fig. 2(a).
(a) EL measurement of a sample cell. (b) Cell geometry of the sample cell used for simulation (white: active area, black: grid lines, red: and shunt). (c) Unprocessed simulated EL image. (d) Post-processed simulated EL image.
(a) EL measurement of a sample cell. (b) Cell geometry of the sample cell used for simulation (white: active area, black: grid lines, red: and shunt). (c) Unprocessed simulated EL image. (d) Post-processed simulated EL image.
The measured image shows dark areas at the edges of the cell that are not present in the simulation. It is assumed that the edge effects are caused by defects in the junction as a result of the laser cutting process. Since edge effects were not included in the simulation model, this nonsimilarity was removed by cutting an area in the center that is not affected by the edge effects as the input to the inverse model [see Fig. 2(a)]. Since the calibration constant was calculated based on the averaged luminance signal with the assumption of constant junction voltage in the low-voltage image, the edge effects will decrease the average low-voltage signal and, therefore, lead to an overestimation of the junction voltage in the area used as the model input.
B. FEM model
The additional parameters used in Eq. (2) are the dark saturation current j0 and the thermal voltage VT. A schematic representation of the model used in Laoss as well as the input parameters required and the output parameters provided is shown in Fig. 3.
Alternatively, the algebraic diode model can also be replaced by a numerical charge drift-diffusion model considering the entire vertical cell structure (Fluxim’s drift-diffusion simulation software Setfos integrated with the 2D+1D model in Laoss). The advantage of the algebraic diode model, however, is the small number of free model parameters to define the coupling law.
C. Model representation of the measurement sample
The bias voltage applied to the solar cell during an EL experiment can be represented in Laoss as a fixed potential boundary condition. In the experiment, we applied the forward voltage at the busbar, which is visible as a dark area at the top edge of the measurement image and highlighted by an orange line in Fig. 2(a). The corresponding position of the fixed potential boundary condition in the model representation with value Vappl is shown analogously in Figs. 2(b)–2(d). The busbar was not included in the simulation geometry assuming that the fixed potential boundary condition represents the intersection between the busbar and the grid lines as well as the active area on the top electrode. 75 000 parameter/image pairs have been generated to build the training data of the inverse model. For each parameter set, two images were generated at different bias voltages to account for the calibration procedure used in the measurement. The simulated images were generated at random bias voltages. The lower voltage was sampled from a uniform distribution in the range [0.54, 0.55 V], and the higher bias voltage was sampled in the range [0.60, 0.64 V].
A coupling law with a non-zero value for the internal resistivity ρint is not yet implemented in Laoss 4. Therefore, Eq. (2) was solved externally with the hybrid algorithm of MINPACK and passed to Laoss as discrete values in a text file. The other values of the diode law were ideality factor nid = 1, thermal voltage VT = 2.38 × 10−2 V, and internal resistivity ρint = 2.88 × 10−4 Ω m2. These values remained unchanged during the simulation.
An unprocessed image of SEL(x, y) generated with parameters set manually to reproduce the measured sample is shown in Fig. 2(c). The silver grid is assumed to completely block fluorescent light and, therefore, appears as sharp dark lines.
D. Simulation post-processing
The simulated image and the camera image of the measurement sample were brought into closer agreement in an additional post-processing step. EL images exhibit inherent blurring due to lateral carrier diffusion in the emitting silicon cell,39 photon scattering in the silicon CCD caused by absorption depths that are longer than the pixel size,40,41 and metal finger scattering.42 Therefore, the simulated image was convoluted with a Gaussian kernel of size 5 pixels to match the blur level of the EL measurement. In a second step, 148 patches of dark areas were cropped from the camera image. These were then scaled to the range of the pixel values of the simulated image, randomly rotated and flipped, and overlaid with the simulated image to imitate the precise camera noise of the setup that consists of a combination of Gaussian noise, shot noise, and salt-and-pepper noise. The post-processed version of the simulation image example is shown in Fig. 2(d). A comparison with the unprocessed image in Fig. 2(c) shows the effects of blurring and noise, resulting in features similar to the measured image in Fig. 2(a).
E. CNN inverse model
F. Training and validation data generation
The simulations for generating the training data were performed using Laoss 4.38 The used geometry has three subdomains (active area, grid, and shunt) and is shown in Fig. 2(b). The input geometry was manually adjusted to match the dimensions and gridline structure of the sample cell. In the final workflow, this step could be replaced by an algorithm that assists in extracting the cell geometry from the EL image using edge detection and morphological operations. A random number of up to four shunts were placed on the artificial images. The position of the shunt was randomly chosen but was constrained by the following conditions:
The shunts cannot intersect with a grid line.
The shunts cannot intersect each other.
All shunts have the same dimensions and orientation (height = 0.01 mm; width = 1 mm). This corresponds to the assumed shape of the laser shunt scribed into the measurement sample.
A total of 75 000 images were simulated and used for training the inverse CNN model. In each of the three subdomains, four free parameters are chosen to be predicted later by the CNN model and, therefore, varied during the simulation of the training images. An overview of the parameters is given in Table I. Some of the parameters are sampled from a probability distribution. The others, such as the parallel resistivity ρpar of the active area, are kept constant at a physically meaningful value. The values of ρpar were independently sampled for shunts on the same simulated cell, resulting in shunts with different intensities that the neural network should learn to distinguish.
Model parameters used for the simulation of the training data.
Subdomain . | Vappl (V) . | j0 (A/m2) . | 1/ρpar (S/m2) . | R□ (Ω/□) . |
---|---|---|---|---|
Active area | Uniform [0.6, 0.7] | Log-uniform [1 × 10−10, 1 × 10−8] | Constant (50) | Uniform [10, 120] |
Grid line | Uniform [0.6, 0.7] | Log-uniform [1 × 10−10, 1 × 10−8] | Constant (50) | Log-uniform [1 × 10−4, 1 × 10−2] |
Shunt | Not defined | Log-uniform [1 × 10−10, 1 × 10−8] | Log-uniform [1 × 103, 2 × 106] | Constant (10) |
Subdomain . | Vappl (V) . | j0 (A/m2) . | 1/ρpar (S/m2) . | R□ (Ω/□) . |
---|---|---|---|---|
Active area | Uniform [0.6, 0.7] | Log-uniform [1 × 10−10, 1 × 10−8] | Constant (50) | Uniform [10, 120] |
Grid line | Uniform [0.6, 0.7] | Log-uniform [1 × 10−10, 1 × 10−8] | Constant (50) | Log-uniform [1 × 10−4, 1 × 10−2] |
Shunt | Not defined | Log-uniform [1 × 10−10, 1 × 10−8] | Log-uniform [1 × 103, 2 × 106] | Constant (10) |
G. Network setup and training
The network was implemented using the Keras/TensorFlow framework. The network architecture was adapted from the VGG19 implementation of Keras.43 All modifications follow the recommendations given for transferring CNN networks from classification to regression problems.44 In all hidden layers, ReLu activation functions are used. Dropout layers have been included after the two dense layers at the top of the network, and a batch normalization layer was included in front of the last dropout layer. The top layer was removed and replaced by a dense layer consisting of eight output neurons with a linear activation function to obtain a regression network for the four target parameters’ (μ, σ2) pairs.
The input image presented to the CNN was built from the cropped, post-processed simulation images. In order to reduce the memory consumption during training, the images were downsampled to a resolution of 80 × 40 pixels. Since every subdomain of a simulated EL image has its own distinct model parameters, we designed the CNN such that it predicts the values for one subdomain at a time. The first channel of the input contains the complete voltage image. The second channel encodes the mask in which the pixels defining the subregion for which the parameters should be predicted are set to 1 and the pixels of areas that should be ignored are set to 0. Since a standard implementation of VGG19 was used, a third channel was present but remained empty in all images to keep the original architecture and dependent hyperparameter ranges intact. Figure 4 shows the structure of the input and the output data for the example image of Fig. 2(b), which contains a single shunt. The approach results in three different two channel images as inputs to the CNN model with identical values in the first voltage image channel and different masks in the second channel. The parameters of the subdomains can then be collected and used to build a complete simulation model. The CNN model provides all four parameters independently from the subdomain defined in the mask. For shunt subdomains, Vappl has no direct physical definition since the fixed voltage boundary condition is only applied to the upper edge of the grid lines and the active area. Therefore, the parameter can be omitted when constructing the simulation model based on the predicted parameters and is put in brackets in Fig. 4. The training set consisted of 94% of all available images, while the validation set contained the remaining 6%. The target values were scaled to a feature range of 0.01–0.99 before training. A batch size of 512 and the Adam solver were used. Training was performed over a total of 200 epochs with an early stopping as soon as the validation loss did not improve for 20 epochs. The negative log-likelihood loss function defined in Eq. (10) can lead to numerical instabilities during network training when intermediate predictions of the variance σ2 are zero or close to zero. To avoid this, the activation of the output neurons for the variance σ2 was set to a strictly positive ELU+1 function with alpha = 1. In addition, the values for the variance σ2 have been clipped to the range in the calculation of the loss function and gradient clipping with a value of 0.5 was used in the Adam solver.
III. RESULTS
Four different measured samples (MO, M1, B1, and T2) were used to test the proposed approach. For all samples, the previously trained deep ensemble was used to predict a model parameterization of the FEM simulation model. The quality of this inverse model was then tested by comparing the forward simulation based on these parameters with the original measured data. In addition to the measured test data, a simulated image (MO-sim) included in the validation set during CNN training was used to analyze the extent to which the performance of the approach degrades when real instead of synthetic data are used as input.
The results of the regression output of a single CNN model from the deep ensemble are shown in Fig. 5. The x axis value represents the value used during the simulation of the EL image. The points on the y axis show the mean of the predicted Gaussian probability distribution of the parameter. The error bars show the standard deviation calculated from the predicted variance. The results show that the CNN learns to predict Vappl from the simulated EL image with very high accuracy, which is correctly represented by the corresponding low variance predictions. The predictions of the parameters ρpar and R□ show higher uncertainty values, which also correspond to larger offsets between predicted and true values. The largest uncertainties with respect to the defined parameter range are found in the predictions of j0. Interestingly, the model seems to correctly identify values with large offsets by predicting high variance values in these cases.
Regression results of a single instance inverse CNN model’s predictions for the applied voltage (a), the sheet resistance (b), the dark saturation current (c), and the parallel resistivity (d), including predicted uncertainties. The plot shows the results for 100 randomly selected images for both the validation set and the training set. The value ranges of the different subdomains are highlighted in (c) and (d). The sheet resistance of the shunts and the parallel resistivity of the active area as well as the grid line were constant in all images. Therefore, all data points of the two sets are superimposed.
Regression results of a single instance inverse CNN model’s predictions for the applied voltage (a), the sheet resistance (b), the dark saturation current (c), and the parallel resistivity (d), including predicted uncertainties. The plot shows the results for 100 randomly selected images for both the validation set and the training set. The value ranges of the different subdomains are highlighted in (c) and (d). The sheet resistance of the shunts and the parallel resistivity of the active area as well as the grid line were constant in all images. Therefore, all data points of the two sets are superimposed.
The trained deep ensemble consists of 17 CNNs. The ensemble results of the simulated validation image MO-sim are shown in Fig. 6. In the case of the simulated validation image, the original parameters used to simulate the image are known and can be compared to the predictions. In general, the mean predictions of each network vary significantly and so the predicted confidence intervals do not necessarily overlap. The true parameter used for the simulation of MO-sim is within the confidence interval of the predicted ensemble distribution in the case of Vappl, R□ of grid lines, and ρpar of the laser cut shunt area. The predicted probability distributions of j0 and R□ parameters of the active area deviate significantly from the actual simulation parameters, indicating that the EL image does not provide sufficient information to determine the two parameters.
Model predictions for the applied voltage (a), the sheet resistance (b), the dark saturation current (c), and the parallel resistivity (d) for the validation image MO-sim. The marker shapes indicate the subdomain type for which the parameters were varied and learned during training (see Table I). They are listed in the legend in black and apply to both single model (blue) and ensemble prediction (orange) data points. The color markers and the black shape markers have to be used in combination (e.g., the blue diamonds correspond to single model predictions in the active area). The green cross shows the values used during the simulation of MO-sim.
Model predictions for the applied voltage (a), the sheet resistance (b), the dark saturation current (c), and the parallel resistivity (d) for the validation image MO-sim. The marker shapes indicate the subdomain type for which the parameters were varied and learned during training (see Table I). They are listed in the legend in black and apply to both single model (blue) and ensemble prediction (orange) data points. The color markers and the black shape markers have to be used in combination (e.g., the blue diamonds correspond to single model predictions in the active area). The green cross shows the values used during the simulation of MO-sim.
The full parameter prediction results for all four measurement samples and the simulated validation image are shown in Fig. 9. The validation images and the corresponding resimulations using the mean values of the distribution predicted by the deep ensemble are shown in Fig. 7 for MO-sim, MO, and M1. The part of the image that was used as the input of the CNN is highlighted with a red rectangle. A general visual inspection of the images shows good agreement in terms of EL intensity and voltage drops due to the gridline layout. The intensity of the resimulated shunt resembles the measurement closely in the case of the simulated test image. For the measured images, the predicted distribution’s mean values for ρpar of the laser cut area result in visually more pronounced voltage drops. The average error in the junction voltage value when comparing the resimulated image with the measured image is 0.2%. The maximum error that can be found in both images is 10%.
Comparison of the input images (a)–(c) and the corresponding resimulations (d) and (e) based on the parameters predicted by the CNN inverse model.
Comparison of the input images (a)–(c) and the corresponding resimulations (d) and (e) based on the parameters predicted by the CNN inverse model.
The horizontal and vertical cross sections of the test cells are shown in Fig. 8. For each pixel, the standard deviation has been calculated by simulating 24 images with parameters that have been sampled from the parameter distributions predicted by the deep ensemble. The values of the cross section confirm the impression that the applied voltage range and voltage drops between grid lines are correctly modeled by the mean values of the predicted distributions. The deep ensemble tends to overestimate 1/ρpar in the laser cut region. In addition, the parallel resistance of the cell samples M1, B1, and T2 were calculated by fitting a lumped-parameter equivalent circuit diode model to the cell’s current–voltage curve. The global parallel resistance has then been multiplied with the total shunt area to estimate the area specific resistivity of the shunts. In the case of M1, it is assumed that the two shunts contribute equally to the measured global parallel resistance. The results shown in Fig. 9 confirm that the CNN model overestimates the value 1/ρpar, which leads to a stronger voltage drop in the vicinity of the shunt subdomain. However, the calculated confidence ranges for 1/ρpar show that the model is able to correctly deliver uncertainty values such that three out of four measured values lie within the predicted confidence interval.
Horizontal (a)–(c) and vertical (d)–(f) cross sections of the input image and the corresponding resimulation based on the parameters predicted by the CNN inverse model, including the simulation model uncertainty.
Horizontal (a)–(c) and vertical (d)–(f) cross sections of the input image and the corresponding resimulation based on the parameters predicted by the CNN inverse model, including the simulation model uncertainty.
Detailed ensemble parameter predictions for the validation image and the investigated measurement samples.
Detailed ensemble parameter predictions for the validation image and the investigated measurement samples.
IV. DISCUSSION
The results presented in Sec. III show that the deep ensemble CNN used is a promising candidate for an inverse model for a silicon solar cell. It was possible to train the network to predict the used model parameters with high accuracy based on the simulation results. In cases with lower prediction accuracy, the model correctly predicts high error bars, thanks to the negative log-likelihood loss function used. Resimulations based on the parameters predicted by the deep ensemble inverse model confirm the overall consistency of the approach by showing good agreement between original data and resimulations where the inverse calculation of the parameters was used for a forward simulation.
High uncertainties exist in the inverse prediction of j0 and R□. Nevertheless, the forward calculation confirmed that the predicted parametrizations lead to a valid model representation of the given sample. This is an indication that the inverse problem defined by the equations implemented in the Laoss model is ill-posed. By substituting Eq. (2) into Eq. (1), one can show that in the regime of low internal resistivity ρint, the derivative of Δvt depends only on the product of j0 and . This leads to a strong correlation of the two parameters that makes it difficult to resolve them independently from a voltage image alone. The dependency is also confirmed by the resimulation of the ensemble prediction results of MO-sim shown in Fig. 6. The CNN model underestimates the value of j0. Since the value of R□ is simultaneously overestimated, the resimulation of MO-sim in Fig. 8 agrees well with the CNN input image. This behavior is also consistent with the luminance imaging theory, which requires a combination of EL and PL imaging to determine j0 and R□ separately. Therefore, if only a model based reconstruction of the EL image is of interest, a possible modification for inverse modeling could be to predict only the product of j0 · R□, which would simplify the problem. Similarly, in the discussed regime of low internal resistivity ρint, the parallel resistivity affects Eq. (1) only through the quotient R□/ρpar. Since there is no domain in which these two parameters have been varied simultaneously, this did not lead to further implications in the present study since one of the two values of the quotient has always been constant. An alternative to avoid the ill-posedness of the problem by guessing well-defined parameter combinations could be a physics-informed neural network that includes the knowledge of the governing equations during training of an inverse model neural network. Such an approach could help to force the network to correctly account for interdependent model parameters.
The parallel resistivity of the shunt region was less accurately predicted by the inverse deep ensemble model. Since the deep ensemble performed significantly better on the simulated validation image, it can be assumed that this is partly due to differences in the detailed appearance of the shunt region in the measurement when compared to the simulated data. Although much effort was put into accurately modeling the resulting voltage drop, even the best modeling results showed significant deviations from the measured data. During training, the CNN only learns to recognize the shape of the model-based shunts. Therefore, shunts in the measured data with different appearances are not expected to be handled correctly by the deep ensemble CNN. This is particularly evident for the second shunt in sample M1, which has a more triangular shape compared to the ellipsoidal shunts in the other test cells. Due to this unknown geometry, the CNN significantly overestimates the conductance of the shunt in this case. Another reason for the overestimation of the shunt conductivities is the overestimation of the measurement’s junction voltage in the area that was used for the input to the CNN model due to the inclusion of the areas affected by edge effects in the calibration procedure (see Sec. II A). The CNN model will interpret this as a higher level of applied voltage in which case the same absolute voltage drop in the shunt area will only be possible with an increased shunt conductivity.
The use of synthetic data for training a neural network can only be successful if the distribution of the training data fully covers all samples of interest to which the model is to be applied. If the network is applied to data that have features that are not present during the training phase, uncontrollable extrapolations are possible. The deep ensemble did not provide reasonable results for two of the four test samples. The results of the failed cases are shown in the Appendix in Fig. 10 and in Fig. 11. In these cases, the measurement samples were not perfectly aligned due to inaccuracies in the chosen measurement setup. As a result, the grid lines and the mask in channel 2, which define the pixels of the subregion to be predicted, have a small shift that causes the network to calculate values for R□ that are averaged between the high conductivity of the grid line and the low conductivity of the active region of the cell. This leads to a complete breakdown of the method. The results demonstrate the sensitivity of data-driven methods to the quality and comprehensiveness of the training dataset.
Comparison of the input images (a) and (b) and the corresponding resimulations (c) and (d) based on the parameters predicted by the CNN inverse model (failed cases).
Comparison of the input images (a) and (b) and the corresponding resimulations (c) and (d) based on the parameters predicted by the CNN inverse model (failed cases).
Horizontal (a) and (b) and vertical (c) and (d) cross sections of the input image and the corresponding resimulation based on the parameters predicted by the CNN inverse model, including the simulation model uncertainty (failed cases).
Horizontal (a) and (b) and vertical (c) and (d) cross sections of the input image and the corresponding resimulation based on the parameters predicted by the CNN inverse model, including the simulation model uncertainty (failed cases).
V. CONCLUSION AND OUTLOOK
In many engineering applications, finding model parameters of numerical models based on an indirect measurement can be a difficult and time-consuming task. In this paper, an inverse modeling approach based on a deep ensemble CNN was demonstrated utilizing a numerical model for the simulation of EL images of silicon solar cells with known cell geometry and known defect areas.
The work confirmed that a CNN is a valuable candidate for a data-driven inverse model. In total, 75 000 simulated images have been created with Laoss based on parameters randomly sampled from a predefined range. With 94% of the images in the training set and 6% in the validation set, the CNN model has successfully been trained to learn the inverse mapping from the measurement image to the corresponding model parameters. By using a deep ensemble CNN model, an uncertainty prediction for the model parameters was included, which is a key component for using the method in a scientific environment.
The model performance was tested with forward simulations based on the predicted model parameter distributions. The tests showed relative deviations of the calculated mean junction voltage from the original measured junction voltage of 0.2% on average with a maximum of 10%. The measured junction voltage was within the estimated uncertainties of the model results. The resistivities of the shunt subdomains have been estimated based on the measured current–voltage curve and compared to the values predicted by the CNN model. Three out of four measurements are within the predicted uncertainty range, which confirms the consistency of the approach. Failures of the method can be explained by mismatches between the simulation model results and the measurement data, which leads to a simulation–reality gap. This critical dependency on the accuracy of the synthetic training data is well known in similar methods.45,46
By essentially pre-calculating an inverse model, the method shows a promising approach to enable a fast and accurate calculation of the parameters of a physical cell model. Since domain specific information is only present during the training data simulation stage, the approach is highly transferable to other types of solar cell or other engineering applications where a numerical simulator is available. For each cell layout, a separate model has to be trained. In addition, the presented results are restricted to shunts with a standardized affected area and orientation. In order to include other defect types, they would have to be included into the training data generation. This would significantly increase the computation time needed for the generation of the images. However, this training stage is responsible for the main effort of the method and the model can still provide fast and reliable results during the prediction stage. In particular, we expect that such a model can be used in the near future for efficient extraction of parameters of batches of novel solar cells with identical layout, for quality control, and to enable further improvements in the production and upscaling process. The full potential of such a model would be realized if it is used for quality control in an industry scale production line to characterize cells or modules and improve quality assurance, default classification, and defect removal.
In a next step, we plan to apply the method to perovskite cells, where there is a high potential for improvement through detailed defects and cell characterization. Future improvements of the methods could also include integrating physical knowledge into other parts of deep neural network training, such as the loss function or network architecture, to improve data efficiency and out-of-sample predictions.47
ACKNOWLEDGMENTS
This work was supported by Innosuisse (Grant No. 37304.1 IP-ENG). Franz Baumgartner and Hartmut Nussbaumer from the ZHAW Institute for Energy Systems and Fluid Engineering are thanked for providing the EL imaging camera and laser scribing system used here.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
M. Battaglia: Conceptualization (equal); Data curation (equal); Methodology (equal); Software (lead); Visualization (lead); Writing – original draft (lead). E. Comi: Conceptualization (equal); Data curation (equal); Methodology (equal); Software (supporting); Writing – review & editing (equal). T. Stadelmann: Conceptualization (supporting); Methodology (supporting); Supervision (supporting); Writing – review & editing (equal). R. Hiestand: Methodology (supporting); Software (supporting); Supervision (supporting); Writing – review & editing (supporting). B. Ruhstaller: Conceptualization (equal); Data curation (supporting); Methodology (equal); Supervision (supporting); Writing – review & editing (equal). E. Knapp: Conceptualization (equal); Funding acquisition (lead); Methodology (equal); Project administration (lead); Software (supporting); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings will be available in GitHub at https://doi.org/10.5281/zenodo.7528885 following an embargo from the date of publication until January 01, 2028, to allow for the commercialization of research findings.
APPENDIX: ADDITIONAL FIGURES
Detailed ensemble parameter predictions for the validation image and the investigated measurement samples. Comparison of the input images (a) and (b) and the corresponding resimulations (c) and (d) based on the parameters predicted by the CNN inverse model (failed cases). Horizontal (a) and (b) and vertical (c) and (d) cross sections of the input image and the corresponding resimulation based on the parameters predicted by the CNN inverse model, including the simulation model uncertainty (failed cases).