Separating lithium metal foil into individual anodes is a critical process step in all-solid-state battery production. With the use of nanosecond-pulsed laser cutting, a characteristic quality-decisive cut edge geometry is formed depending on the chosen parameter set. This cut edge can be characterized by micrometer-scale imaging techniques such as confocal laser scanning microscopy. Currently, experimental determination of suitable process parameters is time-consuming and biased by the human measurement approach, while no methods for automated quality assurance are known. This study presents a deep-learning computer vision approach for geometry characterization of lithium foil laser cut edges. The convolutional neural network architecture Mask R-CNN was implemented and applied for categorizing confocal laser scanning microscopy images showing defective and successful cuts, achieving a classification precision of more than 95%. The algorithm was trained for automatic pixel-wise segmentation of the quality-relevant melt superelevation along the cut edge, reaching segmentation accuracies of up to 88%. Influence of the training data set size on the classification and segmentation accuracies was assessed confirming the algorithm’s industrial application potential due to the low number of 246 or fewer original images required. The segmentation masks were combined with topography data of cut edges to obtain quantitative metrics for the quality evaluation of lithium metal electrodes. The presented computer vision pipeline enables the integration of an automated image evaluation for quality inspection of lithium foil laser cutting, promoting industrial production of all-solid-state batteries with lithium metal anode.

The strive for improved energy storage solutions drives efforts to commercialize lithium metal battery (LMB) technologies as potential substitutes for conventional lithium-ion batteries (LIBs). Adopting lithium metal as an anode active material promises next-generation battery types with increased specific energies and energy densities (Placke ., 2017). This capability originates from lithium’s exceptional theoretical specific capacity of 3 860 mAh g−1 and its low electrochemical potential of −3.04 V vs the standard hydrogen electrode (Xu ., 2014). However, electrochemical hurdles, such as low Coulombic efficiency and safety-critical formation of lithium dendrites, have thus far precluded utilization of lithium metal anodes in conjunction with liquid electrolytes (Lin ., 2017). Therefore, several post-lithium-ion battery chemistries leveraging lithium metal anodes, particularly combined with solid electrolytes, are extensively researched. These technologies include, among others, inorganic and organic all-solid-state batteries (ASSBs), lithium-sulfur batteries (LSBs), and lithium-air batteries (LABs) (Varzi ., 2020). While significant advancements in the cell chemistry and design of LMBs have been made, a notable gap in research addressing their industrialization remains (Frith ., 2023; Tan ., 2022; and Xu ., 2020b). To facilitate the commercial availability of LMBs, manufacturing systems, production processes, and quality assurance measures have to be developed.

Cutting out anodes of a specified geometry from lithium metal coil substrates with typical thicknesses in the low micrometer range is one of the critical process steps in industrial LMB production (Duffner ., 2021 and Schnell ., 2018). In laboratory-scale LMB manufacturing, lithium metal substrates are manually separated using hand tools, such as scissors or punches (Stumper ., 2023). As lithium metal adheres to mechanical cutting devices (Jansen ., 2018) due to its plastic deformation at low strain rates (Grady 1980), the cutting tools require successive cleaning. Progressive tool contamination complicates the transfer of fine blanking from conventional LIB to LMB production (Jansen ., 2019) due to the decreasing cut edge quality. Proposed techniques to maintain blade cleanliness, such as applying special coatings (Weber, 2019) or sacrificial interlayers (Backlund, 1977), are intricate to implement in high-throughput industrial production lines.

Thus, laser cutting is favorable given its non-contact, wear-free, and flexible working principle (Duffner ., 2021). In the realm of LIB production, nanosecond-pulsed laser systems are preferentially utilized for electrode cutting (Kriegler ., 2021; Lee and Suk, 2020; and Lutey ., 2015). This established application has positioned nanosecond-pulsed laser radiation as a promising choice for separating lithium metal substrates in LMB manufacturing. Moreover, it was recently demonstrated that laser pulses in the nanosecond range enable the separation of lithium metal substrates at exceptional cutting speeds of more than 5 m s−1 (Kriegler ., 2022).

Beam-matter interaction in the short-pulsed laser processing of metals is characterized by absorption, heat conduction, melting, melt expulsion, evaporation, and plasma formation (Leitz ., 2011). It was demonstrated that depending on the selected process parameters, melt displacement results in a raised edge along the cutting kerf when lithium metal substrates are cut by nanosecond laser radiation (Jansen ., 2018 and Kriegler ., 2022). This melt superelevation represents a surface feature of critical importance, as it is suspected to promote lithium dendrite growth. Such dendrites are needlelike structures that extend from the electrode surface, causing battery short-circuiting when piercing the separator layer.

It was demonstrated in the literature that inhomogeneities on the surface of lithium metal anodes promote fluctuations in local current densities as they influence the contact area to adjacent layers (Gireaud ., 2006). Consequently, melt superelevations draw an increased influx of lithium ions and are subject to accelerated lithium deposition rates during lithium plating, referred to as current focusing (Krauskopf ., 2020). Moreover, if a critical current density is surpassed during lithium stripping, a self-reinforcing mechanism successively accumulates voids and increases the current density. Hence, the increased local current densities during plating may initiate lithium dendrite formation when exceeding a characteristic value, ultimately causing cell death by electrical short circuits (Kasemchainan ., 2019). Moreover, lithium hydroxide may form in the heat-affected zone around the cut edge by reaction with water residues (Bocksrocker, 2022 and Jansen ., 2018). It is theorized that an uneven lithium surface composition can lead to non-uniform ionic surface conductivities, which in turn, encourage the emergence of dendritic lithium depositions (He ., 2019). Thus, choosing process parameters suitable to diminish melt formation is paramount to prevent lithium dendrite formation. Furthermore, detecting defective laser cuts contributes to quality-controlled LMB production. Whereas, in the laboratory-scale fabrication of prototype LMBs, the cut edge quality is typically not controlled, the efficient industrial production of LMBs with consistent performance characteristics demands scalable and quality-assured separation processes.

Optically inspecting the cut edge of lithium metal substrates using imaging techniques combined with automated image analysis accelerates the laborious identification of feasible process parameters and allows product quality control. The automatic feature extraction from digital images is commonly referred to as computer vision. The underlying methods can be used, among others, for image classification, object detection, semantic segmentation, and instance segmentation (Lin ., 2014). Recently, the machine learning subfield of deep learning has gained increasing interest in processing image data. Deep multilayered neural networks learn implicit relations within data sets, expanding the detection capabilities compared to conventional image analysis methods (LeCun ., 2015) like thresholding techniques (Ng, 2006). Their high versatility renders deep learning algorithms robust against fluctuating production environments, such as changing lighting conditions, and allows their transfer to modified production scenarios (Smith ., 2021). These advantages, in combination with the emerging abundance of computing resources, have promoted deep learning based on neural networks as a leading computer vision method (Chai ., 2021 and Deng, 2014).

Convolutional neural networks (CNNs) consisting of convolutional, pooling, and fully connected layers represent deep, feed-forward networks well-suited for computer vision tasks (LeCun ., 2015). From the various evolving CNN architectures (Bharati and Pramanik 2020 and Guo ., 2016), the algorithm type and implementation parameters must be selected according to the envisaged application-specific trade-off between runtime and accuracy (Huang ., 2017). The region-based CNN (R-CNN), initially introduced as an algorithm for bounding-box object detection (Girshick ., 2014) and its extensions (Girshick, 2015 and Ren ., 2015), served as the basis for Mask-Regional CNN (Mask R-CNN). Mask R-CNN complements object detection with instance segmentation, creating pixel-to-pixel segmentation masks for each region of interest (ROI) (He ., 2017), standing out for its high accuracy (Bharati and Pramanik, 2020). Thus, Mask R-CNN offers a comprehensive solution for detecting and classifying objects through bounding boxes and pixel-level segmentation.

Data acquisition is often elaborate for industrial computer vision tasks, particularly when considering the large amount of annotated data demanded by CNNs (Krizhevsky ., 2017). Therefore, a CNN can be pre-trained with abundant data to learn low-level, data-unspecific features. Following this initial phase, domain-specific fine-tuning can be applied to repurposing the learned features to a target data set and task (Girshick ., 2014). Such a transfer learning approach (Yosinski ., 2014) alleviates the scarcity of task-specific training data (Bengio 2012).

Applying Mask R-CNN and comparable computer vision algorithms was proposed within a multitude of domains, including agriculture (Gené-Mola ., 2020; Gonzalez ., 2019; and Qiao ., 2019), infrastructure (Guo ., 2021 and Xu ., 2022), medicine (Anantharaman ., 2018 and Ronneberger ., 2015), and materials science (Masubuchi ., 2020). Although CNNs exhibit exceptional detection capabilities, published research regarding their utilization for computer vision in industrial production is limited (Wuerschinger ., 2020). This scarcity may partly be attributed to the industries’ reluctance against the black box approach. Nonetheless, the potential of CNNs in industrial quality control was showcased several times.

Various CNNs were applied for the automated visual inspection of friction stir welds using camera and topography images (Hartl ., 2019). Courtier . (2021) used a CNN to classify laser-cut stainless steel samples according to the applied cutting speed (Courtier ., 2021). In additive manufacturing, CNNs were applied for characterizing surface defects in scanning electron microscopy images of samples produced by selective laser melting (Wang ., 2022). For femtosecond laser processing, CNNs were implemented to predict process parameters (Mills ., 2019) and to detect beam misalignments (Xie ., 2019) using camera images. Furthermore, CNNs were utilized in battery production to classify laser weld defects of battery safety vents in digital images (Yang ., 2020a and Yang ., 2020b). Moreover, the approach was extended by a pixel-level localization of weld defects using semantic segmentation networks (Yang ., 2022 and Zhu ., 2021).

Assigning pixel-level masks by instance segmentation allows the derivation of quantitative values on the location, shape, and size of objects, rendering it an excellent method for feature evaluation in microscopy images. Therefore, this study addresses the applicability of CNN-based computer vision for parameter selection and quality assurance of lithium metal laser cutting in ASSB production. Mask R-CNN is proposed as a feasible algorithm for classifying and segmenting confocal laser scanning microscopy (LSM) images showing cut edges of lithium metal foils separated by laser radiation. The segmentation masks are applied to gain quantitative information on the cut edge geometry by combining them with topography data.

Battery-grade lithium metal foils (China Energy Lithium, China) with a thickness of 50 μm were processed using a nanosecond-pulsed fiber laser (SP-200P-A-EP-Z-L-Y, TRUMPF formerly SPI, Germany) emitting radiation with a wavelength of 1060 nm. The laser source allowed the adjustment of the pulse waveform and enabled average output powers of up to 200 W. The laser beam was deflected by a high-speed galvanometric scanning unit (Superscan IV-30, Raylase, Germany) and focused via a telecentric F-theta lens (S4LFT2163/126, Sill Optics, Germany) with a focal length of 163 mm to a spot radius of approximately 14 μm. Due to the high reactivity of lithium metal, the samples were enclosed in a container filled with dry air, which allowed the laser beam to enter via a transparent laser window. Cuts of 10 mm length were produced using 288 parameter combinations, varying the laser power, the pulse repetition rate, the pulse duration, and the laser beam scanning velocity. The experimental plan encompassed a wide range of process parameters and is detailed in Table V in the  Appendix. Depending on the process parameters used, material removal was based on melt expulsion and evaporation leading to a characteristic melt superelevation along the cutting kerf. The experimental setup and cause-effect relationships between process parameters and cutting kerf features are detailed in a previous publication (Kriegler ., 2022).

Images of the laser cuts in the lithium metal samples were obtained using LSM (VK-X 1000, Keyence, Japan) at a 480-fold magnification, resulting in a captured image region of approximately 702 × 527 μm2. The cutting kerfs were manually centered in the microscope’s image field. The samples were evaluated at approximately 5 mm from the incision point to exclude process instabilities at the start of the laser cut. The illumination strength was automatically determined by the microscope’s measurement software (VK.H2X, Keyence, Japan).

246 color images with a resolution of 1024 pixels × 768 pixels were recorded (see Fig. 7) using the complementary metal-oxide-semiconductor (CMOS) sensor integrated into the LSM. Additionally, the topography of the electrode surface was captured using the confocal laser height measurement function of the LSM with a laser beam wavelength of 661 nm. The acquisition frame rate was 15 Hz and the vertical scan step size was 0.75 μm. The images were tilt-corrected and the workpiece surface was referenced to zero height. The height information was exported to a comma-separated values (csv) file format.

FIG. 1.

Processing pipeline showing the training of Mask R-CNN and its application for automatic image segmentation, enabling the quantitative evaluation of the melt superelevations of laser cut edges on lithium metal foils.

FIG. 1.

Processing pipeline showing the training of Mask R-CNN and its application for automatic image segmentation, enabling the quantitative evaluation of the melt superelevations of laser cut edges on lithium metal foils.

Close modal

The color images were converted to the joint photographic experts’ group (jpg) format. Subsequently, the data set was artificially augmented by 180° rotation and horizontal mirroring to quadruple the data stock to 984 images. Data augmentation by modifying images on a pixel level, for instance, by altering the brightness, was disregarded as constant illumination during image acquisition and ensured consistent image quality in industrial production. Each of the 984 images from the data stock was assigned to one of the three classes (see Table I) and ground-true labeled with polygon lines by a human expert. The open-source software LabelMe (Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, USA) was used to segment the melt superelevations.

TABLE I.

Identifier, exemplary confocal laser scanning microscope image, class description, and data set size for the three defined classes.

IdentifierClass 1 (defective cut)Class 2 (successful cut/regular melt)Class 3 (successful cut/irregular melt)
Exemplary class image    
Class description Class 1 images show a defective cut with a non-continuous cutting kerf and fused melt superelevations. Defective cuts can occur for various reasons, such as unsuitable process parameter selection, a laser malfunction, or a workpiece positioning error. Workpieces corresponding to class 1 images are allocated as rejects for industrial lithium metal battery production. Class 2 images show a continuous cutting kerf with clearly separated melt rims at its sides and even melt superelevations with a quasi-constant width, characteristic of a stable cutting process. It is assumed that class 2 cuts can be accepted within industrial lithium metal battery production if the topography deviations are in a reasonable range. Class 3 images show a continuous cutting kerf with clearly separated melt rims at its sides but with melt superelevations irregular in width and shape, indicating an unstable process behavior. It is assumed that class 3 cuts can be accepted within industrial lithium metal battery production if the topography deviations are in a reasonable range. However, the inconsistency in the product quality may render the corresponding process parameter set rather undesirable. 
Data set size 336 images 105 images 483 images 
IdentifierClass 1 (defective cut)Class 2 (successful cut/regular melt)Class 3 (successful cut/irregular melt)
Exemplary class image    
Class description Class 1 images show a defective cut with a non-continuous cutting kerf and fused melt superelevations. Defective cuts can occur for various reasons, such as unsuitable process parameter selection, a laser malfunction, or a workpiece positioning error. Workpieces corresponding to class 1 images are allocated as rejects for industrial lithium metal battery production. Class 2 images show a continuous cutting kerf with clearly separated melt rims at its sides and even melt superelevations with a quasi-constant width, characteristic of a stable cutting process. It is assumed that class 2 cuts can be accepted within industrial lithium metal battery production if the topography deviations are in a reasonable range. Class 3 images show a continuous cutting kerf with clearly separated melt rims at its sides but with melt superelevations irregular in width and shape, indicating an unstable process behavior. It is assumed that class 3 cuts can be accepted within industrial lithium metal battery production if the topography deviations are in a reasonable range. However, the inconsistency in the product quality may render the corresponding process parameter set rather undesirable. 
Data set size 336 images 105 images 483 images 

The labeled images were converted to the common objects in context (coco) format (Lin ., 2014), saved as javascript object notation (json) files, and used for training, validation, and testing of the CNN. 60 images were selected for model testing and excluded from model training/validation with equal shares of class 1, class 2, and class 3 images. From the remaining 924 images, 700 and 224 were assigned to the training and validation data sets, respectively. The validation data set was used for hyperparameter tuning, particularly for early-stage detection of overfitting. Smaller data sets with 22/7, 44/14, 88/28, 175/56, and 350/112 training/validation images were created by randomly removing images to test the influence of the training/validation data quantity.

All rotated and mirrored variants of the original test images were removed from the training/validation set to evaluate the influence of data augmentation on model performance in another experiment. The resulting 744 images for training and validation were again divided into data sets of 22/7, 44/14, 88/28, 175/56, and 350/112 training/validation images. This modified data augmentation approach is referred to as limited data augmentation, and the data sets are referred to by their number of training images throughout this work.

Code implementation and execution were realized on a standard personal computer (see Table II). Computational analysis was performed using Python 3.7 with the Tensorflow 2.0 and the Keras 1.15 frameworks. An open-source version of Mask R-CNN (Waleed, 2017) with a ResNet101 (He ., 2016) backbone was selected as the model basis due to its high accuracy, providing bounding boxes and semantic feature masks. No particular hyperparameter optimization, for example, of the learning rate, was performed as this study focused on model application to an industrial use case. A learning rate of 10−4 was chosen for all experiments based on explorative preliminary tests. The image characteristics and the model hyperparameters for training are summarized in Table III. A transfer learning approach was followed using a pre-trained Mask R-CNN model (Waleed, 2017) with initial weights based on 35 000 images from the generic coco data set (Lin ., 2014), reducing the training effort.

TABLE II.

Computing resources used for model training and testing within this work.

CPUaRAMbGPUcVRAMdCUDA capabilityeOperating system
Intel(R) Core(TM) i5-6400 16 GB NVIDIA GTX 1070 8 GB 6.1 Windows 10 
CPUaRAMbGPUcVRAMdCUDA capabilityeOperating system
Intel(R) Core(TM) i5-6400 16 GB NVIDIA GTX 1070 8 GB 6.1 Windows 10 
a

CPU, central processing unit.

b

RAM, random-access memory.

c

VRAM, video random access memory.

d

GPU, graphical processing unit.

e

The compute unified device architecture (CUDA) allows the usage of GPUs for general-purpose computing.

TABLE III.

Image characteristics and hyperparameters for the training of the Mask R-CNN model presented within this work.

Image dimensionsImage sizeOriginal image typeLearning rateWeight decayaMomentumaBatch sizeEpochs
1024 pixels × 768 pixels ≈2.1 MB pngb 10−4 1 × 10−4 0.9 100 
Image dimensionsImage sizeOriginal image typeLearning rateWeight decayaMomentumaBatch sizeEpochs
1024 pixels × 768 pixels ≈2.1 MB pngb 10−4 1 × 10−4 0.9 100 
a

The weight decay and momentum were chosen according to He . (2017).

b

png, portable network graphics.

The model performance was evaluated using well-known metrics for object classification and instance segmentation. Only one of the three defined object classes could be present in an image as they were mutually exclusive. Thus, an image was categorized as true positive (TP) if the ground-truth object class was detected correctly, while an image was categorized as false positive (FP) if a non-existing object was detected or an existing object was detected in a misplaced position (Padilla ., 2020) (see Table VI). Thereby, a correct detection was assumed if the generated bounding box overlapped with the ground-truth bounding box to a degree of at least 70%. The corresponding model hyperparameter was the region proposal network threshold (RPN threshold). The mean precision,
(1)
in % for all the test images was calculated individually for class 1 and class 2/class 3 as the metric to identify the relevant classes (Padilla ., 2020). Additionally, an accumulated mean precision P was calculated for all classes to account for the overall algorithm performance.
In order to evaluate the instance segmentation accuracy, the intersection over union (IoU) in % was calculated by the ratio of the area of overlap and the area of union using the predicted area AP and the ground-truth area AG (Padilla ., 2020),
(2)

Furthermore, the data set training time and the image test time were recorded to assess the applicability of the methodology in industrial production.

The melt height, the melt width, and the width of the cutting kerf were defined as quality features of the cut edge and were analyzed automatically by a human expert.

For an automatic quantitative assessment of the laser cut edges, the following steps were carried out using the segmentation masks:

  • Creation of binary segmentation matrices (size: 1024 pixels × 768 pixels) from the segmentation masks by setting every value corresponding to the segmentation mask to 1 and all other values to 0.

  • Extraction of LSM topography data by exporting height matrices (size: 1024 pixels × 768 pixels) containing the height values at each spatial position.

  • Segmentation matrix correction using the topography data by removing all pixels with height values below the reference plane from the segmentation matrices (i.e., setting their value to 0).

  • Segmentation matrix correction by deletion of the 50 uppermost and 50 lowermost pixel rows corresponding to approximately 34 μm at the top and the bottom of the image to reduce the influence of segmentation mask inaccuracies at the image margins.

  • Multiplication of the binary matrices with the width of one pixel (0.686 μm), line-wise summation of the matrix values for the left/right melt superelevation, and extraction of the mean melt width.

  • Calculation of the standard deviation of the melt width using the line values to allow an assessment of the homogeneity of the melt superelevation.

  • Determination of the mean kerf width by line-wise subtraction of the innermost point of the left melt superelevation from the innermost point of the right melt superelevation and calculation of the mean value of all rows.

  • Multiplication of the binary matrices with the height values from the topography data, line-wise summation of the matrix values for the left/right melt superelevation, and calculation of the mean melt height and its according standard deviation.

To receive reference values for the automatically determined metrics, the topography data was analyzed by a human expert using the microscope’s software module (Multifile Analyzer, Keyence, Japan). Average profile lines were generated by calculating the mean of 768 profile lines perpendicular to the cutting kerf. The distance between the profile lines was 0.686 μm, corresponding to the size of one pixel. The processing pipeline is summarized in Fig. 1 and detailed in Table VII with exemplary intermediate and final images for the three defined classes.

FIG. 2.

Learning curves showing the training loss for training data sets over the number of training epochs with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. The error intervals in subfigures (a)–(e) depict the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the error interval corresponds to the standard deviation for three training runs using all available training images.

FIG. 2.

Learning curves showing the training loss for training data sets over the number of training epochs with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. The error intervals in subfigures (a)–(e) depict the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the error interval corresponds to the standard deviation for three training runs using all available training images.

Close modal

Mask R-CNN was used for classifying the images into defective cuts (class 1) and successful cuts (class 2 and 3). Such a classification can be useful for identifying a rough parameter window or for detecting production defects. At this point, no distinction was made between class 2 and class 3 images since these classes do not indicate production defects but allow conclusions on the process behavior. Reducing the effort for image acquisition, data preparation, and model training is crucial for implementing computer vision applications in industrial production. The required amount of annotated data for reaching a certain precision with a CNN is generally unknown. Thus, the influence of the data set size on the classification precision was evaluated by training the model with six data sets containing 22–700 training images randomly selected from the data stock.

The test data set contained 20 images each of defective cuts (class 1), successful cuts with a regular melt superelevation (class 2), and successful cuts with an irregular melt superelevation (class 3). The training time per image was approximately 2.2 min and the absolute training time scaled virtually linearly with the number of training images. Accordingly, absolute training times between 0.81 and 25.1 h resulted in 22 and 700 training images, respectively (see Table IV).

Figure 2 shows the learning curves over the number of epochs for the six different data set sizes. The training loss converges to values between 0.1 and 0.2, indicating a successful training process for all data set sizes. Convergence has already been reached for the smallest data set of 22 training images, but lower training losses are reached after fewer epochs when images are added to the training data set. Also, the standard deviation between three different training runs decreases when more training images are used, indicating an enhanced reproducibility of the results by reducing the dependence on individual images.

FIG. 3.

Evolution of precision (P) over the number of training epochs for classification of the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. The error intervals in subfigures (a)–(e) depict the standard deviation of three independent training runs with randomly selected input images from the data stock. In subfigure (f), the error interval corresponds to the standard deviation of three training runs using all available training images.

FIG. 3.

Evolution of precision (P) over the number of training epochs for classification of the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. The error intervals in subfigures (a)–(e) depict the standard deviation of three independent training runs with randomly selected input images from the data stock. In subfigure (f), the error interval corresponds to the standard deviation of three training runs using all available training images.

Close modal

As expected, while the training loss is reduced over the number of training epochs, the precision of the model for object classification increases (see Fig. 3). While class 2 and class 3 objects were widely classified correctly for small training data sets, class 1 images were misclassified more frequently, which is also visible in the confusion matrices presented in Table VI. The lower classification accuracy for class 1 images might result from the lower number of images showing defective cuts in the data set (336 class 1 images vs 588 class 2/class 3 images). Also, an assignment to class 1 sometimes is not unambiguously possible even for a human expert since the color images impede determining whether a continuous cut is present in some cases. When using 175 training images or more, the precision significantly exceeded 90% and reached 98.3% for 700 training images with a correct classification of all class 2/class 3 images after two of the three training runs.

FIG. 4.

Evolution of the intersection over union (IoU) over the number of training epochs for the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. An IoU of zero was assumed for images with a failed classification. The error bars in subfigures (a)–(e) represent the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the standard deviation accounts for three training runs using all available training images. The inset in subfigure (f) details the IoU of Mask R-CNN approaching human accuracy.

FIG. 4.

Evolution of the intersection over union (IoU) over the number of training epochs for the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. An IoU of zero was assumed for images with a failed classification. The error bars in subfigures (a)–(e) represent the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the standard deviation accounts for three training runs using all available training images. The inset in subfigure (f) details the IoU of Mask R-CNN approaching human accuracy.

Close modal

The decreasing standard deviation for the training loss and the precision between the separate training runs underlines the relevance of individual images for the training process. However, this indicates that, by carefully selecting the training images covering a broad range of object peculiarities, a further boost of the algorithm performance can be reached despite a low number of original images.

For the data sets of 22–350 training images, an additional training run was performed using only the original test images, which were not seen in a mirrored or rotated version during training, limiting the artificial data augmentation (compare Sec. II C).

It was studied if the usage of mirrored or rotated versions of an original image for training and testing caused overfitting. The precision gained by testing with solely unseen original images was within or in proximity to the standard deviation range of the results gained using the comprehensive data augmentation approach (see Fig. 8 and Table IV). However, for 22–88 training images, the limited data augmentation approach led to a higher precision.

FIG. 5.

Exemplary binarized images and segmented topography images for class 2 and class 3 showing the segmented melt superelevations; automatically determined geometry metrics for the class 2 and the class 3 image extracted from the binarized images and the segmented topography images.

FIG. 5.

Exemplary binarized images and segmented topography images for class 2 and class 3 showing the segmented melt superelevations; automatically determined geometry metrics for the class 2 and the class 3 image extracted from the binarized images and the segmented topography images.

Close modal

For 175 and 350 training images, the limited data augmentation yielded slightly lower precisions than the conventional approach. This might be due to the higher chance of having mirrored or rotated image versions in the training data set when the comprehensive data augmentation approach is used, causing an adaption of image-specific features. Thus, a marginal overfitting might have occurred for the comprehensive data augmentation approach. Nevertheless, the high precisions reached render data augmentation a feasible method to reduce the necessary amount of original data for the presented use case.

The fast convergence and comparably high precision achieved with small data sets are presumably a consequence of the model pre-training and the rather simple classification task with only two object classes. The Mask R-CNN model returns a class probability for detected objects in an image, which can be used to reduce misclassification. Therefore, the RPN threshold, which was set to 0.7 within this study, can be increased to achieve a quasi-zero false negative rate for classification as a defective cut (class 1) (Gené-Mola ., 2020 and Xu ., 2020a). Thus, if an image cannot be unambiguously classified, a manual post-inspection by an operator can be triggered in industrial battery production.

Mask R-CNN is not only suitable for object classification but also returns a pixel-level object segmentation mask. Figure 4 demonstrates that as few as 44 training images yielded an IoU of 74.4% after model training for 100 epochs. The high IoU confirms the capability of Mask R-CNN to achieve an accurate segmentation even with a scarce data set, as also shown for a use case from the medicine domain (Anantharaman ., 2018). When 22 training images were used, the IoU standard deviation, especially for class 1, was still high, owing to the dependence on the individual images in the training data set. However, when increasing the training image number, the standard deviation between individual training runs diminished significantly.

FIG. 6.

Depiction of quantitative values characterizing the laser cut comparing human-based and automatic measurements of the (a) mean melt heights, (b) mean melt widths, and (c) mean kerf widths; deviations of the human-based and automatic measurements are represented by a 10% and a 20% interval. Scatter plots of the automatically determined standard deviations of (d) the melt heights, (e) the melt widths, and (f) the kerf widths for individual cuts indicating process stability during laser cutting. In subfigures (d) and (e), “left” and “right” refer to the respective position of the melt superelevation in relation to the cutting kerf.

FIG. 6.

Depiction of quantitative values characterizing the laser cut comparing human-based and automatic measurements of the (a) mean melt heights, (b) mean melt widths, and (c) mean kerf widths; deviations of the human-based and automatic measurements are represented by a 10% and a 20% interval. Scatter plots of the automatically determined standard deviations of (d) the melt heights, (e) the melt widths, and (f) the kerf widths for individual cuts indicating process stability during laser cutting. In subfigures (d) and (e), “left” and “right” refer to the respective position of the melt superelevation in relation to the cutting kerf.

Close modal

This substantiates the negligible influence of individual images in the data set, allowing for achieving a high reproducibility. The highest IoU of 87.9% was obtained for the largest training datasets containing 700 images while the standard deviation only amounted to 0.1%.

The 60 test images were separately labeled by two experts to estimate the human segmentation reproducibility. Both experts were equally instructed on the LabelMe software and the relevant image features. The segmentation of one image took around 1–2 min for a trained expert. The mean IoU between the manually accomplished image masks was 90.7%, resulting from individual labeling preferences and a limited accuracy when adjusting the polygon line in a reasonable amount of time. Thus, the inherent inaccuracy in human-based image segmentation might limit the training progress and justify why an IoU of 100% between human-generated and automatically generated segmentation masks is practically impossible. However, IoU of 87.9% for all classes using 700 training images approached the human masking accuracy, which underlines the suitability of the chosen algorithm for the underlying segmentation task. Furthermore, when only considering correctly classified images, even higher IoUs resulted with almost no differences between the separate classes at higher image numbers (see Fig. 9).

FIG. 7.

(a) Color image, (b) height image, and (c) an average profile line of a lithium metal foil sample surface with a cutting kerf obtained using confocal laser scanning microscopy.

FIG. 7.

(a) Color image, (b) height image, and (c) an average profile line of a lithium metal foil sample surface with a cutting kerf obtained using confocal laser scanning microscopy.

Close modal

The test time for object detection and segmentation amounted to approximately 850 ms per image, corresponding to a frame rate of 1.18 frames per second. This renders the presented approach a reasonable basis for online applications considering that a standard personal computer and no dedicated high-performance computing resource was used (see Table II). A further test time reduction for enabling inline quality control could be accomplished by increasing the computational performance or reducing the image resolution to increase the signal-to-noise ratio. The appropriate resolution must be selected based on the sensor unit being used, which for inline applications might be a high-resolution camera system or a laser-triangulation sensor working at a high sampling rate.

Another approach for image size reduction could be image cropping to remove nonrelevant image sections, such as parts of the background at the image borders. Following the classification precision, no correlation between the achieved IoU and the data augmentation approach was found.

The relevant achieved performance metrics for object classification and image segmentation are summarized in Table IV. The high precision and the IoU approximating the human labeling accuracy render the conducted Mask R-CNN implementation feasible for its application in industrial battery production.

TABLE IV.

Precision (P) after 100 training epochs, intersection over union (IoU) after 100 training epochs, and training time in relation to the number of training images; for determining the precision with limited data augmentation, rotated and mirrored versions of the test images were removed from the training and validation data sets. The same 60 test images were used for all experiments. The human repeatability for instance segmentation was determined to be 90.7%.

No. of training imagesAbsolute training time (h)Precision (P)a,bPrecision (P) with limited data augmentationbIntersection over union (IoU)a,bIntersection over union (IoU) with limited data augmentationb
22 0.81 70.0 ± 7.0% 81.7% 63.4 ± 6.1% 71.0% 
44 1.55 86.1 ± 2.1% 91.2% 74.4 ± 2.6% 78.9% 
88 3.01 85.6 ± 5.1% 85.0% 74.5 ± 5.0% 74.3% 
175 5.87 97.2 ± 2.1% 93.3% 85.7 ± 1.5% 81.8% 
350 11.91 96.1 ± 1.0% 90.0% 85.4 ± 0.6% 79.9% 
700 25.09 98.3 ± 0% — 87.9 ± 0.1% — 
No. of training imagesAbsolute training time (h)Precision (P)a,bPrecision (P) with limited data augmentationbIntersection over union (IoU)a,bIntersection over union (IoU) with limited data augmentationb
22 0.81 70.0 ± 7.0% 81.7% 63.4 ± 6.1% 71.0% 
44 1.55 86.1 ± 2.1% 91.2% 74.4 ± 2.6% 78.9% 
88 3.01 85.6 ± 5.1% 85.0% 74.5 ± 5.0% 74.3% 
175 5.87 97.2 ± 2.1% 93.3% 85.7 ± 1.5% 81.8% 
350 11.91 96.1 ± 1.0% 90.0% 85.4 ± 0.6% 79.9% 
700 25.09 98.3 ± 0% — 87.9 ± 0.1% — 
a

Mean value and standard deviation of three test runs with a random set of training and validation images.

b

After 100 epochs.

As this study did not concentrate on algorithm optimization, but on the method applicability in industry, a further improvement of the results is supposedly achievable through comprehensive hyperparameter tuning (He ., 2017). Yet, despite the absence of a comprehensive hyperparameter tuning, a low number of training images resulted in a high precision P and IoU within this study.

The parameter selection for laser micro-machining is complex due to the high number of process parameters and their complex interdependencies. The shape of the melt superelevation makes it possible to investigate the process behavior and is a quality-relevant feature for laser cutting of lithium metal within all-solid-state battery production (Jansen ., 2018 and Kriegler ., 2022). Therefore, using the segmentation masks for gaining quantitative values characterizing the melt superelevation supports process design. Furthermore, an automated inspection of the cut edge quality allows to detect process fluctuations enabling corrections. Figure 5 shows exemplary binarized images with the segmentation masks for the melt superelevations as well as the height images resulting from overlaying the topography images with the binarized segmentation masks. While the geometry metrics of the class 2 image possess a low standard deviation, high standard deviations for the melt width, the melt height, and the kerf width characterize the cutting kerf in the class 3 image. The automatic assessment of geometry parameters was validated by comparison to measurements performed by human experts (see Sec. II E). Most of the automatically determined melt heights, melt widths, and kerf widths within the test images deviated by less than 20% from the human-based measurements [Figs. 6(a)6(c)].

FIG. 8.

Evolution of the precision (P) over the number of training epochs for image classification using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, and (e) 350 images with limited data augmentation neglecting augmented images from the test data set in the training and validation data sets; (f) precision (P) reached after 100 training epochs for different numbers of training images; comparison of the conventional comprehensive data augmentation approach and the limited data augmentation approach.

FIG. 8.

Evolution of the precision (P) over the number of training epochs for image classification using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, and (e) 350 images with limited data augmentation neglecting augmented images from the test data set in the training and validation data sets; (f) precision (P) reached after 100 training epochs for different numbers of training images; comparison of the conventional comprehensive data augmentation approach and the limited data augmentation approach.

Close modal
FIG. 9.

Evolution of the intersection over union (IoU) over the number of training epochs for the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. Only the IoU of correctly classified images is depicted. The error bars in subfigures (a)–(e) represent the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the standard deviation accounts for three training runs using all available training images. The inset in subfigure (f) details the IoU of Mask R-CNN approaching human accuracy.

FIG. 9.

Evolution of the intersection over union (IoU) over the number of training epochs for the 60 test images using training data sets with (a) 22 images, (b) 44 images, (c) 88 images, (d) 175 images, (e) 350 images, and (f) 700 images. Only the IoU of correctly classified images is depicted. The error bars in subfigures (a)–(e) represent the standard deviation for three independent training runs with randomly selected input images from the data stock. In subfigure (f), the standard deviation accounts for three training runs using all available training images. The inset in subfigure (f) details the IoU of Mask R-CNN approaching human accuracy.

Close modal

The slight divergence of the values can be explained by the differing measurement approaches. A systematic downward deviation in the automatically determined melt heights is visible in Fig. 6(a). For automatic measurement, the mean melt height is determined by calculating a mean of all pixel height values that are part of the segmentation mask.

In contrast, in the human-based measurement approach, an average cross section is created by calculating mean height values for each image column. Furthermore, as the manual measurements are based on averaged cross sections, the transition from a melt superelevation to the surrounding bulk material is defined either at the position where the averaged cross section line undershoots the reference plane or by visual criteria in the color images. Thus, since the manual measurements partly depend on the subjective perception of the human expert, errors might arise, especially for class 3 images with irregular melt complicating boundary identification [see Figs. 6(b) and 6(c)]. The lowest melt heights and widths detected were below 10 and 40 μm, respectively. Correlating these values to the underlying process parameters enables the selection of feasible parameter sets. Additionally, a low kerf width, which might be the consequence of unsuitable process parameters, for example, an insufficient laser power or an excessive scanning velocity, can be used to predict a transition from class 2/class 3 to class 1 in the case of an incorrect adjustment of the process parameters.

The mere consideration of the mean melt height and the melt width does not allow for classification into class 2 or class 3, as a homogenous melt superelevation might have the same quantitative mean values as an irregular melt superelevation. Extracting the standard deviation of the melt height, the melt width, and the kerf width allows an alternative classification approach [see Figs. 6(d)6(f)]. Class 2 is characterized by a low standard deviation of all quantitative values, while class 3 shows a higher standard deviation.

Individual quantitative values are calculated for the left and the right melt superelevations as separate segmentation masks are generated. Directionalities in melt formation, for example, coming from a beam misalignment, can be detected.

The execution of the entire test pipeline, including the segmentation process and the determination of quantitative values, took around 1.3 s per image. Thus, the data set of 60 images could be analyzed in less than 1.5 min, while a human-based inspection would take up to one hour. The developed pipeline covering object detection, instance segmentation, and the determination of qualitative values is not limited to the presented task but can be transferred to other laser micro-machining tasks. As nanosecond-pulsed laser cutting is industrially well established (Meijer 2004 and Mishra and Yadava 2015), for example, in the photovoltaic (Bovatsek ., 2010), packaging (Lutey ., 2013), and battery industry (Kriegler ., 2021), a high potential for applying the presented approach arises. Moreover, the proposed method could be used for inspecting the output of other process steps in battery manufacturing, such as the detection of electrode defects (Badmos ., 2020), the characterization of laser-structured LIB electrodes (Hille ., 2023), or the analysis of three-dimensionally-structured ceramic solid electrolyte layers (Kriegler ., 2023).

In this study, a computer vision pipeline was presented, allowing the qualitative and quantitative assessment of lithium metal foil cut edges separated by laser radiation for quality inspection in all-solid-state battery production. The state-of-the-art deep learning algorithm Mask R-CNN was implemented for detecting and segmenting the melt superelevations along cut edges in color images recorded by confocal laser scanning microscopy. 248 images were captured showing cut edges stemming from laser cutting with various parameter sets and the data set was artificially augmented. The classification ability of the algorithm was used to distinguish between defective and successful laser cuts. The relation between the training data set size and the classification accuracy was discussed, showing a precision of more than 95% for 175 or more training images. The low number of original training images required to reach high classification precisions underlines the high applicability of the approach for industrial use cases where data acquisition is complicated. The melt superelevation was automatically segmented, reaching an intersection over union between 63.4% and 87.9% depending on the number of training images. These values approach the human-based segmentation repeatability of 90.7%, substantiating the high suitability of Mask R-CNN for feature extraction. The segmentation masks were employed to determine quantitative values characterizing the geometry of the cutting kerf and the melt superelevations, which enabled an automated cut edge quality evaluation allowing conclusions on the suitability of a parameter set for lithium metal laser cutting. The presented pipeline can be easily transferred to quality inspection for other micro-machining applications. Furthermore, the approach’s high versatility makes it applicable to other imaging techniques providing three-dimensional information, such as white light interferometry. Overall, the developed approach supports the production of high-quality all-solid-state batteries by facilitating the selection of feasible process parameters and automated quality assurance for the laser cutting of lithium-metal foils.

Future works may include implementing the method for a continuous quality inspection by transferring the approach to feasible inline sensor systems and increasing the algorithm’s computational efficiency. Moreover, a transfer of the approach to related applications is targeted, considering other laser processes, materials, and quality characteristics.

This work was funded by the German Federal Ministry of Education and Research (BMBF) under Grant No. 03XP0184l (ProFeLi). The authors gratefully acknowledge support. The authors thank Elena Jaimez-Farnham for manual image labeling.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Johannes Kriegler: Conceptualization (lead); Data curation (equal); Formal analysis (lead); Investigation (lead); Methodology (lead); Validation (equal); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). Tianran Liu: Data curation (lead); Formal analysis (equal); Investigation (equal). Roman Hartl: Validation (equal); Writing – review & editing (equal). Lucas Hille: Formal analysis (equal); Writing – review & editing (equal). Michael F. Zaeh: Funding acquisition (lead); Project administration (lead); Supervision (lead); Writing – review & editing (equal).

Additional information, such as image data, software code of the trained Mask R-CNN, and software code for the generation of quantitative quality indicators are available upon reasonable request.

Exemplary images of a lithium metal foil sample obtained using confocal laser scanning microscopy are shown in Fig. 7.

Evolution of the precision (P) using training data sets with limited data augmentation neglecting augmented images from the test data set in the training and validation data sets are shown in Fig. 8.

Evolution of the intersection over union (IoU) for correctly classified images only are shown in Fig. 9.

An overview of the experimental series in which the training, test, and validation images were created is presented in Table V.

Confusion matrices for the classification of lithium laser cut images are presented in Table VI.

The image processing pipeline at the hand of exemplary class images is presented in Table VII.

TABLE V.

Images were acquired from five experimental series that aimed to study various effects during pulsed laser cutting of lithium metal. Process behavior is addressed in a separate publication (Kriegler ., 2022). While experimental series I and II included a full factorial experimental design, experimental series III–V served to determine the maximum scanning speed allowing for a continuous cut. From the total of 288 images, 244 images were selected to train the Mask R-CNN model to provide for an approximate 1:2 ratio between defective (class 1) and successful (class 2) cuts.

Experimental seriesNo. of imagesPulse duration (ns)Pulse repetition rate (kHz)Laser power (W)Scanning speed (m s−1)Experiment repetitions
12 261, 508, 820, 1220 200 100 1000 
II 96 29, 108, 177, 261 800 100 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400 
III 48 261 200, 500, 750, 1000 50, 100, 150, 200 Maximum 
IV 84 29 1000, 1500, 2000, 2500, 3000, 3500, 4000 50, 100, 150, 200 Maximum 
48 13 2950, 3350, 3750, 4150 50, 100, 150, 200 Maximum 
Experimental seriesNo. of imagesPulse duration (ns)Pulse repetition rate (kHz)Laser power (W)Scanning speed (m s−1)Experiment repetitions
12 261, 508, 820, 1220 200 100 1000 
II 96 29, 108, 177, 261 800 100 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400 
III 48 261 200, 500, 750, 1000 50, 100, 150, 200 Maximum 
IV 84 29 1000, 1500, 2000, 2500, 3000, 3500, 4000 50, 100, 150, 200 Maximum 
48 13 2950, 3350, 3750, 4150 50, 100, 150, 200 Maximum 
TABLE VI.

Confusion matrices for the classification of lithium laser cut images into defective cuts (class 1) and successful cuts (class 2/class 3) after 100 training epochs. Confusion matrices give the number of correctly detected instances of a particular object and the number of misclassified instances of a particular object. True classes are in rows and the predicted classes are in columns.

Predicted
PositiveNegative
TruePositiveTrue positive (TP) False positive (FP) 
NegativeFalse negative (FP) True negative (TN) 
Predicted
Test run 1Test run 2Test run 3Precision after 100 epochsa
Class 1Class 2/Class 3Class 1Class 2/Class 3Class 1Class 2/Class 3
22 training images  Class 1 17 11 20 70.0 ± 7.0% 
Class 2/class 3 40 39 40 
44 training images Class 1 14 18 12 86.1 ± 2.1% 
Class 2/class 3 39 32 40 
88 training images Class 1 16 15 13 85.6 ± 5.1% 
Class 2/class 3 31 39 40 
175 training images Class 1 18 17 20 97.2 ± 2.1% 
Class 2/class 3 40 40 40 
350 training images Class 1 18 17 18 96.1 ± 1.0% 
Class 2/class 3 40 40 40 
700 training images Class 1 19 20 19 98.3 ± 0.0% 
Class 2/class 3 40 39 40 
Predicted
PositiveNegative
TruePositiveTrue positive (TP) False positive (FP) 
NegativeFalse negative (FP) True negative (TN) 
Predicted
Test run 1Test run 2Test run 3Precision after 100 epochsa
Class 1Class 2/Class 3Class 1Class 2/Class 3Class 1Class 2/Class 3
22 training images  Class 1 17 11 20 70.0 ± 7.0% 
Class 2/class 3 40 39 40 
44 training images Class 1 14 18 12 86.1 ± 2.1% 
Class 2/class 3 39 32 40 
88 training images Class 1 16 15 13 85.6 ± 5.1% 
Class 2/class 3 31 39 40 
175 training images Class 1 18 17 20 97.2 ± 2.1% 
Class 2/class 3 40 40 40 
350 training images Class 1 18 17 18 96.1 ± 1.0% 
Class 2/class 3 40 40 40 
700 training images Class 1 19 20 19 98.3 ± 0.0% 
Class 2/class 3 40 39 40 
a

Mean value and standard deviation of three test runs with a random set of training and validation images.

TABLE VII.

Executed steps in image processing are shown at an exemplary laser cut for each object class. Within model training, the original images were labeled with image masks. During testing, the images were automatically labeled by Mask R-CNN and binarized masks were derived.

Class 1Class 2Class 3
Original imagea    
Image labeled by a human expert    
Binarized version of an image labeled by a human expert    
Automatically segmented image with bounding box and pixel-level mask    
Binarized automatically segmented image    
Binarized automatically segmented image corrected using height values No quantitative evaluation because categorized as efective laser cut   
Binarized automatically segmented image corrected by deletion of the 50 uppermost and lowermost pixel lines No quantitative evaluation because categorized as defective laser cut   
Automatically segmented height image created by overlaying the binarized segmentation mask to the topography data No quantitative evaluation because categorized as defective laser cut   
Class 1Class 2Class 3
Original imagea    
Image labeled by a human expert    
Binarized version of an image labeled by a human expert    
Automatically segmented image with bounding box and pixel-level mask    
Binarized automatically segmented image    
Binarized automatically segmented image corrected using height values No quantitative evaluation because categorized as efective laser cut   
Binarized automatically segmented image corrected by deletion of the 50 uppermost and lowermost pixel lines No quantitative evaluation because categorized as defective laser cut   
Automatically segmented height image created by overlaying the binarized segmentation mask to the topography data No quantitative evaluation because categorized as defective laser cut   
a

Scale bars are inserted to facilitate reader’s understanding of the dimensions but are not included in the automatically processed images.

1.
Anantharaman
,
R.
,
Velazquez
,
M.
, and
Lee
,
Y
, “
Utilizing Mask R-CNN for detection and segmentation of oral diseases
,” in
2018 IEEE International Conference on Bioinformatics and Biomedicine
, edited by
H.
Zheng
,
Z.
Callejas
,
D.
Griol
,
H.
Wang
,
X.
Hu
,
H.
Schmidt
,
J.
Baumbach
,
J.
Dickerson
, and
L.
Zhang
(
IEEE
,
Piscataway, NJ
,
2018
), pp.
2197
2204
.
3.
Badmos
,
O.
,
Kopp
,
A.
,
Bernthaler
,
T.
, and
Schneider
,
G
, “
Image-based defect detection in lithium-ion battery electrode using convolutional neural networks
,”
J. Intell. Manuf.
31
,
885
897
(
2020
).
4.
Bengio
,
Y.
, “
Deep learning of representations for unsupervised and transfer learning
,” in
Proceedings of ICML Workshop on Unsupervised and Transfer Learning, Bellevue, WA
, edited by
I.
Guyon
,
G.
Dror
,
V.
Lemaire
,
G.
Taylor
, and
D.
Silver
(JMLR.org,
2012
), pp.
17
37
.
5.
Bharati
,
P.
, and
Pramanik
,
A.,
Deep learning techniques—R-CNN to mask R-CNN: A survey
,” in
Computational Intelligence in Pattern Recognition
(
Springer
,
Singapore
,
2020
), pp.
657
668
.
6.
Bocksrocker
,
O
. “Method for processing a lithium foil or a lithium-coated metal foil by a laser beam,” (US 2022/0234140 A1) (2022) (last accessed May 12, 2023).
7.
Bovatsek
,
J.
,
Tamhankar
,
A.
,
Patel
,
R. S.
,
Bulgakova
,
N. M.
, and
Bonse
,
J.
, “
Thin film removal mechanisms in ns-laser processing of photovoltaic materials
,”
Thin Solid Films
518
,
2897
2904
(
2010
).
8.
Chai
,
J.
,
Zeng
,
H.
,
Li
,
A.
, and
Ngai
,
E. W. T.
, “
Deep learning in computer vision: A critical review of emerging techniques and application scenarios
,”
Mach. Learn. Appl.
6
,
100134
(
2021
).
9.
Courtier
,
A. F.
,
McDonnell
,
M.
,
Praeger
,
M.
,
Grant-Jacob
,
J. A.
,
Codemard
,
C.
,
Harrison
,
P.
,
Mills
,
B.
, and
Zervas
,
M.
, “
Modelling of fibre laser cutting via deep learning
,”
Opt. Express
29
,
36487
36502
(
2021
).
10.
Deng
,
L
, “
A tutorial survey of architectures, algorithms, and applications for deep learning
,”
SIP
3
,
1
29
(
2014
).
11.
Duffner
,
F.
,
Kronemeyer
,
N.
,
Tuebke
,
J.
,
Leker
,
J.
,
Winter
,
M.
, and
Schmuch
,
R.
, “
Post-lithium-ion battery cell production and its compatibility with lithium-ion cell production infrastructure
,”
Nat. Energy
6
,
123
134
(
2021
).
12.
Frith
,
J. T.
,
Lacey
,
M. J.
, and
Ulissi
,
U.
, “
A non-academic perspective on the future of lithium-based batteries
,”
Nat. Commun.
14
,
420
(
2023
).
13.
Gené-Mola
,
J.
,
Sanz-Cortiella
,
R.
,
Rosell-Polo
,
J. R.
,
Morros
,
J.-R.
,
Ruiz-Hidalgo
,
J.
,
Vilaplana
,
V.
, and
Gregorio
,
E.
, “
Fruit detection and 3D location using instance segmentation neural networks and structure-from-motion photogrammetry
,”
Comput. Electron. Agric.
169
,
105165
(
2020
).
14.
Gireaud
,
L.
,
Grugeon
,
S.
,
Laruelle
,
S.
,
Yrieix
,
B.
, and
Tarascon
,
J.-M.
, “
Lithium metal stripping/plating mechanisms studies: A metallurgical approach
,”
Electrochem. Commun.
8
,
1639
1649
(
2006
).
15.
Girshick
,
R.
, “
Fast R-CNN
,” in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile (IEEE, Piscataway, NJ,
2015
), pp.
1440
1448
.
16.
Girshick
,
R.
,
Donahue
,
J.
,
Darrell
,
T.
, and
Malik
,
J.
, “
Rich feature hierarchies for accurate object detection and semantic segmentation
,” in The Institute of Electrical and Electronics Engineers, Inc. (ed.) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CPS, Columbus, OH (IEEE, Piscataway, NJ,
2014
), pp.
580
587
.
17.
Gonzalez
,
S.
,
Arellano
,
C.
, and
Tapia
,
J. E.
, “
Deepblueberry: Quantification of blueberries in the wild using instance segmentation
,”
IEEE Access
7
,
105776
105788
(
2019
).
18.
Grady
,
H. R
, “
Lithium metal for the battery industry
,”
J. Power Sources
5
,
127
135
(
1980
).
20.
Guo
,
F.
,
Qian
,
Y.
,
Wu
,
Y.
,
Leng
,
Z.
, and
Yu
,
H.
, “
Automatic railroad track components inspection using real-time instance segmentation
,”
Comput.-Aided Civ. Infrastruct. Eng.
36
,
362
377
(
2021
).
19.
Guo
,
Y.
,
Liu
,
Y.
,
Oerlemans
,
A.
,
Lao
,
S.
,
Wu
,
S.
, and
Lew
,
M. S.
, “
Deep learning for visual understanding: A review
,”
Neurocomputing
187
,
27
48
(
2016
).
21.
Hartl
,
R
,
Landgraf
,
J
,
Spahl
,
J
,
Bachmann
,
A
, and
Zaeh
,
M, F.
, “
Automated visual inspection of friction stir welds: A deep learning approach
,” in
Multimodal Sensing: Technologies and Applications
, edited by
E.
Stella
(
SPIE
,
Bellingham, WA
,
2019
), Vol. 11059, pp.
1105909-1
1105909-24
.
22.
He
,
K.
,
Gkioxari
,
G.
,
Dollár
,
P.
, and
Girshick
,
R.
, “
Mask R-CNN
,” in Computer Vision Foundation (ed.) Proceedings of the IEEE International Conference on Computer Vision (ICCV) Venice, Italy (IEEE, Piscataway, NJ,
2017
), pp.
2961
2969
.
23.
He
,
K.
,
Zhang
,
X.
,
Ren
,
S.
, and
Sun
,
J.
, “
Deep residual learning for image recognition
,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Las Vegas N2:(ed) Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV (IEEE, Piscataway, NJ,
2016
), pp.
770
778
.
24.
He
,
Y.
,
Ren
,
X.
,
Xu
,
Y.
,
Engelhard
,
M. H.
,
Li
,
X.
,
Xiao
,
J.
,
Liu
,
J.
,
Zhang
,
J.-G.
,
Xu
,
W.
, and
Wang
,
C.
, “
Origin of lithium whisker formation and growth under stress
,”
Nat. Nanotechnol.
14
,
1042
1047
(
2019
).
25.
Hille
,
L.
,
Hoffmann
,
P.
,
Kriegler
,
J.
,
Mayr
,
A.
, and
Zaeh
,
M. F.
, “
Automated geometry characterization of laser-structured battery electrodes
,”
Prod. Eng.
17
,
773
783
(
2023
).
26.
Huang
,
J.
,
Rathod
,
V.
,
Sun
,
C.
,
Zhu
,
M.
,
Korattikara
,
A.
,
Fathi
,
A.
,
Fischer
,
I.
,
Wojna
,
Z.
,
Song
,
Y.
,
Guadarrama
,
S.
, and
Murphy
,
K.
, “
Speed/accuracy trade-offs for modern convolutional object detectors
,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (IEEE, Piscataway, NJ,
2017
), pp.
3296
3297
.
27.
Jansen
,
T.
,
Blass
,
D.
,
Hartwig
,
S.
, and
Dilger
,
K.
, “
Processing of advanced battery materials—Laser cutting of pure lithium metal foils
,”
Batteries
4
,
37
16
(
2018
).
28.
Jansen
,
T.
,
Kandula
,
M. W.
,
Blass
,
D.
,
Hartwig
,
S.
,
Haselrieder
,
W.
, and
Dilger
,
K.
, “
Evaluation of the separation process for the production of electrode sheets
,”
Energy Technol.
8
,
1
11
(
2019
).
29.
Kasemchainan
,
J.
,
Zekoll
,
S.
,
Jolly
,
D. S.
,
Ning
,
Z.
,
Hartley
,
G. O.
,
Marrow
,
J.
, and
Bruce
,
P. G.
, “
Critical stripping current leads to dendrite formation on plating in lithium anode solid electrolyte cells
,”
Nat. Mater.
18
,
1105
1111
(
2019
).
30.
Krauskopf
,
T.
,
Richter
,
F. H.
,
Zeier
,
W. G.
, and
Janek
,
J.
, “
Physicochemical concepts of the lithium metal anode in solid-state batteries
,”
Chem. Rev.
120
,
7745
7794
(
2020
).
31.
Kriegler
,
J.
,
Binzer
,
M.
, and
Zaeh
,
M. F.
, “
Process strategies for laser cutting of electrodes in lithium-ion battery production
,”
J. Laser Appl.
33
, 012006 (
2021
).
32.
Kriegler
,
J
,
Duy Nguyen
,
T. M
,
Tomcic
L
,
Hille
L
,
Grabmann
S
,
Jaimez-Farnham
E. I
, and
Zaeh
M. F.
, “
Processing of lithium metal for the production of post-lithium-ion batteries using a pulsed nanosecond fiber laser
,”
Results Mater.
15
,
100305
(
2022
).
33.
Kriegler
,
J.
,
Jaimez-Farnham
,
E.
,
Scheller
,
M.
,
Dashjav
,
E.
,
Konwitschny
,
F.
,
Wach
,
L.
,
Hille
,
L.
,
Tietz
,
F.
, and
Zaeh
,
M. F.
, “
Design, production, and characterization of three-dimensionally-structured oxide-polymer composite cathodes for all-solid-state batteries
,”
Energy Stor. Mater.
57
,
607
617
(
2023
).
34.
Krizhevsky
,
A.
,
Sutskever
,
I.
, and
Hinton
,
G. E.
, “
Imagenet classification with deep convolutional neural networks
,”
Commun. ACM
60
,
84
90
(
2017
).
35.
LeCun
,
Y.
,
Bengio
,
Y.
, and
Hinton
,
G.
, “
Deep learning
,”
Nature
521
,
436
444
(
2015
).
36.
Lee
,
D.
, and
Suk
,
J.
, “
Laser cutting characteristics on uncompressed anode for lithium-ion batteries
,”
Energies
13
,
2630
(
2020
).
37.
Leitz
,
K.-H.
,
Redlingshoefer
,
B.
,
Reg
,
Y.
,
Otto
,
A.
, and
Schmidt
,
M.
, “
Metal ablation with short and ultrashort laser pulses
,”
Phys. Proc.
12
,
230
238
(
2011
).
38.
Lin
,
D.
,
Liu
,
Y.
, and
Cui
,
Y.
, “
Reviving the lithium metal anode for high-energy batteries
,”
Nat. Nanotechnol.
12
,
194
206
(
2017
).
39.
Lin
,
T. Y.
,
Maire
,
M.
,
Belongie
,
S.
,
Hays
,
J.
,
Perona
,
P.
,
Ramanan
,
D.
,
Dollár
, and
Zitnick
,
C, L.
, “
Microsoft COCO: Common objects in context
,” in
Computer Vision—ECCV 2014
, edited by
D.
Fleet
,
T.
Pajdla
,
B.
Schiele
, and
T.
Tuytelaars
(
Springer International Publishing
,
Cham
,
2014
), Vol.
8693
, pp.
740
755
.
40.
Lutey
,
A. H.
,
Sozzi
,
M.
,
Carmignato
,
S.
,
Selleri
,
S.
,
Cucinotta
,
A.
, and
Molari
,
P. G.
, “
Nanosecond and sub-nanosecond pulsed laser ablation of thin single and multi-layer packaging films
,”
Appl. Surf. Sci.
285
,
300
308
(
2013
).
41.
Lutey
,
A. H. A.
,
Fortunato
,
A.
,
Ascari
,
A.
,
Carmignato
,
S.
, and
Leone
,
C.
, “
Laser cutting of lithium iron phosphate battery electrodes: Characterization of process efficiency and quality
,”
Opt. Laser Technol.
65
,
164
174
(
2015
).
42.
Masubuchi
,
S
,
Watanabe
,
E
,
Seo
,
Y
,
Okazaki
,
S
,
Sasagawa
,
T
,
Watanabe
,
K
,
Taniguchi
,
T
, and
Machida
,
T.
, “
Deep-learning-based image segmentation integrated with optical microscopy for automatically searching for two-dimensional materials
,”
npj 2D Mater. Appl.
4
,
3
(
2020
).
43.
Meijer
,
J.
, “
Laser beam machining (LBM): State of the art and new opportunities
,”
J. Mater. Process. Technol.
149
,
2
17
(
2004
).
44.
Mills
,
B.
,
Heath
,
D. J.
,
Grant-Jacob
,
J. A.
,
Xie
,
Y.
, and
Eason
,
R. W.
, “
Image-based monitoring of femtosecond laser machining via a neural network
,”
J. Phys.: Photonics
1
,
015008
(
2019
).
45.
Mishra
,
S.
, and
Yadava
,
V.
, “
Laser beam micromachining (LBMM)—A review
,”
Opt. Lasers Eng.
73
,
89
122
(
2015
).
46.
Ng
,
H.-F.
, “
Automatic thresholding for defect detection
,”
Pattern Recognit. Lett.
27
,
1644
1649
(
2006
).
47.
Padilla
,
R.
,
Netto
,
S. L.
, and
Da Silva
,
E. A. B.
, “
A survey on performance metrics for object-detection algorithms
,” in
Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP)
, edited by
A. C.
Paiva
,
A.
Conci
,
G.
Braz
, Jr.
,
J. D. S.
Almeida
, and
L. A. F.
Fernandes
(IEEE,
Piscataway, NJ
,
2020
), pp.
217
242
.
48.
Placke
,
T.
,
Kloepsch
,
R.
,
Duehnen
,
S.
, and
Winter
,
M.
, “
Lithium ion, lithium metal, and alternative rechargeable battery technologies: The odyssey for high energy density
,”
J. Solid State Electrochem.
21
,
1939
1964
(
2017
).
49.
Qiao
,
Y.
,
Truman
,
M.
, and
Sukkarieh
,
S.
, “
Cattle segmentation and contour extraction based on mask R-CNN for precision livestock farming
,”
Comput. Electron. Agric.
165
,
104958
(
2019
).
50.
Ren
,
S.
,
He
,
K.
,
Girshick
,
R.
, and
Sun
,
J.
, “
Faster R-CNN: Towards real-time object detection with region proposal networks,”
in
Advances in Neural Information Processing Systems: 29th Annual Conference on Neural Information Processing
, edited by
C.
Cortes
,
N. D.
Lawrence
,
D. D.
Lee
,
M.
Sugiyama
, and
R.
Garnett
(
Neural Info Process Sys F, La Jolla
, CA,
2015
), Vol. 39, pp
1137
1149
.
51.
Ronneberger
,
O.
,
Fischer
,
P.
, and
Brox
,
T.
, “
U-net: Convolutional networks for biomedical image segmentation
,” in
Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015
, edited by
N.
Navab
,
J.
Hornegger
,
W. M.
Wells
, and
A. F.
Frangi
(
Springer Nature
,
Cham
,
2015
), Vol. 9351, pp.
234
241
.
52.
Schnell
,
J.
,
Guenther
,
T.
,
Knoche
,
T.
,
Vieider
,
C.
,
Koehler
,
L.
,
Just
,
A.
,
Keller
,
M.
,
Passerini
,
S.
, and
Reinhart
,
G.
, “
All-solid-state lithium-ion and lithium metal batteries—Paving the way to large-scale production
,”
J. Power Sources
382
,
160
175
(
2018
).
53.
Smith
,
M. L.
,
Smith
,
L. N.
, and
Hansen
,
M. F.
, “
The quiet revolution in machine vision—A state-of-the-art survey paper, including historical review, perspectives, and future directions
,”
Comput. Ind.
130
,
103472
(
2021
).
54.
Stumper
,
B.
,
Mayr
,
A.
,
Mosler
,
K.
,
Kriegler
,
J.
, and
Daub
,
R.
, “
Investigation of the direct contact prelithiation of silicon-graphite composite anodes for lithium-ion batteries
,”
J. Electrochem. Soc.
170
,
060518
(
2023
).
55.
Tan
,
D. H.S.
,
Meng
,
Y. S.
, and
Jang
,
J.
, “
Scaling up high-energy-density sulfidic solid-state batteries: A lab-to-pilot perspective
,”
Joule
6
,
1755
1769
(
2022
).
56.
Varzi
,
A.
,
Thanner
,
K.
,
Scipioni
,
R.
,
Di Lecce
,
D.
,
Hassoun
,
J.
,
Doerfler
,
S.
,
Altheus
,
H.
,
Kaskel
,
S.
,
Prehal
,
C.
, and
Freunberger
,
A, S.
, “
Current status and future perspectives of lithium metal batteries
,”
J. Power Sources
480
,
228803
(
2020
).
57.
Waleed
,
A.
, (2017) Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow, see https://github.com/matterport/Mask_RCNN (last accessed December 13, 2022).
58.
Wang
,
R.
,
Cheung
,
C. F.
,
Wang
,
C.
, and
Cheng
,
M. N.
, “
Deep learning characterization of surface defects in the selective laser melting process
,”
Comput. Ind.
140
,
103662
(
2022
).
59.
Weber
,
D. A.
Coating for a tool for handling lithium metal, tool and method for producing such a tool(WO2019/162314 Al) (2019), see https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2019162314.
60.
Wuerschinger
,
H.
,
Muehlbauer
,
M.
,
Winter
,
M.
,
Engelbrecht
,
M.
, and
Hanenkamp
,
N.
, “
Implementation and potentials of a machine vision system in a series production using deep learning and low-cost hardware
,”
Procedia CIRP
90
,
611
616
(
2020
).
61.
Xie
,
Y.
,
Heath
,
D. J.
,
Grant-Jacob
,
J. A.
,
Mackay
,
B. S.
,
McDonnell
,
M. D. T.
,
Praeger
,
M.
,
Eason
,
R. W.
, and
Mills
,
B.
, “
Deep learning for the monitoring and process control of femtosecond laser machining
,”
J. Phys. Photonics
1
,
035002
(
2019
).
62.
Xu
,
B.
,
Wang
,
W.
,
Falzon
,
G.
,
Kwan
,
P.
,
Guo
,
L.
,
Chen
,
G.
,
Tait
,
A.
, and
Schneider
,
D.
, “
Automated cattle counting using mask R-CNN in quadcopter vision system
,”
Comput. Electron. Agric.
171
,
105300
(
2020a
).
63.
Xu
,
L.
,
Lu
,
Y.
,
Zhao
,
C.-Z.
,
Yuan
,
H.
,
Zhu
,
G.-L.
,
Hou
,
L.-P.
,
Zhang
,
Q.
, and
Huang
,
J.-Q.
, “
Toward the scale-up of solid-state lithium metal batteries: The gaps between lab-level cells and practical large-format batteries
,”
Adv. Energy Mater.
6
,
2002360
(
2020b
).
64.
Xu
,
W.
,
Wang
,
J.
,
Ding
,
F.
,
Chen
,
X.
,
Nasybulin
,
E.
,
Zhang
,
Y.
, and
Zhang
,
J.-G.
, “
Lithium metal anodes for rechargeable batteries
,”
Energy Environ. Sci
7
,
513
537
(
2014
).
65.
Xu
,
X.
,
Zhao
,
M.
,
Shi
,
P.
,
Ren
,
R.
,
He
,
X.
,
Wei
,
X.
, and
Yang
,
H.
, “
Crack detection and comparison study based on faster R-CNN and mask R-CNN
,”
Sensors
22
,
1215
(
2022
).
66.
Yang
,
Y.
,
He
,
Y.
,
Guo
,
H.
,
Chen
,
Z.
, and
Zhang
,
L.
, “
Semantic segmentation supervised deep-learning algorithm for welding-defect detection of new energy batteries
,”
Neural Comput. Appl.
34
,
19471
19484
(
2022
).
67.
Yang
,
Y.
,
Pan
,
L.
,
Ma
,
J.
,
Yang
,
R.
,
Zhu
,
Y.
,
Yang
,
Y.
, and
Zhang
,
L.
, “
A high-performance deep learning algorithm for the automated optical inspection of laser welding
,”
Appl. Sci.
10
,
933
(
2020a
).
68.
Yang
,
Y.
,
Yang
,
R.
,
Pan
,
L.
,
Ma
,
J.
,
Zhu
,
Y.
,
Diao
,
T.
, and
Zhang
,
L.
, “
A lightweight deep learning algorithm for inspection of laser welding defects on safety vent of power battery
,”
Comput. Ind.
123
,
103306
(
2020b
).
69.
Yosinski
,
J.
,
Clune
,
J.
,
Bengio
,
Y.
, and
Lipson
,
H.
, “
How transferable are features in deep neural networks?
in Advances in Neural Information Processing Systems 27 (NIPS 2014) Montreal, Quebec, Canada (MIT Press, Cambridge, MA,
2014
), Vol. 27, pp.
3320
3328
.
70.
Zhu
,
Y.
,
Yang
,
R.
,
He
,
Y.
,
Ma
,
J.
,
Guo
,
H.
,
Yang
,
Y.
, and
Zhang
,
L.
, “
A lightweight multiscale attention semantic segmentation algorithm for detecting laser welding defects on safety vent of power battery
,”
IEEE Access
9
,
39245
39254
(
2021
).