The management and disposal of waste is a severe social issue and an essential part of ecological sustainability. As an important component of the green, low-carbon, and recycling economic system, the identification and classification of recyclable waste is the premise of its reuse and energy conservation. The main issues at hand are to improve the classification accuracy and reliability of recyclable waste and to achieve automatic classification. The methods based on physical characteristics and image-based methods are inaccurate and unreliable. The current spectroscopy methods need to process the detected samples in advance, unsuitable for automatic detection. Based on material composition properties, the Laser-Induced Breakdown Spectroscopy (LIBS) technology is here proposed to accurately and reliably identify and classify recyclable waste into six categories at the level of consumer, such as paper, plastic, glass, metal, textile, and wood. The method is also used to subclassify the same category of waste for reuse at the level of a recycling factory. We subclassified metals into iron, stainless steel, copper, and aluminum and plastics into polyvinylchloride, polyoxymethylene, acrylonitrile-butadiene-styrene, polyamide, polyethylene, and polytetrafluoroethylene. The drop-dimension methods of LIBS spectra of waste were researched to eliminate noise and redundant information by principal component analysis (PCA) and linear discriminant analysis (LDA), respectively. Their clustering effects were analyzed to choose a suitable dimension. Combining the random forest (RF), back propagation neural network (BPNN), and convolutional neural network (CNN), we established and compared five classification models, PCA + RF, PCA + BPNN, LDA + RF, LDA + BPNN, and 1D-CNN. For the classification of six categories, the accuracies of proposed classification models are all more than 96%, and LDA(5D) + RF has 100% accuracy and optimal classification performance indices. For the subclassification of metals and plastics, PCA(8D) + RF has the highest classification accuracy of 98.77% and 99.52%, respectively.
I. INTRODUCTION
The growth of the population and the improved living standards have led to a sharp increase in category and quantity of waste. It will cause serious environmental pollution and loss of resources if waste is not properly managed and disposed of. Waste management is a severe social issue today and has been on a steady rise, which is an essential part of social, economic, and ecological sustainability.1 In order to maximize the reuse of material, waste products have to be properly disposed of based on their material properties. The identification and classification of waste based on material composition is the key step to realizing its reuse.
Most identification and classification of waste methods are based on the differences in their physical properties, such as gravity separation,2 magnetic separation,3 electrostatic separation,4 and eddy current separation (ECS).5 These methods are simple and only can detect a few categories of waste, and the classification results of which are inaccurate and cannot be used for waste reuse. With the development of image processing and spectroscopic techniques, more categories of waste can be identified and classified by obtaining their physical and chemical information, such as image-based waste classification,6–10 hyperspectral image (HSI) classification,11 near-infrared and infrared spectroscopy (NIRS and IRS) classification,12,13 and fluorescence spectroscopy classification.14 Image-based waste classification methods require taking images of a large amount of waste to train machine learning and deep learning algorithms to establish classification models. However, due to the influence of background, light, camera angle, geometry and color of waste, and other factors, their classification results are unstable and the accuracy is related to the sample dataset. The spectroscopy technology identifies and classifies waste by spectral information, which is not affected by the ambient environment and geometry of waste. NIRS (IRS) classifies the waste according to the absorption and reflection in the near-infrared (infrared) band. HSI integrates imaging and spectral analysis technique to classify waste by detecting the reflection and absorption of light at different wavelengths to provide spectral and spatial information. Fluorescence spectroscopy distinguishes the categories of waste based on the difference in fluorescence spectra of molecular radiation by laser. However, the surface stain of waste samples affects the reflection spectra, so NIRS(IRS) and HSI need to clean and clip detected samples in advance. Fluorescence spectroscopy technology also needs to grind and chip the waste samples, and the classification results are affected by the light source, samples’ color, and samples’ surface stain. These methods cannot realize real-time identification and classification.
Laser-Induced Breakdown Spectroscopy (LIBS) is an atomic emission spectroscopy technology that detects the elemental composition of the sample based on emission spectra. It is not affected by the ambient environment and light, samples’ shape and color, and without preprocessing of samples. LIBS technology has been applied to classify the samples of the same category with machine learning such as iron ore,15 rock,16,17 plastic,18 coal,19 soil,20 ceramics,21,22 and alloys.23 Recyclable waste is a kind of special and complex sample that contains diverse samples. LIBS technology can not only classify the different categories of waste but also identify and subclassify the different waste samples in the same category simultaneously for waste reuse. In this paper, LIBS technology is used to collect the spectra of 80 recyclable waste samples and classified them into paper, plastic, glass, metal, textile, and wood according to LIBS spectra. Considering the recycling and reuse of metals and plastics in industry, we subclassify metal into iron, stainless steel, copper, and aluminum and plastic into polyvinylchloride (PVC), Polyoxymethylene (POM), Acrylonitrilebutadiene-styrene (ABS), polyamide (PA), polyethylene (PE), and polytetrafluoroethylene (PTFE). Because of the high dimension and a large amount of redundant information of LIBS spectral data, the dimensions of collected full-spectra are dropped by principal component analysis (PCA) and linear discriminant analysis (LDA). The drop-dimensional spectra are used as input to machine learning models of random forest (RF) and back propagation neural network (BPNN) for training models and predicting results. The deep learning method, a branch of machine learning, can learn the features of data automatically in one pass and simplify machine-learning workflows. Thus, the convolutional neural network (CNN) is also used whose input is LIBS full spectra. The classification models of PCA + RF, PCA + BPNN, LDA + RF, LDA + BPNN, and CNN are proposed and their accuracies and stabilities of classification are compared and analyzed by classification evaluation indices. The method could be used in the automatic identification and classification of recyclable wastes.
II. SAMPLES AND METHODS
A. Samples and LIBS experimental system
The samples used are six categories of recyclable waste in life and 80 in total. They are 12 different paper including books, paper boxes, etc.; 31 plastics including plastic bottles, plastic combs, plastic boxes, etc.; 9 glasses including lunch boxes, glass bottles, etc.; 12 metals including cans, coppers, aluminum, stainless steel standard samples (Standard No. YSBS37378-16), etc.; 8 textiles including clothes, canvas bags, sheets, etc.; and 8 wood including waste furniture fragments, ice cream sticks, wooden combs, etc. Some of the samples are shown in Fig. 1.
The LIBS system shown in Fig. 2 is composed of a 1064 nm pulsed Nd: YAG laser (pulse energy of 50 mJ, repetition rate of 20 Hz, pulse width of 5 ns), an Avaspec-ULS2048 spectrometer (wavelength range 200–1100 nm, the spectral resolution of about 0.11 nm), and a digital delay generator (DG645). The high-energy laser passes through a 45° total reflector and is focused on the surface of the waste sample by a 100 mm focal length lens 1. The LIBS technique is based on the interaction of a laser pulse with a sample, producing dielectric breakdown, i.e., the formation of plasma. The plasma emission light is collected by a lens 2 with a focal length of 100 mm and coupled by fiber to the spectrometer for spectral analysis. In the early stage of plasma formation, the continuous spectra are generated and interfere with the extraction and analysis of characteristic spectra. The working delay time between the laser and spectrometer is controlled by DG645 to avoid collecting continuous spectra and acquire characteristic spectra with a high signal-to-noise ratio. In the LIBS experiment, the optimal working delay time is 2 µs for most waste samples. In real detection, the detected objects are unknown and the working delay time cannot be changed with the detected object. Therefore, considering the universal application of the method, the delay time in this experiment is set to 2 µs for all samples. In order to avoid the influence of coating and contaminants on the sample surface, several shots are struck at the same position of the sample until the change of spectral intensity is in the range of 100–200 counts, regarding stable spectra. Then, the single-shot spectra are acquired. After collecting five single-shot spectra at the same position, the sample is moved by a displacement stage so that the laser strikes at a new position. In this experiment, 100 single-shot spectra are collected to analyze each sample.
B. LIBS data preprocessing
The LIBS spectra are affected by experimental conditions, such as fluctuation of laser energy, thermal noise of detector, dark current noise, and matrix effects, which reduce the accuracy of spectral analysis. The most used spectral preprocessing method is to average the collected spectra of a laser pulse. However, the average method cannot effectively reduce the influence of the inhomogeneity of the sample. To make the analytical spectra contain more spectral information and reduce the influence of random spectral fluctuation, we apply the method of put-back random averaging (bootstrap)24 to preprocess the raw spectra. The bootstrap method is as follows: Collect 100 single-shot spectra from a waste sample, randomly select five spectra data from the 100 single-shot spectra, and the average is taken as a bootstrap spectrum, and the same single-shot spectrum can be selected repeatedly. We obtain 50 bootstrap spectra for each sample.
The spectral integral area is calculated by the trapezoidal integration method. The variation of the spectral integral area should be as small as possible by data preprocessing. The relative standard deviation (RSD) of the variation of the spectral integral area is used to compare with the average method and bootstrap method, as shown in Fig. 3. We can see that RSDs of spectral integral area variation of most waste samples in the bootstrap method are much lower than that in the average method. For six categories of samples—paper, plastic, glass, metal, textile, and wood, the average RSDs of the bootstrap method are less than 3.01%, 6.74%, 2.28%, 5.56%, 4.80%, and 8.44% than that of the average method, respectively.
RSDs of spectral integral areas of bootstrap method and average method: (a) paper, (b) plastic, (c)glass, (d) metal, (e) textile, and (f) wood.
RSDs of spectral integral areas of bootstrap method and average method: (a) paper, (b) plastic, (c)glass, (d) metal, (e) textile, and (f) wood.
C. Spectral drop-dimension algorithms and evaluation indices
There are 50 bootstrap spectra for each sample, and 4000 spectral data for 80 samples in total, with each spectrum having 7250 dimensions; thus, LIBS spectra has wide wavelength range and high dimension (4000 × 7250) and contain a lot of noise and redundant information, which would affect the classification accuracy. The spectral drop-dimension method can eliminate irrelevant information from spectra and drop spectral dimensions. The drop-dimensional spectra should retain effective information and characteristic of original spectra. The principal component analysis (PCA) and linear discriminant analysis (LDA) are used to drop spectral dimension. PCA extracts the principal components (PCs) of the spectral data by the Eigen decomposition algorithm to eliminate data redundancy,25 and the PCs contain the main information of the original data. LDA is a supervised drop-dimension algorithm. It extracts the canonical variables of the spectra data to drop their dimensions based on the classification.26 LDA principle is that the spatial distances of sample data in the same category are as close as possible and that the different categories are as far as possible. Thus, the data dropped by LDA have certain distinguishability that contributes to improving the classification accuracy. In this paper, the first k principal components extracted by PCA and the first k canonical variables extracted by LDA are both referred to as first k dimensional data (kD).
D. Classification models and performance evaluation
Random forest (RF), back propagation (BP) neural network (BPNN), and convolutional neural network (CNN) are used to establish the classification models of recyclable waste, respectively. RF is an integrated machine learning algorithm of “Bootstrap aggregating” and “random subspace.”27 It is a kind of classifier with multiple decision trees and each decision tree deals with a subset of training samples. According to the voting rule, the classification result with the most votes in all decision trees is the final classification result. The prediction results are inaccurate when more than half of decision trees are incorrect; thus, the results of RF are stable and RF has a good ability to prevent overfitting. The performance of RF is related to the number of decision trees, the number of features selected randomly, and the maximum depth of each decision tree. In this paper, each decision tree is grown independently without pruning and the number of features selected randomly of each node is set to the square root of the number of input variables. In theory, the more decision trees, the better the prediction results, but the longer training time. For the PCA + RF and LDA + RF models, 170 and 25 decision trees are selected, respectively.
BPNN is a multilayer feed-forward network with error back propagated and has the advantages of self-learning, self-adaption, fault-tolerance, and good generalization ability.28 The weights and thresholds of BPNN are optimal and the network error is minimized by training of datasets. BPNN model consists of three layers: input layer, hidden layer, and output layer. The hidden layer contains ten neurons with tansig activation functions. The output layer contains six neurons with softmax training functions. The training step is set to 10 000, the target error is set to 0.01, and the learning rate is set to 0.1.
CNN has the ability to automatically extract sample features for classification. CNN is a feed-forward neural network consisting of multiple layers, such as convolutional and pooling layers. We design a one-dimensional convolutional neural network (1D-CNN) architecture that has two convolutional layers with ReLU activation function, two max pooling layers, a flatten layer, and two dense layers. The input data of the first convolutional layer are 7250 variables of the full LIBS spectra. The max-pooling layers follow the convolutional layers and have 64 kernels, the same as the convolutional layers. Following that, a dropout layer with a 30% dropout rate, a flatten layer, and a dense layer with a ReLU activation function are added to the model in sequence. The final layer is a dense layer that has six neurons with a softmax activation function. A categorical cross-entropy loss function and an Adam optimizer are utilized to adjust the trainable parameters of the 1D-CNN model.
K-fold cross-validation is used to evaluate the generalization ability of classification models. In this paper, K = 10, i.e., the dataset is divided into ten groups. Each group is treated as a prediction set in turn, and the remaining nine groups are treated as a training set.
III. RESULTS AND DISCUSSION
The flow diagram of the recyclable waste classification is shown in Fig. 4. First, the collected LIBS spectral data are preprocessed by the bootstrap method, and then, the dimensions of the preprocessed full-spectra data are dropped by PCA and LDA, respectively. Combined with machine learning algorithms RF and BPNN, four classification models, PCA + RF, PCA + BPNN, LDA + RF, and LDA + BPNN, are proposed. The 1D-CNN model is established with full spectra as input. The sample data are divided randomly into training set and prediction set by 8:2, that is, 3200 sample data are used for training and 800 sample data are used for prediction. The accuracies of classification models are calculated by five prediction results. Finally, AR, precision, recall, and F-measure of classification models are analyzed and compared to evaluate the performances of classification models with tenfold cross-validation.
A. Spectral analysis
The samples of the same category with different element compositions and contents will cause great differences in spectra and may be identified as different categories. The samples of different categories with similar elemental compositions may also have similar spectral information and will be identified into the same category, resulting in misclassification. The LIBS spectra of some recyclable waste samples are shown in Fig. 5. The characteristic spectral lines of elements of waste samples are referred to the National Institute of Standards and Technology (NIST) database. In Fig. 5(a), the express box and textbook belong to paper. Their spectra both contain the same characteristic spectral lines of the elements of carbon (C) and oxygen (O), but the intensities of characteristic spectral lines are different. The different plastics have different elements, including metallic elements, such as potassium (K), sodium (Na), magnesium (Mg), calcium (Ca), etc., non-metallic elements, such as C, hydrogen (H), O, nitrogen(N), chlorine (Cl), etc., and molecular bands (CC, CN).29 In Fig. 5(b), mineral water bottles and cosmetic bottles belong to plastic. The characteristic spectral lines of Mg are detected in the mineral water bottle, while characteristic spectral lines of Ca are detected in the cosmetic bottle. The elemental composition of metal is diverse, as shown in Fig. 5(d), and the elemental types, characteristic spectral lines, and characteristic peak intensities of the same elements are obviously different for stainless steel and can. As can be seen, for the samples of the same category, the characteristic spectral lines of the main elements are similar, but spectral profiles and characteristic peak intensities are different. Figure 5(g) shows the spectra of sheet and timber that belong to different categories, but their spectra are also similar and they have many same characteristic spectral lines.
Spectra of recyclable waste samples: spectra of different samples of the same categories: (a) paper, (b) plastic, (c) glass, (d) metal, (e) textiles, and (f) wood. (g) Spectra of different categories: textiles and wood.
Spectra of recyclable waste samples: spectra of different samples of the same categories: (a) paper, (b) plastic, (c) glass, (d) metal, (e) textiles, and (f) wood. (g) Spectra of different categories: textiles and wood.
B. Spectral drop-dimension analysis
PCA extracts the principal components (PCs) of spectra data to drop dimensions according to their contribution rates. The contribution rates of PCs are shown in Fig. 6(a). The contribution rates of the first three PCs are 71.86%, 10.27%, and 9.05%, respectively. The cumulative contribution rate of the first three PCs is 91.18%, i.e., the first three PCs contain the most information on spectral data. The maximum dimension of the spectral data dropped by LDA is k-1, where k is the number of categories. There are six categories of waste samples in our experiment, so the dimensions of spectral data dropped by LDA are 1, 2, 3, 4, and 5 (1D-5D), respectively.
(a) The contribution rate of the first 11 PCs. (b) Spectral drop-dimension to 3D visualization by PCA. (c) Spectral drop-dimension to 3D visualization by LDA.
(a) The contribution rate of the first 11 PCs. (b) Spectral drop-dimension to 3D visualization by PCA. (c) Spectral drop-dimension to 3D visualization by LDA.
PCA and LDA drop the dimension of the original full-spectra data of waste samples to three dimensions (3D) and the 3D visualization figures, as shown in Figs. 6(b) and 6(c). In Fig. 6(b), processed by PCA, most samples of paper, plastic, and glass are obviously separated. Metal, wood, and glass are relatively close in three-dimensional space, and a few of them overlap. The samples of textiles overlapped with other categories and cannot be distinguished. In Fig. 6(c), processed by LDA, the samples of the same category are tightly clustered together and the samples of the different categories are separated obviously. Although plastic and textile are completely separated in three-dimensional space, they are relatively close. It can be seen that most samples of different categories have been distinguished in three-dimensional space by spectral drop-dimension. When the drop-dimensional spectra are used as the input of machine learning models, classification results will be more accurate.
The evaluation indices SCS, DBI, and CHS of PCA from 3D to 10D and those of LDA from 1D to 5D are calculated, as given in Tables I and II. The SCS and DBI are analyzed to select the optimal dimensions of PCA and LDA. The SCS, DBI, and CHS are used to compare the clustering effect of drop-dimensional spectra by PCA and LDA, respectively. In Table I, with the increase in dimension, the SCS values increase and DBI values decrease, indicating the higher the dimension, the easier the samples of the same category cluster and the samples of different categories disperse. The values of SCS and DBI of 8D-10D are almost close for PCA, and in 9D, the SCS value is the largest at −0.353 and the DBI value is the smallest at 2.082. Therefore, PCA with dimensions of 9D shows better clustering performance. However, the SCS values in PCA are all negative, which means samples of different categories might be not completely separated by PCA.
The evaluation indices in dimension from 3D to 10D by PCA.
Dimension (D) . | SCS . | DBI . | CHS . |
---|---|---|---|
3 | −0.395 | 3.293 | 671.081 |
4 | −0.392 | 3.213 | 648.377 |
5 | −0.390 | 3.046 | 631.368 |
6 | −0.385 | 2.445 | 625.149 |
7 | −0.380 | 2.345 | 618.206 |
8 | −0.369 | 2.159 | 611.036 |
9 | −0.353 | 2.082 | 610.719 |
10 | −0.357 | 2.103 | 608.274 |
Dimension (D) . | SCS . | DBI . | CHS . |
---|---|---|---|
3 | −0.395 | 3.293 | 671.081 |
4 | −0.392 | 3.213 | 648.377 |
5 | −0.390 | 3.046 | 631.368 |
6 | −0.385 | 2.445 | 625.149 |
7 | −0.380 | 2.345 | 618.206 |
8 | −0.369 | 2.159 | 611.036 |
9 | −0.353 | 2.082 | 610.719 |
10 | −0.357 | 2.103 | 608.274 |
The evaluation indices in dimension from 1D to 5D by LDA.
Dimension (D) . | SCS . | DBI . | CHS . |
---|---|---|---|
1 | 0.622 | 0.471 | 10 936 153.211 |
2 | 0.928 | 0.188 | 6 866 558.198 |
3 | 0.930 | 0.170 | 4 639 716.830 |
4 | 0.995 | 0.077 | 3 504 870.973 |
5 | 0.996 | 0.059 | 2 816 554.031 |
Dimension (D) . | SCS . | DBI . | CHS . |
---|---|---|---|
1 | 0.622 | 0.471 | 10 936 153.211 |
2 | 0.928 | 0.188 | 6 866 558.198 |
3 | 0.930 | 0.170 | 4 639 716.830 |
4 | 0.995 | 0.077 | 3 504 870.973 |
5 | 0.996 | 0.059 | 2 816 554.031 |
For LDA in Table II, SCS and DBI values change sharply from 1D to 2D, which means that the clustering of samples of the same category is poor in 1D. The values of SCS are more than 0.928 and the values of DBI are less than 0.188 from 2D to 5D, especially the values of SCS and DBI in 4D and 5D are very close. In 5D, the SCS value is the largest, 0.996, close to 1, and the DBI value is the smallest, 0.059. Therefore, when the dimension of LIBS spectra is dropped to 5D by LDA, the recyclable waste samples of different categories show the best-distinguished effect. Compared with 8D-10D PCA and 1D-5D LDA, CHS values of LDA are much higher than that of PCA, and clustering evaluation indices of LDA are better than that of PCA. We deduce that the clustering effects of drop-dimensional spectra of LDA are better than that of PCA.
C. Classification results
Based on the above results, the drop-dimensional spectra of 8D-10D PCA and 2D-5D LDA are, respectively, used as the input variables of RF and BPNN models, and the categories of recyclable waste are as outputs. The full LIBS spectra are used as input variables of the 1D-CNN model. We establish five classification models, PCA + RF, PCA + BPNN, LDA + RF, LDA + BPNN, and 1D-CNN.
Table III lists the classification results of classification models. For PCA(8D-10D) + RF, the accuracies of the training set are 100%, and those of the prediction set are more than 99.28%. For PCA(8D-10D) + BPNN, the accuracies of the training set and prediction set are less than 99%. For LDA(2D-5D) + RF, the accuracies of both training sets are 100%, and for LDA(4D-5D) + RF, the accuracies of the prediction set are 100%. For LDA(2D-5D) + BPNN, and the accuracies of the training set and prediction set are more than 98.87%. For 1D-CNN, the accuracies of the training set and prediction set are 96.82% and 96.63%, which are slightly lower than that of PCA + RF, PCA + BPNN, LDA + RF, and LDA + BPNN models.
Classification results of PCA + RF, PCA + BPNN, LDA + RF, LDA + BPNN, and 1D-CNN. Note: ART is thr accuracy of training set; ARP is the accuracy of prediction set.
Models . | Dimension (D) . | ART (%) . | ARP (%) . |
---|---|---|---|
PCA + RF | 8 | 100.00 | 99.28 |
9 | 100.00 | 99.53 | |
10 | 100.00 | 99.45 | |
PCA + BPNN | 8 | 96.11 | 96.08 |
9 | 98.93 | 98.72 | |
10 | 98.27 | 97.92 | |
LDA + RF | 2 | 100.00 | 99.85 |
3 | 100.00 | 99.95 | |
4 | 100.00 | 100.00 | |
5 | 100.00 | 100.00 | |
LDA + BPNN | 2 | 99.01 | 98.87 |
3 | 99.31 | 99.07 | |
4 | 99.95 | 99.96 | |
5 | 100.00 | 100.00 | |
1D-CNN | 7250 | 96.82 | 96.63 |
Models . | Dimension (D) . | ART (%) . | ARP (%) . |
---|---|---|---|
PCA + RF | 8 | 100.00 | 99.28 |
9 | 100.00 | 99.53 | |
10 | 100.00 | 99.45 | |
PCA + BPNN | 8 | 96.11 | 96.08 |
9 | 98.93 | 98.72 | |
10 | 98.27 | 97.92 | |
LDA + RF | 2 | 100.00 | 99.85 |
3 | 100.00 | 99.95 | |
4 | 100.00 | 100.00 | |
5 | 100.00 | 100.00 | |
LDA + BPNN | 2 | 99.01 | 98.87 |
3 | 99.31 | 99.07 | |
4 | 99.95 | 99.96 | |
5 | 100.00 | 100.00 | |
1D-CNN | 7250 | 96.82 | 96.63 |
It can be shown that, PCA(9D) + RF, PCA(9D) + BPNN, LDA(4D-5D) + RF, and LDA(5D) + BPNN have higher accuracies. These models and 1D-CNN are evaluated by an average of accuracy (AR), precision, recall, and F-measure with a tenfold cross-validation method for comparing their stabilities and performances. The results are given in Table IV. The performances of LDA(4D-5D) + RF and LDA(5D) + BPNN are better than PCA(9D) + RF, PCA(9D) + BPNN, and 1D-CNN. LDA(5D) + RF has the best classification performance and each evaluation parameter of tenfold cross-validation results reaches 1. 1D-CNN has the lowest classification accuracy and evaluation indices. Overall, LDA drop-dimension to 5D combined with the RF model, i.e., LDA(5D) + RF model, is the most optimal to classify the recyclable waste.
The variation of evaluation indices of tenfold cross-validation.
Models . | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 99.47 | 0.9937 | 0.9934 | 0.9934 |
PCA(9D) + BPNN | 98.08 | 0.9833 | 0.9723 | 0.9772 |
LDA(4D) + RF | 99.93 | 0.9989 | 0.9997 | 0.9993 |
LDA(5D) + RF | 100.00 | 1.00 | 1.00 | 1.00 |
LDA(5D) + BPNN | 99.93 | 0.9968 | 0.972 | 0.9967 |
1D-CNN | 95.94 | 0.9597 | 0.9753 | 0.9657 |
Models . | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 99.47 | 0.9937 | 0.9934 | 0.9934 |
PCA(9D) + BPNN | 98.08 | 0.9833 | 0.9723 | 0.9772 |
LDA(4D) + RF | 99.93 | 0.9989 | 0.9997 | 0.9993 |
LDA(5D) + RF | 100.00 | 1.00 | 1.00 | 1.00 |
LDA(5D) + BPNN | 99.93 | 0.9968 | 0.972 | 0.9967 |
1D-CNN | 95.94 | 0.9597 | 0.9753 | 0.9657 |
D. Subclassification of metals and plastics
The waste samples of the same category can be further subclassified for recycling and reuse by the proposed methods. In the recycling industry, waste metal is usually subclassified into steel, aluminum alloy, copper, and stainless steel and then put into different production procedures to reprocess and reuse. For LIBS spectra of metal, the different metal samples have their own unique characteristic spectra. For example, stainless steel contains characteristic spectra of nickel (Ni), while cans contain characteristic spectra of aluminum (Al), as shown in Fig. 5(d). We use PCA(9D) + RF, PCA(9D) + BPNN, LDA(5D) + RF, LDA(5D) + BPNN, and 1D-CNN models to subclassify the ten metal samples of the experiment into four categories of iron, stainless steel, copper, and aluminum. The subclassification results of metals are given in Table V. The classification accuracies of the five models are all above 93%. The subclassification results of PCA combined models are better than that of LDA combined models. We induce PCA is more suitable to drop the dimension of metal LIBS spectra. As shown in Figs. 6(b) and 6(c), LDA drop-dimensional spectra data of metal samples are close, while those of PCA are more scattered. This means that PCA can preliminarily identify and classify different metal samples. The results of PCA(9D) + RF, PCA(9D) + BPNN, and 1D-CNN are similar, and PCA(9D) + RF has the highest classification accuracy of 98.77% with precision, recall, and F-measure of 0.9892, 0.9863, and 0.9871, respectively.
The variation of evaluation indices of tenfold cross-validation.
. | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 98.77 | 0.9892 | 0.9863 | 0.9871 |
PCA(9D) + BPNN | 98.70 | 0.9870 | 0.9882 | 0.9868 |
LDA(5D) + RF | 93.10 | 0.9627 | 0.9389 | 0.9434 |
LDA(5D) + BPNN | 96.19 | 0.9493 | 0.9520 | 0.9465 |
1D-CNN | 98.00 | 0.9881 | 0.9761 | 0.9817 |
. | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 98.77 | 0.9892 | 0.9863 | 0.9871 |
PCA(9D) + BPNN | 98.70 | 0.9870 | 0.9882 | 0.9868 |
LDA(5D) + RF | 93.10 | 0.9627 | 0.9389 | 0.9434 |
LDA(5D) + BPNN | 96.19 | 0.9493 | 0.9520 | 0.9465 |
1D-CNN | 98.00 | 0.9881 | 0.9761 | 0.9817 |
We use PCA(9D) + RF, PCA(9D) + BPNN, LDA(5D) + RF, LDA(5D) + BPNN, and 1D-CNN models to subclassify plastics into six categories: PVC, POM, ABS, PA, PE, and PTFE. The subclassification results of plastics are given in Table VI. The classification accuracies of models are all above 98%. The PCA(9D) + RF has the highest classification accuracy of 99.52% with precision, recall, and F-measure of 0.9970, 0.9910, and 0.9937, respectively.
The variation of evaluation indices of tenfold cross-validation.
Models . | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 99.52 | 0.9970 | 0.9910 | 0.9937 |
PCA(9D) + BPNN | 98.61 | 0.9762 | 0.9799 | 0.9844 |
LDA(5D) + RF | 98.30 | 0.9889 | 0.9815 | 0.9857 |
LDA(5D) + BPNN | 98.25 | 0.9913 | 0.9870 | 0.9885 |
1D-CNN | 98.00 | 0.9732 | 0.9612 | 0.9672 |
Models . | AR . | Precision . | Recall . | F-measure . |
---|---|---|---|---|
PCA(9D) + RF | 99.52 | 0.9970 | 0.9910 | 0.9937 |
PCA(9D) + BPNN | 98.61 | 0.9762 | 0.9799 | 0.9844 |
LDA(5D) + RF | 98.30 | 0.9889 | 0.9815 | 0.9857 |
LDA(5D) + BPNN | 98.25 | 0.9913 | 0.9870 | 0.9885 |
1D-CNN | 98.00 | 0.9732 | 0.9612 | 0.9672 |
E. Comparison of recyclable waste classification methods
The recyclable waste classification result of this paper proposed is compared with other methods, such as image-based methods, hyperspectral image method, near-infrared spectroscopy method, infrared spectroscopy, and fluorescence spectroscopy method, as given in Table VII. The image-based methods combined with deep learning algorithms and their highest classification accuracies are less than 96%.6,8 In spectroscopy-based methods, the accuracies of the hyperspectral image method are 99.33% and 99% by PCA-SAM (spectral matched discriminant analysis) and PCA-Fisher (Fisher discriminant analysis), separately,11 the accuracy of near-infrared spectroscopy is 98.67%,12 the accuracies of infrared spectroscopy are between 91.6% and 98.1%,13 and the accuracies of fluorescence spectroscopy method are relatively low, less than 93.5%.14 It can be seen that LIBS technology proposed in this paper has the highest classification accuracy, and it can classify not only recyclable waste but also subclassify a certain category such as plastic whose accuracy is 99.52% with PCA(9D) + RF model.
Classification accuracies of different methods for recyclable waste.
Technology . | Algorithm . | Waste sample category . | AR (%) . | Reference . |
---|---|---|---|---|
Image | InceptionV3 | Plastic, glass, paper, metal, and fabric | 92.97 | 6 |
MobileNetV2 | 92.77 | |||
MobileNetV3 | 94.12 | |||
ResNet50 | 92.52 | |||
ResNet101 | 92.71 | |||
ResNet152 | 92.53 | |||
Xception | 94.44 | |||
Image | ResNet 18 | Paper, glass, plastic, metal, and cardboard | 95.87 | 8 |
Hyperspectral image | PCA-SAM | Paper, plastic, wood | 99.3 | 11 |
PCA-Fisher | 99 | |||
Near-infrared spectroscopy | PCA-PSO-OL-ELM | Foam, plastic, brick, concrete mix, and wood | 98.67 | 12 |
Fluorescence spectroscopy | KNN | Industrial black plastic particles | 89.8 | 14 |
ENSEMBLE | 89.8 | |||
SVM | 86.8 | |||
CNN | 93.5 | |||
Infrared spectroscopy | PNN | Cellulose, vinyl-polymers, woods, | 98.1 | 13 |
and low-value residual wastes | ||||
GRNN | 94.4 | |||
RF | 91.6 | |||
SVM | 92.6 | |||
LIBS | PCA(9D) +RF | Paper, glass, plastic, metal, textile, and wood | 99.47 | |
LDA(5D) +RF | 100 | |||
1D-CNN | 95.94 |
Technology . | Algorithm . | Waste sample category . | AR (%) . | Reference . |
---|---|---|---|---|
Image | InceptionV3 | Plastic, glass, paper, metal, and fabric | 92.97 | 6 |
MobileNetV2 | 92.77 | |||
MobileNetV3 | 94.12 | |||
ResNet50 | 92.52 | |||
ResNet101 | 92.71 | |||
ResNet152 | 92.53 | |||
Xception | 94.44 | |||
Image | ResNet 18 | Paper, glass, plastic, metal, and cardboard | 95.87 | 8 |
Hyperspectral image | PCA-SAM | Paper, plastic, wood | 99.3 | 11 |
PCA-Fisher | 99 | |||
Near-infrared spectroscopy | PCA-PSO-OL-ELM | Foam, plastic, brick, concrete mix, and wood | 98.67 | 12 |
Fluorescence spectroscopy | KNN | Industrial black plastic particles | 89.8 | 14 |
ENSEMBLE | 89.8 | |||
SVM | 86.8 | |||
CNN | 93.5 | |||
Infrared spectroscopy | PNN | Cellulose, vinyl-polymers, woods, | 98.1 | 13 |
and low-value residual wastes | ||||
GRNN | 94.4 | |||
RF | 91.6 | |||
SVM | 92.6 | |||
LIBS | PCA(9D) +RF | Paper, glass, plastic, metal, textile, and wood | 99.47 | |
LDA(5D) +RF | 100 | |||
1D-CNN | 95.94 |
IV. CONCLUSIONS
The recyclable waste was identified and classified by Laser-Induced Breakdown Spectroscopy (LIBS) technology in this paper. The LIBS spectra of recyclable waste samples were collected and preprocessed to obtain 4000 spectral data by the bootstrap method. PCA and LDA, respectively, were used to drop the dimension of these full spectra data, the clustering effects of which were researched and analyzed by evaluation indices of SCS, DBI, and CHS. The drop-dimensional spectra of 8D-10D PCA and that of 2D-5D LDA showed better clustering effects, which were used as input to the RF and BPNN models. The advantage of the 1D-CNN model is that no drop-dimension is necessary; thus, its input is full spectra data. The classification models of PCA + RF, PCA + BPNN, LDA + RF, LDA + BPNN, and 1D-CNN were proposed based on LIBS technology. The accuracies of PCA(9D) + RF and PCA(9D) + BPNN were 99.53% and 98.72%, respectively. The accuracies of LDA (4D-5D) + RF and LDA(5D) + BPNN were 100%. Tenfold cross-validation method was used to compare and verify the classification models, the accuracy, precision, recall, and F-measure of which were calculated and analyzed. The results showed that LDA(5D) + RF is the optimal classification model with 100% accuracy and its precision, recall, and F-measure of 1. For recycling and reuse of metal and plastic in industry, we subclassified metals and plastics with proposed models, and the results showed that the PCA(9D) + RF model had the highest accuracy of 98.77% for metal and 99.52% for plastic. Therefore, LDA(5D) + RF is optimal to classify recyclable waste and PCA(9D) + RF is suitable to subclassify metal and plastic, respectively. The LIBS technology combined with drop-dimension algorithms and machine learning algorithms can realize high-precision identification, classification, and subclassification for recyclable waste, which provides a new automatic real-time detection method and technology in the environmental protection field.
ACKNOWLEDGMENTS
This work was supported by the National Natural Science Foundation of China (Grant No. 61903116), the Enterprise Cooperation Project (Grant No. W2022JSKF0096), and the National Key Instrument Development and Application Project (Grant No. 2013YQ220749).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Lei Yang: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Methodology (equal); Writing – review & editing (equal). Yong Xiang: Conceptualization (equal); Data curation (equal); Methodology (equal); Software (equal); Writing – original draft (equal); Writing – review & editing (equal). Yinchuan Li: Visualization (equal). Wenyi Bao: Visualization (equal). Feng Ji: Formal analysis (equal). Jingtao Dong: Formal analysis (equal). Jingjing Chen: Conceptualization (equal). Mengjie Xu: Formal analysis (equal). Rongsheng Lu: Formal analysis (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.