This study was aimed at developing a reliable Machine Learning algorithm to classify castings of steel for tire reinforcement depending on the number and properties of inclusions, experimentally determined. 855 castings were available for training, validation and testing. 140 parameters are monitored during fabrication, which are the features of the analysis; the output is 1 or 0 depending on whether the casting is rejected or not. The following algorithms have been employed: Logistic Regression, K-Nearest Neighbors, Support Vector Classifier, Random Forests, AdaBoost, Gradient Boosting and Artificial Neural Networks. The reduced value of the rejection rate implies that classification must be carried out on an imbalanced dataset. Resampling methods and specific scores for imbalanced datasets (Recall, Precision and AUC rather than Accuracy) were used. Random Forest was the most successful method providing an area under the curve in the test set of 0.85. No significant improvements were detected after resampling. It has been proved that this tool allows the samples with a higher probability of being rejected to be selected, improving the effectiveness of the quality control. In addition, the optimized Random Forest has enabled to identify the most important features, which have been satisfactorily interpreted on a metallurgical basis.

1.
Millman
S.
 Clean steel basic features and operation practices. In:
Wúnnenberg
K.
,
Millman
S.
, editors.
IISI Study Clean Steel, Brussels
,
Belgium
:
IISI Committeee on Technology
;
2004
, p.
39
60
.
2.
Guido
S.
,
Müller
A.
 Introduction to Machine Learning with Python.
A Guide for Data Scientists
.
O’Reilly Media
;
2016
.
3.
Mukherjee
U.
How to handle Imbalanced Classification Problems in machine learning? Anal Vidhya
2017
. https://www.analyticsvidhya.com/blog/2017/03/imbalanced-classification-problem/.
This content is only available via PDF.
You do not currently have access to this content.