The relationship between crowd noise and crowd behavioral dynamics is a relatively unexplored field of research. Signal processing and machine learning (ML) may be useful in classifying and predicting crowd emotional state. This paper describes using both supervised and unsupervised ML methods to automatically differentiate between different types of crowd noise. Features used include A-weighted spectral levels, low-level audio signal parameters, and Mel-frequency cepstral coefficients. K-means clustering is used for the unsupervised approach with spectral levels, and six distinct clusters are found; four of these clusters correspond to different amounts of crowd involvement, while two correspond to different amounts of band or public announcement system noise. Random forests are used for the supervised approach, wherein validation and testing accuracies are found to be similar. These investigations are useful for differentiating between types of crowd noise, which is necessary for future work in automatically determining and classifying crowd emotional state.

This content is only available via PDF.