The severe hearing loss problems that some people suffer can be treated by providing them with a surgically implanted electrical device called cochlear implant (CI). CI users struggle to perceive complex audio signals such as music; however, previous studies show that CI recipients find music more enjoyable when the vocals are enhanced with respect to the background music. In this manuscript source separation (SS) algorithms are used to remix pop songs by applying gain to the lead singing voice. This work uses deep convolutional auto-encoders, a deep recurrent neural network, a multilayer perceptron (MLP), and non-negative matrix factorization to be evaluated objectively and subjectively through two different perceptual experiments which involve normal hearing subjects and CI recipients. The evaluation assesses the relevance of the artifacts introduced by the SS algorithms considering their computation time, as this study aims at proposing one of the algorithms for real-time implementation. Results show that the MLP performs in a robust way throughout the tested data while providing levels of distortions and artifacts which are not perceived by CI users. Thus, an MLP is proposed to be implemented for real-time monaural audio SS to remix music for CI users.
Skip Nav Destination
June 19 2018
Deep learning models to remix music for cochlear implant users
Tom Gajęcki, Waldo Nogueira; Deep learning models to remix music for cochlear implant users. J. Acoust. Soc. Am. 1 June 2018; 143 (6): 3602–3615. https://doi.org/10.1121/1.5042056
Download citation file: