The severe hearing loss problems that some people suffer can be treated by providing them with a surgically implanted electrical device called cochlear implant (CI). CI users struggle to perceive complex audio signals such as music; however, previous studies show that CI recipients find music more enjoyable when the vocals are enhanced with respect to the background music. In this manuscript source separation (SS) algorithms are used to remix pop songs by applying gain to the lead singing voice. This work uses deep convolutional auto-encoders, a deep recurrent neural network, a multilayer perceptron (MLP), and non-negative matrix factorization to be evaluated objectively and subjectively through two different perceptual experiments which involve normal hearing subjects and CI recipients. The evaluation assesses the relevance of the artifacts introduced by the SS algorithms considering their computation time, as this study aims at proposing one of the algorithms for real-time implementation. Results show that the MLP performs in a robust way throughout the tested data while providing levels of distortions and artifacts which are not perceived by CI users. Thus, an MLP is proposed to be implemented for real-time monaural audio SS to remix music for CI users.
Skip Nav Destination
Article navigation
June 2018
June 19 2018
Deep learning models to remix music for cochlear implant users
Tom Gajęcki;
Tom Gajęcki
a)
Department of Otolaryngology, Medical University Hannover and Cluster of Excellence Hearing4all
, Hannover, 30625, Germany
Search for other works by this author on:
Waldo Nogueira
Waldo Nogueira
Department of Otolaryngology, Medical University Hannover and Cluster of Excellence Hearing4all
, Hannover, 30625, Germany
Search for other works by this author on:
a)
Also at: the Music Technology Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, 08018, Spain. Electronic mail: tomgajecki@gmail.com
J. Acoust. Soc. Am. 143, 3602–3615 (2018)
Article history
Received:
December 08 2017
Accepted:
May 25 2018
Citation
Tom Gajęcki, Waldo Nogueira; Deep learning models to remix music for cochlear implant users. J. Acoust. Soc. Am. 1 June 2018; 143 (6): 3602–3615. https://doi.org/10.1121/1.5042056
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languages
Maïa Ponsonnet, Christophe Coupé, et al.
The alveolar trill is perceived as jagged/rough by speakers of different languages
Aleksandra Ćwiek, Rémi Anselme, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
A versatile deep-neural-network-based music preprocessing and remixing scheme for cochlear implant listeners
J. Acoust. Soc. Am. (May 2022)
Spectral complexity reduction of music signals based on frequency-domain reduced-rank approximations: An evaluation with cochlear implant listeners
J. Acoust. Soc. Am. (September 2017)
A subjective evaluation of different music preprocessing approaches in cochlear implant listeners
J. Acoust. Soc. Am. (February 2023)
Remixing music using source separation algorithms to improve the musical experience of cochlear implant users
J. Acoust. Soc. Am. (December 2016)
Onsets, autocorrelation functions and spikes for direction‐based sound source separation
J Acoust Soc Am (April 2005)