Music perception remains rather poor for many Cochlear Implant (CI) users due to the users' deficient pitch perception. However, comprehensible vocals and simple music structures are well perceived by many CI users. In previous studies researchers re-mixed songs to make music more enjoyable for them, favoring the preferred music elements (vocals or beat) attenuating the others. However, mixing music requires the individually recorded tracks (multitracks) which are usually not accessible. To overcome this limitation, Source Separation (SS) techniques are proposed to estimate the multitracks. These estimated multitracks are further re-mixed to create more pleasant music for CI users. However, SS may introduce undesirable audible distortions and artifacts. Experiments conducted with CI users (N = 9) and normal hearing listeners (N = 9) show that CI users can have different mixing preferences than normal hearing listeners. Moreover, it is shown that CI users' mixing preferences are user dependent. It is also shown that SS methods can be successfully used to create preferred re-mixes although distortions and artifacts are present. Finally, CI users' preferences are used to propose a benchmark that defines the maximum acceptable levels of SS distortion and artifacts for two different mixes proposed by CI users.

1.
W.
Nogueira
,
M.
Haro
,
P.
Herrera
, and
X.
Serra
, “
Music perception with current signal processing strategies for cochlear implants
,” in
Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
, ACM (
2011
).
2.
K.
Gfeller
,
A.
Christ
,
K.
John
,
S.
Witt
, and
M.
Mehr
, “
The effects of familiarity and complexity on appraisal of complex songs by cochlear implant recipients and normal hearing adults
,”
J. Music Therapy
40
(
2
),
78
112
(
2003
).
3.
E. M.
Burns
and
N. F.
Viemeister
, “
Played-again SAM: Further observations on the pitch of amplitude-modulated noise
,”
J. Acoust. Soc. Am.
70
(
6
),
1655
1660
(
1981
).
4.
B. C. J.
Moore
, “
Coding of sounds in the auditory system and its relevance to signal processing and coding in cochlear implants
,”
Otol. Neurotol.
24
(
2
),
243
254
(
2003
).
5.
J. J.
Galvin
,
Q.-J.
Fu
, and
R. V.
Shannon
, “
Melodic contour identification and music perception by cochlear implant users
,”
Ann. N.Y. Acad. Sci.
1169
(
1
),
518
533
(
2009
).
6.
Y. Y.
Kong
,
R.
Cruz
,
J. A.
Jones
, and
F. G.
Zeng
, “
Music perception with temporal cues in acoustic and electric hearing
,”
Ear Hear.
25
(
2
),
173
185
(
2004
).
7.
H. J.
McDermott
, “
Music perception with cochlear implants: A review
,”
Trends Amplif.
8
(
2
),
49
82
(
2004
).
8.
V.
Looi
,
H.
McDermott
,
C.
McKay
, and
L.
Hickson
, “
Music perception of cochlear implant users compared with that of hearing aid users
,”
Ear Hear.
29
(
3
),
421
434
(
2008
).
9.
W.
Buyens
,
B.
van Dijk
,
M.
Moonen
, and
J.
Wouters
, “
Music mixing preferences of cochlear implant recipients: A pilot study
,”
Int. J. Audiol.
53
(
5
),
294
301
(
2014
).
10.
K.
Gfeller
,
D.
Jiang
,
J.
Oleson
,
V.
Driscoll
, and
J. F.
Knutson
, “
Temporal stability of music perception and appraisal scores of adult cochlear implant recipients
,”
J. Am. Acad. Audiol.
21
(
1
),
28
34
(
2010
).
11.
K.
Gfeller
,
A.
Christ
,
J. F.
Knutson
,
S.
Witt
,
K. T.
Murray
, and
R. S.
Tyler
, “
Musical backgrounds, listening habits, and aesthetic enjoyment of adult cochlear implant recipients
,”
J. Am. Acad. Audiol.
11
(
7
),
390
406
(
2000
).
12.
V.
Looi
,
H.
McDermott
,
C.
McKay
, and
L.
Hickson
, “
Comparisons of quality ratings for music by cochlear implant and hearing aid users
,”
Ear Hear.
28
(
2
),
59S
61S
(
2007
).
13.
P. J.
Donnelly
,
Z.
Guo Benjamin
, and
J. L.
Charles
, “
Perceptual fusion of polyphonic pitch in cochlear implant users
,”
J. Acoust. Soc. Am.
126
(
5
),
EL128
EL133
(
2009
).
14.
W.
Buyens
,
B.
van Dijk
,
J.
Wouters
, and
M.
Moonen
, “
A stereo music pre-processing scheme for cochlear implant users
,”
IEEE Trans. Biomed. Eng.
62
(
10
),
2434
2442
(
2015
).
15.
G. D.
Kohlberg
,
D. M.
Mancuso
,
D. A.
Chari
, and
A. K.
Lalwani
, “
Music engineering as a novel strategy for enhancing music enjoyment in the cochlear implant recipient
,”
Behav. Neurol.
501
,
829680
(
2015
).
16.
W.
Buyens
,
B.
Van Dijk
,
J.
Wouters
, and
M.
Moonen
, “
A harmonic/percussive sound separation based music pre-processing scheme for cochlear implant users
,” in
Proceedings of the 21st European Signal Processing Conference (EUSIPCO)
(
2013
), pp.
1
5
.
17.
N.
Ono
,
K.
Miyamoto
,
J.
Le Roux
,
H.
Kameoka
, and
S.
Sagayama
, “
Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram
,” in
Proceedings of 16th European Signal Processing Conference (EUSIPCO)
(
2008
), pp.
1
4
.
18.
K.
Kokkinakis
and
P. C.
Loizou
, “
Using blind source separation techniques to improve speech recognition in bilateral cochlear implant patients
,”
J. Acoust. Soc. Am.
123
(
4
),
2379
2390
(
2008
).
19.
W.
Nogueira
,
T.
Gajecki
,
B.
Krger
,
J.
Janer
, and
A.
Bchner
, “
Development of a sound coding strategy based on a deep recurrent neural network for monaural source separation in cochlear implants?
,” in
Proceedings of the 12th ITG Conference on Speech Communication
(
2016
).
20.
A.
Roebel
,
J.
Pons
,
M.
Liuni
, and
M.
Lagrange
, “
On automatic drum transcription using non-negative matrix deconvolution and itakura saito divergence
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
(
2015
), pp.
414
418
.
21.
P. S.
Huang
,
M.
Kim
,
M.
Hasegawa-Johnson
, and
P.
Smaragdis
, “
Singing-voice separation from monaural recordings using deep recurrent neural networks
,” in
International Society for Music Information Retrieval (ISMIR)
(
2014
).
22.
A.
Ozerov
,
E.
Vincent
, and
F.
Bimbot
, “
A general flexible framework for the handling of prior information in audio source separation
,”
IEEE Trans. Audio, Speech, Lang. Process.
20
(
4
),
1118
1133
(
2012
).
23.
D. D.
Lee
and
H. S.
Seung
, “
Algorithms for non-negative matrix factorization
,” in
Proceedings of the Advances in Neural Information Processing Systems (NIPS)
(
2001
), pp.
556
562
.
24.
W.
Nogueira
,
M.
Lopez
,
T.
Rode
,
S.
Doclo
, and
A.
Buechner
, “
Individualizing a monaural beamformer for cochlear implant users
,” in
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
(
2015
), pp.
5738
5742
.
25.
R.
Marxer
and
J.
Janer
, “
Low-latency bass separation using harmonic-percussion decomposition
,” in
Proceedings of the 16th International Conference on Digital Audio Effects (DAFx)
(
2013
).
26.
J. J.
Carabias-Orti
,
M.
Cobos
,
P.
Vera-Candeas
, and
F. J.
Rodriguez-Serrano
, “
Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings
,”
EURASIP J. Adv. Signal Process.
2013
(
1
),
1
16
.
27.
M.
Goto
,
H.
Hashiguchi
,
T.
Nishimura
, and
R.
Oka
, “
RWC Music Database: Music genre database and musical instrument sound database
,”
Int. Soc. Music Inf. Retrieval
3
,
229
230
(
2003
).
28.
B. C. J.
Moore
and
B. R.
Glasberg
, “
Suggested formulae for calculating auditory-filter bandwidths and excitation patterns
,”
J. Acoust. Soc. Am.
74
,
750
753
(
1983
).
29.
A. P.
Dempster
,
N. M.
Laird
, and
D. B.
Rubin
, “
Maximum likelihood from incomplete data via the EM algorithm
,”
J. Royal Stat. Soc. Ser. B
39
(
1
),
1
38
(
1977
).
30.
http://bass-db.gforge.inria.fr/fasst/ (Last viewed January 18,
2016
).
33.
D. C.
Liu
and
J.
Nocedal
, “
On the limited memory BFGS method for large scale optimization
,”
Math. Program.
45
(
1–3
),
503
528
(
1989
).
34.
E.
Vincent
,
R.
Gribonval
, and
C.
Fvotte
, “
Performance measurement in blind audio source separation
,”
IEEE Trans. Audio, Speech, Lang. Process.
14
(
4
),
1462
1469
(
2006
).
35.
http://bass-db.gforge.inria.fr/bss_eval/ (Last viewed January 18,
2016
).
36.
Recommendation, I. T. U. R., Bs. 1534-1
, “
Method for the subjective assessment of intermediate sound quality (MUSHRA)
” (International Telecommunications Union, Geneva,
2001
).
37.
Recommendation, I. T. U. T., P. 800.1
. “
Mean opinion score (MOS) terminology
” (International Telecommunication Union, Geneva,
2006
).
38.
I.
Hochmair-Desoyer
,
E.
Schulz
,
L.
Moser
, and
M.
Schmidt
, “
The HSM sentence test as a tool for evaluating the speech understanding in noise of cochlear implant users
,”
Am. J. Otol.
18
(
6
),
S83
S83
(
1997
).
39.
https://github.com/jordipons/MT5 (Last viewed January 18,
2016
).
40.
M.
Buffa
,
A.
Hallili
, and
P. R.
Gonin
, “
MT5: A HTML5 multitrack player for musicians
,”
First Web Audio Conference
(
2015
).
41.
D.
Robinson
, “
Replay Gain—A proposed standard
,” Online document (
2001
), http://wiki.hydrogenaud.io/index.php?title=ReplayGain_1.0_specification.
42.
N.
Srivastava
,
G.
Hinton
,
A.
Krizhevsky
,
I.
Sutskever
, and
R.
Salakhutdinov
, “
Dropout: A simple way to prevent neural networks from overfitting
,”
J. Machine Learn. Res.
15
(
1
),
1929
1958
(
2014
).
43.
G.
Soley
and
E. E.
Hannon
, “
Infants prefer the musical meter of their own culture: A cross-cultural comparison
,”
Develop. Psychol.
46
(
1
),
286
292
(
2010
).
You do not currently have access to this content.