Cantor Digitalis, a real-time formant synthesizer controlled by a graphic tablet and a stylus, is used for assessment of melodic precision and accuracy in singing synthesis. Melodic accuracy and precision are measured in three experiments for groups of 20 and 28 subjects. The task of the subjects is to sing musical intervals and short melodies, at various tempi, using chironomy (hand-controlled singing), mute chironomy (without audio feedback), and their own voices. The results show the high accuracy and precision obtained by all the subjects for chironomic control of singing synthesis. Some subjects performed significantly better in chironomic singing compared to natural singing, although other subjects showed comparable proficiency. For the chironomic condition, mean note accuracy is less than 12 cents and mean interval accuracy is less than 25 cents for all the subjects. Comparing chironomy and mute chironomy shows that the skills used for writing and drawing are used for chironomic singing, but that the audio feedback helps in interval accuracy. Analysis of blind chironomy (without visual reference) indicates that a visual feedback helps greatly in both note and interval accuracy and precision. This study demonstrates the capabilities of chironomy as a precise and accurate mean for controlling singing synthesis.

1.
M.
Astrinaki
,
N.
d'Alessandro
, and
T.
Dutoit
, “
MAGE—A platform for tangible speech synthesis
,” in
Proceedings of the International Conference on New Interfaces for Musical Expression
, Ann Arbor, Michigan (
2012
), pp.
353
356
.
2.
N.
D'Alessandro
,
B.
Doval
,
T.
Dutoit
,
C.
d'Alessandro
,
Y.
Favre
, and
S.
le Beux
, “
Real-time and accurate musical control of expression in singing synthesis
,”
J. Multimodal User Interfaces
1
,
31
39
(
2007
).
3.
L.
Kessous
, “
Gestural control of singing voice, a musical instrument
,” in
Proceedings of Sound and Music Computing Conference
, http://smcnetwork.org/files/proceedings/2004/P39.pdf (Last viewed October 9,
2013
).
4.
S.
Le Beux
,
L.
Feugère
, and
C.
d'Alessandro
, “
Chorus digitalis: Experiments in chironomic choir singing
,” in
Proceedings of the International Conference on Speech Communication
, Firenze, Italy (
2011
), pp.
2005
2008
.
5.
D.
Trueman
,
P.
Cook
,
S.
Smallwood
, and
G.
Wang
, “
PLOrk: Princeton Laptop Orchestra
,” Year 1, in
Proceedings of the International Computer Music Conference
, New Orleans, LA (
2006
), pp.
164
167
.
6.
M. M.
Wanderley
,
J.
Viollet
,
F.
Isart
, and
X.
Rodet
, “
On the choice of transducer technologies for specific musical functions
,” in Proceedings of the International Computer Music Conference, Berlin, Germany (
2000
), pp.
244
247
.
7.
M.
Zbyszynski
,
M.
Wright
,
A.
Momeni
, and
D.
Cullen
, “
Ten years of tablet musical interfaces at CNMAT
,” in
Proceedings of the International Conference on New Interfaces for Musical Expression
, New York (
2007
), pp.
100
105
.
8.
C.
d'Alessandro
,
A.
Rilliard
, and
S.
Le Beux
, “
Chironomic stylization of intonation
,”
J. Acoust. Soc. Am.
129
(
2
),
1594
1604
(
2011
).
9.
S.
Le Beux
,
A.
Rilliard
, and
C.
d'Alessandro
, “
Calliphony: A real-time intonation controller for expressive speech synthesis
,” in
Proceedings of the International Speech Communication Association Speech Synthesis Res. Workshop
, Bonn, Germany (
2007
), pp.
345
350
.
10.
S.
Dalla Bella
,
J.-F.
Gigure
, and
I.
Peretz
, “
Singing proficiency in the general population
,”
J. Acoust. Soc. Am.
121
,
1182
1189
(
2007
).
11.
P. Q.
Pfordresher
,
S.
Brown
,
K.
Meier
,
M.
Belyk
, and
M.
Liotti
, “
Imprecise singing is widespread
,”
J. Acoust. Soc. Am.
128
,
2182
2190
(
2010
).
12.
X.
Rodet
,
Y.
Potard
, and
J. B.
Barriere
, “
The CHANT project: From synthesis of the singing voice to synthesis in general
,”
Computer Music J.
8
(
3
),
15
31
(
1984
).
13.
G.
Bennett
and
X.
Rodet
, “
Synthesis of the singing voice
,” in
Current Directions in Computer Music Research
, edited by
M. V.
Mathews
and
J. R.
Pierce
(
MIT Press
,
Cambridge, MA, 1989
), pp.
19
44
.
14.
P.
Cook
, “
Singing voice synthesis: History, current work, and future directions
,”
Computer Music J.
20
(
3
),
38
46
(
1996
).
15.
J.
Sundberg
, “
Synthesis of singing by rule
,” in
Current Directions in Computer Music Research
, edited by
M. V.
Mathews
and
J. R.
Pierce
(
MIT Press
,
Cambridge, MA
,
1989
), pp.
45
56
.
16.
P.
Depalle
,
G.
Garcia
, and
X.
Rodet
, “
A Virtual Castrato (!?)
,” in
Proceedings of the International Computer Music Conference
, San Francisco, CA (
1994
), pp.
357
360
.
17.
H.
Kenmochi
and
H.
Ohshita
, “
VOCALOID-commercial singing synthesizer based on sample concatenation
,” in
Proceedings of the International Conference on Speech Communication
(
2007
), pp.
4009
4010
.
18.
E.
Miranda
and
M. M.
Wanderley
,
New Digital Musical Instruments: Control and Interaction Beyond the Keyboard
(
A-R Editions
,
Middleton, WI
,
2006
), pp.
1
25
.
19.
P.
Cook
, “
Real-time performance controllers for synthesized singing
,” in
Proceedings of the International Conference on New Interfaces for Musical Expression
, Vancouver, Canada (
2005
) pp.
236
237
.
20.
H.
Dudley
and
T. H.
Tarnoczy
, “
The speaking machine of Wolfgang Von Kempelen
,”
J. Acoust. Soc. Am.
22
(
2
),
151
166
(
1950
).
21.
H.
Dudley
, “
Remaking speech
,”
J. Acoust. Soc. Am.
11
(
2
),
169
177
(
1939
).
22.
S.
Fels
and
G.
Hinton
, “
Glove-Talk II—a neural-network interface which maps gestures to parallel formant speech synthesizer controls
,”
IEEE Trans. Neural Networks
9
,
205
212
(
1998
).
23.
C.
d'Alessandro
,
N.
D'Alessandro
,
S.
Le Beux
,
J.
Simko
,
F.
Cetin
, and
H.
Pirker
, “
The speech conductor: Gestural control of speech synthesis
,” in
Proceedings of eNTERFACE Summer Workshop on Multimodal Interfaces
, Mons, Belgium (
2005
), pp.
52
61
.
24.
MAX programing environment, http://cycling74.com/ (Last viewed March 9,
2013
).
25.
M. S.
Puckette
, “
Pure data
,” in
Proceedings of the International Computer Music Conference
, International Computer Music Association, San Francisco, CA (
1996
), pp.
269
272
.
26.
J. N.
Holmes
, “
Formant synthesizers: Cascade or parallel?
,”
Speech Commun.
2
(
4
),
251
273
(
1983
).
27.
B.
Doval
,
C.
d'Alessandro
, and
N.
Henrich
, “
The voice source as a causal/anticausal linear filter
,” in
Proceedings of the International Speech Communication Association Voqual'03: Voice Quality: Functions, Analysis and Synthesis
,
Geneva, Switzerland
(
2003
), pp.
15
20
.
28.
B. C. J.
Moore
, “
Frequency difference limens for short-duration tones
,”
J. Acoust. Soc. Am.
54
,
610
619
(
1973
).
29.
http://www.jmcueyeti.fr/download.html (Last viewed October 8,
2013
).
30.
http://simplesynth.sourceforge.net/ (Last viewed October 9,
2013
).
31.
H.
Kawahara
,
I.
Masuda-Katsuse
, and
A.
de Cheveigne
, “
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous frequency-based F0 extraction: Possible role of a repetitive structure in sounds
,”
Speech Commun.
27
,
187
207
(
1999
).
32.
C.
d'Alessandro
and
M.
Castellengo
, “
The pitch of short-duration vibrato tones
,”
J. Acoust. Soc. Am.
95
(
3
),
1617
1630
(
1994
).
33.
C.
d'Alessandro
,
S.
Rosset
, and
J. P.
Rossi
, “
The pitch of short-duration fundamental frequency glissandos
,”
J. Acoust. Soc. Am.
104
,
2339
2348
(
1998
).
34.
S.
Ternströ
and
J.
Sundberg
, “
Intonation precision of choir singers
,”
J. Acoust. Soc. Am.
84
,
59
69
(
1988
).
35.
D. F.
Bauer
, “
Constructing confidence sets using rank statistics
,”
J. Am. Stat. Assoc.
67
(
339
),
687
690
(
1972
).
36.
R Core Team
,
R: A Language and Environment for Statistical Computing
,
R Foundation for Statistical Computing
,
Vienna, Austria
, http://www.R-project.org (Last viewed October 9,
2013
).
37.
J. T.
Mantell
and
P. Q.
Pfordresher
, “
Vocal imitation of song and speech
,”
Cognition
127
,
177
202
(
2013
).
38.
D.
Mürbe
,
F.
Pabst
,
G.
Hofmann
, and
J.
Sundberg
, “
Significance of auditory and kinesthetic feedback to singers pitch control
,”
J. Voice
16
(
1
),
44
51
(
2002
).
39.
R. Y.
Granot
,
R.
Israel-Kolatt
,
A.
Gilboa
, and
T.
Kolatt
, “
Accuracy of pitch matching significantly improved by live voice model
,”
J. Voice
27
(
3
),
390
e13
390
e20
(
2013
).
You do not currently have access to this content.