Real-time Magnetic Resonance Imaging (rtMRI) was used to examine mechanisms of sound production by an American male beatbox artist. rtMRI was found to be a useful modality with which to study this form of sound production, providing a global dynamic view of the midsagittal vocal tract at frame rates sufficient to observe the movement and coordination of critical articulators. The subject's repertoire included percussion elements generated using a wide range of articulatory and airstream mechanisms. Many of the same mechanisms observed in human speech production were exploited for musical effect, including patterns of articulation that do not occur in the phonologies of the artist's native languages: ejectives and clicks. The data offer insights into the paralinguistic use of phonetic primitives and the ways in which they are coordinated in this style of musical performance. A unified formalism for describing both musical and phonetic dimensions of human vocal percussion performance is proposed. Audio and video data illustrating production and orchestration of beatboxing sound effects are provided in a companion annotated corpus.

1.
Atherton
,
M.
(
2007
). “
Rhythm-speak: Mnemonic, language play or song
,” in
Proc. Inaugural Intl. Conf. on Music Communication Science (ICoMCS)
, Sydney, edited by
E.
Schubert
 et al, pp.
15
18
.
2.
Bone
,
D.
,
Kim
,
S.
,
Lee
,
S.
, and
Narayanan
,
S.
(
2010
). “
A study of intra-speaker and interspeaker affective variability using electroglottograph and inverse filtered glottal waveforms
,” in
Proc. Interspeech
, Makuhari, pp.
913
916
.
3.
Bresch
,
E.
,
Kim
,
Y.-C.
,
Nayak
,
K.
,
Byrd
,
D.
, and
Narayanan
,
S.
(
2008
). “
Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP]
,”
IEEE Signal Process. Mag.
25
,
123
132
.
4.
Bresch
,
E.
,
Nielsen
,
J.
,
Nayak
,
K.
, and
Narayanan
,
S.
(
2006
). “
Synchronized and noise-robust audio recordings during realtime magnetic resonance imaging scans
,”
J. Acoust. Soc. Am.
120
,
1791
1794
.
5.
Clements
,
N.
(
2002
). “
Explosives, implosives, and nonexplosives: The linguistic function of air pressure differences in stops
,” in
Laboratory Phonology
, edited by
C.
Gussenhoven
and
N.
Warner
(
Mouton De Gruyter
,
Berlin
), Vol.
7
, pp.
299
350
.
6.
Eklund
,
R.
(
2008
). “
Pulmonic ingressive phonation: Diachronic and synchronic characteristics, distribution and function in animal and human sound production and in human speech
,”
J. Int. Phonetic. Assoc.
38
,
235
324
.
7.
Erickson
,
D.
,
Fujimura
,
O.
, and
Pardo
,
B.
(
1998
). “
Articulatory correlates of prosodic control: Emotion and emphasis
,”
Lang. Speech
41
,
399
417
.
8.
Gafos
,
A.
(
1999
).
The Articulatory Basis of Locality in Phonology
(
Garland
,
New York
), pp.
272
.
9.
Gobl
,
C.
, and
Ní Chasaide
,
A.
(
2003
). “
The role of voice quality in communicating emotion, mood and attitude
,”
Speech Comm.
40
,
189
212
.
10.
Hess
,
M.
(
2007
).
Icons of Hip Hop: An Encyclopedia of the Movement, Music, and Culture
(
Greenwood Press
,
Westport
), pp.
640
.
11.
Hogan
,
J.
(
1976
). “
An analysis of the temporal features of ejective consonants
,”
Phonetica
33
,
275
284
.
12.
Kapur
,
A.
,
Benning
,
M.
, and
Tzanetakis
,
G.
(
2004
). “
Query-by-beat-boxing: Music retrieval for the DJ
,” in
Proc. 5th Intl. Conf. on Music Information Retrieval (ISMIR)
, Barcelona, pp.
170
178
.
13.
Kim
,
Y.-C.
,
Proctor
,
M. I.
,
Narayanan
,
S. S.
, and
Nayak
,
K. S.
(
2012
). “
Improved imaging of lingual articulation using real-time multislice MRI
,”
J. Magn. Resonance Imaging
35
,
943
948
.
14.
Kingston
,
J.
(
2005
). “
The phonetics of Athabaskan tonogenesis
,” in
Athabaskan Prosody
, edited by
S.
Hargus
and
K.
Rice
(
John Benjamins
,
Amsterdam
), pp.
137
184
.
15.
Ladefoged
,
P.
, and
Maddieson
,
I.
(
1996
).
The Sounds of the World's Languages
(
Blackwell
,
Oxford
), pp.
426
.
16.
Ladefoged
,
P.
, and
Traill
,
A.
(
1984
). “
Linguistic phonetic descriptions of clicks
,”
Language
60
,
1
20
.
17.
Lederer
,
K.
(
2005
). “
The phonetics of beatboxing
,” BA dissertation,
Leeds Univ., UK
.
18.
Lindau
,
M.
(
1984
). “
Phonetic differences in glottalic consonants
,”
J. Phonetics
12
,
147
155
.
19.
Lisker
,
L.
, and
Abramson
,
A.
(
1964
). “
A cross-language study of voicing in initial stops: Acoustical measurements
,”
Word
20
,
384
422
.
20.
Maddieson
,
I.
,
Smith
,
C.
, and
Bessell
,
N.
(
2001
). “
Aspects of the phonetics of Tlingit
,”
Anthropolog. Ling.
43
,
135
176
.
21.
McDonough
,
J.
, and
Wood
,
V.
(
2008
). “
The stop contrasts of the Athabaskan languages
,”
J. Phonetics
36
,
427
449
.
22.
McLean
,
A.
, and
Wiggins
,
G.
(
2009
). “
Words, movement and timbre
,” in
Proc. Intl. Conf. on New Interfaces for Musical Expression (NIME'09)
, edited by
A.
Zahler
and
R.
Dannenberg
(
Carnegie Mellon Univ.
,
Pittsburgh, PA
), pp.
276
279
.
23.
Miller
,
A.
,
Namaseb
,
L.
, and
Iskarous
,
K.
(
2007
). “
Tongue body constriction differences in click types
,” in
Laboratory Phonology
, edited by
J.
Cole
and
J.
Hualde
(
Mouton de Gruyter
,
Berlin
), Vol.
9
,
643
656
.
24.
Miller
,
A. L.
,
Brugman
,
J.
,
Sands
,
B.
,
Namaseb
,
L.
,
Exter
,
M.
, and
Collins
,
C.
(
2009
). “
Differences in airstream and posterior place of articulation among Nuu clicks
JIPA
39
,
129
161
.
25.
Narayanan
,
S.
,
Bresch
,
E.
,
Ghosh
,
P. K.
,
Goldstein
,
L.
,
Katsamanis
,
A.
,
Kim
,
Y.-C.
,
Lammert
,
A.
,
Proctor
,
M. I.
,
Ramanarayanan
,
V.
, and
Zhu
,
Y.
(
2011
). “
A multimodal real-time MRI articulatory corpus for speech research
,” in
Proc. Interspeech
, Florence, pp.
837
840
.
26.
Narayanan
,
S.
,
Nayak
,
K.
,
Lee
,
S.
,
Sethy
,
A.
, and
Byrd
,
D.
(
2004
). “
An approach to realtime magnetic resonance imaging for speech production
,”
J. Acoust. Soc. Am.
115
,
1771
1776
.
27.
Nordstrand
,
M.
,
Svanfeldt
,
G.
,
Granstrm
,
B.
, and
House
,
D.
(
2004
). “
Measurements of articulatory variation in expressive speech for a set of Swedish vowels
,”
Speech Comm.
44
,
187
196
.
28.
Proctor
,
M. I.
,
Bone
,
D.
, and
Narayanan
,
S. S.
(
2010a
). “
Rapid semi-automatic segmentation of real-time Magnetic Resonance Images for parametric vocal tract analysis
,” in
Proc. Interspeech
, Makuhari, pp.
23
28
.
29.
Proctor
,
M. I.
,
Nayak
,
K. S.
, and
Narayanan
,
S. S.
(
2010b
). “
Linguistic and para-linguistic mechanisms of production in human “beatboxing”: A rtMRI study
,” in
Proc. Intersinging
, Univ. of Tokyo, pp.
1576
1579
.
30.
Saltzman
,
E. L.
, and
Munhall
,
K. G.
(
1989
). “
A dynamical approach to gestural patterning in speech production
,”
Ecol. Psychol.
1
,
333
382
.
31.
Scherer
,
K.
(
2003
). “
Vocal communication of emotion: A review of research paradigms
,”
Speech Comm.
40
,
227
256
.
32.
Sinyor
,
E.
,
Rebecca
,
C. M.
,
Mcennis
,
D.
, and
Fujinaga
,
I.
(
2005
). “
Beatbox classification using ACE
,” in
Proc. Intl. Conf. on Music Information Retrieval
, London, pp.
672
675
.
33.
Smith
,
A. G.
(
2005
). “
An examination of notation in selected repertoire for multiple percussion
,” Ph.D. dissertation,
Ohio State Univ., Columbus, OH
.
34.
Splinter
,
M.
, and
Tyte
,
G.
(
2006–2012
). “Standard beatbox notation,” http: //www.humanbeatbox.com/tips/p2_articleid/231 (Last viewed February 16, 2012).
35.
Stone
,
K.
(
1980
).
Music Notation in the Twentieth Century: A Practical Guidebook
(
W. W. Norton
,
New York
),
357
pp.
36.
Stowell
,
D.
(
2008–2012
). “The beatbox alphabet,” http://www.mcld.co.uk/ beatboxalphabet/ (Last viewed February 22, 2012).
37.
Stowell
,
D.
(
2010
). “
Making music through real-time voice timbre analysis: machine learning and timbral control
,” Ph.D. dissertation,
School of Electronic Engineering and Computer Science, Queen Mary Univ., London
.
38.
Stowell
,
D.
, and
Plumbley
,
M. D.
(
2008
). “
Characteristics of the beatboxing vocal style
,” Technical Report C4DM-TR-08-01 (Centre for Digital Music, Dep. of Electronic Engineering, Univ. of London, London), pp.
1
4
.
39.
Stowell
,
D.
, and
Plumbley
,
M. D.
(
2010
). “
Delayed decision-making in real-time beatbox percussion classification
,”
J. New Music Res.
39
,
203
213
.
40.
Sundara
,
M.
(
2005
). “
Acoustic-phonetics of coronal stops: A cross-language study of Canadian English and Canadian French
,”
J. Acoust. Soc. Am.
118
,
1026
1037
.
41.
Traill
,
A.
(
1985
).
Phonetic and Phonological Studies of!Xoõ Bushman
(
Helmut Buske
,
Hamburg, Germany
),
215
pp.
42.
Tyte
,
G.
(
2012
). “Beatboxing techniques,” www.humanbeatbox.com (Last viewed February 16, 2012).
43.
Weinberg
,
N.
(
1998
).
Guide to Standardized Drumset Notation
(
Percussive Arts Society
,
Lawton, OK
), pp.
43
.
44.
Wood
,
S.
(
1982
). “
X-Ray and model studies of vowel articulation
,” in
Working Papers in Linguistics
(
Dep. Linguistics, Lund Univ. Lund
,
Sweden
), Vol.
23
, pp.
192
.
45.
Wright
,
R.
,
Hargus
,
S.
, and
Davis
,
K.
(
2002
). “
On the categorization of ejectives: Data from Witsuwit'en
J. Int. Phonetics Assoc.
32
,
43
77
.
You do not currently have access to this content.