Vibrations within the vocal tract during speech are transmitted through tissue to the skin surface and can be used to transmit speech. Achieving quality speech signals using skin vibration is desirable but problematic, primarily due to the several sound production locations along the vocal tract. The objective of this study was to characterize the frequency content of speech signals on various locations of the head and neck. Signals were recorded using a microphone and accelerometers attached to 15 locations on the heads and necks of 14 males and 10 females. The subjects voiced various phonemes and one phrase. The power spectral densities (PSD) of the phonemes were used to determine a quality ranking for each location and sound. Spectrograms were used to examine signal frequency content for selected locations. A perceptual listening test was conducted and compared to the PSD rankings. The signal-to-noise ratio was found for each location with and without background noise. These results are presented and discussed. Notably, while high-frequency content is attenuated at the throat, it is shown to be detectable at some other locations. The best locations for speech transmission were found to be generally common to males and females.

1.
Acker-Mills
,
B. E.
,
Houtsma
,
A. J. M.
, and
Ahroon
,
W. A.
(
2004
). “
Speech intelligibility in noise using throat and acoustic microphones
,” USAARL Report No. 2004-13.
2.
Cheyne
,
H. A.
 II
(
2002
). “
Estimating glottal voicing source characteristics by measuring and modeling the acceleration of the skin on the neck
,” Ph.D. Dissertation,
Massachusetts Institute of Technology
, Cambridge, MA.
3.
Cheyne
,
H. A.
,
Hanson
,
H. M.
,
Genereux
,
R. P.
,
Stevens
,
K. N.
, and
Hillman
,
R. E.
(
2003
). “
Development and testing of a portable vocal accumulator
,”
J. Speech Lang. Hear. Res.
46
,
1457
1467
, December 2003.
4.
Dromey
,
C.
,
Nissen
,
S.
,
Roy
,
N.
, and
Merril
,
R.
(
2008
). “
Articulatory changes following treatment of muscle tension dysphonia: Preliminary acoustic evidence
,”
J. Speech Lang. Hear. Res.
51
,
196
208
(
2008
).
5.
Dupont
,
S.
,
Christophe
,
R.
, and
Bachelart
,
D.
(
2004
). “
Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise
,” in
Proceedings of Robust 2004 (Workshop (ITRW) on Robustness Issues in Conversational Interaction)
,
Norwich
, August 2004.
6.
Garciarena
,
M.
,
Franco
,
H.
,
Sonmez
,
K.
, and
Bratt
,
H.
(
2003
). “
Combining standard and throat microphones for robust speech recognition
,”
IEEE Signal Process. Lett.
10
(
3
),
72
74
.
7.
Hayes
,
D. P.
, and
Meltzer
,
L.
(
1967
). “
Bone-conducting microphones
,”
IEEE Trans. Very Large Scale Integr. (VLSI) Syst.
80
(
4
),
619
624
.
8.
Horáček
,
J.
,
Vesely
,
J.
,
Pesek
,
L.
, and
Vohradnik
,
M.
(
2004
). “
Fundamental dynamic characteristics of human skull. I. Experimental modal analysis and FE modeling of basic vibration properties
,”
Eng. Mech.
11
(
2
),
139
158
.
9.
Meltzner
,
G. S.
,
Kobler
,
J. B.
, and
Hillman
,
R. E.
(
2003
). “
Measuring the frequency response function of laryngectomy patients: Implications for the design of electrolarynx devices
,”
J. Acoust. Soc. Am.
114
(
2
),
1035
1047
.
10.
Moser
,
H. M.
, and
Oyer
,
H. J.
(
1958
). “
Relative intensities of sounds at various anatomical locations of the head and neck during phonation of the vowels
,”
J. Acoust. Soc. Am.
30
(
4
),
275
277
.
11.
Norton
,
R. L.
, and
Bernstien
,
R. S.
(
1993
). “
Improved laboratory prototype electrolarynx (LAPEL): Using inverse filtering of the frequency response function of the human throat
,”
Ann. Biomed. Eng.
21
,
163
174
.
12.
Shimamura
,
T.
, and
Tamiya
,
T.
(
2005
). “
A reconstruction filter for bone conduction speech
,”
48th Midwest Symposium on Circuits and Systems 2005
, pp.
1847
1850
.
13.
Snidecor
,
J. C.
,
Rehman
,
I.
, and
Washburn
,
D. D.
(
1959
). “
Speech pickup by contact microphone at head and neck positions
,”
J. Speech Hear. Res.
2
(
3
),
277
281
.
14.
Svec
,
J. G.
,
Titze
,
I. R.
, and
Popolo
,
P. S.
(
2005
). “
Estimation of sound pressure levels of voiced speech from skin vibration of the neck
,”
J. Acoust. Soc. Am.
117
,
1386
1394
.
15.
Welch
,
P. D.
(
1967
). “
The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms
,”
IEEE Trans. Audio Electroacoust.
AU-15
,
70
73
.
16.
Wodicka
,
G. R.
, and
Shannon
,
D. C.
(
1990
). “
Transfer function of sound transmission in subglottal human respiratory system at low frequencies
,”
J. Appl. Physiol.
69
(
6
),
2126
2130
.
17.
Zhang
,
Z.
,
Zicheng
,
L.
,
Sinclair
,
M.
,
Acero
,
A.
,
Deng
,
L.
,
Droppo
,
J.
,
Huang
,
X.
, and
Zheng
,
Y.
(
2004
). “
Multi-sensory microphones for robust speech detection, enhancement and recognition
,”
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004 ICASSP’04
, Vol.
III
pp.
781
784
.
18.
Zheng
,
Y.
,
Liu
,
Z.
,
Zhang
,
Z.
,
Sinclair
,
M.
,
Droppo
,
J.
,
Deng
,
L.
,
Acero
,
A.
, and
Huang
,
X.
(
2003
). “
Air- and bone-conductive integrated microphones for robust speech detection and enhancement
,”
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2003
,
Virgin Islands, USA
, 2003-11-30, pp.
249
254
.
You do not currently have access to this content.