The high-frequency region (above 4–5 kHz) of the speech spectrum has received substantial research attention over the previous decade, with a host of studies documenting the presence of important and useful information in this region. The purpose of the current experiment was to compare the presence of indexical and segmental information in the low- and high-frequency region of speech (below and above 4 kHz) and to determine the extent to which information from these regions can be used in a machine learning framework to correctly classify indexical and segmental aspects of the speech signal. Naturally produced vowel segments produced by ten male and ten female talkers were used as input to a temporal dictionary ensemble classification model in unfiltered, low-pass filtered (below 4 kHz), and high-pass filtered (above 4 kHz) conditions. Classification performance in the unfiltered and low-pass filtered conditions was approximately 90% or better for vowel categorization, talker sex, and individual talker identity tasks. Classification performance for high-pass filtered signals composed of energy above 4 kHz was well above chance for the same tasks. For several classification tasks (i.e., talker sex and talker identity), high-pass filtering had minimal effect on classification performance, suggesting the preservation of indexical information above 4 kHz.
Skip Nav Destination
,
,
Article navigation
November 2023
November 16 2023
Classification of indexical and segmental features of human speech using low- and high-frequency energya)
Special Collection:
Perception and Production of Sounds in the High-Frequency Range of Human Speech
Jeremy J. Donai;
Jeremy J. Donai
b)
1
Department of Speech, Language, and Hearing Sciences, Texas Tech University Health Sciences Center
, Lubbock, Texas 79430, USA
Search for other works by this author on:
D. Dwayne Paschall;
D. Dwayne Paschall
2
Predictive Market Analytics
, Frisco, Texas 75035, USA
Search for other works by this author on:
Saad Haider
Saad Haider
3
Department of Electrical and Computer Engineering, Texas Tech University
, Lubbock, Texas 79409, USA
Search for other works by this author on:
Jeremy J. Donai
1,b)
D. Dwayne Paschall
2
Saad Haider
3
1
Department of Speech, Language, and Hearing Sciences, Texas Tech University Health Sciences Center
, Lubbock, Texas 79430, USA
2
Predictive Market Analytics
, Frisco, Texas 75035, USA
3
Department of Electrical and Computer Engineering, Texas Tech University
, Lubbock, Texas 79409, USA
a)
This paper is part of a special issue on Perception and Production of Sounds in the High-Frequency Range of Human Speech.
b)
Email: [email protected]
J. Acoust. Soc. Am. 154, 3201–3209 (2023)
Article history
Received:
February 27 2023
Accepted:
October 31 2023
Citation
Jeremy J. Donai, D. Dwayne Paschall, Saad Haider; Classification of indexical and segmental features of human speech using low- and high-frequency energy. J. Acoust. Soc. Am. 1 November 2023; 154 (5): 3201–3209. https://doi.org/10.1121/10.0022414
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Effects of signal bandwidth and noise on individual speaker identification
J. Acoust. Soc. Am. (November 2018)
Dialect and gender perception in relation to the intelligibility of low-pass and high-pass filtered spontaneous speech
J. Acoust. Soc. Am. (September 2023)
Effects of filter cutoff on individual speaker identification
J. Acoust. Soc. Am. (October 2020)
Identification of high-pass filtered male, female, and child vowels: The use of high-frequency cues
J. Acoust. Soc. Am. (April 2015)
The identification of high pass filtered vowels
J. Acoust. Soc. Am. (May 2013)