Current evidence supports the contribution of extended high frequencies (EHFs; >8 kHz) to speech recognition, especially for speech-in-speech scenarios. However, it is unclear whether the benefit of EHFs is due to phonetic information in the EHF band, EHF cues to access phonetic information at lower frequencies, talker segregation cues, or some other mechanism. This study investigated the mechanisms of benefit derived from a mismatch in EHF content between target and masker talkers for speech-in-speech recognition. EHF mismatches were generated using full band (FB) speech and speech low-pass filtered at 8 kHz. Four filtering combinations with independently filtered target and masker speech were used to create two EHF-matched and two EHF-mismatched conditions for one- and two-talker maskers. Performance was best with the FB target and the low-pass masker in both one- and two-talker masker conditions, but the effect was larger for the two-talker masker. No benefit of an EHF mismatch was observed for the low-pass filtered target. A word-by-word analysis indicated higher recognition odds with increasing EHF energy level in the target word. These findings suggest that the audibility of target EHFs provides target phonetic information or target segregation and selective attention cues, but that the audibility of masker EHFs does not confer any segregation benefit.

1.
Ananthanarayana
,
R.
,
Trine
,
A.
, and
Monson
,
B. B.
(
2022
). “
Extended high-frequency pure-tone thresholds predict speech-in-speech recognition even when extended high-frequency speech cues are absent
,”
J. Acoust. Soc. Am
151
(
4
),
A224
.
2.
Bates
,
D.
,
Mächler
,
M.
,
Bolker
,
B.
, and
Walker
,
S.
(
2015
). “
Fitting linear mixed-effects models using lme4
,”
J. Stat. Softw.
67
(
1
),
1
48
.
3.
Bench
,
J.
,
Kowal
,
A.
, and
Bamford
,
J.
(
1979
). “
The BKB (Bamford-Kowal-Bench) sentence lists for partially hearing children
,”
Br. J. Audiol.
13
(
3
),
108
112
.
4.
Best
,
V.
,
Roverud
,
E.
,
Baltzell
,
L.
,
Rennies
,
J.
, and
Lavandier
,
M.
(
2019
). “
The importance of a broad bandwidth for understanding ‘glimpsed’ speech
,”
J. Acoust. Soc. Am
146
(
5
),
3215
3221
.
5.
Braza
,
M. D.
,
Corbin
,
N. E.
,
Buss
,
E.
, and
Monson
,
B. B.
(
2022
). “
Effect of masker head orientation, listener age, and extended high-frequency sensitivity on speech recognition in spatially separated speech
,”
Ear Hear
43
(
1
),
90
100
.
6.
Bronkhorst
,
A. W.
(
2015
). “
The cocktail-party problem revisited: Early processing and selection of multi-talker speech
,”
Atten. Percept. Psychophys.
77
(
5
),
1465
1487
.
7.
Brungart
,
D. S.
,
Simpson
,
B. D.
,
Ericson
,
M. A.
, and
Scott
,
K. R.
(
2001
). “
Informational and energetic masking effects in the perception of multiple simultaneous talkers
,”
J. Acoust. Soc. Am.
110
,
2527
2538
.
8.
Buss
,
E.
,
Hodge
,
S. E.
,
Calandruccio
,
L.
,
Leibold
,
L. J.
, and
Grose
,
J. H.
(
2019
). “
Masked sentence recognition in children, young adults, and older adults: Age-dependent effects of semantic context and masker type
,”
Ear Hear
40
,
1117
1126
.
9.
Buss
,
E.
,
Leibold
,
L. J.
,
Porter
,
H. L.
, and
Grose
,
J. H.
(
2017
). “
Speech recognition in one- and two-talker maskers in school-age children and adults: Development of perceptual masking and glimpsing
,”
J. Acoust. Soc. Am.
141
(
4
),
2650
2660
.
10.
Calandruccio
,
L.
,
Buss
,
E.
, and
Bowdrie
,
K.
(
2017
). “
Effectiveness of two-talker maskers that differ in talker congruity and perceptual similarity to the target speech
,”
Trends Hear.
21
,
1
14
.
11.
Carhart
,
R.
, and
Jerger
,
J. F.
(
1959
). “
Preferred method for clinical determination of pure-tone thresholds
,”
J. Speech Hear. Disord.
24
(
4
),
330
345
.
12.
Chu
,
W. T.
, and
Warnock
,
A. C. C.
(
2002
). “
Detailed directivity of sound fields around human talkers
,” Technical Report, Institute for Research in Construction (
National Research Council of Canada
,
Ottawa, Canada
), pp.
1
47
.
13.
Crouzet
,
O.
, and
Ainsworth
,
W. A.
(
2001
). “
On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation
,” in
Proceedings of the Workshop on Consistent and Reliable Acoustic Cues for Sound Analysis
, September 2, Aalborg, Denmark.
14.
Flaherty
,
M.
,
Kelsey
,
L.
, and
Monson
,
B. B.
(
2021
). “
Extended high-frequency hearing and head orientation cues benefit children during speech-in-speech recognition
,”
Hear. Res.
406
,
108230
.
15.
Freyman
,
R. L.
,
Balakrishnan
,
U.
, and
Helfer
,
K. S.
(
2004
). “
Effect of number of masking talkers and auditory priming on informational masking in speech recognition
,”
J. Acoust. Soc. Am.
115
,
2246
2256
.
16.
Freyman
,
R. L.
,
Helfer
,
K. S.
, and
Balakrishnan
,
U.
(
2007
). “
Variability and uncertainty in masking by competing speech
,”
J. Acoust. Soc. Am.
121
(
2
),
1040
1046
.
17.
Hunter
,
L. L.
,
Monson
,
B. B.
,
Moore
,
D. R.
,
Dhar
,
S.
,
Wright
,
B. A.
,
Munro
,
K. J.
,
Zadeh
,
L. M.
,
Blankenship
,
C. M.
,
Stiepan
,
S. M.
, and
Siegel
,
J. H.
(
2020
). “
Extended high frequency hearing and speech perception implications in adults and children
,”
Hear. Res.
397
,
107922
.
18.
IBM
(
2022
). “
IBM Watson—Speech to text
,” https://www.ibm.com/cloud/watson-speech-to-text (Last viewed June 25, 2022).
19.
Jongman
,
A.
,
Wayland
,
R.
, and
Wong
,
S.
(
2000
). “
Acoustic characteristics of English fricatives
,”
J. Acoust. Soc. Am.
108
(
3
),
1252
1263
.
20.
Kocon
,
P.
, and
Monson
,
B. B.
(
2018
). “
Horizontal directivity patterns differ between vowels extracted from running speech
,”
J. Acoust. Soc. Am.
144
(
1
),
EL7
EL12
.
21.
Levy
,
S. C.
,
Freed
,
D. J.
,
Nilsson
,
M.
,
Moore
,
B. C.
, and
Puria
,
S.
(
2015
). “
Extended high-frequency bandwidth improves speech reception in the presence of spatially separated masking speech
,”
Ear Hear.
36
(
5
),
e214
e224
.
22.
Lippmann
,
R. P.
(
1996
). “
Accurate consonant perception without mid-frequency speech energy
,”
IEEE Trans. Speech Audio Process.
4
(
1
),
66
69
.
23.
Lough
,
M.
, and
Plack
,
C. J.
(
2022
). “
Extended high-frequency audiometry in research and clinical practice
,”
J. Acoust. Soc. Am.
151
(
3
),
1944
1955
.
24.
Maniwa
,
K.
,
Jongman
,
A.
, and
Wade
,
T.
(
2009
). “
Acoustic characteristics of clearly spoken English fricatives
,”
J. Acoust. Soc. Am.
125
(
6
),
3962
3973
.
25.
MathWorks Audio Toolbox Team
(
2022
). “
speech2text
,” https://www.mathworks.com/matlabcentral/fileexchange/65266-speech2text (Last viewed July 4, 2022).
26.
Mishra
,
S. K.
,
Fu
,
Q.
,
Galvin
,
J. J.
, III.
, and
Galindo
,
A.
(
2023
). “
Suprathreshold auditory processes in listeners with normal audiograms but extended high-frequency hearing loss
,”
J. Acoust. Soc. Am.
153
(
5
),
2745
2745
.
27.
Mishra
,
S. K.
,
Saxena
,
U.
, and
Rodrigo
,
H.
(
2021
). “
Extended high-frequency hearing impairment despite a normal audiogram: Relation to early aging, speech-in-noise perception, cochlear function, and routine earphone use
,”
Ear Hear.
43
(
3
),
822
835
.
28.
Monson
,
B. B.
, and
Buss
,
E.
(
2022
). “
On the use of the TIMIT, QuickSIN, NU-6, and other widely used bandlimited speech materials for speech perception experiments
,”
J. Acoust. Soc. Am.
152
(
3
),
1639
1645
.
29.
Monson
,
B. B.
, and
Caravello
,
J.
(
2019
). “
The maximum audible low-pass cutoff frequency for speech
,”
J. Acoust. Soc. Am
146
(
6
),
EL496
EL501
.
30.
Monson
,
B. B.
,
Hunter
,
E. J.
, and
Story
,
B. H.
(
2012
). “
Horizontal directivity of low- and high-frequency energy in speech and singing
,”
J. Acoust. Soc. Am.
132
(
1
),
433
441
.
31.
Monson
,
B. B.
,
Lotto
,
A. J.
, and
Story
,
B. H.
(
2012
). “
Analysis of high-frequency energy in long-term average spectra (LTAS) of singing, speech, and voiceless fricatives
,”
J. Acoust. Soc. Am.
132
(
3
),
1754
1764
.
32.
Monson
,
B. B.
,
Lotto
,
A. J.
, and
Story
,
B. H.
(
2014
). “
Detection of high-frequency energy level changes in speech and singing
,”
J. Acoust. Soc. Am.
135
(
1
),
400
406
.
33.
Monson
,
B. B.
,
Rock
,
J.
,
Schulz
,
A.
,
Hoffman
,
E.
, and
Buss
,
E.
(
2019
). “
Ecological cocktail party listening reveals the utility of extended high-frequency hearing
,”
Hear. Res.
381
,
107773
.
34.
Moore
,
B. C.
,
Füllgrabe
,
C.
, and
Stone
,
M. A.
(
2010
). “
Effect of spatial separation, extended bandwidth, and compression speed on intelligibility in a competing-speech task
,”
J. Acoust. Soc. Am.
128
(
1
),
360
371
.
35.
Motlagh Zadeh
,
L.
,
Silbert
,
N. H.
,
Sternasty
,
K.
,
Swanepoel
,
W.
,
Hunter
,
L. L.
, and
Moore
,
D. R.
(
2019
). “
Extended high-frequency hearing enhances speech perception in noise
,”
Proc. Natl. Acad. Sci. U.S.A.
116
(
47
),
23753
23759
.
36.
Neuhoff
,
J. G.
(
2003
). “
Twist and shout: Audible facing angles and dynamic rotation
,”
Ecol. Psychol.
15
(
4
),
335
351
.
37.
Phatak
,
S. A.
, and
Allen
,
J. B.
(
2007
). “
Consonant and vowel confusions in speech-weighted noise
,”
J. Acoust. Soc. Am.
121
(
4
),
2312
2326
.
38.
Pinheiro
,
J.
, and
Bates
,
D.
, and
R Core Team
(
2022
). “
nlme: Linear and nonlinear mixed effects models
,” R package version 3.1-158, https://CRAN.R-project.org/package=nlme (Last viewed July 18, 2023).
39.
Polspoel
,
S.
,
Kramer
,
S. E.
,
van Dijk
,
B.
, and
Smits
,
C.
(
2022
). “
The importance of extended high-frequency speech information in the recognition of digits, words, and sentences in quiet and noise
,”
Ear Hear
43
(
3
),
913
920
.
40.
R Core Team
(
2022
).
R: A Language and Environment for Statistical Computing
(
R Foundation for Statistical Computing
,
Vienna, Austria
).
41.
Rosen
,
S.
,
Souza
,
P.
,
Ekelund
,
C.
, and
Majeed
,
A. A.
(
2013
). “
Listening to speech in a background of other talkers: Effects of talker number and noise vocoding
,”
J. Acoust. Soc. Am.
133
(
4
),
2431
2443
.
42.
Seeto
,
A.
, and
Searchfield
,
G. D.
(
2018
). “
Investigation of extended bandwidth hearing aid amplification on speech intelligibility and sound quality in adults with mild-to-moderate hearing loss
,”
J. Am. Acad. Audiol.
29
(
3
),
243
254
.
43.
Shadle
,
C. H.
, and
Mair
,
S. J.
(
1996
). “
Quantifying spectral characteristics of fricatives
,” in
Proceedings of ICSLP 96
, October 3–6, Philadelphia, PA, pp.
1521
1524
.
44.
Shamma
,
S. A.
,
Elhilali
,
M.
, and
Micheyl
,
C.
(
2011
). “
Temporal coherence and attention in auditory scene analysis
,”
Trends Neurosci.
34
(
3
),
114
123
.
45.
Sobon
,
K. A.
,
Taleb
,
N. M.
,
Buss
,
E.
,
Grose
,
J. H.
, and
Calandruccio
,
L.
(
2019
). “
Psychometric function slope for speech-in-noise and speech-in-speech: Effects of development and aging
,”
J. Acoust. Soc. Am.
145
(
4
),
EL284
EL290
.
46.
Strelcyk
,
O.
,
Pentony
,
S.
,
Kalluri
,
S.
, and
Edwards
,
B.
(
2014
). “
Effects of interferer facing orientation on speech perception by normal-hearing and hearing-impaired listeners
,”
J. Acoust. Soc. Am.
135
,
1419
1432
.
47.
Trine
,
A.
, and
Monson
,
B. B.
(
2020
). “
Extended high frequencies provide both spectral and temporal information to improve speech-in-speech recognition
,”
Trends Hear.
24
,
233121652098029
.
48.
Vitela
,
A. D.
,
Monson
,
B. B.
, and
Lotto
,
A. J.
(
2015
). “
Phoneme categorization relying solely on high-frequency energy
,”
J. Acoust. Soc. Am.
137
(
1
),
EL65
EL70
.
49.
Yeend
,
I.
,
Beach
,
E. F.
, and
Sharma
,
M.
(
2019
). “
Working memory and extended high-frequency hearing in adults: Diagnostic predictors of speech-in-noise perception
,”
Ear Hear.
40
(
3
),
458
467
.
You do not currently have access to this content.