Acoustic variation is central to the study of speaker characterization. In this respect, specific phonemic classes such as vowels have been particularly studied, compared to fricatives. Fricatives exhibit important aperiodic energy, which can extend over a high-frequency range beyond that conventionally considered in phonetic analyses, often limited up to 12 kHz. We adopt here an extended frequency range up to 20.05 kHz to study a corpus of 15 812 fricatives produced by 59 speakers in Russian, a language offering a rich inventory of fricatives. We extracted two sets of parameters: the first is composed of 11 parameters derived from the frequency spectrum and duration (acoustic set) while the second is composed of 13 mel frequency cepstral coefficients (MFCCs). As a first step, we implemented machine learning methods to evaluate the potential of each set to predict gender and speaker identity. We show that gender can be predicted with a good performance by the acoustic set and even more so by MFCCs (accuracy of 0.72 and 0.88, respectively). MFCCs also predict individuals to some extent (accuracy = 0.64) unlike the acoustic set. In a second step, we provide a detailed analysis of the observed intra- and inter-speaker acoustic variation.

1.
Ajili
,
M.
,
Bonastre
,
J.-F.
,
Ben Kheder
,
W.
,
Rossato
,
S.
, and
Kahn
,
J.
(
2017
). “
Phonological content impact on wrongful convictions in Forensic Voice Comparison context
,” in
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
,
IEEE
,
New Orleans, LA
, pp.
2147
2151
.
2.
Alsulaiman
,
M.
,
Mahmood
,
A.
, and
Muhammad
,
G.
(
2017
). “
Speaker recognition based on Arabic phonemes
,”
Speech Commun.
86
,
42
51
.
3.
Al-Tamimi
,
J.
, and
Khattab
,
G.
(
2015
). “
Acoustic cue weighting in the singleton vs geminate contrast in Lebanese Arabic: The case of fricative consonants
,”
J. Acoust. Soc. Am.
138
(
1
),
344
360
.
4.
Antal
,
M.
(
2008
). “
Phonetic speaker recognition
,” in
Procedings of the 7th International Conference COMMUNICATIONS
, pp.
67
72
.
5.
Barry
,
S. M. E.
(
1995
). “
Variation in vocal fold vibration during voiced obstruents in Russian
,”
Int. J. Lang. Commun. Disord.
30
(
2
),
124
131
.
6.
Boersma
,
P.
, and
Weenink
,
D.
(
2022
). “
Praat: Doing phonetics by computer (version 6.2.14) [computer program]
,” http://www.praat.org/ (Last viewed February 19, 2023).
7.
Bolla
,
K.
(
1981
).
A Conspectus of Russian Speech Sounds
,
32nd ed
. (
Böhlau Verlag
,
Köln
).
8.
Breiman
,
L.
(
2001
). “
Random forests
,”
Mach. Learn.
45
,
5
32
.
9.
Breiman
,
L.
,
Friedman
,
J.
,
Olshen
,
R.
, and
Stone
,
C.
(
1984
).
Classification and Regression Trees
, Wadsworth & Brooks/Cole Statistics/Probability Series (
Springer
,
Berlin
).
10.
Diehl
,
R. L.
,
Lindblom
,
B.
,
Hoemeke
,
K. A.
, and
Fahey
,
R. P.
(
1996
). “
On explaining certain male-female differences in the phonetic realization of vowel categories
,”
J. Phon.
24
(
2
),
187
208
.
11.
Draxler
,
C.
, and
Jänsch
,
K.
(
2022
). https://www.bas.uni-muenchen.de/Bas/software/speechrecorder/ (Last viewed February 19, 2023).
12.
Eckert
,
P.
(
1989
). “
The whole woman: Sex and gender differences in variation
,”
Lang. Var. Change
1
(
3
),
245
267
.
13.
Enzinger
,
E.
, and
Balazs
,
P.
(
2011
). “
Speaker verification using pole/zero estimates of nasals
,”
Analele Univ. “Eftimie”
18
,
33
44
.
14.
Ferragne
,
E.
, and
Pellegrino
,
F.
(
2010
). “
Formant frequencies of vowels in 13 accents of the British Isles
,”
J. Int. Phon. Assoc.
40
(
1
),
1
34
.
15.
Flipsen
,
P.
,
Shriberg
,
L.
,
Weismer
,
G.
,
Karlsson
,
H.
, and
McSweeny
,
J.
(
1999
). “
Acoustic characteristics of /s/ in adolescents
,”
J. Speech. Lang. Hear. Res.
42
(
3
),
663
677
.
16.
Forrest
,
K.
,
Weismer
,
G.
,
Milenkovic
,
P.
, and
Dougall
,
R. N.
(
1988
). “
Statistical analysis of word-initial voiceless obstruents: Preliminary data
,”
J. Acoust. Soc. Am.
84
(
1
),
115
123
.
17.
Ganchev
,
T.
,
Fakotakis
,
N.
, and
Kokkinakis
,
G.
(
2005
). “
Comparative evaluation of various MFCC implementations on the speaker verification task
,” in
Proceedings of the SPECOM
, pp.
191
194
.
18.
Gendrot
,
C.
,
Ferragne
,
E.
, and
Pellegrini
,
T.
(
2019
). “
Deep learning and voice comparison: Phonetically-motivated vs. automatically-learned features
,” in
ICPhS
.
19.
Gendrot
,
C.
,
Ferragne
,
E.
, and
Pellegrini
,
T.
(
2020
). “
Informations segmentales pour la caractérisation phonétique du locuteur: Variabilité inter-et intra-locuteurs” (“Segmental information for speaker phonetic characterization: Inter- and intra-speaker variability
”),
6e Conférence Conjointe Journées D'Études Sur la Parole (JEP, 33e Édition), Traitement Automatique Des Langues Naturelles (TALN, 27e Édition), Rencontre Des Étudiants Chercheurs en Informatique Pour le Traitement Automatique Des Langues (RÉCITAL, 22e Édition)
, Vol. 1: Journées d'Études sur la Parole (Proceedings of the 6th joint conference Journees d'Etudes sur la Parole).
20.
Ghaffarvand Mokari
,
P.
, and
Mahdinezhad Sardhaei
,
N.
(
2020
). “
Predictive power of cepstral coefficients and spectral moments in the classification of Azerbaijani fricatives
,”
J. Acoust. Soc. Am.
147
(
3
),
EL228
EL234
.
21.
Gordon
,
M.
,
Barthmaier
,
P.
, and
Sands
,
K.
(
2002
). “
A cross-linguistic acoustic study of voiceless fricatives
,”
J. Int. Phonetic Assoc.
32
(
2
),
141
174
.
22.
Harper
,
S. K.
(
2021
). “
Individual differences in phonetic variability and phonological representation
,” Ph.D. thesis,
University of Southern California
,
Los Angeles, CA
.
23.
Henton
,
C.
(
1995
). “
Cross-language variation in the vowels of female and male speakers
,” in
Proceedings of the XIIIth International Congress of Phonetic Sciences
, Vol.
4
, pp.
420
423
.
24.
Hughes
,
G. W.
, and
Halle
,
M.
(
1956
). “
Spectral properties of fricative consonants
,”
J. Acoust. Soc. Am.
28
(
2
),
303
310
.
25.
Jesus
,
L. M. T.
, and
Jackson
,
P. J. B.
(
2008
). “
Frication and voicing classification
,”
Comput. Process. Portuguese Lang.
5190
,
11
20
.
26.
Jongman
,
A.
,
Wayland
,
R.
, and
Wong
,
S.
(
2000
). “
Acoustic characteristics of English fricatives
,”
J. Acoust. Soc. Am.
108
(
3
),
1252
1263
.
27.
Kavanagh
,
C.
(
2011
). Intra-and Inter-Speaker Variability in Acoustic Properties of English/s [International Association for Forensic Phonetics and Acoustics (IAFPA)].
28.
Kavanagh
,
C. M.
(
2012
). “
New consonantal acoustic parameters for forensic speaker comparison
,” Ph.D. thesis,
University of York
,
York, UK
.
29.
Kisler
,
T.
,
Reichel
,
U.
, and
Schiel
,
F.
(
2017
). “
Multilingual processing of speech via web services
,”
Comput. Speech Lang.
45
,
326
347
.
30.
Klatt
,
D. H.
, and
Klatt
,
L. C.
(
1990
). “
Analysis, synthesis, and perception of voice quality variations among female and male talkers
,”
J. Acoust. Soc. Am.
87
(
2
),
820
857
.
31.
Kochetov
,
A.
(
2017
). “
Acoustics of Russian voiceless sibilant fricatives
,”
J. Int. Phon. Assoc.
47
(
3
),
321
348
.
32.
Kong
,
Y.-Y.
,
Mullangi
,
A.
, and
Kokkinakis
,
K.
(
2014
). “
Classification of fricative consonants for speech enhancement in hearing devices
,”
PLoS One
9
(
4
),
e95001
.
33.
Labov
,
W.
(
1990
). “
The intersection of sex and social class in the course of linguistic change
,”
Lang. Var. Change
2
(
2
),
205
254
.
34.
Lilley
,
J.
,
Spinu
,
L.
, and
Athanasopoulou
,
A.
(
2021
). “
Exploring the front fricative contrast in Greek: A study of acoustic variability based on cepstral coefficients
,”
J. Int. Phonetic Assoc.
51
(
3
),
393
424
.
35.
Ludger
,
P.
,
Fuchs
,
S.
, and
Seifert
,
F.
(
2021
). “
Differences between male and female speakers in the production of /s/: A cross-linguistic study
,” 17. Phonetik und Phonologie im deutschsprachigen Raum (PP).
36.
Machač
,
P.
, and
Skarnitzl
,
R.
(
2009
). “
Principles of phonetic segmentation
,”
Epocha
.
37.
McDougall
,
K.
, and
Nolan
,
F.
(
2007
). “
Discrimination of speaker using the formant dynamics of /u:/ in British English
,” in
Proceedings of the International Congress of Phonetic Sciences 1825–1828
, http://icphs2007.de/conference/Papers/1567/1567.pdf (Last viewed February 19, 2023).
38.
Munson
,
B.
,
McDonald
,
E. C.
,
DeBoe
,
N. L.
, and
White
,
A. R.
(
2006
). “
The acoustic and perceptual bases of judgments of women and men's sexual orientation from read speech
,”
J. Phon.
34
(
2
),
202
240
.
39.
Narayanan
,
S. S.
,
Alwan
,
A. A.
, and
Haker
,
K.
(
1995
). “
An articulatory study of fricative consonants using magnetic resonance imaging
,”
J. Acoust. Soc. Am.
98
(
3
),
1325
1347
.
40.
Newman
,
R. S.
,
Clouse
,
S. A.
, and
Burnham
,
J. L.
(
2001
). “
The perceptual consequences of within-talker variability in fricative production
,”
J. Acoust. Soc. Am.
109
(
3
),
1181
1196
.
41.
Romeo
,
R.
,
Hazan
,
V.
, and
Pettinato
,
M.
(
2013
). “
Developmental and gender-related trends of intra-talker variability in consonant production
,”
J. Acoust. Soc. Am.
134
(
5
),
3781
3792
.
42.
Rose
,
P.
(
2007
). “
Forensic speaker discrimination with Australian English vowel acoustics
,” in
ICPhS XVI
, Vol.
6
, No. 10.
43.
Schiel
,
F.
(
1999
). “
Automatic phonetic transcription of non-prompted speech
,” in ICPhS 99.
44.
Schiel
,
F.
(
2023
). “
The Munich automatic segmentation system MAUS
,” https://www.bas.uni-muenchen.de/Bas/BasMAUS.html (Last viewed August 20, 2022).
45.
Schindler
,
C.
, and
Draxler
,
C.
(
2013
). “
Using spectral moments as a speaker specific feature in nasals and fricatives
,” in
Interspeech 2013
, ISCA, pp.
2793
2796
.
46.
Schwartz
,
M. F.
(
1968
). “
Identification of speaker sex from isolated, voiceless fricatives
,”
J. Acoust. Soc. Am.
43
(
5
),
1178
1179
.
47.
Silbert
,
N.
, and
de Jong
,
K.
(
2008
). “
Focus, prosodic context, and phonological feature specification: Patterns of variation in fricative production
,”
J. Acoust. Soc. Am.
123
(
5
),
2769
2779
.
48.
Smorenburg
,
L.
, and
Heeren
,
W.
(
2020
). “
The distribution of speaker information in Dutch fricatives /s/ and /x/ from telephone dialogues
,”
J. Acoust. Soc. Am.
147
(
2
),
949
960
.
49.
Spinu
,
L.
,
Kochetov
,
A.
, and
Lilley
,
J.
(
2018
). “
Acoustic classification of Russian plain and palatalized sibilant fricatives: Spectral vs. cepstral measures
,”
Speech Commun.
100
,
41
45
.
50.
Spinu
,
L.
, and
Lilley
,
J.
(
2016
). “
A comparison of cepstral coefficients and spectral moments in the classification of Romanian fricatives
,”
J. Phon.
57
,
40
58
.
51.
Spinu
,
L.
,
Vogel
,
I.
, and
Timothy Bunnell
,
H.
(
2012
). “
Palatalization in Romanian—Acoustic properties and perception
,”
J. Phon.
40
(
1
),
54
66
.
52.
Strevens
,
P.
(
1960
). “
Spectra of fricative noise in human speech
,”
Lang. Speech
3
(
1
),
32
49
.
53.
Stuart-Smith
,
J.
(
2007
).
Empirical Evidence for Gendered Speech Production: /s/ in Glaswegian
(
Mouton de Gruyter
,
Berlin
).
54.
Timberlake
,
A.
(
2004
).
A Reference Grammar of Russian
(
Cambridge University Press
,
Cambridge
).
55.
Ulrich
,
N.
(
2022
). “
Russian fricatives [Dataset]
,” https://www.swissubase.ch/en/catalogue/studies/20152/latest/datasets/2183/2445/overview (Last viewed February 19, 2023).
56.
Ulrich
,
N.
,
Allassonnière-Tang
,
M.
,
Pellegrino
,
F.
, and
Dediu
,
D.
(
2021
). “
Identifying the Russian voiceless non-palatalized fricatives /f/, /s/, and /S/ from acoustic cues using machine learning
,”
J. Acoust. Soc. Am.
150
(
3
),
1806
1820
.
57.
Van der Maaten
,
L.
, and
Hinton
,
G.
(
2008
). “
Visualizing data using t-SNE
,”
J. Mach. Learn. Res.
9
(
11
),
2579
2605
.
58.
Weirich
,
M.
, and
Simpson
,
A. P.
(
2014
). “
Differences in acoustic vowel space and the perception of speech tempo
,”
J. Phon.
43
,
1
10
.
59.
Weirich
,
M.
, and
Simpson
,
A. P.
(
2015
). “
Gender-specific differences in sibilant contrast realizations in English and German
,” in ICPhS.

Supplementary Material

You do not currently have access to this content.