In a forensic-voice-comparison (FVC) case, one speaker (A) was talking on a mobile telephone, and another (B) was standing a short distance away. Later, B moved closer to the telephone. Shortly thereafter, there was a section of speech where the identity of the speaker was disputed. All material for training an FVC-system could be extracted from this single recording, but there was a near-far mismatch: Training data for A were near, training data for B were far, and the disputed speech was near. We describe a procedure for addressing the degree of validity and reliability of an FVC system under such conditions, prior to it being applied to the casework recording: Sections of recordings of pairs of speakers of known identity are used to train an A and a B model; multiple other sections from each of the A and B recordings are used as test data; a likelihood ratio is calculated for each test section; and system validity and reliability are assessed. Prior to training and testing, the A and B recordings were played through loudspeakers and rerecorded via a mobile-telephone network, B was rerecorded twice, once with the loudspeaker near and once with it far from the telephone.
Skip Nav Destination
Article navigation
May 2013
Meeting abstract. No PDF available.
May 01 2013
Mismatched distances from speakers to telephone in a forensic-voice-comparison case
Ewald Enzinger
Ewald Enzinger
Forensic Voice Comparison Lab., School of Elec. Eng. & Telecom., Univ. of New South Wales, Sydney, NSW 2052, [email protected]
Search for other works by this author on:
J. Acoust. Soc. Am. 133, 3294 (2013)
Citation
Ewald Enzinger; Mismatched distances from speakers to telephone in a forensic-voice-comparison case. J. Acoust. Soc. Am. 1 May 2013; 133 (5_Supplement): 3294. https://doi.org/10.1121/1.4805425
Download citation file:
23
Views
Citing articles via
Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languages
Maïa Ponsonnet, Christophe Coupé, et al.
The alveolar trill is perceived as jagged/rough by speakers of different languages
Aleksandra Ćwiek, Rémi Anselme, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Related Content
Impact of mismatch conditions between mobile phone recordings on forensic voice comparison
J Acoust Soc Am (October 2014)
Cues above 4 kilohertz can improve spatially separated speech recognition.
J Acoust Soc Am (April 2011)
Musical acoustics experiments
J Acoust Soc Am (August 2005)
The effect of word class on speaker-dependent information in the Standard Dutch vowel /aː/
J. Acoust. Soc. Am. (October 2020)
Examining correlations between phonetic parameters: Implications for forensic speaker comparison
J Acoust Soc Am (April 2015)