This study assesses speaker verification efficacy in detecting cloned voices, particularly in safety-critical applications such as healthcare documentation and banking biometrics. It compares deeply trained neural networks like the DeepSpeaker with human listeners in recognizing these cloned voices, underlining the severe implications of voice cloning in these sectors. Cloned voices in healthcare could endanger patient safety by altering medical records, causing inaccurate diagnoses and treatments. In banking, they threaten biometric security, increasing the risk of financial fraud and identity theft. We tested feature extraction strategies using up to 40 parameters, including MFCC, GFCC, LFCC, CQCC, MSRCC, and others computed with the Simplified Python Audio Features Extraction (spafe libraries) or Librosa. We verified the feature vectors using Feature Ranking (Random Forest-derived) and performed dimensionality reduction using Principal Component Analysis (PCA). Our central research question was whether using the voice cloning method to effectively attack the advanced authentication systems is possible. The research reveals the neural network's superiority over human detection in pinpointing cloned voices, underscoring the urgent need for sophisticated AI-based security.
Skip Nav Destination
Article navigation
13 May 2024
186th Meeting of the Acoustical Society of America and the Canadian Acoustical Association
13–17 May 2024
Ottawa, Ontario, Canada
Signal Processing in Acoustics: Paper 1pPAa1
November 25 2024
Enhancing voice biometric security: Evaluating neural network and human capabilities in detecting cloned voices
Andrzej Czyzewski
Andrzej Czyzewski
1
Department of Multimedia Systems, Gdańsk University of Technology: Politechnika Gdanska
, Gdansk, Pomerania, 80-233, POLAND
; [email protected]
Search for other works by this author on:
Proc. Mtgs. Acoust. 54, 055005 (2024)
Article history
Received:
July 26 2024
Accepted:
November 09 2024
Connected Content
Citation
Andrzej Czyzewski; Enhancing voice biometric security: Evaluating neural network and human capabilities in detecting cloned voices. Proc. Mtgs. Acoust. 13 May 2024; 54 (1): 055005. https://doi.org/10.1121/2.0001978
Download citation file:
19
Views
Citing articles via
Flyback sonic booms from Falcon-9 rockets: Measured data and some considerations for future models
Mark C. Anderson, Kent L. Gee, et al.
Related Content
Investigating speaker authentication system vulnerability to the limited duration of speech excerpts and voice cloning
J. Acoust. Soc. Am. (October 2020)
Conjugate degradability and the quantum capacity of cloning channels
J. Math. Phys. (July 2010)
Rhythmic analysis of human heart sounds applying deep learning: LSTM and CNN
AIP Conf. Proc. (July 2024)
Sloshing suppression with active controlled baffles through deep reinforcement learning–expert demonstrations–behavior cloning process
Physics of Fluids (January 2021)
Limitations on cloning in classical mechanics
J. Math. Phys. (January 2012)