Individual head-related transfer functions (HRTFs) are usually measured with high spatial resolution or modeled with anthropometric parameters. This study proposed an HRTF individualization method using only spatially sparse measurements using a convolutional neural network (CNN). The HRTFs were represented by two-dimensional images, in which the horizontal and vertical ordinates indicated direction and frequency, respectively. The CNN was trained by using the HRTF images measured at specific sparse directions as input and using the corresponding images with a high spatial resolution as output in a prior HRTF database. The HRTFs of a new subject can be recovered by the trained CNN with the sparsely measured HRTFs. Objective experiments showed that, when using 23 directions to recover individual HRTFs at 1250 directions, the spectral distortion (SD) is around 4.4 dB; when using 105 directions, the SD reduced to around 3.8 dB. Subjective experiments showed that the individualized HRTFs recovered from 105 directions had smaller discrimination proportion than the baseline method and were perceptually undistinguishable in many directions. This method combines the spectral and spatial characteristics of HRTF for individualization, which has potential for improving virtual reality experience.
Skip Nav Destination
,
,
,
,
Article navigation
January 2023
January 13 2023
Modeling individual head-related transfer functions from sparse measurements using a convolutional neural networka) Available to Purchase
Special Collection:
3D Sound Reconstruction For Virtual Auditory Displays: Applications In Buildings
Ziran Jiang;
Ziran Jiang
b)
1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences
, Beijing 100190, China
Search for other works by this author on:
Jinqiu Sang;
Jinqiu Sang
c)
2
Shanghai Institute of AI for Education, East China Normal University
, Shanghai 200062, China
Search for other works by this author on:
Chengshi Zheng
;
Chengshi Zheng
b)
1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences
, Beijing 100190, China
Search for other works by this author on:
Andong Li;
Andong Li
b)
1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences
, Beijing 100190, China
Search for other works by this author on:
Xiaodong Li
Xiaodong Li
b)
1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences
, Beijing 100190, China
Search for other works by this author on:
Ziran Jiang
1,b)
Jinqiu Sang
2,c)
Chengshi Zheng
1,b)
Andong Li
1,b)
Xiaodong Li
1,b)
1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences
, Beijing 100190, China
2
Shanghai Institute of AI for Education, East China Normal University
, Shanghai 200062, China
a)
This paper is part of a special issue on 3D Sound Reconstruction for Virtual Auditory Displays: Applications in Buildings.
b)
Also at: University of Chinese Academy of Sciences, Beijing 100190, China.
c)
Electronic mail: [email protected]
J. Acoust. Soc. Am. 153, 248–259 (2023)
Article history
Received:
September 02 2022
Accepted:
December 24 2022
Citation
Ziran Jiang, Jinqiu Sang, Chengshi Zheng, Andong Li, Xiaodong Li; Modeling individual head-related transfer functions from sparse measurements using a convolutional neural network. J. Acoust. Soc. Am. 1 January 2023; 153 (1): 248–259. https://doi.org/10.1121/10.0016854
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
Focality of sound source placement by higher (ninth) order ambisonics and perceptual effects of spectral reproduction errors
Nima Zargarnezhad, Bruno Mesquita, et al.
A survey of sound source localization with deep learning methods
Pierre-Amaury Grumiaux, Srđan Kitić, et al.
Variation in global and intonational pitch settings among black and white speakers of Southern American English
Aini Li, Ruaridh Purse, et al.
Related Content
Introduction to the special issue on 3D sound reconstruction for virtual auditory displays: Applications in buildings
J. Acoust. Soc. Am. (December 2023)
Sound field synthesis for psychoacoustic research: In situ evaluation of auralized sound pressure level
J. Acoust. Soc. Am. (September 2023)