The COVID-19 pandemic has brought widespread use of masks, and with that a drop in speech intelligibility. Existing microphone systems struggle to obtain high quality signals from masked speakers under distant and noisy conditions.

Optical microphones offer a viable alternative by quantifying the rich speech information contained in the vibrations of medical masks. Lin et al. developed an optical-based speech acquisition system based on the principles of interferometry.

In their setup, a head and torso simulator was used to produce artificial speech, and its mouth was covered with a medical mask. A laser beam was directed at the mask’s surface, which vibrates with speech, and the result was captured with a laser Doppler vibrometer.

Meanwhile, the team used an omni microphone and directional microphone to record sounds and compare results at various distances and under different ambient noise conditions. The optical system performed significantly better than the traditional alternatives at large distances and loud background noise levels.

“Using a laser beam to sense object vibration can avoid the influence of environmental noise and achieve the function of long-distance signal acquisition,” said author Ying-Hui Lai. “Also, compared to traditional vibration sensors which need to contact an object, one of the advantages of optical-based speech acquisition is that it doesn’t need to directly contact the object’s surface.”

The system could be applicable for hearing aids, medical fields, and human-computer interfaces. The researchers plan to employ deep learning to account for the vibration characteristics of different materials and further improve the speech quality and intelligibility.

Source: “Study of optical-based speech acquisition system using vibration signals from speakers’ medical masks,” by Yu-Min Lin, Ji-Yan Han, Wei-Zhong Zheng, Yi-Chieh Lin, Cheng-Hung Lin, and Ying-Hui Lai, JASA Express Letters (2022). The article can be accessed at https://doi.org/10.1121/10.0010491.