Phonological contrasts are usually signaled by multiple cues, and tonal languages typically involve multiple dimensions to distinguish between tones (e.g., duration, pitch contour, and voice quality, etc.). While the topic has been extensively studied, research has mostly used small datasets. This study employs a deep neural network (DNN) based speech recognizer trained on the AISHELL-1 (Bu et al., 2017) speech corpus (178 hours of read speech) to explore the tone space in Mandarin Chinese. A recent study shows that DNN models learn linguistically-interpretable information to distinguish between vowels (Weber et al., 2016). Specifically, from a low-dimensional Bottleneck layer, the model learns features comparable to F1 and F2. In the current study, we propose a more complicated Long Short-Term Memory (LSTM) model—with a Bottleneck layer implemented in the hidden layers—to account for variable duration, an important cue for tone discrimination. By interpreting the features learned in the Bottleneck layer, we explore what acoustic dimensions are involved in distinguishing tones. The large amount of data from the speech corpus also renders the results more convincing and provides additional insights not possible from studies with more limited data sets.
Skip Nav Destination
Article navigation
Meeting abstract. No PDF available.
March 01 2019
A deep neural network approach to investigate tone space in languages
Bing'er Jiang;
Bing'er Jiang
McGill Univ., 1085 Dr. Penfield, Montreal, QC H3A 1A7, Canada, binger.jiang@mail.mcgill.ca
Search for other works by this author on:
Tim O'Donnell;
Tim O'Donnell
McGill Univ., 1085 Dr. Penfield, Montreal, QC H3A 1A7, Canada, binger.jiang@mail.mcgill.ca
Search for other works by this author on:
Meghan Clayards
Meghan Clayards
McGill Univ., 1085 Dr. Penfield, Montreal, QC H3A 1A7, Canada, binger.jiang@mail.mcgill.ca
Search for other works by this author on:
J. Acoust. Soc. Am. 145, 1913 (2019)
Citation
Bing'er Jiang, Tim O'Donnell, Meghan Clayards; A deep neural network approach to investigate tone space in languages. J. Acoust. Soc. Am. 1 March 2019; 145 (3_Supplement): 1913. https://doi.org/10.1121/1.5101949
Download citation file:
26
Views