Zhaoyan Zhang

 

Physics of Voice Production

Neuromuscular control of voice

Production and perception

Mechanical models of voice production

Computational models of voice production

Computational models of voice production

Our goal is to develop computationally efficient reduced-order models of phonation, for pratical applications in the clinic and natural speech synthesis. Based on previous experimental and numerical studies, we have developed a computationally-efficient three-dimensional model of voice production (Zhang, 2015, 2016a). This model includes a subglottal system, which is driven by respiratory muscular force (Zhang, 2016b), a three-dimensional anisotropic vocal fold model, and a vocal tract model. This model builds on our previous research on human voice production, and has been shown to be able to reasonably reproduce experimental observations (Zhang et al., 2002; Zhang and Luu, 2012; Farahani and Zhang, 2016). The following video shows vibration produced by this three-dimensional voice production model. Many features typical of normal phonation can be observed, including complete glottal closure and a clear wave motion along both the medial surface and the superior surface.

 

 

This model has been used in speech synthesis (Zhang, 2017). An example is shown below, which shows the acoustic waveform of a simulated /aha/ utterance, and the corresponding glottal area waveform, spectrogram, and F0 as a function of time. For comparison, the figure also shows the waveform of a recorded human voice, although no effort has been made to reproduce all the features and details of this voice in the simulation.

    Zhang, Z., Mongeau, L., Frankel, S. (2002). Experimental verification of the quasi-steady approximation for aerodynamic sound generation by pulsating jets in tubes, J. Acoust. Soc. Am., 112(4), 1652-1663. [pdf] [link]
    Zhang, Z., Luu, T.H. (2012). Asymmetric vibration in a two-layer vocal fold model with left-right stiffness asymmetry: Experiment and simulation, J. Acoust. Soc. Am., 132, 1626-1635. [pdf] [link]
    Zhang, Z. (2015). Regulation of glottal closure and airflow in a three-dimensional phonation model: Implications for vocal intensity control, J. Acoust. Soc. Am., 137(2), 898-910. [pdf] [link]
    Zhang, Z. (2016a). Cause-effect relationship between vocal fold physiology and voice production in a three-dimensional phonation model, J. Acoust. Soc. Am., 139(4), 1493-1507. [pdf] [link]
    Zhang, Z. (2016b). Respiratory laryngeal coordination in airflow conservation and reduction of respiratory effort of phonation, Journal of Voice, 30(6), 760.e7–760.e13. [pdf] [link]
    Farahani, M., Zhang, Z. (2016). Experimental validation of a three-dimensional reduced-order continuum model of phonation, J. Acoust. Soc. Am., 140, EL172-EL177. [pdf] [link]
    Zhang, Z. (2017). Toward real-time physically-based voice simulation: An eigenmode-based approach, Proceedings of Meetings on Acoustics, vol. 30, pp. 060002. [pdf] [link]

 

 

Home | Research | Publications | Resources | Links