C5: Information Density Aware Text-to-Speech Synthesis

Project C5 investigates how text-to-speech (TTS) synthesis techniques can be enhanced to take knowledge about information and encoding density into account. The project explores methods to connect and align the processing of high-level information with its encoding into low-level phonetic parameters in TTS synthesis. The approach is to encode information density in two stages: first, directly as high-level parameters during TTS voice building (offline) and, second, during runtime synthesis (online). Quantification of information density can also be used to develop a model of listeners’ susceptibility to synthesis artifacts, in order to automatically predict and pre-emptively improve the perceived output quality by selecting a sequence of acoustic units that forms the desired variation and density of encoding given a defined degree of information density.



Le Maguer, Sébastien ; Möbius, Bernd; Steiner, Ingmar; Lolive, Damien

De l'utilisation de descripteurs issus de la linguistique computationnelle dans le cadre de la synthèse par HMM Inproceedings

Proc. Journées d'Etudes sur la Parole, Paris, 2016.

Le Maguer, Sébastien ; Möbius, Bernd; Steiner, Ingmar

Toward the use of information density based descriptive features in HMM based speech synthesis Inproceedings

8th International Conference on Speech Prosody, pp. 1029–1033, Boston, MA, USA, 2016.

Le Maguer, Sébastien ; Steiner, Ingmar; Möbius, Bernd

Toward a Speech Synthesis Guided by the Modeling of Unexpected Events Inproceedings

Schweitzer, Antje ; Dogil, Grzegorz (Ed.): Workshop on Modeling Variability in Speech, Stuttgart, Germany, 2015.

