C5: Information Density Aware Text-to-Speech Synthesis

Project C5 investigates how text-to-speech (TTS) synthesis techniques can be enhanced to take knowledge about information and encoding density into account. The project explores methods to connect and align the processing of high-level information with its encoding into low-level phonetic parameters in TTS synthesis. The approach is to encode information density in two stages: first, directly as high-level parameters during TTS voice building (offline) and, second, during runtime synthesis (online). Quantification of information density can also be used to develop a model of listeners’ susceptibility to synthesis artifacts, in order to automatically predict and pre-emptively improve the perceived output quality by selecting a sequence of acoustic units that forms the desired variation and density of encoding given a defined degree of information density.

Publications

2016

Le Maguer, Sébastien ; Möbius, Bernd; Steiner, Ingmar; Lolive, Damien

De l'utilisation de descripteurs issus de la linguistique computationnelle dans le cadre de la synthèse par HMM Inproceedings

Proc. Journées d'Etudes sur la Parole, Paris, 2016.

Links | BibTeX

Le Maguer, Sébastien ; Möbius, Bernd; Steiner, Ingmar

Toward the use of information density based descriptive features in HMM based speech synthesis Inproceedings

8th International Conference on Speech Prosody, pp. 1029–1033, Boston, MA, USA, 2016.

Links | BibTeX

2015

Le Maguer, Sébastien ; Steiner, Ingmar; Möbius, Bernd

Toward a Speech Synthesis Guided by the Modeling of Unexpected Events Inproceedings

Schweitzer, Antje ; Dogil, Grzegorz (Ed.): Workshop on Modeling Variability in Speech, Stuttgart, Germany, 2015.

Links | BibTeX

Ingmar Steiner

PI

Mail
Website

Sébastien LeMaguer

Postdoc

Mail
Website