C5: Information Density Aware Text-to-Speech Synthesis
Project C5 investigates how text-to-speech (TTS) synthesis techniques can be enhanced to take knowledge about information and encoding density into account. The project explores methods to connect and align the processing of high-level information with its encoding into low-level phonetic parameters in TTS synthesis. The approach is to encode information density in two stages: first, directly as high-level parameters during TTS voice building (offline) and, second, during runtime synthesis (online). Quantification of information density can also be used to develop a model of listeners’ susceptibility to synthesis artifacts, in order to automatically predict and pre-emptively improve the perceived output quality by selecting a sequence of acoustic units that forms the desired variation and density of encoding given a defined degree of information density.
Studying Mutual Phonetic Influence With a Web-Based Spoken Dialogue System Inproceedings Forthcoming
20th International Conference on Speech and Computer (SPECOM), Leipzig, Germany, Forthcoming.
Phonetic Accommodation in HCI: Introducing a Wizard-of-Oz Experiment Inproceedings Forthcoming
Phonetik & Phonologie 14, Vienna, Austria, Forthcoming.
Convergence of Pitch Accents in a Shadowing Task Inproceedings
9th International Conference on Speech Prosody, pp. 225-229, Poznán, Poland, 2018.
11th Language Resources and Evaluation Conference (LREC), pp. 3171-3175, Miyazaki, Japan, 2018.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25 (12), pp. 2351-2361, 2017.
Blizzard Challenge, Stockholm, Sweden, 2017.
Interspeech, pp. 3797-3801, Stockholm, Sweden, 2017.
Interspeech, pp. 239-243, Stockholm, Sweden, 2017.
Trouvain, Jürgen ; Steiner, Ingmar ; Möbius, Bernd (Ed.): 28th Conference on Electronic Speech Signal Processing (ESSV), pp. 186-192, Saarbrücken, Germany, 2017.
Trouvain, Jürgen ; Steiner, Ingmar ; Möbius, Bernd (Ed.): 28th Conference on Electronic Speech Signal Processing (ESSV), pp. 254-261, Saarbrücken, Germany, 2017.
Uprooting MaryTTS: Agile Processing and Voicebuilding Inproceedings
Trouvain, Jürgen ; Steiner, Ingmar ; Möbius, Bernd (Ed.): 28th Conference on Electronic Speech Signal Processing (ESSV), pp. 152-159, Saarbrücken, Germany, 2017.
The MaryTTS entry for the Blizzard Challenge 2016 Inproceedings
Blizzard Challenge, Cupertino, CA, USA, 2016.
Proc. Journées d'Etudes sur la Parole, Paris, 2016.
8th International Conference on Speech Prosody, pp. 1029–1033, Boston, MA, USA, 2016.
Schweitzer, Antje ; Dogil, Grzegorz (Ed.): Workshop on Modeling Variability in Speech, Stuttgart, Germany, 2015.