C1: Information Density and the Predictability of Phonetic Structure

Project C1 addresses the relation between information density and linguistic encoding in phonetics and human speech processing. In the second funding period, an elaborate account of the prosodic hierarchy and its interaction with the ID profile of utterances will be implemented. The project will also investigate effects of channel characteristics and audience design on production and perception. With respect to methodology, the project will develop a procedure for evaluating the contribution of phonetic features to the informativity of linguistic units and investigate the combination of language models across different linguistic levels.
Publications
2020 |
Raveh, Eran; Twig, Maya; Möbius, Bernd; Zehavi, Oded Prosodic alignments in shadowed singing of familiar and novel music Inproceedings Proceedings of Speech Prosody 2020, Tokyo, Japan, 2020. @inproceedings{Raveh/etal:2020a, title = {Prosodic alignments in shadowed singing of familiar and novel music}, author = {Eran Raveh and Maya Twig and Bernd M\"{o}bius and Oded Zehavi }, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2020/04/raveh_etal_sp2020.pdf}, year = {2020}, date = {2020-00-00}, booktitle = {Proceedings of Speech Prosody 2020}, address = {Tokyo, Japan}, abstract = {This paper presents a study comprising two singing shadowing tasks focusing on prosodic features of music. The first experiment investigated alignment effects in a song known to the participants. They sang the song before and after listening to a recorded version of it. The second experiment tested which prosodic elements are best preserved in replications of an unfamiliar song. Methods used in phonetic accommodation studies were adapted and used to measure the effects. Results show that convergence occurs in singing, but not in the same manner across all tested features. Additionally, participants preserved rhythmic patterns better than the tonal contour in the unfamiliar music piece.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper presents a study comprising two singing shadowing tasks focusing on prosodic features of music. The first experiment investigated alignment effects in a song known to the participants. They sang the song before and after listening to a recorded version of it. The second experiment tested which prosodic elements are best preserved in replications of an unfamiliar song. Methods used in phonetic accommodation studies were adapted and used to measure the effects. Results show that convergence occurs in singing, but not in the same manner across all tested features. Additionally, participants preserved rhythmic patterns better than the tonal contour in the unfamiliar music piece. |
Werner, Raphael; Trouvain, Jürgen; Möbius, Bernd Ein sprachübergreifender Vergleich des Pausenverhaltens natürlicher Sprecher in verschiedenen Sprechtempi mit TTS-Systemen Inproceedings Elektronische Sprachsignalverarbeitung 2020, Tagungsband der 31. Konferenz , pp. 101-108, TUD Press, Magdeburg, 2020. @inproceedings{Werner/etal:2020a, title = {Ein sprach\"{u}bergreifender Vergleich des Pausenverhaltens nat\"{u}rlicher Sprecher in verschiedenen Sprechtempi mit TTS-Systemen}, author = {Raphael Werner and J\"{u}rgen Trouvain and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2020/04/werner_etal_essv2020.pdf}, year = {2020}, date = {2020-00-00}, booktitle = {Elektronische Sprachsignalverarbeitung 2020, Tagungsband der 31. Konferenz }, pages = {101-108}, publisher = {TUD Press}, address = {Magdeburg}, series = {Studientexte zur Sprachkommunikation}, abstract = {Die vorliegende Studie vergleicht die Pausensetzung in nat\"{u}rlicher und in synthetischer Sprache sprach\"{u}bergreifend (Deutsch, Franz\"{o}sisch, Englisch) mit Bezug auf Ort, Dauer und Anzahl der Pausen, sowie h\"{o}rbare Atemger\"{a}usche. Von den nat\"{u}rlichen Sprechern (drei Sprecher je Sprache) wurden Texte in f\"{u}nf Sprechgeschwindigkeiten (von sehr langsam bis sehr schnell) vorgelesen. Im Vergleich zur Normalgeschwindigkeit bei nat\"{u}rlichen Sprechern haben die TTS-Systeme (vier Systeme je Sprache) tendenziell langsamere Artikulationsgeschwindigkeiten (in allen drei Sprachen), k\"{u}rzere Pausendauern (Deutsch, Franz\"{o}sisch) und mehr Pausen (Deutsch, Englisch). Die TTS-Systeme unterscheiden sich zus\"{a}tzlich von den nat\"{u}rlichen Sprechern dadurch, dass sie g\"{a}nzlich auf h\"{o}rbare Atemger\"{a}usche in Pausen verzichten. Dar\"{u}ber hinaus zeigten sich individuell verschiedene Strategien der Pausengestaltung der nat\"{u}rlichen Sprecher.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Die vorliegende Studie vergleicht die Pausensetzung in natürlicher und in synthetischer Sprache sprachübergreifend (Deutsch, Französisch, Englisch) mit Bezug auf Ort, Dauer und Anzahl der Pausen, sowie hörbare Atemgeräusche. Von den natürlichen Sprechern (drei Sprecher je Sprache) wurden Texte in fünf Sprechgeschwindigkeiten (von sehr langsam bis sehr schnell) vorgelesen. Im Vergleich zur Normalgeschwindigkeit bei natürlichen Sprechern haben die TTS-Systeme (vier Systeme je Sprache) tendenziell langsamere Artikulationsgeschwindigkeiten (in allen drei Sprachen), kürzere Pausendauern (Deutsch, Französisch) und mehr Pausen (Deutsch, Englisch). Die TTS-Systeme unterscheiden sich zusätzlich von den natürlichen Sprechern dadurch, dass sie gänzlich auf hörbare Atemgeräusche in Pausen verzichten. Darüber hinaus zeigten sich individuell verschiedene Strategien der Pausengestaltung der natürlichen Sprecher. |
Andreeva, Bistra; Möbius, Bernd; Whang, James Effects of surprisal and boundary strength on phrase-final lengthening Inproceedings Proc. 10th International Conference on Speech Prosody 2020, pp. 146-150, 2020. @inproceedings{Andreeva2020, title = {Effects of surprisal and boundary strength on phrase-final lengthening}, author = {Bistra Andreeva and Bernd M\"{o}bius and James Whang}, url = {http://dx.doi.org/10.21437/SpeechProsody.2020-30 http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2020/06/Andreeva_etal2020-1.pdf}, doi = {10.21437/SpeechProsody.2020-30}, year = {2020}, date = {2020-00-00}, booktitle = {Proc. 10th International Conference on Speech Prosody 2020}, pages = {146-150}, abstract = {This study examines the influence of prosodic structure (pitch accents and boundary strength) and information density (ID) on phrase-final syllable duration. Phrase-final syllable durations and following pause durations were measured in a subset of a German radio-news corpus (DIRNDL), consisting of about 5 hours of manually annotated speech. The prosodic annotation is in accordance with the autosegmental intonation model and includes labels for pitch accents and boundary tones. We treated pause duration as a quantitative proxy for boundary strength. ID was calculated as the surprisal of the syllable trigram of the preceding context, based on language models trained on the DeWaC corpus. We found a significant positive correlation between surprisal and phrase-final syllable duration. Syllable duration was statistically modeled as a function of prosodic factors (pitch accent and boundary strength) and surprisal in linear mixed effects models. The results revealed an interaction of surprisal and boundary strength with respect to phrase-final syllable duration. Syllables with high surprisal values are longer before stronger boundaries, whereas low-surprisal syllables are longer before weaker boundaries. This modulation of pre-boundary syllable duration is observed above and beyond the well-established phrase-final lengthening effect.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This study examines the influence of prosodic structure (pitch accents and boundary strength) and information density (ID) on phrase-final syllable duration. Phrase-final syllable durations and following pause durations were measured in a subset of a German radio-news corpus (DIRNDL), consisting of about 5 hours of manually annotated speech. The prosodic annotation is in accordance with the autosegmental intonation model and includes labels for pitch accents and boundary tones. We treated pause duration as a quantitative proxy for boundary strength. ID was calculated as the surprisal of the syllable trigram of the preceding context, based on language models trained on the DeWaC corpus. We found a significant positive correlation between surprisal and phrase-final syllable duration. Syllable duration was statistically modeled as a function of prosodic factors (pitch accent and boundary strength) and surprisal in linear mixed effects models. The results revealed an interaction of surprisal and boundary strength with respect to phrase-final syllable duration. Syllables with high surprisal values are longer before stronger boundaries, whereas low-surprisal syllables are longer before weaker boundaries. This modulation of pre-boundary syllable duration is observed above and beyond the well-established phrase-final lengthening effect. |
Meier, David; Andreeva, Bistra Einflussfaktoren auf die Wahrnehmung von Prominenz im natürlichen Dialog Inproceedings Elektronische Sprachsignalverarbeitung 2020, Tagungsband der 31. Konferenz , pp. 257-264, Magdeburg, 2020. @inproceedings{Meier2020, title = {Einflussfaktoren auf die Wahrnehmung von Prominenz im nat\"{u}rlichen Dialog}, author = {David Meier and Bistra Andreeva}, url = {http://www.essv.de/?year=2020 http://www.essv.de/pdf/2020_257_264.pdf}, year = {2020}, date = {2020-00-00}, booktitle = {Elektronische Sprachsignalverarbeitung 2020, Tagungsband der 31. Konferenz }, pages = {257-264}, address = {Magdeburg}, abstract = {Turnbull et al. [1] stellen fest, dass sich auf die Wahrnehmung der prosodischen Prominenz von isolierten Adjektiv-Nomen-Paaren mehrere konkurrierende Faktoren auswirken, n\"{a}mlich die Phonologie, der Diskurskontext und das Wissen \"{u}ber den Diskurs. Der vorliegende Beitrag hat das Ziel, den relativen Einfluss der evozierten Fokussierung (eng kontrastiv vs. weit kontrastiv) und der Akzentuierung (akzentuiert vs. nicht akzentuiert) auf die Wahrnehmung von Prominenz zu untersuchen und zu \"{u}berpr\"{u}fen, ob die in Turnbull et al. vorgestellten Konzepte in einer Umgebung reproduzierbar sind, die eher mit einem nat\"{u}rlichsprachlichen Dialog vergleichbar ist. F\"{u}r die Studie wurden 144 realisierte S\"{a}tze eines einzelnen m\"{a}nnlichen Sprechers so zusammengeschnitten, dass ein semantischer Kontrast entweder auf dem betreffenden Nomen oder auf dem Adjektiv entsteht. Die metrisch starken Silben des Adjektivs oder des Nomens waren entweder entsprechend der Fokusstruktur oder gegen Erwartung akzentuiert. Die Ergebnisse zeigen, dass die Akzentuierung einen gr\"{o}{\ss}eren Einfluss auf die Prominenzwahrnehmung als die Fokusbedingung hat, was im Einklang mit den Ergebnissen von Turnbull et al. ist. Adjektive werden zudem konsequent als prominenter eingestuft als Nomen in vergleichbaren Kontexten. Eine Erweiterung des Diskurskontextes und der Hintergrundinformationen, die dem Versuchsteilnehmer zur Verf\"{u}gung standen, haben in dem hier vorgestellten Versuchsaufbau allerdings nur vernachl\"{a}ssigbare Effekte.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Turnbull et al. [1] stellen fest, dass sich auf die Wahrnehmung der prosodischen Prominenz von isolierten Adjektiv-Nomen-Paaren mehrere konkurrierende Faktoren auswirken, nämlich die Phonologie, der Diskurskontext und das Wissen über den Diskurs. Der vorliegende Beitrag hat das Ziel, den relativen Einfluss der evozierten Fokussierung (eng kontrastiv vs. weit kontrastiv) und der Akzentuierung (akzentuiert vs. nicht akzentuiert) auf die Wahrnehmung von Prominenz zu untersuchen und zu überprüfen, ob die in Turnbull et al. vorgestellten Konzepte in einer Umgebung reproduzierbar sind, die eher mit einem natürlichsprachlichen Dialog vergleichbar ist. Für die Studie wurden 144 realisierte Sätze eines einzelnen männlichen Sprechers so zusammengeschnitten, dass ein semantischer Kontrast entweder auf dem betreffenden Nomen oder auf dem Adjektiv entsteht. Die metrisch starken Silben des Adjektivs oder des Nomens waren entweder entsprechend der Fokusstruktur oder gegen Erwartung akzentuiert. Die Ergebnisse zeigen, dass die Akzentuierung einen größeren Einfluss auf die Prominenzwahrnehmung als die Fokusbedingung hat, was im Einklang mit den Ergebnissen von Turnbull et al. ist. Adjektive werden zudem konsequent als prominenter eingestuft als Nomen in vergleichbaren Kontexten. Eine Erweiterung des Diskurskontextes und der Hintergrundinformationen, die dem Versuchsteilnehmer zur Verfügung standen, haben in dem hier vorgestellten Versuchsaufbau allerdings nur vernachlässigbare Effekte. |
Gessinger, Iona; Möbius, Bernd; Andreeva, Bistra; Raveh, Eran; Steiner, Ingmar Phonetic accommodation of L2 German speakers to the virtual language learning tutor Mirabella Inproceedings Proceedings of Interspeech 2020, pp. 4118-4122, Shanghai, China , 2020. @inproceedings{Gessinger2020, title = {Phonetic accommodation of L2 German speakers to the virtual language learning tutor Mirabella}, author = {Iona Gessinger and Bernd M\"{o}bius and Bistra Andreeva and Eran Raveh and Ingmar Steiner}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2020/11/Gessinger2020.pdf http://www.interspeech2020.org/index.php?m=content&c=index&a=show&catid=351&id=1147}, doi = {10.21437/Interspeech.2020-2701}, year = {2020}, date = {2020-00-00}, booktitle = {Proceedings of Interspeech 2020}, pages = {4118-4122}, address = {Shanghai, China }, abstract = {The present paper compares phonetic accommodation of L1 French speakers in interaction with the simulated virtual language learning tutor for German, Mirabella, to that of L1 German speakers from a previous study. In a question-and-answer exchange, the L1 French speakers adapted the intonation contours of wh-questions as falling or rising according to the variant produced by Mirabella. However, they were not sensitive to a change of the nuclear pitch accent placement. In a map task, the L1 French speakers increased the number of dispreferred variants for the allophonic contrast [ɪ\c{c}] vs.[ɪk] in the word ending <-ig> when Mirabella used this variant. For the contrast [ɛː] vs. [eː] as a realization of stressed <-\"{a}->, such a convergence effect was not found. Overall, the non-native speakers showed a similar degree of accommodative behavior towards Mirabella as the L1 German speakers. This suggests that incidental inductive learning through accommodation is possible. However, phenomena of the target language that deviate too radically from the native pattern seem to require more explicit training.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } The present paper compares phonetic accommodation of L1 French speakers in interaction with the simulated virtual language learning tutor for German, Mirabella, to that of L1 German speakers from a previous study. In a question-and-answer exchange, the L1 French speakers adapted the intonation contours of wh-questions as falling or rising according to the variant produced by Mirabella. However, they were not sensitive to a change of the nuclear pitch accent placement. In a map task, the L1 French speakers increased the number of dispreferred variants for the allophonic contrast [ɪç] vs.[ɪk] in the word ending <-ig> when Mirabella used this variant. For the contrast [ɛː] vs. [eː] as a realization of stressed <-ä->, such a convergence effect was not found. Overall, the non-native speakers showed a similar degree of accommodative behavior towards Mirabella as the L1 German speakers. This suggests that incidental inductive learning through accommodation is possible. However, phenomena of the target language that deviate too radically from the native pattern seem to require more explicit training. |
2019 |
Brandt, Erika; Andreeva, Bistra; Möbius, Bernd Information density and vowel dispersion in the productions of Bulgarian L2 speakers of German Inproceedings Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019), pp. 3165-3169, Melbourne, 2019. @inproceedings{Brandt2019, title = {Information density and vowel dispersion in the productions of Bulgarian L2 speakers of German}, author = {Erika Brandt and Bistra Andreeva and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2019/09/brandt_etal_icphs2019.pdf}, year = {2019}, date = {2019-09-01}, booktitle = {Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019)}, pages = {3165-3169}, address = {Melbourne}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Whang, James Effects of phonotactic predictability on sensitivity to phonetic detail Journal Article Laboratory Phonology: Journal of the Association for Laboratory Phonology, 10(1):8 , pp. 1-28, 2019. @article{Whang2019, title = {Effects of phonotactic predictability on sensitivity to phonetic detail}, author = {James Whang }, url = {https://www.journal-labphon.org/articles/10.5334/labphon.125/ http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2019/05/Whang-LabPhon-2019-1.pdf}, year = {2019}, date = {2019-04-23}, booktitle = {Laboratory Phonology: Journal of the Association for Laboratory Phonology}, journal = {Laboratory Phonology: Journal of the Association for Laboratory Phonology}, volume = {10(1):8}, pages = {1-28}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Brandt, Erika Information density and phonetic structure : explaining segmental variability PhD Thesis Saarland University, 2019. @phdthesis{Brandt_diss_2019, title = {Information density and phonetic structure : explaining segmental variability}, author = {Erika Brandt}, editor = {Bernd M\"{o}bius [Akademische Betreuung]}, url = {http://nbn-resolving.de/urn:nbn:de:bsz:291--ds-279181}, doi = {http://dx.doi.org/10.22028/D291-27918}, year = {2019}, date = {2019-00-00}, address = {Saarbr\"{u}cken}, school = {Saarland University}, abstract = {There is growing evidence that information-theoretic principles influence linguistic structures. Regarding speech several studies have found that phonetic structures lengthen in duration and strengthen in their spectral features when they are difficult to predict from their context, whereas easily predictable phonetic structures are shortened and reduced spectrally. Most of this evidence comes from studies on American English, only some studies have shown similar tendencies in Dutch, Finnish, or Russian. In this context, the Smooth Signal Redundancy hypothesis (Aylett and Turk 2004, Aylett and Turk 2006) emerged claiming that the effect of information-theoretic factors on the segmental structure is moderated through the prosodic structure. In this thesis, we investigate the impact and interaction of information density and prosodic structure on segmental variability in production analyses, mainly based on German read speech, and also listeners' perception of differences in phonetic detail caused by predictability effects. Information density (ID) is defined as contextual predictability or surprisal (S(unit_i) = -log2 P(unit_i|context)) and estimated from language models based on large text corpora. In addition to surprisal, we include word frequency, and prosodic factors, such as primary lexical stress, prosodic boundary, and articulation rate, as predictors of segmental variability in our statistical analysis. As acoustic-phonetic measures, we investigate segment duration and deletion, voice onset time (VOT), vowel dispersion, global spectral characteristics of vowels, dynamic formant measures and voice quality metrics. Vowel dispersion is analyzed in the context of German learners' speech and in a cross-linguistic study. As results, we replicate previous findings of reduced segment duration (and VOT), higher likelihood to delete, and less vowel dispersion for easily predictable segments. Easily predictable German vowels have less formant change in their vowel section length (VSL), F1 slope and velocity, are less curved in their F2, and show increased breathiness values in cepstral peak prominence (smoothed) than vowels that are difficult to predict from their context. Results for word frequency show similar tendencies: German segments in high-frequency words are shorter, more likely to delete, less dispersed, and show less magnitude in formant change, less F2 curvature, as well as less harmonic richness in open quotient smoothed than German segments in low-frequency words. These effects are found even though we control for the expected and much more effective effects of stress, boundary, and speech rate. In the cross-linguistic analysis of vowel dispersion, the effect of ID is robust across almost all of the six languages and the three intended speech rates. Surprisal does not affect vowel dispersion of non-native German speakers. Surprisal and prosodic factors interact in explaining segmental variability. Especially, stress and surprisal complement each other in their positive effect on segment duration, vowel dispersion and magnitude in formant change. Regarding perception we observe that listeners are sensitive to differences in phonetic detail stemming from high and low surprisal contexts for the same lexical target.}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } There is growing evidence that information-theoretic principles influence linguistic structures. Regarding speech several studies have found that phonetic structures lengthen in duration and strengthen in their spectral features when they are difficult to predict from their context, whereas easily predictable phonetic structures are shortened and reduced spectrally. Most of this evidence comes from studies on American English, only some studies have shown similar tendencies in Dutch, Finnish, or Russian. In this context, the Smooth Signal Redundancy hypothesis (Aylett and Turk 2004, Aylett and Turk 2006) emerged claiming that the effect of information-theoretic factors on the segmental structure is moderated through the prosodic structure. In this thesis, we investigate the impact and interaction of information density and prosodic structure on segmental variability in production analyses, mainly based on German read speech, and also listeners' perception of differences in phonetic detail caused by predictability effects. Information density (ID) is defined as contextual predictability or surprisal (S(unit_i) = -log2 P(unit_i|context)) and estimated from language models based on large text corpora. In addition to surprisal, we include word frequency, and prosodic factors, such as primary lexical stress, prosodic boundary, and articulation rate, as predictors of segmental variability in our statistical analysis. As acoustic-phonetic measures, we investigate segment duration and deletion, voice onset time (VOT), vowel dispersion, global spectral characteristics of vowels, dynamic formant measures and voice quality metrics. Vowel dispersion is analyzed in the context of German learners' speech and in a cross-linguistic study. As results, we replicate previous findings of reduced segment duration (and VOT), higher likelihood to delete, and less vowel dispersion for easily predictable segments. Easily predictable German vowels have less formant change in their vowel section length (VSL), F1 slope and velocity, are less curved in their F2, and show increased breathiness values in cepstral peak prominence (smoothed) than vowels that are difficult to predict from their context. Results for word frequency show similar tendencies: German segments in high-frequency words are shorter, more likely to delete, less dispersed, and show less magnitude in formant change, less F2 curvature, as well as less harmonic richness in open quotient smoothed than German segments in low-frequency words. These effects are found even though we control for the expected and much more effective effects of stress, boundary, and speech rate. In the cross-linguistic analysis of vowel dispersion, the effect of ID is robust across almost all of the six languages and the three intended speech rates. Surprisal does not affect vowel dispersion of non-native German speakers. Surprisal and prosodic factors interact in explaining segmental variability. Especially, stress and surprisal complement each other in their positive effect on segment duration, vowel dispersion and magnitude in formant change. Regarding perception we observe that listeners are sensitive to differences in phonetic detail stemming from high and low surprisal contexts for the same lexical target. |
Gessinger, Iona; Möbius, Bernd; Andreeva, Bistra; Raveh, Eran; Steiner, Ingmar Phonetic accommodation in a Wizard-of-Oz experiment: Intonation and segments Inproceedings Proceedings of Interspeech 2019, pp. 301-305, Graz, Austria, 2019. @inproceedings{Gessinger/etal:2019b, title = {Phonetic accommodation in a Wizard-of-Oz experiment: Intonation and segments}, author = {Iona Gessinger and Bernd M\"{o}bius and Bistra Andreeva and Eran Raveh and Ingmar Steiner }, url = {http://dx.doi.org/10.21437/Interspeech.2019-2445 http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2020/04/gessinger_etal_is2019.pdf}, year = {2019}, date = {2019-00-00}, booktitle = {Proceedings of Interspeech 2019}, pages = {301-305}, address = {Graz, Austria}, abstract = {This paper discusses phonetic accommodation of 20 native German speakers interacting with the simulated spoken dialogue system Mirabella in a Wizard-of-Oz experiment. The study examines intonation of wh-questions and pronunciation of allophonic contrasts in German. In a question-and-answer exchange with the system, the users produce predominantly falling intonation patterns for wh-questions when the system does so as well. The number of rising patterns on the part of the users increases significantly when Mirabella produces questions with rising intonation. In a map task, Mirabella provides information about hidden items while producing variants of two allophonic contrasts which are dispreferred by the users. For the [I\c{c}] vs. [Ik] contrast in the suffix h-igi, the number of dispreferred variants on the part of the users increases significantly during the map task. For the [E:] vs. [e:] contrast as a realization of stressed h-¨a-i, such a convergence effect is not found on the group level, yet still occurs for some individual users. Almost every user converges to the system to a substantial degree for a subset of the examined features, but we also find maintenance of preferred variants and even occasional divergence. This individual variation is in line with previous findings in accommodation research.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } This paper discusses phonetic accommodation of 20 native German speakers interacting with the simulated spoken dialogue system Mirabella in a Wizard-of-Oz experiment. The study examines intonation of wh-questions and pronunciation of allophonic contrasts in German. In a question-and-answer exchange with the system, the users produce predominantly falling intonation patterns for wh-questions when the system does so as well. The number of rising patterns on the part of the users increases significantly when Mirabella produces questions with rising intonation. In a map task, Mirabella provides information about hidden items while producing variants of two allophonic contrasts which are dispreferred by the users. For the [Iç] vs. [Ik] contrast in the suffix h-igi, the number of dispreferred variants on the part of the users increases significantly during the map task. For the [E:] vs. [e:] contrast as a realization of stressed h-¨a-i, such a convergence effect is not found on the group level, yet still occurs for some individual users. Almost every user converges to the system to a substantial degree for a subset of the examined features, but we also find maintenance of preferred variants and even occasional divergence. This individual variation is in line with previous findings in accommodation research. |
2018 |
Brandt, Erika; Zimmerer, Frank; Andreeva, Bistra; Möbius, Bernd Impact of prosodic structure and information density on dynamic formant trajectories in German Inproceedings Speech Prosody 2018, Poznan, 2018. @inproceedings{Brandt2018SpPro, title = {Impact of prosodic structure and information density on dynamic formant trajectories in German}, author = {Erika Brandt and Frank Zimmerer and Bistra Andreeva and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2018/08/brandt_etal_sp2018.pdf}, doi = {10.21437/SpeechProsody.2018-24}, year = {2018}, date = {2018-04-13}, booktitle = {Speech Prosody 2018}, address = {Poznan}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Zimmerer, Frank; Brandt, Erika; Andreeva, Bistra; Möbius, Bernd Idiomatic or literal? Production of collocations in German read speech Inproceedings Speech Prosody 2018, Poznan, 2018. @inproceedings{Zimmerer2018SpPro, title = {Idiomatic or literal? Production of collocations in German read speech}, author = {Frank Zimmerer and Erika Brandt and Bistra Andreeva and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2018/08/zimmerer_etal_sp2018-1.pdf}, doi = {10.21437/SpeechProsody.2018-87}, year = {2018}, date = {2018-04-13}, booktitle = {Speech Prosody 2018}, address = {Poznan}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Malisz, Zofia; Brand, Erika; Möbius, Bernd; Oh, Yoon Mi; Andreeva, Bistra Dimensions of segmental variability: interaction of prosody and surprisal in six languages Journal Article Frontiers in Communication / Language Sciences, 3 (25), pp. 1-18, 2018, (sfb1102, c1, speech rate, information density, surprisal, duration, vowel distinctiveness, spectral emphasis). @article{Malisz2018, title = {Dimensions of segmental variability: interaction of prosody and surprisal in six languages}, author = {Zofia Malisz and Erika Brand and Bernd M\"{o}bius and Yoon Mi Oh and Bistra Andreeva}, url = {https://www.frontiersin.org/articles/10.3389/fcomm.2018.00025/full}, year = {2018}, date = {2018-00-00}, journal = {Frontiers in Communication / Language Sciences}, volume = {3}, number = {25}, pages = {1-18}, note = {sfb1102, c1, speech rate, information density, surprisal, duration, vowel distinctiveness, spectral emphasis}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
2017 |
Brandt, Erika; Zimmerer, Frank; Andreeva, Bistra; Möbius, Bernd Mel-cepstral distortion of German vowels in different information density contexts Inproceedings Proceedings of Interspeech (Stockholm, Sweden), 2017, (C1, sfb1102). @inproceedings{Brandt/etal:2017, title = {Mel-cepstral distortion of German vowels in different information density contexts}, author = {Erika Brandt and Frank Zimmerer and Bistra Andreeva and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2018/08/brandt_etal_is2017-1-1.pdf}, year = {2017}, date = {2017-08-01}, booktitle = {Proceedings of Interspeech (Stockholm, Sweden)}, note = {C1, sfb1102}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Zimmerer, Frank; Andreeva, Bistra; Möbius, Bernd; Malisz, Zofia; Ferragne, Emmanuel; Pellegrino, François; Brandt, Erika Perzeption von Sprechgeschwindigkeit und der (nicht nachgewiesene) Einfluss von Surprisal Inproceedings Trouvain, Jürgen; Steiner, Ingmar; Möbius, Bernd (Ed.): Elektronische Sprachsignalverarbeitung 2017 - Tagungsband der 28. Konferenz, Saarbrücken, 15.-17. März 2017. Studientexte zur Sprachkommunikation, Band 86, pp. 174-179, 2017. @inproceedings{Zimmerer/etal:2017a, title = {Perzeption von Sprechgeschwindigkeit und der (nicht nachgewiesene) Einfluss von Surprisal}, author = {Frank Zimmerer and Bistra Andreeva and Bernd M\"{o}bius and Zofia Malisz and Emmanuel Ferragne and Fran\c{c}ois Pellegrino and Erika Brandt}, editor = {J\"{u}rgen Trouvain and Ingmar Steiner and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2017/03/zimmerer_etal_essv2017.pdf}, year = {2017}, date = {2017-03-15}, booktitle = {Elektronische Sprachsignalverarbeitung 2017 - Tagungsband der 28. Konferenz, Saarbr\"{u}cken, 15.-17. M\"{a}rz 2017. Studientexte zur Sprachkommunikation, Band 86}, pages = {174-179}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2016 |
Maguer, Sébastien Le; Möbius, Bernd; Steiner, Ingmar; Lolive, Damien De l'utilisation de descripteurs issus de la linguistique computationnelle dans le cadre de la synthèse par HMM Inproceedings Proc. Journées d'Etudes sur la Parole, Paris, 2016. @inproceedings{Lemaguer/etal:2016b, title = {De l'utilisation de descripteurs issus de la linguistique computationnelle dans le cadre de la synth\`{e}se par HMM}, author = {S'{e}bastien Le Maguer and Bernd M\"{o}bius and Ingmar Steiner and Damien Lolive}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2017/07/JEP_2016_paper_17.pdf}, year = {2016}, date = {2016-01-01}, booktitle = {Proc. Journ'{e}es d'Etudes sur la Parole}, address = {Paris}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Maguer, Sébastien Le; Möbius, Bernd; Steiner, Ingmar Toward the use of information density based descriptive features in HMM based speech synthesis Inproceedings 8th International Conference on Speech Prosody, pp. 1029-1033, Boston, MA, USA, 2016. @inproceedings{LeMaguer2016SP, title = {Toward the use of information density based descriptive features in HMM based speech synthesis}, author = {S'{e}bastien Le Maguer and Bernd M\"{o}bius and Ingmar Steiner}, url = {http://www.isca-speech.org/archive/SpeechProsody_2016/abstracts/190.html}, doi = {10.21437/SpeechProsody.2016-211}, year = {2016}, date = {2016-01-01}, booktitle = {8th International Conference on Speech Prosody}, pages = {1029-1033}, address = {Boston, MA, USA}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Schulz, Erika; Oh, Yoon Mi; Malisz, Zofia; Andreeva, Bistra; Möbius, Bernd Impact of Prosodic Structure and Information Density on Vowel Space Size Inproceedings Proceedings of Speech Prosody 2016 (Boston, MA, USA), pp. 350-354, 2016, (information density, vowel space, prosody, sfb1102). @inproceedings{Schulz/etal:2016a, title = {Impact of Prosodic Structure and Information Density on Vowel Space Size}, author = {Erika Schulz and Yoon Mi Oh and Zofia Malisz and Bistra Andreeva and Bernd M\"{o}bius}, url = {http://www.sfb1102.uni-saarland.de/wp/wp-content/uploads/2017/07/schulz_etal_sp2016-1.pdf}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of Speech Prosody 2016 (Boston, MA, USA)}, pages = {350-354}, note = {information density, vowel space, prosody, sfb1102}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Malisz, Zofia; O'Dell, Michael; Nieminen, Tommi; Wagner, Petra Perspectives on speech timing: Coupled oscillator modeling of Polish and Finnish Journal Article Phonetica, 73 , pp. 229-255, 2016. @article{Malisz/etal:2016, title = {Perspectives on speech timing: Coupled oscillator modeling of Polish and Finnish}, author = {Zofia Malisz and Michael O'Dell and Tommi Nieminen and Petra Wagner}, year = {2016}, date = {2016-00-00}, journal = {Phonetica}, volume = {73}, pages = {229-255}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
2015 |
Schulz, Erika; Malisz, Zofia; Andreeva, Bistra; Möbius, Bernd Einfluss von Informationsdichte und prosodischer Struktur auf Vokalraumausdehnung Inproceedings Phonetik und Phonologie 11, Marburg, 2015. @inproceedings{pundp11, title = {Einfluss von Informationsdichte und prosodischer Struktur auf Vokalraumausdehnung}, author = {Erika Schulz and Zofia Malisz and Bistra Andreeva and Bernd M\"{o}bius}, year = {2015}, date = {2015-01-01}, booktitle = {Phonetik und Phonologie 11}, address = {Marburg}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Malisz, Zofia; Schulz, Erika; Oh, Yoon Mi; Andreeva, Bistra; Möbius, Bernd Dimensions of segmental variability: relationships between information density and prosodic structure Inproceedings Workshop "Modeling variability in speech", Stuttgart, 2015. @inproceedings{malisz15, title = {Dimensions of segmental variability: relationships between information density and prosodic structure}, author = {Zofia Malisz and Erika Schulz and Yoon Mi Oh and Bistra Andreeva and Bernd M\"{o}bius}, year = {2015}, date = {2015-01-01}, booktitle = {Workshop "Modeling variability in speech"}, address = {Stuttgart}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |