Causal inferences under uncertainty in speech perception and production

T. Florian Jaeger - University of Rochester

Causal inferences under uncertainty in speech perception and production

T. Florian Jaeger
Brain and Cognitive Sciences, Computer Science,
University of Rochester

[This work is based on collaborations with Wednesday Bushong, Esteban Buz, Linda Liu, and Michael K. Tanenhaus, and funded by NSF CAREER IIS-1150028 and NIH R01 HD075797-01]


Although I focus on low-level speech processing, the questions I discuss apply to higher levels of language understanding and production as well. In the first part of the talk, I focus on speech perception and, specifically, the learning mechanisms underlying adaptation to talker-specific differences in the boundaries between phonetic categories. I present evidence that listeners can draw on inferences about the causes of unexpected pronunciations. Rather than to merely passively integrate perceptual evidence from a novel talker into a talker-specific acoustic model, listeners seem to consider alternative causes for the evidence they perceive. I also show that listeners can maintain uncertainty about alternative causes for surprisingly long times, suggesting that the memory systems underlying speech perception are less bounded than is often assumed.

In the second part of the talk, I turn to speech production. I focus on millisecond changes to the speech signal due to context-driven hyper-articulation. I present evidence that even such subtle—and likely highly automatic—modulations are sensitive to causal inferences. Specifically, speakers make targeted changes to their articulations based on what they perceive to have been the likely cause for previous miscommunications. These hyper-articulations, I show, present smart solutions to a system that is implemented through a noisy motor system and thus has uncertainty about the perceptual outcome of its motor plans.

Both case studies I present argue for systems that allow ‘smart’ causal inference to permeate low-level aspects of linguistic de- and encoding. Both studies also highlight that these ‘smart’ inferences are often conducted under uncertainty—failing to keep this in mind, as is often the case in debates about, e.g., audience design, perspective taking, etc. leads to misleading conclusions.

If you would like to meet with the speaker, please contact Mihaela Vela.

Key references:

Bushong, W. and Jaeger, T. F. 2017. Maintenance of Perceptual Information in Speech Perception. In XXX (eds.) Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci66), XXX-XXX. Austin, TX: Cognitive Science Society.

Buz, E., Tanenhaus, M. K., and Jaeger, T. F. 2016. Dynamically adapted context-specific hyper-articulation: Feedback from interlocutors affects speakers’ subsequent pronunciations. Journal of Memory and Language 89, 68-86. [10.1016/j.jml.2015.12.009, IF: 4.014] [pdf]

Liu, L. and Jaeger, T. F. 2017. Speech perception as causal inference under uncertainty. Ms., University of Rochester.

Seyfarth, S., Buz, E., and Jaeger, T. F. 2016. Dynamic hyperarticulation of coda voicing contrasts. Journal of the Acoustical Society of America 139(2), EL31-37. [10.1121/1.4942544, IF: 1.503] [pdf]

T. Florian Jaeger

T. Florian Jaeger’s website.