Information Density and Linguistic Encoding

Language provides not only the expressiveness needed to communicate, but also offers speakers a multitude of choices regarding how they may encode their messages – from the choice of words, structuring of syntactic elements, and arranging sentences in discourse. The SFB addresses the hypothesis that language variation and language use can be better understood in terms of the goal of speakers to modulate the amount of information conveyed in an utterance. While previous efforts have sought to understand language systems and their use in terms of complexity, the definition of this notion is often imprecise and specific to particular linguistic levels. Recently, however, there is evidence that the ease of processing linguistic material is correlated with its contextually determined predictability. This has lead to the hypothesis that complexity may be appropriately indexed by Shannon’s notion of information, referred to in recent linguistic work as surprisal. The SFB investigates the hypothesis that

(i) processing complexity is indexed by surprisal across linguistic levels, and

(ii) that variation in language use may be characterised by the optimal distribution of information across the linguistic signal.

Under this view, speakers exploit possible variation in their linguistic encoding – modulating the order, density and specificity of their expressions – so as to avoid informational peaks and troughs that result in inefficient communication. This view naturally extends to all aspects and levels of linguistic communication, thus offering the potential for a deeper understanding of the relationship between the nature of variation offered by our linguistic systems and the way it is exploited in language use. Crucially, just as the surprisal of linguistic material can be determined at different levels of granularity, from phonemes to phrases to entire propositions, so do speakers have encoding choices that span these levels – from varying properties of the acoustic realisation to the broader structuring of the discourse. The aim of the SFB is thus to investigate the extent to which notions of surprisal and the optimal distribution of information offer a unifying explanation of observed patterns of variation in language use within and across linguistic levels, and in a range of communicative settings.

For an overview article on IDeaL, please click here.