Open Positions

Various PhD and Postdoc postions will be available starting from July 1st (see descriptions below).

Please check the official job advertisement at Saarland University’s website.

Please refer to reference W1376 in your application. Applications shall be sent to:
Saarland University
Collaborative Research Center (SFB) 1102
Coordination Office
Universität des Saarlandes
Campus A2.2, 2.10
sfb1102@uni-saarland.de

Project A1: Neurobehavioural Correlates of Surprisal in Online Comprehension
PIs: Matthew Crocker and Harm Brouwer

This project continues to investigate how world knowledge about likely events and probabilistic linguistic experience combine to determine a person’s expectations, and thus “semantic surprisal”, during online sentence comprehension (Venhuizen, Crocker & Brouwer, 2018). In phase two, electrophysiological and behavioural experiments will be used to test the predictions, and inform the further development, of a neurocomputational model of sentence comprehension (Brouwer, Crocker, Venhuizen, & Hoeks, 2017) with the aim of explicitly linking semantic surprisal to the underlying neurocognitive processes that are indexed by event-related brain potentials (ERPs). We will carry out a set of neurophysiological and reading time experiments to test three predictions that follow from the model: 1) that the P600 is an index of interpretation-level surprisal, 2) that surprisal reflects the generation of forward and backward inferences, and 3) that surprisal reflects the interaction of world knowledge and linguistic experience. By arriving at a neurocomputational model that links electrophysiological and behavioural metrics of processing difficulty, this project thus seeks to illuminate the neurocognitive basis of surprisal.

PhD Student (TV-L 13 65%) – Experimental Psycholinguist: Candidates for this position should have completed a masters degree in psycholinguistics, or a related discipline. Experience with psycholinguistic methods, such as ERPs and/or eye-tracking, as well as associated inferential statistical analysis techniques is expected. As experiments are conducted primarily in German, a good working knowledge of the language is also desired.

Starting date:  October 1st 2018

Project A3: Modelling the Information Density of Event Sequences in Texts
PIs: Stefan Thater, Alexander Koller, Vera Demberg

The goal of project A3 is to learn formalized script knowledge — a specific type of commonsense knowledge about prototypical sequences of events — from data, and to use it to improve algorithms for natural language processing and our understanding of linguistic encoding  choice and  interpretation in human communication. In the first phase of the project, we used crowdsourcing techniques to learn script knowledge. In the next phase, we will develop wide coverage methods  that allows script representations to be learned directly from raw data. We will develop NLP systems which process arbitrary texts, detect the script events that they describe (even implicitly), measure their information density (conditioned on script knowledge), and draw script-based inferences from them.

In a second line of research, we will add a deeper level of knowledge to our script representations by connecting events to their preconditions and effects . This will allow us to model the (in)coherence of texts in terms of long-term effects of script events; we will measure the accuracy of these models in a novel story generation task. Both of these lines of work will feed into empirical research in which we will measure what events experimental subjects infer from a story, and which events can be left out without making the story seem incoherent. We will build upon our wide-coverage models of script knowledge with preconditions and effects to develop quantitative cognitive models of linguistic encoding, extending the RSA model beyond toy domains for the first time.

PhD Student (TV-L 13 75%) – Candidates for this position should have completed an MSc degree in computational linguistics or computer science or a related discipline. A strong background in machine learning (including neural networks), excellent programming skills, and very good working knowledge of English are expected. The focus of this PhD project will be the extension of script representations with causal information (preconditions and effects) and the automatic induction of causal knowledge from corpora and crowdsourced data. A second potential topic is to learn wide-coverage script knowledge directly from unannotated corpora, by combining Bayesian techniques and deep learning. This will make it possible, for the first time, to infer events that the speaker left unsaid from text, at large scale. You will evaluate the quality and coverage of the learned script models in a number of NLP applications including question answering and natural language generation.

PhD Student (TV-L 13 65%) – Candidates for this position should have completed an MSc degree in psycholinguistics or computational linguistics , or a related discipline. Experience in experimental methodology and statistical analysis is expected. Background knowledge and interest in pragmatics is desired. The focus of this PhD project will be on experimental pragmatics and computational modelling using the rational speech act theory: what inferences are triggered by uttering redundant (inferable) materials? How can we distinguish redundant materials that trigger inferences from those that don’t? Do inferences triggered by redundant material cause processing difficulty, and how can we measure this processing difficulty? What consequences does this have for computational models of pragmatics?

Starting date: July 1st 2018 (or later)

Project A4: Language Comprehension in a Noisy Channel
PIs: Vera Demberg, Jutta Kray, Dietrich Klakow

In realistic environments, language comprehension depends not only on the amount of information that needs to be transferred, but also on the quality of this information transfer. Existing research shows that a substantial portion of the comprehension difficulty in elderly adults might be due to perceptual problems with hearing or vision (when reading) and/or to cognitive problems. In the experimental part of the project, we plan to vary the quality of information transfer by degrading the auditory speech signal or by inducing environmental noise. We will compare groups of younger and older adults. In the modelling part, we propose a noisy channel model, consisting of a component that models comprehension at different levels of hearing ability, and a generation component that can model the confusability of words and can in turn optimize the system-generated output in order to minimize confusability of words, while also adapting the output to a target channel capacity.

PhD Student (TV-L 13 65%) – Candidates for this position should have completed a masters degree in psychology or psycholinguistics, or a related discipline. Experience with psychophysiological methods, such as ERPs, as well as knowledge about statistical analysis techniques is expected. The experiments will be conducted in German and younger as well as older adults will be tested. Therefore, some knowledge of the German language as well as background knowledge in cognitive and developmental psychology is desired.

PhD Student (TV-L 13 65%) – Candidates for this position should have completed a MSc degree in phonetics, psycholinguistics or computational linguistics, or a related discipline. Experience with eye-tracking as well as knowledge about statistical analysis is expected. Background knowledge about phonetics and experience in working with spoken language is a plus. Experiments will be conducted in German. A focus of this PhD project will be on the phenomenon of “false hearing” in various noise conditions, involving younger as well as older adults. False hearing refers to cases where the hearer is confident to have heard something other than what the speaker said. Furthermore, this PhD student will investigate the role of the uniform information density in understanding infrequent words in noisy conditions and with limited attention (during dual tasking).

PhD Student (TV-L 13 75%) – Candidates for this position should have completed a MSc degree in computer science or computational linguistics, or a related discipline. A strong background in machine learning (including neural networks) is expected. This PhD project will focus on quantifying the phonetic similarity between words and utterances, and creating a natural language generation system which avoids generating utterances that are auditorily difficult to understand or can be misunderstood easily for something else.

Starting date: July 1st 2018 (or later)

Project A5: The Role of Language Experience and Visual Context in Surprisal
PIs: Maria Staudte, Jutta Kray and Nivedita Mani

This project will examine the interplay of linguistic and visual context in determining surprisal, and their interdependence with language development and individual differences. The goal is to go beyond exploring the probabilistic (language) system that adults have established over time and consider the development of such a system in childhood: Does limited experience lead to simplified predictions, which would likely lead to more frequent (prediction) errors eliciting high surprisal? The project will also examine the effect of prediction, and prediction error, on acquiring and storing novel word meanings in both children and adults. Building on the findings from the first phase, the relation between a word’s expectancy and its induced cognitive load, as well as the role of visual context and the individual linguistic and cognitive abilities, will be considered. Exploring this across development will contribute to our understanding of how and when these factors interact with each other and potentially provide insights into possible connections between prediction, error/surprisal and learning.

PhD Student (TV-L 13 65%) – Experimental Psycholinguist/Psychologist. Candidates should have a masters degree in Psycholinguistics, Psychology or a related programme, ideally with some experience in using ERP methods and with statistical analysis. Since experiments will be conducted in German and with children, a good working knowledge of German is necessary.

Starting date: July 1st 2018 (or later)

Project A6: The Role of Semantic Surprisal for Memory Formation and Retrieval
PI: Axel Mecklinger

The main goal of this project is to explore how semantic surprisal of a new event in a given sentence context modulates the formation and retrieval of episodic memories. Our key assumptions are (i) that semantically incongruent and congruent events differ in their expectedness during online comprehension, (ii) that the ensuing semantic surprisal is a key determinant for the formation of new episodic memories and (iii) that events differing in semantic surprisal affect memory formation by different mechanisms. We plan to use behavioural and event-related brain potential (ERP) measures to explore how different aspects of semantic congruency modulate memory formation. We plan to carry out combined ERP and memory experiments to address the following main research questions: (1) Do different forms of semantic surprisal affect memory formation by the same or by different mechanisms? (2) Does semantic surprisal not only modulate the way events are encoded but also affect how events are subsequently remembered? (3) Can the mnemonic consequences of semantic surprisal be generalized to tests of implicit memory? (4) Does new fictional knowledge support the formation of new memories in a similar way as previously established world knowledge does? (5) How do contextual factors influence the effects of semantic surprisal on memory formation and retrieval? By illuminating the mnemonic consequences of semantic surprisal the project aims to bridge
the gap between experimental research on memory and psycholinguistic research.

Two PhD Students (TV-L 13 65%): Candidates for these two positions should have a master degree in psychology, psycholinguistics or a related discipline. Experience with psychophysiological methods (ideally ERP), experimental psychological or psycholinguistic research and a good background in statistical analyses is expected. As the experiments will be conducted in German, good German language skills are necessary.

Starting date: August 1st 2018 (or later)

Project A7: Controlling Information Density in Discourse Generation
PIs: Alexander Koller, Jörg Hoffmann

The aim of this project is to develop a system that generates instructions in the context of the computer game “Minecraft”. In generating technical instructions, it is essential to strike a balance between communicative risk and communicative efficiency: The speaker would like to help the listener get the job done quickly (efficiency), but needs to ensure that the listener can also reliably understand these instructions (risk avoidance). We will develop a probabilistic model of a rational listener, who interprets an utterance by updating a prior probability distribution over what the speaker may have meant. We will define communicative risk and efficiency in these terms, and develop a natural language generation (NLG) system for Minecraft instruction videos which balances efficiency and risk and brings together methods from chart-based sentence generation and AI planning.

Postdoctoral Researcher (TV-L E13 100%) — Candidates for this position should have completed a PhD in computational linguistics or computer science, or a related discipline. Expertise in the design and implementation of efficient algorithms for natural language generation, parsing, or machine translation and the ability to communicate clearly in spoken and written English are required. Background knowledge in natural language generation, dialog systems, artificial intelligence (in particular planning), and/or machine learning is desired.

The focus of this position will be on designing and implementing a sentence generation system for Minecraft instructions. This system will be called frequently by a discourse planner (developed by a PhD student in the same project), and will thus be required to generate sentences that make a given risk-efficiency tradeoff extremely efficiently. You will integrate this system with a module for generating visual representations of the generated instructions (obtaining a complete system for generating instruction videos), and coordinate the evaluation of the overall system.

Starting date: July 1st 2018 (or later)

Project B1: Information Density in English Scientific Writing: A Diachronic Perspective
PI: Elke Teich (supporting staff: Stefania Degaetano-Ortlieb, Katrin Menzel)

Project B1 is concerned with modeling the temporal dynamics of English from the late Modern period onwards, focusing on the language of science. Language models (information density, surprisal) are employed as instruments for capturing patterns of diachronic variation using a corpus specifically compiled for the project, the Royal Society Corpus (RSC; Kermes et al., 2016). Results show that typical features exhibit different levels of productivity over time indexed by ID, indicating phases of change (linguistic expansion vs. consolidation) (cf. Fankhauser et al., 2014; Degaetano & Teich, 2016). In this phase of the project, we want to assess the role of ID in favoring/impeding change and investigate effects of change at different linguistic levels and interactions (e.g. between morphology and syntax). For this, the repertoire of suitable models will have to be extended (language models on different linguistic levels, embeddings, neural models) to enhance the analysis of temporal dynamics.

Two types of knowledge and skills are needed in the project: diachronic corpus linguistics (with good computational skills) and computational language modeling (with good linguistic skills).  Two kinds of positions are available, a Postdoc (TVL-13 50%) and a Phd position (TVL-13 65%). Depending on qualifications, the PhD student or the PostDoc may fill the corpus linguistic or the computational position, respectively.

The formal prerequisites for a PhD student position is a Master’s degree in English Linguistics, Computational Linguistics, or a related discipline. Experience with corpuslinguistic methods and/or computational language modeling as well as statistical analysis is expected. A good working knowledge of German is desired, excellent knowledge of English is required.

For a Postdoc position, the formal prerequisite is a PhD in English Linguistics, Computational Linguistics, or a related discipline. Experience with corpuslinguistic methods and/or computational language modeling as well as statistical analysis is expected. A good working knowledge of German is desired, excellent knowledge of English is required.

Starting date: July 1st 2018 (or later)

Project B6: Neural Feature and Representation Learning for Information Density Based Translationese Classification
PIs:  Josef van Genabith, Raphael Rubino

In this project we are using deep learning and information theory to model key aspects of human and machine translation. The objective is to better understand human translation to improve machine translation.

PhD Student (TV-L 13 75%) –  in Deep Learning for Modelling Machine Translation and Human Translation. Requirements: MSc/MA (or BSc/BA) in Natural Language Processing, AI or Computer Science. Strong programming, mathematics, problem solving, creativity, analytic capabilities and independent thinking. Strong interest and expertise in language, natural language processing, statistics, machine learning, and deep learning. Able to work well in a team.  Good writing and communication skills in English. German language skills are a plus, but not required.

Postdoctoral Researcher (TV-L 13 50%)  –  in Deep Learning for Modelling Machine Translation and Human Translation. Requirements: PhD in Machine Translation, Natural Language Processing, Machine Learning, AI or Computer Science. Strong problem solving, creativity, analytic capabilities, independent thinking, mathematics, statistics, machine learning, deep learning and software development. Strong track record in international peer reviewed publications. Able to work well in a team.  Good writing and communication skills in English. German language skills are a plus, but not required.

Project B7: Modelling Human Translation with a Noisy Channel
PI: Elke Teich (supporting staff: Mihaela Vela)

Project B7 is concerned with modeling human translation from the perspective of  rational communication. With this topic, the project follows a trend in current empirical translation research taking up information theory (cf. Carl et al., 2016).  Research in the project involves (i) building noisy channel models of translation (as used in statistical machine translation) to investigate translation adequacy and relating the results to “translationese” effects  (notably shining through and normalization)  and (ii) formalizing translation complexity based on entropy with possible interpretations in terms of translation difficulty (cf. Martinez-Martinez & Teich, 2017).

Depending on qualification level, the position available in this project is a PhD (TVL-13 65%) or Postdoc position (TVL-13 50%). A background in corpus-based translatology and/or computational language modeling/machine translation is required. For a PhD position, the formal prerequisite is a Master’s  degree in Translatology,  Corpus Linguistics,  Computational Linguistics or a related discipine; for a Postdoc position, the formal prerequisite is a PhD in Translatology,  Corpus Linguistics,  Computational Linguistics or a related discipine.  Expected knowledge and skills are experience in corpus building, corpus-based translation analysis (“translationese”) and /or statistical machine translation as well as statistical analysis (e.g. with R).

Starting date: July 1st 2018 (or later)

Project C1: Information Density and the Predictability of Phonetic Structure
PIs: Bernd Möbius,  Bistra Andreeva

Project C1 is concerned with the relation between information density (ID) and phonetic encoding. In the first funding period we have analysed the effects of ID on phonetic encoding, in particular on segmental duration, vowel dispersion, vocalic spectral emphasis, and consonantal center of gravity, while controlling for several basic factors related to the prosodic structure. In phase two we will incorporate a more elaborate account of the prosodic hierarchy and its interaction with the information density profile of utterances. Another extension is the investigation of the effects of channel characteristics and listener orientation on speakers’ productions, complemented by an analysis of listeners’ exploitation of the predicted enhancements that speakers implement. With respect to methodology, we will explore a continuous feature extraction scheme that can accurately quantify the contribution of individual features of speech signals, and their combination, to the informativity of units of speech. Finally, we plan to investigate the combination of language models on different sub-word and supra-word levels.

Postdoctoral Researcher (TV-L 13 100%) – Phonetician or Speech Scientist: Candidates for this position should have completed their Ph.D. in phonetics, speech science, or a related discipline. Experience with acoustic-phonetic analyses, experimental and statistical methods, and an interest in computational modeling are expected. A good command of English is mandatory. Working knowledge of German is desirable but not a prerequisite. Candidates must have completed their Ph.D. by the time of the appointment.

Starting date: July 1st 2018 (or later)

PhD Student (TV-L 13 65%) – Phonetician or Speech Scientist:  Candidates for this position should have completed a masters degree in phonetics, speech science, or a related discipline. The successful candidate is expected to prepare a PhD dissertation on the interaction of ID effects and the prosodic hierarchy.

Starting date: November 1st 2018

Project C3: Rational Encoding and Decoding of Referring Expressions
PIs: Matthew Crocker and Heiner Drenhaus

This project investigates visually situated language processing – how people produce and comprehend utterances that directly reference the objects in the visual context – examining the extent to which information-theoretic accounts can explain encoding and decoding behaviours. Two advantages of situated comprehension are that (i) it is easier to ensure the meaning equivalence of alternative encodings, and (ii) it enables the more controlled investigation of production, comprehension, and interactive communication tasks. From a theoretical perspective, referring expressions allow us to examine key information-theoretic aspects of syntactic encoding—the linearization of information or surprisal (in pre- versus post-nominal expressions) and redundancy (overspecification)—in order to evaluate the predictions of both bounded (e.g. UID) and pragmatic (e.g., RSA; Frank and Goodman, 2012) accounts of rational communication. More specifically, we examine the extent to which RSA scales to both more complex and natural online communicative tasks, and when limited cognitive resources result in more bounded rational behaviours. The project exploits a range of dependent measures in order to assess differences in the nature of online cognitive load resulting from situated surprisal, and will also develop a neurocomputational model (jointly with project A1) which aims to instantiate bounded rational comprehension and establish explicit links to behavioural and neurophysiological indices of processing.

PhD Student  (TV-L 13 65%) – Experimental Psycholinguist: Candidates for this position should have completed a masters degree in psycholinguistics, or a related discipline. Experience with psycholinguistic methods, such as ERPs and/or eye-tracking, as well as associated inferential statistical analysis techniques is expected. While some knowledge of German would be an asset, it is not strictly essential.

Starting date: October 1st 2018

Project C4: Mutual Intelligibility and Surprisal in Slavic Intercomprehension
PIs: Tania Avgustinova, Bernd Möbius, Dietrich Klakow

The research project is concerned with the analysis of cross-lingual mutual intelligibility between Slavic languages. It studies the auditory-perceptual intercomprehension of Slavic languages based on analyses of the acoustic, phonetic and phonological structure of spoken utterances. This line of investigation will be complemented by using adaptation techniques established in speech synthesis and recognition to measure the distance between languages. In addition, similarity will be determined on the level of complete utterances.

Postdoctoral Researcher (TV-L 13 100%) – The successful candidate should have a Ph.D./Master’s in Computer Science, Computational Linguistics, or a related discipline, with a strong background in speech science and speech technology, in particular TTS and ASR. Strong programming skills are essential. A good command of English is mandatory. Working knowledge of German is desirable but not a prerequisite. Candidates must have completed their Ph.D. by the time of the appointment.

PhD Student (TV-L 13 65%) – The successful candidate should have a background in computational linguistics or machine translation, with experience in multilingual applications involving Slavic languages, and is expected to prepare a PhD dissertation in this domain.

Starting date: October 1st 2018 (or later)