B4: Modeling and Measuring Information Density

Classical language models predict a word given a sequence of predecessor words. We will extend this to condition on knowledge from the environment that is to condition not only on the linguistics context but also one context from the real world. In one branch of the project, we will consider language models that also condition on an image. Knowledge of the image in whose context the text was produced should help to predict the next word. In a second branch of the project we will consider, knowledge bases, question-answer data sets and states of a game as additional context. The surprisal and the predictability of an utterance like “Pawn from E2 to E4” depends on the present state of a chess game.

Publications

2017

Dietrich, Klakow; Thomas, Trost

Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings. Inproceedings

In Workshop Proceedings of TextGraphs-11: Graph-based Methods for Natural Language Processing (Workshop at ACL 2017), 2017.

BibTeX

Oualil, Youssef; Klakow, Dietrich

A batch noise contrastive estimation approach for training large vocabulary language models Inproceedings

18th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2017.

BibTeX

Singh, Mittul; Oualil, Youssef; Klakow, Dietrich

Approximated and domain-adapted LSTM language models for first-pass decoding in speech recognition Inproceedings

18th Annual Conference of the International Speech Communication Association (INTERSPEECH), Stockholm, Sweden, 2017.

BibTeX

Oualil, Youssef; Klakow, Dietrich

A neuronal network approach for mixing language models Inproceedings

ICASSP 2017 2017.

BibTeX

2016

Singh, Mittul; Greenberg, Clayton; Oualil, Youssef; Klakow, Dietrich

Sub-Word Similarity based Search for Embeddings: Inducing Rare-Word Embeddings for Word Similarity Tasks and Language Modelling Inproceedings

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2061–2070, The COLING 2016 Organizing Committee, Osaka, Japan, 2016.

Abstract | Links | BibTeX

Varjokallio, Matti; Klakow, Dietrich

Unsupervised morph segmentation and statistical language models for vocabulary expansion Inproceedings

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 175–180, Association for Computational Linguistics, Berlin, Germany, 2016.

Links | BibTeX

Sayeed, Asad; Greenberg, Clayton; Demberg, Vera

Thematic fit evaluation: an aspect of selectional preferences Journal Article

Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, pp. 99–105, 2016, ISBN: 9781945626142.

BibTeX

Schneegass, Stefan; Oualil, Youssef; Bulling, Andreas

SkullConduct: Biometric User Identification on Eyewear Computers Using Bone Conduction Through the Skull Inproceedings

Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 1379–1384, ACM, New York, NY, USA, 2016, ISBN: 978-1-4503-3362-7.

Links | BibTeX

Oualil, Youssef; Greenberg, Clayton; Singh, Mittul; Klakow, Dietrich

Sequential recurrent neural networks for language modeling Journal Article

Interspeech 2016, pp. 3509–3513, 2016.

BibTeX

Singh, Mittul; Greenberg, Clayton; Klakow, Dietrich

The Custom Decay Language Model for Long Range Dependencies Book Chapter

Sojka, Petr ; Hor{á}k, Ale{š} ; Kope{č}ek, Ivan ; Pala, Karel (Ed.): Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Brno , Czech Republic, September 12-16, 2016, Proceedings, pp. 343–351, Springer International Publishing, Cham, 2016, ISBN: 978-3-319-45510-5.

Links | BibTeX

Matti, Varjokallio; Dietrich, Klakow

Unsupervised morph segmentation and statistical language models for vocabulary expansion. Inproceedings

2016.

BibTeX

Oualil, Youssef; Singh, Mittul; Greenberg, Clayton; Klakow, Dietrich

Long-short range context neural networks for language models Inproceedings

EMLP 2016 2016.

BibTeX

2015

Greenberg, Clayton; Demberg, Vera; Sayeed, Asad

Verb Polysemy and Frequency Effects in Thematic Fit Modeling Inproceedings

Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics, pp. 48–57, Association for Computational Linguistics, Denver, Colorado, 2015.

Links | BibTeX

Greenberg, Clayton; Sayeed, Asad; Demberg, Vera

Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering Inproceedings

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 21–31, Association for Computational Linguistics, Denver, Colorado, 2015.

Links | BibTeX

Oualil, Youssef; Schulder, Marc; Helmke, Hartmut; Schmidt, Anna; Klakow, Dietrich

Real-Time Integration of Dynamic Context Information for Improving Automatic Speech Recognition Inproceedings

INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015.

Links | BibTeX

Dietrich Klakow

PI

Mail
Website

Clayton Greenberg

PhD

Mail
Website

Thomas Trost

Postdoc

Mail
Website

Aditya Mogadala

Postdoc