B4: Modeling and Measuring Information Density

Project B4 is concerned with computational modeling of information density in terms of language models. The main goal of the project is to improve current language modeling approaches by developing a more sophisticated notion of context. While existing models stay within the sentence boundary, this project extends the notion of context to much larger stretches of text: The proposed approach further allows for a gradual development and forgetting of the context as the text evolves. As a secondary goal, project B4 will provide a tool box of standard language modeling techniques. Both the new models and the tool box will be exploited in many other CRC projects, to obtain measures of information density.



Dietrich, Klakow; Thomas, Trost

Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings. Inproceedings

In Workshop Proceedings of TextGraphs-11: Graph-based Methods for Natural Language Processing (Workshop at ACL 2017), 2017.



Singh, Mittul; Greenberg, Clayton; Oualil, Youssef; Klakow, Dietrich

Sub-Word Similarity based Search for Embeddings: Inducing Rare-Word Embeddings for Word Similarity Tasks and Language Modelling Inproceedings

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, The COLING 2016 Organizing Committee, Osaka, Japan, 2016.

Varjokallio, Matti; Klakow, Dietrich

Unsupervised morph segmentation and statistical language models for vocabulary expansion Inproceedings

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Berlin, Germany, 2016.

Sayeed, Asad; Greenberg, Clayton; Demberg, Vera

Thematic fit evaluation: an aspect of selectional preferences Journal Article

Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, 2016.


Schneegass, Stefan; Oualil, Youssef; Bulling, Andreas

SkullConduct: Biometric User Identification on Eyewear Computers Using Bone Conduction Through the Skull Inproceedings

Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, ACM, New York, NY, USA, 2016.

Oualil, Youssef; Greenberg, Clayton; Singh, Mittul; Klakow, Dietrich

Sequential recurrent neural networks for language modeling Journal Article

Interspeech 2016, 2016.


Singh, Mittul; Greenberg, Clayton; Klakow, Dietrich

The Custom Decay Language Model for Long Range Dependencies Book Chapter

Sojka, Petr; Horák, Aleš; Kopeček, Ivan; Pala, Karel (Ed.): Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Brno, Czech Republic, September 12-16, 2016, Proceedings, Springer International Publishing, Cham, 2016.

Matti, Varjokallio; Dietrich, Klakow

Unsupervised morph segmentation and statistical language models for vocabulary expansion. Inproceedings




Greenberg, Clayton; Demberg, Vera; Sayeed, Asad

Verb Polysemy and Frequency Effects in Thematic Fit Modeling Inproceedings

Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics, Association for Computational Linguistics, Denver, Colorado, 2015.

Greenberg, Clayton; Sayeed, Asad; Demberg, Vera

Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering Inproceedings

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Denver, Colorado, 2015.

Oualil, Youssef; Schulder, Marc; Helmke, Hartmut; Schmidt, Anna; Klakow, Dietrich

Real-Time Integration of Dynamic Context Information for Improving Automatic Speech Recognition Inproceedings

INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015.

