B6: Neural Feature and Representation Learning for Information Density Based Translationese Classification

The project B6 continuation application is focused on addressing limitations of the information density methodological framework based on hand-crafted features and implemented and evaluated during the first phase of the project by making use of neural network approaches to capture and explore information density-based textual features, namely surprisal, with applications to translationese identification, machine translation evaluation and improvement. A systematic comparison will be conducted between neural and standard count-based textual features with the objective of exploring the information encoded in continuous representations obtained by unsupervised and end-to-end learning methods. The combination of hand-crafted and neural information density features will provide an extension to the classic surprisal measure. In addition, neural approaches facilitate multi-granularity input representations and various context sizes for surprisal measure calculation. An important part of project B6 in the context of the CRC is the analysis and visualisation of representations learned by neural networks, in order to compare with features inspired by our linguistic intuitions.

Publications

2016

Bojar, Ondvrej; Chatterjee, Rajen; Federmann, Christian; Graham, Yvette; Haddow, Barry; Huck, Matthias; Jimeno Yepes, Antonio ; Koehn, Philipp; Logacheva, Varvara; Monz, Christof; Negri, Matteo; Neveol, Aurelie; Neves, Mariana; Popel, Martin; Post, Matt; Rubino, Raphael; Scarton, Carolina; Specia, Lucia; Turchi, Marco; Verspoor, Karin; Zampieri, Marcos

Findings of the 2016 Conference on Machine Translation Inproceedings

Proceedings of the First Conference on Machine Translation, pp. 131–198, Association for Computational Linguistics, Berlin, Germany, 2016.

Links | BibTeX

Rubino, Raphael; Lapshinova-Koltunski, Ekaterina; van Genabith, Josef

Information Density and Quality Estimation Features as Translationese Indicators for Human Translation Classification Inproceedings

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 960–970, Association for Computational Linguistics, 2016.

Links | BibTeX

Rubino, Raphael; Degaetano-Ortlieb, Stefania; Teich, Elke; van Genabith, Josef

Modeling Diachronic Change in Scientific Writing with Information Density Inproceedings

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 750–761, The COLING 2016 Organizing Committee, 2016.

Links | BibTeX

Josef van Genabith

PI

Mail
Website

Raphaël Rubino

PI

Mail

Ahmad Taie

Member

Mail