B6: Neural Feature and Representation Learning for Information Density Based Translationese Classification

The project B6 continuation application is focused on addressing limitations of the information density methodological framework based on hand-crafted features and implemented and evaluated during the first phase of the project by making use of neural network approaches to capture and explore information density-based textual features, namely surprisal, with applications to translationese identification, machine translation evaluation and improvement. A systematic comparison will be conducted between neural and standard count-based textual features with the objective of exploring the information encoded in continuous representations obtained by unsupervised and end-to-end learning methods. The combination of hand-crafted and neural information density features will provide an extension to the classic surprisal measure. In addition, neural approaches facilitate multi-granularity input representations and various context sizes for surprisal measure calculation. An important part of project B6 in the context of the CRC is the analysis and visualisation of representations learned by neural networks, in order to compare with features inspired by our linguistic intuitions.

Publications

2020

Bizzoni, Yuri; Juzek, Tom S; España-Bonet, Cristina; Chowdhury, Koel Dutta; van Genabith, Josef; Teich, Elke

How Human is Machine Translationese? Comparing Human and Machine Translations of Text and Speech Inproceedings

The 17th International Workshop on Spoken Language Translation, Seattle, WA, United States, 2020.

Abstract | Links | BibTeX

2019

Lapshinova-Koltunski, Ekaterina; Espa{~n}a-Bonet, Cristina; van Genabith, Josef

Analysing Coreference in Transformer Outputs Inproceedings

Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), pp. 1-12, Association for Computational Linguistics, Hong Kong, 2019.

Abstract | Links | BibTeX

2016

Bojar, Ondvrej; Chatterjee, Rajen; Federmann, Christian; Graham, Yvette; Haddow, Barry; Huck, Matthias; Yepes, Antonio Jimeno; Koehn, Philipp; Logacheva, Varvara; Monz, Christof; Negri, Matteo; Neveol, Aurelie; Neves, Mariana; Popel, Martin; Post, Matt; Rubino, Raphael; Scarton, Carolina; Specia, Lucia; Turchi, Marco; Verspoor, Karin; Zampieri, Marcos

Findings of the 2016 Conference on Machine Translation Inproceedings

Proceedings of the First Conference on Machine Translation, pp. 131-198, Association for Computational Linguistics, Berlin, Germany, 2016.

Links | BibTeX

Rubino, Raphael; Lapshinova-Koltunski, Ekaterina; van Genabith, Josef

Information Density and Quality Estimation Features as Translationese Indicators for Human Translation Classification Inproceedings

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 960-970, Association for Computational Linguistics, 2016.

Links | BibTeX

Rubino, Raphael; Degaetano-Ortlieb, Stefania; Teich, Elke; van Genabith, Josef

Modeling Diachronic Change in Scientific Writing with Information Density Inproceedings

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 750-761, The COLING 2016 Organizing Committee, 2016.

Links | BibTeX

Josef van Genabith

PI

Mail
Website

Koel Dutta Chowdhury

PhD

Mail

Cristina España i Bonet

Member

Mail

Daria Pylypenko

Member

Mail