Can machine translation be evaluated by the crowd alone?Yvette Graham - School of Computing, Dublin City University
Can machine translation be evaluated by the crowd alone?
School of Computing, Dublin City University
Crowd-sourced assessments of machine translation quality allow evaluations to be carried out cheaply and on a large scale. It is essential, however, that the crowd’s work be filtered to avoid contamination of results through the inclusion of false assessments. One method is to filter via agreement with experts, but even amongst experts agreement levels may not be high. In this talk, I will present a new methodology for crowd-sourcing human assessments of translation quality, which allows individual workers to develop their own individual assessment strategy. Agreement with experts is no longer required, and a worker is deemed reliable if they are consistent relative to their own previous work. Individual translations are assessed in isolation from all others in the form of direct estimates of translation quality. This allows more meaningful statistics to be computed for systems and enables significance to be determined on smaller sets of assessments. We demonstrate the methodology’s feasibility in large-scale human evaluation through replication of the human evaluation component of WMT shared translation task for two language pairs, Spanish-to-English and English-to-Spanish. Results for measurement based solely on crowd-sourced assessments show system rankings in line with those of the original evaluation. Comparison of results produced by the relative preference approach and the direct estimate method described here demonstrate that the direct estimate method has a substantially increased ability to identify significant differences between translation systems. In addition, the talk will include how this method of evaluation can be adapted for segment-level assessment of machine translation.
If you would like to meet with the speaker, please contact Raphael Rubino.
Yvette Graham is a natural language processing researcher of at Dublin City University. Her research interests include machine translation, quality-controlled crowd-sourcing and evaluation of natural language processing systems. She obtained an M.Sc. in Computational Linguistics from Trinity College Dublin in 2005 and Ph.D. from Dublin City University in 2011.