Computational Linguistics and Digital Humanities: Chances and ChallengesCaroline Sporleder - Georg-August-Universität Göttingen
Computational Linguistics and Digital Humanities: Chances and Challenges
Department of Computational Linguistics and Digital Humanities
Digital Humanities (DH) is a field that has grown immensely in recent years. It is also a very diverse field covering -in its broadest definition- everything from corpus linguistics over computational philology and quantitative history to computational archaeology. Because the origin of the field is rooted in corpus linguistics and computational philology and because data in the Humanities and Social Sciences are often (but not always) textual, digital text representation, processing, and mining are a major area of attention. Computational linguistics has a lot to contribute to this, both at the lower end of the scale (e.g., tools for OCR error correction and preprocessing) and at the higher end (e.g., sophisticated text mining tools). Computational linguistics can also benefit from evaluating its algorithms and tools on data from the Humanities as these data are often difficult, e.g. due to non-standard language and spelling, missing sentence boundaries, noisy input data and domains that are different from those typically considered in CL. Hence, CL for DH requires the development of very robust methods that work well on noisy data and do not require large amounts of training data. In this talk, I will address some of the chances and the challenges that arise when applying computational linguistic methods to data from the Humanities and Social Sciences.
If you would like to meet with the speaker, please contact Annemarie Friedrich.