The name of the game is comparable corpora

Ruslan Mitkov, University of Wolverhampton

 

Comparable corpora are the most versatile and valuable resource for multilingual Natural Language Processing. The speaker will argue that comparable corpora can support a wider range of applications than has been demonstrated so far in the state of the art. The talk will present completed and ongoing work conducted by the speaker and colleagues from his research group where comparable corpora are employed for different tasks including but not limited to the identification of cognates and false friends, validation of translation universals, language change and translation of multiword expressions. The presentation will conclude with the results of an interesting experiment as part of this study which sought to establish whether large loosely comparable data would yield better results than smaller but strictly comparable corpora.

 


Prof. Dr. Ruslan Mitkov has been working in Natural Language Processing (NLP), Computational Linguistics, Corpus Linguistics, Machine Translation, Translation Technology and related areas since the early 1980s. Whereas Prof. Mitkov is best known for his seminal contributions to the areas of anaphora resolution and automatic generation of multiple-choice tests, his extensively cited research (more than 240 publications including 14 books, 35 journal articles and 36 book chapters) also covers topics such as machine translation, translation memory and translation technology in general, bilingual term extraction, automatic identification of cognates and false friends, natural language generation, automatic summarisation, computer-aided language processing, centering, evaluation, corpus annotation, NLP-driven corpus-based study of translation universals, text simplification, NLP for people with language disabilities and computational phraseology. Mitkov is author of the monograph Anaphora resolution (Longman) and Editor of the most successful Oxford University Press Handbook - The Oxford Handbook of Computational Linguistics. Current prestigious projects include his role as Executive Editor of the Journal of Natural Language Engineering published by Cambridge University Press and Editor-in-Chief of the Natural Language Processing book series of John Benjamins publishers. Dr. Mitkov is also working on the forthcoming Oxford Dictionary of Computational Linguistics (Oxford University Press, co-authored with Patrick Hanks) and the forthcoming second, substantially revised edition of the Oxford Handbook of Computational Linguistics. Prof. Mitkov has been invited as a keynote speaker at a number of international conferences including conferences on translation and translation technology. He has acted as Programme Chair of various international conferences on Natural Language Processing (NLP), Machine Translation, Translation Technology, Translation Studies, Corpus Linguistics and Anaphora Resolution. He is asked on a regular basis to review for leading international funding bodies and organisations and to act as a referee for applications for Professorships both in North America and Europe. Ruslan Mitkov is regularly asked to review for leading journals, publishers and conferences and serve as a member of Programme Committees or Editorial Boards. Prof. Mitkov has been an external examiner of many doctoral theses and curricula in the UK and abroad, including Master’s programmes related to NLP, Translation and Translation Technology. Dr. Mitkov has considerable external funding to his credit (more than є 20,000,000) and is currently acting as Principal Investigator of several large projects, some of which are funded by UK research councils, by the EC as well as by companies and users from the UK and USA. Ruslan Mitkov received his MSc from the Humboldt University in Berlin, his PhD from the Technical University in Dresden and worked as a Research Professor at the Institute of Mathematics, Bulgarian Academy of Sciences, Sofia. Mitkov is Professor of Computational Linguistics and Language Engineering at the University of Wolverhampton which he joined in 1995 and where he set up the Research Group in Computational Linguistics. His Research Group has emerged as an internationally leading unit in applied Natural Language Processing and members of the group have won awards in different NLP/shared-task competitions. In addition to being Head of the Research Group in Computational Linguistics, Prof. Mitkov is also Director of the Research Institute in Information and Language Processing. The Research Institute consists of the Research Group in Computational Linguistics and the Research Group in Statistical Cybermetrics, which is another top performer internationally. Ruslan Mitkov is Vice President of ASLING, an international Association for promoting Language Technology. Dr. Mitkov is a Fellow of the Alexander von Humboldt Foundation, Germany and was invited as Distinguished Visiting Professor at the University of Franche-Comté in Besançon, France; he also serves as Vice-Chair for the prestigious EC funding programme ‘Future and Emerging Technologies’. In recognition of his outstanding professional/research achievements, Prof. Mitkov was awarded the title of Doctor Honoris Causa at Plovdiv University in November 2011. At the end of October 2014 Dr. Mitkov was also conferred Professor Honoris Causa at Veliko Tarnovo University.