Introduction into Handwritten Text Recognition
Part 3: Transkribus 3





4 Zoom online sessions | Oct 21, Nov 4 and 25, Dec 9
3-day-workshop in person at Vienna | December 19-21
SCHEDULE
OCT 21 | What is HTR? |
Transkribus 1 (first hands-on) | |
Introduction into manuscripts and working in groups | |
NOV 4 | Transkribus 2 (uploading documents, layout recognition, simple transcription) |
Working in groups (first transcripitions) | |
NOV 25 | Transkribus 3 (training of custommodels, learningcurves, tagging) |
DEC 9 | Presentation of resulting models |
Transkribus 4 (exporting documents and furter processing) | |
DEC 19 | Publication of the Ground truth |
Creating a simple website 1 | |
Alternatives to Transkribus | |
DEC 20
| Creating a simple website 2 |
VCEditor and other tools | |
Library visit | |
DEC 21 | Time to finish the work |
Introduction into HTR | Technologies of Medieval Manuscripts – Latin|German|Czech
A revolution has slowly begun in the study of historical documents: Machine Learning tools have been developed to allow for the automatic transcription of documents. Over the last decade, these tools can now help assist in the production of texts from medieval manuscripts at previously unobtainable levels of accuracy. Today, libraries have used these tools to make their collections searchable, while researchers have sped up the process of creating editions of texts and adopted them for the study of medieval documents.
The course will offer an introduction into some of these ongoing projects, but more importantly provide an introduction into the practice of studying medieval documents with Handwritten Text Recognition (HTR) technologies. The course will have two main parts: 4 online sessions and a three-day in person workshop in Vienna. During the first phase, participants will be introduced to both the theory of handwritten text recognition and its practical application using the Transkribus (transkribus.eu) tool. We will then work in four groups, focusing on four different periods and languages: Carolingian Latin, late medieval Latin, late medieval German, and late medieval Czech. Each group will have its own supervisor and its goal will be to train an HTR model for each type of writing.
During the in-person workshop in Vienna, we will finalize the four projects and publish our results online: both the transcriptions and Handwritten Text Recognition models. Additionally, we will also visit libraries in Vienna to see selected manuscripts in person. Finally, we will test other machine learning tools for their automatic transcription outcomes and use other digital tools. The course will be taught by a team of experts in HTR, medieval manuscript studies and Latin, German and Czech philology. At the end of the course, you will receive a certificate.
Requirements
The course is primarily designed for Masters or PhD students, however, we will consider other applications as well. You are expected to be familiar with the language of the group you want to join: Medieval Latin, Medieval German, and Medieval Czech. We expect you to have at least basic knowledge of medieval palaeography and manuscript studies. However, we would also be able and delighted to offer resources for self-training in these languages and manuscript studies to prepare for the course.
Costs
There is no participation fee, but you will be expected to cover the costs of your trip to Vienna, including accommodation (c. 300 Euros). We do hope, however, to be able to offer bursaries for students who do not have support from their institutions. If you would like to apply for a bursary, please let us know as early as possible.