This interdisciplinary and international project focuses on German language travelogues in the collections of the Austrian National Library, covering the period from 1500 to 1876. In order to analyze perceptions of “the other” and “the orient” in a large-scale text corpus, algorithms for the semi-automatized search for, and evaluation of, digitally available texts are being created. Several institutions partake in the project: the Institute for Modern and Contemporary Historical Research at the Austrian Academy of Sciences, the Austrian Institute of Technology, the Austrian National Library and the research center L3S at the University of Hannover.
Travelogues offer a wide range of information on perceptions of otherness related to foreign regions, cultures, and religions. Travelogues are strongly influenced by the people involved in their production and their self-representations; this influence in turn offers insights into how origin cultures handled “otherness”. The main challenges encountered while performing the analysis have been the significant number of available texts and heterogeneity of the type of source. The number of people involved in the production of travelogues varied, as did their socio-cultural background, intentions, aims, motives, and the intended audience. As the project covers almost four centuries, the diachronic contextualization is methodologically demanding. Further obstacles to be surmounted are grammatical and semiotic changes, the variable quality of digitization and optical character recognition (OCR).
The project faces these challenges due to an innovative methodology that is being used for the quantitative and qualitative analysis of digitized texts: Researchers and scientists from the fields of history and computer science, as well as library and information science are jointly developing algorithms for the semi-automatized detection and evaluation of the perceptions of the other in a large-scale text corpus. To handle the data, state-of-the-art machine learning, text-mining and adaptive topic-modeling techniques are being applied, and a novel, neural network-based expression-detection approach is being developed. In this way, German language travelogues are being systematically collected, as well as quantitatively and qualitatively analyzed for the first time. Moreover, the developed tools will be easily adaptable to other languages and concepts, thus providing the basis for further research.
The historical focus is on perceptions of “the other” and “the orient”. The project identifies different types of perceptions, analyzes if and how they changed during the selected time period, and considers the influence of the geographic or socio-cultural background of the people involved in the production of travelogues. Such analyses provide a historical context for current challenges: In particular, the findings on historical strategies for dealing with otherness may be applied to today’s challenges connected with worldwide phenomena such as mass tourism, the internationalization of consumer cultures, transnational migrations and globalization.
1. Semi-automatized creation of a text corpus. Identification of characteristic elements of travelogues and mapping of possibilities of their automatized collection, based on the digitized collections of the Austrian National Library (ca. 600,000 books).
2. Collection of meta-data. Creation of a data base with the identified travelogues, including further information, such as names and places of the origins of the travelers or intertextual relations between the publications (traditions of copying texts inherent references, etc.).
3. Analysis of the topics, “otherness” and “orient” within the travelogues. The linguistic and semantic changes of these terms, as well as connected associations will be quantitatively and qualitatively examined based on the automatized recognition of text passages.