04. February 2019 – 02. March 2019

ACDH virtual hackathon series

Two tasks are still live!

Not only in the context of Digital Humanities but also in other research areas the Open Data movement is gaining momentum. One example of this development is the International Open Data Day which is becoming more prominent every year. The City of Vienna has launched its own annual Open Data happening. But also in the context of DH Open Data is becoming a vital part of the research process as projects such as ELEXIS are opening up (research) data. One of the benefits of Open Data is that it can be used by anyone for any purpose which makes it an ideal resource for hackathons.

Usually hackathons take place on site, participants are given tasks to be solved within a given timeframe in a fixed location. This requires the programmers to be flexible and available and to have access to travel funding. A virtual hackathon on the other hand offers people all over the globe the possibility to participate and contribute without having to travel.

For these reasons the proposed contribution is an Open Data hackathon series in a virtual format. Participants will be given a specific Open Data set and a fixed time frame within which to solve a given task on the basis of the provided data. Programmers will be able to participate either as individuals or in teams. Their code will be evaluated by a group of experts and the best submissions will be awarded winners’ certificates and prizes (1st place: 700€, 2nd: 500€, 3rd: 300€). In the end, all submissions (not only the winners’) will be made available Open Source in order for all results created in the course of the event series to be available and reusable for the entire DH community.

The three hackathons will centre on three data sets that are connected to Open Data events taking place in spring 2019. The first task will be to enrich a lexicographical dataset and will take place at the time of the ELEXIS Observer Event. The second task focusing on cartographic data will be based on an Austrian Open Government Data set provided by the City of Vienna and take place at the time of the Open Data Day Vienna. The final highlight of the series will the third hackathon that takes place in the context of the International Open Data Day and is carried out in cooperation with Europeana. The core goal of this event series is to promote Openness in DH and show a path towards putting it into practice.


DATES

Registration is open: 17.01.2019 - 01.02.2019 (noon CET)

1. Hackathon (ELEXIS hack - lexicographical data): 04.02.2019 (2 p.m. CET) - 20.02.2019 (midnight CET)

2. Hackathon (Vienna Open Data Day hack - cartographic data): 11.02.2019 (2 p.m. CET) - 28.02.2019 (midnight CET)

3. Hackathon (International Open Data Day hack - textual data): 14.02.2019 (2 p.m. CET) - 02.03.2019 (midnight CET)

Announcement of winners: 10.05.2019


ELEXIS hack: YOUR TASK

THIS TASK HAS BEEN CLOSED. THANK YOU FOR YOUR SUBMISSIONS.

The first hack of the ACDH virtual hackathon series focuses on Open Data that is of interest for the ELEXIS project. In this hack, your task is to develop a creative mode of processing a lexicographical dataset. It is your decision what to do with the data: You could enrich it, visualize it, do statistical analysis, integrate/aggregate it with other resources (e.g. LOD) or develop an innovative application to explore it.

The dataset to be worked on in this task is the “Digital Dictionary of Tunis Arabic”. This set of XML/TEI data was not only built on data from the corpus of spoken language that was compiled in the TuniCo project, but also on a range of additional sources: data elicited from complementary interviews with young Tunisians and lexical material taken from various published historical sources dating from the middle of the 20th century and earlier. There already exist several front-ends to this data-set. To get an overview have a look at this website:

https://tunico.acdh.oeaw.ac.at/about_dictionary.html

The dataset is available for download here:

https://id.acdh.oeaw.ac.at/uuid/175b8cdf-5d04-f4d3-a778-67910aa8fd37

It is a single XML file following the TEI schema. It consists of two parts. The first part contains in a dictionary with each term providing basic grammatical information as well as meanings in three languages (English, German & French). Some terms are illustrated with sample phrases and/or links to the second part of the dictionary. The second part contains usage examples taken from the spoken language. Each example is a short sentence with English, German & French translations.

See this task on GitHub

GitHub topic tag: #ACDHhackathonELEXIS


Vienna Open Data Day hack: YOUR TASK

The second hack of the ACDH virtual hackathon series focuses on Open Government Data provided by the City of Vienna. In this hack, your task is to develop a creative mode of processing cartographical data. It is your decision what to do with the data: You could enrich it, automatically vectorize it or develop an innovative application to explore it.

The data to be worked on in this task are cartographic data showing war damage to buildings in Vienna, obviously compiled by the City of Vienna in 1946.

The data are described here:

https://www.data.gv.at/katalog/dataset/87282445-a02d-4f7f-9bf6-196d73d9b3a9

Actual map data are provided by a WMS service. You can access them using any GIS software by adding a WMS layer and providing a GetCapabilities URL (https://data.wien.gv.at/daten/wms?service=WMS&request=GetCapabilities&version=1.1.1). The layer providing the war damage map has an id BOMBENSCHADENOGD and a label Kriegsschäden um 1946. Be aware the same WMS service provides over 200 layers ranging from a historical map from 1904 to up-to-date public toilet locations. If you want to embed the data on a webpage, all major map libraries support WMS layers (Leaflet doc, OpenLayers doc).

See this task on GitHub

GitHub topic tag: #ACDHhackathonODDVie


International Open Data Day hack: YOUR TASK

The third hack of the ACDH virtual hackathon series is happening around the international Open Data Day and offers you a choice of two datasets. Pick the one that inspires you most and develop a creative mode of processing a textual dataset. It is your decision what to do with the data: You could enrich it, visualize it, do statistical analysis or develop an innovative application to explore it. You might even be inspired to develop a solution that incorporates both datasets.

The fist dataset to be worked on in this task is a collection of XML/TEI transcriptions of early German travel guides on non-European countries which were released by the Baedeker publishing house between 1875 and 1914 (5 volumes, first editions). The texts are tokenized, labelled with lemmas as well as part-of-speech-tags, and semantically annotated. The dataset was compiled in the context of the travel!digital project.

The data and the accompanying TEI schema is available for download here:
* https://id.acdh.oeaw.ac.at/traveldigital/Corpus
* https://id.acdh.oeaw.ac.at/traveldigital/Auxiliary_Files/TEI_schema

The focus should be on functionalities that go beyond what the travel!digital web-app already offers.

We are offering the second dataset In cooperation with Europeana, the digital platform for cultural heritage funded by the European Union. It is composed of high definition images and full-text of 8 newspapers in German contributed by the Hamburg State and University Library Carl von Ossietzky to the Europeana's project on newspapers: http://www.europeana-newspapers.eu/featured-partner-hamburg-state-and-university-library-carl-von-ossietzky/

The information on how to download the dataset is on a specific section at:
https://pro.europeana.eu/resources/apis/iiif#download

The documentation page for the IIIF APIs is available at:
https://pro.europeana.eu/resources/apis/iiif

We invite you to imagine its reuse in your field of research and/or to develop new modes of exploration and visualisation of large volumes of resources, in particular with full-text content.

See this task on GitHub

GitHub topic tag: #ACDHhackathonODD


PARTICIPATION & SUBMISSION

Participants can register as individuals or in teams of max. 3 persons. Only registered participants will be able to submit their hacks and be eligible to win prizes. Each submission has to include code, an instruction on how to run it (readme), a short presentation of what was done (in a format of their choice: poster, video, description,...), enriched data (if applicable) and statistics on the data (if applicable).

Incomplete submissions will not be reviewed.

Registration is already closed.


On the day a hackathon opens, the task will be posted on this site and additionally made available to all registered participants via GitHub. Participants will submit their solutions to a dedicated GitHub repository. Participation in the hackathons requires a GitHub account.

Hackathon winners will be determined by a panel of judges. The prizes will be announced by May 10, 2019.


OPENNESS

Participants will work on their solutions in a dedicated (own) GitHub repository. Once you have finished your work, set your repo to public, tag it with the dedicated tag and send the repo link to vanessa.hannesschlaeger[at]oeaw.ac.at and tanja.wissik[at]oeaw.ac.at.

All code submitted by all participants (as well as all reviews by all judges) shall be made permanently publicly available on GitHub under an MIT license (respectively a fitting open license for submissions of enriched data and presentation material). By submitting their code to this challange, participants agree to these terms.


WHO SHOULD PARTICIPATE

All skill levels are welcome. We are looking for DH developers, students, researchers, and anyone interested in working with Open Data in the context of digital humanities.


JUDGEMENT CRITERIA

Your submission will be judged by the following criteria:

  • Creativity, innovation (e.g. Is the approach/idea new and unique? Does it do something that hasn’t been done before? Does it provide new insights into the data? Does the hack provide a new/faster/clearer solution to the old problem?)
  • Accessibility, reusability, reproducibility (e.g. Is the code properly documented? Is the technical approach reproducible?)
  • Elegance (e.g. Is the code easy to modify and reuse? Is it readable for others? Is modularity considered in the design? Is the code simple and concise?)