ACDH-CH Lecture 6.2

Online, 13 October 2020

Crowdsourcing methods to increase public access and engagement with cultural heritage and academic datasets

Victoria Van Hyning
University of Maryland School of Information

Crowdsourcing methods to increase public access and engagement with cultural heritage and academic datasets

Crowdsourcing, also known as “cultural heritage co-creation” or “commons-based per production,” is the practice of organizations inviting volunteers from diverse backgrounds to engage with collections such as digitized manuscript images, camera trap data, or microscopy images, in order to create new datasets to aid researchers and patrons of many kinds. Ideally, crowdsourcing projects further research and widen access to information, while also giving participants the opportunity to gain or sharpen their own research skills and interest in the subject area. Academics and cultural heritage practitioners can leverage crowdsourcing to democratize access to knowledge, and skills such as digital literacy, and palaeography—the ability to interpret handwriting, and transcribe documents.

Transcription radically improves access to collections by making them word-searchable, as well as legible to screen readers used by visually or cognitively impaired people. Crowdsourcing transcription projects have proliferated in the last fifteen years, and are frequently led by cultural heritage organizations keen to increase access to their holdings. This talk will explore two different methods for online crowdsourced text transcription developed by Van Hyning and her colleagues at Zooniverse.org (Oxford University), and the Library of Congress, and their different but overlapping methods of community engagement.

While at Zooniverse, Dr Van Hyning led an interdisciplinary team to develop several line-by-line transcription approaches that attempt to lower barriers to participation, while also improving the accuracy of transcriptions. Each line of text is transcribed by multiple independent volunteers, and then compared using various algorithms. At the Library of Congress she worked with an interdisciplinary team to develop a different transcription approach for By the People (https://crowd.loc.gov). She developed communication and engagement strategies for both projects to motivate and encourage volunteers, which would be valuable to anyone considering hosting or participating in a crowdsourcing project.

The ACDH-CH Lecture 6.2 will address the need for further comparative analysis of these and other crowdsourcing projects’ design and engagement strategies, and their resulting data, and will briefly discuss possible applications for these data in Machine Learning training sets. Professor Van Hyning is very interested in the connection between crowdsourcing and Machine Learning, and hopes to hear from audience members who have experience and/or interest in these intersections as well.

Victoria Van Hyning

Professor Victoria Van Hyning joined the University of Maryland School of Information (College Park) in 2020, as an Assistant Professor of Library Innovation. Before this she served as a Senior Innovation Specialist and a Community Manager for Collections and Data on the crowdsourcing project By the People (https://crowd.loc.gov), at the Library of Congress in Washington, D.C. She joined the Library in July 2018 after holding two consequent postdoctoral fellowships at the University of Oxford, one with the crowdsourcing research group Zooniverse.org in the Department of Astrophysics, and the other in the English Faculty, funded by the British Academy. From 2015 to 2018 she served as Humanities Principal Investigator of Zooniverse, leading projects such as Shakespeare’s World, AnnoTate, and the Institute of Museum and Library Services grant-funded project “Transforming Libraries and Archives through Crowdsourcing.”

Victoria earned a Masters in medieval English literature from Oxford, and a PhD in early modern English literature from the University of Sheffield. She publishes on various literary, digital humanities, and crowdsourcing topics. Her first monograph, Convent Autobiography: Early Modern English Nuns in Exile,was published by Oxford University Press in 2019.

Date

13 October 2020 – 16:45 CEST

Place

Online via Zoom

Name	Purpose	Storage duration	Type	Provider
CookieConsent	Remembers your consent to the use of cookies.	1 year	HTML	Web Consent
fe_typo_user	Assigns your browser to a session on the server. This only affects the content you see and is not evaluated or processed by us	-	HTTP	Web User

Name	Purpose	Storage duration	Type	Provider
_pk_id	Used to store a few details about the user like unique visitor ID.	13 months	HTML	Matomo-id
_pk_ref	Used to store information about the user's referring website.	6 months	HTML	Matomo-ref
_pk_ses	Short-term cookie to save temporary data from the visit.	30 minutes	HTML	Matomo-ses
_pk_cvar	Short-term cookie to save temporary data from the visit.	30 minutes	HTML	Matomo-cvar
_pk_hsr	Short lived cookie used to temporarily store data for the visit.	30 minutes	HTML	Matomo

Name	Purpose	Storage duration	Type	Provider
YouTube	A connection to YouTube will be established to view videos.	-	Connection	YouTube
SoundCloud	A connection to SoundCloud will be established to play audio files.	-	Connection	SoundCloud
Twitter	A connection to Twitter will be established to display tweets.	-	missing translation: type.	Twitter

Crowdsourcing methods to increase public access and engagement with cultural heritage and academic datasets

Crowdsourcing methods to increase public access and engagement with cultural heritage and academic datasets

Date

Place

Contact

Twitter

YouTube

Helpdesk