Human Capital Data Lab

Reconstruction of populations by levels of education in the 20th Century


Long-term information on education is important to better inform the future, and particularly through policy makers and stakeholders about the priorities to be set. Therefore, the work on the historical trend in educational development has been extended beyond the 1970s in a project titled EDU20C (Principal Investigator: Anne Goujon), funded by the Anniversary Fund of the City of Vienna for the Austrian Academy of Sciences. The objective of this project is to reconstruct past for a selected number of countries levels of educational attainment back to the beginning of the 20th century. The reconstructed dataset will enable a broader understanding of the education expansion and the impact of education on social, economic, technological, and ecological changes in the 20th century, which in turn will assist in better anticipating the challenges of the future in different settings.

The EDU20C project focuses on an initial list of 30 countries. The countries will be back-projected using several validated and harmonized historical datasets. The reconstructed data will be provided and visualised on the project website. On top of the reconstructed data, the website will include a repository of the original historic data collection (available from census reports, and more rarely from international or national statistical compendium, collected at the libraries of INED (National institute of demographic studies - Paris), INSEE (National institute of statistics and economic studies - Paris), the Bodleian Libraries (Oxford) , the United Nations Dag Hammarskjöld Library (New York), and the Library of Congress (Washington).


Moreover, we would like to take the different levels of data quality over time and across countries into account, and produce estimates with associated measures of uncertainty surrounding them. Therefore another direction for the work under EDU20C is to invest in Bayesian modeling. The Human Capital Data Lab will use the available R package for Bayesian reconstruction of past populations developed by Wheldon et al. (2016) for single state population and aims to enhance it, in order to be able to reconstruct levels of educational attainment for the 20th century. Since the methodology provides as outputs information about demographic determinants, it is complementary to the more ‘classical’ back-projection methodology. Beyond that, the comparison between the two procedures is an added value in the methodological sense interesting and will possibly lead to further applications in the field of global human capital research.


Contact: Anne Goujon

Research team: Ramon Bauer, Jakob Eder, Sandra Jurasszovich, Samir KC (IIASA), Markus Speringer, Dilek Yildiz


Project website: