arrowCIRSS Home arrow Research Areas arrow Data Curation

Data Curation

Research and education initiatives focused on challenges associated with the curation and federation of digital collections for long-term distributed use.  Work in this area relates to all parts of data lifecycle, including data cleaning, metadata standards and tools, end-user tool development, knowledge representation, and conceptual foundations.


Bertram Ludäscher (

Director, Center for Informatics Research in Science and Scholarship

Current People

Recent Publications

Budden, A. E., Jones, M. B., Ludaescher, B., & Vieglais, D. (2019, December). Beyond Bibliographic Citation: Provenance and Dependency Metadata for Complex Research Objects. In AGU Fall Meeting 2019. AGU. Read more

Chard, K., Gaffney, N., Jones, M. B., Kowalik, K., Ludäscher, B., Nabrzyski, J., ... & Willis, C. (2019, June). Implementing computational reproducibility in the Whole Tale environment. In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems (pp. 17-22). Read more

Li, L., Ludäscher, B., & Zhang, Q. (2019). Towards more transparent, reproducible, and reusable data cleaning with OpenRefine. iConference 2019 Proceedings. Read more

Franz, N. M., Musher, L. J., Brown, J. W., Yu, S., & Ludäscher, B. (2019). Verbalizing phylogenomic conflict: Representation of node congruence across competing reconstructions of the neoavian explosion. PLoS computational biology, 15(2), e1006493. Read more

All related publications

Current Projects

The Whole Tale: Merging Science and Cyberinfrastructure Pathways

Whole Tale is a five-year NSF CC*DNI DIBBS-funded project that will enable researchers to examine, transform, and then seamlessly republish research data that was used in an article.  These "living articles" will enable new discovery by allowing researchers to construct representations and syntheses of data.

Bertram Ludaescher, PI (Illinois); Kyle Chard, co-PI (U of Chicago); Victoria Stodden, co-PI (Illinois); Matthew Turk, co-PI (Illinois); Niall Gaffney, co-PI (Texas Advanced Computing Center)

Designing Synthesized Knowledge of Past Environments (SKOPE)

This project will design and prototype SKOPE  (Synthesized Knowledge of Past Environments), an online research tool that will provide state-of-the-art information about the environment experienced by humans at a given a place and time, past or present.  In response to a specific query, SKOPE will extract the latest data from diverse online databases. Using explicit and repeatable procedures, it will process the data to yield a cutting-edge synthesis of environmental information specifically tailored to the user’s request. Initially the tool will be developed for the Southwest US over the last 2000 years, but it will be designed to be readily extended to other places and times.

PI: Keith Kintigh (Arizona State); PI: Timothy Kohler (Washington State); PI: Bertram Ludäscher (iSchool at Illinois)

Kurator: A Provenance-enabled Workflow Platform and Toolkit to Curate Biodiversity Data

Data curation is a critical step in scientific data digitization, sharing, integration and use. The considerable resources allocated to digitization of natural science collections in the U.S. and globally require a focus on both digitization efficiencies and the utility of the generated data. One way to address both issues is to employ workflow software to automate and streamline data curation processes. We are developing Kurator, a suite of biodiversity data quality tools aimed at collection management specialists with little or no programming experience, database administrators and researchers with some scripting language experience, and developers.

PI: Bertram Ludäscher; co-PI: James Macklin (Agriculture and Agri-Food Canada); PI: James Hanken (Director, Museum of Comparative Zoology. Harvard)

All related projects

Recent News

February 9, 2021

Ludäscher to present keynote at reusable research webinar

Professor and Center for Informatics Research in Science and Scholarship (CIRSS) Director Bertram Ludäscher will be the keynote speaker for a webinar hosted by the Council on Library and Information Resources (CLIR) on February 10. The webinar, …

February 21, 2020

iSchool researchers organize provenance workshop in Ireland

PhD students Michael Gryk and Jessica Cheng and alumna Rhiannon Bettivia (PhD '16) organized a provenance workshop, which was held on February 17 in conjunction with the 15th International Digital Curation Conference (IDCC) in Dublin, Ireland.The…

September 24, 2019

iSchool researchers present at ro2019

CIRSS researchers will present their work at the Workshop on Research Objects 2019 (ro2019), which will be held in conjunction with eScience 2019 on September 24-27 in San Diego, California. The Research Objects approach proposes a way to "packa…

May 31, 2019

Ludäscher Lab to present research at Philadelphia Logic Week

Professor Bertram Ludäscher will be presenting research with group members during Philadelphia Logic Week 2019. The event, which will be held from June 3-7 at St. Joseph's University, brings together several conferences dedicated to the rese…

More news

Past Events

October 30, 2020

Agreeing to Disagree: Applying A Logic-Based Approach To Reconciling and Merging Multiple Taxonomies

Taxonomies are Knowledge Organization Systems (KOS) that classify concepts into hierarchies via parent- child (is-a) relationships. While taxonomies are largely used in information systems as tools fo…

March 6, 2020

Katy, Millie, Misty, and Me: Participatory Culture in Teen Fashion and Humor Comics

This paper explores pre-Internet, print-based participatory culture in the form of reader-contributed content to fashion and humor comics associated with characters such as Archie Comics’ Katy K…

September 27, 2019

Extracted Features: Opening Access to HathiTrust and Beyond

The HathiTrust Digital Library (HTDL) contains nearly 17 million volumes (approximately 6 billion pages). Unfortunately, around 11 million HTDL volumes are under copyright restrictions and cannot be s…

All events