Research Problems in Data Curation: Outcomes from the Data Curation Education in Research Centers Program

Palmer, C. L., Mayernik, M. S., Weber, N., Baker, K. S., Kelly, K., Marlino, M. R., & Thompson, C. A. (2013, December). Research Problems in Data Curation: Outcomes from the Data Curation Education in Research Centers Program. Poster presented at the 46th annual Fall Meeting of the American Geophysical Union, San Francisco, CA.

The need for data curation is being recognized in numerous institutional settings as national research funding agencies extend data archiving mandates to cover more types of research grants. Data curation, however, is not only a practical challenge. It presents many conceptual and theoretical challenges that must be investigated to design appropriate technical systems, social practices and institutions, policies, and services. This presentation reports on outcomes from an investigation of research problems in data curation conducted as part of the Data Curation Education in Research Centers (DCERC) program. DCERC is developing a new model for educating data professionals to contribute to scientific research. The program is organized around foundational courses and field experiences in research and data centers for both master’s and doctoral students. The initiative is led by the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign, in collaboration with the School of Information Sciences at the University of Tennessee, and library and data professionals at the National Center for Atmospheric Research (NCAR).At the doctoral level DCERC is educating future faculty and researchers in data curation and establishing a research agenda to advance the field. The doctoral seminar, Research Problems in Data Curation, was developed and taught in 2012 by the DCERC principal investigator and two doctoral fellows at the University of Illinois. It was designed to define the problem space of data curation, examine relevant concepts and theories related to both technical and social perspectives, and articulate research questions that are either unexplored or under theorized in the current literature. There was a particular emphasis on the Earth and environmental sciences, with guest speakers brought in from NCAR, National Snow and Ice Data Center (NSIDC), and Rensselaer Polytechnic Institute. Through the assignments, students constructed dozens of research questions informed by class readings, presentations, and discussions. A technical report is in progress on the resulting research agenda covering: data standards; infrastructure; research context; data reuse; sharing and access; preservation; and conceptual foundations. This presentation will discuss the agenda and its importance for the geosciences, highlighting high priority research questions. It will also introduce the related research to be undertaken by two DCERC doctoral students at NCAR during the 2013-2014 academic year and other data curation research in progress by the doctoral DCERC team.