Full Program »
Successes, pain points, and lessons learned when curating data at scale: The NACJD experience
Established in 1978, the National Archive of Criminal Justice Data (NACJD) archives and disseminates data on crime and justice for secondary analysis. To support research on crime and justice, NACJD staff curate datasets from three U.S. federal agencies: 1) the Bureau of Justice Statistics, 2) the National Institute of Justice, and 3) the Office of Juvenile Justice and Delinquency Prevention. This presentation will describe how NACJD retooled established data curation efforts in the face of staff turnover, organizational change, increasingly complex data deposits, and growth in deposit counts.
In the first part of our presentation, we describe NACJD’s history. We show that data release counts remained stable until 2005, when they began increasing. We attribute this increase to widespread broadband Internet access and shifting expectations for data sharing among social scientists. Next, we describe the period from 2006 to 2018. We show that most data NACJD released during this period were quantitative. Finally, we demonstrate that key staff retirements, restructuring at the Inter-university Consortium for Political and Social Research, and an uptick in mixed-method data archiving caused releases to decline.
In the second half of our presentation, we discuss how NACJD reversed this decline. Specifically, we argue that new hires and workflows helped NACJD increase data releases through ongoing collaboration with stakeholders. We describe these hires and workflows, focusing on how they connect to the data curation lifecycle. However, we also note that our lessons learned cannot address the challenges associated with curating complex (large and non-tabular) datasets, so we propose new avenues for data curation at scale.