Already a member?

Sign In

Preserving Social Science Data: How Much Replication Do We Need?

Presenter 1
Myron P. Gutmann
Presenter 2
Nancy Y. McGovern
Presenter 3
Bryan Beecher
Presenter 4
T.E, Raghunathan
University of Michigan

Those responsible for digital preservation are aware of a tension between the need to expend resources on preservation and the scarcity of those resources. Ideal preservation would save many copies forever, but this has a large potential cost. We need to be certain that we are preserving the right number of replicas. The paper raises issues that derive from a core attribute of most social science data, which is that social science data is often created by drawing random samples from a population and studying the behavior or attributes of the sample. The sampled character of these data has implications for preservation. While it is less than desirable to lose cases from a sample, even after some loss the sample still has validity and can be used for future research. From this the paper argues that replication for preservation purposes may require thinking at the level of cases or variables and not entire data files. There may be varying numbers of replicas within a data file, depending on the attributes of the overall sample, and the attributes of cases and variables. The situation is also more complex because of the need to protect confidentiality of data.

Presentation File: 
  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...