Archiving, Preservation, Curation

Defining the nature of data archives, their functions and their operation

IASSIST Quarterly (IQ) volume 40-2 is now on the website: Revolution in the air

Welcome to the second issue of Volume 40 of the IASSIST Quarterly (IQ 40:2, 2016). We present three papers in this issue.

First, there are two papers on the Data Documentation Initiative that have their own special introduction. I want to express my respect and gratitude to Joachim Wackerow (GESIS - Leibniz Institute for the Social Sciences). Joachim (Achim) and Mary Vardigan (University of Michigan) have several times and for many years communicated to and advised the readers of the IASSIST Quarterly on the continuing development of the DDI. The metadata of data is central for the use and reuse of data, and we have come a long way through the efforts of many people.    

The IASSIST 2016 conference in Bergen was a great success - I am told. I was not able to attend but heard that the conference again was 'the best ever'. I was also told that among the many interesting talks and inputs at the conference Matthew Woollard's keynote speech on 'Data Revolution' was high on the list. Good to have well informed informers! Matthew Woollard is Director of the UK Data Archive at the University of Essex. Here in the IASSIST Quarterly we bring you a transcript of his talk. Woollard starts his talk on the data revolution with the possibility of bringing to users access to data, rather than bringing data to users. The data is in the 'cloud' - in the air - 'Revolution in the air' to quote a Nobel laureate. We are not yet in the post-revolutionary phase and many issues still need to be addressed. Woollard argues that several data skills are in demand, like an understanding of data management and of the many ethical issues. Although he is not enthusiastic about the term 'Big Data', Woollard naturally addresses the concept as these days we cannot talk about data - and surely not about data revolution - without talking about Big Data. I fully support his view that we should proceed with caution, so that we are not simply replacing surveys where we 'ask more from fewer' with big data that give us 'less from more'. The revolution gives us new possibilities, and we will see more complex forms of research that will challenge data skills and demand solutions at data service institutions.  

Papers for the IASSIST Quarterly are always very welcome. We welcome input from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing a presentation, give a thought to turning your one-time presentation into a lasting contribution. We permit authors 'deep links' into the IQ as well as deposition of the paper in your local repository. Chairing a conference session with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the session participants, and will be readily available on the IASSIST website at

Authors are very welcome to take a look at the instructions and layout:

Authors can also contact me via e-mail: Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you.

Karsten Boye Rasmussen   
Editor, IASSIST Quarterly

IQ double issue 38(4)/39(1) is up, and so is vol 39(2)!

Hi folks!  A lovely gift for your reading pleasure over the holidays, we present two, yes, TWO issues of the IASSIST Quarterly.  The first is the double issue, 38(4)/39(1) with guest editors, Joachim Wacherow of GESIS – Leibniz Institute for the Social Sciences in Germany and Mary Vardigan of ICPSR at the University of Michigan, USA.  This issue focuses on the Data Documentation Initiative (DDI) and how it makes meta-analysis possible.  The second issue is 39(2), and is all about data:  avoiding statistical disclosure, using data, and improving digital preservation.  Although we usually post the full text of the Editor's Notes in the blog post, it seems lengthy to do that for both issues.  You will find them, though, on the web site: the Editor's Notes for the double issue, and the Editor's Notes for issue 39(2).

Michele Hayslett, for the IQ Publications Committee

A decade against decay: the 10th International Digital Curation Conference

The International Digital Curation Conference (IDCC) is now ten years old. On the evidence of its most recent conference, is in rude health and growing fast.

IDCC is the first time IASSIST decided to formally support another organisational conference. I think it was a wise investment given the quality of plenaries, presentations, posters, and discussions.

DCC already has available a number of blogs covering the substance of sessions, including an excellent summary by IASSIST web editor, Robin Rice. Presentations and posters are already available, and video from plenary sessions will soon be online.

Instead I will use this opportunity to pick-up on hanging issues and suggestions for future conferences.

One was apportionment of responsibility. Ultimately, researchers are responsible for management of their data, but they can only do so if supporting infrastructure is in place to help them. So, who is responsible for providing that: funders or institutions? This theme emerged in the context of the UK’s Engineering and Physical Sciences Research Council who will soon enforce expectations identifying the institution as responsible for supporting good Research Data Management.

Related to that was a discussion on the role of libraries in this decade. Are they relevant? Can they change to meet new challenges? Starting out as a researcher who became a data archivist and is now a librarian, I wouldn’t be here if libraries weren’t meeting these challenges. There’s a “hush” of IASSIST members also ready to take issue with the suggestions libraries aren’t relevant or not engaged with data, in fact they did so at our last conference.

Melissa Terras, (UCL) did a fantastic job presenting [PDF] work in the digital humanities that is innovative in not only preserving, but rescuing objects – and all done on small change research budgets. I hope a future IDCC finds space for a social sciences person to present on issues we face in preservation and reuse. Clifford Lynch (CNI) touched on the problems of data reuse and human subjects, which remained one of the few glancing references to a significant problem and one IASSIST members are addressing. Indeed, thanks must go to a former president of this association, Peter Burhill (Edinburgh) who mentioned IASSIST and how it relates to the IDCC audience on more than one occasion.

Finally, if you were stimulated by IDCC’s talk of data, reuse, and preservation then don’t forget our own conference in Minneapolis later this year.

Version 4, Research Data Curation Bibliography & the IQ

Another reason to write for the IQ: you might get yourself into Charles Bailey's prestigious bibliography, at

I'm pleased to see no less than 7 IQ articles in the latest version. I didn’t count IASSISTers who published elsewhere but several of those were in the list as well.

Research Data Curation Bibliography

Charles W. Bailey, Jr.

Houston: Digital Scholarship

Version 4: 6/23/2014

Altman, Micah, and Mercè Crosas. "The Evolution of Data Citation: From Principles to Implementation" IASSIST Quarterly 37, no. 1-4 (2013): 62-70.

Bender, Stefam, and Jorg Heining. "The Research-Data-Centre in Research-Data-Centre Approach: A First Step towards Decentralised International Data Sharing." IASSIST Quarterly 35, no. 3 (2011): 10-16.

Mooney, Hailey. "A Practical Approach to Data Citation: The Special Interest Group on Data Citation and Development of the Quick Guide to Data Citation." IASSIST Quarterly 37, 1-4 (2013): 71-77.

 Ribeiro, Cristina, Maria Eugénia, and Matos Fernandes. "Data Curation at U. Porto: Identifying Current Practices across Disciplinary Domains." IASSIST Quarterly 35, no. 4 (2011): 14-17.

 Schumann, Natascha. "Tried and Trusted: Experiences with Certification Processes at the GESIS Data Archive." IASSIST Quarterly 36, no. 3/4 (2012): 24-27.

 Schumann, Natascha, and Astrid Recker. "De-mystifying OAIS compliance: Benefits and challenges of mapping the OAIS reference model to the GESIS Data Archive." IASSIST Quarterly 36, no. 2 (2012): 6-11.

 Yoon, Ayoung, and Helen Tibbo. "Examination of Data Deposit Practices in Repositories with the OAIS Model." IASSIST Quarterly 35, no. 4 (2011): 6-13.

 Congratulations to the authors.

Robin Rice, IASSIST Web Editor

Research Data Management Issues Across Environments

Lots of conversations going on these days in different venues where people are asking many of the same questions:  how do we teach researchers about data management with limited staff, and what data management services should we offer?  How do we find sustainable ways to manage data that leverage the efforts of many different repositories, those in government, institutions and disciplinary ones?  How do we coalesce standard practice and reasonable but effective policies at at least the national level and preferably on a global scale?  What roles should governments play?  How much can we as data professionals accomplish on our own?  The Data Management and Curation SIG will host a workshop to talk about these and other issues across different countries and environments next Tuesday. Our speakers will include:

  • Dan Gillman, U.S. Bureau of Labor Statistics
  • Marcel Hebing, DIW Berlin
  • Chuck Humphrey, University of Alberta
  • Steven McEachern, Australian Data Archive
  • Barry Radler, Institute on Aging, University of Wisconsin-Madison
  • Robin Rice, EDINA and Data Library at the University of Edinburgh
  • Kathleen Shearer, Confederation of Open Access Repositories and Research Data Canada

Looking forward to seeing many of you in Toronto!

Michele Hayslett, University of North Carolina at Chapel Hill & Stefan Kramer, American University

White Paper Urges New Approaches to Assure Access to Scientific Data

Press release posted on behalf of Mark Thompson-Kolar, ICPSR.

12/12/2013:  (Ann Arbor, MI)—More than two dozen data repositories serving the social, natural, and physical sciences today released a white paper recommending new approaches to funding sharing and preservation of scientific data. The document emphasizes the need for sustainable funding of domain repositories—data archives with ties to specific scientific communities.

“Sustaining Domain Repositories for Digital Data: A White Paper,” is an outcome of a meeting convened June 24-25, 2013, in Ann Arbor. The meeting, organized by the Inter-university Consortium for Political and Social Research (ICPSR) and supported by the Alfred P. Sloan Foundation, was attended by representatives of 22 data repositories from a wide spectrum of scientific disciplines.

Domain repositories accelerate intellectual discovery by facilitating data reuse and reproducibility. They leverage in-depth subject knowledge as well as expertise in data curation to make data accessible and meaningful to specific scientific communities. However, domain repositories face an uncertain financial future in the United States, as funding remains unpredictable and inadequate. Unlike our European competitors who support data archiving as necessary scientific infrastructure, the US does not assure the long-term viability of data archives.

“This white paper aims to start a conversation with funding agencies about how secure and sustainable funding can be provided for domain repositories,” said ICPSR Director George Alter. “We’re suggesting ways that modifications in US funding agencies’ policies can help domain repositories to achieve their mission.”

Five recommendations are offered to encourage data stewardship and support sustainable repositories: 

  •  Commit to sustaining institutions that assure the long-term preservation and viability of research data
  • Promote cooperation among funding agencies, universities, domain repositories, journals, and other stakeholders 
  •  Support the human and organizational infrastructure for data stewardship as well as the hardware
  •  Establish review criteria appropriate for data repositories
  • Incentivize Principal Investigators (PIs) to archive data

While a single funding model may not fit all disciplines, new approaches are urgently needed, the paper says.

“What’s really remarkable about this effort—the meeting and the resulting white paper—has been the consensus across disciplines from astronomy to archaeology to proteomics,” Alter said. “More than two dozen domain repositories from so many disciplines are saying the same thing: Data sharing can produce more science, but data stewards must know the needs of their scientific communities.”

This white paper is a must read for anyone who wants to understand the role of scientific domain repositories and their critical role in the advancement of science. It can be downloaded at


The Inter-university Consortium for Political and Social Research (ICPSR), based in Ann Arbor, MI, is the largest archive of behavioral and social science research data in the world. It advances research by acquiring, curating, preserving, and distributing original research data.

The Alfred P. Sloan Foundation is a philanthropic, not-for-profit grantmaking institution based in New York City. Established in 1934, the Foundation makes grants in support of original research and education in science, technology, engineering, mathematics, and economic performance.


IASSIST Fellows 2013


The IASSIST Fellows Committee is glad to announce through this post the six recipients of the 2013 IASSIST Fellowship award. We are extremely excited to have such a diverse and interesting group with different backgrounds and experience and encourage IASSISTers to welcome them at our conference in Cologne, Germany.

Please find below their names, countries and brief bios:

Chifundo Kanjala (Tanzania) 

Chifundo currently works as a Data Manager and data documentalist for an HIV research group called ALPHA network based at London School of Hygiene and Tropical Medicine's department of Population Health, Chifundo spends most of his time in Mwanza, Tanzania but do travel from time around Southern and Eastern Africa to work with colleagues in the ALPHA network.Before joining the London School of Hygiene and Tropical Medicine, he was working as a Data analyst consultant at Unicef, Zimbabwe.Currently working part time on a PhD with London school of Hygiene and Tropical Medicine. He has an MPhil in Demography from university of Cape Town, South Africa and a BSc Statistics Honours degree from University of Zimbabwe.

Judit Gárdos (Hungary) 

Judit Gárdos studied Sociology and German Language and Literature in Budapest, Vienna and Berlin. She is PhD-candidate in sociology, with a topic on the philosophy, sociology and anthropology of quantitative sociology. She is young researcher at the Institute of Sociology of the Hungarian Academy of Sciences. Judit has been working at the digital archive and research group called "" that is collecting qualitative, interview-based sociological research collections of the last 50 years. She is coordinating the work at the newly-funded Research Documentation Center of the Center for Social Sciences at the Hungarian Academy of Sciences.

Cristina Ribeiro (Portugal) 

Cristina Ribeiro is an Assistant Professor in Informatics Engineering at Universidade do Porto and a researcher at INESC TEC. She has graduated in Electrical Engineering, holds a Master in Electrical and Computer Engineering and a Ph.D. in Informatics. Her teaching includes undergraduate and graduate courses in information retrieval, digital libraries, knowledge representation and markup languages. She has been involved in research projects in the areas of cultural heritage, multimedia databases and information retrieval. Currently her main research interests are information retrieval, digital preservation and the management of research data.

Aleksandra Bradić-Martinović (Serbia) 

Aleksandra Bradić-Martinović, PhD is the Research Fellow at the Institute of Economic Sciences, Belgrade, Serbia. Her field of expertize is research of information and communication technology implementation in economy, especially in banking, payment system operations and stock exchange operations. Aleksandra is also engaged in education process in Belgrade Banking Academy at the following subjects: E-banking and Payment Systems, Stock Market Dealings and Management Information Systems. She was engaged at several projects in the field of education. At the FP7 SERSCIDA project she is a Serbia team coordinator.

Anis Miladi (Tunisia) 

Anis Miladi earned his Bachelor degree in computer sciences and multimedia in 2007 and a Master degree in Management of Information Systems and organizations in 2008 and he is currently finalizing his master degree in project management(projected date summer 2013). Before joining the Social and Economic Survey Research Institute at Qatar University as Survey Research technology specialist in 2009, he worked as a programmer analyst in a private IT services company In Tunisia. His Area of expertise includes managing computer assisted surveys CAPI,CATI(Blaise surveying system)  in addition to Enterprise Document Management Systems, Enterprise Portals (SharePoint).

Lejla Somun-Krupalija (Sarajevo) 

Lejla currently serves as the Senior Program and Research Officer at the Human Rights Centre of the University of Sarajevo. She has over 15 years of experience in research, policy development in social inclusion issues. She is the Project Coordinator of the SERSCIDA FP7 project that aims to open data services/archives in the Western Balkan region in cooperation with CESSDA members. She had been engaged in the NGO sector previously, particularly on issues of capacity building and policy development in the areas of gender equality, the rights of persons with disabilities and issues of social inclusion and forced migration. She teaches academic writing, qualitative research, and gender and nationalism at the University of Sarajevo. 

Some reflections on research data confidentiality, privacy, and curation by Limor Peer

Some reflections on research data confidentiality, privacy, and curation

Limor Peer

Maintaining research subjects’ confidentiality is an essential feature of the scientific research enterprise. It also presents special challenges to the data curation process. Does the effort to open access to research data complicate these challenges?

A few reasons why I think it does: More data are discoverable and could be used to re-identify previously de-identified datasets; systems are increasingly interoperable, potentially bridging what may have been insular academic data with other data and information sources; growing pressure to open data may weaken some of the safeguards previously put in place; and some data are inherently identifiable

But these challenges should not diminish the scientific community’s firm commitment to both principles. It is possible, and desirable, for openness and privacy co-exist. It will not be simple to do, and here’s what we need to keep in mind:

First, let’s be clear about semantics. Open data and public data are not the same thing. As Melanie Chernoff observed, “All open data is publicly available. But not all publicly available data is open.” This distinction is important because what our community means by open (standards, format) may not be what policy-makers and the public at large mean (public access). Chernoff rightly points out that “whether data should be made publicly available is where privacy concerns come into play. Once it has been determined that government data should be made public, then it should be done so in an open format.” So, yes, we want as much data as possible to be public, but we most definitely want data to be open.

Another term that could be clarified is usefulness. In the academic context, we often think of data re-use by other scholars, in the service of advancing science. But what if the individuals from whom the data were collected are the ones who want to make use of it? It’s entirely conceivable that the people formerly known as “research subjects” begin demanding access to, and control over, their own personal data as they become more accustomed to that in other contexts. This will require some fresh ideas about regulation and some rethinking of the concept of informed consent (see, for example, the work of John Wilbanks, NIH, and the National Cancer Institute on this front). The academic community is going to have to confront this issue.

Precisely because terms are confusing and often vaguely defined, we should use them carefully. It’s tempting to pit one term against the other, e.g., usefulness vs. privacy, but it may not be productive. The tension between privacy and openness or transparency does not mean that we have to choose one over the other. As Felix Wu says, “there is nothing inherently contradictory about hiding one piece of information while revealing another, so long as the information we want to hide is different from the information we want to disclose.” The complex reality is that we have to weigh them carefully and make context-based decisions.

I think the IASSIST community is in a position to lead on this front, as it is intimately familiar with issues of disclosure risk. Just last spring, the 2012 IASSIST conference included a panel on confidentiality, privacy and security. IASSIST has a special interest group on Human Subjects Review Committees and Privacy and Confidentiality in Research. Various IASSIST members have been involved with heroic efforts to create solutions (e.g., via the DDI Alliance, UKDA and ICPSR protocols) and educate about the issue (e.g., ICPSR webinar , ICPSR summer course, and MANTRA module). A recent panel at the International Data Curation Conference in Amsterdam showcased IASSIST members’ strategies for dealing with this issue (see my reflections about the panel).

It might be the case that STEM is leading the push for open data, but these disciplines are increasingly confronted with problems of re-identification, while the private sector is increasingly being scrutinized for its practices (see this on “data hops”). The social (and, of course, medical) sciences have a well-developed regulatory framework around the issue of research ethics that many of us have been steeped in. Government agencies have their own approaches and standards (see recent report by the U.S. Government Accountability office). IASSIST can provide a bridge; we have the opportunity to help define the conversation and offer some solutions.

IQ Special Quadruple Issue: The Book of the Bremen Workshop

Welcome to this very special IASSIST Quarterly issue. We now present volume 34 (3 & 4) of 2010 and volume 35 (1 & 2) of 2011. Normally we have about three papers in a single issue. In this super-mega-special issue we have fourteen papers from the countries: Finland, Ireland, United Kingdom, Austria, Czech Republic, Denmark, Germany, Norway, Slovenia, Belarus, Hungary, Lithuania, Poland and Switzerland. This will be known in IASSIST as the “The book of the Bremen Workshop”.

The workshop took place in April 2009 at the University of Bremen. The workshop was hosted by the Archive for Life Course Research at Bremen and funded by the Timescapes Initiative with support from CESSDA. The background and context of the workshop as well as short introductions to the many papers are found in the Editorial Introduction by the guest editors Bren Neale and Libby Bishop. The many papers are the result of the effort of numerous authors that were instrumental in the development and fulfillment of the many outcomes of the workshop. The introduction by the guest editors shows impressive lists of short-term activities, agreed goals, and also strategies for development. There are future initiatives and the future looks bright and interesting.The focus of the Bremen Workshop is on “qualitative (Q) and qualitative longitudinal (QL) research and resources across Europe”. I would have called that a qualitative workshop but you can see from the introduction and the papers that this subject is often referred to as “qualitative and QL data”. The “and QL” emphasizes that the longitudinal aspect is the special and important issue. In the beginning of IASSIST data was equivalent to quantitative data. However, digital archives found in the next wave that the qualitative data also with great value were made available for secondary research. The aspect of “longitudinal” further accentuates that value creation.

This is a growing subject area. During the processing one of the authors wanted to update her paper and asked for us to replace the sentence “80 archived qualitative datasets and yearly around 30-40 datasets are ordered for re-use” with “115 archived qualitative datasets and yearly around 50-60 datasets are ordered for re-use”. Yes, we do have a somewhat long processing time but this is still a very fast growth rate. I want to thank Libby Bishop for not being annoyed when I persistently reminded her of the IQ special issues. I’m sure the guest editors with similar persistency contacted the authors. It was worth it.

As in Sherlock Holmes we might look for what is not there as when curiosity is raised by the fact that “the dog did not bark”. IASSIST has had and continues to have a majority of its membership in North America so it is also remarkable that we here present the initiative on “qualitative (Q) and qualitative longitudinal (QL) research” with a European angle. Hopefully the rest of the world will enjoy these papers and there will probably be more papers both from Europe but also from the others regions covered by the IASSIST members.

Articles for the IQ are always very welcome. They can be papers from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. If you don't have anything to offer right now, then please prepare yourself for the next IASSIST conference and start planning for participation in a session there. Chairing a conference session with the purpose of aggregating and integrating papers for a special issue IQ is much appreciated as the information in the form of an IQ issue reaches many more people than the session participants and will be readily available on the IASSIST website at

Authors are very welcome to take a look at the description for layout and sending papers to the IQ:
Authors can also contact me via e-mail: Should you be interested in compiling a special issue for the IQ as guest editor (editors) I will also delighted to hear from you.

Karsten Boye Rasmussen
Editor August 2011

Image Credit: by mitko-denev on flickr

