OR2013: Open Repositories Confront Research Data

By Chuck | July 19, 2013

Open Repositories 2013 was hosted by the University of Prince Edward Island from July 8-12. A strong research data stream ran throughout this conference, which was attended by over 300 participants from around the globe.  To my delight, many IASSISTers were in attendance, including the current IASSIST President and four Past-Presidents!  Rarely do such sightings happen outside an IASSIST conference.

IASSIST Five Presidents by Limor Peer

This was my first Open Repositories conference and after the cool reception that research data received at the SPARC IR meetings in Baltimore a few years ago, I was unsure how data would be treated at this conference.  I was pleasantly surprised by the enthusiastic interest of this community toward research data.  It helped that there were many IASSISTers present but the interest in research data was beyond that of just our community.  This conference truly found an appropriate intersection between the communities of social science data and open repositories. 

Thanks go to Robin Rice (IASSIST), Angus Whyte (DCC), and Kathleen Shearer (COAR) for organizing a workshop entitled, “Institutional Repositories Dealing with Data: What a difference a ‘D’ makes!”  Michael Witt, Courtney Matthews, and I joined these three organizers to address a range of issues that research data pose for those operating repositories.  The registration for this workshop was capped at 40 because of our desire to host six discussion tables of approximately seven participants each.  The workshop was fully subscribed and Kathleen counted over 50 participants prior to the coffee break.  The number clearly expresses the wider interest in research data at OR2013.

Dealing with Data speakers by Robin Rice

Our workshop helped set the stage for other sessions during the week.  For example, we talked about environmental drivers popularizing interest in research data, including topics around academic integrity.  Regarding this specific issue, we noted that the focus is typically directed toward specific publication-related datasets and the access needed to support the reproducibility of published research findings.  Both the opening and closing plenary speakers addressed aspects of academic integrity and the role of repositories in supporting the reproducibility of research findings.  Victoria Stodden, the opening plenary speaker, presented a compelling and articulate case for access to both the data and computer code upon which published findings are based.  She calls herself a computational scientist and defends the need to preserve computer code as well as data to facilitate the reproducibility of scientific findings.  Jean-Claude Guédon, the closing plenary speaker, bracketed this discussion on academic integrity.  He spoke about scholarly publishing and how the commercial drive toward indicators of excellence has resulted in cheating.  He likened some academics to Lance Armstrong, cheating to become number one.  He feels that quality rather than excellence is a better indicator of scientific success.

Between these two stimulating plenary speakers, there was a number of sessions during which research data were discussed.  I was particularly interested in a panel of six entitled, “Research Data and Repositories,” especially because the speakers were from the repository community instead of the data community.  They each took turns responding to questions about what their repositories do now regarding research data and what they see happening in the future.  In a nutshell, their answers tended to describe the desire to make better connections between the publications in their repositories with the data underpinning the findings in these articles.  They also spoke about the need to support more stages of the research lifecycle, which often involves aspects of the data lifecycle within research.  There were also statements that reinforced the need for our (IASSIST’s) continued interaction with the repository community.  The use of readme files in the absence of standards-based metadata and other practices, where our data community has moved the best-practice yardstick well beyond, demonstrate the need for our communities to continue in dialogue. 

Chuck Humphrey