By LHorton 2 | May 1, 2015
The Digital Curation Centre’s most recent Research Data Management Forum took place last week in London.
UK Data Service’s Louise Corti began the day with an overview of their acquisitions process. The Service (under various names) is almost 50 years old that gives it experience and perspective many institutions do not have. Lessons from those years include the importance of a collections development policy that’s allowed to evolve. The Archive evaluates on a basis of teaching and re-use for validation and replication. They have learnt from past mistakes and now keep access licences to three options: open, safeguarded (requiring registration), and controlled (locked-down access). Common problems persist however. Poor file names, weak description of methods and contextual documentation, limited metadata, and unexplained missing data files. The UK Data Service play a number of roles as a data service, from hand-holders and evangelical preachers, to being the Economic and Social Research Council’s police officer for non-compliance on data sharing.
Suzanne Embury made a valuable point in her presentation. Of course, the one thing we know is we don’t know how other people will re-use data in the future. But we can reasonably guess what they will want to do is discover, integrate, and aggregate it. To this end, simple things can help – check spellings, aim for standardised vocabularies, avoid acronyms. Finally, apply a domain expert test to see if people in the discipline can independently understand the data. With that, echoes of Gary King’s replication standard came to mind.
A presentation on meeting the RDM challenge focused on the University of Loughborough who have adopted a data preservation and sharing solution based on figshare and Arkivum support. Loughborough desire making depositing data as easy as possible for researchers by taking care of as much of back end stuff as possible. But at what cost, in both finances and quality? At the last IASSIST we learnt RDM takes a village, but Loughborough acknowledged the contribution of 61 people in setting up their service, so maybe it really takes a small metropolitan statistical area.
IASSIST’s own web editor Robin Rice directed us through data deposit at the University of Edinburgh guided by former IASSIST president Peter Burnhill’s refrain of “helping researchers to do the right thing”. Edinburgh provide support throughout the data lifecycle with strong training resources (Research Data MANTRA), plus face-to-face sessions on managing data, creating DMP, good practice, handling data in SPSS, working with personal and sensitive research data. Like the UK Data Service, they recognise the value in keeping things simple and offering good incentives. Licence options, for example. Their repository only accepts open data (CC-BY 4.0) but depositing is based on five required metadata fields. In return, depositors get their data available quickly with open download stats for every item.
The afternoon sessions split into three discussion groups. Emerging from them were thoughts on keeping metadata requirements as simple as possible, recognising the concentrate on different aspects depending on the discipline; some disciplines require precision while others do not require so much. An acknowledgement that data discovery is often undertaken through google. Also, while there inevitably is a range of people providing a service, there needs to be or a person connecting existing resources in a university. Finally, raising awareness is a problem, demand related to institutional awareness.
Presentations from the event are available from the DCC, and tweets with the hashtag #rdmf13. The DCC will be blogging about the discussion group sessions.