Already a member?

Sign In
Syndicate content

LHorton's blog

International Digital Curation Conference 2016 (IDCC16)

The International Digital Curation Conference 2016 was in Amsterdam between 23-24 February.

IASSIST was again a sponsor, and presented a poster on IASSIST members’ activities. In addition, plenty of familiar faces were present including our current IASSIST president and three former ones.

This year’s conference was the eleventh IDCC and took the title of "Visible data, invisible infrastructure". This asks what can we do to make the hard work of preserving data and making it and keeping it usable as easy as possible for researchers to use and as unobtrusive as possible in their work.

One feature of this year’s conference was the importance of terminology. In his opening keynote, Barend Mons made a good point that accessible data is not open data and sharing data does not make it reusable. Reusable is what is important. In his plenary, Andrew Sallans spoke of openness and sharing as core to scientific activity. His presentation was insightful on how data is lost (paywalls, broken links, TIF walls), as was his call for five percent of research budgets be reserved for data stewardship and the need for Europe to train 500,000 data experts in the next decade. The final keynote from Susan Halford was a warning about sloppy research methodology as researchers gorge on new big data sources. Using social media as an example, she cautioned on how these are not “naturally occurring” data but mediated by private companies using methods we do not know about.

The rest of the conference split into concurrent sessions with either a national or institutional focus, or featuring demonstrations and elaborations on tools and services. It is interesting to see how ventures like Dataverse and DMPonline/Tool fit into national infrastructure initiatives like Australian National Data Service or Canada’s Portage and institutional ones like those demonstrated by the universities of Oxford and California. If they are to do so successfully, it will be with a vison of enabling researchers to do better science rather than compelling researchers to comply with bureaucracy, and that the route to achieving this will be through open standards and building on existing initiatives rather than going back to constructing new tools to do essentially the same job.

An impressive feature of IDCC is the methodological rigour applied to research papers. An example to highlight from the programme was Renata Curty’s research on Factors influencing research data reuse in social sciences.

The final notable aspect of IDCC16 was how almost none of the suggestions in keynotes and tools presented supported “traditional” academic publishing. Reuse needs discoverable, machine readable, contextualised data with minimal barriers to access and minimal limits on usage – not the business model on which some well-known academic publishers thrive.

All presentations, posters, demonstrations, as well as blogs reporting on IDCC16 can be found on the DCC website.

Spring forward! The Jisc Research Data Spring programme

On 26/27 February, I attended Jisc Data Spring “Sandpit 1” in the English city of Birmingham. Data Spring is a funding programme supporting UK based projects in Research Data Management (RDM), and something of a successor to the successful Managing Research Data programmes (MRD) that did so much to get RDM training and tools underway in the UK’s education sector.

Unlike the traditional proposal-evaluation-funding model, Data Spring takes a more collaborative, interactive approach, splitting the programme into separate stages at which projects may no longer receive funding. If that sounds like the approach of entertainment modern TV shows, then you would not be wrong to think that. Beginning with an open call, some 70 proposals were available online for voting and comments. These reduced to 44 by the time of a workshop [PDF] at the recent IDCC conference. At the “Sandpit” (metaphorical, not literal, sadly), these proposals had to fit 27 available slots to proceed to the next stage. Through a process of negotiation, mergers and acquisitions, and hasty matchmaking, all 44 managed to get through in some form from the first day to the second.

The second day consisted of the now 27 projects making four-minute pitches to a panel of judges. By mid-March, successful projects will receive notice of three months testing and prototype funding before reporting to a similar event in June. Following this event, projects may receive a further four months of funding before a final workshop in November allows six months of funding leading to the programme’s conclusion in 2016.

Having been part of the JISCMRD Program (Jisc has since switched to sentence case from caps), it was notable how much the area has moved on since those days. From evidence gathering and basic training tools to RDM support focused on integration into existing workflows. That this occurred is a testament to the original MRD programme, and the support, work, and imaginations of those involved. Whatever projects make it through to the end of Data Spring, I have no doubt they will be worth the attention of people involved in Research Data Management both inside and outside the UK.

You can review projects at the Data Spring ideascale and figshare pages and tweet about them using #dataspring.

UPDATE: a storify of the event is also available.

A decade against decay: the 10th International Digital Curation Conference

The International Digital Curation Conference (IDCC) is now ten years old. On the evidence of its most recent conference, is in rude health and growing fast.

IDCC is the first time IASSIST decided to formally support another organisational conference. I think it was a wise investment given the quality of plenaries, presentations, posters, and discussions.

DCC already has available a number of blogs covering the substance of sessions, including an excellent summary by IASSIST web editor, Robin Rice. Presentations and posters are already available, and video from plenary sessions will soon be online.

Instead I will use this opportunity to pick-up on hanging issues and suggestions for future conferences.

One was apportionment of responsibility. Ultimately, researchers are responsible for management of their data, but they can only do so if supporting infrastructure is in place to help them. So, who is responsible for providing that: funders or institutions? This theme emerged in the context of the UK’s Engineering and Physical Sciences Research Council who will soon enforce expectations identifying the institution as responsible for supporting good Research Data Management.

Related to that was a discussion on the role of libraries in this decade. Are they relevant? Can they change to meet new challenges? Starting out as a researcher who became a data archivist and is now a librarian, I wouldn’t be here if libraries weren’t meeting these challenges. There’s a “hush” of IASSIST members also ready to take issue with the suggestions libraries aren’t relevant or not engaged with data, in fact they did so at our last conference.

Melissa Terras, (UCL) did a fantastic job presenting [PDF] work in the digital humanities that is innovative in not only preserving, but rescuing objects – and all done on small change research budgets. I hope a future IDCC finds space for a social sciences person to present on issues we face in preservation and reuse. Clifford Lynch (CNI) touched on the problems of data reuse and human subjects, which remained one of the few glancing references to a significant problem and one IASSIST members are addressing. Indeed, thanks must go to a former president of this association, Peter Burhill (Edinburgh) who mentioned IASSIST and how it relates to the IDCC audience on more than one occasion.

Finally, if you were stimulated by IDCC’s talk of data, reuse, and preservation then don’t forget our own conference in Minneapolis later this year.

Hallelujah and praise the LARD! The first London Area Research Data group meeting

LARD is London Area Research Data and this was its inaugural meeting, informally bringing together various people from London based institutions (and as far away as Reading) who are charged in some way with Research Data Management (RDM) - be it research support or repository work.

These are my notes, which lack attribution partly because I couldn't remember where every person was from, and also it wasn't clear if the meeting was on or off the record. Nonetheless, I felt there were some interesting points that deserve sharing as an insight into how UK universities (and one research centre) are dealing with RDM less than a year away from the EPSRC deadline on expectations of compliance for research data.

The first item in what was a free form discussion (think RDM jazz - hence my beat style kind of note taking, with full stops however), was policies. Some institutions have data policies, some have draft policies, and others have no policy. The mood seemed to be that a policy was more effective as a mandate for focusing university attention and resources on support services, not so much for grabbing researchers’ attention. Researchers, it was said, tend to react more to what funders want rather than university policies or documents. Those universities that competed for Medical Research Council (MRC) funding felt the MRC demanded institutional data policies, and so those institutions tended to adopt or have drafts ready for adoption. Yet most researchers are not funded by one of the RCUK councils, and these are often funders without data mandates. The group found a problem telling researchers that they don’t own their own data (it’s often funders or institutions through employee created works clauses). There was also a sense that researchers worry about data protection and are looking for practical guidance on how to keep data safe and secure. There was also a recognition that disciplines matter, those disciplines that do not have a strong culture of sharing data can be helped with the weight of institutional support providing the infrastructure to support RDM. This tackles the disciplinary focus of researchers, or localism. An example of how a bad experience can focus attention was mentioned when a researcher lost data by plugging a malware infected hard drive into a university network and had to have the drive and the copy of the data destroyed. Episodes like this can be used to tackle the culture of “improvisation” when it comes researchers “backing-up” their data without, or without engaging, institutional support. Aside from acting as a “wake-up” for researchers, they can push universities into providing workable, easy to use, institutional storage - either working storage or preservation in an institutional repository.

Discussion then moved round to the EPSRC expectations for research data, with those who attended a recent DCC event on the EPSRC expectations reporting that the EPSRC are not looking to get rid of opportunities for supporting research, so are not likely to cut off funding come May 2015. However, they do expect to see evidence that institutions are working towards or trying to improve storage, support, and data discovery and access. Nonetheless, there is no doubt the EPSRC policy has focused knowledge and effort in institutions towards RDM. Then training was mentioned. When the “T” word is mentioned I often think of that line about if people don't want to come how are you going to stop them? To save us from preparing to teach to empty rooms, the thinking now seems to be towards providing support when people need it and building up a directory of experts to refer to when appropriate. Structured support is based on identifying four key stages in the data lifecycle: submitting a proposal (for help on data management planning), when proposals are accepted (implementing RDM), mid-project (supporting implementation), and towards the close to talk about preservation. The key is to keep engagement with researchers. One institution is trying to do this for all research projects at that institution so is working with their research office to target RCUK funded projects. Another institution initially plans to work with a sample of projects.

By now the discussion had moved on to data management planning. One institution had a Data Management Plan (DMP) template and DMP requirement as part of its data policy, with separate plans for staff and postgraduate students. The feeling was that template texts are not such a good thing if they are copied and pasted into DMPs. A case was mentioned of one research funder refusing to fund a project because the DMP used identical text to another DMP submitted from that institution. The DCC’s DMPOnline tool was mentioned, particularly it’s ability to be customised towards an institution. It was also mentioned that DMPOnline has been much improved in later versions. A policy was mentioned at one institution of not offering storage until a DMP has been completed, another institution reported on how there is a checkbox in the research office to signify that the DMP has been looked at by the data management officer.

The RDM equivalent of Godwin's law (or Godwin's Rule of Nazi Analogies), is that at some point cost will be mentioned. How to cost RDM is an ongoing problem. Given the problem of identifying costs that specifically relate to RDM activity, as opposed to to typical research requirements that have an RDM aspect, an additional problem is that RCUK funders mostly allow budgeting for RDM but that budgeting must not identify activity that is supported as part of general institutional funding. Auditing costs is a problem. Storage tends to have the easier to identify costs (storage per byte for example), but this can be a problem if data is stored in an institutional repository when the budget for the project identified separate storage costs. For this reason, solutions like Arkivum may be advantageous as they can be specified as an auditable costs.

The coda to this discussion concerned metadata. It was said that funders were keen on ensuring that good quality metadata accompanies research data generated by projects they support, and that they are willing to allow proposals that factor in additional time and resources for metadata. However, an obvious problem is who should be adding that metadata - is it researchers who know the data, but not necessarily the standard or see its importance in the way RDM support staff do; or should it be RDM staff, particularly repository staff, who know they type of information required but do not necessarily know the data or discipline that well. Finally, hitting on a standard that that is applicable to all data is a problem. Social science is not the same as genetics; art history is not the same as management. It was then asked if there was a way to harvest metadata when that metadata is created elsewhere (say, the UK Data Service). Both the DCC and UK Data Service are working on a Jisc funded Research Data Registry and Discovery Service and the European Union are also working on data discovery platforms that imports/exports catalogue record metadata.

The feeling at the end of this initial meeting was LARD provided a useful forum for sharing practice and learning from contemporaries and there was enthusiasm for follow-up meetings including those based around structured themes. If you work in a big city, and there are people doing similar things to you in that city, take advantage and get together to talk. So, thanks to Gareth Knight (LSHTM), Stephen Grace (UEL), and Veronica Howe (KCL) for organising, facilitating, and hosting LARD #1.

IASSIST 2014 Conference Submission Deadline EXTENDED to December 20

By popular request (and due to the tight holiday schedule this year), the IASSIST 2014 Conference Programme Committee has extended the conference submission deadline!

Submissions for all formats are now due December 20, 2013.

Thank you for all of your submissions to date, we look forward to the review.

Please let us know if you have any questions - email.

Best,

Program Chairs

Jen Green
Johan Fihn
Chuck Humphrey

(And in case this is new to you...)

The theme of the conference  is "Aligning Data and Research Infrastructure" and the meeting will be held in Toronto, Canada 3-6 June 2014.  The conference program emphasizes three tracks:  Research Data Management, Professional Development, and Data Developers and Tools.  Participants may propose individual papers, complete sessions, poster/demonstrations, Pecha Kucha, roundtable discussions, and workshops.

Conference overview: http://www.library.yorku.ca/cms/iassist/
Call for Papers: http://www.library.yorku.ca/cms/iassist/call-for-papers/
Online submissions: http://staff.lib.muohio.edu/~sekyerk/iassist14/
Workshop proposals: email Workshop Coordinator Lynda Kellam

Please spread the word about the impending submission deadline and IASSIST's exciting 40th Anniversary conference!

I am he as you are he as you are me and we are all together

I'm just in the process of updating who we follow from our @iassistdata twitter account (we follow members who follow us - when I get round to updating things, sorry).

Given the huge* number of followers we now have, (595, thank you one and all) I thought it would be interesting to see what we looked like according to our twitter bios.

No surprises: we define ourselves as data people or organisations, in terms of "research", "librarian" (and library related terms), "social" "science", "digital", "information", and "universities". It suggests people following us are the type of people that should be following us given the organisation's goals, and hopefully are getting some value from following @iassistdata.

*Obviously a subjective assessment when Justin Beiber has 44,625,042.

 @iassistdata twitter follower bios

Ich bin ein IASSISTer

Topic:

From 28-31 May, GESIS - Leibniz Institute for the Social Sciences hosted the 39th Annual Conference of the International Association for Social Science Information Service and Technology, aka #iassist2013

IASSIST conferences provide an overview of what’s happening in information technology and data services and allow exchange of ideas between participants working in different backgrounds - from social science and humanities to information and computer science. The aim of this year's event was to help us move closer to the dream of technical and organizational measures that make research data discoverable and accessible.

Two-hundred and eighty five participants were welcomed to Cologne by GESIS President York Sure-Vetter ahead of a program of workshops, presentations, posters and discussions around this year’s topic of "Data Innovation: Increasing Accessibility, Visibility, and Sustainability".

The first day of the conference offered eight workshops, providing participants the opportunity to look at specific topics like licensing data, data visualization or DOI assignment. Sessions on a variety of tools and methods were also offered, specifically the OLAP analysis method, R open source software, and CharmStats - GESIS’s newly developed data harmonization software which was formally launched at IASSIST.

Over the following three days there were a total of three plenaries and 32 concurrent sessions organized in three tracks.

Presentations and discussions were concentrated in the track "Research Data Management" (RDM). This embraced a spectrum of topics related to all aspects of the data lifecycle. Emphasis was on policies, strategies and tools to support researchers in managing their research data. In addition presentations demonstrated various supporting collaborative infrastructures and virtual research environments at institutional, national or international level. Another focus was data citation and publications to enhance discoverability of data and professional credit for data sharing. Additional discussion offered answers to the question of how responsible use of complex or sensitive data can be facilitated. Finally sessions in the RDM track dedicated themselves to the subject of data curation and long-term preservation.

The track "Data Developers and Tools" presented a technical point of view with offerings from those working in application development – seasoning their work with a good dash of metadata. Questions were asked and solutions presented on the topics of interoperability, interconnection and integration, and preservation of data. A special role here is played by the DDI metadata standard to which many tools and applications have been introduced to simplify the creation and management of DDI metadata or provide value-added services on setting the standard up.

The track "Data Public Services/Librarianship" confronted aspects of access to research data. Here, development of data services from country-specific perspective (Bosnia and Herzegovina, Serbia, Croatia) was highlighted, but the track also managed to look at specific data types (non-digital, historical, confidential and sensitive data).

Slides of the presentations and video recordings of selected events will be published in the coming weeks on the IASSIST website providing you an opportunity to plunge into the world IASSIST 2013. Let’s do it all again in Toronto for IASSIST 2014!

Astrid Recker, Laurence Horton, Alexia Katsanidou
GESIS Archive and Data Management Training Center 

IASSIST 2013 by the numbers

  • 285 participants from 29 countries (a new IASSIST record!)
  • Nearly two-thirds from Europe (64%), one-third from North America
  • 2 Participants from Africa and 9 from the Asia-Pacific region

Top 5 countries represented

  • Germany: 88
  • United States: 66
  • UK: 32
  • Canada: 23
  • Netherlands: 10

Activity

  • 8 workshops with 103 participants
  • 32 parallel sessions featuring 126 presentations
  • 35 Posters
  • 11 Pecha Kuchas
  • 3 Plenary Sessions
  • 2 songs
  • 1 Banquet
  • Lots of white asparagus served
  • Many glasses of Kölsch drunk
  • ∞ Complaints about the venue Wi-Fi
  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect

    more...

  • Resources

    Resources

    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...