Already a member?

Sign In
Syndicate content

Blogs

Chronology of data library and data centres

A few days ago I asked on the IASSIST mailing list for some help in order to find out dates of creation of data libraries, data centres and such services. It was overwhelming to receive answers from colleagues from everywhere with dates and some other useful information about the establishment of local data support and national services.

There is a wealth of information in this community around these issues and with the increasing importance of data services we need to make sure we collect and make this information accessible. After all, our data obsession comes with the trade. ; )

There were many colleagues that asked for all the information to be compiled and shared. Thus I have prepared an initial google sheet titled "Chronology of data libraries and data services" with the information from all responses.

I have added a few extra fields such as country or type of service but am sure there would be many others that could be interesting. The list is by no means complete or perfect so I ask again for help from colleagues to add or edit (you will need to request edit access for this).

I also wonder whether other information of IASSIST membership could be merged to construct an even more powerful dataset. All comments, suggestions and volunteering is welcome.

IASSIST Fellows Program 2014-15

The IASSIST Fellows Program is pleased to announce that it is now accepting applications for financial support to attend the IASSIST 2015 conference in Minneapolis [https://sites.google.com/a/umn.edu/iassist-2015/], from data professionals who are developing, supporting and managing data infrastructures at their home institutions.

Please be aware that funding is not intended to cover the entire cost of attending the conference. The applicant's home institution must provide some level of financial support to supplement an IASSIST Fellow award. Strong preference will be given to first time participants and applicants from those countries currently with insufficient representation at IASSIST. Only fully completed applications will be considered. Applicants submitting a paper for the conference will be given priority consideration for funding.

You may apply for funding via this form <https://docs.google.com/spreadsheet/viewform?usp=drive_web&formkey=dEhLcnNIcE4xWW9NUzBwZnViNy1sUWc6MA#gid=0>.The deadline for applications is the 31st of January 2015.

For more information, to apply for funding or nominate a person for a Fellowship, please send an email to the Fellows Committee chairs, Florio Arguillas (foa2@cornell.edu) and Stuart Macdonald (stuart.macdonald@ed.ac.uk)

Hallelujah and praise the LARD! The first London Area Research Data group meeting

LARD is London Area Research Data and this was its inaugural meeting, informally bringing together various people from London based institutions (and as far away as Reading) who are charged in some way with Research Data Management (RDM) - be it research support or repository work.

These are my notes, which lack attribution partly because I couldn't remember where every person was from, and also it wasn't clear if the meeting was on or off the record. Nonetheless, I felt there were some interesting points that deserve sharing as an insight into how UK universities (and one research centre) are dealing with RDM less than a year away from the EPSRC deadline on expectations of compliance for research data.

The first item in what was a free form discussion (think RDM jazz - hence my beat style kind of note taking, with full stops however), was policies. Some institutions have data policies, some have draft policies, and others have no policy. The mood seemed to be that a policy was more effective as a mandate for focusing university attention and resources on support services, not so much for grabbing researchers’ attention. Researchers, it was said, tend to react more to what funders want rather than university policies or documents. Those universities that competed for Medical Research Council (MRC) funding felt the MRC demanded institutional data policies, and so those institutions tended to adopt or have drafts ready for adoption. Yet most researchers are not funded by one of the RCUK councils, and these are often funders without data mandates. The group found a problem telling researchers that they don’t own their own data (it’s often funders or institutions through employee created works clauses). There was also a sense that researchers worry about data protection and are looking for practical guidance on how to keep data safe and secure. There was also a recognition that disciplines matter, those disciplines that do not have a strong culture of sharing data can be helped with the weight of institutional support providing the infrastructure to support RDM. This tackles the disciplinary focus of researchers, or localism. An example of how a bad experience can focus attention was mentioned when a researcher lost data by plugging a malware infected hard drive into a university network and had to have the drive and the copy of the data destroyed. Episodes like this can be used to tackle the culture of “improvisation” when it comes researchers “backing-up” their data without, or without engaging, institutional support. Aside from acting as a “wake-up” for researchers, they can push universities into providing workable, easy to use, institutional storage - either working storage or preservation in an institutional repository.

Discussion then moved round to the EPSRC expectations for research data, with those who attended a recent DCC event on the EPSRC expectations reporting that the EPSRC are not looking to get rid of opportunities for supporting research, so are not likely to cut off funding come May 2015. However, they do expect to see evidence that institutions are working towards or trying to improve storage, support, and data discovery and access. Nonetheless, there is no doubt the EPSRC policy has focused knowledge and effort in institutions towards RDM. Then training was mentioned. When the “T” word is mentioned I often think of that line about if people don't want to come how are you going to stop them? To save us from preparing to teach to empty rooms, the thinking now seems to be towards providing support when people need it and building up a directory of experts to refer to when appropriate. Structured support is based on identifying four key stages in the data lifecycle: submitting a proposal (for help on data management planning), when proposals are accepted (implementing RDM), mid-project (supporting implementation), and towards the close to talk about preservation. The key is to keep engagement with researchers. One institution is trying to do this for all research projects at that institution so is working with their research office to target RCUK funded projects. Another institution initially plans to work with a sample of projects.

By now the discussion had moved on to data management planning. One institution had a Data Management Plan (DMP) template and DMP requirement as part of its data policy, with separate plans for staff and postgraduate students. The feeling was that template texts are not such a good thing if they are copied and pasted into DMPs. A case was mentioned of one research funder refusing to fund a project because the DMP used identical text to another DMP submitted from that institution. The DCC’s DMPOnline tool was mentioned, particularly it’s ability to be customised towards an institution. It was also mentioned that DMPOnline has been much improved in later versions. A policy was mentioned at one institution of not offering storage until a DMP has been completed, another institution reported on how there is a checkbox in the research office to signify that the DMP has been looked at by the data management officer.

The RDM equivalent of Godwin's law (or Godwin's Rule of Nazi Analogies), is that at some point cost will be mentioned. How to cost RDM is an ongoing problem. Given the problem of identifying costs that specifically relate to RDM activity, as opposed to to typical research requirements that have an RDM aspect, an additional problem is that RCUK funders mostly allow budgeting for RDM but that budgeting must not identify activity that is supported as part of general institutional funding. Auditing costs is a problem. Storage tends to have the easier to identify costs (storage per byte for example), but this can be a problem if data is stored in an institutional repository when the budget for the project identified separate storage costs. For this reason, solutions like Arkivum may be advantageous as they can be specified as an auditable costs.

The coda to this discussion concerned metadata. It was said that funders were keen on ensuring that good quality metadata accompanies research data generated by projects they support, and that they are willing to allow proposals that factor in additional time and resources for metadata. However, an obvious problem is who should be adding that metadata - is it researchers who know the data, but not necessarily the standard or see its importance in the way RDM support staff do; or should it be RDM staff, particularly repository staff, who know they type of information required but do not necessarily know the data or discipline that well. Finally, hitting on a standard that that is applicable to all data is a problem. Social science is not the same as genetics; art history is not the same as management. It was then asked if there was a way to harvest metadata when that metadata is created elsewhere (say, the UK Data Service). Both the DCC and UK Data Service are working on a Jisc funded Research Data Registry and Discovery Service and the European Union are also working on data discovery platforms that imports/exports catalogue record metadata.

The feeling at the end of this initial meeting was LARD provided a useful forum for sharing practice and learning from contemporaries and there was enthusiasm for follow-up meetings including those based around structured themes. If you work in a big city, and there are people doing similar things to you in that city, take advantage and get together to talk. So, thanks to Gareth Knight (LSHTM), Stephen Grace (UEL), and Veronica Howe (KCL) for organising, facilitating, and hosting LARD #1.

Version 4, Research Data Curation Bibliography & the IQ

Another reason to write for the IQ: you might get yourself into Charles Bailey's prestigious bibliography, at

http://digital-scholarship.org/rdcb/rdcb.htm

I'm pleased to see no less than 7 IQ articles in the latest version. I didn’t count IASSISTers who published elsewhere but several of those were in the list as well.

Research Data Curation Bibliography

Charles W. Bailey, Jr.

Houston: Digital Scholarship

Version 4: 6/23/2014

Altman, Micah, and Mercè Crosas. "The Evolution of Data Citation: From Principles to Implementation" IASSIST Quarterly 37, no. 1-4 (2013): 62-70. http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation

Bender, Stefam, and Jorg Heining. "The Research-Data-Centre in Research-Data-Centre Approach: A First Step towards Decentralised International Data Sharing." IASSIST Quarterly 35, no. 3 (2011): 10-16. http://www.iassistdata.org/iq/research-data-centre-research-data-centre-approach-first-step-towards-decentralised-international

Mooney, Hailey. "A Practical Approach to Data Citation: The Special Interest Group on Data Citation and Development of the Quick Guide to Data Citation." IASSIST Quarterly 37, 1-4 (2013): 71-77. http://iassistdata.org/iq/practical-approach-data-citation-special-interest-group-data-citation-and-development-quick-guide

 Ribeiro, Cristina, Maria Eugénia, and Matos Fernandes. "Data Curation at U. Porto: Identifying Current Practices across Disciplinary Domains." IASSIST Quarterly 35, no. 4 (2011): 14-17. http://www.iassistdata.org/iq/data-curation-uporto-identifying-current-practices-across-disciplinary-domains

 Schumann, Natascha. "Tried and Trusted: Experiences with Certification Processes at the GESIS Data Archive." IASSIST Quarterly 36, no. 3/4 (2012): 24-27. http://www.iassistdata.org/iq/tried-and-trusted-experiences-certification-processes-gesis-data-archive-0.

 Schumann, Natascha, and Astrid Recker. "De-mystifying OAIS compliance: Benefits and challenges of mapping the OAIS reference model to the GESIS Data Archive." IASSIST Quarterly 36, no. 2 (2012): 6-11. http://www.iassistdata.org/iq/de-mystifying-oais-compliance-benefits-and-challenges-mapping-oais-reference-model-gesis-data-arc

 Yoon, Ayoung, and Helen Tibbo. "Examination of Data Deposit Practices in Repositories with the OAIS Model." IASSIST Quarterly 35, no. 4 (2011): 6-13. http://www.iassistdata.org/downloads/iqvol35_tibbo.pdf

 Congratulations to the authors.

Robin Rice, IASSIST Web Editor

Data Visualization Support Roles

Hello IASSISTers,

Since our last entry, the Data Visualization Working Group (DVIG) has been connecting through email to gather information and share knowledge about data visualization tools, best practices, teaching, and events of interest. A major theme of conversation has been open source programming frameworks like R statistical packages to conduct visualization.  Many other non-programming tools have also been discussed and shared. The question of tools is not an easy one, and there are a lot out there!

For a list of tools (not comprehensive) see: DVIG Tool list (opens in Pearl Trees)

What are we doing with visualization?

Some members are considering licensing software for their institutions, including licensed software such as Tableau (http://www.tableausoftware.com/). Others are considering adding visualization features to existing data repositories or portals, while others are considering these for upcoming data collections and repository development. A discussion about the creation of a blog series related to experiences with different repository software has been mentioned, as well as, a list of criteria for discerning between licensed software and repository systems. Many are concerned about the scalability of tools and are interested in the application of visualization techniques across disciplines and groups.

Now, there are a lot of ways to visualize data. The following diagram describes the process of creating and making sense of visualizations, and might be helpful for our discussions and understanding.

Process – Information Workflow

 

Taken from: Aisch, Gregory. Using Data Visualization to Find Insights in Data. Data Journalism Handbook, Open Knowledge Foundation. http://datajournalismhandbook.org/1.0/en/understanding_data_7.html . 2014-07-11.

Regardless of where we are at, we are all in agreement that data visualization is COOL! And it needs support. Unfortunately the skills needed to perform data visualization and “data wrangling” projects are not taught widely in higher education, however, some institutions have made strides to develop these core skills and training and others are now developing curriculum. 

Here is a short list of current courses and teaching materials:

University of Washington, Data Visualization (winter 2014)

New York University, Certificate in Data Visualization

Columbia University, Data Visualization

University of Kansas, Managing Research Data in the Social Sciences (incl data wrangling) (summer 2014)

University of British Columbia, Information Visualization

University of Toronto, Big Data Analytics (fall 2014)

A major theme at this past IASSIST Conference was data support roles. Data visualization topics such as R programming package, developing library support services, teaching tools and undergraduate pedagogy, current research, were very well attended. The IASSIST community is engaged in Data Visualization at almost all stages of the process workflow (see above).

Libraries can play an important role in supporting researchers…

Libraries serve as an ideal place on campus to support visualization for a number of reasons. Data visualization is a truly interdisciplinary activity seeing a growing importance in a wide variety of fields. Even the techniques involved draw on diverse fields from statistics to computer science to design. As such an interdisciplinary field and on that benefits so many diverse fields, visualization has a natural home in the library. Rather than individual disciplines developing support, knowledge and tools for visualization these advances can be shared across campus by making the library a central point of visualization activities.

Furthermore, providing support for data visualization in the library can amplify other data related activities. As libraries increasingly move to collecting, managing and preserving complex datasets offering services that can assist in making sense of that data will make it all the more valuable. Moreover, visualization services provide additional opportunities to inform researchers of support for research and data within libraries.

Data Viz at University of Michigan Libraries

At the University of Michigan we are in the process of developing our services to support visualization. Historically our support for data visualization has developed in two different parts of the library. Both our 3D Lab (part of our Digital Media Commons group) and our Spatial and Numeric Data Services (SAND – Part of the Clark Library for Maps, Government Information and Data Services) have supported and continue to support various types of data visualization, mapping and working with complex types of data. In SAND, where I am located, we focus primarily on helping researchers, students and faculty through consultations where we teach people techniques and how to use appropriate software rather than producing finished products. While we all have our favorite pieces of software we attempt to balance our patron’s familiarity and the most effective software for their goals. We also offer open workshops and course instruction on various data visualization and mapping topics.

While we would ideally like to be able to support the entire spectrum of data visualization activities, one of the most challenging aspects of supporting visualization is providing a scalable service or at least supporting a variety of scales to best benefit one’s campus. Providing consultations around producing graphs and charts for presentations and publications seems easily within the scope and scale of traditional library consultations, but providing production services and assisting with large scale projects such as the creation of interactive web environments or visualizing terabytes of data often requires more time and effort that we usually have to devote to individual projects. Still, libraries, in providing a space for whatever assistance is possible and helping researchers and students understand the resources required for a given project, can offer an invaluable service to our campus communities.

Supported projects at the University of Michigan:

19th Century Acts - http://19thcenturyacts.com/

Mapping Moby Dick - http://record.umich.edu/articles/technology-meets-literature-students-map-classic-novel

 

Many thanks for all the collaboration on the DVIG list, 

Amber Leahey (University of Toronto) & Justin Joque (University of Michigan)

IASSIST 2014 conference song

Topic:

Sung to the tune of Gordon Lightfoot's  "If You Could Read My Mind"; thanks to San, Paula, Bill, and Vince for their suggestions, and Vince, Dan and Kate for helping to drown out my own voice :) If anyone has a video version of this, send me the link, so I can add it here:


If you could read my data
What a tale these points would tell
At Toronto's IASSIST 40
Data folk began to dwell
At the hockey game
We began to meet
With colleagues and old friends
These meetings never end
Aligning data with infrastructure of research is what it's all about

If you were at this IASSIST
Many fantastic talks you heard
All of the sexy specialists
Seemed to tweet on what they heard
When you reached the talk 'bout the data dude
The seating was all gone
The tweeting would go on
We don't want the talks to end
Because excitement's just too hard to fake

Chuck walked away with the plenary
When the speaker didn't show
Improv -- way to go!
IASSIST's big tent will cover everyone who works with data now
We will show them how
The Steam Whistle banquet was great
Where we could get more beer with tickets
Myron's talk went slightly wrong, his script was gone but he managed to get it back

If you could read my hashtag
You'd have seen some tweets galore
How Justin Hayes raps data
Declare variables not war
Robin's free at last, 'cause her session's done
And Bit Rot Bitter's cool
The tweeting never ends
If you read the twitter feed
You soon will see the many things we do
The talent's always there
We always seem to feel this way
And we've got to say that we really get it
The only thing that seems so wrong
Is the long time before we meet again.


IASSIST SIGDMC Annual Report 2013-2014

By Carol Perry & Stefan Kramer, co-chairs
Last updated: 2014-05-29 by CP

  • The major activity of the Data Management & Curation Interest Group (SIGDMC) in the last year was the conceptualization, organization, submission, and offering of the June 2, 2014, morning workshop Data Management & Curation: Lessons from Government, Academia, and Research. It features seven invited presenters, and session and breakout group moderators from the SIGDMC membership, which also provided input on the breakout group topics.
  • As of May 26, 2014 SIGDMC membership is at just under 70, having been fairly steady over the year in terms of Google Group membership.  
  • The Data Management and Curation Resources page on the IASSIST website has been reviewed and updated. The list now contains 59 resources;  9 new resources were added since May 2013. Minglu Wang, Limor Peer and Wendy Mann are responsible for this resource. 
  • Progress was made in keeping the IASSIST blog active, however, we did not quite meet our goal of one blog per month. 
  • The members who attend the annual IASSIST conference in Toronto have been invited to participate in an in-person meeting on June 4, where the election outcome of the successor of Carol Perry as co-chair will first be announced, and future goals for the group be discussed.

Research Data Management Issues Across Environments

Lots of conversations going on these days in different venues where people are asking many of the same questions:  how do we teach researchers about data management with limited staff, and what data management services should we offer?  How do we find sustainable ways to manage data that leverage the efforts of many different repositories, those in government, institutions and disciplinary ones?  How do we coalesce standard practice and reasonable but effective policies at at least the national level and preferably on a global scale?  What roles should governments play?  How much can we as data professionals accomplish on our own?  The Data Management and Curation SIG will host a workshop to talk about these and other issues across different countries and environments next Tuesday. Our speakers will include:

  • Dan Gillman, U.S. Bureau of Labor Statistics
  • Marcel Hebing, DIW Berlin
  • Chuck Humphrey, University of Alberta
  • Steven McEachern, Australian Data Archive
  • Barry Radler, Institute on Aging, University of Wisconsin-Madison
  • Robin Rice, EDINA and Data Library at the University of Edinburgh
  • Kathleen Shearer, Confederation of Open Access Repositories and Research Data Canada

Looking forward to seeing many of you in Toronto!

Michele Hayslett, University of North Carolina at Chapel Hill & Stefan Kramer, American University

One week to IASSIST 2014!

Topic:

It’s a week to go until IASSIST 2014 begins! (If you haven’t registered yet, better get to it!) Here are a few things you might like to know before you get here:

  • If you haven’t looked at them yet, please peruse the “How to get here” and “Local information” pages on the conference website
  • under ‘Visitors’ on the top menu (where you can find a map-based guide to Toronto attractions);

    • How to get here” lists ways to get to the downtown core from the airport;
    • Local information” has links to maps, information on how to get around Toronto, links to weather information from Environment Canada, plus links to guides to restaurants and drinking establishments in and around town.
    • If you’re a baseball fan, the Toronto Blue Jays are playing home games on June 1, 6, and 7; see the June schedule for details.
    • if, at any point, you want to pick up some groceries, there's a huge grocery store -- right next to the Mattamy Centre (site of the Tuesday reception) -- known as Loblaws at Maple Leaf Gardens; check out the 'Wall of Cheese"!
    • if your taste runs to reading, there's an Indigo store (Canada's major bookstore chain) about 10 min from the conference hotel by foot, on Yonge St.

As far as food and drink go, the variety you'll find in the city, even restricted to the downtown area, is nothing short of remarkable; there's even a multicultural food court just down the street from the conference venue. Later this week we'll be adding to the above links a personal guide to eating and drinking establishments prepared by one of our volunteers.

Your Local Arrangements Committee and volunteers will do our best to ensure you have a pleasant and productive stay in Toronto. Please let us now if you have any specific questions. We look forward to greeting you next week for IASSIST 2014!

New 'Special Issue' IQ now available!

Editor’s notes

Special issue: A pioneer data librarian

Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect papers relating to the work of Sue A. Dodd. Margaret Adams (Peggy) acted as the guest editor and the background and content of this volume is described in her preface to this volume on the following page. As editor I want to especially thank Peggy and Libbie for pursuing and finalizing their excellent idea. I also want to thank all the authors that contributed to produce this volume. As one of the authors I can witness that Peggy did a great job.

Articles for the IASSIST Quarterly are always very welcome. They can be papers from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing a presentation, give a thought to turning your one-time presentation into a lasting contribution to continuing development. As an author you are permitted “deep links” where you link directly to your paper published in the IQ. Chairing a conference session with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the session participants, and will be readily available on the IASSIST website at http://www.iassistdata.org.

Authors are very welcome to take a look at the instructions and layout:
http://iassistdata.org/iq/instructions-authors

Authors can also contact me via e-mail: kbr@sam.sdu.dk. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you.

 

Karsten Boye Rasmussen

April 2014

Editor

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect

    more...

  • Resources

    Resources

    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...