Q and A on data-librarianship

By jajacobs | June 22, 2006

During the colloquy hosted by the Chronicle of Higher Education today, there were a number of questions about the role of libraries and librarians in preservation of and access to data. Although the focus was on e-science more than social science, the questions and answers might be of interest. All the questions and answers are available here. D. Scott Brandt, associate dean of libraries at Purdue University, is helping to build a repository of scientific data.


– jim jacobs




Question from Lila Guterman, The Chronicle of Higher Education: I’m curious about how scientists and librarians are trained to deal with all these data sets. Do library students have to take statistics? Do science graduate students take computer science?


D. Scott Brandt: The researchers we’ve talked to recognize that they don’t have the skills (or the time) for dealing with the organization of their data—formatting it, enhancing it with metadata, curating and/or archiving it. At Purdue they look to librarians to help them. The librarian situation is interesting. For instance, we are posting a position called data research scientist, which is a library based researcher applying technical skills to metadata problems. But librarians have an important role in curating and archiving data, applying library science principles to e-science to resolve problems related to metadata, ontologies, data management, etc. Thanks!


Question from Sarah Everts, Chemical & Engineering News: I’m interested in the people who might consider a career trying to solve this problem. What skills are required to handle this data? If someone gets training in digital science data storage (ie, to handle the tech side of things), what kind of workforce demand is there now and 5-10 years in the future?


D. Scott Brandt: Here’s how we describe the data research scientist (DRS) position which works in this area: carry out sponsored research projects related to data, datasets and data mining applications, including data description and enhancement; collaborate with the Libraries’ and university data producers and repository contributors to develop cost effective and efficient strategies and reliable data streams for managing data and importing it into the institutional repository; organize access to data and related resources using traditional and emerging metadata schema; track developments in data management practices, as well as recommend and design appropriate applications to facilitate and enhance access to data sets and other collections. My guess is that it will 5-10 years to develop the tools and techniques to make this doable for the future.


Question from Charon, Research Institution: Do you believe that librarians have the necessary skills to build data repositories for such disperate data? Archiving the data as a single “blob” will not provide the value that archiving via a relational data would.


D. Scott Brandt: While we work data, we are really more concerned with the metadata and the collection management process, which covers not only preservation, but access and use. And we believe that for data to interoperate, metadata has to interoperate.


Question from Michael, a Cal State Univ. Campus: Beyond archiving raw data, what is the role of libraries in archiving and hosting web portals to access the resultant scholarship? We are struggling with whether our library should assume this role or if the colleges with which the faculty are affiliated should take on the responsibility for developing and maintaining websites for journals and other online resources. Your thoughts?


D. Scott Brandt: At Purdue, we’ve developed a distributed institutional repository framework that provides an application layer, and we work to support protocols and interfaces for not only our own applications (e.g., portal), but those developed by others outside of the Libraries. We’ve tried to provide the tools they need and support for them.


Question from Brian Simboli: As a librarian, I know that we’re already quite busy with many other things. Specifically, what will the staff costs be? Do you see dedicating someone to maintain this at least part time?


D. Scott Brandt: The role of the librarian is changing… In the face of large scale digitization and rapidly advancing technology, these are both exciting and perhaps threatening times for many librarians. We see curating datasets as a function of building collections, not unlike acquiring other material… We predict that in ten years this will be a comfortable and familiar facet of librarianship—and could be part of any librarian’s collection development responsibilities.