Already a member?

Sign In

Conference Presentations 2008

  • IASSIST 2008-Technology of Data: Collection, Communication, Access and Preservation, Stanford, CA
    Host Institution: Stanford University Libraries and Academic Information Resources

F1: Implementation, Application, and Sharing of DDI Resources (Fri, 2008-05-30)
Chair:Stefan Kramer, Yale University

  • Metadata Share Project (MSP)
    Joel Herndon (Duke University)
    Rob O'Reilly (Emory University)


    While the exponential growth of web based data sources has expanded access to the research community, this same growth has presented a series of challenges to Data Libraries that attempt to promote a mixture of online data resources alongside library licensed resources. Even with advanced content management systems, many Data Libraries devote a great deal of effort describing similar sets of web based data resources for local patrons. The Metadata Share Project attempts to reduce the effort required to document the growing number of web based data sources by sharing DDI compliant data descriptions across research libraries. This paper describes a test initiative by Duke University's Data & GIS Services and Emory University's Electronic Data Center to share library DDI resources in order to expand resources at both institutions while reducing the burden of documenting data resources in DDI. We hope to expand the discussion of sharing DDI resources across data libraries by discussing our experience.

  • Creating Enriched Publications with MPEG-21 DIDL, DDI 3.0 and Primary Research Data
    Rob Grim (Tilburg University)
    Paul Plaatsman (Erasmus Data Service Centre)


    One of the challenges for both scientific researchers and research libraries in the eScience era is the creation of scientific publication packages (SPP’s)[1], wherein publications are combined with the primary resources that were used for the publication, such as for example, research data, statistical programming code or stimulus materials. End 2007 the library of the University of Tilburg and the Erasmus University Rotterdam (EUR) started an interuniversity and multidisciplinary SURF[2] financed project called “Together in Sharing”, that aims to create SPP’s for economic and social science research domains. For this project, primary research data were used form surveys (European Values Study), experimental economics (CentERlab) and finance (EUR). The project used MPEG-21 DIDL as a general data model to represent and pack the digital objects in a SPP. DDI 3.0 instances are created to capture the metadata that researchers consider relevant for the enhanced publications, and as a means to build metadata records that can be harvested within a library portal environment. The production of SPP’s also raises more fundamental questions, i.e. where do we store SPP’s in a national infrastructure that is equipped for archiving research data separately from publications.

  • DDI 3.0: Final Revisions and Future Directions
    Arofan Gregory (Open Data Foundation)
    Wendy Thomas (Minnesota Population Center)


    DDI 3.0 represented a major change from preceding versions of the standard, and was subject to several rounds of internal and public review as it developed. The last stage of review was a Candidate period of testing prior to its final release. This stage included applying the proposed standard to a variety of different real-world data sets, and its implementation in software tools. This paper summarizes the experience of the DDI Technical Implementation Committee during this final phase. It will highlight what was learned from the use cases and software implementations, in terms of both process and required adjustments in the standard, as well as discuss the anticipated future direction of the standard as it matures.

  • Documentation of German Labor Force Data at the IAB: First Experiences with DDI 3.0
    Claudia Lehnert (Institute for Employment Research)
    Joachim Wackerow (GESIS/ZUMA)


    The Federal Employment Agency (BA) is one of the most important producers of administrative data about the labor market in Germany. BA data are collected in the notification process of the social security system and in BA internal procedures for computer-aided benefit allowance, job placement and the administration of employment and training measures. The preparation and documentation of these process-generated data for researchers is performed by the Institute for Employment Research (IAB). Most of these data have been documented in PDF and MS WORD formatted codebooks, with some information contained in an SQL database, but the documentation in the current structure is coming up against limiting factors. We have therefore started to establish a new database organized according to the DDI 3.0 standard. This enables us to reuse metadata for different data collections and to track changes of variable definitions and frequencies over time. Our presentation focuses on the general structure of the documentation for administrative data, especially the use of different data sources for several data sets. Our initial experiences with DDI 3.0 will be illustrated by a structural outline and selected examples.

F2: Integration and Linking: Bringing Data and Documents Together (Fri, 2008-05-30)
Chair:Hans Jorgen Marker, DDA

  • Data in DSpace: Linking Archival Primary Documents and Quantitative Datasets
    Ann Marshall (University of Rochester)


    This paper investigates the potential for archiving primary source documents and the datasets created from these documents in the institutional repository DSpace. Through an initiative at the University of Rochester (U.R.), the Friends of the U.R. Libraries awarded a dissertation grant in support of depositing a unique dataset into the University’s digital archive. This grant funded the acquisition and digitization of WWII military documents currently located at the Bundesarchiv in Berlin, Germany. On condition of awarding these funds, the doctoral student agreed to deposit both the digitized primary documents and the unique dataset created from these documents into DSpace. This approach has the potential to increase the awareness and use of DSpace, while also capitalizing on the contribution that doctoral students might bring to data depositories. The paper also discusses the use of DSpace technology as a data depository and considers current and future enhancements to DSpace. Issues such as metadata, the availability of funds, interface functionality, and copyright are important considerations for expanding this initiative.

  • Implementing a Digital Repository for the Preservation of Interdisciplinary Data
    Robert R. Downs (CIESIN, Columbia University)
    Robert S. Chen (CIESIN, Columbia University)


    Digital scientific data created during the last few decades offer potential for analysis by future users and for integration with other data from different disciplines to support interdisciplinary analysis, discovery, decision-making, and education. However, significant barriers remain in managing and documenting such data sufficiently to meet the needs of future and interdisciplinary users. One possible approach to overcoming these barriers is to develop and implement digital repository systems within an appropriate institutional context. We report here on progress in implementing a digital repository using the Fedora open source software, working with the Columbia University Libraries. After discussing platform selection, feasibility testing, and collection development policy issues, we describe our experience with data migration and parallel ingest of data. We then discuss current system enhancements, challenges, and plans to improve capabilities for ingesting data and for enabling dissemination that supports future applications and use.

  • KombiFiD – Combined Firm Data for Germany
    Tanja Hethey (Research Data Centre of the German Federal Employment Agency at the Institute for Employment Research)
    Anja Spengler (Research Data Centre of the German Federal Employment Agency at the Institute for Employment Research)


    In Germany, process-generated data and survey data on firms are collected by different data producers. Each data producer provides access for researchers to its data, but the combination of datasets from different producers is not possible at the moment. The KombiFiD project aims to overcome this limitation: firm data collected by the German Statistical Offices, the German Central Bank and the Federal Employment Agency will be merged for the first time. Our goals are twofold: to gauge the possibilities of merging selected datasets beyond the limits of individual labour market data producers, and to provide combined datasets to science, thereby creating new research opportunities. Our presentation outlines the status quo of the project. We describe the datasets selected for merging and explain potential merging problems. Moreover we address research questions which can be analyzed for the first time with the unique new data. Amongst others, this is the possibility to monitor the history of businesses on the basis of their combined single units.

  • We Inhabit the Same World: Integrating Socio-economic and Environmental Data
    Dr. Veerle Van den Eynden (UKDA)


    The Rural Economy and Land Use programme provides examples of how interdisciplinary research projects carried out by teams of social and natural scientists combine the use of socio-economic and environmental data. Data may be integrated through spatial integration (GIS), modelling, relational databases or data conceptualisations and visualisations. Integrations must take into account differences in data scale, area and framework. Whilst social science data are usually organised according to administrative areas, natural science data are based on grids or ecological zoning. Researchers use different approaches to optimise communication between such diverging data. Experiences of data integration also provide information on how best to organise and archive data to enable their long-term use within various research disciplines.


F3: The Challenges of Data Preservation (Fri, 2008-05-30)
Chair:Libby Bishop, University of Leeds

  • Aligning Digital Preservation Policies with Community Standards
    Nancy McGovern (ICPSR)


    Digital preservation policies are an essential component of an organization's digital preservation program. Yet, recent surveys show that many organizations that manage digital content do not have an explicit policy statement to delineate the mandate, purpose, scope, principles, and objectives of their digital preservation program. The Digital Preservation Management workshop series developed at Cornell University by Anne R. Kenney and Nancy Y. McGovern (Digital Preservation Officer at ICPSR) between 2003 and 2006 produced version 1.0 of a digital preservation policy framework. The resulting framework is a high-level policy document that is structured to be sharable with other organizations. Version 2.0 of the digital preservation policy framework was developed and tested at ICPSR. It builds on the Cornell model by aligning the components of the framework with the attributes of a trusted digital repository and incorporating key components of the Open Archival Information System (OAIS) Reference Model. This paper will discuss the digital preservation policy framework, present examples from the version 1.0 and version 2.0 models, discuss the structure and development of a comprehensive set of digital preservation policies for an organization, consider the connections between recent research and development on policy engines for digital preservation, and propose next steps for community policy development.

  • Challenges in Preserving Neuroimaging Research Data
    Angus Whyte (University of Edinburgh, Digital Curation Centre)


    Preserving neuroimaging research data for sharing and re-use involves practical challenges for those concerned in its use and curation. These are exemplified in a case study of a psychiatry research group. The study is one of a series encompassing two aims; firstly to discover more about disciplinary approaches and attitudes to digital curation through “immersion” in selected cases; secondly to apply known good practice, and where possible to identify new lessons from practice in the selected discipline areas. These aims were addressed through ethnographic study of current practices, and action research to assess risks, challenges, and opportunities for change. The challenges are in some ways archetypal of fields that are embracing “e-science;” how to reconfigure practice to improve data sharing and re-use, given the capabilities afforded by “cyberinfrastructure. “ The evolution of those in neuroimaging is tied to the social and technological infrastructure underpinning the domain, and imaging centres such as the psychiatric research group in question. Its preservation challenges may be understood by examining relationships between its history, nature of the data collected, innovations in analysis, practices of sharing data and methods, and the evolution of data repositories in the domain.

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...