Already a member?

Sign In

Conference Presentations 2008

  • IASSIST 2008-Technology of Data: Collection, Communication, Access and Preservation, Stanford, CA
    Host Institution: Stanford University Libraries and Academic Information Resources

G3: Beyond Numbers: Preserving and Delivering Non-numeric Collections (Fri, 2008-05-30)
Chair:Jennifer Green, University of Michigan

  • Two New Content Services on the 1956 Institute Portal
    Zoltan Lux (1956 Institute)


    The 50th anniversary of the 1956 Hungarian Revolution encouraged many new written and electronic works. We have now created a database of these works (books, films, websites and conferences) and of reviews of them ( The photographic documentation database has taken a further step forward with the project entitled “Processing of Photographic Life’s Works”. Our plans for this are to process the work of four to six living photographers each year. It will include making a life interview with each photographer, which can be read and heard on the internet service. The photographs selected to present the life’s work can be ordered for various uses or even commented upon online. The content service will also be accessible in English. (Under development: Both the projects will be developed further. The subject-area of the “Review” database will be broadened, with the eventual intention of having it cover historical publications on contemporary Hungarian history (if possible in two languages). This is more a problem of content development. We would also like to adapt the framework system of the “Photographic Life’s Works” database to our earlier photo-documentary database (


H1: The CESSDA ESFRI Project - Setting Up a One-Stop Shop for European Data (Fri, 2008-05-30)
Chair:Ken Miller, UK Data Archive

  • The Vision and Requirements - Legal, Financial, Governance
    Kevin Schurer (UK Data Archive)


    The principle outcomes of the achievement of these objectives will be the establishment of CESSDA as a legally constructed entity, membership of which will mark organisations as quality assured centres of expertise in the preservation, management and dissemination of research resources. Also, the undertaking will position CESSDA as the first choice for the deposit of data by both national and pan-European bodies, for example Eurostat and DG Research.

    In line with the recently published NSF (National Science Foundation) report on cyberinfrastructures, the broad vision of the CESSDA RI upgrade can be stated as wishing to develop ‘a system of …. data collections that is open, extensible, and evolvable; and to support development of a new generation of tools and services for data discovery, integration, visualization, analysis and preservation…[consisting of]…. a range of data collections and managing organizations, networked together in a flexible technical architecture using standard, open protocols and interfaces, and designed to contribute to the emerging global information commons.’

  • The Infrastructure - Availability, Authentication and Access
    Atle Alvheim (NSD)
    Vigdis Kvalheim (NSD)


    The project will undertake strategic work to plan functionality enhancements in data access and data exploration. It will also develop functional specifications for a Shibboleth-based single data-users registration and Sign-On system, incorporating a common CESSDA-wide End-user-licence and logging of data use, for statistical reporting and evaluation purposes and potential solutions to the problem of a common identifier system for data resources, with a version identification for version control purposes. However, all this is dependent upon some thorough procedural groundwork. Europe still consists of some 30+ independent countries with national lawgiving protecting data and privacy. The project aims to build an infrastructure of content and the paper will discuss this as a two-fold problem, availability and access, technical and procedural (legal, administrative and procedures) to ensure access and use. Who can access what and for which purposes is still a very relevant question.

  • The Services - Data Discovery, Harmonisation, Analysis and Dissemination
    Uwe Jensen (GESIS-ZA)


    The project will setup strategic plans for developments on metadata, data models and software upgrades for data and metadata capture, processing and management which will complement the guiding services in online publishing, discovery, access and analysis of complex data types. Particular attention will be given to the advanced options of DDI 3.0 to handle complex datasets along the entire data-life-cycle.

    The extension of the thesaurus to other languages will enable natural language resource discovery for an increased number of European researchers. Accordingly developments to the CESSDA portal will enhance the system for researchers by facilitating access to resources discovered via the thesaurus.

    Needs in comparative research associated with data harmonisation are subject of a particular endeavour. The identification of the demands for harmonisation to standards and conceptual work on metadata structures for harmonisation rules and conversion keys are the central issues on the content side. The draft for a database and the construction of middleware to facilitate the application of harmonisation techniques through the portal are key tasks to setup the functional specifications and realize related technical solutions.

  • The Staff - Professionalisation, Network of Excellence, Training
    Adrian Dusa (RODA)


    The project will deal with the skills gap between the better developed archives and those which are less well resourced. It will establish a working group, undertake an audit of expertise, organise a workshop for CESSDA members and draft a report based on the results of the audit and the workshop.

    The aim is to extend the CESSDA network, in terms of membership, professionalism and skills and in terms of associations with similar organisations. The precise differences between CESSDA archives will be examined to identify the gaps (both technical and organisational) that will need to be filled to bring all organisations (both existing members and those which may seek to join at a later stage) to a common standard. This work will be complemented by the development of a self–assessment tool which will enable for all archives to measure their conformance with the OAIS Reference Model. The combined effect of this work will be to raise standards within the community.

  • The Widening – Exploring Potentials and Possibilities of Extending the Research Infrastructure
    Brigitte Hausstein (GESIS-SAEE)


    The project will focus on the need for widening the participation in the CESSDA RI both directly by fostering membership in new countries and indirectly by deepening involvement and extending the CESSDA RI to agencies and organizations which remain outside of CESSDA yet continue to host important data collections. Although the current CESSDA network is extensive, including 21 countries, it is not totally comprehensive. Equally, the CESSDA network is currently rather heterogeneous, with some country members being younger and less-developed. These imbalances will be addressed.

    A special action plan will be set up in order to extend the existing CESSDA network and foster the development of national data archiving initiatives in those countries which are not currently part of CESSDA. The obvious purpose of this activity is to spread the CESSDA-network to each EU-member state and to create and maintain a ‘complete’ pan-European SSH network, including representation from emerging and candidate countries. Equally, some countries with organizations within CESSDA have specialised research-led project-based teams whose data currently fall outside of the CESSDA network, and mechanisms likewise need to be put in place to create better comprehensiveness and coherency.


H2: New Data, New Tools: the State of Software Development at the Minnesota Population Center (Fri, 2008-05-30)
Chair:Peter Clark and Bill Block, University of Minnesota

  • A Unified System for Processing Microdata Projects with Disparate Hierarchical Data Models
    Justin Coyne (University of Minnesota)


    Historically the Minnesota Population Center (MPC) has dealt with 2- tier (person-household) hierarchical datasets that originate primarily from census data. Recent projects have data hierarchies that go well beyond the limits of the systems developed to process the 2-tier data. This has necessitated development of a new system that would be able to handle data structured in a multi-tier configuration. We contrast the structure of two datasets (American Time Use Survey and Integrated Health Interview Survey) and explore how our unified system addresses the unique challenges posed by each dataset. We will examine the techniques we used to identify and classify hierarchy, translate flat datasets into relational database, and creating queries with the relational data model. We offer a brief overview of the 4 stage pipeline that we use to process data from collection, integration, to extract back into a flat data file.

  • Domain Specific Languages for Data Editing
    Colin Davis (University of Minnesota)


    The Minnesota Population Center (MPC) offers microdata with constructed variables not found in the original datasets. These variables represent an important addition to the data disseminated by the MPC. However, recent MPC projects have involved datasets with more complex structure, necessitating the development of new tools to import, integrate and disseminate these datasets. The new toolset performs variable construction and complex data editing as the final phase of variable integration. The tools accomplish this by executing scripts that apply edits to variables. The editing scripts are meant to be easy for a researcher without a programming background to read and modify, and are written utilizing a domain-specific language, or DSL. The MPC has utilized a rudimentary form of a DSL for the data editing of US Census micro-data. For datasets with more richly structured data, a similar but more expressive and powerful domain-specific language was created. We examine the two styles of the data editing DSL and look at the challenges posed by the form of the American Time Use Series data and the Integrated Health Interview Survey data compared with traditional census data.

  • Building an Extensible Data Access System for Longitudinal Surveys
    Marcus Peterson (University of Minnesota)


    Though traditionally dealing in simply structured microdata, the Minnesota Population Center (MPC) has recently begun to move toward handling more complex types of demographic data. One such project involves the harmonization and dissemination of the National Survey of Families and Households (NSFH), a longitudinal dataset containing some 13,000 respondents surveyed over 16 years. In developing a database- driven, web-based access system for this dataset, the MPC aims to build an interface capable of disseminating a broad range of other similarly complex datasets. The NSFH dataset introduced several new programming challenges for the MPC, most relating to the presentation and storage of longitudinal data comprising more than 27, 000 variables. Initially working with an assorted collection of codebooks, DDI and data files, the MPC developed tools to import the NSFH metadata and data into a database that is easily queried from a web application. By making this disparate data accessible through a database, the MPC has taken a big step toward realizing an access system suitable for this and other equally complex datasets. Here we will discuss the building of the data access portion of the NSFH dissemination system.

  • Time Well Spent: Building a System for Time Use Research
    Benjamin Ortega (University of Minnesota)


    The Minnesota Population Center is working with the University of Maryland's Joint Program in Survey Methodology to develop a data access system for the American Time Use Survey, a collection of time diary information from participants in the U.S. Census Bureau's Current Population Survey (CPS). This unique dataset offers potential for research on topics including work and family time, social policy impact, and many others, but the data's inherent complexity has limited its use so far to a small group of researchers. With the aim of facilitating research use of this data, we are building a system that combines comprehensive metadata and documentation with a tool that lets researchers work intuitively with time use and survey data. Our system enables researchers to specify customized aggregations of time spent involving particular activities, locations, times of day, and other criteria. Users can then view this data alongside CPS responses. In this talk, we will discuss our efforts to create the end-user component of this system, drawing on the MPC's previous experience building such tools while incorporating traditional and emerging trends in software architecture and usability to address the new challenges presented by the disparate qualities of this data.

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...