Already a member?

Sign In

Conference Presentations 2015

  • IASSIST 2015-Bridging the data divide: Data in the international context, Minneapolis
    Host Institution: University of Minnesota

Pecha Kuchas (Thu, 2015-06-04)

  • Uncorked: Leveraging Data to Drink Better
    Ashley Jester (Columbia University)


    Since 2010, my partner and I have been recording information about the wine that is consumed by our household, making a point to gather data about each unique bottle. Together, we have accumulated detailed information about over 400 bottles of wine, including tasting notes, varietals, origin, and importer. While this data collection was originally intended to keep us from buying “bad” wines again, it has turned out to be a rich trove of information about the varietals we like, the importers we can trust, and the years that have proven to be good vintages. This Pecha Kucha will present an overview of this data, revealing both some of the substantive findings from our dataset and also the methodologies that have been applied to create the analysis. This Pecha Kucha will be a quick and fun tour of the international landscape through the lens of wine, with a focus on finding out the best way to use data to make more informed consumption choices.

  • The Data Service Centre (DSC) at Statistics Netherlands: Storing and Exchanging Statistical Data and Metadata
    Harold Kroeze (Statistics Netherlands)


    The Data Service Centre (DSC) is the central repository for datasets across the entire statistical field of Statistics Netherlands (SN). Its purpose is to archive the datasets as well to enable easy, secure and monitored exchange of data and metadata. The DSC has the following characteristics: - Metadata first, data second. - Datasets are stored as text files (csv or fixed-width) and are described according to a metadata model. - Public access to metadata within SN, data access only after authorisation by data owner. - Service-oriented approach: the backend system uses web services for communication with client tools. - It promotes re-use of variables and definitions. An organisation-wide project ("The treasure chest unlocked") was set up to describe and store the microdata sets that form the basis of our published data. The project also produced a number of tools to manage metadata and data (for example a Metadata editor and a Catalogue). This resulted in a very noticable increase in the volume of metadata and datasets stored at the DSC. Data and metadata can now easily be shared within the organisation, but also with external researchers through our remote access facility.


D3: Leveraging metadata (Thu, 2015-06-04)
Chair:Sam Spencer

  • New ICPSR tools for data discovery and classification
    Sandra Ionescu (Inter-university Consortium for Political and Social Research (ICPSR))


    Capitalizing on rich, standardized DDI-XML metadata, ICPSR continues to develop its suite of tools for data discovery and analysis with new features and applications. Recently, ICPSR launched an innovative tool that enables linking individual variables with concepts to help increase granularity in the comparison of variables and/or questions across studies and series of studies. The tool allows users to create personalized concept lists and tag variables from multiple studies with these concepts; interactive crosswalks display the variable-concept associations to further assist in data analysis, comparison, and harmonization projects. In addition to personal concept lists, it is possible to create public lists so that an organization can apply its own authoritative tagging and make this resource publicly available. The concept tagging tool is integrated with ICPSR's variable search and comparison functions that have also been upgraded with a novel feature allowing retrieval of separate lists of variables measuring different concepts within the same study. We will present and discuss the tagging tool and the enhanced search features using live examples, and will also introduce the public ICPSR classification of the American National Election Studies and General Social Survey collections and the resulting crosswalk displaying the ANES Time Series and the GSS iterations by individual years.

  • Public APIs: Extending access to the UK Data Service
    John Sheperdson (UK Data Archive, University of Essex)


    The UK Data Service is providing access to its data and metadata holdings by making public some of its web service APIs. These REST APIs facilitate a self-service approach for our data producers and researchers alike, whilst also enabling 3rd party developers to write applications that consume our APIs and present novel and exciting ways of accessing and viewing some of the data collections that we hold. We have put new infrastructure in place to enable the provision of these APIs and have already run an App Challenge (for external developers to build mobile applications against our APIs) and added a data collection usage "leader board" as initial tests of the functionality, capacity, account management, developer documentation and performance aspects of our public APIs. The main infrastructure elements are an API management service, HTTP caching and routing and various API endpoints. The other major consideration was a set of design principles for the APIs so that developers have a consistent and predictable experience. This presentation will elaborate on the key components of the infrastructure and the API design guidelines.

  • Building a public opinion knowledge base at the Roper Center
    Elise Dunham (Roper Center for Public Opinion Research)
    Marmar Moussa (Roper Center for Public Opinion Research)


    A central and ongoing priority of the Roper Center for Public Opinion Research is the development and enhancement of state-of-the-art online retrieval systems that promote the discovery and reuse of public opinion data. It has become clear that foundational changes to the way the Center produces and manages its descriptive metadata throughout the data lifecycle would provide new and more efficient avenues for web application and tool development. In a collaborative effort to solidify the connection between cataloging and retrieval system development goals, the Center is developing a knowledge base system for managing and facilitating access to our vast collection of public opinion datasets. This presentation will provide an overview of the networked system of thesauri and controlled vocabularies that the Center is implementing to create the knowledge base as well as describe the automated classification process the team has developed using machine-learning techniques to repurpose existing metadata and enhance process integration throughout the metadata production workflow.

  • Colectica Portal vNext: Addressing new data discovery challenges
    Dan Smith (Colectica)


    A data portal designed to present managed research data has many tasks including making the data discoverability, documenting the research data management process, data access policies, standardized metadata, data linkages, longitudinal data support, programmatic access to the data and metadata, and integrating the data with existing systems. Colectica Portal has always solely focused on providing standardized metadata and metadata discovery, while many other tasks were left to other systems. This sole focus on metadata created challenges integrating rich DDI-Lifecycle information stored in the Portal with other applications that do not support the standard. This presentation will describe how the Colectica vNext project addresses these challenges in two distinct ways. One aspect of the vNext project is to present an integrated view of metadata and data. While the Colectica Portal historically presented DDI metadata in a metadata centric fashion, the vNext project creates focus areas centered around surveys, research datasets, and study documentation. This allows users a familiar and user friendly view laid on top of the more advanced metadata descriptions. The second aspect is a focus on data discovery. The Portal vNext project supports a new programmatic API for both metadata and data search, allowing easier integration with existing systems.


Pecha Kuchas (Thu, 2015-06-04)
Chair:Jennifer Doty

  • University Data Ownership and Management Policies
    Abigail Goben (University of Illinois-Chicago)
    Lisa Zilinski (Carnegie Mellon University)
    Kristin Briney ( University of Wisconsin–Milwaukee)


    Data ownership and management policies can affect how research data are supported at a university. This Pecha Kucha presentation will highlight the preliminary results of our current research on university data ownership and management policies. In contrast to previous studies on institutional data management policies, we examined the university websites of 206 institutions with a Carnegie Classification on Institutions of Higher Education of either "High" or "Very High" research level as of July 2014. Some of the major questions we asked included: Does the institution have a data sharing or management policy? What does the policy cover? Who owns the policy (e.g. Office of Research, Information Technology, Libraries)? What happens to the ownership of the data if a researcher leaves the institution? Are universities with data management services provided by the library more likely to have a policy on data management? Ultimately, our goal is to determine if universities support data management comprehensively with complementary policies and services. The topics that will be covered include data stewardship, ownership, retention, and sharing in regards to university research data policies.

  • Call Me Maybe? It's Not Crazy! Data Collection Offices Are a Good Partner in Data Management
    Alicia Hofelich Mohr (University of Minnesota)
    Andrew Sell (University of Minnesota)


    For data management professionals, attention is largely focused on the beginning and ends of the research process, as many researchers are worried about meeting federal requirements for data management plans (DMPs) and are looking for ways to share and archive their data. As a University office specializing in survey and experimental data collection, we have seen how the "middle" steps of data collection and analysis can be influenced by, and be an influence on, these upstream and downstream data management processes. In this Pecha Kucha, we will present relevant data management lessons we have learned from designing, developing, and hosting data collection tools. Challenges of anonymity and paying participants, quirks of statistical files produced by data collection tools, and transparency in the research process are among some of the issues we will discuss. As many of these challenges directly impact later sharing and curation of the data collected, we emphasize that data collection offices can be important partners in data management efforts.

  • Trends in Data Submissions at a Social Science Data Archive
    Amy Pienta (ICPSR, University of Michigan)


    In recent years, new data sharing policies in the US have encouraged scientists to share research data with others, many accomplishing this through archiving their data with a domain repository. Related to this trend, there is strong demand from social scientists for access to research data for a variety of secondary data analysis uses including: support of new grant applications, in classrooms for research papers, and to be used in research projects that lead to conference presentations and publications. Given that many users search for a potential secondary data through Google or through the search feature of data repository, it is possible to create and mine a database for emerging patterns in search behavior that help us better understand the demand for data and how well a domain repository is able to meet that demand. We explore data from the 100 most frequently searched keywords/phrases at ICPSR in 2014. We match these popular terms to the depth of the ICPSR holdings related to these search to determine areas where ICPSR may be lacking data. We also identify common search terms where the users exit the ICPSR web site after searching for data. We find, for example, "demoralization" was searched for 323 times in 2014 and 94% of users exited the ICPSR web site after results from the search were returned. Looking forward, ICPSR expects the number of scientists wanting access to research data collected by others to increase and this user search model may provide a greater understanding of data user needs.

  • The Census with Kittens: Not Just a Gimmick
    Amy West (University of Minnesota Libraries)


    Yes, illustrating a two hour virtual class session on the history of the Census Bureau and its surveys with cat pictures was initially a gimmick to maintain student engagement. Turns out though, cats are particularly effective at illustrating the more complex aspects of how the Census Bureau has developed, the functions that decennial censuses serve and the controversies they engender. I"ll demonstrate this unique qualification with comparisons to other charismatic fauna such as puppies, red pandas and otters.

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...