Already a member?

Sign In
Syndicate content

Community of Data Professionals

New about IASSIST members.

Now Accepting Proposals for IASSIST 2013

IASSIST 2013 will be hosted by GESIS – Leibniz Institute for the Social Sciences at Maternushaus in Cologne, Germany from May 28-31.

The Conference Website can be accessed here:

As announced previously, the theme of this year’s conference is Data Innovation: Increasing Accessibility, Visibility and Sustainability

This theme reflects recent efforts across the globe by the largest government agencies down to the smaller independent research units to make data (be it survey, administrative, geospatial, or scientific) more open, accessible and understandable for all.

With an ever-increasing availability of new technologies offering unparalleled opportunities to sustainably deliver, share, model and visualize data, we anticipate that there is much to share with and much to learn from one another.  Interdisciplinarity is a large part of where innovation comes from, and we hope to receive submissions from those in the social sciences, humanities, sciences, and computer science fields.

We welcome submissions on the theme outlined above, and encourage conference participants to propose papers and sessions that would be of interest to a diverse audience. In order to make session formation and scheduling more streamlined, we have created three distinct tracks.  If you are not sure where your submission fits, or feel that it fits into more than one track, that’s perfectly fine. Please do still make your submission, and if accepted, we will find an appropriate fit.

Online submission forms and guidelines for BOTH conference content and workshops are be found here:

NOTE: The top of the page is for sessions/papers/posters/round tables/pecha kuchas the bottom is for workshops – please note that the submission forms are completely separate.

All submissions are due by December 5, 2012.  Notification of acceptance will be made by February 5, 2012

Questions about session/paper submissions may be sent to
Questions about workshop submission may be sent to the Workshop Coordinator, Lynda Kellam at

Data-related blog posts coming out of Open Repositories 2012 conference

I 'd been meaning to write an IASSIST blog post about OR 2012, hosted by the University of Edinburgh's Host Organising Committee led by Co-Chair and IASSISTer Stuart Macdonald in July, because it had such good DATA content.

Fortunately Simon Hodson, the UK's JISC Managing Research Data Programme Manager, has provided this introduction and has allowed me to post it here, with further links to his analytic blog posts, and even those contain further links to OTHER blog posts talking about OR2012 and data!

There are also more relevant pointers from the OR 2012 home page here:

I think there's enough here to easily keep people going until next year's conference in Prince Edward Island next July. Oh, and Peter Burnhill, Past President IASSIST, made a good plug for IASSIST in his closing keynote, pointing it out to repository professionals as a source of expertise and community for would-be data professionals.

Enjoy! - Robin Rice, University of Edinburgh


It has been widely remarked that OR 2012 saw the arrival of research data in the repository world.  Using a wordle of #or2012 tweets in his closing summary, Peter Burnhill noted that ‘Data is the big arrival. There is a sense in which data is now mainstream.’  (See Peter’s summary on the OR2012 You Tube Channel:

I have written a series of blog posts reflecting on the contributions made by *some* those working on research data repositories, and particularly the development of research data services

These posts may be of interest to subscribers to this list and are listed below.

Institutional Data Repositories and the Curation Hierarchy: reflections on the DCC-ICPSR workshop at OR2012 and the Royal Society’s Science as an Open Enterprise report

‘Data is now Mainstream’: Research Data Projects at OR2012 (Part 1…)

Pulling it all Together: Research Data Projects at OR2012 (Part 2…)

Making the most of institutional data assets: Research Data Projects at OR2012 (Part 3…)

Manage locally, discover (inter-)nationally: research data management lessons from Australia at OR2012

Simon Hodson [reposted with permission]

Knight Foundation Data Challenge (FYI)

The Knight Foundation announced a data-related grant opportunity in a kind of "tweet your grant proposal" format.   The call has closed but over 800 of the applications are viewable through Tumblr.   

In a simple search of the applications with the words "data professionals" returned 148 results.  Of these, some of the more related are: 

  1. My submission:  A passion for data, and the professionals who keep it alive
  2. A proposal for a dating service for data professionals :} DATABLE | Data based dating
  3. MetaLayer Turns Anyone into a Data Scientist

Have a look and see if there might be someone or some group out there you could be in collaboration with! 

IASSIST Fellows application now closed

The application for IASSIST Fellows is now closed. Over 40 applications from 23 different countries have been received with the following number of applications by region:

  • 23 Latin America
  • 12 Africa
  • 6 Asia

The Fellows Committee is now working to evaluate the applications and will make the decisions in the following weeks. Good luck to all participants.

IASSIST 2012 Fellows Program

The IASSIST Fellows Program is now accepting applications for financial support to attend the IASSIST 2012 conference in Washington [], from data professionals from countries with emerging economies who are developing and managing data infrastructures at their home institutions.

Please be aware that funding is not intended to cover the entire cost of attending the conference. The applicant’s home institution must provide some level of financial support to supplement the IASSIST Fellow award. Strong preference will be given to first time participants, and applicants from Latin-American countries. Only fully completed applications will be accepted. Applicants submitting a paper for the conference will be given priority consideration for funding.

 You may apply for funding via this form.

For more information, to apply for funding or nominate a person for a Fellowship, please send an email to the Fellows Committee chair, Luis Martínez-Uribe.


Council of Professional Associations on Federal Statistics (COPAFS) meeting notes

I was lucky enough to be able to sit in on the most recent COPAFS meeting in place of our regular liaison Judith Rowe.  While the topics were very different than the issues I usually deal with at work, I found the presentations really interesting. Here's an abridged version of my notes.



Ed Spar will be stepping down as Executive Director at the end of 2012.  The board will be launching a search and will be engaging a search firm.

Director's update:

The budgetary situation is grim to worse and outlook isn't any better. Every agency will wish they had last years budget. Census numbers reflect a very bad year coming up. The meeting dates for next year are: March 16, June 1, Sept 14, December 7.

Update on National Center for Education Statistics (NCES)- Marilyn Seastrom
NCES is the statistical agency within Dept of Education.  They have a small staff but lots of contractors and may be lucky enough to be level funded next year.

Assessment: it was the busiest year in the history of national assessment.  They are ready to release the state mapping report.  This compares assessment measures across states - map state assessments to National Assessment of Educational Progress (NAEP). For example, there is only one state (MA) where a 4th grader who is deemed is proficient on the state exam is proficient on the national level.  There are many states where they are proficient at the state level but they don't even make the "basic" cut for the national assessment. The are also ready to Release the Reading and Mathematics report card

Elementary and Secondary update: They've done an expansion of NCES Geo-mapping application which works with the ACS to provide data by school district boundaries.

Miscellaneous: there's a new OECD adult literacy study (PIAAC - first international assessment done on laptops in the home) and the national household education survey (what goes on outside of school) is no longer random digit dial sample due to deterioration in response rates, now address based sample (mail) .
There's new stuff on the horizon:  a middle school study, NAEP-TIMSS (Trends in International Mathematics and Science Study) link which will be an ambitious study using 8th grade level achievement in math and science.

American Demographic History: Campbell Gibson (demographer retired from Census)
Website of demographic history :
Developed over a few years with David Kennedy and Herbert Kline (Stanford) - about 130 graphics through 2000 for both state and national charts which are freely available and can be downloaded.
Source:  all decennial census - some drawn from compendia of ipums files.
He showed a variety of slides - all of which are available on the website and most of which were fascinating.  Can you guess the changes in the set of the top five languages spoken in the home of non-US born residents?

Rural Statistical Areas: Mike Radcliffe, Geography Division, Census
The presentation described a three year joint research project with 23 states. The goal was to define Rural Statistical areas - geographic areas defined using counties, county subdivisions and census tracts a building blocks. The goal was to be able to tabulate ACS 1 year estimates for areas of 65K+ people. These areas would be based on rural focus - not like pumas which used 100K but mostly urban areas.  They started with most rural parts and build from there - urban is really the residual.

RSA delineation process - counties with 65K+ would be standalone RSAs if rural focus.  Used the urban influence codes (UIC from USDA) to get to "ruralness" and grouped counties with some boundary tweaks made by State Data Center Steering committee. He showed maps of UIC ratings then discussed how to aggregate counties:  they created an aggregation net using state boundaries, interstate highways and rivers to create a lattice work to think about how to group counties.  They started with UIC category 12 and aggregated up by county until you hit the 65K+ measure. It's an imperfect measure and there were some problems with adjacent county differences and sometimes had to sacrifice resolution.

The resulting definitions for RSAs by state were sent to the state and they were able to move things around a bit to help smooth out some of the initial classification imperfections. Some states suggested alternative definitions; for example, Vermont wanted to use their planning regions.

Questions on the table:

  • Should RSAs be contiguous? Census has a preference for yes but states disagree - eg Alabama might have similar demographics between north and south counties that would match better for an RSA than using geography.
  • Can a variety of building blocks be used to form RSAs?  Initial proposal was counties but they may not be the best units to start with.  States found that in some cases sub-county divisions or census tracts worked better.
  • Why not cross state lines?  Makes sense for some questions but State data centers need to address rural areas withing their states?
  • Should counties of 65K+ be split into multiple areas?

Next steps:
State data centers have asked Census to define these as statistical areas but Census has said that in some cases (like Los Angeles) you just can't call them rural.  What do you call them? The project needs to get wider review including public comment through a Federal Register notice.

Research on measuring same sex couples - Nancy Bates - Census
Motivation: definition of marriage has changed; new terms and different state recognition and no federal recognition of same sex couples. According to 2008 ACS, there are about 150,000 self described same-sex married couples but only around 32,000 same-sex legally married couples.

Possible causes:

  • Classification error:  maybe people think of themselves as married even if they aren't.
  • First response: on ACS the husband/wife category is first in list but unmarried partner is 13th
  • Errors elsewhere: false positives due to incorrect gender response

Research: some based on focus groups - 18 groups in 8 different areas with different legal recognition of same sex marriage.  Mostly gay couples but some unmarried straight couples.  Most people interpreted the question on federal form as indicating "legal status".  Some thought it meant "legally married anywhere".  Many groups noted they were missing categories for civil unions or domestic partnerships. And there is the "function equivalence" problem that couples had the equivalent of a marriage but no where to put themselves.

Research: some based on cognitive interviews - 40 interviews both gays and straights across different legal jurisdictions. Participants filled out forms then were debriefed afterwards and showed alternative form and asked for preference.
Results: most survey results aligned with "true" legal status.  Specifically calling out same sex or opposite sex in the marital status question was preferred but also was flagged as potentially sensitive. Would this delineation increase unit non-response? Also, there was some confusion about defintion of civil union/domestic partnership.  Most people found it useful to have a cohabitation question.
Next steps:  interagency group review, piggyback on an ACS test for a larger trial which is mail only and they need to test in other modes and would love to be able to have a re-interview component.

Research on measuring same sex couples - Martin O'Donnell - showing some data
Showed a comparison of ACS data and census stuff - but comparability may not be perfect.
Changes in ACS forms and editing caused a drop of self reported same sex spouses from 350K+ to 150K+.

2010 Census results showed much higher level of same sex households than the 2010 ACS.  There was a huge difference between mail forms and non-mail forms.  Approximately 3 times as many households reported themselves as same sex households in mail forms as non-mail forms for ACS where the non-mail were nonresponse follow up (NRFU). On the pre2008 ACS and 2010 Census NRFU form, the matrix format for the form didn't yield consistent results.  ACS 2008+ and 2010 Census form had a person based column format which had much more consistent responses.  This is truly non-sampling error for populations: you only need 4 errors per 1000 of opposite sex households to generate the 250K+ error in the same sex spouses because there are 60 million of them.

Problem: bad matrix form was approved and printed before these results where available. Now short form data wave 1 is published including one table with one table about same sex couples but they can't stop the processing of the entire 2010 Census to allow for the correction of one table. Now how do they fix it?

They tested the quality of the reporting on sex.  Used name index to match the probability that a person has a name associated with a male (John or Thomas has very high index, Virginia or Elizabeth is very low) with state controls for cultural differences (Jean may be more likely to be a male in French areas).  Index value of 0-50 were likely to be female and those with 950-1000 were likely to be male.  Couples with a female partner with a name at the highest index value or a male partner with a name at the lowest index value where then considered to have incorrectly marked the sex item on the question and they were dropped from the same sex couples category. Ex: 9000 male-male couples in Texas out of 31,000 have names that indicate they are probably male-female couples - nearly one third of the same sex marriage stats in American Factfinder may be incorrect.  

Geographic distribution with inconsistent name reporting: swath from Florida north west to ND - matches high rate of NRFU forms.
Summary: They reissued the numbers which matched the 2010 ACS better once the name mismatched folks where thrown out. Spousal household estimate is most improved. American Factfinder page shows people where to go to get preferred estimate. Census PUMS is based on edited data.  They aren't recalculating the entire Census data but they are published the edit data and there will be a flag on data that are affected.

Stephen S. Clark Library for Maps, Government Information, and Data Services is open for business!

Three cheers for Jen Green!!! 

When not keeping IASSIST finances in check as the IASSIST Treasurer, Jennifer Green, director of the new Stephen S. Clark Library for Maps, Government Information, and Data Services, at the University of Michigan has been busy getting the library in shape for the recent opening day! 

Check out the announcement of the grand opening festivities in the Record Update (a publication of the Office of the Vice President for Communications at the University of Michigan) and don't miss the brand new website of the Setphen S. Clark Library

Green says the new library’s unique combination of collections, government information expertise, and data services will provide scholars and researchers with unprecedented opportunities for exploration, discovery, and collaboration.

“Before the Clark, there was a large degree of interaction among these three units,” Green says. “Our new proximity, in a purposefully designed and equipped space, means that we can more effectively collaborate with each other, which in turn really enhances our ability to creatively collaborate with students, faculty, and researchers.”

From the Record Update

IASSIST Latin Engagement Action Group

The Latin Engagement Action Group have come up with a number of outreach activities aimed at supporting data professionals from Spanish and Portuguese speaking educational institutions, namely:

1. Research Data Management Webinars (complete with IASSIST contribution) for Spanish/Portuguese data specialists (

Stuart Macdonald and Luis Martínez-Uribe in collaboration with Alicia López Medina (UNED, Spain), the Spanish Agency of Science and Technology (FECYT) and the network of Spanish repositories RECOLECTA have organised a programme of webinars in 3 strands starting in October to discuss RDM issues:

Strand 1 is dedicated to Research Data Management Strategy (presentations from FECYT, RedIris, Simon Hodson (JISC Managing Research Data (MRD) Programme Manager)

Strand 2 - RDM Tools and models (presentations from Sarah Jones on DAF/DMP online (DCC) and Stuart Macdonald (EDINA) on IASSIST Latin Engagement, RDM at Edinburgh & Research Data MANTRA 

Strand 3 - Research Data Management Experiences (presentations from Kate McNeil-Harmen (MIT) , Luis Martinez Uribe (Institute Juan March), colleagues from University of Porto

Several members of IASSIST have been invited and the work of the group will be presented in order to keep promoting the organization to colleagues in Spain, Portugal and Latin-America.

2. Preparation of a Latin-American session in next IASSIST annual conference in collaboration with outreach committee

Organise another Latin-American session at IASSIST 2012 (complete with NGO representation) led by Bobray Bordelan (Princeton). Liaise with the outreach to fund and invite data professional colleagues from Latin America to participate in this session.

3. Spanish and Portuguese translation of the main pages of the IASSIST site - May 2012

Working with the IASSIST web editor Robin Rice to scope and implement (voluntary) translation of the main landing pages on the IASSIST website (e.g. Home page, About page, Becoming a member if IASSIST, FAQ, IASSIST at a Glance, About IQ, Instruction for Authors)

Image: Toledo by Pat Barker on Flickr, CC-BY-NC licence

Data can be cool

As I prepare to leave Guelph there are lots of things I will miss - but what I will maybe miss most is the Data Resource Centre and the creative people who work there.   If you link to the picasa album below you will see some awesome posters they have made to showcase services and bring people into the world of Data and GIS. The images on some of the posters are really powerful....

posters on picasa

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...