Already a member?

Sign In

iBlog

"Data files should contain data."

For those tech-types who do their own data munging, here's a rant from Mark Dominus, a Perl programming wizard who was briefly stymied by trying to process a large data file from Census. As we face these issues daily in my office, I thought I'd share the frustration!

Of course, he doesn't mention where he thinks metadata "should" go but I have a pretty good idea what he would suggest.... ;-)

Comments

On a related theme, I was

On a related theme, I was just perusing the tables released with this year's budget. In a table labeled "Grants to states for Medicaid", the last few rows are: Wyoming American Samoa Guam Northern Mariana Islands Puerto Rico Freely Associated States Virgin Islands Indian Tribes Undistributed Survey & Certification Fraud Control Units Vaccines for Children Vaccines for Children Collection Medicare Part B Transfer Incurred but Not Reported Adjustments These tables are clearly meant for humans given the formatting, but still...

I didn't miss that point; I

I didn't miss that point; I omitted it. Anyway, they couldn't be numbers, because then there wouldn't have been any way to get the commas in!. :-)

I quite enjoyed Mark Dominus'

I quite enjoyed Mark Dominus' rant and have berated CSV on occasion myself. One important point that Mark missed about the Census file he was using is that the counts were not numbers but rather strings. While there was no violation of variable types in his examples, we could have an entirely separate rant about using strings instead of numbers in data files!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect

    more...

  • Resources

    Resources

    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...