Cataloging query: how does cataloging data differ from traditional library material?

In their never ending quest for information, Jen, Tiffani and Paula would like to know about people's experiences cataloging data. Do you use MARC format? Have you tried using DDI? At what level do you tend to catalog collections? How does it depart from cataloging traditional library matter? Enquiring minds want to know.


- Contributed by Jen Darragh


At DPLS we use a home-grown

At DPLS we use a home-grown cataloging system to describe our 3000 plus datasets. Please check out our online catalog, This home-grown system contains description for study, doucmentation and file for each of our datasets. It has worked for us in the last 30 some years. More recently, we have used DDI standard to mark up a small number of our collections in a local consortium called Better Access to Data for Global Interdisplinary Research (BADGIR), We are also looking into Dublincore used in DSpace for our institutional digital repository, Minds@UW.

Historically, AACR2

Historically, AACR2 incorporated the machine readable data file as a general material designation (GMD) in 1978. At our University, we were cataloguing our data collection in MARC format as early as 1983. Here is the unformatted display of the MARC record for the first study in our Data Library collection: World handbook of political and social indicators, 1961-1963 [Machine-readable data file] 245: 00 : World handbook of political and social indicators, 1961-1963|h[Machine-readable data file] /|cBruce Russett ... [et al.]. 250: : Rev. ed. 260: 0 : Ann Arbor, Mich. :bICPSR,|c[1968?]. 300: : 1 data file (564 logical records). 440: 0 : U of A study ;|vno. 1 830: 0 : ICPSR (Series)|v5022. 650: 0 : Social indicators 650: 0 : Political indicators|xStatistics. 650: 0 : Social history|y20th century|xStatistics. 538: : Format : Raw. 500: : Accompanied by codebook [cdbk] (43 p. ; 29 cm.). 500: : Data for 1961-1963. 500: : Authors: Bruce Russett, Karl Deutsch, Harold Lasswell, Hayward Alker.? 500: : 4 cards per case. 500: : 70 variables. 516: : Statistical. 565: : 141 cases. 490: 0 : ICPSR ;|v5022 700: 10 : Russett, Bruce M. 700: 10 : Deutsch, Karl Wolfgang,|d1912- 700: 10 : Lasswell, Harold Dwight,|d1902- 700: 10 : Alker, Hayward R. 710: 20 : Inter-university Consortium for Political and Social Research. The GMD changed from machine readable data file to electronic resource around the beginning of the 1990\'s at our institution (I always felt that this was less descriptive than MRDF) and later computer file became mixed with electronic resource. We did our own cataloguing in the Data Library because the concept of study-level description was too abstract to the liking of most of our cataloguers, who at the time preferred working with concrete, physical objects. The study-level consisted of two basic elements: the set of files belonging to a title and the corresponding physical codebook or data documentation in print. Traditional cataloguers would look at the codebook and say, \"We can catalogue this because we can see it, measure it and describe it.\" Our Data Library cataloguer would say, \"And I can see, measure and describe the digital files to which that codebook co-exists.\" The computer files and the print documentation together made up the study-level. A title would wrap them together, although titles often were not descriptive and one could encounter multiple titles for the same study. In the mid-1990\'s, the Library in Statistics Canada began producing detailed catalogue records for their microdata products. We began to copy their MARC records for the titles that we received from them through DLI. Here is an example of the detail that they provide in MARC format. Canadian community health survey, 2000-2001: cycle 1.1 (2000-2001) [electronic resource] : [public-use microdata file] 245: 00 : Canadian community health survey, 2000-2001: cycle 1.1 (2000-2001)|h[electronic resource] :b[public-use microdata file]. 256: : Computer data. 260: : [Ottawa] :bSpecial Surveys Division, Statistics Canada,|c2003- 246: 1 : |iAlso known as:|aCCHS 246: 1 : |iTitle on user\'s guide:|aCanadian Community Health Survey 2000-2001 :buser\'s guide for the public use microdata file 650: 0 : Public health|zCanada|xStatistics. 650: 0 : Health surveys|zCanada. 650: 0 : Health status indicators|zCanada. 650: 0 : Medical care|xUtilization|zCanada|xStatistics. 362: 0 : 2000/2001- 310: : Biennial. 538: : System requirements: Software such as Beyond 20/20, SAS and SPSS. Adobe Acrobat reader is required to view and print files in PDF format. 515: : Data Dictionary revised August 2003. 530: : Available also on CD-ROM. 506: : Available through the University of Alberta Data Library. Restricted to non-commercial use by U of A clientele only. 516: : Statistical. Files in ASCII and binary format. 520: : The Canadian Community Health Survey (CCHS) collects information on the health of the Canadian population, aged 12 and older and related socio-economic data. The survey is designed to provide reliable cross-sectional estimates at provincial and territorial and health region levels. This public use microdata file (PUMF) contains data collected during Cycle 1.1 of the CCHS, 2000/01. For Cycle 1.1, approximately 130,000 respondants provided detailed health and socio-economic information. The survey collected information covering the broad topic areas of health status, determinants of health and health system utilization. 522: : Canada, provinces, and the three territories the Yukon, Northwest Territories and Nunavut, excluding persons living on Indian Reserces or Crown lands, clientele of institutions, full-time members of the Canadian Armed Forces and residents of certain remote regions. 565: : |dUniverse: The CCHS targets person aged 12 years or older who are living in private dwellings in the ten provinces and the three territories. Persons living on Indian Reserves or Crown lands, clientele of institutions, full-time members of the Canadian Armed Forces and residents of certain remote regions are excluded from this survey. The CCHS covered approximately 98% of the Canadian population aged 12 or older. 567: : The Canadian Community Health Survey (CCHS) is a cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. The CCHS operates on a two-year collection cycle. The first year of the survey cycle \".1\" is a large sample, general population health survey, designed to provide reliable estimates at the health region level. The second year of the survey cycle \".2\" is a smaller survey designed to provide provincial level results on specific focused health topics. This Microdata File contains data collected in the first year of collection for the CCHS (Cycle 1.1). Information was collected between September 2000 and November 2001, for 136 health regions, covering all provinces and territories. The CCHS (Cycle 1.1) questionnaire was administered using computer-assisted interviewing (CAI). Sample units selected from the area frame were interviewed using the Computer-Assisted Personal Interviewing (CAPI) method while units selected from Random Digit Dialling (RDD) and telephone list frames were interviewed using the Computer-Assisted Telephone Interviewing (CATI) method. Selection of individual respondents was designed to ensure over-representation of youths (12 to 19) and seniors (65 or older). The selection strategy was designed to consider user needs, cost, design efficiency, response burden and operational constraints. Among the households from the area frame, one person aged 12 or older was randomly selected from among 82% of the sampled households and two persons (12 or older) were randomly chosen in the remaining 18%. The sample size targeted was 133,300. In total and after removing the out-of-scope units, 136,937 households were selected to participate in the CCHS (Cycle 1.1). Out of these selected households a response was obtained for 125,159 which results in an overall household-level response rate of 91.4%. Among these responding households, 142,421 individuals were selected to participate in the CCHS (Cycle 1.1) out of which a response was obtained for 130,827 which results in an overall person-level response rate of 91.9%. At the Canada level, this would yield a combined response rate of 84.7% for the CCHS (Cycle 1.1). 710: 2 : Statistics Canada.|bSpecial Surveys Division. ------------------------------------------------------------------------------------------- Elements of Parts 2 & 3 (the study and file descriptions, respectively) of DDI versions 1 & 2 capture the essence of the study-level that we catalogued at our institution. Part 4 (the variable description), however, introduced a whole new -- and much desired -- level of metadata. The paradigm for the first two versions of DDI was an elaboration of the paper codebook consisting of study, file and variable descriptions. These are very reasonable objects to catalogue and versions 1 & 2 provided the means to do this. However, as we learn more about describing processes as well as objects, our metadata will become increasingly rich and, consequently, immensely more powerful, which is the paradigm for DDI version 3, namely, a shift to the life cycle of data.

