IASSIST Conference 2024

Full Program »

Metadata Augmentation for Social Science Datasets Using Generative AI

Efficiently curating metadata with controlled terminology is a critical yet time-consuming task in social science data management. Data depositors often provide insufficient metadata, compelling data repository staff to extensively enhance the metadata. This process traditionally involves navigating a wide array of controlled terms, a task demanding substantial time and expertise, sometimes necessitating the creation of new terms.

Addressing these challenges, we introduce an innovative model employing Generative AI technology (ChatGPT). This tool is engineered to significantly diminish the time required for metadata curation for data repository staff while enhancing the accuracy of term matching. It achieves this by rapidly analyzing text and extracting pertinent keywords from established thesauri, including ICPSR, ELSST, and Library of Congress, along with ChatGPT's intelligent recommendations. This approach not only expedites the curation process but also ensures heightened precision and recall in the results.

Caden Picard
University of Michigan - Flint
United States

Jared Lyle
ICPSR
United States

jay winkler
ICPSR
United States

Murali Mani
University of Michigan - Flint
United States

 



Powered by OpenConf®
Copyright©2002-2023 Zakon Group LLC