Skip to main content
IASSIST Conference 2023

Full Program »

Powerful DDI-CDI Metadata – the How and the Why?

Here at the UK Data Archive we have a vision of using DDI-CDI – DDI metadata for Cross-Domain Interoperability – to power our future data dissemination tools. However, populating these rich metadata structures at scale is not a trivial task and therefore we are exploring the use of current state-of-the-art machine learning models. With the resulting tools we aim to improve access to our data while protecting people’s privacy. We aim to break open the current binary utility/risk trade-off of Secure Access or Open Access when it comes to sensitive/disclosive datasets, particularly in Social Science and reduce the time delay in giving researchers access to data. Detailed metadata will not only allow researchers to select, filter and link data but will also provide us with the input necessary to drive SDCMicro functions (well-used command-line tools we have rewritten as web services) which drive machine-assisted disclosure assessment, which in turn feeds a decision tree of real-time access outcomes, providing researchers with richer and more flexible choices based on real-time mitigations of key variable sensitivity (e.g. global recoding, top/bottom recoding, etc.) As well as a live demo of the tool, we outline our methodology for building and improving machine learning models capable of enabling the scaling of rich metadata creation. We are all aware of the potential of machine learning to reduce our manual workload but what are the practical considerations for applying these tools to create Social Science metadata, how successful have we been and what more can we do to improve the resulting accuracy?

Deirdre Lungley
UK Data Service
United Kingdom

Thomas Gilders
UK Data Service
United Kingdom

 


Powered by OpenConf®
Copyright ©2002-2022 Zakon Group LLC