Full Program »
Boosting Data Findability: The Role of AI-Enhanced Keywords
In today’s data-driven world, data has become more valuable than ever. Finding relevant data within a vast expanse of data can be quite challenging at times. Researchers have been working on finding different methods to offer the relevant data to users easily and swiftly. Focusing on data and its reusability, FAIR principles emphasize on the importance of findability of data. Finding the relevant data will make the life of data users easier and at the same time improves the reusability quotient of the data for data producer. For data archives, providing the relevant data to the data consumers is important. Data archives use keywords to define a study. These keywords are mostly chosen from the available set of controlled vocabularies (CVs) in the form of Thesauri. Sometimes the data producers, unable to find suitable keywords for their study use their own keywords, called user-defined keywords. User-defined keywords on one hand solves data producers’ problems, but on the other hand poses a challenge for the data archive to make data findable for future data consumers. One such example is GESIS Search. GESIS Search is a web-portal which provides a platform to find surveys and social science research data. Users can query the research data based on metadata fields like Topic, Author and many more. In this paper, we focus on Topic field to make the research data more findable. However, the user defined keywords in the Topic field acts as hinderance to data findability. Therefore, the role of assigning CVs to user-defined keywords is crucial to solve this challenge. Manually assigning CVs to user-defined keywords costs a majority of resources and time. We aim to employ Artificial Intelligence (AI) techniques to automate the process of broader-term assignment from CVs, improving the findability along the way for the studies in GESIS Search.