Back to the rough ground! Retrieving concepts in survey research and its potential uses
Survey design and fielding of questionnaires exerts significant effort into asking the right questions to elicit high quality data from respondents. Yet as a researcher coming to data from archives much of this information is lost or locked up in PDFs that is burdensome to use and a barrier to the ambitions of FAIR.
The technical capability to serve up such metadata is well served by standards such as the suite of DDI standards. Populating such schemas at scale will however need a step change in the way metadata is utilised in the data lifecycle. The absence of high quality question banks and paucity of ‘this is how you do it’ projects are demotivating factors for adoption.
The ESRC Future Data Services pilot project between CLOSER, University of Surrey, UK Data Service and Scotcen is tackling these issues, utilizing the CLOSER metadata repository as a training (meta) data set to develop novel machine learning approaches to the extraction of metadata from survey questionnaires, conceptual extraction and alignment of questions and
the use of concepts to drive machine actionable disclosure assessment.
The presentation will report on progress in these three areas