Workshops available
The workshops run on Tuesday the 3rd of June and are supplementary to the conference.
Each workshop costs £50 to attend. The workshops will run in parallel so can register for up to one in the morning and one in the afternoon.
The workshops will be in person, held at UWE Bristol Frenchay campus rather than at the main conference venue, they are 3 hour sessions each, to keep costs as low as possible they don’t include catering but there are plenty of catering outlets on campus.
The workshops have limited availability. Although you can register to them at a later date it is advised that if there is a workshop you are particularly keen on attending you should register to it at the time of your booking to avoid disappointment.
The morning workshops are:
- Bridging Design and Accessibility: Creating FAIR and Inclusive Visualizations
- Teaching Qualitative Data Analysis using Open Data, Standards, and Tools
- AI-enabled data practices for metadata discovery and access: Best practices for developing training data
- Learn how to create synthetic data
The Afternoon workshops are:
- Embedding sensitive data management best practice in institutional workflows
- The Art of Transcription: Using open-source tools to optimize transcription processes
- Fundamentals of MAST & IDEAL Metadata
- Working with Census Data in R: Open Source Tools for Social Science Research
Morning Workshops
1: Bridging Design and Accessibility: Creating FAIR and Inclusive Visualizations
Sarah Siddiqui (University of Rochester); Heather Charlotte Owen (University of Rochester)
This interactive workshop is aimed at making visualizations more appealing, accessible, and FAIR. Participants will explore strategies to determine the accessibility of a graph, learn tips for creating accessible graphs, and create their own visualizations with various tools. Beginning with the theory of design then diving into individual elements with hands-on exercises, participants will experience the full data visualization cycle and practice creating accessible graphs using software such as Excel, Tableau, RStudio, and Jupyter Notebooks. While all of these software are open-source and/or freely available (except Excel which is fairly ubiquitous), they differ significantly in how they visualize graphs, requiring distinct approaches towards accessibility. In addition, participants will learn how to create accessible tables and documentation to ensure a fully FAIR package that can be understood and further built upon by the public.
As data sharing becomes increasingly prevalent, it is imperative that data professionals become knowledgeable on how visualizations can be shared in a way that is FAIR, but also accessible to individuals with disabilities. Raw data and code receive the most attention when it comes to ensuring research is FAIR, but visualizations should also receive the same treatment. While data and code increase the reproducibility of research, it is the visualizations that make it understandable to the general public. This workshop will help bridge this gap and encourage data professionals to expand the range of research outputs they assist researchers in producing.
Workshop Learning Objectives:
- Participants will be able to evaluate if a graph is accessible.
- Participants will be able to design accessible visualizations in various tools such as Excel, Tableau, RStudio, and Jupyter Notebooks.
- Participants will be able to write documentation and create tables to ensure their visualizations and associated materials are accessible and FAIR.
Workshop Prerequisites:
The workshop is open to everyone regardless of programming or visualization experience. Participants are expected to download Excel, Tableau (either Public Desktop - free version or Desktop are fine), RStudio, and Jupyter Notebooks onto their laptops. These software (besides Excel) are all open and free. The workshop instructors will email downloading instructions to registered participants before the workshop/conference begins. The instructors are happy to answer any troubleshooting questions before the workshop begins or during the first few minutes.
Workshop Setting:
Classroom - participants bring laptops
Workshop Computer Requirements:
Windows OS are preferred but others are also fine. If a computer lab has all necessary software [Excel, Tableau, RStudio, and Jupyter Notebooks], this is preferable to laptops as participants will not need to download software.
2: Teaching Qualitative Data Analysis using Open Data, Standards, and Tools
Sebastian Karcher (Qualitative Data Repository); Nathaniel Porter (Virginia Tech)
Teaching Computer-Assisted Qualitative Data Analysis (CAQDAS) can be a challenge: Qualitative data for instruction can be hard to find, CAQDAS data formats are proprietary as are all leading tools. In this “train the trainer” workshop, we show how instructors can leverage the recent opening of qualitative research infrastructure and build an effective training around open qualitative data (as can be found in the Qualitative Data Repository), open standards (here, the REFI-QDA standard for the exchange of qualitative data projects), and open tools (the open source QualCoder software).
The workshop follows the structure of an “Introduction to Computer Assisted Qualitative Data Analysis” workshop, but focuses on the technical backgrounds and pedagogical issues that instructors face when teaching such a course. We begin with an introduction to the Qualitative Data Repository, the REFI-QDA format, and the QualCoder software. We discuss identifying qualitative data for teaching a workshop as well as common challenges in setting up and working with the QualCoder tool. We then jointly engage in an abbreviated version of a set of qualitative data coding exercises that highlights good practices in teaching the coding of qualitative data. The workshop concludes with some considerations for sharing coded qualitative data.
The target audience for the workshop are principally data professionals with some familiarity with qualitative research who are considering providing support for qualitative researchers. That said, the workshop does not have any pre-requirements and is also open to adventurous beginners interested in learning about qualitative data analysis and its developing open infrastructure.
Workshop Learning Objectives:
After participating in this workshop participants
- will be able to find and identify open qualitative data for teaching
- will be able to use, and if they have prior experience, teach with the FLOSS QualCoder software
- will know how to identify and address common challenges in teaching introductory qualitative data analysis classes using open tools and data.
Workshop Prerequisites:
None, though some familiarity with qualitative data analysis will benefit participants.
Workshop Setting:
Classroom - participants bring laptops
Workshop Computer Requirements:
The workshop relies exclusively on open tools and softwares that participants can install on their own laptops for free. As installation on Mac can be challenging, we will provide support for that prior to the workshop.
3: AI-enabled data practices for metadata discovery and access: Best practices for developing training data
Wing Yan Li (University of Surrey); Chandresh Pravin (University of Surrey)
Continued investment into new and existing data collection infrastructures (such as surveys and smart data), highlights the growing need for creation of efficient, robust and scalable data resources which help researchers find and access data. Recent advances in artificial intelligence (AI) methods to facilitate automatic analysis of large text collections provides a unique opportunity at the intersection of computational techniques and research methodologies for the development of data resources that are able to meet the current and future needs of the research community.
With the widening application of AI and machine learning (ML) pipelines for processing large text corpora, this workshop focuses on a fundamental prerequisite before setting up any pipeline for downstream tasks: the Dataset. It is a common perception that ML models are data hungry and require a vast amount of data to enhance model performance. While understandable, this perception can sometimes overshadow the importance of data quality. In collaboration with CLOSER, this workshop will cover a typical “packaging” of data to train and evaluate models. The workshop will explore various aspects that contribute towards good practice for creating quality training datasets, including exploratory data analysis, selection of evaluation metrics, model selection and model evaluation.
Conventionally, models are evaluated quantitatively, as represented by the appropriate metrics, and qualitatively. While it might be tedious to qualitatively analyse all the samples, random sampling could be problematic. In the section covering model evaluation, workshop participants will be introduced to the problem of data biases and gaps. By bridging technological approaches with social science research needs, this workshop offers an exploration of data transformation techniques that enhance research reproducibility and computational analysis capabilities.
Workshop Learning Objectives:
- Gain knowledge of best practice in creating good training data for machine learning models.
- Identify and explain relevant evaluation metrics.
- Analyse instances of bias in training data.
Workshop Length:
Half Day 3-Hour Session
Workshop Setting:
Classroom - participants bring laptops
4: Learn how to create synthetic data
Gillian Raab (University of Edinburgh); Lee Williamson (University of Edinburgh)
The workshop will discuss the use of synthetic data for disclosure control. It will include discussions of what and how synthetic data can be used. Different types of synthetic data will be discussed. A practical session on how to create low-fidelity synthetic data, using the R package synthpop, will be included in the workshop and participants will be guided as to how they might proceed to create synthetic data with greater utility. The workshop will also include discussions of how to assess synthetic data for utility and disclosure control.
We will also review examples of cases where synthetic data has been released to the public.
Workshop Learning Objectives:
- To understand the ways that synthetic data can be used to widen access to sensitive data
- To learn how to use the R package synthpop to create low fidelity synthetic data
Workshop Prerequisites:
- A laptop with R installed
- Familiarity with the R programming language
Workshop Setting:
Classroom - participants bring laptops
Workshop Computer Requirements:
Participants can download the R package and all software and data required.
Afternoon Workshops
5: Embedding sensitive data management best practice in institutional workflows
Zosia Beckles (University of Bristol); Kirsty Merrett (University of Bristol); Alice Motes (University of Bath); Christopher Tibbs (University of Exeter); Kellie Snow (Cardiff University)
This half-day workshop will provide an overview of approaches to management of sensitive research data based on the collective experience of GW4 (https://gw4.ac.uk/) research data management support staff at the University of Bath, University of Bristol, Cardiff University, and University of Exeter, and how these approaches may be adapted for different institutional contexts. This will include design of consent forms, data management plans, and data sharing strategies, and how to embed this within the context of individual institutional workflows and policy frameworks.
The workshop is structured as a train-the-trainer package, developed from material previously delivered as part of the UKRN Train-the-Trainer programme (https://www.ukrn.org/training/data-sharing-from-planning-to-publishing-8-may/). It will include both subject matter content and workshop/training delivery design.
In the first part of the workshop we will deliver current GW4 best practice on design of consent forms to enable data sharing, data management planning for studies involving sensitive data, and strategies to enable sharing of sensitive data.
Following this, there will be discussion of pedagogic approaches to delivering training on these topics. The last part of the workshop will be devoted to developing content based on the topics covered in the earlier parts – there will be space for attendees to develop their own training materials tailored to their own institutional contexts. Attendees will be able to design and test content blocks and receive feedback from peers and workshop organisers.
Workshop Learning Objectives:
Objectives
- Understand the approaches for sensitive data management used at all four GW4 institutions
- Explore how decisions in the research design and consenting documents affect data sharing at the project’s completion
- Develop their own guidance and training materials, suited to their specific institutional context and infrastructure
- Share challenges and generate solutions for sensitive data management with other sector experts
Workshop Prerequisites:
This workshop is aimed at repository managers, data librarians, research governance/research support staff, and academics working with sensitive data. It may also be relevant for funding bodies that develop their own training. Familiarity with the general principles of research data management is required, but experience of sensitive data management is not.
Workshop Setting:
Classroom - discussion style/movable furniture
Workshop Computer Requirements:
Projector or display compatible with Windows 11 laptops and HDMI connectors.
6: The Art of Transcription: Using open-source tools to optimize transcription processes
Maureen Haaker (UK Data Service)
One of the most significant challenges when working with qualitative data is the substantial amount of time and resources required to prepare text for analysis and sharing. A key step in this preparation is the process of converting audio files into text, or transcription. This task involves multiple elements, such as speaker segmentation, tagging, editing, anonymization, and notation. Each of these steps is essential for producing accurate, high-quality transcripts, but can also create substantial barriers to the efficient curation of qualitative data. In short, the effort involved in completing a full and accurate transcription can often slow down the overall workflow and limit the accessibility of data.
However, despite these challenges, complete transcriptions are essential for ensuring data privacy, enabling secure data sharing, and maximizing the reuse and value of qualitative collections. When done well, transcription makes it possible to share sensitive research data in a way that respects confidentiality and privacy while making the data more accessible to other researchers. As such, transcription is just as vital for the curation and long-term management of qualitative data as it is for research analysis itself.
This workshop is designed for individuals working with both audio and text data who are seeking solutions to streamline their transcription workflows. It will focus on exploring how open-source tools can ease the burden of transcription, notation, summarization, and anonymization. Participants will learn how to build a semi-automated curation pipeline that efficiently converts audio files into shareable, anonymized transcripts. They will also have the opportunity to discuss how these tools can be integrated into existing research workflows, helping to improve the efficiency and quality of transcriptions. By leveraging these powerful open-source tools, researchers can optimize their transcription process, reduce the workload involved, and critically evaluate the choices they make when handling qualitative data.
Workshop Learning Objectives:
By the end of this course, participants will be able to:
- Understand and explain the differences in transcription approaches;
- Identify key features of a transcript; and
- Effectively use online tools and software to produce a high-quality, accurate, qualitative transcript.
Workshop Prerequisites:
None
Workshop Setting:
Classroom - participants bring laptops
Workshop Computer Requirements:
Whisper, FAMTAFOS (or Textwash), and web browser (with Wifi connection)
7: Fundamentals of MAST & IDEAL Metadata
Samuel Spencer (Aristotle Metadata)
Fundamentals of MAST & IDEAL Metadata is a short-course in the MAST/IDEAL Methodology, a pragmatic framework for building understanding and organisational support for data management through good processes rather than specific tools or standards.
The MAST/IDEAL Methodology is a standards-agnostic and tools-independent approach that supports existing frameworks such as the Data Management Body of Knowledge and DDI Data Lifecycle, which talk about what data practitioners must do, by providing a step-by-step guide on how practitioners can develop skills and culture to perform these tasks.
Learning objectives for the course include:
- Understanding the principles of the MAST (Metadata-Analysis-Standards-Teamwork) Manifesto
- Understand the stages in the IDEAL (Inventory-Document-Endorse-Audit-Leadership) Framework
- How to communicate and promote the importance of metadata for a range of audiences - from researcher and data staff to the administrative and executive level
- How to identify skills for specialist teams to develop a culture of metadata peer review to support modern data governance
- How to communicate steps and focus effort in data documentation and governance to maximise success in data initiatives
- How to identify appropriate metadata standards and software for successful data projects
This course will be delivered in a practioner-led group session, including:
- Peer discussions on case studies highlighting existing challenges in change management and education in data governance
- Examination of sample templates for documenting data and how to adopt these for existing projects
- Peer discussions on the use of MAST & IDEAL for the selection of data standards and software
As an emerging framework, following the course participants will be invited to provide feedback on the methods and frameworks presented to gather data on ongoing research into the framework. Participants will also be invited to complete an online quiz to receive a digital micro-certification to promote their own experience and knowledge.
Workshop Learning Objectives:
- How to communicate and promote the importance of metadata for a range of audiences - from researcher and data staff to the administrative and executive level
- How to identify skills for specialist teams to develop a culture of metadata peer review to support modern data governance
- How to communicate steps and focus effort in data documentation and governance to maximise success in data initiatives
- How to identify appropriate metadata standards and software for successful data projects
Workshop Setting:
Classroom - discussion style/movable furniture
8: Working with Census Data in R: Open Source Tools for Social Science Research
Aditya Ranganath (University of Colorado Boulder)
[ Title and description updated ]
Data Librarians and social science information professionals must often use Census data, whether it is for their own projects, or in order to assist researchers and students with their empirical research needs. However, the process of working with these data products can be complex; as a result, academic libraries often subscribe to commercial software applications that can make this process more straightforward. However, these solutions are expensive, and may not be readily interoperable with the open-source tools and platforms that researchers typically use to conduct empirical research.
Recent years have witnessed the development of robust open-source R packages that leverage relevant APIs to interact with Census data from a variety of countries and sources across the world. This hands-on workshop provides an introduction to this cross-national package-based Census data ecosystem in R (encompassing packages such as “tidycensus”, “idbr”, “ipumsr”, and others), which offers an accessible alternative to proprietary software applications for social scientists working with Census data. We will demonstrate how to identify, locate, extract, query, visualize, and analyze Census-based demographic data from a variety of countries in an efficient, cost-effective, and user-friendly manner. Upon completing the Workshop, participants will be empowered to use R as a platform for Census-based demographic research in their own work; consult with and effectively support social science students and researchers that wish to work with Census data in R; and develop hands-on teaching materials that introduce relevant campus stakeholders to important Census data resources, packages, and tools through the R programming language.
Workshop Learning Objectives:
Participants will learn how to:
- Query and extract Census American Community Survey datasets to their local R environment using the “tidycensus” package
- Wrangle and process Census datasets to prepare them for analysis
- Create basic maps and visualizations using the census data they extract
Workshop Prerequisites:
Participants should have some prior exposure to the R programming language and the “tidyverse”, though the workshop will not presuppose extensive prior experience. The completion of a “Carpentries” or Carpentries-style introductory workshop on the R programming language would be sufficient preparation to participate in this workshop.
Workshop Computer Requirements:
Participants should attend with their personal laptops, with R and R Studio installed. Before the workshop, they must also apply for a Census API Key (https://api.census.gov/data/key_signup.html); typically, an API key will be issued immediately upon applying. An API key will be needed to participate in the workshop.
[ Back to IASSIST 2025 Bristol, UK main page ]