IASSIST 2025: IASSIST at 50! Bridging oceans, harbouring data & anchoring the future


The Art of Transcription: Using open-source tools to optimize transcription processes

One of the most significant challenges when working with qualitative data is the substantial amount of time and resources required to prepare text for analysis and sharing. A key step in this preparation is the process of converting audio files into text, or transcription. This task involves multiple elements, such as speaker segmentation, tagging, editing, anonymization, and notation. Each of these steps is essential for producing accurate, high-quality transcripts, but can also create substantial barriers to the efficient curation of qualitative data. In short, the effort involved in completing a full and accurate transcription can often slow down the overall workflow and limit the accessibility of data.

However, despite these challenges, complete transcriptions are essential for ensuring data privacy, enabling secure data sharing, and maximizing the reuse and value of qualitative collections. When done well, transcription makes it possible to share sensitive research data in a way that respects confidentiality and privacy while making the data more accessible to other researchers. As such, transcription is just as vital for the curation and long-term management of qualitative data as it is for research analysis itself.

This workshop is designed for individuals working with both audio and text data who are seeking solutions to streamline their transcription workflows. It will focus on exploring how open-source tools can ease the burden of transcription, notation, summarization, and anonymization. Participants will learn how to build a semi-automated curation pipeline that efficiently converts audio files into shareable, anonymized transcripts. They will also have the opportunity to discuss how these tools can be integrated into existing research workflows, helping to improve the efficiency and quality of transcriptions. By leveraging these powerful open-source tools, researchers can optimize their transcription process, reduce the workload involved, and critically evaluate the choices they make when handling qualitative data.

Maureen Haaker
UK Data Service
United Kingdom