Full Program »
35. Librarian teach thyself: Exploring natural language processing tools for de-identification as window into AI for librarians
This poster will present a project that helped librarians learn and teach artificial intelligence (AI) tools for data de-identification. Librarians continue to play an important role in research data management services, offering crucial outreach and instruction to faculty and researchers about data tools, FAIR data, and open science. As AI tools continue to grow in use and popularity, information professionals are poised to offer guidance on data sharing in the AI landscape. Training librarians to train researchers in AI concepts and AI-based data tools helps increase librarian skills in both AI and data management. Data de-identification is a great example of a data management skill that is well suited to AI-based tools. Therefore, this one-hour webinar was created to introduce librarians and information professionals to AI and natural language processing (NLP), and provide fundamental context about how they work through the example of de-identification.
Using a train-the-trainer model, this class explored openly available tools that use NLP to find and redact personally identifying information, and discussed the distinctions between privacy, de-identification, anonymization, and (USA-specific) HIPAA compliance. The removal and de-identification of personal and sensitive information is often a repetitive, time intensive process. These tools can be used to produce HIPAA compliant data that can be more safely shared and help researchers comply with data sharing policies. Teaching librarians about freely available data tools can improve their understanding of AI and expand their institution’s data service offerings. Although the workshop presented was focused on librarians, our planning process may be of interest to other trainers interested in introducing AI basics through practical tools. This presentation will describe curriculum development, highlighted tools, and results from a pilot run.