Enhancing Fairness and Trustworthiness in Healthcare Language Models

A Framework for Bias Detection and Mitigation

Jakob, Nadia, 2025

Type of Thesis Master Thesis
Client
Supervisor Martin, Andreas
Views: 30 - Downloads: 10
The rapid adoption of language models in healthcare has opened new opportunities for improving patient care, knowledge dissemination, and decision-making. However, these models are often undermined by systemic biases embedded in their outputs, posing risks to fairness, reliability, and trustworthiness in sensitive healthcare applications.
This thesis addresses these challenges by proposing a systematic framework to detect, analyse, and mitigate biases in large language models tailored for healthcare use cases, particularly focusing on HIV-related queries. The framework includes input guardrails, trusted knowledge retrieval, and iterative bias inspection and mitigation, implemented within a modular design adaptable to evolving definitions of bias.
The framework was evaluated using expert-reviewed datasets on HIV-related topics and tested across models such as GPT-4o, GPT-3.5-turbo, and Llama 3.1 8b. Evaluation metrics, including bias scores and semantic similarity measures. These findings provide a foundation for future research and practical implementations aimed at enhancing the fairness and reliability of AI in sensitive healthcare settings.
Studyprogram: Business Information Systems (Master)
Keywords
Confidentiality: öffentlich
Type of Thesis
Master Thesis
Authors
Jakob, Nadia
Supervisor
Martin, Andreas
Publication Year
2025
Thesis Language
English
Confidentiality
Public
Studyprogram
Business Information Systems (Master)
Location
Olten