Enhancing Fairness and Trustworthiness in Healthcare Language Models
A Framework for Bias Detection and Mitigation
Jakob, Nadia, 2025
Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Martin, Andreas
Views: 30 - Downloads: 10
The rapid adoption of language models in healthcare has opened new opportunities for improving patient care, knowledge dissemination, and decision-making. However, these models are often undermined by systemic biases embedded in their outputs, posing risks to fairness, reliability, and trustworthiness in sensitive healthcare applications.
This thesis addresses these challenges by proposing a systematic framework to detect, analyse, and mitigate biases in large language models tailored for healthcare use cases, particularly focusing on HIV-related queries. The framework includes input guardrails, trusted knowledge retrieval, and iterative bias inspection and mitigation, implemented within a modular design adaptable to evolving definitions of bias.
The framework was evaluated using expert-reviewed datasets on HIV-related topics and tested across models such as GPT-4o, GPT-3.5-turbo, and Llama 3.1 8b. Evaluation metrics, including bias scores and semantic similarity measures. These findings provide a foundation for future research and practical implementations aimed at enhancing the fairness and reliability of AI in sensitive healthcare settings.
Studiengang: Business Information Systems (Master)
Keywords
Vertraulichkeit: öffentlich