Enhancing Fairness and Trustworthiness in Healthcare Language Models

A Framework for Bias Detection and Mitigation

Jakob, Nadia, 2025

Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Martin, Andreas
Views: 30 - Downloads: 10
The rapid adoption of language models in healthcare has opened new opportunities for improving patient care, knowledge dissemination, and decision-making. However, these models are often undermined by systemic biases embedded in their outputs, posing risks to fairness, reliability, and trustworthiness in sensitive healthcare applications.
This thesis addresses these challenges by proposing a systematic framework to detect, analyse, and mitigate biases in large language models tailored for healthcare use cases, particularly focusing on HIV-related queries. The framework includes input guardrails, trusted knowledge retrieval, and iterative bias inspection and mitigation, implemented within a modular design adaptable to evolving definitions of bias.
The framework was evaluated using expert-reviewed datasets on HIV-related topics and tested across models such as GPT-4o, GPT-3.5-turbo, and Llama 3.1 8b. Evaluation metrics, including bias scores and semantic similarity measures. These findings provide a foundation for future research and practical implementations aimed at enhancing the fairness and reliability of AI in sensitive healthcare settings.
Studiengang: Business Information Systems (Master)
Keywords
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Jakob, Nadia
Betreuende Dozierende
Martin, Andreas
Publikationsjahr
2025
Sprache der Arbeit
Englisch
Vertraulichkeit
öffentlich
Studiengang
Business Information Systems (Master)
Standort Studiengang
Olten