Augmenting and Evaluating Biomedical Hypothesis through Knowledge Graph Integration and Large Language Models

The exponential growth of biomedical data and scientific publications has significantly reshaped the process of scientific discovery. While automated hypothesis Generation techniques have become increasingly effective in producing large numbers of candidate biomedical hypotheses, they have also shifted the primary bottleneck of research from hypothesis generation to hypothesis evaluation. Researchers are now confronted with challenges related to information overload, fragmented evidence, limited interpretability of model outputs and the absence of robust mechanisms for prioritizing hypotheses in a transparent and systematic manner.

Berdini, Angelica, 2025

Art der Arbeit Master Thesis

Auftraggebende

Betreuende Dozierende Laurenzi, Emanuele, Callisto De Donato, Massimo

Views: 3 - Downloads: 0

Download

This thesis addressed these challenges by exploring the integration of KGs and LLMs as a means to support the evaluation of automatically generated biomedical hypotheses. Following the DSR methodology, a hybrid decision-support prototype was designed, developed and evaluated with the goal of assisting domain experts during the hypothesis evaluation process rather than replacing expert judgment.

The proposed system uses KGs as a structured and traceable backbone for representing biomedical entities, semantic relationships derived from scientific literature and specialized databases. By exposing explicit graph paths, publication metadata and statistical indicators such as Mutual Information Score, the system try to provide a transparent exploration of hypotheses and their supporting context. LLMs are integrated as an interpretative layer, providing structured natural language explanations using in graph-based evidence, with the intention of improving the accessibility and interpretability of complex biomedical relationships.The evaluation results indicate that, while the prototype represents a promising exploratory tool, it does not yet provide sufficiently incisive support to autonomously validate or reject biomedical hypotheses. In several cases, the system alone was not sufficient to strongly support a hypothesis and additional contextual elements or expertdriven interpretation were required. These outcomes highlight intrinsic structural limitations related to data completeness, KG coverage and the current capabilities of LLMs when applied to complex biomedical reasoning tasks.From a critical perspective, the developed prototype should be regarded primarily as an assistive and exploratory system rather than as a definitive decision-making tool. Although it improves transparency and evidence aggregation compared to isolated approaches, its effectiveness is constrained by limitations such as incomplete biological relations, uneven publication coverage and simplified scoring strategies. These factors reduce the system’s ability to provide conclusive support for certain hypotheses and emphasize the continued necessity of human expertise in the evaluation process.Despite these limitations, the findings of this thesis support the validity of combining KGs and LLMs as a technological direction for addressing the hypothesis Evaluation bottleneck. KGs provide essential structure, traceability, and grounding of evidence, while LLMs contribute to the interpretation and communication of complex information. The observed limitations are therefore not indicative of an inappropriate technological choice, but rather of the need for richer datasets, more expressive graph schemas, and more advanced integration strategies.This work also reinforces the importance of human-in-the-loop approaches in AIassisted scientific discovery. Rather than aiming for full automation, the proposed system demonstrates how hybrid technologies can augment expert reasoning by reducing cognitive load, improving transparency, and supporting informed decision-making. This perspective aligns with current trends in explainable and trustworthy artificial intelligence, particularly in high-stakes domains such as biomedical research.Several directions for future work emerge from this research. These include the Integration of additional curated biomedical databases, the refinement of multidimensional scoring and ranking mechanisms and the exploration of more advanced Explanation strategies that better capture biological causality. Moreover, larger-scale user studies would be required to more fully assess the system’s impact on real-world Research workflows.

The main research question can be answered as follows. The proposed hybrid System supports transparency and improves the inspection and interpretation of automatically generated biomedical hypotheses, but it does not support trustworthy autonomous evaluation. The system is useful as an exploratory tool, yet it remains at an early stage of maturity and cannot be considered a complete or independent solution.In conclusion, this thesis demonstrates that hybrid systems combining KGs and LLMs are not yet capable of fully replacing expert-driven biomedical hypothesis evaluation. However, they represent a concrete and promising step toward more transparent, scalable, and user-centered scientific support tools. The prototype and empirical insights presented in this work contribute to a deeper understanding of both the potential and the current limitations of these technologies, providing a solid foundation for future research in explainable and trustworthy scientific discovery systems.

Studiengang: Business Information Systems (Master)

Keywords

Vertraulichkeit: öffentlich

Art der Arbeit

Master Thesis

Autorinnen und Autoren

Berdini, Angelica

Betreuende Dozierende

Laurenzi, Emanuele, Callisto De Donato, Massimo

Publikationsjahr

2025

Sprache der Arbeit

Englisch

Vertraulichkeit

öffentlich

Studiengang

Business Information Systems (Master)

Standort Studiengang

Olten