Ontology-Based Prompt Engineering with Large Language Models

A test of an ontology-based prompt engineering approach in extracting information from annual financial statements

Balsiger, David, 2025

Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Hanne, Thomas
Views: 50 - Downloads: 13
Large language models have become a widely discussed topic that have a variety of potential applications which can be used by people with little technical knowledge and no coding skills. One of the potential applications is using large language models for information extraction tasks, such as for example when one wants to extract data from tables that are available in PDF files. This is very common in the financial industry, where data from annual reports needs to be gathered and made available for further analysis. However, large language models tend to hallucinate and provide wrong information in some cases, which creates a need for improved accuracy. A common strategy to improve the accuracy is with prompt engineering, which allows to influence the results of large language models by changing the input to the large language model.
This master thesis created an accounting-domain specific ontology consisting of 76 classes and applied this ontology in several experiments to evaluate how it influences the accuracy of LLMs in information extraction tasks from tables within annual reports out of PDF files from Swiss companies.
The results show that this approach successfully decreased the number of LLM hallucinations by 9.5%, but did not increase the accuracy of information extraction overall. Further experiments showed that the format of the ontology has opposite influences on different LLMs. A combination of additional instructions and ontology-based prompt engineering showed promising results, providing the highest accuracy and lowest number of hallucinations overall. Similarly to the ontology being provided in different formats, further experiments showed that different wordings in additional instructions can have inconsistent effects on the performance of different LLMs.
Studiengang: Business Information Systems (Master)
Keywords
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Balsiger, David
Betreuende Dozierende
Hanne, Thomas
Publikationsjahr
2025
Sprache der Arbeit
Englisch
Vertraulichkeit
öffentlich
Studiengang
Business Information Systems (Master)
Standort Studiengang
Olten