Creation of RAG Systems for Managing Massive Data in Vector Databases

Leveraging Query Transformation for Enhanced Accuracy in Technical Documentation Retrieval

Joho, Luca, 2025

Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Martin, Andreas
Views: 32 - Downloads: 8
This master’s thesis explores the development and optimization of a Retrieval Augmented Generation (RAG) pipeline designed to extract contextually rich, accurate, and detail-oriented responses from extensive, multilingual technical documents stored in a vector database. Grounded in a design science research methodology, the study employs an iterative, artifact-centric approach that not only builds and refines the RAG pipeline but also systematically evaluates its effectiveness. A comprehensive literature review provided the theoretical basis for the choice of embedding models, evaluation metrics, and prompt templates. Based on these theoretical insights, a first conceptual design was created prior to coding to ensure that the practical implementation was closely aligned with the best practices, new techniques, and recognized knowledge gaps identified in the literature.
Subsequent iterative development cycles introduced and tested components such as automated query transformation and a vocabulary-based RAG system, serving as an alternative to time- and resource-intensive fine-tuning. Multiple pipeline configurations were examined - ranging from different embedding models to differing chunk and overlap sizes - to determine their impact on retrieval accuracy, response completeness, and runtime. The pipeline’s modular design enables adjustable parameters, such as chunk and overlap sizes, embedding models, and prompt templates, allowing for rapid experimentation and tailored solutions.
Performance is evaluated using both established metrics and advanced methods like G-EVAL, supplemented by critical input from domain experts. This hybrid evaluation revealed that vocabulary-driven query transformation, while improving the completeness and relevance of retrieved content, can negatively affect runtime. The study’s findings underscore the importance of balancing efficiency, adaptability, and accuracy, as no single configuration emerged as optimal for all use cases. Instead, it has been shown that the refinement of embedding models, prompt templates and retrieval strategies is context-dependent and shaped by domain-specific requirements and practical resource constraints. By combining theoretical insights with real-world applications and expert feedback, this master's thesis provides actionable guidelines for implementing robust, scalable RAG pipelines.
Studiengang: Business Information Systems (Master)
Keywords
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Joho, Luca
Betreuende Dozierende
Martin, Andreas
Publikationsjahr
2025
Sprache der Arbeit
Englisch
Vertraulichkeit
öffentlich
Studiengang
Business Information Systems (Master)
Standort Studiengang
Olten