Information Extraction from Building Insurance Policies

Schmid, Celia Lorena, 2020

Art der Arbeit Master Thesis
Auftraggebende
Betreuende Dozierende Pustulka, Elzbieta
Keywords
Views: 23 - Downloads: 1
Information extraction from pdf documents such as Swiss insurance policies is a challenge to which no well-established solution yet exists. FHNW has a project running where they made first attempts in extracting information from insurance policies by applying a machine-learning based approach. However, the solution is not ripe for the market yet because the accuracy is not sufficient. The goal of this master thesis was to manually explore what kind of information a building insurance policy contains and to develop a rule-based approach to annotate building insurance policies. We received a data-base containing bounding-boxes of about 3’000 scanned documents containing the term “Gebäudeversicherung”. We programmed an algorithm to automatically annotate OCR-processed building insurance policies of the company Mobiliar. The algorithm returns useful output, most of the text boxes labelled correctly. These results can be of use to further develop the machine-learning-based approach of the FHNW project.
Studiengang: Business Information Systems (Master)
Vertraulichkeit: öffentlich
Art der Arbeit
Master Thesis
Autorinnen und Autoren
Schmid, Celia Lorena
Betreuende Dozierende
Pustulka, Elzbieta
Publikationsjahr
2020
Sprache der Arbeit
Englisch
Vertraulichkeit
öffentlich
Studiengang
Business Information Systems (Master)
Standort Studiengang
Olten