Information Extraction from Building Insurance Policies

Schmid, Celia Lorena, 2020

Type of Thesis Master Thesis
Client
Supervisor Pustulka, Elzbieta
Views: 31 - Downloads: 1
Information extraction from pdf documents such as Swiss insurance policies is a challenge to which no well-established solution yet exists. FHNW has a project running where they made first attempts in extracting information from insurance policies by applying a machine-learning based approach. However, the solution is not ripe for the market yet because the accuracy is not sufficient. The goal of this master thesis was to manually explore what kind of information a building insurance policy contains and to develop a rule-based approach to annotate building insurance policies. We received a data-base containing bounding-boxes of about 3’000 scanned documents containing the term “Gebäudeversicherung”. We programmed an algorithm to automatically annotate OCR-processed building insurance policies of the company Mobiliar. The algorithm returns useful output, most of the text boxes labelled correctly. These results can be of use to further develop the machine-learning-based approach of the FHNW project.
Studyprogram: Business Information Systems (Master)
Keywords
Confidentiality: öffentlich
Type of Thesis
Master Thesis
Authors
Schmid, Celia Lorena
Supervisor
Pustulka, Elzbieta
Publication Year
2020
Thesis Language
English
Confidentiality
Public
Studyprogram
Business Information Systems (Master)
Location
Olten