Information Extraction from Building Insurance Policies
Schmid Celia Lorena, 2020
Betreuende Dozierende: Elzbieta Pustulka
Views: 9 - Downloads: 0
Information extraction from pdf documents such as Swiss insurance policies is a challenge to which no well-established solution yet exists. FHNW has a project running where they made first attempts in extracting information from insurance policies by applying a machine-learning based approach. However, the solution is not ripe for the market yet because the accuracy is not sufficient. The goal of this master thesis was to manually explore what kind of information a building insurance policy contains and to develop a rule-based approach to annotate building insurance policies. We received a data-base containing bounding-boxes of about 3’000 scanned documents containing the term “Gebäudeversicherung”. We programmed an algorithm to automatically annotate OCR-processed building insurance policies of the company Mobiliar. The algorithm returns useful output, most of the text boxes labelled correctly. These results can be of use to further develop the machine-learning-based approach of the FHNW project.
Studiengang: Business Information Systems (Master)
Fachbereich der Arbeit: Wirtschaftsinformatik & IT-Management