Eesti keele nimeolemite märgendaja analüüs ja parandamine
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Named entity recognition is a task in information extraction that aims to find proper names from text and categorizing them. There is one previous published research on named entity recognition for Estonian and as a result of that research, a named entity recognizer for Estonian was developed which is accessible through the EstNLTK project. The purpose of this thesis is to port the recognizer to the newest version of EstNLTK and analyse its performance. As a result of that analysis, rule-based improvements are pro-posed for the named entity recognizer. The improvements that have a positive effect on the performance of the named entity recognizer are implemented.
Description
Keywords
named entity recognition, natural language processing, statistics, rule-based models, machine learning