Eesti keele nimeolemite märgendaja analüüs ja parandamine
Named entity recognition is a task in information extraction that aims to find proper names from text and categorizing them. There is one previous published research on named entity recognition for Estonian and as a result of that research, a named entity recognizer for Estonian was developed which is accessible through the EstNLTK project. The purpose of this thesis is to port the recognizer to the newest version of EstNLTK and analyse its performance. As a result of that analysis, rule-based improvements are pro-posed for the named entity recognizer. The improvements that have a positive effect on the performance of the named entity recognizer are implemented.
named entity recognition, natural language processing, statistics, rule-based models, machine learning