Andmebaasi logo
Valdkonnad ja kollektsioonid
Kogu ADA
Eesti
English
Deutsch
  1. Esileht
  2. Sirvi autori järgi

Sirvi Autor "Tolmats, Norman" järgi

Tulemuste filtreerimiseks trükkige paar esimest tähte
Nüüd näidatakse 1 - 1 1
  • Tulemused lehekülje kohta
  • Sorteerimisvalikud
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Nimeüksuste tuvastamine ajaloolistes Tartu Linnavolikogu protokollides
    (Tartu Ülikool, 2025) Tolmats, Norman; Orasmaa, Siim, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
    This thesis explores the use of machine learning for named entity recognition (NER) in the meeting protocols of the Tartu City Council from 1918 to 1940, which are in Estonian. Most existing named entity recognition models for Estonian have been developed using modern language data and perform poorly when applied to historical texts. To effectively annotate valuable historical documents, it is necessary either to train specialized models or to adapt existing ones — particularly when only a small amount of labeled data is available. This study analyzes current NER models and evaluates their suitability for older language. Given the limited availability of high-quality labeled data, the best-performing model is adapted using machine learning techniques to be more suitable for these historical meeting protocols. The results demonstrate that, by using a small amount of labeled data and a large corpus of unlabeled historical documents, it is possible to improve model performance through weakly supervised learning — achieving better results on older language than models trained on modern language data.

DSpace tarkvara autoriõigus © 2002-2025 LYRASIS

  • Teavituste seaded
  • Saada tagasisidet