Morfoloogilise muuttüübi automaatne tuvastamine

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

Estonian language is constantly evolving, as new words are created in different ways. Language users often know intuitively how to inflect new words, but in linguistics this intuition is formalized in the form of inflection types. This work researches how to automate the identification of inflection types. To this end, two LSTM-based models have been created to detect and predict inflection types. The initial data for the models are taken from Vabamorf’s morphology lexicon, which consists of almost 74 000 lemmas. All possible word forms are synthesized for the lemmas and the result is transformed into a suitable form for the LSTM-based models. One model is trained on only words, with an accuracy of 95.8%, and the other model is trained on words and parts of speech, with an accuracy of 97.8%.

Description

Keywords

LSTM, muuttüübid, Vabamorf, tehisnärvivõrgud, tehisintellekt, masinõpe, klassifitseerimine, inflection types, Vabamorf, artifical neural networks, artificial intelligence, machine learning, classification

Citation