Lihtsate eesti keele lausete grammatika tuletamine korpusest geneetilise algoritmiga
Date
2013
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Töö eesmärk on analüüsida geneetilisi algoritme kui tööriista grammatikate au-
tomaatseks tuvastamiseks ning konkreetsemalt leida lihtsate eestikeelsete lausete
grammatika. Grammatikate tuvastamine näitelausete baasil on huvitav loomuli-
ke keelte analüüsi vahend juhul kui grammatika käsitsi koostamine ei ole prak-
tiline, näiteks meditsiiniraportite kui valdkonnaspetsiifilise allkeele grammatika
automaattuvastamine artiklis [Kat11]. Lisaks sellele on grammatikate tuletamine
leidnud rakendusi andmete kadudeta pakkimise valdkonnas, näiteks [NMWM94].
This work concentrated on generating context-free grammars with genetic algo- rithms. Its main purpose was to generate a grammar for simple Estonian sentences. We gave an overview of some previous works in the field. These works discussed ways to represent grammars for genetic algorithms and potential problems, like staying in a local maximum or generating too liberal grammars. We also gave a brief overview of genetic algorithms and of context-free gram- mars. We evaluated our version of genetic algorithms by inducing a known grammar of three-variable algebraic expressions. We concluded that our approach is promising, but the implementation is problematic. Our experiment of inducing a grammar from a morphologically tagged corpus was not successful. We concluded that the results might be improved by using a more detailed evaluation function.
This work concentrated on generating context-free grammars with genetic algo- rithms. Its main purpose was to generate a grammar for simple Estonian sentences. We gave an overview of some previous works in the field. These works discussed ways to represent grammars for genetic algorithms and potential problems, like staying in a local maximum or generating too liberal grammars. We also gave a brief overview of genetic algorithms and of context-free gram- mars. We evaluated our version of genetic algorithms by inducing a known grammar of three-variable algebraic expressions. We concluded that our approach is promising, but the implementation is problematic. Our experiment of inducing a grammar from a morphologically tagged corpus was not successful. We concluded that the results might be improved by using a more detailed evaluation function.