Lihtsate eesti keele lausete grammatika tuletamine korpusest geneetilise algoritmiga

Date

2013

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

Töö eesmärk on analüüsida geneetilisi algoritme kui tööriista grammatikate au- tomaatseks tuvastamiseks ning konkreetsemalt leida lihtsate eestikeelsete lausete grammatika. Grammatikate tuvastamine näitelausete baasil on huvitav loomuli- ke keelte analüüsi vahend juhul kui grammatika käsitsi koostamine ei ole prak- tiline, näiteks meditsiiniraportite kui valdkonnaspetsiifilise allkeele grammatika automaattuvastamine artiklis [Kat11]. Lisaks sellele on grammatikate tuletamine leidnud rakendusi andmete kadudeta pakkimise valdkonnas, näiteks [NMWM94].
This work concentrated on generating context-free grammars with genetic algo- rithms. Its main purpose was to generate a grammar for simple Estonian sentences. We gave an overview of some previous works in the field. These works discussed ways to represent grammars for genetic algorithms and potential problems, like staying in a local maximum or generating too liberal grammars. We also gave a brief overview of genetic algorithms and of context-free gram- mars. We evaluated our version of genetic algorithms by inducing a known grammar of three-variable algebraic expressions. We concluded that our approach is promising, but the implementation is problematic. Our experiment of inducing a grammar from a morphologically tagged corpus was not successful. We concluded that the results might be improved by using a more detailed evaluation function.

Description

Keywords

Citation