Grammatiliste vigade parandamine mitmekeelse neuromasintõlkega

dc.contributor.advisorFišel, Mark, juhendaja
dc.contributor.authorLuhtaru, Agnes
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-10-27T12:54:56Z
dc.date.available2023-10-27T12:54:56Z
dc.date.issued2020
dc.description.abstractWe introduce an approach to grammatical error correction that does not require annotated training data. We train a multilingual neural machine translation model that uses only language-parallel translations. There are more openly available translations available than grammatical error correction corpora, especially for low-resource languages like Estonian. We find out that this system has high recall but low precision. So it corrects plenty of mistakes but adds many mistakes to correct text. Adding artificial mistakes increases the recall and has really positive impact on spelling error correction. Our model reliably corrects grammatical errors, like subject-verb agreement and noun number, but struggles with lexical errors and unnecessary paraphrasing.et
dc.identifier.urihttps://hdl.handle.net/10062/93808
dc.language.isoestet
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectnatural language processinget
dc.subjectneural machine translationet
dc.subjectgrammatical error correctionet
dc.subject.otherbakalaureusetöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleGrammatiliste vigade parandamine mitmekeelse neuromasintõlkegaet
dc.typeThesiset

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
Luhtaru_informaatika_2020.pdf
Suurus:
328.33 KB
Formaat:
Adobe Portable Document Format
Kirjeldus:

Litsentsi pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
license.txt
Suurus:
1.71 KB
Formaat:
Item-specific license agreed upon to submission
Kirjeldus: