Grammatiliste vigade parandamine sageduspõhise sünteetilise andmestikuga

dc.contributor.advisorLuhtaru, Agnes, juhendaja
dc.contributor.advisorFišel, Mark, juhendaja
dc.contributor.authorUniver, Jakob
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-23T06:28:16Z
dc.date.available2023-08-23T06:28:16Z
dc.date.issued2022
dc.description.abstractIn this thesis we introduce a grammatical error correction method with a neural network trained only on synthetic data. The method is useful for languages without big corpora for training a grammatical error correction model, like Estonian. From a smaller human corrected corpus, we found the probabilities of word deletion, addition, substitution and changing word order mistakes in the text. With the help of these probabilities we created a bigger synthetic corpus and we trained a neural network for grammatical error correction on the synthetic data. The author found that the probabilities of mistakes do not have to be very precise and the trained neural network can correct spelling mistakes as well as grammar mistakes.et
dc.identifier.urihttps://hdl.handle.net/10062/91682
dc.language.isoestet
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectGrammatcal Error Correctionet
dc.subjectneural networket
dc.subjectsynthetic dataet
dc.subject.otherbakalaureusetöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleGrammatiliste vigade parandamine sageduspõhise sünteetilise andmestikugaet
dc.typeThesiset

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
Univer_Informaatika_2022.pdf
Suurus:
188.62 KB
Formaat:
Adobe Portable Document Format
Kirjeldus:

Litsentsi pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
license.txt
Suurus:
1.71 KB
Formaat:
Item-specific license agreed upon to submission
Kirjeldus: