Metaphor Identification for Estonian

dc.contributor.advisorBarbu, Eduard, juhendaja
dc.contributor.authorKittask, Claudia
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-09-28T08:39:55Z
dc.date.available2023-09-28T08:39:55Z
dc.date.issued2021
dc.description.abstractMetaphors are a common facet of written and spoken language. For humans, it is pretty easy to identify and interpret metaphors, but machines struggle to match this capability. Much research about metaphors has been done in the last decades, but mainly for English using different approaches - ranging from rule-based to deep learning-based systems. As of the date of this thesis, there has been no research done for computational metaphor processing for the Estonian language. In this thesis, the research in the field of computational metaphors is explicitly applied to the Estonian language. All the methods implemented are unsupervised or semisupervised because the resources for Estonian regarding metaphors do not exist. This thesis also attempts to incorporate contextualized embeddings from the BERT language model into metaphor identification systems to enhance performance. For testing the performance of the methods, a new evaluation dataset for the Estonian language was created1. This dataset contains 500 sentences, from which 232 sentences contain VERB-NOUN phrase where VERB is used metaphorically and 268 which the VERB was used literally. The best results were obtained using BERT embeddings alongside with information from Estonian WordNet.et
dc.identifier.urihttps://hdl.handle.net/10062/93204
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectMetaphorset
dc.subjectclusteringet
dc.subjectnatural language processinget
dc.subjectunsupervised learninget
dc.subjectsemisupervised learninget
dc.subjectmetaphor identificationet
dc.subjectBERTet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleMetaphor Identification for Estonianet
dc.typeThesiset

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
kittask_computerscience_2021.pdf
Size:
2.72 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: