Eesti alamredditi korpuse loomine ning analüüs
dc.contributor.advisor | Orasmaa, Siim, juhendaja | |
dc.contributor.author | Tamm, Tauno | |
dc.contributor.other | Tartu Ülikool. Loodus- ja täppisteaduste valdkond | et |
dc.contributor.other | Tartu Ülikool. Arvutiteaduse instituut | et |
dc.date.accessioned | 2024-09-26T07:29:16Z | |
dc.date.available | 2024-09-26T07:29:16Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Reddit is the world's largest forum, visited by about 1.2 billion users monthly. The largest Estonian subreddit is r/Eesti. This master's thesis involved creating a language corpus based on the data from r/Eesti and analyzing the data therein. The analysis addressed questions on how and when posts are made and what they discuss. For answering these research questions, various transformer-type models were fine-tuned for sentiment analysis, the Python language detection library Lingua was used for language detection, and BERTopic was employed for topic analysis. The results revealed that the r/Eesti subreddit can be considered bilingual, as a significant portion of posts and comments are also in English. The sentiment analysis exhibited that users posting and commenting in Estonian are mostly negative, while those who write in English tend to be neutral, with a slight lean towards positivity. In both languages, “Education” is the most common topic. | |
dc.identifier.uri | https://hdl.handle.net/10062/104920 | |
dc.language.iso | en | |
dc.publisher | Tartu Ülikool | et |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Estonia | en |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/ee/ | |
dc.subject | ||
dc.subject | Natural Language Processing | |
dc.subject | Sentiment Analysis | |
dc.subject | Language Detection | |
dc.subject | r/Eesti | |
dc.subject | BERTopic | |
dc.subject.other | magistritööd | et |
dc.subject.other | informaatika | et |
dc.subject.other | infotehnoloogia | et |
dc.subject.other | informatics | en |
dc.subject.other | infotechnology | en |
dc.title | Eesti alamredditi korpuse loomine ning analüüs | |
dc.type | Thesis | en |
Files
Original bundle
1 - 1 of 1