Predicting Depression Symptoms Based on Reddit Posts

dc.contributor.advisorSirts, Kairit, juhendaja
dc.contributor.authorKoljal, Kaire
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-24T06:06:24Z
dc.date.available2023-08-24T06:06:24Z
dc.date.issued2022
dc.description.abstractUsing social media posts to predict mental health problems has become a popular topic in Natural Language Processing (NLP). Machine learning has been used for detecting a diagnosis or single symptoms associated with depression. As the clinical picture of depression can differ for people, it is better to detect symptoms instead of diagnosis from the social media posts. In this work, depression symptoms are predicted based on posts from Reddit page r/depression using NLP methods and multi-label classification. This work focuses on evaluating the quality of the annotations and analysing if such data can be used to train a predictive model. Each post is annotated by three annotators and the labels are aggregated in three ways to create three datasets that are used to train Transformers models. The results of this work reveal that on a small dataset with a lower annotation agreement, a majority vote over annotations gives the most reliable dataset and results. RoBERTa model shows the best learning and generalization ability in this work.et
dc.identifier.urihttps://hdl.handle.net/10062/91714
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectMulti-label classificationet
dc.subjectTransformerset
dc.subjectsymptom predictionet
dc.subjectdepressionet
dc.subjectsocial mediaet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titlePredicting Depression Symptoms Based on Reddit Postset
dc.typeThesiset

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Koljal_ComputerScience_2022.pdf
Size:
665.53 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: