Tervisesündmuste üldistamine sõnavektorite abil

dc.contributor.advisorReisberg, Sulev, juhendaja
dc.contributor.authorSaarse, Kermo
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2024-10-02T08:34:32Z
dc.date.available2024-10-02T08:34:32Z
dc.date.issued2024
dc.description.abstractIn the electronic health record, each visit to doctor could generate multiple data points. The same health issue could be linked to multiple diagnoses, drug prescriptions and measurements that are all separate events. Such a high resolution of the data makes its analysis difficult. In this thesis, word2vec model and K-means clustering are used to aggregate related health events into generalised events in an OMOP CDM dataset. It is shown that word2vec can successfully identify related events. As the number of clusters grows, each cluster becomes more homogenous, but there will also be a higher number of similar clusters. As a result of generalization, the number of events in a patient’s dataset decreased significantly.
dc.identifier.urihttps://hdl.handle.net/10062/105008
dc.language.isoet
dc.publisherTartu Ülikoolet
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Estoniaen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/ee/
dc.subjectWord2vec
dc.subjectK-means
dc.subjectOMOP CDM
dc.subjectICD10
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticsen
dc.subject.otherinfotechnologyen
dc.titleTervisesündmuste üldistamine sõnavektorite abil
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Saarse_andmeteadus_2024.pdf
Size:
2.24 MB
Format:
Adobe Portable Document Format