Tervisesündmuste üldistamine sõnavektorite abil

Saarse, Kermo

Tervisesündmuste üldistamine sõnavektorite abil

dc.contributor.advisor	Reisberg, Sulev, juhendaja
dc.contributor.author	Saarse, Kermo
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.contributor.other	Tartu Ülikool. Arvutiteaduse instituut	et
dc.date.accessioned	2024-10-02T08:34:32Z
dc.date.available	2024-10-02T08:34:32Z
dc.date.issued	2024
dc.description.abstract	In the electronic health record, each visit to doctor could generate multiple data points. The same health issue could be linked to multiple diagnoses, drug prescriptions and measurements that are all separate events. Such a high resolution of the data makes its analysis difficult. In this thesis, word2vec model and K-means clustering are used to aggregate related health events into generalised events in an OMOP CDM dataset. It is shown that word2vec can successfully identify related events. As the number of clusters grows, each cluster becomes more homogenous, but there will also be a higher number of similar clusters. As a result of generalization, the number of events in a patient’s dataset decreased significantly.
dc.identifier.uri	https://hdl.handle.net/10062/105008
dc.language.iso	et
dc.publisher	Tartu Ülikool	et
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Estonia	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/ee/
dc.subject	Word2vec
dc.subject	K-means
dc.subject	OMOP CDM
dc.subject	ICD10
dc.subject.other	magistritööd	et
dc.subject.other	informaatika	et
dc.subject.other	infotehnoloogia	et
dc.subject.other	informatics	en
dc.subject.other	infotechnology	en
dc.title	Tervisesündmuste üldistamine sõnavektorite abil
dc.type	Thesis	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Saarse_andmeteadus_2024.pdf
Size:: 2.24 MB
Format:: Adobe Portable Document Format

Download

Collections

MTAT magistritööd – Master's theses