Clustering Methods for Interpreting Medical Data

dc.contributor.advisorLaur, Sven, juhendaja
dc.contributor.authorTeesaar, Egert Georg
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-11-07T13:20:27Z
dc.date.available2023-11-07T13:20:27Z
dc.date.issued2020
dc.description.abstractThe medical bills can be analyzed to identify disease trajectories. By applying machine learning methods it is possible to find answers to questions, like which diagnoses occur together and from what these conditions arise. This study uses various clustering methods, like Bernoulli mixture models and autoencoders compression with K-means, to divide patient into groups based on the diagnoses they have received. The results of the models are visualized on the heatmaps showing how likely it is to encounter specific diagnoses in those groups. Also a guided hidden Markov model was used to form a lifelong disease path from the short segments of the different patients’ treatment. This provides a way to observe how certain conditions arise in different ages and allows to track the disease development over time. It found similar results, what had been previously reported in medical studies, like development of J35 from H65. The models interpretability was also improved by using support vector machines as a feature selection method for I11. This way it was possible to get rid of all the diagnoses, which had no connection to I11 and only keep those contributing to the development of the disease. Result on the processed data also agreed with the medical findings, like I50 development from I11.et
dc.identifier.urihttps://hdl.handle.net/10062/94084
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectdisease trajectoryet
dc.subjectmedical billset
dc.subjectclusteringet
dc.subjectvisualisationet
dc.subjectinterpretationet
dc.subjectdiagnose rankinget
dc.subjectunsupervised learninget
dc.subjectBernoulli mixture modelset
dc.subjecthidden Markov modelset
dc.subjectK-meanset
dc.subjectautoencoderet
dc.subjectsupport vector machineset
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleClustering Methods for Interpreting Medical Dataet
dc.typeThesiset

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
master_thesis_teesaar_egert_georg.pdf
Size:
726.08 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: