Laur, Sven, juhendajaTalvet, AnnikaTartu Ülikool. Loodus- ja täppisteaduste valdkondTartu Ülikool. Arvutiteaduse instituut2024-10-032024-10-032024https://hdl.handle.net/10062/105084When interpreting the results of patients’ clinical analyses, reference ranges are important as they define the range within which a measurement result could fall for a healthy individual. These ranges can depend on age and gender, but may also vary depending on the methodology used in a particular laboratory. Using analysis results that are discretized based on reference ranges simplifies data analysis and model training. However, analysis results may be associated with incorrect LOINC codes or units of measurement. The aim of this Master’s thesis is to identify analyses and reference ranges grouped incorrectly or with incorrect units. Additionally, it aims to investigate whether discretized analysis results are beneficial for predicting medical events and if there is a difference in prediction accuracy using different discretization methods. In order to identify incorrectly grouped analysis results, the data was clustered using a Gaussian mixture model. To assess the predictive capability of discretized results, dependencies between the occurrence of medical events and differently discretized measurements, as well as measurement facts, were examined and models were trained to predict the occurrence of medical events. The results revealed that there is no significant difference in the prediction accuracy between models using different inputs. This suggests that in predicting medical events, the occurrence of measurement is as important as the discretized analysis result.enAttribution-NonCommercial-NoDerivs 3.0 EstoniaLaborianalüüsidreferentsvahemikudklasteranalüüsLaboratory analysisreference rangecluster analysismagistritöödinformaatikainfotehnoloogiainformaticsinfotechnologyLaborianalüüside diskretiseerimine ja analüüsThesis