Sirvi Autor "Kull, Meelis, juhendaja" järgi
Nüüd näidatakse 1 - 20 28
- Tulemused lehekülje kohta
- Sorteerimisvalikud
Kirje A Competitive Scenario Forecaster using XGBoost and Gaussian Copula(Tartu Ülikool, 2023) Kolomiiets, Denys; Shahroudi, Novin, juhendaja; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutIn recent years scenario forecasting has been explored and developed by multiple authors. It is a useful technique for setting such as renewable energy production, which is extremely important for a society transitioning from fossil fuel energy generation. Currently, one of the methods to approach the task of scenario forecasting are generative models. The primary goal of this thesis is to develop an approach that outperforms the current best model, using the decision tree model method. This work also discusses possible improvements for decision tree models in scenario forecasting setting. Our approach has surpassed the performance of generative models, making it a solid new baseline for future researchers to beat.Kirje Andmepunktide panuse visualiseerimine binaarklassifitseerija kaos(Tartu Ülikool, 2021) Paal, Magnus; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutThe decrease of a classifier’s loss function’s output is one way to know if a classifier is improving. The output of a loss function which is also known as loss is just one value and doesn’t give a complete overview of the classifier and dataset as a whole. The aim of this thesis was to find a way how to interpret loss through datapoints and visualize it. The visualizations found can help to grasp how each datapoint contributes in the whole loss. These visualizations could be used to find out which sets of datapoints contribute the most in loss, the ones whose predicted value is farther from their actual value and which make up a smaller number of points, or those whose predicted value is closer to the actual value and which make up a bigger number of points. Secondly these visualizations could be used to compare the different losses of two classifiers and find out which datapoints are the ones that contribute most in that difference. Lastly the visualizations could be used to find out which datapoints with which features contribute the most in a loss.Kirje Calibration of Multi-Class Probabilistic Classifiers(Tartu Ülikool, 2022) Valk, Kaspar; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutClassifiers, machine learning models that predict probability distributions over classes, are not guaranteed to produce realistic output. A classifier is considered calibrated if the produced output is in correspondence with the actual class distribution. Calibration is essential in safety-critical tasks where small deviations between the predicted probabilities and the actual class distribution can incur large costs. A common approach to improve the calibration of a classifier is to use a hold-out data set and a post-hoc calibration method to learn a correcting transformation for the classifier’s output. This thesis explores the field of post-hoc calibration methods for classification tasks with multiple output classes: several existing methods are visualized and compared, and three new non-parametric post-hoc calibration methods are proposed. The proposed methods are shown to work well with data sets with fewer classes, managing to improve the stateof- the-art in some cases. The basis of the three suggested algorithms is the assumption of similar calibration errors in close neighborhoods on the probability simplex, which has been previously used but never clearly stated in the calibration literature. Overall, the thesis offers additional insight into the field of multi-class calibration and allows for the construction of more trustworthy classifiers.Kirje Cost-sensitive classification with deep neural networks(Tartu Ülikool, 2020) Baum, Andreas; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutTraditional classification focuses on maximizing the accuracy of predictions. This approach works well if all types of errors have the same cost. Unfortunately, in many real-world applications, the misclassification costs can be different, where some errors may be much worse than others. In such cases, it is useful to consider the costs and build a classifier that minimizes the total cost of all predictions. Earlier, cost-sensitive learning has received very little research with balanced datasets. Mostly, it has been mostly considered as one of the measures that solves the class imbalance problem. As the basis of the class imbalance problem is similar to costsensitive learning, we can mainly rely on the research done regarding the class imbalance problem. The purpose of this thesis is to experiment on how successful different cost-sensitive techniques are at minimizing the total cost compared to an ordinary neural network. The used techniques involve making neural network cost-sensitive based on the output probabilities. Additionally, oversampling, undersampling and loss functions that consider the class weights are used. The experiments are performed on 3 datasets with different degrees of difficulty and they involve binary and multiclass classification tasks. Also, 3 different cost matrix types are considered. The results show that all the techniques reduce the total prediction cost compared to an ordinary neural network. The best results were achieved using oversampling and cost-sensitive output modifications for both binary and multiclass case.Kirje Dirichlet’ kalibreerimismeetodi analüüs(Tartu Ülikool, 2020) Grjaznov, Kirill; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutIn machine learning, one of the problems with classification methods is that classifiers give too confident probabilities. The solution to the problem is calibration which performs a correction on the predicted probabilities. In this bachelor's thesis, the Dirichlet calibration method is analyzed. The change of the calibration matrix was studied through the classifier training process, its effect on the results at different training stages, and the nature of the elements of the calibration matrix was interpreted. The paper described how the calibration is performed with the Dirichlet calibration method and how the calibration matrix shows and improves the confidence of the classifier. The experiments were performed on deep neural networks with the architectures ResNet110, Wide ResNet32 and DenseNet40 classifiers and on the CIFAR-10 dataset. The analysis showed that the classifiers were over confident throughout the whole training process, and the Dirichlet calibration method improves confidence at each stage of the training process.Kirje Disentanglement of features in variational autoencoders(Tartu Ülikool, 2022) Tark, Kaarel; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutMachine learning models, especially neural networks, have shown excellent performance in classifying different images. The features these models learn are often complex and hard to interpret. Learning disentangled features from images is a way to tackle explainability and create features with semantic meaning. A learned feature is disentangled if it represents only a single property of an object. For example, if we had an image of a chair, we would assume that one feature changes its size, but nothing else. Another feature changes the chair leg shape and nothing else. Beta variational autoencoders (β-VAE) have shown promising performance in learning disentangled features from images without supervision. If there is enough data, the model can learn the features without needing large amounts of labelled data. After learning features, we can use a smaller amount of labelled data to train an additional model on top of the learned features (few-shot learning). The experiments of β-VAE architectures have been with simple images with known generative factors. Usually, all generative factors are independent, and the architecture assumes that there is a small number of them. Recently a new dataset has been published where some features are dependent (Boxhead dataset). The experiments with existing architectures showed relatively poor performance on β-VAE based architectures to capture those features. Based on exploratory analysis of β-VAE architecture based models, we propose a new architecture to improve the result. For evaluation, we introduce new metrics in addition to the commonly used ones. Our results showed no substantial performance difference between our proposed and β-VAE architectures. Based on the results of the main experiments, we conduct additional exploratory experiments on a dataset where the object does not rotate.Kirje DNA-motiivide anaIüüsimise tarkvara m:Profiler(Tartu Ülikool, 2009) Adari, Mirko; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutKirje Ettekannete automaatne sessioonideks jagamine teaduskonverentside jaoks(Tartu Ülikool, 2024) Heikla, Mia Marta; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutThe aim of this work is to create a method, using a language model, that can divide the articles accepted for a scientific conference into sessions by topic, so that participants can listen to a succession of presentations on a similar topic. In developing the method, it is necessary to know in advance the problem of scheduling a conference, the principles of creating a good session title, and the principles of creating story prompts. The titles and abstracts of scientific articles will be provided for the method. Based on this data, the language model is queried to generate possible session titles, followed by the use of prompts to generalise the titles. Occurring titles are reviewed to ensure their suitability in the context of a machine learning conference by removing low-value titles. The remaining titles are evaluated with a language model according to the content and title of each presentation, followed by a segmentation into sessions using a linear integer optimization algorithm. The process is completed by using Levenshtein distance to estimate the similarity of the segmentation of the sessions.Kirje Evaluating Slow Feature Analysis on Time-Series Data(Tartu Ülikool, 2021) Kaasla, Kaarel; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutIn this thesis, we investigate Slow Feature Analysis (SFA) as a method of extracting slowly-varying signals from quickly-varying input data. The main aim of the thesis is two-fold. The first primary objective is to evaluate how the level of noise in input data affects the performance of SFA for different input feature combinations. The second objective of this thesis is to compare the performance of the classical formulation of SFA to a biologically plausible version of the algorithm. The first half of the thesis gives reader a theoretical overview of how the algorithm works and explores some of the previous applications. The second half conducts three experiments that explore the primary research questions of the thesis and discusses possible further research directions.Kirje Exploring Out-of-Distribution Detection Using Vision Transformers(Tartu Ülikool, 2022) Haavel, Karl Kaspar; Kull, Meelis, juhendaja; Leelar, Bhawani Shankar, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutCurrent state-of-the-art artificial neural network (ANN) image classifiers perform well on input data from the same distribution that it was trained with, also known as in-distribution (InD), yet have worse results on out-of-distribution (OOD) samples. An input can be considered OOD for many reasons - such as an input with a new concept (e.g. new class), or the input has random noise generated by a sensor. Knowing if a new data point is OOD is necessary for deploying models in real-world safety-critical applications (e.g. self-driving cars, healthcare) to make safer decisions. For example, a self-driving car slows down when it detects an OOD object or gives the control back to the human. The primary method used for OOD detection is to utilise ANN as a feature extractor of embeddings to approximate where the new data point will be in the embedding space and compare it to trained embeddings using distance metrics. We use a Vision Transformer (ViT) as the ANN because there has been a push to use large-scale pre-trained Transformers to improve a range of OOD tasks. Improvements stem from ViT’s state-of-the-art performance as a feature extractor, which can be used out-of-the-box for OOD detection compared to convolutional neural networks (CNNs), which require custom training methods and exposure to OOD to reach similar results. In this thesis, a ViT was used as a feature extractor, and the performance of OOD detection was compared using various distance metrics to determine the robustness and choose the best distance metric in ViT’s embedding space. Three separate experiments were conducted with multiple datasets, methods, models and approaches. The experiments showed that ViT is capable of OOD detection out-of-the-box without any custom training methods or exposure to OOD. However, none of the distance metrics could noticeably improve the results of OOD detection obtained with the baseline Mahalanobis distance. Nonetheless, ViT has considerably better OOD detection performance in most datasets and is more generalisable than a similarly trained CNN. Furthermore, ViT is more robust to various distance metrics, proving that the features extracted from the model are good enough to discriminate between InD and OOD. Finally, it was shown that ViT with Mahalanobis distance has the best OOD detection performance when blending InD and OOD at various ratios. Future work can consider ensembling multiple distance metrics to utilise the properties of each distance metric and to apply the same methodology on other ANN architectures.Kirje Forecasting and Trading Financial Time Series with LSTM Neural Network(Tartu Ülikool, 2021) Madisson, Vahur; Raus, Toomas, juhendaja; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutThe growing importance of data science and the development of machine learning allows to implementation of the algorithms created in recent decades with new capable technologies. Machine learning methods can challenge statistical methods of forecasting when applied in financial time series, as such data may exhibit nonlinear characteristics. The objective of the thesis is to present a theoretical introduction and practical steps to construct, test, and implement forecasting methods on the stock market index, using artificial intelligence algorithm called long short-term memory (LSTM) neural network. The relevant trading strategy is developed to implement the model predictions. The empirical study focuses on finding the best configuration of the LSTM model to enhance the forecasting ability, using Keras library in Python programming language. The results are assessed in terms of forecast accuracy measures and profitability when applying relevant trading strategy and compared against selected benchmark methods. Results demonstrate that LSTM forecast accuracy is competitive and trading results outperform compared to selected benchmarks methods.Kirje Forecasting Human Trajectories with Uncertainty Estimation(Tartu Ülikool, 2022) Riis, Karl; Kull, Meelis, juhendaja; Shahroudi, Novin, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutHuman trajectory forecasting is a task which has been getting increasingly more attention in recent years. It is often used in robotics research as autonomous robots have to be well aware of the movement patterns of surrounding pedestrians to ensure safe and collision-free navigation. Many recent trajectory prediction works have been focused on neural network based solutions which need to be trained on large amounts of data. We propose a new generative trajectory forecasting method which does not need to be previously trained and is algorithmically simple and intuitive. Our method produces a multi-modal output to convey the uncertainty in human motion and is configurable with a set of parameters to adapt it to various environments. We show that our method performs nearly as good and in some cases better than state-of-the-art forecasting models when considering the task of predicting trajectories in an unseen environment. The results indicate that when deploying a forecasting model in an environment for which there is not a lot of data available, a neural network can be rivaled by a simpler approach.Kirje Hierarchical Forecasting Methods in Day-Ahead Electricity Consumption Forecasting(Tartu Ülikool, 2024) Kuusk, Carel; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutPaljudes rakendustes on võimalik mitu aegrida organiseerida ühte hierarhiasse nii, et hierarhia alumiste tasemete aegread saab agregeerida kõrgema taseme aegridadeks. Selliste aegridade prognoosid tuleb omavahel sobitada, et garanteerida prognoositavates aegridades olevate agregatsioonitingimuste täitmine ka aegridade prognoosides. Selle magistritöö eesmärk on arendada ja analüüsida hierarhilisi prognoosimeetodeid elektritarbimise tunnipõhiste aegridade jaoks. Tulemusena on välja töötatud ja analüüsitud LightGBM ja kantregressiooni mudelitele põhinevad hierarhilised mudelid. Kaks keerulisemat lineaarse sobitamise meetodit – OLS ja minimaalse jälje meetod (MinT) – on võrreldud alt-üles sobitamise meetodiga, mille käigus OLS ja MinT lähenemisele on leitud olulised puudujäägid. Puudujäägid tulenevad elektritarbimise prognoosivigade kovariatsioonistruktuurist. Samas, sobitamise meetodeid saab kasutada, et leida prognoose vahepealsetele tasemetele hierarhias.Kirje Instance-based Label Smoothing for Better Classifier Calibration(Tartu Ülikool, 2020) Abdelrahman, Mohamed Maher; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutBinary classification is one of the fundamental tasks in machine learning, which involves assigning one of two classes to an instance defined by a set of features. Although accurate predictions are essential in most of the tasks, knowing the model confidence is indispensable in many of them. Many probabilistic classifiers’ predictions are not well-calibrated and tend to be overconfident, requiring further calibration as a post-processing step to the model training. Logistic calibration is one of the most popular calibration methods, that fits a logistic regression model to map the outputs of a classification model into calibrated class probabilities. Various regularization methods could be applied to logistic regression fitting to reduce its overfitting on the training set. Platt scaling is one of these methods, which applies label smoothing to the class labels and transforms them into target probabilities before fitting the model to reduce its overconfidence. Also, label smoothing is widely used in classification neural networks. In previous works, it was shown that label smoothing has a positive calibration and generalization effect on the network predictions. However, it erases information about the similarity structure of the classes by treating all incorrect classes as equally probable, which impairs the distillation performance of the network model. In this thesis, we aim to find better ways of reducing overconfidence in logistic regression. Here we derive the formula of a Bayesian approach for the optimal predicted probabilities in case of knowing the generative model distribution of the dataset. Later, this formula is approximated by a sampling approach to be applied practically. Additionally, we propose a new instance-based label smoothing method for logistic regression fitting. This method motivated us to present a novel label smoothing approach that enhanced the distillation and calibration performance of neural networks compared with standard label smoothing. The evaluation experiments confirmed that the approximated formula for the derived optimal predictions is significantly outperforming all other regularization methods on synthetic datasets of known generative model distribution. However, in more realistic scenarios when this distribution is unknown, our proposed instance-based label smoothing had a better performance than Platt scaling in most of the synthetic and real-world datasets in terms of log loss and calibration error. Besides, neural networks trained with instancebased label smoothing, outperformed the standard label smoothing regarding log loss, calibration error, and network distillation.Kirje Isejuhtiva auto objektituvastusmudeli riski jaotuse hindamine Poissoni protsessiga(2020) Põru, Getter; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Matemaatika ja statistika instituutMagistritöö eesmärk on isejuhtiva auto objektituvastusriskide jaotuse hindamine kaamerapildilt. Eraldi käsitletakse sõidukeid ja jalakäijaid. Objektide tuvastamata jätmiste hindamiseks on kasutatud tuvastatud objektide esimese tuvastuse kaadreid teadmisega, et mõningates olukordades on objekt olemas ja tuvastatav juba ka esimesele tuvastusele eelnevas kaadris. Seega esimesele tuvastusele eelnevas kaadris toimus potentsiaalne tuvastamata jätmine ning kuna objekti asukoht kahe järjestikuse kaadri korral on väga sarnane, siis on tuvastamata jätmist hinnatud esmase tuvastamise asukohaga. Parema hinnangu saamiseks objektide tuvastamata jätmisest on välja filtreeritud olukorrad, kus objekt on ilmunud mõne teise objekti tagant või väljastpoolt kaadrit. Selline lähenemine võimaldab objektituvastusmudeli tuvastamata jätmiste jaotust hinnata ka kasutusjärgus uudses olukorras, mille kohta annoteeritud andmed puuduvad. Töö teoreetilises osas antakse ülevaade Poissoni protsessidest ja Gaussi segumudeli ning gammajaotuse tihedusfunktsiooni hindamisest. Töö praktilises osas sobitatakse andmetele Poissoni protsesside intensiivsusfunktsioonid. Intensiivsusfunktsiooni defineerimiseks kasutatakse objektide asukohtade koordinaatide marginaaljaotustele sobitatud gammajaotust ning Gaussi segumudelit.Kirje Kasutajaliidese loomine JavaScript-tehnoloogiatega töövootarkvara m:Profiler näitel(Tartu Ülikool, 2009) Paas, Aivo; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutKirje Klassifitseerija kalibreerituse testi võimsuse suurendamine(Tartu Ülikool, 2020) Valk, Kaspar; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutIn machine learning, a classifier is called to be calibrated if its predicted class probabilities match with the actual class distribution of the data. In classification tasks where safety is necessary, it is important that the classifier’s predictions would not be over- or underconfident but instead would be calibrated. Calibration can be evaluated using the measure ECE, and based on its value it is possible to construct a calibration test: a statistical test which allows to check if the hypothesis that the model is calibrated holds. In the thesis, experiments were performed to find optimal parameters for calculating ECE, so that the calibration test based on this would be as powerful as possible. That is, for a miscalibrated classifier the test would be able to reject the null hypothesis that the model is calibrated as frequently as possible. The work concluded that to make the calibration test as powerful as possible, the datapoints should be placed into separate bins when calculating ECE. If the dataset is expected to contain datapoints for which the classifier is largely miscalibrated, then it is best to use a variant of ECE with the logarithmic distance measure inspired by Kullback-Leibler divergence. Otherwise, it is more reasonable to use absolute or square distance. These recommendations differ significantly from conventional parameter values used when calculating ECE in previous scientific literature. The results of this thesis allow for improved identification of miscalibration in classifiers.Kirje Konvolutsiooniliste neurovõrkude kalibreerimine järkjärgulise külmutamise meetodiga(Tartu Ülikool, 2023) Savolainen, Oliver; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutNeurovõrke on väga edukalt kasutatud mitmete valdkondade ja ülesannete juures, neist üks parimaid näiteid on konvolutsiooniliste neurovõrkude kasutamine piltide klassifitseerimiseks. Samas ei saa tihti neid mudeleid siiski usaldada, sest nad on üldjuhul liiga enesekindlad oma ennustustes ehk määravad liiga kõrge tõenäosuse ennustatavale klassile. Selle tõttu on vaja mudeleid kalibreerida. Probleemi lahendamiseks on loodud mitmeid kalibreerimismeetodeid, neist üks tõhusamaid on temperatuuri skaleerimine. Selles töös on kalibreerimiseks, täpsemalt liigse enesekindluse vähendamiseks uurimise all järkjärguline külmutamine. Järkjärgulise külmutamise all mõeldakse meetodit neurovõrgu treenimiseks, kus mingitel hetkel lõpetatakse valitud kihtide kaalude muutmine. Töös kasutati Kängsepa 2018. aasta magistritöö lähtekoodi ja tulemusi. Kõigepealt leiti ühe mudeli ja andmestiku põhjal parimad viisid meetodi rakendamiseks. Valiti kaks järkjärgulise külmutamise skeemi ning seejärel implementeeriti Kängsepa töö abil mitmeid konvolutsioonilisi neurovõrke, rakendati neile järkjärgulist külmutamist ning treeniti neid. Saadud mudelite tulemusi võrreldi Kängsepa töö tulemustega. Kuigi ei saa väita, et uuritud mõõdikute põhjal aitas järkjärguline külmutamine liigset enesekindlust vähendada, siis vähendas külmutamine ajaliste ressursside kasutamist ning saavutas samas paljude mudelite juures sarnaseid tulemusi.Kirje Machine learning for assessing toxicity of chemicals identified with mass spectrometry(Tartu Ülikool, 2023) Rahu, Ida; Kruve, Anneli, juhendaja; Kull, Meelis, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutReal-world samples can contain hundreds to thousands of chemicals, with endocrinedisrupting chemicals (EDCs) posing a severe threat to human health. Unfortunately, reliable and rapid methods for detecting these compounds from complex mixtures are lacking. One of the potential solutions could be to leverage the capabilities of non-target liquid chromatography high-resolution mass spectrometry (LC/HRMS) combined with machine learning methods. This study aimed to investigate whether the biochemical activity of compounds can be estimated based on chemical fingerprints calculated from HRMS spectra and thereby flag the compounds that require further analysis due to the potential risk they pose to human health. For that, several classification models based on a variety of machine learning algorithms were trained, and their accuracy was evaluated using chemical fingerprints derived from experimental mass spectra. As a result, it was found that the proposed methodology has great potential in the field of in silico toxicology.Kirje Masinõppe rakendamine makseviivituse tõenäosuse hindamisel(Tartu Ülikool, 2022) Praks, Martti; Kängsepp, Markus, juhendaja; Kull, Meelis, juhendaja; Kõiv, Kuldar, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutMakseviivituse tõenäosuse hindamine on finantsasutusel üheks võtmetegevuseks krediidiriski hindamisel. Makseviivituse tõenäosus on oluline tunnus, mille pealt otsustatakse kas ja mis tingimustel krediiti anda ning jälgitakse kogu krediiditoodete portfelli kvaliteeti. Üldistatult saab kasutatavad mudelid jagada kaheks: statistilised lähenemised ja masinõppe tehnikad. Magistritöö peamisteks tulemusteks on võrdlus logistilise regressiooni ja teiste masinõppemeetoditega loodud mudelite vahel, kasutades AS LHV Group’i reaalseid andmeid. Töös demonstreeritakse erinevate meetoditega saavutatud makseviivituse hindamismudeli tulemeid ja arutletakse erinevate meetodite eeliste üle. Parima tulemuse saavutas mõõdikute alusel otsustuspuu algoritmil põhinev otsustusmets. Töös rakendatakse erinevaid meetodeid otsustusmetsa mudeli seletamiseks toetades selle meetodi rakendamist praktikas. Arvestades viimasel ajal erinevate otsustuspuu meetodil põhinevate masinõppemeetodite edukust paljudes valdkondades, ei ole saavutatud tulemused üllatuslikud. Otsustusmetsa ennustusi seletatakse läbi mudeli üldiste seoste andmestiku tunnustega ja konkreetsemalt näitlikustatakse erinevate näidete ennustuse kujunemist. Kas need tulemid on piisavad, et praktikas otsustusmetsa kasutada, jäetakse lõppkasutaja otsustada.