Calibration of Multi-Class Probabilistic Classifiers

dc.contributor.advisorKull, Meelis, juhendaja
dc.contributor.authorValk, Kaspar
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-24T09:20:33Z
dc.date.available2023-08-24T09:20:33Z
dc.date.issued2022
dc.description.abstractClassifiers, machine learning models that predict probability distributions over classes, are not guaranteed to produce realistic output. A classifier is considered calibrated if the produced output is in correspondence with the actual class distribution. Calibration is essential in safety-critical tasks where small deviations between the predicted probabilities and the actual class distribution can incur large costs. A common approach to improve the calibration of a classifier is to use a hold-out data set and a post-hoc calibration method to learn a correcting transformation for the classifier’s output. This thesis explores the field of post-hoc calibration methods for classification tasks with multiple output classes: several existing methods are visualized and compared, and three new non-parametric post-hoc calibration methods are proposed. The proposed methods are shown to work well with data sets with fewer classes, managing to improve the stateof- the-art in some cases. The basis of the three suggested algorithms is the assumption of similar calibration errors in close neighborhoods on the probability simplex, which has been previously used but never clearly stated in the calibration literature. Overall, the thesis offers additional insight into the field of multi-class calibration and allows for the construction of more trustworthy classifiers.et
dc.identifier.urihttps://hdl.handle.net/10062/91734
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectmachine learninget
dc.subjectmulti-classet
dc.subjectclassifieret
dc.subjectcalibrationet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleCalibration of Multi-Class Probabilistic Classifierset
dc.typeThesiset

Failid

Originaal pakett

Nüüd näidatakse 1 - 2 2
Laen...
Pisipilt
Nimi:
Valk_informaatika_2022.pdf
Suurus:
16.67 MB
Formaat:
Adobe Portable Document Format
Kirjeldus:
Pisipilt ei ole saadaval
Nimi:
msc-knn-cal-main.zip
Suurus:
5.84 MB
Formaat:
Compressed ZIP
Kirjeldus:
Lisad

Litsentsi pakett

Nüüd näidatakse 1 - 1 1
Pisipilt ei ole saadaval
Nimi:
license.txt
Suurus:
1.71 KB
Formaat:
Item-specific license agreed upon to submission
Kirjeldus: