Calibration of Multi-Class Probabilistic Classifiers
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Classifiers, machine learning models that predict probability distributions
over classes, are not guaranteed to produce realistic output. A classifier is considered
calibrated if the produced output is in correspondence with the actual class distribution.
Calibration is essential in safety-critical tasks where small deviations between the predicted
probabilities and the actual class distribution can incur large costs. A common
approach to improve the calibration of a classifier is to use a hold-out data set and a
post-hoc calibration method to learn a correcting transformation for the classifier’s output.
This thesis explores the field of post-hoc calibration methods for classification tasks with
multiple output classes: several existing methods are visualized and compared, and three
new non-parametric post-hoc calibration methods are proposed. The proposed methods
are shown to work well with data sets with fewer classes, managing to improve the stateof-
the-art in some cases. The basis of the three suggested algorithms is the assumption
of similar calibration errors in close neighborhoods on the probability simplex, which
has been previously used but never clearly stated in the calibration literature. Overall,
the thesis offers additional insight into the field of multi-class calibration and allows for
the construction of more trustworthy classifiers.
Description
Keywords
machine learning, multi-class, classifier, calibration