Radial Softmax: A Novel Activation Function for Neural Networks to Reduce Overconfidence in Out-Of-Distribution Data
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Neural networks are used widely and give state-of-the-art results in fields such as machine
translation, image classification and speech recognition. These networks operate under
the assumption that they predict on data that originates from the same distribution, as the
training data. If this is not the case then the model will output incorrect results, often
with very high confidence. In this work we explain how the commonly used softmax
is unable to mitigate these problems and propose a new function called radial softmax
which might help to mitigate out-of-distribution (OOD) overconfidence issues. We show
that radial softmax is capable of mitigating OOD overconfidence issues in almost all
cases. Based on our literature review this is the first time an improvement to softmax
has been proposed for this issue. We also showed that changes to the training cycle or
intermediate activation functions are not needed. With this function it is possible to make
the models more resistant to OOD data without modifications to the larger architecture
or training cycles. By having models that we know are resistant to OOD data, we can be
more confident in the model output and use them for applications where mistakes are
unacceptable such as healthcare, the defence industry or autonomous driving.
Description
Keywords
Neural Networks, Machine Learning, Softmax, Out-of-Distribution data, Overconfidence