Exploring Out-of-Distribution Detection Using Vision Transformers

Haavel, Karl Kaspar

Exploring Out-of-Distribution Detection Using Vision Transformers

dc.contributor.advisor	Kull, Meelis, juhendaja
dc.contributor.advisor	Leelar, Bhawani Shankar, juhendaja
dc.contributor.author	Haavel, Karl Kaspar
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.contributor.other	Tartu Ülikool. Arvutiteaduse instituut	et
dc.date.accessioned	2023-08-24T06:12:48Z
dc.date.available	2023-08-24T06:12:48Z
dc.date.issued	2022
dc.description.abstract	Current state-of-the-art artificial neural network (ANN) image classifiers perform well on input data from the same distribution that it was trained with, also known as in-distribution (InD), yet have worse results on out-of-distribution (OOD) samples. An input can be considered OOD for many reasons - such as an input with a new concept (e.g. new class), or the input has random noise generated by a sensor. Knowing if a new data point is OOD is necessary for deploying models in real-world safety-critical applications (e.g. self-driving cars, healthcare) to make safer decisions. For example, a self-driving car slows down when it detects an OOD object or gives the control back to the human. The primary method used for OOD detection is to utilise ANN as a feature extractor of embeddings to approximate where the new data point will be in the embedding space and compare it to trained embeddings using distance metrics. We use a Vision Transformer (ViT) as the ANN because there has been a push to use large-scale pre-trained Transformers to improve a range of OOD tasks. Improvements stem from ViT’s state-of-the-art performance as a feature extractor, which can be used out-of-the-box for OOD detection compared to convolutional neural networks (CNNs), which require custom training methods and exposure to OOD to reach similar results. In this thesis, a ViT was used as a feature extractor, and the performance of OOD detection was compared using various distance metrics to determine the robustness and choose the best distance metric in ViT’s embedding space. Three separate experiments were conducted with multiple datasets, methods, models and approaches. The experiments showed that ViT is capable of OOD detection out-of-the-box without any custom training methods or exposure to OOD. However, none of the distance metrics could noticeably improve the results of OOD detection obtained with the baseline Mahalanobis distance. Nonetheless, ViT has considerably better OOD detection performance in most datasets and is more generalisable than a similarly trained CNN. Furthermore, ViT is more robust to various distance metrics, proving that the features extracted from the model are good enough to discriminate between InD and OOD. Finally, it was shown that ViT with Mahalanobis distance has the best OOD detection performance when blending InD and OOD at various ratios. Future work can consider ensembling multiple distance metrics to utilise the properties of each distance metric and to apply the same methodology on other ANN architectures.	et
dc.identifier.uri	https://hdl.handle.net/10062/91717
dc.language.iso	eng	et
dc.publisher	Tartu Ülikool	et
dc.rights	openAccess	et
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	deep learning	et
dc.subject	neural networks	et
dc.subject	vision transformer	et
dc.subject	out-of-distribution detection	et
dc.subject.other	magistritööd	et
dc.subject.other	informaatika	et
dc.subject.other	infotehnoloogia	et
dc.subject.other	informatics	et
dc.subject.other	infotechnology	et
dc.title	Exploring Out-of-Distribution Detection Using Vision Transformers	et
dc.type	Thesis	et

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: haavel_datascience_2022.pdf
Suurus:: 2.51 MB
Formaat:: Adobe Portable Document Format
Kirjeldus:

Lae alla

Litsentsi pakett

Nüüd näidatakse 1 - 1 1

Nimi:: license.txt
Suurus:: 1.71 KB
Formaat:: Item-specific license agreed upon to submission
Kirjeldus:

Lae alla

Kollektsioonid

LTAT magistritööd – Master's theses