Sim-to-Real Generalization of Computer Vision with Domain Adaptation, Style Randomization, and Multi-Task Learning

dc.contributor.advisorMatiisen, Tambet, juhendaja
dc.contributor.authorLiik, Hannes
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-11-06T13:41:26Z
dc.date.available2023-11-06T13:41:26Z
dc.date.issued2020
dc.description.abstractIn recent years, supervised deep learning has been very successful in computer vision applications. This success comes at the cost of a large amount of labeled data required to train artificial neural networks. However, manual labeling can be very expensive. Semantic segmentation, the task of pixel-wise classification of images, requires painstaking pixel-level annotation. The particular difficulty of manual labeling for semantic segmentation motivates research into alternatives. One solution is to use simulations, which can generate semantic segmentation ground truth automatically. Unfortunately, in practice, simulation-trained models have been shown to generalize poorly to the real world. This work considers a simulation environment, used to train models for semantic segmentation, and real-world environments to evaluate their generalization. Three different approaches are studied to improve generalization from simulation to reality. Firstly, using a generative image-to-image model to make the simulation look realistic. Secondly, using style randomization, a form of data augmentation using style transfer, to make the model more robust to change in visual style. Thirdly, using depth estimation as an auxiliary task to enable learning of geometry. Our results show that the first method, image-to-image translation, improves performance on environments similar to the simulation. By applying style randomization, the trained models generalized better to completely new environments. The additional depth estimation task did not improve performance, except by a small amount when combined with style randomization.et
dc.identifier.urihttps://hdl.handle.net/10062/94055
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectComputer Visionet
dc.subjectMachine Learninget
dc.subjectDeep Learninget
dc.subjectDomain Adaptationet
dc.subjectData Augmentationet
dc.subjectMulti-Task Learninget
dc.subjectConvolutional Neural Networkset
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleSim-to-Real Generalization of Computer Vision with Domain Adaptation, Style Randomization, and Multi-Task Learninget
dc.typeThesiset

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Liik_ComputerScience_2020.pdf
Size:
37.45 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: