Disentanglement of features in variational autoencoders
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Machine learning models, especially neural networks, have shown excellent performance
in classifying different images. The features these models learn are often complex and
hard to interpret. Learning disentangled features from images is a way to tackle explainability
and create features with semantic meaning. A learned feature is disentangled if
it represents only a single property of an object. For example, if we had an image of
a chair, we would assume that one feature changes its size, but nothing else. Another
feature changes the chair leg shape and nothing else. Beta variational autoencoders
(β-VAE) have shown promising performance in learning disentangled features from
images without supervision. If there is enough data, the model can learn the features
without needing large amounts of labelled data. After learning features, we can use a
smaller amount of labelled data to train an additional model on top of the learned features
(few-shot learning). The experiments of β-VAE architectures have been with simple
images with known generative factors. Usually, all generative factors are independent,
and the architecture assumes that there is a small number of them. Recently a new
dataset has been published where some features are dependent (Boxhead dataset). The
experiments with existing architectures showed relatively poor performance on β-VAE
based architectures to capture those features. Based on exploratory analysis of β-VAE
architecture based models, we propose a new architecture to improve the result. For
evaluation, we introduce new metrics in addition to the commonly used ones. Our results
showed no substantial performance difference between our proposed and β-VAE architectures.
Based on the results of the main experiments, we conduct additional exploratory
experiments on a dataset where the object does not rotate.
Description
Keywords
machine learning, variational autoencoder, unsupervised learning, image processing, disentanglement