In Search of the Best Activation Function

dc.contributor.advisorMatiisen, Tambet, juhendaja
dc.contributor.authorLiibert, Marti Ingmar
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-24T09:49:45Z
dc.date.available2023-08-24T09:49:45Z
dc.date.issued2022
dc.description.abstractThe choice of an activation function in neural networks can have great consequences on the performance of the network. Designing and discovering new activation functions that increase the performance or solve problems of existing activation functions is an active research field. In this thesis, a kind of trainable activation function is proposed - a weighted linear combination of activation functions where the weights are normalized using Softmax, inspired by the DARTS network architecture search method. The activation function is applied at the layer, kernel, and neuron levels. Optimizing the activation function weights is done on training loss and validation loss, as was done in DARTS. The activation function here was tested on two simple datasets, sine wave, and spiral datasets, on image classification tasks and on a robotics task. In the case of image classification, on CIFAR10 using the trainable activation function for initial training the accuracy increased 5% over the baseline, on ImageNet the accuracy increased 1% over the baseline. For the robotics task, CartPole, the mean reward increased by 10 points out of a maximum of 200 when using the already learned activation functions in the case of Deep Q-learning. In the case of Proximal Policy Optimization, the mean reward increased by 2 points approximately over the baseline. For future work, more difficult tasks could be explored for robotics tasks and longer initial search could be explored for image classification tasks.et
dc.identifier.urihttps://hdl.handle.net/10062/91736
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectActivation functionet
dc.subjecttrainable activation functionet
dc.subjectartificial neural networket
dc.subjectimage classificationet
dc.subjectreinforcement learninget
dc.subjectroboticset
dc.subjectCIFAR10et
dc.subjectImageNetet
dc.subjectCartPoleet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleIn Search of the Best Activation Functionet
dc.typeThesiset

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
liibert_informaatika_129537_2022.pdf
Size:
8.42 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: