Multihead Attention Enhanced Memory Augmented Neural Network for Multimodal Trajectory Prediction
Laen...
Kuupäev
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikool
Abstrakt
Autonomous driving has gathered an increased interest over the last two decades. One of the
problems in autonomous driving that the researchers are actively trying to solve is agent
trajectory prediction. The trajectory prediction is the problem of predicting future trajectories
of surrounding agents such as other cars, cyclists, pedestrians, and any other road users
around an autonomous vehicle. Deep learning has shown promising results in tackling the
problem. There are various deep learning approaches addressed to the problem, and one of
the approaches is using Memory Augmented Neural Network (MANN) and multi-head
attention layer. Memory augmented neural networks in multimodal trajectory prediction have
been proposed in the literature to address trajectory prediction (in a model called memory
augmented networks for multiple trajectory prediction or MANTRA), but they do not use
multi-head attention layers. Meanwhile, multi-head attention layers have also been
investigated in the literature but in different contexts within this research topic.
In this work we proposed two models which both employ multi-head attention layers to the
memory augmented neural network model. We name the models Multihead Attention
Enhanced MANN (MAEMANN) 1 and MAEMANN-2. Similar to MANTRA, MAEMANN
uses AutoEncoder, Memory Controller, and iterative refinement module (IRM). While the
AutoEncoder and Memory Controller is responsible for memory, the IRM compiles the
output from the memory and input from the surrounding agents in the environment. The
MAEMANN-1 uses the multi-head self-attention layer in the memory network to improve
predicting future trajectory by giving attention to the multiple neighboring memories, while
MAEMANN-2 uses the multi-head attention in IRM to improve perceiving surrounding
agents. Our experimental results showed that both MAEMANNs (i.e. models 1 and 2)
outperform the MANTRA model, when tested on the Kitti dataset, where we predict 4
seconds future trajectory given 2 seconds past. In the multimodal prediction where the
number of modes is 5, the MAEMANN-1 improves the Final Displacement Error (FDE) and
Average Displacement Error (ADE) at t = 4 seconds by 10.58 % and 9.24%. Meanwhile, for
MAEMANN-2, the improvements for FDE and ADE are 14.39% and 13.47%.
Kirjeldus
Märksõnad
Trajectory prediction, Memory Augmented Neural Network, Autonomous Driving, Transformer