INTUIT-VLNCE: Autonomous Navigation through Vision-and-Language

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

The aim of this thesis was to developed an Embodied Agent with explicit intuition capable of navigating indoor Continuous Environments based on provided Natural Language instruction and Agent’s egocentric vision as part of a Vision-and-Language Navigation task. The thesis proposes creating explicit intuition by making an Agent predict not only an action to perform at a given time, but also predicting actions for the future. An Agent’s policy was trained mimicking the training procedure from LAW-VLNCE project [1]. Evaluations showed negative results after implementing proposed method.

Description

Käesoleva lõputöö eesmärk oli ilmse intuitsiooniga Kehastunud Agendi arendus. Kehastunud Agent peab navigeerima läbi pideva siseruumi, kasutades antud Loomuliku Keele instruktsiooni ja egotsentrilist nägemust, nagu on kirjeldatud Nägemus-ja-Keel ülesannes. Lõputöö pakub välja luua ilmset intuitsiooni: Agent ennustab ette nii tegu, mida on vaja hetkseisul sooritada, kui ka tegusid, mida kavatsetakse sooritada tulevikus.

Keywords

robotics, natural language, embodied agent, autonomy

Citation