Document Type
Conference Proceeding
Publication Title
IJCAI International Joint Conference on Artificial Intelligence
Abstract
Vision-language navigation (VLN) requires an agent to perceive visual observation in a house scene and navigate step-by-step following natural language instruction. Due to the high cost of data annotation and data collection, current VLN datasets provide limited instruction-trajectory data samples. Learning vision-language alignment for VLN from limited data is challenging since visual observation and language instruction are both complex and diverse. Previous works only generate augmented data based on original scenes while failing to generate data samples from unseen scenes, which limits the generalization ability of the navigation agent. In this paper, we introduce the Knowledge-driven Environmental Dreamer (KED), a method that leverages the knowledge of the embodied environment and generates unseen scenes for a navigation agent to learn. Generating an unseen environment with texture consistency and structure consistency is challenging. To address this problem, we incorporate three knowledge-driven regularization objectives into the KED and adopt a reweighting mechanism for self-adaptive optimization. Our KED method is able to generate unseen embodied environments without extra annotations. We use KED to successfully generate 270 houses and 500K instruction-trajectory pairs. The navigation agent with the KED method outperforms the state-of-the-art methods on various VLN benchmarks, such as R2R, R4R, and RxR. Both qualitative and quantitative experiments prove that our proposed KED method is able to high-quality augmentation data with texture consistency and structure consistency.
First Page
1840
Last Page
1848
Publication Date
8-19-2023
Keywords
Artificial intelligence, Textures, Visual languages
Recommended Citation
F. Zhu et al., "Vision Language Navigation with Knowledge-driven Environmental Dreamer," IJCAI International Joint Conference on Artificial Intelligence, vol. 2023-August, pp. 1840 - 1848, Aug 2023.
Additional Links
https://www.ijcai.org/proceedings/2023/204
Comments
Archived thanks to IJCAI
Uploaded: June 19, 2024