模式识别与人工智能
Wednesday, Apr. 23, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence  2024, Vol. 37 Issue (9): 839-849    DOI: 10.16451/j.cnki.issn1003-6059.202409007
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Semantic Topological Maps-Based Reasoning for Vision-and-Language Navigation in Continuous Environments
XIE Zilong1, XU Ming1
1. Software College, Liaoning Technical University, Huludao 125105

Download: PDF (2177 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  To address the issue of inadequate reasoning ability of existing vision-language navigation methods in continuous environments, a method for semantic topological maps-based reasoning for vision-and-language navigation in continuous environments is proposed. First, regions and objects in the navigation environment are identified through scene understanding auxiliary tasks, and a knowledge base of spatial proximity is constructed. Second, the agent interacts with the environment in real time during the navigation process, collecting location information, encoding visual features and predicting semantic labels of regions and objects. Thereby a semantic topological map is gradually generated. On this basis, an auxiliary reasoning localization strategy is designed. A self-attention mechanism is employed to extract object and region information from navigation instructions, and the spatial proximity knowledge base is combined with semantic topological map to infer and localize objects and regions. The above assists navigation decisions and ensures that the agent navigation trajectory aligns with the instructions. Experimental results on public datasets R2R-CE and RxR-CE demonstrate the proposed method achieves a higher navigation success rate.
Key wordsVision-and-Language Navigation      Visual Reasoning      Multi-modal Data      Embodied Intelligence     
Received: 09 May 2024     
ZTFLH: TP391.41  
Fund:Doctoral Scientific Research Foundation of Liaoning Technical University(No.21-1027)
Corresponding Authors: XU Ming, Ph.D., associate professor. His research interests include spatiotemporal data mining, deep lear-ning and intelligent transportation.   
About author:: XIE Zilong, Master student. His research interests include embodied intelligence and robot navigation.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
XIE Zilong
XU Ming
Cite this article:   
XIE Zilong,XU Ming. Semantic Topological Maps-Based Reasoning for Vision-and-Language Navigation in Continuous Environments[J]. Pattern Recognition and Artificial Intelligence, 2024, 37(9): 839-849.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202409007      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2024/V37/I9/839
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn