Computer Vision Faculty Publications

Commands for autonomous vehicles by progressively stacking visual-linguistic representations

Hang Dai, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Shujie Luo, College of Information Science and Electronic Engineering, Zhejiang University
Yong Ding, College of Information Science and Electronic Engineering, Zhejiang University
Ling Shao, Mohamed Bin Zayed University of Artificial Intelligence & Inception Institute of Artificial IntelligenceFollow

Document Type

Conference Proceeding

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

In this work, we focus on the object referral problem in the autonomous driving setting. We use a stacked visual-linguistic BERT model to learn a generic visual-linguistic representation. Each element of the input is either a word or a region of interest from the input image. To train the deep model efficiently, we use a stacking algorithm to transfer knowledge from a shallow BERT model to a deep BERT model.

First Page

Last Page

DOI

10.1007/978-3-030-66096-3_2

Publication Date

1-3-2021

Keywords

Bidirectional Encoder Representations from Transformers (BERT), image classification, natural language processing

Comments

IR Deposit conditions:

OA version (pathway a)
Accepted version 12 month embargo
Must link to published article
Set statement to accompany deposit

Recommended Citation

H. Dai, S. Luo, Y. Ding and L. Shao, "Commands for autonomous vehicles by progressively stacking visual-linguistic representations", in Computer Vision – ECCV 2020 Workshops, ECCV 2020, (Lecture Notes in Computer Science, v. 12536), pp. 27-32, 2020. Available: 10.1007/978-3-030-66096-3_2

Link to Full Text

COinS

Computer Vision Faculty Publications

Commands for autonomous vehicles by progressively stacking visual-linguistic representations

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

Commands for autonomous vehicles by progressively stacking visual-linguistic representations

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Share

Browse

Contribute

Links