Guest Editorial Introduction to the Special Section on Transformer Models in Vision
Document Type
Editorial
Publication Title
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract
Transformer models have achieved outstanding results on a variety of language tasks, such as text classification, ma- chine translation, and question answering. This success in the field of Natural Language Processing (NLP) has sparked interest in the computer vision community to apply these models to vision and multi-modal learning tasks. However, visual data has a unique structure, requiring the need to rethink network designs and training methods. As a result, Transformer models and their variations have been successfully used for image recognition, object detection, segmentation, image super-resolution, video understanding, image generation, text-image synthesis, and visual question answering, among other applications.
First Page
12721
Last Page
12725
DOI
10.1109/TPAMI.2023.3306164
Publication Date
10-3-2023
Keywords
Special issues and sections, Transformers, Text categorization, Machine translation, Natural language processing
Recommended Citation
S. Khan et al., "Guest Editorial Introduction to the Special Section on Transformer Models in Vision," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 12721 - 12725, Oct 2023.
The definitive version is available at https://doi.org/10.1109/TPAMI.2023.3306164
Comments
IR Deposit conditions:
OA version (pathway a) Accepted version
No embargo
When accepted for publication, set statement to accompany deposit (see policy)
Must link to publisher version with DOI
Publisher copyright and source must be acknowledged