Computer Vision Faculty Publications

Face Pyramid Vision Transformer

Khawar Islam, FloppyDisk.AI
Muhammad Zaigham Zaheer, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Arif Mahmood, Information Technology University

Document Type

Conference Proceeding

Publication Title

BMVC 2022 - 33rd British Machine Vision Conference Proceedings

Abstract

A novel Face Pyramid Vision Transformer (FPVT) is proposed to learn a discriminative multi-scale facial representations for face recognition and verification. In FPVT, Face Spatial Reduction Attention (FSRA) and Dimensionality Reduction (FDR) layers are employed to make the feature maps compact, thus reducing the computations. An Improved Patch Embedding (IPE) algorithm is proposed to exploit the benefits of CNNs in ViTs (e.g., shared weights, local context, and receptive fields) to model lower-level edges to higher-level semantic primitives. Within FPVT framework, a Convolutional Feed-Forward Network (CFFN) is proposed that extracts locality information to learn low level facial information. The proposed FPVT is evaluated on seven benchmark datasets and compared with ten existing state-of-the-art methods, including CNNs, pure ViTs, and Convolutional ViTs. Despite fewer parameters, FPVT has demonstrated excellent performance over the compared methods. Project page is available at https://khawar-islam.github.io/fpvt/.

Publication Date

11-24-2022

Keywords

Computer vision, Convolution, Semantics

Comments

IR conditions: non-described

Open Access version available on BMVC

Recommended Citation

K. Islam et al., "Face Pyramid Vision Transformer," BMVC 2022 - 33rd British Machine Vision Conference Proceedings, Nov 2022.

Additional Links

https://bmvc2022.mpi-inf.mpg.de/758/

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Computer Vision Faculty Publications

Face Pyramid Vision Transformer

Document Type

Publication Title

Abstract

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Browse

Contribute

Links

Computer Vision Faculty Publications

Face Pyramid Vision Transformer

Authors

Document Type

Publication Title

Abstract

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Share

Browse

Contribute

Links