On Self-Supervised Learning of Vision Transformers for Fingerprint Presentation Attack Detection

Document Type



With the rapid growth of fingerprint-based biometric systems, it is imperative to ensure the security and reliability of these systems, given the immense security vulnerabilities that they are exposed to. Fingerprint presentation attack detection (FPAD) is an example of a threat to these systems. Various studies done on fingerprint attack detection have shown their significance in this area. Despite all the work done over the past years, a shortcoming of these detection systems is that since fingerprint images captured from different fingerprint sensors exhibit unique characteristics because of the various fingerprint sensing technologies, sensor noises, and resolutions these methods fail to generalize well on unseen data and thus are still prone to attacks. Therefore, it is important to enhance the generalizability of these FPAD algorithms in cross-sensor and cross-material settings. In this work, we introduce two self-supervised learning approaches using Masked Image Modeling (MIM) and Image-text alignment based techniques utilizing vision transformer backbones to create robust models for improving cross-sensor performance without any knowledge of the target sensor. Although both the approaches demonstrate excellent generalizability, we show that MIM based framework outperforms the text-alignment based framework for the task of FPAD.

Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Karthik Nandakumar, Dr. Shijian Lu

with 2 year embargo period

This document is currently not available here.