Computer Vision Faculty Publications

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

Chun Mei Feng, A-Star, Institute of High Performance Computing
Kai Yu, A-Star, Institute of High Performance Computing
Yong Liu, A-Star, Institute of High Performance Computing
Salman Khan, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Wangmeng Zuo, Harbin Institute of Technology

Document Type

Conference Proceeding

Publication Title

Proceedings of the IEEE International Conference on Computer Vision

Abstract

Benefiting from prompt tuning, recent years have witnessed the promising performance of pre-trained vision-language models, e.g., CLIP, on versatile downstream tasks. In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT). Existing TPT methods typically rely on data augmentation and confidence selection. However, conventional data augmentation techniques, e.g., random resized crops, suffers from the lack of data diversity, while entropy-based confidence selection alone is not sufficient to guarantee prediction fidelity. To address these issues, we propose a novel TPT method, named DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data. Specifically, we incorporate augmented data by both conventional method and pre-trained stable diffusion to exploit their respective merits, improving the model's ability to adapt to unknown new test data. Moreover, to ensure the prediction fidelity of generated data, we introduce a cosine similarity-based filtration technique to select the generated data with higher similarity to the single test sample. Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13% compared to the state-of-the-art TPT method.

First Page

2704

Last Page

2714

DOI

10.1109/ICCV51070.2023.00255

Publication Date

1-1-2023

Recommended Citation

C. Feng et al., "Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning," Proceedings of the IEEE International Conference on Computer Vision, pp. 2704 - 2714, Jan 2023.

The definitive version is available at https://doi.org/10.1109/ICCV51070.2023.00255

This document is currently not available here.

COinS

Computer Vision Faculty Publications

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Browse

Contribute

Links