Computer Vision Faculty Publications

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Cuicui Kang, Mohamed Bin Zayed University of Artificial IntelligenceFollow

Document Type

Conference Proceeding

Publication Title

2022 IEEE International Joint Conference on Biometrics, IJCB 2022

Abstract

Recent studies show that models trained on synthetic datasets are able to outperform models trained on real-world datasets for generalizable person re-identification (GPReID). On the other hand, due to the limitations of real-world person ReID datasets, it would also be important and interesting to use large-scale synthetic datasets as test sets to benchmark algorithms. Yet this raises a critical question: is synthetic dataset reliable for benchmarking GPReID? In the literature there is no evidence showing this. To address this, we design a method called Pair-wise Ranking Analysis (PRA) to quantitatively measure the ranking similarity and perform the statistical test of identical distributions. Specifically, we employ Kendall rank correlation coefficients to evaluate pairwise similarity values between algorithm rankings on different datasets. Then, a non-parametric two-sample Kolmogorov-Smirnov (KS) test is performed for the judgement of whether algorithm ranking correlations between synthetic and real-world datasets and those only between real-world datasets lie in identical distributions. We conduct comprehensive experiments, with ten representative algorithms, three popular real-world person ReID datasets, and three recently released large-scale synthetic datasets. Through the designed pairwise ranking analysis and comprehensive evaluations, we conclude that a recent large-scale synthetic dataset ClonedPerson can be reliably used to benchmark GPReID, statistically the same as real-world datasets. Therefore, this study guarantees the usage of synthetic datasets for both source training set and target testing set, with completely no privacy concerns from real-world surveillance data. Besides, the study in this paper might also inspire future designs of synthetic datasets.

DOI

10.1109/IJCB54206.2022.10007952

Publication Date

1-17-2023

Keywords

Ranking (statistics), Training, Correlation coefficient, Data privacy, Surveillance, Design methodology, Benchmark testing

Comments

IR conditions: non-described

Recommended Citation

C. Kang, "Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?," 2022 IEEE International Joint Conference on Biometrics (IJCB), Abu Dhabi, United Arab Emirates, 2022, pp. 1-8, doi: 10.1109/IJCB54206.2022.10007952.

Link to Full Text

COinS

Computer Vision Faculty Publications

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Share

Browse

Contribute

Links