Document Type

Conference Proceeding

Publication Title

Advances in Neural Information Processing Systems

Abstract

Given an unsupervised novelty detection task on a new dataset, how can we automatically select a “best” detection model while simultaneously controlling the error rate of the best model? For novelty detection analysis, numerous detectors have been proposed to detect outliers on a new unseen dataset based on a score function trained on available clean data. However, due to the absence of labeled anomalous data for model evaluation and comparison, there is a lack of systematic approaches that are able to select the “best” model/detector (i.e., the algorithm as well as its hyperparameters) and achieve certain error rate control simultaneously. In this paper, we introduce a unified data-driven procedure to address this issue. The key idea is to maximize the number of detected outliers while controlling the false discovery rate (FDR) with the help of Jackknife prediction. We establish non-asymptotic bounds for the false discovery proportions and show that the proposed procedure yields valid FDR control under some mild conditions. Numerical experiments on both synthetic and real data validate the theoretical results and demonstrate the effectiveness of our proposed AutoMS method. The code is available at: https://github.com/ZhangYifan1996/AutoMS.

Publication Date

12-2022

Keywords

Error detection, Petroleum reservoir evaluation, Statistics

Comments

Available at NeurIPS Proceedings

Share

COinS