Document Type

Conference Proceeding

Publication Title

Proceedings of Machine Learning Research

Abstract

The noise transition matrix plays a central role in the problem of learning with noisy labels. Among many other reasons, a large number of existing solutions rely on the knowledge of it. Identifying and estimating the transition matrix without ground truth labels is a critical and challenging task. When label noise transition depends on each instance, the problem of identifying the instance-dependent noise transition matrix becomes substantially more challenging. Despite recently proposed solutions for learning from instance-dependent noisy labels, the literature lacks a unified understanding of when such a problem remains identifiable. The goal of this paper is to characterize the identifiability of the label noise transition matrix. Building on Kruskal's identifiability results, we are able to show the necessity of multiple noisy labels in identifying the noise transition matrix at the instance level. We further instantiate the results to explain the successes of the state-of-the-art solutions and how additional assumptions alleviated the requirement of multiple noisy labels. Our result reveals that disentangled features improve identification. This discovery led us to an approach that improves the estimation of the transition matrix using properly disentangled features. Code is available at https://github.com/UCSC-REAL/Identifiability.

First Page

22137

Last Page

22176

Publication Date

7-2023

Keywords

Ground truth, Identifiability, Kruskal, Noisy labels, State of the art, Transition matrices

Comments

Open Access version from PMLR

Uploaded on June 19, 2024

Share

COinS