Machine Learning Faculty Publications

Triply Stochastic Gradient Method For Large-Scale Nonlinear Similar Unlabeled Classification

Wanli Shi, Nanjing University of Information Science & Technology & Mohamed bin Zayed University of Artificial Intelligence
Bin Gu, Nanjing University of Information Science & Technology & Mohamed bin Zayed University of Artificial Intelligence & JD Finance American Cooperation, USAFollow
Xiang Li, University of Western Ontario, CanadaFollow
Cheng Deng, Xidian University, Xi'an, ChinaFollow
Heng Huang, JD Finance America Corporation, USAFollow

Document Type

Article

Publication Title

Machine Learning

Abstract

Similar unlabeled (SU) classification is pervasive in many real-world applications, where only similar data pairs (two data points have the same label) and unlabeled data points are available to train a classifier. Recent work has identified a practical SU formulation and has derived the corresponding estimation error bound. It evaluated SU learning with linear classifiers on medium-sized datasets. However, in practice, we often need to learn nonlinear classifiers on large-scale datasets for superior predictive performance. How this could be done in an efficient manner is still an open problem for SU classification. In this paper, we propose a scalable kernel learning algorithm for SU classification using a triply stochastic optimization framework, called TSGSU. Specifically, in each iteration, our method randomly samples an instance from the similar pairs set, an instance from the unlabeled set, and their random features to calculate the stochastic functional gradient for the model update. Theoretically, we prove that our method can converge to a stationary point at the rate of O(1/T) after T iterations. Experiments on various benchmark datasets and high-dimensional datasets not only demonstrate the scalability of TSGSU but also show the efficiency of TSGSU compared with existing SU learning algorithms while retaining similar generalization performance.

First Page

2005

Last Page

2033

DOI

10.1007/s10994-021-05980-1

Publication Date

8-1-2021

Keywords

Kernel method, Large-scale optimization, SU classification, Weakly-supervised learning

Comments

IR deposit conditions:

OA (accepted version) - pathway b
12 months embargo
Published source must be acknowledged
Must link to publisher version with DOI

Recommended Citation

W. Shi, B. Gu, X. Li, C. Deng, and H. Huang, “Triply stochastic gradient method for large-scale nonlinear similar unlabeled classification,” Machine Learning, vol. 110, no. 8, pp. 2005–2033, Aug. 2021, doi: 10.1007/S10994-021-05980-1.

Link to Full Text

COinS

Machine Learning Faculty Publications

Triply Stochastic Gradient Method For Large-Scale Nonlinear Similar Unlabeled Classification

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Browse

Contribute

Links

Machine Learning Faculty Publications

Triply Stochastic Gradient Method For Large-Scale Nonlinear Similar Unlabeled Classification

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Share

Browse

Contribute

Links