Machine Learning Faculty Publications

Random-Reshuffled SARAH Does Not Need a Full Gradient Computations

Aleksandr Beznosikov, Moscow Institute of Physics and Technology (MIPT) & Higher School of Economics (HSE University)Follow
Martin Takáč, Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Article

Publication Title

arXiv

Abstract

The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent (SGD) algorithm that needs a gradient of the objective function from time to time. In this paper, we remove the necessity of a full gradient computation. This is achieved by using a randomized reshuffling strategy and aggregating stochastic gradients obtained in each epoch. The aggregated stochastic gradients serve as an estimate of a full gradient in the SARAH algorithm. We provide a theoretical analysis of the proposed approach and conclude the paper with numerical experiments that demonstrate the efficiency of this approach. Copyright © 2021, The Authors. All rights reserved.

DOI

10.48550/arXiv.2111.13322

Publication Date

11-26-2021

Keywords

Gradient algorithm; Gradients computation; Numerical experiments; Objective functions; Stochastic gradient; Stochastic gradient descent algorithm; Stochastics

Comments

Preprint: arXiv

Recommended Citation

A. Beznosikov and M. Takáč, "Random-reshuffled SARAH does not need a full gradient computations," 2021, arXiv:2111.13322

Link to Full Text

COinS

Machine Learning Faculty Publications

Random-Reshuffled SARAH Does Not Need a Full Gradient Computations

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Browse

Contribute

Links

Machine Learning Faculty Publications

Random-Reshuffled SARAH Does Not Need a Full Gradient Computations

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Share

Browse

Contribute

Links