On Preconditioning Polyak Step-Size and its Variants

Document Type



The family of Stochastic Gradient Methods with Polyak Step-size offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. Recent work [13] has been proposed to introduce a slack variable, which makes these methods applicable outside of the interpolation regime. In this paper, we combine preconditioning and slack in an updated optimization algorithm to show its performance on badly scaled and/or ill-conditioned datasets. We use Hutchinson’s method to obtain an estimate of a Hessian which is used as the preconditioner.

Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Martin Takac, Dr. Bin Gu

Online access available for MBZUAI patrons