Study on Scale-Invariance Framework of Polyak-type Methods

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Martin Takac

Second Advisor

Dr. Samuel Horvath

Abstract

Stochastic gradient descent has two fatal defects: it requires careful adjustment of step size and the convergence rate is very slow in the face of ill-conditioned data. Researchers have proposed a number of improvement methods to accelerate SGD, including: Momentum , Adaptive step size, and Preconditioning methods. Momentum methods, adaptive optimization methods such as Adam, AdaGrad, and AdaHessian, and preconditioning methods have become important tools for training deep neural networks (DNNS) because of their ability to adjust the search direction by considering the curvature details of the objective function. But they still need to manually adjust the step size, which can be significantly time consuming to resolve the problem. To solve this problem, we consider to introduce the Polyak step size, which can ideally be parameter free if interpolation condition holds. We try to give a reasonable explanation for the combination of polyak step size and preconditioner under ill-conditioning problems from the perspective of mirror gradient descent. Regarding the choice of preconditioner, we also give advice on how to choose it. Finally, we consolidated our work into an optimization framework called SANIA. From this framework, one can derive almost all the popular methods such as SGD, classic AdaGrad, Adam and so on. Besides these, our invariant polyak-type methods can also be inferred from this framework. The framework is designed to eliminate the need to manually adjust step size hyperparameters and improve performance for dealing with low-scale or pathological challenges. Our extensive empirical analysis spans several classification tasks in both convex and non-convex scenarios, demonstrating the effectiveness of the proposed approach.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Martin Takac, Samuel Horvath

Online access available for MBZUAI patrons

Share

COinS