Enhancing Policy Gradient with the Polyak Step-Size Adaption
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Machine Learning
Department
Machine Learning
First Advisor
Dr. Martin Takac
Second Advisor
Dr. Kun Zhang
Abstract
IPolicy gradient stands as a cornerstone within the realm of reinforcement learning (RL), revered for its widespread adoption and foundational significance. While celebrated for its convergence guarantees and stability relative to other RL algorithms, its pragmatic utility often encounters roadblocks stemming from hyper-parameter sensitivity, notably the stepsize. In this manuscript, we unveil a groundbreaking advancement in RL methodology by introducing the integration of the Polyak step-size, a mechanism designed to autonomously adjust the step-size without necessitating prior knowledge. Our endeavor to adapt this method to RL settings involves addressing many challenges, chief among them being the presence of unknown f∗ in the Polyak step-size formulation. Moreover, we present empirical evaluation of the Polyak step-size within RL frameworks through designed experiments. The outcomes of our empirical analyses serve to illuminate the better performance of the Polyak step-size, showcasing its propensity for facilitating expedited convergence and the realization of more stable policies in diverse RL environments.
Recommended Citation
Y. Li, "Enhancing Policy Gradient with the Polyak Step-Size Adaption,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfilment of the requirements for the M.Sc degree in Machine Learning
Advisors: Dr. Martin Takac, Kun Zhang
Online access available for MBZUAI patrons