Machine Learning Dissertations and Theses

Enhancing Policy Gradient with the Polyak Step-Size Adaption

Yunxiang Li, Mohamed bin Zayed University of Artificial IntelligenceFollow

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Martin Takac

Second Advisor

Dr. Kun Zhang

Abstract

IPolicy gradient stands as a cornerstone within the realm of reinforcement learning (RL), revered for its widespread adoption and foundational significance. While celebrated for its convergence guarantees and stability relative to other RL algorithms, its pragmatic utility often encounters roadblocks stemming from hyper-parameter sensitivity, notably the stepsize. In this manuscript, we unveil a groundbreaking advancement in RL methodology by introducing the integration of the Polyak step-size, a mechanism designed to autonomously adjust the step-size without necessitating prior knowledge. Our endeavor to adapt this method to RL settings involves addressing many challenges, chief among them being the presence of unknown f∗ in the Polyak step-size formulation. Moreover, we present empirical evaluation of the Polyak step-size within RL frameworks through designed experiments. The outcomes of our empirical analyses serve to illuminate the better performance of the Polyak step-size, showcasing its propensity for facilitating expedited convergence and the realization of more stable policies in diverse RL environments.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Martin Takac, Kun Zhang

Online access available for MBZUAI patrons

Recommended Citation

Y. Li, "Enhancing Policy Gradient with the Polyak Step-Size Adaption,", Apr 2024.

Link to Full Text

COinS

Machine Learning Dissertations and Theses

Enhancing Policy Gradient with the Polyak Step-Size Adaption

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Abstract

Comments

Recommended Citation

Browse

Contribute

Links

Machine Learning Dissertations and Theses

Enhancing Policy Gradient with the Polyak Step-Size Adaption

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Abstract

Comments

Recommended Citation

Share

Browse

Contribute

Links