VC Dimensions for Deep Neural Networks with Bounded-Rank Weight Matrices

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Huan Xiong

Second Advisor

Dr. Zhiqiang Xu

Abstract

Deep neural networks (DNNs) have seen immense success in the past decade, yet their lack of interpretability remains a challenge. Recent research on the VC (Vapnik-Chervonenkis) dimension of DNNs has provided valuable insights into the underlying mechanisms of deep learning's powerful generalization capabilities. Understanding the VC dimension offers a promising path toward unraveling the enigma of deep learning, ultimately leading to more interpretable and trustworthy AI systems. In this paper, we study the VC dimensions for DNNs with piecewise polynomial activations and bounded-rank weight matrices. Our main results show that the VC dimensions for DNNs with weight matrices that have bounded rank $r$ are at most $\mathcal{O}(nrL^2\log (nrL))$, where $n$ is the width of the network, and $L$ is the depth of the network. We also construct a ReLU DNN with bounded rank $r$ that can achieve the VC dimension $\Omega(nr)$, which confirms that the upper bound we obtain is nearly tight for large $n$. Based on these bounds, we compare the generalization power in terms of VC dimensions for various DNN architectures.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Samuel Horvath, Bin Gu

Online access available for MBZUAI patrons

Share

COinS