Machine Learning Faculty Publications

Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Lei Huang, Inception Institute of Artificial Intelligence
Jie Qin, Inception Institute of Artificial Intelligence
Li Liu, Inception Institute of Artificial Intelligence
Fan Zhu, Inception Institute of Artificial Intelligence
Ling Shao, Inception Institute of Artificial Intelligence & Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Conference Proceeding

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

Conditioning analysis uncovers the landscape of an optimization objective by exploring the spectrum of its curvature matrix. This has been well explored theoretically for linear models. We extend this analysis to deep neural networks (DNNs) in order to investigate their learning dynamics. To this end, we propose layer-wise conditioning analysis, which explores the optimization landscape with respect to each layer independently. Such an analysis is theoretically supported under mild assumptions that approximately hold in practice. Based on our analysis, we show that batch normalization (BN) can stabilize the training, but sometimes result in the false impression of a local minimum, which has detrimental effects on the learning. Besides, we experimentally observe that BN can improve the layer-wise conditioning of the optimization problem. Finally, we find that the last linear layer of a very deep residual network displays ill-conditioned behavior. We solve this problem by only adding one BN layer before the last linear layer, which achieves improved performance over the original and pre-activation residual networks.

First Page

384

Last Page

401

DOI

10.1007/978-3-030-58536-5_23

Publication Date

11-3-2020

Keywords

Conditioning analysis, Normalization, Residual network

Comments

IR conditions: non-described

Recommended Citation

L. Huang, J. Qin, L. Liu, F. Zhu, and L. Shao, "Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs", in 16th European Conference on Computer Vision (ECCV 2020), Lecture Notes in Computer Science, vol 12347, Aug 2020. doi:10.1007/978-3-030-58536-5_23

Additional Links

DOI link: https://doi.org/10.1007/978-3-030-58536-5_23

Link to Full Text

COinS

Machine Learning Faculty Publications

Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Browse

Contribute

Links

Machine Learning Faculty Publications

Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Share

Browse

Contribute

Links