본문

서브메뉴

Implicit Bias of Deep Learning Optimization: A Mathematical Examination.
Sommaire Infos
Implicit Bias of Deep Learning Optimization: A Mathematical Examination.
자료유형  
 학위논문
Control Number  
0017163821
International Standard Book Number  
9798384466994
Dewey Decimal Classification Number  
004
Main Entry-Personal Name  
Lyu, Kaifeng.
Publication, Distribution, etc. (Imprint  
[S.l.] : Princeton University., 2024
Publication, Distribution, etc. (Imprint  
Ann Arbor : ProQuest Dissertations & Theses, 2024
Physical Description  
392 p.
General Note  
Source: Dissertations Abstracts International, Volume: 86-04, Section: B.
General Note  
Advisor: Arora, Sanjeev.
Dissertation Note  
Thesis (Ph.D.)--Princeton University, 2024.
Summary, Etc.  
요약Deep learning has achieved remarkable success in recent years, yet training neural networks often involves a delicate combination of guesswork and hyperparameter tuning. A critical aspect of this process is the "implicit bias" of optimization methods, where minor changes in the optimization setup-without affecting the small training loss at convergence-can drastically shift the solution to which the model converges, thereby affecting test performance. This dissertation presents a collection of results that mathematically characterize this implicit bias in various training regimes.The first part of this dissertation explores how gradient descent, even without explicit regularization, can converge to solutions that maximize the margin. Previous results have established the first-order optimality of margin for homogeneous neural networks in general, but the global optimality of margin is not guaranteed due to their non-convex nature. This dissertation provides in-depth theoretical analyses when data has simple structures: for linearly separable data, we present both positive and negative results on whether the global optimality of margin can be attained. Furthermore, we show how this margin-based view can be used to explain interesting generalization phenomena in training neural networks with or without explicit regularization, including the simplicity bias and grokking phenomena.The second part of the dissertation presents two results that capture the implicit biases induced by finite learning rate. Many existing analyses, including the margin-based ones in the first part, describe implicit biases that hold even when the learning rate is infinitesimal. However, practical implementations use finite learning rates, which have been empirically observed to benefit generalization. We analyze how full-batch GD with finite learning rates, combined with key training components like normalization layers and weight decay, create a bias towards flatter minima, which are positively correlated with better generalization. Additionally, we study the implicit bias in stochastic optimization and derive rigorous approximations for the dynamics of adaptive gradient methods like Adam and RMSprop via Stochastic Differential Equations (SDEs) to capture the effect of finite learning rates. Based on this, we also derive the square root scaling rule as a practical guideline for adjusting the optimization hyperparameters of adaptive gradient methods when changing batch size.
Subject Added Entry-Topical Term  
Computer science.
Subject Added Entry-Topical Term  
Applied mathematics.
Index Term-Uncontrolled  
Deep learning
Index Term-Uncontrolled  
Hyperparameter tuning
Index Term-Uncontrolled  
Finite learning rate
Added Entry-Corporate Name  
Princeton University Computer Science
Host Item Entry  
Dissertations Abstracts International. 86-04B.
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:657145
New Books MORE
최근 3년간 통계입니다.

Info Détail de la recherche.

  • Réservation
  • 캠퍼스간 도서대출
  • 서가에 없는 책 신고
  • My Folder
Matériel
Reg No. Call No. emplacement Status Lend Info
TQ0033363 T   원문자료 열람가능/출력가능 열람가능/출력가능
마이폴더 부재도서신고

* Les réservations sont disponibles dans le livre d'emprunt. Pour faire des réservations, S'il vous plaît cliquer sur le bouton de réservation

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치