중부대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

Sommaire Infos

Implicit Bias of Deep Learning Optimization: A Mathematical Examination.

자료유형: 학위논문

Control Number: 0017163821

International Standard Book Number: 9798384466994

Dewey Decimal Classification Number: 004

Main Entry-Personal Name: Lyu, Kaifeng.

Publication, Distribution, etc. (Imprint: [S.l.] : Princeton University., 2024

Publication, Distribution, etc. (Imprint: Ann Arbor : ProQuest Dissertations & Theses, 2024

Physical Description: 392 p.

General Note: Source: Dissertations Abstracts International, Volume: 86-04, Section: B.

General Note: Advisor: Arora, Sanjeev.

Dissertation Note: Thesis (Ph.D.)--Princeton University, 2024.

Summary, Etc.: 요약Deep learning has achieved remarkable success in recent years, yet training neural networks often involves a delicate combination of guesswork and hyperparameter tuning. A critical aspect of this process is the "implicit bias" of optimization methods, where minor changes in the optimization setup-without affecting the small training loss at convergence-can drastically shift the solution to which the model converges, thereby affecting test performance. This dissertation presents a collection of results that mathematically characterize this implicit bias in various training regimes.The first part of this dissertation explores how gradient descent, even without explicit regularization, can converge to solutions that maximize the margin. Previous results have established the first-order optimality of margin for homogeneous neural networks in general, but the global optimality of margin is not guaranteed due to their non-convex nature. This dissertation provides in-depth theoretical analyses when data has simple structures: for linearly separable data, we present both positive and negative results on whether the global optimality of margin can be attained. Furthermore, we show how this margin-based view can be used to explain interesting generalization phenomena in training neural networks with or without explicit regularization, including the simplicity bias and grokking phenomena.The second part of the dissertation presents two results that capture the implicit biases induced by finite learning rate. Many existing analyses, including the margin-based ones in the first part, describe implicit biases that hold even when the learning rate is infinitesimal. However, practical implementations use finite learning rates, which have been empirically observed to benefit generalization. We analyze how full-batch GD with finite learning rates, combined with key training components like normalization layers and weight decay, create a bias towards flatter minima, which are positively correlated with better generalization. Additionally, we study the implicit bias in stochastic optimization and derive rigorous approximations for the dynamics of adaptive gradient methods like Adam and RMSprop via Stochastic Differential Equations (SDEs) to capture the effect of finite learning rates. Based on this, we also derive the square root scaling rule as a practical guideline for adjusting the optimization hyperparameters of adaptive gradient methods when changing batch size.

Subject Added Entry-Topical Term: Computer science.

Subject Added Entry-Topical Term: Applied mathematics.

Index Term-Uncontrolled: Deep learning

Index Term-Uncontrolled: Hyperparameter tuning

Index Term-Uncontrolled: Finite learning rate

Added Entry-Corporate Name: Princeton University Computer Science

Host Item Entry: Dissertations Abstracts International. 86-04B.

Electronic Location and Access: 로그인을 한후 보실 수 있는 자료입니다.

Control Number: joongbu:657145

New Books MORE

최근 3년간 통계입니다.

Réservation
캠퍼스간 도서대출
서가에 없는 책 신고
보존서고대출신청
My Folder

Matériel
Reg No.	Call No.	emplacement	Status	Lend Info
TQ0033363	T	원문자료	열람가능/출력가능	열람가능/출력가능 마이폴더 부재도서신고

* Les réservations sont disponibles dans le livre d'emprunt. Pour faire des réservations, S'il vous plaît cliquer sur le bouton de réservation

본문

서브메뉴

검색

New Books MORE

최근 3년간 통계입니다.

Info Détail de la recherche.

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치

QUICK LINK