서브메뉴
검색
Distributionally Robust Machine Learning.
Distributionally Robust Machine Learning.
- 자료유형
- 학위논문
- Control Number
- 0017163727
- International Standard Book Number
- 9798342107174
- Dewey Decimal Classification Number
- 616.07
- Main Entry-Personal Name
- Sagawa, Shiori.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Stanford University., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 212 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-04, Section: B.
- General Note
- Advisor: Liang, Percy.
- Dissertation Note
- Thesis (Ph.D.)--Stanford University, 2024.
- Summary, Etc.
- 요약Machine learning models can unexpectedly fail in the wild due to distribution shifts:mismatches in data distributions between training and deployment. Distribution shifts are often unavoidable and pose significant reliability challenges in many real-world applications. For example, models can fail on certain subpopulations (e.g., language models can fail on non-English languages) and on new domains unseen during training (e.g., medical models can fail on new hospitals).In this thesis, we aim to build reliable machine learning models that are robust to distribution shifts in the wild. In the first part, we mitigate subpopulation shifts by developing methods that leverage distributionally robust optimization (DRO). These methods overcome the computational and statistical obstacles of applying DRO on modern neural networks and on real-world shifts. In the second part, to tackle domain shifts, we introduce WILDS, a benchmark of real-world shifts, and show that existing methods fail on WILDS even though they perform well on synthetic shifts from prior benchmarks. We then develop a method that successfully mitigates real-world domain shifts. We propose an alternative to domain invariance---a key principle behind the prior methods---to reflect the structure of real-world shifts, and our method achieves state-of-the-art performance on multiple WILDS datasets.Altogether, the algorithms developed in this thesis mitigate real-world distribution shifts by addressing key statistical and computational challenges of training robust models, while anchoring to and leveraging the structure of real-world distribution shifts. These algorithms successfully improve robustness to a wide range of distribution shifts in the wild, from subpopulation shifts in language modeling to domain shiftsin wildlife monitoring and histopathology to spurious correlations.
- Subject Added Entry-Topical Term
- Histopathology.
- Subject Added Entry-Topical Term
- Deep learning.
- Subject Added Entry-Topical Term
- Demographics.
- Subject Added Entry-Topical Term
- Benchmarks.
- Subject Added Entry-Topical Term
- Demography.
- Subject Added Entry-Topical Term
- Pathology.
- Added Entry-Corporate Name
- Stanford University.
- Host Item Entry
- Dissertations Abstracts International. 86-04B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:658405