서브메뉴
검색
Data-Efficient and Robust Deep Learning From Large Vision and Language Data.
Data-Efficient and Robust Deep Learning From Large Vision and Language Data.
- 자료유형
- 학위논문
- Control Number
- 0017165081
- International Standard Book Number
- 9798346813828
- Dewey Decimal Classification Number
- 004
- Main Entry-Personal Name
- Yang, Yu.
- Publication, Distribution, etc. (Imprint
- [S.l.] : University of California, Los Angeles., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 265 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-06, Section: A.
- General Note
- Advisor: Mirzasoleiman, Baharan.
- Dissertation Note
- Thesis (Ph.D.)--University of California, Los Angeles, 2024.
- Summary, Etc.
- 요약Deep learning has revolutionized fields like computer vision, natural language processing, and multimodal learning, but its reliance on large datasets brings challenges such as rising computational costs, vulnerability to data poisoning attacks, and difficulty achieving robustness against spurious correlations.My research addresses these challenges through a data-centric approach, improving data selection, curriculum design, and weighting strategies. This dissertation is organized into three parts. First, for efficient training, CREST identifies coresets for deep vision models with theoretical guarantees, and S2L reduces fine-tuning costs for large language models by prioritizing subsets based on proxy model loss trajectories. Second, for robust training against data poisoning, EPIC iteratively detects and excludes malicious examples during training, effectively mitigating the attacks. Finally, to address spurious correlations, SPARE mitigates these biases early in training by separating and rebalancing biased groups, PDE progressively expands balanced subsets to guide models toward learning core features, and a multimodal fine-tuning method enhances robustness in vision-language models like CLIP by reducing reliance on spurious features, achieving significant gains in worst-group accuracy.Together, my research demonstrates how focusing on the properties and selection of data helps address core limitations in deep learning, providing scalable and effective solutions that bridge theoretical insights with practical needs across diverse real-world applications.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Computer engineering.
- Subject Added Entry-Topical Term
- Information science.
- Index Term-Uncontrolled
- Deep learning
- Index Term-Uncontrolled
- Data poisoning
- Index Term-Uncontrolled
- Robust training
- Index Term-Uncontrolled
- Vision-language models
- Added Entry-Corporate Name
- University of California, Los Angeles Computer Science 0201
- Host Item Entry
- Dissertations Abstracts International. 86-06A.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:657862