서브메뉴
검색
Exploring the Theoretical Foundations of Contrastive Learning in Self-Supervised Learning.
Exploring the Theoretical Foundations of Contrastive Learning in Self-Supervised Learning.
- 자료유형
- 학위논문
- Control Number
- 0017164901
- International Standard Book Number
- 9798346390329
- Dewey Decimal Classification Number
- 150
- Main Entry-Personal Name
- Zhang, Haochen.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Stanford University., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 193 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-05, Section: B.
- General Note
- Advisor: Ma, Tengyu.
- Dissertation Note
- Thesis (Ph.D.)--Stanford University, 2024.
- Summary, Etc.
- 요약High-quality data representations can serve as a foundation for various practical machine learning applications, ranging from search to data-ecient adaptation for new tasks and domains. Many successful representation learning algorithms rely heavily on supervised learning, which requires costly and time-consuming annotation of data [Salakhutdinov and Hinton, 2007].In contrast to the expensive and limited amount of labeled data, a much larger amount of rich and cheap unlabeled data exists on the internet. Unsupervised representation learning aims to find patterns in data without pre-existing labels, and generate representations that capture the essential features of raw data. This approach o↵ers a promising path towards training transferable data representations that can be e↵ectively adapted to a wide range of downstream tasks.In particular, contrastive learning has recently emerged as a powerful approach for learning representations from unlabeled data. The central idea of contrastive learning is the notion of "positive pairs", which is pairs of datapoints that are semantically close and can be constructed directly from unlabeled data without human labelling. Correspondingly, there's the notion of "negative pairs" which is pairs of datapoints that are typically semantically unrelated. In the computuer vision domain, a positive pair is typically composed of two images that are generated via data augmentation from the same original image, and a negative pair is composed of two independently randomly sampled images . Given the positive pairs and negative pairs, contrastive learning learns representations of datapoints by encouraging the positive pairs to have representations closer, whereas negative pairs to have representations far apart.Many contrastive learning methods learn features with siamese networks [Bromley et al., 1993], where two neural networks of shared weights are applied to the two datapoints in the positive pair, and representations are the outputs of neural networks on the raw input. The seminal work of SimCLR [Chen et al., 2020b] demonstrates that contrastive learned reprensentation leveraging a siamese network structure can achieve linear probing accuracy on downstream classification tasks that is competitive with supverised learning. Several follow-up works [Chen and He, 2020, Grill et al., 2020, Bardes et al., 2021] have explored di↵erent loss objectives and regularization techniques, aiming to reduce some seemingly ad-hoc and unnatural aspects of the algorithm such as the stop-gradient operation (i.e., stopping the gradient backpropagation via one branch of the siamese networks during training) or the nessecity of large batch size. Nevertheless, most of them still evolve around the same idea of siamese network structure.These methods have achieved impressive empirical successes, often surpassing the performance of fully-supervised models without requiring labeled data. Furthermore, the learned representations often have nice structure such as linear separability, where a linear classifier train on top of these representations can perform well on downstream classification problems. The surprising simplicity of these methods and the structure encoded in the contrastive learned representations seems to suggest that the method leverages some intrinsic property defined by the data distribution via the positive-pair construction. However, developing a comprehensive theoretical understanding of why these self-supervised representations are so e↵ective has remained a significant challenge. Novel mathematical frameworks that go beyond classical statistical learning theory are needed to fully explain their performance, and the prevalent use of deep neural networks in contrastive learning further complicates the analysis.
- Subject Added Entry-Topical Term
- Success.
- Subject Added Entry-Topical Term
- Computer vision.
- Subject Added Entry-Topical Term
- Neural networks.
- Subject Added Entry-Topical Term
- Adaptation.
- Subject Added Entry-Topical Term
- Clustering.
- Subject Added Entry-Topical Term
- Eigenvectors.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Mathematics.
- Added Entry-Corporate Name
- Stanford University.
- Host Item Entry
- Dissertations Abstracts International. 86-05B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:654272