서브메뉴
검색
On the Feature Alignment of Deep Vision Models Explainability and Robustness Connected at Hip- [electronic resource]
On the Feature Alignment of Deep Vision Models Explainability and Robustness Connected at Hip- [electronic resource]
상세정보
- 자료유형
- 학위논문
- Control Number
- 0016934847
- International Standard Book Number
- 9798380142427
- Dewey Decimal Classification Number
- 621.3
- Main Entry-Personal Name
- Wang, Zifan.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Carnegie Mellon University., 2023
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2023
- Physical Description
- 1 online resource(153 p.)
- General Note
- Source: Dissertations Abstracts International, Volume: 85-02, Section: B.
- General Note
- Advisor: Data, Anupam;Fredrikson, Matt.
- Dissertation Note
- Thesis (Ph.D.)--Carnegie Mellon University, 2023.
- Restrictions on Access Note
- This item must not be sold to any third party vendors.
- Summary, Etc.
- 요약Deep Neural Networks (DNNs) have recently demonstrated remarkable performance that is comparable to humans. However, these models pose a challenge when it comes to answering whether their behaviors, ethical values, and morality always align with humans' interests. This issue is known as (mis)alignment of intelligent systems. One basic requirement for deep classifiers to be considered aligned is that their output is always semantically equivalent to that of a human, who possesses the necessary knowledge and tools to solve the problem at hand. Unfortunately, verifying the alignment between models and humans on outputs is often not feasible, as it would be impractical to test every sample from the distribution.A lack of output alignment of DNNs has been evidenced by their vulnerability to adversarial noise, which are unlikely to affect a human's response. This weakness originates from the fact that important features used by the model may not be semantically meaningful from a human perspective, an issue which we will term as feature (mis)alignment in vision tasks. Being (perceptually) aligned with humans on useful features is necessary to preserve output alignment. Thus, the goal of this thesis is to evaluate and enhance the feature alignment of deep vision classifiers to promote output alignment.To evaluate feature alignment, we introduce locality, a metric based on explainability tools that guarantee faithful returns of important features contributing towards the models' outputs. Consequently, the first contribution of the thesis shows that modern architectures, e.g., Vision Transformers (ViTs), the stateof-the-art classifiers on many tasks, are misaligned in features. Our second contribution, on the other hand, shows that improved adversarial robustness leads to improved locality. To be specific, we find that a robust model has better locality than any non-robust model and the locality of a model increases as it becomes more robust. Inspired by this finding, our third contribution is to improve robustness with a novel technique, TrH regularization, based on a direct minimization of PAC-Bayesian generalization bound for robustness. Our technique provides the new state-of-the-art robustness for ViTs. However, as robustness is often measured by running existing attacks, the guarantee is only empirical and may fail against adaptive attacks. The last contribution of this thesis introduces GloRo Nets, which entail a built-in formal robustness verification layer based on the global Lipschitz constant of the model. Unlike a probabilistic guarantee provided by Randomized Smoothing, GloRo Nets have a deterministic guarantee and significantly improve the state-of-the-art provable robustness under ℓ2-norm-bounded threats.Robustness is necessary for feature alignment but is probably not sufficient, as there are many other unspecified requirements that would result in misalignment. In conclusion, the thesis discusses the issue of under-specification in classification and its connection to alignment, together with potential remedies for addressing the issue as another step towards feature alignment in deep learning.
- Subject Added Entry-Topical Term
- Computer engineering.
- Subject Added Entry-Topical Term
- Electrical engineering.
- Index Term-Uncontrolled
- Adversarial robustness
- Index Term-Uncontrolled
- Explainability tool
- Index Term-Uncontrolled
- Feature alignment
- Index Term-Uncontrolled
- Machine learning
- Index Term-Uncontrolled
- Vision models
- Added Entry-Corporate Name
- Carnegie Mellon University Electrical and Computer Engineering
- Host Item Entry
- Dissertations Abstracts International. 85-02B.
- Host Item Entry
- Dissertation Abstract International
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:641403
MARC
008240221s2023 ulk 00 kor■001000016934847
■00520240214101701
■006m o d
■007cr#unu||||||||
■020 ▼a9798380142427
■035 ▼a(MiAaPQ)AAI30635765
■040 ▼aMiAaPQ▼cMiAaPQ
■0820 ▼a621.3
■1001 ▼aWang, Zifan.
■24510▼aOn the Feature Alignment of Deep Vision Models Explainability and Robustness Connected at Hip▼h[electronic resource]
■260 ▼a[S.l.]▼bCarnegie Mellon University. ▼c2023
■260 1▼aAnn Arbor▼bProQuest Dissertations & Theses▼c2023
■300 ▼a1 online resource(153 p.)
■500 ▼aSource: Dissertations Abstracts International, Volume: 85-02, Section: B.
■500 ▼aAdvisor: Data, Anupam;Fredrikson, Matt.
■5021 ▼aThesis (Ph.D.)--Carnegie Mellon University, 2023.
■506 ▼aThis item must not be sold to any third party vendors.
■520 ▼aDeep Neural Networks (DNNs) have recently demonstrated remarkable performance that is comparable to humans. However, these models pose a challenge when it comes to answering whether their behaviors, ethical values, and morality always align with humans' interests. This issue is known as (mis)alignment of intelligent systems. One basic requirement for deep classifiers to be considered aligned is that their output is always semantically equivalent to that of a human, who possesses the necessary knowledge and tools to solve the problem at hand. Unfortunately, verifying the alignment between models and humans on outputs is often not feasible, as it would be impractical to test every sample from the distribution.A lack of output alignment of DNNs has been evidenced by their vulnerability to adversarial noise, which are unlikely to affect a human's response. This weakness originates from the fact that important features used by the model may not be semantically meaningful from a human perspective, an issue which we will term as feature (mis)alignment in vision tasks. Being (perceptually) aligned with humans on useful features is necessary to preserve output alignment. Thus, the goal of this thesis is to evaluate and enhance the feature alignment of deep vision classifiers to promote output alignment.To evaluate feature alignment, we introduce locality, a metric based on explainability tools that guarantee faithful returns of important features contributing towards the models' outputs. Consequently, the first contribution of the thesis shows that modern architectures, e.g., Vision Transformers (ViTs), the stateof-the-art classifiers on many tasks, are misaligned in features. Our second contribution, on the other hand, shows that improved adversarial robustness leads to improved locality. To be specific, we find that a robust model has better locality than any non-robust model and the locality of a model increases as it becomes more robust. Inspired by this finding, our third contribution is to improve robustness with a novel technique, TrH regularization, based on a direct minimization of PAC-Bayesian generalization bound for robustness. Our technique provides the new state-of-the-art robustness for ViTs. However, as robustness is often measured by running existing attacks, the guarantee is only empirical and may fail against adaptive attacks. The last contribution of this thesis introduces GloRo Nets, which entail a built-in formal robustness verification layer based on the global Lipschitz constant of the model. Unlike a probabilistic guarantee provided by Randomized Smoothing, GloRo Nets have a deterministic guarantee and significantly improve the state-of-the-art provable robustness under ℓ2-norm-bounded threats.Robustness is necessary for feature alignment but is probably not sufficient, as there are many other unspecified requirements that would result in misalignment. In conclusion, the thesis discusses the issue of under-specification in classification and its connection to alignment, together with potential remedies for addressing the issue as another step towards feature alignment in deep learning.
■590 ▼aSchool code: 0041.
■650 4▼aComputer engineering.
■650 4▼aElectrical engineering.
■653 ▼aAdversarial robustness
■653 ▼aExplainability tool
■653 ▼aFeature alignment
■653 ▼aMachine learning
■653 ▼aVision models
■690 ▼a0800
■690 ▼a0544
■690 ▼a0464
■71020▼aCarnegie Mellon University▼bElectrical and Computer Engineering.
■7730 ▼tDissertations Abstracts International▼g85-02B.
■773 ▼tDissertation Abstract International
■790 ▼a0041
■791 ▼aPh.D.
■792 ▼a2023
■793 ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T16934847▼nKERIS▼z이 자료의 원문은 한국교육학술정보원에서 제공합니다.
■980 ▼a202402▼f2024
미리보기
내보내기
chatGPT토론
Ai 추천 관련 도서
detalle info
- Reserva
- 캠퍼스간 도서대출
- 서가에 없는 책 신고
- Mi carpeta