중부대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

상세정보

On the Feature Alignment of Deep Vision Models Explainability and Robustness Connected at Hip- [electronic resource]

자료유형: 학위논문

Control Number: 0016934847

International Standard Book Number: 9798380142427

Dewey Decimal Classification Number: 621.3

Main Entry-Personal Name: Wang, Zifan.

Publication, Distribution, etc. (Imprint: [S.l.] : Carnegie Mellon University., 2023

Publication, Distribution, etc. (Imprint: Ann Arbor : ProQuest Dissertations & Theses, 2023

Physical Description: 1 online resource(153 p.)

General Note: Source: Dissertations Abstracts International, Volume: 85-02, Section: B.

General Note: Advisor: Data, Anupam;Fredrikson, Matt.

Dissertation Note: Thesis (Ph.D.)--Carnegie Mellon University, 2023.

Restrictions on Access Note: This item must not be sold to any third party vendors.

Summary, Etc.: 요약Deep Neural Networks (DNNs) have recently demonstrated remarkable performance that is comparable to humans. However, these models pose a challenge when it comes to answering whether their behaviors, ethical values, and morality always align with humans' interests. This issue is known as (mis)alignment of intelligent systems. One basic requirement for deep classifiers to be considered aligned is that their output is always semantically equivalent to that of a human, who possesses the necessary knowledge and tools to solve the problem at hand. Unfortunately, verifying the alignment between models and humans on outputs is often not feasible, as it would be impractical to test every sample from the distribution.A lack of output alignment of DNNs has been evidenced by their vulnerability to adversarial noise, which are unlikely to affect a human's response. This weakness originates from the fact that important features used by the model may not be semantically meaningful from a human perspective, an issue which we will term as feature (mis)alignment in vision tasks. Being (perceptually) aligned with humans on useful features is necessary to preserve output alignment. Thus, the goal of this thesis is to evaluate and enhance the feature alignment of deep vision classifiers to promote output alignment.To evaluate feature alignment, we introduce locality, a metric based on explainability tools that guarantee faithful returns of important features contributing towards the models' outputs. Consequently, the first contribution of the thesis shows that modern architectures, e.g., Vision Transformers (ViTs), the stateof-the-art classifiers on many tasks, are misaligned in features. Our second contribution, on the other hand, shows that improved adversarial robustness leads to improved locality. To be specific, we find that a robust model has better locality than any non-robust model and the locality of a model increases as it becomes more robust. Inspired by this finding, our third contribution is to improve robustness with a novel technique, TrH regularization, based on a direct minimization of PAC-Bayesian generalization bound for robustness. Our technique provides the new state-of-the-art robustness for ViTs. However, as robustness is often measured by running existing attacks, the guarantee is only empirical and may fail against adaptive attacks. The last contribution of this thesis introduces GloRo Nets, which entail a built-in formal robustness verification layer based on the global Lipschitz constant of the model. Unlike a probabilistic guarantee provided by Randomized Smoothing, GloRo Nets have a deterministic guarantee and significantly improve the state-of-the-art provable robustness under ℓ2-norm-bounded threats.Robustness is necessary for feature alignment but is probably not sufficient, as there are many other unspecified requirements that would result in misalignment. In conclusion, the thesis discusses the issue of under-specification in classification and its connection to alignment, together with potential remedies for addressing the issue as another step towards feature alignment in deep learning.

Subject Added Entry-Topical Term: Computer engineering.

Subject Added Entry-Topical Term: Electrical engineering.

Index Term-Uncontrolled: Adversarial robustness

Index Term-Uncontrolled: Explainability tool

Index Term-Uncontrolled: Feature alignment

Index Term-Uncontrolled: Machine learning

Index Term-Uncontrolled: Vision models

Added Entry-Corporate Name: Carnegie Mellon University Electrical and Computer Engineering

Host Item Entry: Dissertations Abstracts International. 85-02B.

Host Item Entry: Dissertation Abstract International

Electronic Location and Access: 로그인을 한후 보실 수 있는 자료입니다.

Control Number: joongbu:641403

008240221s2023        ulk                      00        kor
■001000016934847
■00520240214101701
■006m          o    d
■007cr#unu||||||||
■020    ▼a9798380142427
■035    ▼a(MiAaPQ)AAI30635765
■040    ▼aMiAaPQ▼cMiAaPQ
■0820  ▼a621.3
■1001  ▼aWang,  Zifan.
■24510▼aOn  the  Feature  Alignment  of  Deep  Vision  Models  Explainability  and  Robustness  Connected  at  Hip▼h[electronic  resource]
■260    ▼a[S.l.]▼bCarnegie  Mellon  University.  ▼c2023
■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2023
■300    ▼a1  online  resource(153  p.)
■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  85-02,  Section:  B.
■500    ▼aAdvisor:  Data,  Anupam;Fredrikson,  Matt.
■5021  ▼aThesis  (Ph.D.)--Carnegie  Mellon  University,  2023.
■506    ▼aThis  item  must  not  be  sold  to  any  third  party  vendors.
■520    ▼aDeep  Neural  Networks  (DNNs)  have  recently  demonstrated  remarkable  performance  that  is  comparable  to  humans.  However,  these  models  pose  a  challenge  when  it  comes  to  answering  whether  their  behaviors,  ethical  values,  and  morality  always  align  with  humans'  interests.  This  issue  is  known  as  (mis)alignment  of  intelligent  systems.  One  basic  requirement  for  deep  classifiers  to  be  considered  aligned  is  that  their  output  is  always  semantically  equivalent  to  that  of  a  human,  who  possesses  the  necessary  knowledge  and  tools  to  solve  the  problem  at  hand.  Unfortunately,  verifying  the  alignment  between  models  and  humans  on  outputs  is  often  not  feasible,  as  it  would  be  impractical  to  test  every  sample  from  the  distribution.A  lack  of  output  alignment  of  DNNs  has  been  evidenced  by  their  vulnerability  to  adversarial  noise,  which  are  unlikely  to  affect  a  human's  response.  This  weakness  originates  from  the  fact  that  important  features  used  by  the  model  may  not  be  semantically  meaningful  from  a  human  perspective,  an  issue  which  we  will  term  as  feature  (mis)alignment  in  vision  tasks.  Being  (perceptually)  aligned  with  humans  on  useful  features  is  necessary  to  preserve  output  alignment.  Thus,  the  goal  of  this  thesis  is  to  evaluate  and  enhance  the  feature  alignment  of  deep  vision  classifiers  to  promote  output  alignment.To  evaluate  feature  alignment,  we  introduce  locality,  a  metric  based  on  explainability  tools  that  guarantee  faithful  returns  of  important  features  contributing  towards  the  models'  outputs.  Consequently,  the  first  contribution  of  the  thesis  shows  that  modern  architectures,  e.g.,  Vision  Transformers  (ViTs),  the  stateof-the-art  classifiers  on  many  tasks,  are  misaligned  in  features.  Our  second  contribution,  on  the  other  hand,  shows  that  improved  adversarial  robustness  leads  to  improved  locality.  To  be  specific,  we  find  that  a  robust  model  has  better  locality  than  any  non-robust  model  and  the  locality  of  a  model  increases  as  it  becomes  more  robust.  Inspired  by  this  finding,  our  third  contribution  is  to  improve  robustness  with  a  novel  technique,  TrH  regularization,  based  on  a  direct  minimization  of  PAC-Bayesian  generalization  bound  for  robustness.  Our  technique  provides  the  new  state-of-the-art  robustness  for  ViTs.  However,  as  robustness  is  often  measured  by  running  existing  attacks,  the  guarantee  is  only  empirical  and  may  fail  against  adaptive  attacks.  The  last  contribution  of  this  thesis  introduces  GloRo  Nets,  which  entail  a  built-in  formal  robustness  verification  layer  based  on  the  global  Lipschitz  constant  of  the  model.  Unlike  a  probabilistic  guarantee  provided  by  Randomized  Smoothing,  GloRo  Nets  have  a  deterministic  guarantee  and  significantly  improve  the  state-of-the-art  provable  robustness  under  ℓ2-norm-bounded  threats.Robustness  is  necessary  for  feature  alignment  but  is  probably  not  sufficient,  as  there  are  many  other  unspecified  requirements  that  would  result  in  misalignment.  In  conclusion,  the  thesis  discusses  the  issue  of  under-specification  in  classification  and  its  connection  to  alignment,  together  with  potential  remedies  for  addressing  the  issue  as  another  step  towards  feature  alignment  in  deep  learning.
■590    ▼aSchool  code:  0041.
■650  4▼aComputer  engineering.
■650  4▼aElectrical  engineering.
■653    ▼aAdversarial  robustness
■653    ▼aExplainability  tool
■653    ▼aFeature  alignment
■653    ▼aMachine  learning
■653    ▼aVision  models
■690    ▼a0800
■690    ▼a0544
■690    ▼a0464
■71020▼aCarnegie  Mellon  University▼bElectrical  and  Computer  Engineering.
■7730  ▼tDissertations  Abstracts  International▼g85-02B.
■773    ▼tDissertation  Abstract  International
■790    ▼a0041
■791    ▼aPh.D.
■792    ▼a2023
■793    ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T16934847▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.
■980    ▼a202402▼f2024

New Books MORE

Related books MORE

최근 3년간 통계입니다.

Reserva
캠퍼스간 도서대출
서가에 없는 책 신고
보존서고대출신청
Mi carpeta

Material
número de libro	número de llamada	Ubicación	estado	Prestar info
TQ0027317	T	원문자료	열람가능/출력가능	열람가능/출력가능 마이폴더 부재도서신고

* Las reservas están disponibles en el libro de préstamos. Para hacer reservaciones, haga clic en el botón de reserva

본문

서브메뉴

검색

상세정보

MARC

미리보기

내보내기

chatGPT토론

Ai 추천 관련 도서

New Books MORE

Related books MORE

최근 3년간 통계입니다.

detalle info

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치

QUICK LINK