본문

서브메뉴

Adversarial Robustness for Estimation and Alignment.
Adversarial Robustness for Estimation and Alignment.
Contents Info
Adversarial Robustness for Estimation and Alignment.
Material Type  
 학위논문
 
0017161278
Date and Time of Latest Transaction  
20250211151333
ISBN  
9798382830964
DDC  
310
Author  
Chao, Patrick.
Title/Author  
Adversarial Robustness for Estimation and Alignment.
Publish Info  
[S.l.] : University of Pennsylvania., 2024
Publish Info  
Ann Arbor : ProQuest Dissertations & Theses, 2024
Material Info  
216 p.
General Note  
Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
General Note  
Advisor: Dobriban, Edgar.
학위논문주기  
Thesis (Ph.D.)--University of Pennsylvania, 2024.
Abstracts/Etc  
요약As machine learning models are deployed in a multitude of settings with increasing levels of influence and competency, there is growing interest in ensuring these models are robust and align with human intentions. To this end, we analyze robust models and adversarial inputs in a variety of settings. We explore statistical estimation under the adversarial setting of Wasserstein distribution shifts, where every data point may undergo a bounded perturbation. We analyze several statistical problems, including location estimation, linear regression, and non-parametric density estimation. Furthermore, we evaluate alignment in modern foundation models, and propose automated methods to construct adversarial inputs. We develop black-box automated algorithms to generate adversarial prompts for text-to-image models and jailbreaks for language models. Lastly, we introduce a benchmark, JailbreakBench, for reproducible jailbreak evaluation.
Subject Added Entry-Topical Term  
Statistics.
Subject Added Entry-Topical Term  
Information technology.
Index Term-Uncontrolled  
Adversarial prompts
Index Term-Uncontrolled  
Adversarial robustness
Index Term-Uncontrolled  
Distribution shifts
Index Term-Uncontrolled  
Jailbreaking
Index Term-Uncontrolled  
Minimax estimation
Index Term-Uncontrolled  
Red teaming
Added Entry-Corporate Name  
University of Pennsylvania Statistics and Data Science
Host Item Entry  
Dissertations Abstracts International. 85-12B.
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:658472
New Books MORE
최근 3년간 통계입니다.

Detail Info.

  • Reservation
  • 캠퍼스간 도서대출
  • 서가에 없는 책 신고
  • My Folder
Material
Reg No. Call No. Location Status Lend Info
TQ0034790 T   원문자료 열람가능/출력가능 열람가능/출력가능
마이폴더 부재도서신고

* Reservations are available in the borrowing book. To make reservations, Please click the reservation button

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치