본문

서브메뉴

Understanding the Role of Data in Model Decisions.
Contents Info
Understanding the Role of Data in Model Decisions.
자료유형  
 학위논문
Control Number  
0017160266
International Standard Book Number  
9798382191348
Dewey Decimal Classification Number  
401
Main Entry-Personal Name  
Gupta, Arushi.
Publication, Distribution, etc. (Imprint  
[S.l.] : Princeton University., 2024
Publication, Distribution, etc. (Imprint  
Ann Arbor : ProQuest Dissertations & Theses, 2024
Physical Description  
154 p.
General Note  
Source: Dissertations Abstracts International, Volume: 85-10, Section: A.
General Note  
Advisor: Arora, Sanjeev.
Dissertation Note  
Thesis (Ph.D.)--Princeton University, 2024.
Summary, Etc.  
요약As neural networks are increasingly employed in high stakes applications such as criminal justice, medicine, etc, it becomes increasingly important to understand why these models make the decisions they do. For example, it is important to develop tools to analyze whether models are perpetuating harmful demographic inequalities they have found in their training data in their future decision making. However, neural networks typically require large training sets, have "black-box" decision making, and have costly retraining protocols, increasing the difficulty of this problem. This work considers three questions. Q1) What is the relationship between the elements of an input and the model's decision? Q2) What is the relationship between the individual training points and the model's decision. And finally Q3) To what extent do there exist (efficient) approximations that would allow practitioners to predict how model performance would change given different training data, or a different training protocol.Part I addresses Q1 for masking saliency methods. These methods implicitly assume that grey pixels in an image are "uninformative." We find experimentally that this assumption may not always be true, and define "soundness," which measures a desirable property of a saliency map. Part II addresses Q2 and Q3 in the context of influence functions, which aim to approximate the effect of removing a training points on the model's decision. We use harmonic analysis to examine a particular type of influence method, namely datamodels, and find that there is a relationship between the coefficients of the datamodel, and the Fourier coefficients of the target function. Finally, Part III addresses Q3 in the context of test data. First, we assess whether held out test data is necessary to approximate the outer loop of meta learning, or whether recycling training data constitutes a sufficient approximation. We find that held out test data is important, as it learns representations that are low rank. Then, inspired by the PGDL competition we investigate whether GAN generated data, despite well known limitations, can be used to approximate generalization performance when no test or validation set is available, and find that they can.
Subject Added Entry-Topical Term  
Linguistics.
Subject Added Entry-Topical Term  
Computer science.
Subject Added Entry-Topical Term  
Information technology.
Index Term-Uncontrolled  
Neural networks
Index Term-Uncontrolled  
Decision making
Index Term-Uncontrolled  
Training data
Index Term-Uncontrolled  
Meta learning
Index Term-Uncontrolled  
Datamodels
Added Entry-Corporate Name  
Princeton University Computer Science
Host Item Entry  
Dissertations Abstracts International. 85-10A.
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:654798
New Books MORE
최근 3년간 통계입니다.

detalle info

  • Reserva
  • 캠퍼스간 도서대출
  • 서가에 없는 책 신고
  • Mi carpeta
Material
número de libro número de llamada Ubicación estado Prestar info
TQ0030720 T   원문자료 열람가능/출력가능 열람가능/출력가능
마이폴더 부재도서신고

* Las reservas están disponibles en el libro de préstamos. Para hacer reservaciones, haga clic en el botón de reserva

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치