본문

서브메뉴

Interpretation Errors: Extracting Functionality From Generative Models of Language by Understanding Them Better- [electronic resource]
내용보기
Interpretation Errors: Extracting Functionality From Generative Models of Language by Understanding Them Better- [electronic resource]
자료유형  
 학위논문
Control Number  
0016934766
International Standard Book Number  
9798380328883
Dewey Decimal Classification Number  
004
Main Entry-Personal Name  
Holtzman, Ari.
Publication, Distribution, etc. (Imprint  
[S.l.] : University of Washington., 2023
Publication, Distribution, etc. (Imprint  
Ann Arbor : ProQuest Dissertations & Theses, 2023
Physical Description  
1 online resource(129 p.)
General Note  
Source: Dissertations Abstracts International, Volume: 85-03, Section: A.
General Note  
Advisor: Zettlemoyer, Luke.
Dissertation Note  
Thesis (Ph.D.)--University of Washington, 2023.
Restrictions on Access Note  
This item must not be sold to any third party vendors.
Summary, Etc.  
요약The rise of large language models as the workhorse of NLP, and the continuous release of better models (OpenAI, 2023; Pichai, 2023; Schulman et al., 2022, inter alia) has created a strange situation: we have models that are more powerful language generators than ever before, but since we did not design them for a specific purpose we struggle to understand how they should be used or what their idiosyncracies are.This dissertation describes three empirical projects that sought to characterize the underlying behavior of language models and, importantly, to make them more reliable tools for generating and selecting text where this behavior does not match up with the tasks we would like models to complete. Each project attempts to understand what language models and accompanying inference methods currently optimize for, to characterize the gap between that and the true objective of a potential user, and to close it with some new inference method. An emergent theme through these works is that models are already doing what we trained them to do quite well-and it is often the experimenters and practitioners who misunderstand precisely what we trained models to do in the first place. We conclude with a conceptual analysis of how we should study generative models going forward-as models keep improving and new, unanticipated uses and misuses become ever more available.The first half of this dissertation concerns two works, Neural Text Degeneration and Surface Form Competition-two failure modes of generative models that occur when probability is viewed as equivalent to "correctness" in text generation and multiple choice scenarios, respectively. For these works we describe the resultant issues, and propose inference methods that largely alleviate them.The second half of this dissertation goes deeper into the question of how generative models of language capture the communicative goals that humans are optimizing: first with Learning to Write, operationalizing communicative goals into auxiliary search objectives for text decoding, and then with Generative Models as a Complex Systems Science, which presents a framework to think about the study of generative models as NLP shifts to analyzing systems that are often infeasible to replicate.How does a model that is predicting the distribution of next tokens understand-and fail to understand-the structure of an essay? This is precisely the kind of question we must face head-on in the new science of generative models. 
Subject Added Entry-Topical Term  
Computer science.
Subject Added Entry-Topical Term  
Computer engineering.
Subject Added Entry-Topical Term  
Linguistics.
Index Term-Uncontrolled  
Language models
Index Term-Uncontrolled  
Communicative goals
Index Term-Uncontrolled  
Text decoding
Index Term-Uncontrolled  
Analyzing systems
Index Term-Uncontrolled  
Interpretation errors
Added Entry-Corporate Name  
University of Washington Computer Science and Engineering
Host Item Entry  
Dissertations Abstracts International. 85-03A.
Host Item Entry  
Dissertation Abstract International
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:640285
신착도서 더보기
최근 3년간 통계입니다.

소장정보

  • 예약
  • 캠퍼스간 도서대출
  • 서가에 없는 책 신고
  • 나의폴더
소장자료
등록번호 청구기호 소장처 대출가능여부 대출정보
TQ0026106 T   원문자료 열람가능/출력가능 열람가능/출력가능
마이폴더 부재도서신고

* 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

해당 도서를 다른 이용자가 함께 대출한 도서

관련도서

관련 인기도서

도서위치