서브메뉴
검색
Interpretation Errors: Extracting Functionality From Generative Models of Language by Understanding Them Better- [electronic resource]
Interpretation Errors: Extracting Functionality From Generative Models of Language by Understanding Them Better- [electronic resource]
- 자료유형
- 학위논문
- Control Number
- 0016934766
- International Standard Book Number
- 9798380328883
- Dewey Decimal Classification Number
- 004
- Main Entry-Personal Name
- Holtzman, Ari.
- Publication, Distribution, etc. (Imprint
- [S.l.] : University of Washington., 2023
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2023
- Physical Description
- 1 online resource(129 p.)
- General Note
- Source: Dissertations Abstracts International, Volume: 85-03, Section: A.
- General Note
- Advisor: Zettlemoyer, Luke.
- Dissertation Note
- Thesis (Ph.D.)--University of Washington, 2023.
- Restrictions on Access Note
- This item must not be sold to any third party vendors.
- Summary, Etc.
- 요약The rise of large language models as the workhorse of NLP, and the continuous release of better models (OpenAI, 2023; Pichai, 2023; Schulman et al., 2022, inter alia) has created a strange situation: we have models that are more powerful language generators than ever before, but since we did not design them for a specific purpose we struggle to understand how they should be used or what their idiosyncracies are.This dissertation describes three empirical projects that sought to characterize the underlying behavior of language models and, importantly, to make them more reliable tools for generating and selecting text where this behavior does not match up with the tasks we would like models to complete. Each project attempts to understand what language models and accompanying inference methods currently optimize for, to characterize the gap between that and the true objective of a potential user, and to close it with some new inference method. An emergent theme through these works is that models are already doing what we trained them to do quite well-and it is often the experimenters and practitioners who misunderstand precisely what we trained models to do in the first place. We conclude with a conceptual analysis of how we should study generative models going forward-as models keep improving and new, unanticipated uses and misuses become ever more available.The first half of this dissertation concerns two works, Neural Text Degeneration and Surface Form Competition-two failure modes of generative models that occur when probability is viewed as equivalent to "correctness" in text generation and multiple choice scenarios, respectively. For these works we describe the resultant issues, and propose inference methods that largely alleviate them.The second half of this dissertation goes deeper into the question of how generative models of language capture the communicative goals that humans are optimizing: first with Learning to Write, operationalizing communicative goals into auxiliary search objectives for text decoding, and then with Generative Models as a Complex Systems Science, which presents a framework to think about the study of generative models as NLP shifts to analyzing systems that are often infeasible to replicate.How does a model that is predicting the distribution of next tokens understand-and fail to understand-the structure of an essay? This is precisely the kind of question we must face head-on in the new science of generative models.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Computer engineering.
- Subject Added Entry-Topical Term
- Linguistics.
- Index Term-Uncontrolled
- Language models
- Index Term-Uncontrolled
- Communicative goals
- Index Term-Uncontrolled
- Text decoding
- Index Term-Uncontrolled
- Analyzing systems
- Index Term-Uncontrolled
- Interpretation errors
- Added Entry-Corporate Name
- University of Washington Computer Science and Engineering
- Host Item Entry
- Dissertations Abstracts International. 85-03A.
- Host Item Entry
- Dissertation Abstract International
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:640285