서브메뉴
검색
Exploiting Cross-lingual Representations for Natural Language Processing
Exploiting Cross-lingual Representations for Natural Language Processing
- 자료유형
- 학위논문
- Control Number
- 0015490782
- International Standard Book Number
- 9781085565288
- Dewey Decimal Classification Number
- 004
- Main Entry-Personal Name
- Upadhyay, Shyam.
- Publication, Distribution, etc. (Imprint
- [Sl] : University of Pennsylvania, 2019
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2019
- Physical Description
- 210 p
- General Note
- Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
- General Note
- Advisor: Roth, Dan.
- Dissertation Note
- Thesis (Ph.D.)--University of Pennsylvania, 2019.
- Restrictions on Access Note
- This item must not be sold to any third party vendors.
- Summary, Etc.
- 요약Traditional approaches to supervised learning require a generous amount of labeled data for good generalization. While such annotation-heavy approaches have proven useful for some Natural Language Processing (NLP) tasks in high-resource languages (like English), they are unlikely to scale to languages where collecting labeled data is di cult and time-consuming. Translating supervision available in English is also not a viable solution, because developing a good machine translation system requires expensive to annotate resources which are not available for most languages.In this thesis, I argue that cross-lingual representations are an effective means of extending NLP tools to languages beyond English without resorting to generous amounts of annotated data or expensive machine translation. These representations can be learned in an inexpensive manner, often from signals completely unrelated to the task of interest. I begin with a review of different ways of inducing such representations using a variety of cross-lingual signals and study algorithmic approaches of using them in a diverse set of downstream tasks. Examples of such tasks covered in this thesis include learning representations to transfer a trained model across languages for document classification, assist in monolingual lexical semantics like word sense induction, identify asymmetric lexical relationships like hypernymy between words in different languages, or combining supervision across languages through a shared feature space for cross-lingual entity linking. In all these applications, the representations make information expressed in other languages available in English, while requiring minimal additional supervision in the language of interest.
- Subject Added Entry-Topical Term
- Computer science
- Added Entry-Corporate Name
- University of Pennsylvania Computer and Information Science
- Host Item Entry
- Dissertations Abstracts International. 81-02B.
- Host Item Entry
- Dissertation Abstract International
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:566985