서브메뉴
검색
Empowering Large Language Models With Efficient and Automated Systems.
Empowering Large Language Models With Efficient and Automated Systems.
- 자료유형
- 학위논문
- Control Number
- 0017161854
- International Standard Book Number
- 9798384449218
- Dewey Decimal Classification Number
- 004
- Main Entry-Personal Name
- Li, Zhuohan.
- Publication, Distribution, etc. (Imprint
- [S.l.] : University of California, Berkeley., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 153 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-03, Section: A.
- General Note
- Advisor: Stoica, Ion.
- Dissertation Note
- Thesis (Ph.D.)--University of California, Berkeley, 2024.
- Summary, Etc.
- 요약Large Language Models (LLMs) have shown remarkable capabilities in a variety of tasks, including chatting, programming, and searching. However, the high costs of LLMs are preventing these models from being deployed for the vast majority of applications. In this dissertation, we focus on building efficient and automated systems to reduce costs and democratize access to large language models. We first introduce systems to optimize computational efficiency and reduce the engineering overhead for distributed LLM training. We develop TeraPipe, which proposes a new dimension to perform pipeline parallel training for LLMs, and also Alpa, the world's first compiler capable of automatically distributing arbitrary neural networks with all existing parallelization methods. While training is typically a one-time cost, deploying and serving an LLM requires running LLM inference continuously, which is the top blocker for the real-world deployment of LLMs. We improve the serving scalability with AlpaServe through model parallelism, and increase the memory utilization and the LLM inference throughput with a new attention algorithm, PagedAttention, and an end-to-end serving system, vLLM. Overall, these systems provide comprehensive solutions that significantly improve both training and inference efficiency for large language models. Together, these systems lower the high costs associated with large language models, democratizing their deployment across various real-world applications.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Linguistics.
- Subject Added Entry-Topical Term
- Information technology.
- Index Term-Uncontrolled
- Deep learning
- Index Term-Uncontrolled
- Distributed systems
- Index Term-Uncontrolled
- Large language models
- Index Term-Uncontrolled
- Machine learning
- Added Entry-Corporate Name
- University of California, Berkeley Electrical Engineering & Computer Sciences
- Host Item Entry
- Dissertations Abstracts International. 86-03A.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:654578