중부대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

内容资讯

Generative Models of Vision and Action.

자료유형: 학위논문

Control Number: 0017163752

International Standard Book Number: 9798342107396

Dewey Decimal Classification Number: 620

Main Entry-Personal Name: Gupta, Agrim.

Publication, Distribution, etc. (Imprint: [S.l.] : Stanford University., 2024

Publication, Distribution, etc. (Imprint: Ann Arbor : ProQuest Dissertations & Theses, 2024

Physical Description: 129 p.

General Note: Source: Dissertations Abstracts International, Volume: 86-04, Section: B.

General Note: Advisor: Li, Fei-Fei.

Dissertation Note: Thesis (Ph.D.)--Stanford University, 2024.

Summary, Etc.: 요약Animals and humans display remarkable ability at building internal representations of the world and using them to simulate, evaluate and select among different possible actions. This capability is learnt primarily from observation and without any supervision. Endowing autonomous agents with similar capabilities is a fundamental challenge in machine learning. In this thesis I will explore new algorithms that enable scalable representation learning from videos via prediction, generative models of visual data and their applications to robotics.To begin, I will discuss the challenges associated with using predictive learning objectives to learn visual representations. I'll introduce a simple predictive learning architecture and objective that enables learning visual representations capable of solving a wide range of visual correspondence tasks in a zero-shot manner. Subsequently, I'll present a transformer-based approach for photorealistic video generation via diffusion modeling. Our approach jointly compresses images and videos within a unified latent space, enabling training and generation across modalities. Finally, I will illustrate the practical applications of generative models for robot learning. Our non-autoregressive, action-conditioned video generation model can act as a world model, enabling embodied agents to plan using visual model-predictive control. Furthermore, I'll showcase a generalist agent trained via next token prediction to learn from diverse robotic experiences across various robots and tasks.

Subject Added Entry-Topical Term: Robots.

Subject Added Entry-Topical Term: Success.

Subject Added Entry-Topical Term: Failure analysis.

Subject Added Entry-Topical Term: Video recordings.

Subject Added Entry-Topical Term: Semantics.

Subject Added Entry-Topical Term: Film studies.

Subject Added Entry-Topical Term: Logic.

Subject Added Entry-Topical Term: Robotics.

Added Entry-Corporate Name: Stanford University.

Host Item Entry: Dissertations Abstracts International. 86-04B.

Electronic Location and Access: 로그인을 한후 보실 수 있는 자료입니다.

Control Number: joongbu:657555

New Books MORE

최근 3년간 통계입니다.

预订
캠퍼스간 도서대출
서가에 없는 책 신고
보존서고대출신청
我的文件夹

材料
注册编号	呼叫号码.	收藏	状态	借信息.
TQ0033777	T	원문자료	열람가능/출력가능	열람가능/출력가능 마이폴더 부재도서신고

*保留在借用的书可用。预订，请点击预订按钮

본문

서브메뉴

검색

New Books MORE

최근 3년간 통계입니다.

高级搜索信息

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치

QUICK LINK