본문

서브메뉴

Generative Models of Vision and Action.
内容资讯
Generative Models of Vision and Action.
자료유형  
 학위논문
Control Number  
0017163752
International Standard Book Number  
9798342107396
Dewey Decimal Classification Number  
620
Main Entry-Personal Name  
Gupta, Agrim.
Publication, Distribution, etc. (Imprint  
[S.l.] : Stanford University., 2024
Publication, Distribution, etc. (Imprint  
Ann Arbor : ProQuest Dissertations & Theses, 2024
Physical Description  
129 p.
General Note  
Source: Dissertations Abstracts International, Volume: 86-04, Section: B.
General Note  
Advisor: Li, Fei-Fei.
Dissertation Note  
Thesis (Ph.D.)--Stanford University, 2024.
Summary, Etc.  
요약Animals and humans display remarkable ability at building internal representations of the world and using them to simulate, evaluate and select among different possible actions. This capability is learnt primarily from observation and without any supervision. Endowing autonomous agents with similar capabilities is a fundamental challenge in machine learning. In this thesis I will explore new algorithms that enable scalable representation learning from videos via prediction, generative models of visual data and their applications to robotics.To begin, I will discuss the challenges associated with using predictive learning objectives to learn visual representations. I'll introduce a simple predictive learning architecture and objective that enables learning visual representations capable of solving a wide range of visual correspondence tasks in a zero-shot manner. Subsequently, I'll present a transformer-based approach for photorealistic video generation via diffusion modeling. Our approach jointly compresses images and videos within a unified latent space, enabling training and generation across modalities. Finally, I will illustrate the practical applications of generative models for robot learning. Our non-autoregressive, action-conditioned video generation model can act as a world model, enabling embodied agents to plan using visual model-predictive control. Furthermore, I'll showcase a generalist agent trained via next token prediction to learn from diverse robotic experiences across various robots and tasks.
Subject Added Entry-Topical Term  
Robots.
Subject Added Entry-Topical Term  
Success.
Subject Added Entry-Topical Term  
Failure analysis.
Subject Added Entry-Topical Term  
Video recordings.
Subject Added Entry-Topical Term  
Semantics.
Subject Added Entry-Topical Term  
Film studies.
Subject Added Entry-Topical Term  
Logic.
Subject Added Entry-Topical Term  
Robotics.
Added Entry-Corporate Name  
Stanford University.
Host Item Entry  
Dissertations Abstracts International. 86-04B.
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:657555
New Books MORE
최근 3년간 통계입니다.

高级搜索信息

  • 预订
  • 캠퍼스간 도서대출
  • 서가에 없는 책 신고
  • 我的文件夹
材料
注册编号 呼叫号码. 收藏 状态 借信息.
TQ0033777 T   원문자료 열람가능/출력가능 열람가능/출력가능
마이폴더 부재도서신고

*保留在借用的书可用。预订,请点击预订按钮

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치