서브메뉴
검색
Discovering the 4D World Behind Any Video.
Discovering the 4D World Behind Any Video.
- 자료유형
- 학위논문
- Control Number
- 0017163649
- International Standard Book Number
- 9798384448730
- Dewey Decimal Classification Number
- 004
- Main Entry-Personal Name
- Ye, Vickie.
- Publication, Distribution, etc. (Imprint
- [S.l.] : University of California, Berkeley., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 118 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-03, Section: B.
- General Note
- Advisor: Kanazawa, Angjoo.
- Dissertation Note
- Thesis (Ph.D.)--University of California, Berkeley, 2024.
- Summary, Etc.
- 요약As we begin to interact with AI systems, we need them to be able to interpret the visual world in 4D - that is, to perceive the geometry and motion in the world. However, pixel differences in image space result from either geometry (via camera motion) or scene motion in the world. To disentangle these two sources this from a single video is extremely under-constrained.In this thesis, I build several systems that recover scene representations from limited image observations. Specifically, I study a series of problems that build toward the 4D monocular recovery problem, each one addressing a different aspect of the under-constrained nature of the problem. First I study the problem of recovering shape from under-constrained inputs, without scene motion. Specifically, I present pixelNeRF, a method to synthesize novel views of a static scene from single or few views. We learn a scene prior by training a 3D neural representation conditioned on image features across multiple scenes. This learned scene prior enables 3D scene completion from the under-constrained inputs of single or few images. Next I study the problem of recovering motion without 3D shape. In particular, I present Deformable Sprites, a method to extract persistent elements of a dynamic scene from an input video. We represent each element as 2D image layers that deform across the video.Finally I present two studies of performing the joint recovery of both the shape and motion of the 4D world from any single video. I first study the special case of dynamic humans, and present SLAHMR, in which we recover from a single video the global poses of all the humans and the camera in the world coordinate frame. I then move on to the general case of recovering any dynamic objects from a single video in Shape of Motion, in which we recover the entire scene as 4D gaussians, which we can use for dynamic novel view synthesis and 3D tracking.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Computer engineering.
- Subject Added Entry-Topical Term
- Information technology.
- Index Term-Uncontrolled
- 3D reconstruction
- Index Term-Uncontrolled
- Computer graphics
- Index Term-Uncontrolled
- Computer vision
- Index Term-Uncontrolled
- Video understanding
- Index Term-Uncontrolled
- 4D monocular recovery problem
- Added Entry-Corporate Name
- University of California, Berkeley Electrical Engineering & Computer Sciences
- Host Item Entry
- Dissertations Abstracts International. 86-03B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:658426
ค้นหาข้อมูลรายละเอียด
- จองห้องพัก
- 캠퍼스간 도서대출
- 서가에 없는 책 신고
- โฟลเดอร์ของฉัน