중부대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

상세정보

Vision-Based Manipulation In-The-Wild.

자료유형: 학위논문

Control Number: 0017162576

International Standard Book Number: 9798383200384

Dewey Decimal Classification Number: 004

Main Entry-Personal Name: Chi, Cheng.

Publication, Distribution, etc. (Imprint: [S.l.] : Columbia University., 2024

Publication, Distribution, etc. (Imprint: Ann Arbor : ProQuest Dissertations & Theses, 2024

Physical Description: 131 p.

General Note: Source: Dissertations Abstracts International, Volume: 86-01, Section: B.

General Note: Advisor: Song, Shuran;Vondrick, Carl.

Dissertation Note: Thesis (Ph.D.)--Columbia University, 2024.

Summary, Etc.: 요약Deploying robots in real-world environments involves immense engineering complexity, potentially surpassing the resources required for autonomous vehicles due to the increased dimensionality and task variety. To maximize the chances of successful real-world deployment, finding a simple solution that minimizes engineering complexity at every level, from hardware to algorithm to operations, is crucial. In this dissertation, we consider a vision-based manipulation system that can be deployed in-the-wild when trained to imitate sufficient quantity and diversity of human demonstration data on the desired task. At deployment time, the robot is driven by a single diffusion-based visuomotor policy, with raw RGB images as input and robot end-effector pose as output. Compared to existing policy representations, Diffusion Policy handles multimodal action distributions gracefully, being scalable to high-dimensional action spaces and exhibiting impressive training stability. These properties allow a single software system to be used for multiple tasks, with data collected by multiple demonstrators, deployed to multiple robot embodiments, and without significant hyperparameter tuning. We developed a Universal Manipulation Interface (UMI), a portable, low-cost, and information-rich data collection system to enable direct manipulation skill learning from in-the-wild human demonstrations. UMI provides an intuitive interface for non-expert users by using hand-held grippers with mounted GoPro cameras. Compared to existing robotic data collection systems, UMI enables robotic data collection without needing a robot, drastically reducing the engineering and operational complexity. Trained with UMI data, the resulting diffusion policies can be deployed across multiple robot platforms in unseen environments for novel objects and to complete dynamic, bimanual, precise, and long-horizon tasks.The Diffusion Policy and UMI combination provides a simple full-stack solution to many manipulation problems. The turn-around time of building a single-task manipulation system (such as object tossing and cloth folding) can be reduced from a few months to a few days.

Subject Added Entry-Topical Term: Computer science.

Subject Added Entry-Topical Term: Computer engineering.

Index Term-Uncontrolled: Universal Manipulation Interface

Index Term-Uncontrolled: Vision manipulation

Index Term-Uncontrolled: Diffusion Policy

Added Entry-Corporate Name: Columbia University Computer Science

Host Item Entry: Dissertations Abstracts International. 86-01B.

Electronic Location and Access: 로그인을 한후 보실 수 있는 자료입니다.

Control Number: joongbu:654033

008250224s2024        us  ||||||||||||||c||eng  d
■001000017162576
■00520250211152028
■006m          o    d
■007cr#unu||||||||
■020    ▼a9798383200384
■035    ▼a(MiAaPQ)AAI31333675
■040    ▼aMiAaPQ▼cMiAaPQ
■0820  ▼a004
■1001  ▼aChi,  Cheng.
■24510▼aVision-Based  Manipulation  In-The-Wild.
■260    ▼a[S.l.]▼bColumbia  University.  ▼c2024
■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2024
■300    ▼a131  p.
■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  86-01,  Section:  B.
■500    ▼aAdvisor:  Song,  Shuran;Vondrick,  Carl.
■5021  ▼aThesis  (Ph.D.)--Columbia  University,  2024.
■520    ▼aDeploying  robots  in  real-world  environments  involves  immense  engineering  complexity,  potentially  surpassing  the  resources  required  for  autonomous  vehicles  due  to  the  increased  dimensionality  and  task  variety.  To  maximize  the  chances  of  successful  real-world  deployment,  finding  a  simple  solution  that  minimizes  engineering  complexity  at  every  level,  from  hardware  to  algorithm  to  operations,  is  crucial. In  this  dissertation,  we  consider  a  vision-based  manipulation  system  that  can  be  deployed  in-the-wild  when  trained  to  imitate  sufficient  quantity  and  diversity  of  human  demonstration  data  on  the  desired  task.  At  deployment  time,  the  robot  is  driven  by  a  single  diffusion-based  visuomotor  policy,  with  raw  RGB  images  as  input  and  robot  end-effector  pose  as  output.  Compared  to  existing  policy  representations,  Diffusion  Policy  handles  multimodal  action  distributions  gracefully,  being  scalable  to  high-dimensional  action  spaces  and  exhibiting  impressive  training  stability.  These  properties  allow  a  single  software  system  to  be  used  for  multiple  tasks,  with  data  collected  by  multiple  demonstrators,  deployed  to  multiple  robot  embodiments,  and  without  significant  hyperparameter  tuning.  We  developed  a  Universal  Manipulation  Interface  (UMI),  a  portable,  low-cost,  and  information-rich  data  collection  system  to  enable  direct  manipulation  skill  learning  from  in-the-wild  human  demonstrations.  UMI  provides  an  intuitive  interface  for  non-expert  users  by  using  hand-held  grippers  with  mounted  GoPro  cameras.  Compared  to  existing  robotic  data  collection  systems,  UMI  enables  robotic  data  collection  without  needing  a  robot,  drastically  reducing  the  engineering  and  operational  complexity.  Trained  with  UMI  data,  the  resulting  diffusion  policies  can  be  deployed  across  multiple  robot  platforms  in  unseen  environments  for  novel  objects  and  to  complete  dynamic,  bimanual,  precise,  and  long-horizon  tasks.The  Diffusion  Policy  and  UMI  combination  provides  a  simple  full-stack  solution  to  many  manipulation  problems.  The  turn-around  time  of  building  a  single-task  manipulation  system  (such  as  object  tossing  and  cloth  folding)  can  be  reduced  from  a  few  months  to  a  few  days.
■590    ▼aSchool  code:  0054.
■650  4▼aComputer  science.
■650  4▼aComputer  engineering.
■653    ▼aUniversal  Manipulation  Interface
■653    ▼aVision  manipulation
■653    ▼aDiffusion  Policy
■690    ▼a0984
■690    ▼a0464
■71020▼aColumbia  University▼bComputer  Science.
■7730  ▼tDissertations  Abstracts  International▼g86-01B.
■790    ▼a0054
■791    ▼aPh.D.
■792    ▼a2024
■793    ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17162576▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.

New Books MORE

Related books MORE

최근 3년간 통계입니다.

Reserva
캠퍼스간 도서대출
서가에 없는 책 신고
보존서고대출신청
Mi carpeta

Material
número de libro	número de llamada	Ubicación	estado	Prestar info
TQ0033004	T	원문자료	열람가능/출력가능	열람가능/출력가능 마이폴더 부재도서신고

* Las reservas están disponibles en el libro de préstamos. Para hacer reservaciones, haga clic en el botón de reserva

본문

서브메뉴

검색

상세정보

MARC

미리보기

내보내기

chatGPT토론

Ai 추천 관련 도서

New Books MORE

Related books MORE

최근 3년간 통계입니다.

detalle info

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치

QUICK LINK