서브메뉴
검색
Reinforcement Learning for Autonomous Self-Improving Robotic Systems.
Reinforcement Learning for Autonomous Self-Improving Robotic Systems.
- 자료유형
- 학위논문
- Control Number
- 0017164887
- International Standard Book Number
- 9798346382348
- Dewey Decimal Classification Number
- 620
- Main Entry-Personal Name
- Sharma, Archit.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Stanford University., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 142 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-05, Section: B.
- General Note
- Advisor: Finn, Chelsea.
- Dissertation Note
- Thesis (Ph.D.)--Stanford University, 2024.
- Summary, Etc.
- 요약A generally capable robotic system that can solve a wide variety of tasks in a diverse set of environments has been an aspirational goal, achieving which can allow robots to go from structured environments such as industrial supply chains to unstructured environments such as homes, offices and restaurants. Recent success of large language models [Brown et al., 2020, Touvron et al., 2023, Team et al., 2023], among several others, indicates that broad language abilities and tasks can be learned by training on large amounts of natural language data, usually on the order of trillions of words. This has inspired similar efforts in context of robotics where robotic interaction data from several scenes, robots and tasks have been consolidated [Fang et al., 2023, Padalkar et al., 2023, Khazatsky et al., 2024] to train more broadly capable robotic agents [Brohan et al., 2022, 2023, Kim et al., 2024]. While there are emergent signs of generalization to new objects, scenes and even tasks, the scale of training data available is much smaller than those for other modalities such as language, images and videos.The scale and diversity of robotic data is limited because the data collection in recent robotic datasets is driven by human teleoperation. If robots could interact with their environments autonomously with minimal human supervision, both the number of robots collecting the data and the volume of data they are collecting would be easier to scale up. Behavioral learning (BC) has delivered remarkable robot learning results recently [Chi et al., 2023, Zhao et al., 2023, Shi et al., 2023], but, autonomous data collection requires moving beyond BC as the data is no longer human supervised robotic interactions, and may contain substantially suboptimal interactions. Reinforcement learning (RL) provides a natural learning based framework for trial-and-error based learning in the presence of such suboptimal interactions, and has been used successfully for robot learning [Levine et al., 2016, Kalashnikov et al., 2018, 2021]. However, standard RL algorithms are often developed for episodic settings, where the environment is reset to allow the robot to try the task again. This introduces the reset problem,where standard RL algorithms require a human to supervise robot training in the real world and reset the environment after every trial [Han et al., 2015a, Eysenbach et al., 2018a, Zhu et al., 2020b, Xu et al., 2020b, Gupta et al., 2021b]. As a result, the standard RL algorithms are not amenable for autonomous robot collection. This dissertation addresses the reset problem, allowing us to construct robotic systems that can collect data with a high degree of autonomy, and self-improve from such collected data.In Chapter 2, we first formalize the problem setting of autonomous reinforcement learning, where a robotic agent has to learn from autonomous interactions with the environment, i.e. minimal human supervision to reset the environment. We distinguish between objectives that an agent may care about: the continuing setting where the goal is to accumulate as much reward as possible during the lifetime and deploymentsetting, where the goal is to maximize the performance of the final policy used for deployment after training.
- Subject Added Entry-Topical Term
- Robots.
- Subject Added Entry-Topical Term
- Sensitivity analysis.
- Subject Added Entry-Topical Term
- Benchmarks.
- Subject Added Entry-Topical Term
- Robotics.
- Added Entry-Corporate Name
- Stanford University.
- Host Item Entry
- Dissertations Abstracts International. 86-05B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:655971