본문

서브메뉴

Efficient and Reconfigurable Approximate Value Functions for Task Scheduling, Path Planning, and Control.
Efficient and Reconfigurable Approximate Value Functions for Task Scheduling, Path Planning, and Control.

상세정보

자료유형  
 학위논문
Control Number  
0017162958
International Standard Book Number  
9798384340515
Dewey Decimal Classification Number  
330
Main Entry-Personal Name  
Washington, Patrick Henry.
Publication, Distribution, etc. (Imprint  
[S.l.] : Stanford University., 2024
Publication, Distribution, etc. (Imprint  
Ann Arbor : ProQuest Dissertations & Theses, 2024
Physical Description  
149 p.
General Note  
Source: Dissertations Abstracts International, Volume: 86-03, Section: A.
General Note  
Includes supplementary digital materials.
General Note  
Advisor: Schwager, Mac.
Dissertation Note  
Thesis (Ph.D.)--Stanford University, 2024.
Summary, Etc.  
요약Task scheduling, path planning, and control are all problems in robotics that involve choosing the best action to take given the current state of the system. For task scheduling, this means observing the condition of the robots and deciding which robot should do which task and when. Path planning involves choosing how the robot should navigate the space and control deals with choosing inputs to best follow that plan. The concept that links these ideas is that of a value function. This thesis addresses two key challenges in value functions, efficiency and reconfigurability.First, when dealing with teams of robots, the number of state variables grows quickly and we run into what is known as the curse of dimensionality, where traditional methods of dynamic programming have trouble finding exact solutions because of the number of possible states in the system. This is exemplified in task scheduling problems, where there are teams of robots that need to manage various tasks. The task scheduling problem presented in this thesis is the persistent surveillance problem. It involves several robots deciding when to charge while maintaining surveillance coverage over a region. With each robot having a position and battery level along with the surveillance position moving over time, it is intractable to find an exact value function and corresponding policy. We first demonstrate a modified Monte Carlo Tree Search algorithm that estimates the value of actions through model predictive control ideas. This method uses the idea that in a scenario where most action sequences fail, searching for any successful performance can outperform searching for the best expected performance. We then present Reduced State Value Iteration, an algorithm that builds off of other approximate value iteration ideas to simplify the decision making state space and efficiently solve for an approximate value function. It leverages knowledge of the problem's structure to vastly reduce the number of states relevant to decision making.Second, many algorithms that develop value functions for path planning and control suffer from an inability to be reconfigured in the event of a change to the problem. Sometimes, this comes in the form of finding a single trajectory from predefined start and end points and building a feedback law to follow that trajectory. However, if either point changes or the robot strays too far from the desired trajectory, the plan is useless and needs to be recomputed. On the other hand, algorithms that solve for feedback policies over the entire state space are generally slow and unable to adapt to new state or control constraints, requiring extensive recomputation. We introduce GrAVITree, a graph-based planning algorithm that builds a value function and feedback policy for simultaneous path planning and control. It samples backwards in time from the goal and maintains a graph that stores state and control information. One advantage of this is that by sampling backwards in time to branch, we only explore regions of the state space that can actually reach the goal, rather than solving over the whole state space. By storing the graph, we can also easily change the goal, the cost function, and the constraints after solving by editing the graph and reselecting the optimal edges to account for the new changes, thus it is a reconfigurable approximate value function. Another key feature is that we build the graph one step at a time, bypassing the need for a complex local controller to connect points in the graph, instead using derivative-free sampling methods to make local connections. This enables us to use black box dynamics models, such as those represented by trained neural networks, where we are unable to leverage the structure of the model to aid in building the value function.Finally, we introduce a method that augments GrAVITree by determining the valid region of the dynamics model for image-based systems. Such systems use autoencoders to map image outputs into low-dimensional latent states. These state spaces are convenient due to their dimension but are not easily interpreted by humans. As such, there is not always a natural way to determine what latent states are actually valid and do not violate a constraint or correspond to an image.
Subject Added Entry-Topical Term  
Aircraft.
Subject Added Entry-Topical Term  
Scheduling.
Subject Added Entry-Topical Term  
Dynamic programming.
Subject Added Entry-Topical Term  
Planning.
Subject Added Entry-Topical Term  
Decision making.
Subject Added Entry-Topical Term  
Neural networks.
Subject Added Entry-Topical Term  
Robots.
Subject Added Entry-Topical Term  
Military bases.
Subject Added Entry-Topical Term  
Drones.
Subject Added Entry-Topical Term  
Surveillance.
Subject Added Entry-Topical Term  
National parks.
Subject Added Entry-Topical Term  
Robotics.
Subject Added Entry-Topical Term  
Computer science.
Subject Added Entry-Topical Term  
Military studies.
Added Entry-Corporate Name  
Stanford University.
Host Item Entry  
Dissertations Abstracts International. 86-03A.
Electronic Location and Access  
로그인을 한후 보실 수 있는 자료입니다.
Control Number  
joongbu:657367

MARC

 008250224s2024        us  ||||||||||||||c||eng  d
■001000017162958
■00520250211152117
■006m          o    d                
■007cr#unu||||||||
■020    ▼a9798384340515
■035    ▼a(MiAaPQ)AAI31460316
■035    ▼a(MiAaPQ)Stanfordmz769gm8320
■040    ▼aMiAaPQ▼cMiAaPQ
■0820  ▼a330
■1001  ▼aWashington,  Patrick  Henry.
■24510▼aEfficient  and  Reconfigurable  Approximate  Value  Functions  for  Task  Scheduling,  Path  Planning,  and  Control.
■260    ▼a[S.l.]▼bStanford  University.  ▼c2024
■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2024
■300    ▼a149  p.
■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  86-03,  Section:  A.
■500    ▼aIncludes  supplementary  digital  materials.
■500    ▼aAdvisor:  Schwager,  Mac.
■5021  ▼aThesis  (Ph.D.)--Stanford  University,  2024.
■520    ▼aTask  scheduling,  path  planning,  and  control  are  all  problems  in  robotics  that  involve  choosing  the  best  action  to  take  given  the  current  state  of  the  system.  For  task  scheduling,  this  means  observing  the  condition  of  the  robots  and  deciding  which  robot  should  do  which  task  and  when.  Path  planning  involves  choosing  how  the  robot  should  navigate  the  space  and  control  deals  with  choosing  inputs  to  best  follow  that  plan.  The  concept  that  links  these  ideas  is  that  of  a  value  function.  This  thesis  addresses  two  key  challenges  in  value  functions,  efficiency  and  reconfigurability.First,  when  dealing  with  teams  of  robots,  the  number  of  state  variables  grows  quickly  and  we  run  into  what  is  known  as  the  curse  of  dimensionality,  where  traditional  methods  of  dynamic  programming  have  trouble  finding  exact  solutions  because  of  the  number  of  possible  states  in  the  system.  This  is  exemplified  in  task  scheduling  problems,  where  there  are  teams  of  robots  that  need  to  manage  various  tasks.  The  task  scheduling  problem  presented  in  this  thesis  is  the  persistent  surveillance  problem.  It  involves  several  robots  deciding  when  to  charge  while  maintaining  surveillance  coverage  over  a  region.  With  each  robot  having  a  position  and  battery  level  along  with  the  surveillance  position  moving  over  time,  it  is  intractable  to  find  an  exact  value  function  and  corresponding  policy.  We  first  demonstrate  a  modified  Monte  Carlo  Tree  Search  algorithm  that  estimates  the  value  of  actions  through  model  predictive  control  ideas.  This  method  uses  the  idea  that  in  a  scenario  where  most  action  sequences  fail,  searching  for  any  successful  performance  can  outperform  searching  for  the  best  expected  performance.  We  then  present  Reduced  State  Value  Iteration,  an  algorithm  that  builds  off  of  other  approximate  value  iteration  ideas  to  simplify  the  decision  making  state  space  and  efficiently  solve  for  an  approximate  value  function.  It  leverages  knowledge  of  the  problem's  structure  to  vastly  reduce  the  number  of  states  relevant  to  decision  making.Second,  many  algorithms  that  develop  value  functions  for  path  planning  and  control  suffer  from  an  inability  to  be  reconfigured  in  the  event  of  a  change  to  the  problem.  Sometimes,  this  comes  in  the  form  of  finding  a  single  trajectory  from  predefined  start  and  end  points  and  building  a  feedback  law  to  follow  that  trajectory.  However,  if  either  point  changes  or  the  robot  strays  too  far  from  the  desired  trajectory,  the  plan  is  useless  and  needs  to  be  recomputed.  On  the  other  hand,  algorithms  that  solve  for  feedback  policies  over  the  entire  state  space  are  generally  slow  and  unable  to  adapt  to  new  state  or  control  constraints,  requiring  extensive  recomputation.  We  introduce  GrAVITree,  a  graph-based  planning  algorithm  that  builds  a  value  function  and  feedback  policy  for  simultaneous  path  planning  and  control.  It  samples  backwards  in  time  from  the  goal  and  maintains  a  graph  that  stores  state  and  control  information.  One  advantage  of  this  is  that  by  sampling  backwards  in  time  to  branch,  we  only  explore  regions  of  the  state  space  that  can  actually  reach  the  goal,  rather  than  solving  over  the  whole  state  space.  By  storing  the  graph,  we  can  also  easily  change  the  goal,  the  cost  function,  and  the  constraints  after  solving  by  editing  the  graph  and  reselecting  the  optimal  edges  to  account  for  the  new  changes,  thus  it  is  a  reconfigurable  approximate  value  function.  Another  key  feature  is  that  we  build  the  graph  one  step  at  a  time,  bypassing  the  need  for  a  complex  local  controller  to  connect  points  in  the  graph,  instead  using  derivative-free  sampling  methods  to  make  local  connections.  This  enables  us  to  use  black  box  dynamics  models,  such  as  those  represented  by  trained  neural  networks,  where  we  are  unable  to  leverage  the  structure  of  the  model  to  aid  in  building  the  value  function.Finally,  we  introduce  a  method  that  augments  GrAVITree  by  determining  the  valid  region  of  the  dynamics  model  for  image-based  systems.  Such  systems  use  autoencoders  to  map  image  outputs  into  low-dimensional  latent  states.  These  state  spaces  are  convenient  due  to  their  dimension  but  are  not  easily  interpreted  by  humans.  As  such,  there  is  not  always  a  natural  way  to  determine  what  latent  states  are  actually  valid  and  do  not  violate  a  constraint  or  correspond  to  an  image.
■590    ▼aSchool  code:  0212.
■650  4▼aAircraft.
■650  4▼aScheduling.
■650  4▼aDynamic  programming.
■650  4▼aPlanning.
■650  4▼aDecision  making.
■650  4▼aNeural  networks.
■650  4▼aRobots.
■650  4▼aMilitary  bases.
■650  4▼aDrones.
■650  4▼aSurveillance.
■650  4▼aNational  parks.
■650  4▼aRobotics.
■650  4▼aComputer  science.
■650  4▼aMilitary  studies.
■690    ▼a0771
■690    ▼a0800
■690    ▼a0984
■690    ▼a0750
■71020▼aStanford  University.
■7730  ▼tDissertations  Abstracts  International▼g86-03A.
■790    ▼a0212
■791    ▼aPh.D.
■792    ▼a2024
■793    ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17162958▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.

미리보기

내보내기

chatGPT토론

Ai 추천 관련 도서


    신착도서 더보기
    최근 3년간 통계입니다.

    소장정보

    • 예약
    • 캠퍼스간 도서대출
    • 서가에 없는 책 신고
    • 나의폴더
    소장자료
    등록번호 청구기호 소장처 대출가능여부 대출정보
    TQ0033544 T   원문자료 열람가능/출력가능 열람가능/출력가능
    마이폴더 부재도서신고

    * 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

    해당 도서를 다른 이용자가 함께 대출한 도서

    관련도서

    관련 인기도서

    도서위치