hierarchical mission control of automata with human supervision prof. david a. castañon boston...

Hierarchical mission control of automata with human supervision

Prof. David A. CastañonBoston University

Problem of Interest

• Coordination of heterogeneous teams to accomplish tasks in uncertain, risky environments

- Vehicles with different capabilities, resources- Some resources are renewable (sensors), others are not

- Tasks are spatially distributed, require combinations of capabilities - Successful completion of tasks not guaranteed

- Likelihood of success depends on resources assigned- Tasks arrive, depart randomly- Task types may be unknown until observed- Vehicles may fail randomly, depending on trajectories

• Key aspect: Real-time adaptation to events

• Human Supervision- Determine task priority/value- Modify individual vehicle task assignments when desired- Determine specific vehicle schedules when desired

Problem Illustration

Experiment model

• Multiple robots search for and perform tasks at BU’s Mechatronics Lab

Why is this a hard problem

• Uncertain environment and dynamics- Unknown targets- Uncertain effectiveness of sensing, actionsRequires highly adaptive system, anticipative of and responsive to new informationHedge against loss of assets, new arrivals, action failures, …

• Diverse set of vehicles with multiple capabilities- Dynamic role selection, ad hoc teaming

• Dual control problems: Manage both information acquisition and action - Trade off search and sensing versus actions- Dynamic coupling of available capabilities to achieve desired effects

• Support and adapt to human control inputs- Goals, constraints, fixed decisions- Provide information to assess effects of changes

Classes of algorithms

• Operations Research- Deterministic and stochastic multi-vehicle task assignment and scheduling

- Large vehicles, small tasks, limited cooperation, homogeneous activities- No risk, limited uncertainty to new task arrivals, departures independent of vehicle actions

- Search theory and sensor management- Large-scale resource allocation and integer programming

• Stochastic Control- Control of stochastic queuing systems in communications- Single vehicle routing and low level vehicle trajectory control- Swarm control approaches with stability and performance guarantees

- Homogeneous vehicles- Approximate dynamic programming techniques

- Not focused on combinatorial optimization in general, rare exceptions- Model predictive control of complex stochastic systems

• Artificial Intelligence/Computer Science- Constraint satisfaction, temporal planning systems

- Non-real time, off-line combinatorial constraint-based search- Limited incorporation of risk/reward, information dynamics

- Behavioral control in robotics for simple tasks- Reinforcement learning for stochastic planning in well-defined repeated environments (e.g.

games)

Proposed Approach: Hierarchical Model Predictive Control

• Hierarchical approach: avoid combinatorial explosion of complexity through decompositionTeam strategy selection: address uncertainty- Allocate team capabilities to tasks, hedging against task type uncertainty, new task

arrivals, action success probabilities- Simplify distribution of resources across vehiclesTeam activity scheduling: address combinatorial complexity- Allocate team activities to platforms- Select schedules and routes

• Model Predictive Control: resolve algorithms in response to new information or human directives- Receding horizon control- Respond to new tasks, changes in task status, platform loss, ….- Adapt to human guidance and constraintsRequires fast algorithms for real-time control

Team Strategy Selection

• Stochastic dynamic programming formulation- Multistage formulation, with outcomes observed after each stage

ResourcesStage 1 Stage 2 Stage 3

Task1

TaskN

Task1

TaskN

Task1

TaskN

Type 1

Type 2

Type 3

Type 4

TaskN+1

TaskN+M

Notation

• N tasks i = 1, …, N

• M resource types j = 1, …, M

• Assume independence of all task completion events

iVi task of Value :

jR

ijx

i

jp

jM

j

ij

ij

j

typeof resource using ofCost :

task toassigned typeof resources ofNumber :

task completes

ly successful typeof resource singley that Probabilit :

typeof resources ofNumber :

Example: Two-Stage Single Resource Problem

• Define a task completion state after each stage

- Task completion state observed after each stage

• Decisions are now feedback policies

• Task completion state dynamics: Controlled Markov chain- Resources assigned determine transition probabilities- Independence of completion event outcomes decouples transition dynamics

across tasks

after state completion task overall theis )}( ),...,({)(

stageafter task of state completion thedenotes }1,0{)(

1 kkkk

kik

N

i

niiii kpnkkxkkP ))(1()))1(,(,1)1(|1)((

kkkx

kikkxi stageat sallocation resource of vector ))1(,(

stagein task toassigned resources ))1(,(

Two-Stage Problem Statement

• Objective: minimize expected uncompleted task value plus expected resource use costs

• Constraints: Resource limits

))1(,2(1}1)2({min 11))}1(,2(),1({

ii

N

iii

xx

xxRIVE

1

11

..., 1, 0,))1(,2(),1(

)1( outcomes allfor ))1(,2()1(

Mxx

Mxx

ii

N

iii

Relaxed Two-Stage Problem

• Original problem is stochastic integer program- P-space complete, hard

• Expand set of admissible feedback strategies in second stage- Generates lower bound to optimal value function- New constraint on average number of resources

- Relaxes exponential number of constraints to a single constraint- Simple result: All feasible strategies in original problem are feasible in current problem- Lower bound on original performance- Idea: select optimal strategies for lower bound

1

11)}1({

..., 1, 0,))1(,2(),1(

))1(,2()1())1(|)1((

Mxx

MxxxP

ii

N

iii

Characterization of Optimal Strategies

• Important concept: Mixed local strategies- Local strategies: feedback strategies such that the actions on a given task depend

only on the state of that task

- Mixed strategy: random combination of pure strategies- Mixed strategies may achieve better performance than pure strategies in relaxed

problem

• Theorem: In relaxed problem, for every pure strategy, there is a mixed local strategy which uses same resources and achieves same expected performance- Proven by construction- Restricts search to local mixed strategies- Fast algorithm for solution of optimal strategies using convex optimization

principles!- Can solve exactly in Complexity O((M1+N)log(N))

))1(,2())1(,2( iii xx

Comments and Extensions

• MPC approach guarantees feasibility of approximate problem solution in terms of original problem- Obtain approximate solution, but implement only first stage allocations- Resolve problem when new observations are available, with receding horizon- Fast algorithm allows for rapid computation

• Main extensions:- Multiple stages- Multiple resource types

- Multiple renewable and non-renewable resources- Solution NP-hard, but can solve approximately

- Multiple task types: sensing and action- Must sense to observe outcomes

- New task arrivals, discovered by searching- Unknown task types: Detect presence, but must observe to determine task type- Task departures, deadlines

Team Activity Scheduling

• Inputs from team strategy selection- Desired resources assigned to each task in current period - Desired resources held in reserve when future information is collected

• Guidance and constraints from human operators- Task values, select platform task assignments, select task resource assignments

• Known parameters- Vehicle locations and resources in each vehicle, task locations

• Problem: assign resource deliveries for tasks to individual vehicles, and select sequence of activities for vehicle- Deterministic multi-vehicle routing problem (VRP)- NP-hard, with many useful approximate approaches available

Team Activity Assignment Formulation

Problem Formulation

Visit Customers

Subject to:

N vehicles to route

Integrality

• VRP is an NP-hard problem (traveling salesman) wrapped in an NP-hard problem (bin packing).

• Classical Application: Truck Routing

where

Discounted Cost

Team Activity Assignment Algorithm

• Candidate algorithm: Tabu Search- Locally perturbs trial solutions- Uses “Tabu” list to avoid local minima- Evaluated by AFIT for UAV routing - Fast replanning, leads to rapid response to events- Handles time window constraints instead of precedence constraint

• Significant extensions to date- Multiple task types- Multiple resource types- Compound tasks involving multiple vehicles

• Alternative algorithms (AFOSR-sponsored)- Mixed Integer-Linear Programming, J. How, MIT- Receding horizon controller, C. Cassandras, BU

Comments

• Algorithms available for dynamic control of automata performing tasks in uncertain, risky environments- Fast generation of desired courses of action- Hedge against uncertain outcomes, adapt to new information

• Operator interaction through value structure, plus fixed decision variables and constraints- Allows for “micro”-management- Very limited insight into effects of operator inputs on automata behavior and

performance

• Fundamental problem for this MURI research: prediction of course of action in the presence of uncertainty- Not a single plan, but a contingency tree of possible actions/responses- Hard to modify, approve

Experimental Platform for Research

• Multiple robots search for and perform tasks at BU’s Mechatronics Lab- Can provide operator control of some platforms: human-automata teams- Control information displayed, risk to each operator using video

Future Activities

• Implement research experiments involving tasks with performance uncertainty in test facility- Vary tempo, size, uncertainty, information

• Develop algorithms to interact with operators in alternative roles- Supervisory control- Team partners

• Extend existing algorithms to different classes of tasks- Area search, task discovery, risk to platforms

• Develop algorithms to assist operators in predicting behavior of automata teams in uncertain environments

• Collaborate with MURI team to design and analyze experiments involving alternative structures for human-automata teams

hierarchical mission control of automata with human supervision prof. david a. castañon boston...

Documents

desired slide

new task arrivals

small tasks

task priorityvalue

team capabilities

task types

human control inputs

task type uncertainty