lectures - department of computer and information science ...tddd10/lectures/2014/06... · tddd10...
TRANSCRIPT
TDDD10AIProgrammingCooperationAndCoordination1
CyrilleBerger
2/78
Lectures1AIProgramming:Introduction2IntroductiontoRoboRescue3AgentsandAgentsArchitecture4Multi-AgentandCommunication5Multi-AgentDecisionMaking6CooperationAndCoordination17CooperationAndCoordination28Machine9KnowledgeRepresentation
10PuttingItAll
3/78
Lecturegoals
4/78
LecturecontentCooperativeSensing&ExplorationCooperativeStateEstimationExtractingPredicatesRobotExploration
Coalitions&RolesDynamicRoleAssignmentCoalitionFormation
CooperativeSensing&Exploration CooperativeStateEstimation
7/78
WhyStateEstimation?Robotsneedtobeawareoftheircurrentstateinordertoperformmeaningfulactions!
WhereamIlocatedintheworld?Wherearethevictims? 8/78
ModelingSensornoise(1/2)Sensorsarerepresentedbyaprobabilisticsensormodelp(z|x)Answersthequestion:WhatistheprobabilityformeasuringzwhengivenIamlocatedinstatex?Example:LaserScannerlocated1mfromthewallreturnsinaverageevery10thtime1.20mTypicallyrepresentedbyaGaussian
9/78
ModelingSensornoise(2/2)Datafromsensorsisnoisy,.e.g.,thedistancemeasurementofalaseratonemetercanbe1m±1cmSensornoiseistypicallymodeledbyanormaldistributionFullydescribedbymeanμandvarianceσ²
10/78
Gaussians(1/2)Univariate:
Multivariate:
11/78
Gaussians(2/2)1D
2D
12/78
StateEstimationContinuesIntegrationofSensordataaccordingtoprobabilitydistributionsSensorobservationsaretakeindifferentcoordinateframes,e.g.,camera,laserTransformationof
StateEstimationistheprocessofintegratingmultipleobservationstoestimateastatei.e.robotlocation,locationsof
13/78
ExampleofBayesianStateEstimation
SupposearobotobtainsmeasurementzWhatisP(open|z)?
14/78
CausalvsDiagnosticReasoningP(open|z)isP(z|open)iscausal,i.e.,thesensormodelOftencausalknowledgeiseasiertoobtainBayesruleallowsustousecausalknowledge:
15/78
ExampleP(z|open)=0.6P(z|¬open)=0.3P(open)=P(¬open)=0.5
Marginalization:computemarginalprobability
zraisestheprobabilityofthebeliefthatthedoorisopen
16/78
GeneralFramework:RecursiveBayesianFilteringzₜ:Sensorobservationattimetxₜ:Stateattimet
Likelihood(sensormodel)PriorTransition(ormotion)model
17/78
AlgorithmsforBayesianFilteringKalmanFilter:optimalforlinearsystemsandnormaldistributions,veryefficient,uni-modalMonteCarloLocalization(ParticleFilter):goodforanydistribution,canbecomputationallyexpensive,multi-modal
18/78
KalmanfiltervsSimpleAveraging
Triangulation Kalmanfiltering
Simpleaveraging
KalmanfilteringcomparedtoSimpleAveraging:HighlyConfidentEstimatesaremoreStronglyWeighted
19/78
ImportanceofStateestimation
20/78
MonteCarloLocalization(MCL)asObservationFilter
TheKalman-FiltercanonlyhandleasinglehypothesesHowever,colorthresholdingonasoccerfieldmightconfuseforexample“redt-shirts”withtheballConsequently,KalmanfilteringyieldspoorresultsMCL:SimultaneoustrackingofmultiplehypothesesCanbeusedtofilter-outhypothesesweaklysupportedbyobservationsovertime
21/78
MarkovLocalization
22/78
MonteCarloLocalization(MCL)Goal:approachfordealingwitharbitrarydistributions
23/78
KeyIdea:SamplesUsemultiplesamplestorepresentarbitrarydistributions
24/78
ParticleSetSetofweightedsamples
x[i]statew[i]importanceweightThesamplesrepresenttheposterior
25/78
ParticlesforApproximationParticlesforfunctionapproximation
Themoreparticlesfallintoaninterval,thehigheritsprobabilitydensity.Howtoobtainsuchsamples?
26/78
ParticleFilterRecursiveBayesfilter:Non-parametricapproachModelsthedistributionbysamplesPrediction:drawfromtheproposalCorrection:weightingbytheratiooftargetandproposalThemoresamplesweuse,thebetteristheestimate!
27/78
ParticleFilterAlgorithm1Sampletheparticlesusingtheproposaldistribution2Computetheimportance
3Resampling:“Replaceunlikelysamplesbymorelikelyones”
28/78
ParticleFilterAlgorithm
29/78
MonteCarloLocalizationEachparticleisaposehypothesisProposalisthemotionmodel
Correctionviatheobservationmodel
30/78
ParticleFilterforLocalization
31/78
ParticleFilterforLocalization
32/78
ResamplingNeededaswehavealimitednumberofsamplesSurvivalofthefittest:“Replaceunlikelysamplesbymorelikelyones”“Trick”toavoidthatmanysamplescoverunlikelystates
33/78
Resampling
RoulettewheelBinaryO(nlog
StochasticuniversalsamplingLowO(n)
34/78
PhantomBalls:DevelopmentofProbabilityDistribution
Firstobservation Secondobservation Thirdobservation
35/78
PhantomBalls:DevelopmentofProbabilityDistribution
Fourthobservation Fifthobservation SixthobservationExtractingPredicates
37/78
Case-Study:Extractingpredicatesforplayingsoccer
PredicatesareneededforsymbolicPredicatesarethebasisforactionselectionandstrategicdecisionmakingCanbeconsideredasworldmodelabstractionsSimplepredicatesofobjects(canbedirectlycomputedfrompositions):
InOpponentsGoal(object),Objectinopponentgoal?InOwnGoal(object),Objectinowngoal?CloseToBorder(object),Thedistancetoanyborderisbeyondathreshold?FrontClear(),Neitheranotherobjectnortheborderisinfront?InDefense(object),Objectinthelastthirdofthesoccerfield?
38/78
Case-Study:Extractingpredicatesforplayingsoccer
Extendedpredicates:Computedbynormalizedgrids:(fi:ℜxℜ⇒[0..1])Discretizedintocells,e,g.,10x10cmsizeExamples:
ffree:indicatespositionsundertheinfluenceoftheopponentfcovered:indicatespositioncoveredbyteammatesfdesired:indicatestacticalgoodpositions
RobotExploration
40/78
RobotExplorationAteamofrobotshastoexploreaninitiallyunknownenvironmentbysensorcoverageFindanassignmentsoftargetlocationstorobotsthatminimizestheoverallexplorationtimeVariantsCentralizedcoordinationviaworldmodeldataexchangeCentralizedcoordinationwithassignmentoptimizationDecentralizedcoordinationbypeer-to-peercommunication
41/78
FrontierExplorationRobotsfuseandsharetheirlocalmapsThefrontiersbetweenfreespaceandunknownareasarepotentialtargetlocations
Findagoodassignmentoffrontierlocationstorobotstominimizeoverallexplorationtime
42/78
LevelofcoordinationNoexchangeofinformationImplicitcoordination:SharingajointmapCommunicationandfusionoflocalmapsCentralmappingsystemFrontierExploration(Yamauchietal.,98)Explicitcoordination:DeterminebettertargetlocationstodistributetherobotsCombinatorialProblem:“planner”forrobot-targetassignment
43/78
Example:NeedforExplicitCoordination
44/78
ExplicitCoordinationChoosetargetlocationsatthefrontiertotheunexploredareabytradingofftheexpectedinformationgainandtravelcostsReduceutilityoftargetlocationswhenevertheyareexpectedtobecoveredbythesensorsofanotherrobotUsecooperativesensingakadistributedstateestimationtocomputethejointmap
45/78
TheCoordinationAlgorithm1Determinethesetoffrontier2ComputeforeachrobotithecostVⁱ(x,y)forreachingeachfrontiercell<x,y>
3Settheutilityofallfrontiercellsto4WhilethereisonerobotleftwithoutaDeterminearobotiandafrontiercell<x,y>whichstatisfies:(i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y'))Reducetheutilityofeachtargetpoint<x',y'>inthevisibilityareaofselected<x,y>accordingto:U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>))
46/78
ExampleRevised
47/78
TypicalTrajectories
Left:implicitRight:explicit
48/78
ExplorationTime
49/78
DrawbacksTheassignmentconsideredsofarisaGreedyassignment:
Moreoptimalapproaches:HungarianMethodComputestheoptimalassignmentofjobstomachinesgivenafixedcostmatrixMarketeconomy-basedapproaches(Auctions)RobotstradewithtargetsComputationalloadissharedbetweentherobots
Coalitions&Roles
DynamicRoleAssignment
52/78
DynamicRoleAssignmentAmechanismtoefficientlycoordinateagentsPredefinedRoles(e.g.Attacker,Defender,Role-specificbehaviorsselectionAssignment:MappingbetweenNrolesandMCanbeaccordingtothecontext(e.g.teamformation)Suitedfordynamicdomains(e.g.robotExampleRobotSoccerAvoidswarmbehaviorandinference(e.g.neitherattackyourownteammatesnorgetintothewayofanattackingordefendingrobot)Taskdecompositionandtask(re-)allocation(e.g.theplayerclosesttotheballshouldgototheballDynamicrolechanges(e.g.Ifaplayerisblocked,anothershouldtakeover)CoordinatingJointexecution(e.g.passingthe
53/78
GeneralAlgorithmAssumptions:Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst,followedbyrole2,etc.EachagentcanbeassignedtoonlyoneroleTheutilityuijreflectshowappropriateagentiisforrolejgiventhecontext
forallagentsinparallelI:=∅;//Committedassignmentswithorderingforeachrolej=1,…,Ncomputeutilityui,j;//Ownpreferenceofagentibroadcastui,j;//Toallotheragentsend;Waituntilallui,jarereceived//Fromalltheotheragentsforeachrolej=1,…,Nassignrolejtoagenti*=argmaxi∉I{ui,j};I:=I∪{i*};//Addassignmentend;end.
54/78
CaseStudy:CS-FreiburgSoccerEachplayercanhaveoneoffourgoalie(fixed)specialhardwaresetup,thusunabletochangethisactiveplayer(inchargeofdealingwiththeball)canapproachtheballorbringtheballforwardtowardstheopponentgoalstrategicplayer:maintainsapositionbackinitsownsupporter:(supportseitheractiveorindefensiveplayitcomplementstheteam’sdefensiveinoffensiveplayitpresentsitselftoreceiveapassclosetotheopponentsgoal
55/78
RoleUtilitiesPlacement:eachrolehasapreferredlocation,whichdependsonthesituation:ballposition,positionofteammatesandopponentsdefensivesituationorattackcomputedbypotentialUtilityuijforeachrole:“Negativeutility(costs)”forreachingthepreferredlocationoftheroleCostsarecomputedfrompartialcostsfordistance(ud),turnangle(ut),objectsonthepath(uo)Weightedsumtoensureutilitiesbetween0..1:Uij=wd*ud+wt*ut+wo*uo
activerole
strategicrole
supportrole56/78
DynamicRoleAssignmentEachplayercomputesutilitiesuijandbroadcastsresultsGrouputility:Considerallpossibleassignmentsandcomputethesummedutilityfromeachagents’individualutilityforitsassignedroleTaketheassignmentwiththehighestutilitysumassolution(undertheassumptionthateveryagentdoesso)Rolesarere-assignedonlywhentherolechangeissignificant,i.e.thenewutility>>oldutility(hysteresisfactortoavoidoscillation)twoplayersagree(byNotethatagentsmightliesince“opinion”aboutglobalpositioncandiffer(evenwithaglobalworldmodel)
57/78
ExampleforRoleSwitching(1/2)
AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbyadefenderandconsequentlyreplacedbyanunblockedplayer
58/78
ExampleforRoleSwitching(2/2)
DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayerareswitchedacoupleoftimes
59/78
Failedball-passing
Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbasedonstandardplan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,thenmakethepass”
CoalitionFormation
61/78
CoalitionFormationNecessarywhentasksaremoreefficientlysolvedbyaspecificcombinationofagentcapabilitiesE.g.AdisasterlocationrequiresambulanceandfireAssignmentofgroupstotasksisnecessarywhentaskscannotbeperformedbyasingleagentE.g.asinglefirebrigadecannotextinguishalargeAgroupofagentsiscalledacoalitionAcoalitionstructureisapartitioningofthesetofagentsintodisjointcoalitionsAnagentparticipatesinonlyonecoalitionAcoalitionmayconsistofonlyasingleagentGenerally,coalitionsconsistofheterogeneousagents
62/78
ApplicationsforCoalitionFormationIne-commerce,buyerscanformcoalitionstopurchaseaproductinbulkandtakeadvantageofpricediscounts(Tsvetovatetal.,2000)InRealTimeStrategy(RTS)gamesgroupsofheterogeneousagentscanjointlyattackbasesoftheopponent.MixturesofagentshavetobeaccordingtothedefensestrategyoftheopponentDistributedvehicleroutingamongdeliverycompanieswiththeirowndeliverytasksandvehicles(Sandholm1997)Wide-areasurveillancebyautonomoussensornetworks(Dang2006)InRescue,teamformationtosolveparticularsub-problems,e.g.largerrobotsdeploysmallerrobotsintoconfinedspaces
63/78
FireBrigadeExample
64/78
FireBrigadeExample
65/78
ThreeActivitiesinCoalitionFormation
Coalitionstructuregeneration:PartitioningoftheagentsintoexhaustiveanddisjointInsidethecoalitions,agentswillcoordinatetheiractivities,butagentswillnotcoordinatebetweencoalitions
Solvingtheoptimizationproblemineachcoalition:PoolingthetasksandresourcesoftheagentsinthecoalitionandsolvingthejointproblemThecoalitionobjectivecouldbetomaximizethemonetaryvalue,ortheoverallexpectedutility
Dividingthevalueofthegeneratedsolution:Intheend,eachagentwillreceiveavalue(moneyorutility)asaresultofparticipatinginthecoalitionInsomeproblems,thecoalitionvaluetheagentshavetoshareisnegative,beingasharedcost
66/78
ProblemFormulationAgroupofagentsS⊆Aiscalledacoalition,whereAdenotesthesetofallagentsandS≠∅Thecoalitionofalltheagentsiscalledgrandcoalition
Acoalitionstructure(CS)partitionsthesetofagentsintocoalitionsThevalueofeachcoalitionSisgivenbyafunctionvSEachcoalitionvalueisindependentofnon-membersactions
CS*isthesocialwelfaremaximizingcoalitionstructure
67/78
CoalitionstructuregenerationThevalueofacoalitionstructureisgivenby:
V(CS)=∑{S∊CS}USThegoalistomaximizethesocialwelfareofasetofagentsAbyfindingacoalitionstructurethatsatisfies:
CS*=argmax{CS∊Partitions(A)}V(CS)
68/78
SpecialCoalitionValuesCoalitionvaluesaresuper-additiveiffforeverypairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vTIfcoalitionvaluesaresuper-additive,thenthecoalitionstructurecontainingthegrandcoalitiongivesthehighestvalueAgentscannotdoworsebyteamingup
Thecoalitionvaluesaresub-additiveiffforeverypairofdisjointcoalitionsS,T⊆A:vS∪T<vS+vTIfcoalitionvaluesaresub-additive,thenthecoalitionstructure{{a}|a∈A}inwhichnoagentcooperatesgivesthehighestvalueIstheambulancerescuetaskintheRoboCupRescuedomainsuper-additive,sub-additive,ornoneofboth?
69/78
CoalitionstructuregenerationInput:allpossiblecoalitionsandtheirvaluesA={1,2,3,4}
ForNagentsthenumberofpossiblecoalitionsis2^N-1andthenumberofpossiblecoalitionstructuresisN^(N/2)
70/78
Coalitiongraph
NodesrepresentCoalitionStructuresArcsrepresenteithermerges(downwards)orsplits(upwards)
71/78
CoalitionStructureSearchTime
Tosearchthewholecoalitiongraphfortheoptimalcoalitionstructureisintractable(onlyfeasibleif|A|<15)
72/78
ApproximateSolutiontoStructureSearch
CanweapproximatethesearchbyvisitingonlyasubsetofLnodes?ChooseasetL(asubsetofallcoalitionsofA)andpickthebestcoalitionseen:CSL*=argmax{CS∊L}
Onerequirementistoguaranteethatthefoundcoalitionstructureiswithinaworstcaseboundfromoptimal:k*V(CSL*)≥V(CSL)
73/78
ApproximateSolutionIfthebottomtwolevelsofthegraphareconsideredthen:k=|A|
andthenumberedofnodessearchesisn=2^(|A|-1)itcanbeproventhatnoothersearchalgorithmcandobetterboundKwhilesearchingn=2^(|A|-1)orfewer
74/78
CoalitionStructureSearchAlgorithm
1Searchthebottomtwolevelsofthecoalitionstructuregraph
2Continuewithbreadth-firstsearchfromthetopofthegraphaslongasthereistimeleft,oruntiltheentiregraphhasbeensearched
3Returnthecoalitionstructurethathasthehighestwelfareamongthoseseensofar
75/78
Casestudy:ResQFreiburgTaskAllocation
NambulanceteamshavetorescueMciviliansafteranearthquakeCiviliansarecharacterizedbyBuriedness,DamageandHit-PointsCostsarethetimetorescueacivilian,composedofthecoalition’sjointmaxtraveltimetoreachthevictim,andthetimeneededfortherescueTheoverallutilityisthenumberofrescuedcivilians(theciviliansbroughttoarefuge)Weconsideredtheambulancerescuetaskassuper-additiveTherescueoperationitselfissuper-additiveAretherepossiblysituationswheretheambulanceteamneedstobesplit?
76/78
ProblemasSequenceAssignmentAssignasequenceRoftasks(herevictims)tothegrandcoalitionofagentsA(hereambulances)R=<r1,r2,…,rN>whereridenotesarescuetaskandithepositioninthesequence
U(R)denotesthepredictedutility(thenumberofsurvivors)whenexecutingsequenceRHence,theproblemisfindtheoptimalsequencefromthesetofallpossiblesequences:R*=argmaxU(R)Enumeratingallpossiblesequencesisintractable(N!)Greedysolutions:Prefervictimsthatcanberescuedfast(smallPreferurgentvictims(high
77/78
ResultsRoboCup2004
78/78
SummaryCooperativesensingwithKalmanFilterandParticlesFilterMakeuseofsensorinformationCoordinationtechniqueforexplorationoftheenvironmentDynamicroleassignmentisanefficientmethodforteamcoordinationActionselectionandcoordinationareessentialwhenactingingroupsCoalitionformationistheprocessoffindingthe“socialwelfare”coalitionstructureamongasetofagents