lectures - department of computer and information science ...tddd10/lectures/2014/06... · tddd10...

TDDD10AIProgrammingCooperationAndCoordination1

CyrilleBerger

2/78

Lectures1AIProgramming:Introduction2IntroductiontoRoboRescue3AgentsandAgentsArchitecture4Multi-AgentandCommunication5Multi-AgentDecisionMaking6CooperationAndCoordination17CooperationAndCoordination28Machine9KnowledgeRepresentation

10PuttingItAll

3/78

Lecturegoals

4/78

LecturecontentCooperativeSensing&ExplorationCooperativeStateEstimationExtractingPredicatesRobotExploration

Coalitions&RolesDynamicRoleAssignmentCoalitionFormation

CooperativeSensing&Exploration CooperativeStateEstimation

7/78

WhyStateEstimation?Robotsneedtobeawareoftheircurrentstateinordertoperformmeaningfulactions!

WhereamIlocatedintheworld?Wherearethevictims? 8/78

ModelingSensornoise(1/2)Sensorsarerepresentedbyaprobabilisticsensormodelp(z|x)Answersthequestion:WhatistheprobabilityformeasuringzwhengivenIamlocatedinstatex?Example:LaserScannerlocated1mfromthewallreturnsinaverageevery10thtime1.20mTypicallyrepresentedbyaGaussian

9/78

ModelingSensornoise(2/2)Datafromsensorsisnoisy,.e.g.,thedistancemeasurementofalaseratonemetercanbe1m±1cmSensornoiseistypicallymodeledbyanormaldistributionFullydescribedbymeanμandvarianceσ²

10/78

Gaussians(1/2)Univariate:

Multivariate:

11/78

Gaussians(2/2)1D

2D

12/78

StateEstimationContinuesIntegrationofSensordataaccordingtoprobabilitydistributionsSensorobservationsaretakeindifferentcoordinateframes,e.g.,camera,laserTransformationof

StateEstimationistheprocessofintegratingmultipleobservationstoestimateastatei.e.robotlocation,locationsof

13/78

ExampleofBayesianStateEstimation

SupposearobotobtainsmeasurementzWhatisP(open|z)?

14/78

CausalvsDiagnosticReasoningP(open|z)isP(z|open)iscausal,i.e.,thesensormodelOftencausalknowledgeiseasiertoobtainBayesruleallowsustousecausalknowledge:

15/78

ExampleP(z|open)=0.6P(z|¬open)=0.3P(open)=P(¬open)=0.5

Marginalization:computemarginalprobability

zraisestheprobabilityofthebeliefthatthedoorisopen

16/78

GeneralFramework:RecursiveBayesianFilteringzₜ:Sensorobservationattimetxₜ:Stateattimet

Likelihood(sensormodel)PriorTransition(ormotion)model

17/78

AlgorithmsforBayesianFilteringKalmanFilter:optimalforlinearsystemsandnormaldistributions,veryefficient,uni-modalMonteCarloLocalization(ParticleFilter):goodforanydistribution,canbecomputationallyexpensive,multi-modal

18/78

KalmanfiltervsSimpleAveraging

Triangulation Kalmanfiltering

Simpleaveraging

KalmanfilteringcomparedtoSimpleAveraging:HighlyConfidentEstimatesaremoreStronglyWeighted

19/78

ImportanceofStateestimation

20/78

MonteCarloLocalization(MCL)asObservationFilter

TheKalman-FiltercanonlyhandleasinglehypothesesHowever,colorthresholdingonasoccerfieldmightconfuseforexample“redt-shirts”withtheballConsequently,KalmanfilteringyieldspoorresultsMCL:SimultaneoustrackingofmultiplehypothesesCanbeusedtofilter-outhypothesesweaklysupportedbyobservationsovertime

21/78

MarkovLocalization

22/78

MonteCarloLocalization(MCL)Goal:approachfordealingwitharbitrarydistributions

23/78

KeyIdea:SamplesUsemultiplesamplestorepresentarbitrarydistributions

24/78

ParticleSetSetofweightedsamples

x[i]statew[i]importanceweightThesamplesrepresenttheposterior

25/78

ParticlesforApproximationParticlesforfunctionapproximation

Themoreparticlesfallintoaninterval,thehigheritsprobabilitydensity.Howtoobtainsuchsamples?

26/78

ParticleFilterRecursiveBayesfilter:Non-parametricapproachModelsthedistributionbysamplesPrediction:drawfromtheproposalCorrection:weightingbytheratiooftargetandproposalThemoresamplesweuse,thebetteristheestimate!

27/78

ParticleFilterAlgorithm1Sampletheparticlesusingtheproposaldistribution2Computetheimportance

3Resampling:“Replaceunlikelysamplesbymorelikelyones”

28/78

ParticleFilterAlgorithm

29/78

MonteCarloLocalizationEachparticleisaposehypothesisProposalisthemotionmodel

Correctionviatheobservationmodel

30/78

ParticleFilterforLocalization

31/78

ParticleFilterforLocalization

32/78

ResamplingNeededaswehavealimitednumberofsamplesSurvivalofthefittest:“Replaceunlikelysamplesbymorelikelyones”“Trick”toavoidthatmanysamplescoverunlikelystates

33/78

Resampling

RoulettewheelBinaryO(nlog

StochasticuniversalsamplingLowO(n)

34/78

PhantomBalls:DevelopmentofProbabilityDistribution

Firstobservation Secondobservation Thirdobservation

35/78

PhantomBalls:DevelopmentofProbabilityDistribution

Fourthobservation Fifthobservation SixthobservationExtractingPredicates

37/78

Case-Study:Extractingpredicatesforplayingsoccer

PredicatesareneededforsymbolicPredicatesarethebasisforactionselectionandstrategicdecisionmakingCanbeconsideredasworldmodelabstractionsSimplepredicatesofobjects(canbedirectlycomputedfrompositions):

InOpponentsGoal(object),Objectinopponentgoal?InOwnGoal(object),Objectinowngoal?CloseToBorder(object),Thedistancetoanyborderisbeyondathreshold?FrontClear(),Neitheranotherobjectnortheborderisinfront?InDefense(object),Objectinthelastthirdofthesoccerfield?

38/78

Case-Study:Extractingpredicatesforplayingsoccer

Extendedpredicates:Computedbynormalizedgrids:(fi:ℜxℜ⇒[0..1])Discretizedintocells,e,g.,10x10cmsizeExamples:

ffree:indicatespositionsundertheinfluenceoftheopponentfcovered:indicatespositioncoveredbyteammatesfdesired:indicatestacticalgoodpositions

RobotExploration

40/78

RobotExplorationAteamofrobotshastoexploreaninitiallyunknownenvironmentbysensorcoverageFindanassignmentsoftargetlocationstorobotsthatminimizestheoverallexplorationtimeVariantsCentralizedcoordinationviaworldmodeldataexchangeCentralizedcoordinationwithassignmentoptimizationDecentralizedcoordinationbypeer-to-peercommunication

41/78

FrontierExplorationRobotsfuseandsharetheirlocalmapsThefrontiersbetweenfreespaceandunknownareasarepotentialtargetlocations

Findagoodassignmentoffrontierlocationstorobotstominimizeoverallexplorationtime

42/78

LevelofcoordinationNoexchangeofinformationImplicitcoordination:SharingajointmapCommunicationandfusionoflocalmapsCentralmappingsystemFrontierExploration(Yamauchietal.,98)Explicitcoordination:DeterminebettertargetlocationstodistributetherobotsCombinatorialProblem:“planner”forrobot-targetassignment

43/78

Example:NeedforExplicitCoordination

44/78

ExplicitCoordinationChoosetargetlocationsatthefrontiertotheunexploredareabytradingofftheexpectedinformationgainandtravelcostsReduceutilityoftargetlocationswhenevertheyareexpectedtobecoveredbythesensorsofanotherrobotUsecooperativesensingakadistributedstateestimationtocomputethejointmap

45/78

TheCoordinationAlgorithm1Determinethesetoffrontier2ComputeforeachrobotithecostVⁱ(x,y)forreachingeachfrontiercell<x,y>

3Settheutilityofallfrontiercellsto4WhilethereisonerobotleftwithoutaDeterminearobotiandafrontiercell<x,y>whichstatisfies:(i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y'))Reducetheutilityofeachtargetpoint<x',y'>inthevisibilityareaofselected<x,y>accordingto:U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>))

46/78

ExampleRevised

47/78

TypicalTrajectories

Left:implicitRight:explicit

48/78

ExplorationTime

49/78

DrawbacksTheassignmentconsideredsofarisaGreedyassignment:

Moreoptimalapproaches:HungarianMethodComputestheoptimalassignmentofjobstomachinesgivenafixedcostmatrixMarketeconomy-basedapproaches(Auctions)RobotstradewithtargetsComputationalloadissharedbetweentherobots

Coalitions&Roles

DynamicRoleAssignment

52/78

DynamicRoleAssignmentAmechanismtoefficientlycoordinateagentsPredefinedRoles(e.g.Attacker,Defender,Role-specificbehaviorsselectionAssignment:MappingbetweenNrolesandMCanbeaccordingtothecontext(e.g.teamformation)Suitedfordynamicdomains(e.g.robotExampleRobotSoccerAvoidswarmbehaviorandinference(e.g.neitherattackyourownteammatesnorgetintothewayofanattackingordefendingrobot)Taskdecompositionandtask(re-)allocation(e.g.theplayerclosesttotheballshouldgototheballDynamicrolechanges(e.g.Ifaplayerisblocked,anothershouldtakeover)CoordinatingJointexecution(e.g.passingthe

53/78

GeneralAlgorithmAssumptions:Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst,followedbyrole2,etc.EachagentcanbeassignedtoonlyoneroleTheutilityuijreflectshowappropriateagentiisforrolejgiventhecontext

forallagentsinparallelI:=∅;//Committedassignmentswithorderingforeachrolej=1,…,Ncomputeutilityui,j;//Ownpreferenceofagentibroadcastui,j;//Toallotheragentsend;Waituntilallui,jarereceived//Fromalltheotheragentsforeachrolej=1,…,Nassignrolejtoagenti*=argmaxi∉I{ui,j};I:=I∪{i*};//Addassignmentend;end.

54/78

CaseStudy:CS-FreiburgSoccerEachplayercanhaveoneoffourgoalie(fixed)specialhardwaresetup,thusunabletochangethisactiveplayer(inchargeofdealingwiththeball)canapproachtheballorbringtheballforwardtowardstheopponentgoalstrategicplayer:maintainsapositionbackinitsownsupporter:(supportseitheractiveorindefensiveplayitcomplementstheteam’sdefensiveinoffensiveplayitpresentsitselftoreceiveapassclosetotheopponentsgoal

55/78

RoleUtilitiesPlacement:eachrolehasapreferredlocation,whichdependsonthesituation:ballposition,positionofteammatesandopponentsdefensivesituationorattackcomputedbypotentialUtilityuijforeachrole:“Negativeutility(costs)”forreachingthepreferredlocationoftheroleCostsarecomputedfrompartialcostsfordistance(ud),turnangle(ut),objectsonthepath(uo)Weightedsumtoensureutilitiesbetween0..1:Uij=wd*ud+wt*ut+wo*uo

activerole

strategicrole

supportrole56/78

DynamicRoleAssignmentEachplayercomputesutilitiesuijandbroadcastsresultsGrouputility:Considerallpossibleassignmentsandcomputethesummedutilityfromeachagents’individualutilityforitsassignedroleTaketheassignmentwiththehighestutilitysumassolution(undertheassumptionthateveryagentdoesso)Rolesarere-assignedonlywhentherolechangeissignificant,i.e.thenewutility>>oldutility(hysteresisfactortoavoidoscillation)twoplayersagree(byNotethatagentsmightliesince“opinion”aboutglobalpositioncandiffer(evenwithaglobalworldmodel)

57/78

ExampleforRoleSwitching(1/2)

AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbyadefenderandconsequentlyreplacedbyanunblockedplayer

58/78

ExampleforRoleSwitching(2/2)

DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayerareswitchedacoupleoftimes

59/78

Failedball-passing

Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbasedonstandardplan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,thenmakethepass”

CoalitionFormation

61/78

CoalitionFormationNecessarywhentasksaremoreefficientlysolvedbyaspecificcombinationofagentcapabilitiesE.g.AdisasterlocationrequiresambulanceandfireAssignmentofgroupstotasksisnecessarywhentaskscannotbeperformedbyasingleagentE.g.asinglefirebrigadecannotextinguishalargeAgroupofagentsiscalledacoalitionAcoalitionstructureisapartitioningofthesetofagentsintodisjointcoalitionsAnagentparticipatesinonlyonecoalitionAcoalitionmayconsistofonlyasingleagentGenerally,coalitionsconsistofheterogeneousagents

62/78

ApplicationsforCoalitionFormationIne-commerce,buyerscanformcoalitionstopurchaseaproductinbulkandtakeadvantageofpricediscounts(Tsvetovatetal.,2000)InRealTimeStrategy(RTS)gamesgroupsofheterogeneousagentscanjointlyattackbasesoftheopponent.MixturesofagentshavetobeaccordingtothedefensestrategyoftheopponentDistributedvehicleroutingamongdeliverycompanieswiththeirowndeliverytasksandvehicles(Sandholm1997)Wide-areasurveillancebyautonomoussensornetworks(Dang2006)InRescue,teamformationtosolveparticularsub-problems,e.g.largerrobotsdeploysmallerrobotsintoconfinedspaces

63/78

FireBrigadeExample

64/78

FireBrigadeExample

65/78

ThreeActivitiesinCoalitionFormation

Coalitionstructuregeneration:PartitioningoftheagentsintoexhaustiveanddisjointInsidethecoalitions,agentswillcoordinatetheiractivities,butagentswillnotcoordinatebetweencoalitions

Solvingtheoptimizationproblemineachcoalition:PoolingthetasksandresourcesoftheagentsinthecoalitionandsolvingthejointproblemThecoalitionobjectivecouldbetomaximizethemonetaryvalue,ortheoverallexpectedutility

Dividingthevalueofthegeneratedsolution:Intheend,eachagentwillreceiveavalue(moneyorutility)asaresultofparticipatinginthecoalitionInsomeproblems,thecoalitionvaluetheagentshavetoshareisnegative,beingasharedcost

66/78

ProblemFormulationAgroupofagentsS⊆Aiscalledacoalition,whereAdenotesthesetofallagentsandS≠∅Thecoalitionofalltheagentsiscalledgrandcoalition

Acoalitionstructure(CS)partitionsthesetofagentsintocoalitionsThevalueofeachcoalitionSisgivenbyafunctionvSEachcoalitionvalueisindependentofnon-membersactions

CS*isthesocialwelfaremaximizingcoalitionstructure

67/78

CoalitionstructuregenerationThevalueofacoalitionstructureisgivenby:

V(CS)=∑{S∊CS}USThegoalistomaximizethesocialwelfareofasetofagentsAbyfindingacoalitionstructurethatsatisfies:

CS*=argmax{CS∊Partitions(A)}V(CS)

68/78

SpecialCoalitionValuesCoalitionvaluesaresuper-additiveiffforeverypairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vTIfcoalitionvaluesaresuper-additive,thenthecoalitionstructurecontainingthegrandcoalitiongivesthehighestvalueAgentscannotdoworsebyteamingup

Thecoalitionvaluesaresub-additiveiffforeverypairofdisjointcoalitionsS,T⊆A:vS∪T<vS+vTIfcoalitionvaluesaresub-additive,thenthecoalitionstructure{{a}|a∈A}inwhichnoagentcooperatesgivesthehighestvalueIstheambulancerescuetaskintheRoboCupRescuedomainsuper-additive,sub-additive,ornoneofboth?

69/78

CoalitionstructuregenerationInput:allpossiblecoalitionsandtheirvaluesA={1,2,3,4}

ForNagentsthenumberofpossiblecoalitionsis2^N-1andthenumberofpossiblecoalitionstructuresisN^(N/2)

70/78

Coalitiongraph

NodesrepresentCoalitionStructuresArcsrepresenteithermerges(downwards)orsplits(upwards)

71/78

CoalitionStructureSearchTime

Tosearchthewholecoalitiongraphfortheoptimalcoalitionstructureisintractable(onlyfeasibleif|A|<15)

72/78

ApproximateSolutiontoStructureSearch

CanweapproximatethesearchbyvisitingonlyasubsetofLnodes?ChooseasetL(asubsetofallcoalitionsofA)andpickthebestcoalitionseen:CSL*=argmax{CS∊L}

Onerequirementistoguaranteethatthefoundcoalitionstructureiswithinaworstcaseboundfromoptimal:k*V(CSL*)≥V(CSL)

73/78

ApproximateSolutionIfthebottomtwolevelsofthegraphareconsideredthen:k=|A|

andthenumberedofnodessearchesisn=2^(|A|-1)itcanbeproventhatnoothersearchalgorithmcandobetterboundKwhilesearchingn=2^(|A|-1)orfewer

74/78

CoalitionStructureSearchAlgorithm

1Searchthebottomtwolevelsofthecoalitionstructuregraph

2Continuewithbreadth-firstsearchfromthetopofthegraphaslongasthereistimeleft,oruntiltheentiregraphhasbeensearched

3Returnthecoalitionstructurethathasthehighestwelfareamongthoseseensofar

75/78

Casestudy:ResQFreiburgTaskAllocation

NambulanceteamshavetorescueMciviliansafteranearthquakeCiviliansarecharacterizedbyBuriedness,DamageandHit-PointsCostsarethetimetorescueacivilian,composedofthecoalition’sjointmaxtraveltimetoreachthevictim,andthetimeneededfortherescueTheoverallutilityisthenumberofrescuedcivilians(theciviliansbroughttoarefuge)Weconsideredtheambulancerescuetaskassuper-additiveTherescueoperationitselfissuper-additiveAretherepossiblysituationswheretheambulanceteamneedstobesplit?

76/78

ProblemasSequenceAssignmentAssignasequenceRoftasks(herevictims)tothegrandcoalitionofagentsA(hereambulances)R=<r1,r2,…,rN>whereridenotesarescuetaskandithepositioninthesequence

U(R)denotesthepredictedutility(thenumberofsurvivors)whenexecutingsequenceRHence,theproblemisfindtheoptimalsequencefromthesetofallpossiblesequences:R*=argmaxU(R)Enumeratingallpossiblesequencesisintractable(N!)Greedysolutions:Prefervictimsthatcanberescuedfast(smallPreferurgentvictims(high

77/78

ResultsRoboCup2004

78/78

SummaryCooperativesensingwithKalmanFilterandParticlesFilterMakeuseofsensorinformationCoordinationtechniqueforexplorationoftheenvironmentDynamicroleassignmentisanefficientmethodforteamcoordinationActionselectionandcoordinationareessentialwhenactingingroupsCoalitionformationistheprocessoffindingthe“socialwelfare”coalitionstructureamongasetofagents

lectures - department of computer and information science ...tddd10/lectures/2014/06... · tddd10...

Documents