game theory for safety and security · game theory for safety and security arunesh sinha ....

Game Theory for Safety and Security

Arunesh Sinha

Motivation: Real World Security Issues

2

Central Problem

Allocating limited security resources against an

adaptive, intelligent adversary

3

Prior Work

•  Stackelberg Games have been very successful in practice

4

Defender-Adversary Interaction: Stackelberg Game

1,0 −3,3

−9,9 1,0

𝑝↓1 =0.75

𝑝↓2 =0.25

5

•  Defender moves first laying out defense

•  Adversary knows the defender’s mixed strategy

•  Does not know the coin flips

•  Stackelberg Equilibrium: Optimal randomization

1 2 3 4 5 Day:

Outline

Threat Screening Games

Crime Prediction using Learning in Games

6

Audit games

Screening for Threats Threat Screening Games

7

Airport Passenger Screening Problem

•  Transport Security Administration (TSA) screens 800 million passengers

•  Dynamics Aviation Risk Management Solution (DARMS) [with USC/CREATE and Milind Tambe]

•  An intelligent approach to screening passengers

Screening Effectiveness

Timely Screening

8


Actors

•  Screener (TSA)

•  Adversary (e.g., terrorist)

•  Benign Screenees


9

Current Screening Approach

•  Two broad passengers categories

•  TSA Pre and general

•  Same type of screening in each category (some exceptions, e.g. children)

•  Long queues

•  Lot of screening time spent on benign passengers

10

Proposed Solution

•  Finer categories for passengers: risk levels and flight

•  Randomized screening

X-Ray + Metal Detector

X-Ray + AIT

X-Ray

Metal Detector

Low Risk, Domestic

4

2

90

4

11

Low Risk, International

40

10

30

20

High Risk, International

5

95

0

0


Actions of Players

•  Defender: Allocation of screening teams to passengers

•  Resource capacity constraints: For example, X-ray can be used only 40 times/hour

•  Passenger flow constraints: All passengers in all categories must be screened

•  Adversary: Choose a passenger category to arrive in

12


Payoffs of Players

•  Defender payoff: Measures the loss incurred from a successful attack

•  Probability of attack is a function of the defender and adversary strategy

•  Adversary payoff: Measures the gain from a successful attack


13

Optimization Problem

•  Maximize defender payoff (i.e., minimize loss)

•  Function of defender randomized strategy and adversary best response

•  Subject to

•  The adversary plays a best response

14


Technical Challenges

•  Very large game: ~ 10↑41  defender actions

•  The equilibrium computation is NP Hard

•  Current large scale optimization approaches like Column Generation (CG) fail

•  Invalid solution with compact representation (CR) approach

15


Technical Contribution

•  We propose the Marginal Guided Algorithm (MGA)

•  Brown, Sinha, Schlenker, Tambe; One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats [AAAI 2016]

16

-4

-3

-2

-1

0

10 20 30 40 50

Scre

ener

Util

ity

Flights

MGA CG

0.1 1

10 100

1000 10000

10 20 30 40 50 Run

time

(sec

onds

)

Flights

MGA CG


General Model for Screening

17


Outline



18

Audit games

19

Privacy Concern in HealthCare Audit Games

What’s Going On?

20

}  Permissive access control regime }  Trust employees to do the right thing }  Malicious insiders can cause breaches

Audit Games

•  Post-hoc inspection of employee accesses to patient health records

•  Detect violations

•  Punish violators

•  Auditing is ubiquitous and effective against insider threat

•  Financial auditing, computer security auditing

21

Auditing Audit Games

Audit Game Model

𝑛 suspicious cases

Auditors

𝑘 Inspections, 𝑘≪𝑛

Adversary

22

Audit Games

•  Defender chooses a randomized allocation of limited resources

•  Also, chooses a punishment level

•  Adversary plays his best response: chooses a misdeed to commit

•  Adversary gets punished if the misdeed is caught

23

Actions of Players Audit Games

Payoff of Players

•  Defender payoff includes the loss incurred from a successful breach

•  Probability of breach is a function of the defender and adversary strategy

•  Defender payoff includes loss from a high punishment level (Punishment is not free)

•  High punishment level •  Negative work environment -> loss for organization

•  Immediate loss from punishment -> Suspension/Firing means loss for org.

•  Adversary payoff includes the gain from a successful breach


•  Adversary payoff includes the loss due to punishment when caught

24

Optimization Problem

•  Maximize defender payoff (i.e., minimize loss)

•  Function of defender randomized strategy, punishment level and adversary best response

•  Subject to

•  The adversary plays a best response (non-linear)

•  Used techniques like Second Order Cone Programs for fast computation [IJCAI 2013, AAAI 2015]

25

Audit Games

General Model for Auditing

26

Punishment costs lead to tradeoff between deterrence and loss due to misdeed

Optimal inspection allocation and punishment policy can be computed efficiently

Outline



27

Audit games

A Big Problem

28

Urban Crime

In 2009 7,857,000 crime $10,994,562,000

Crime Prediction

Challenges

•  Model behavior of criminals

•  What is their utility?

•  Criminals are not homogenous

•  Crime has spatial aspects

•  Opportunity

•  Real world data about frequent defender-adversary interaction available

29

Predictive Policing Solution

•  Our contribution [AAMAS 2015, 2016]

•  Learn crime and crime evolution in response to patrolling

•  Then, design optimal patrols

•  Distinct from “crime predicts crime” philosophy in criminology [Chen 2004; McCue 2015]

•  Deployment: Licensed to a start-up ArmorWay; deployment in University of Southern California

Crime Prediction

30

0

0.5

1

Acc

urac

y

EMC2 Crime predicts crime Random

The Role of Learning in Stackelberg Games

31

Learn Adversary Behavior

Plan Optimal

Defender Strategy

Data about past

interaction

Defender Strategy

Adversary Model

Domain Description

•  Five patrol areas

•  Eight hour shifts

•  Crime data: number of crimes/shift/area

•  Patrol data: number of officers/shift/area

32

Crime Prediction

Learning Model

•  Dynamic Bayesian Network (AAMAS’15)

•  Captures interaction between officers and criminals

•  D: Number of defenders (known)

•  X: Number of criminals (hidden)

•  Y: Number of crimes (known)

•  T: Step = Shift

•  Expectation-Maximization with intelligent factoring

T T+1 33

Crime Prediction

Planning

DBN (Criminal Model)

Input Defender Strategy

Output Crime Number

Search problem Search space grows exponentially with the number of steps that are planned ahead

34

•  DOGS algorithm (AAMAS 2015)

•  Apply Dynamic Programming in the search problem

Crime Prediction

Experimental Results

35

Crime heat map without patrol

Crime heat map with random patrol

Crime heat map with optimal patrol

Data Enables Learning in Games

36

Learn Adversary Behavior

Plan Optimal

Defender Strategy

Data about past

interaction

Defender Strategy

Adversary Model

Takeaway

37

Game theory enables intelligent randomized allocation of limited security resources

against an adaptive adversary

Thank You

38

game theory for safety and security · game theory for safety and security arunesh sinha ....

Documents