game theory for safety and security · game theory for safety and security arunesh sinha ....
TRANSCRIPT
Game Theory for Safety and Security
Arunesh Sinha
Motivation: Real World Security Issues
2
Central Problem
Allocating limited security resources against an
adaptive, intelligent adversary
3
Prior Work
• Stackelberg Games have been very successful in practice
4
Defender-Adversary Interaction: Stackelberg Game
1,0 −3,3
−9,9 1,0
𝑝↓1 =0.75
𝑝↓2 =0.25
5
• Defender moves first laying out defense
• Adversary knows the defender’s mixed strategy
• Does not know the coin flips
• Stackelberg Equilibrium: Optimal randomization
1 2 3 4 5 Day:
Outline
Threat Screening Games
Crime Prediction using Learning in Games
6
Audit games
Screening for Threats Threat Screening Games
7
Airport Passenger Screening Problem
• Transport Security Administration (TSA) screens 800 million passengers
• Dynamics Aviation Risk Management Solution (DARMS) [with USC/CREATE and Milind Tambe]
• An intelligent approach to screening passengers
Screening Effectiveness
Timely Screening
8
Threat Screening Games
Actors
• Screener (TSA)
• Adversary (e.g., terrorist)
• Benign Screenees
Threat Screening Games
9
Current Screening Approach
• Two broad passengers categories
• TSA Pre and general
• Same type of screening in each category (some exceptions, e.g. children)
• Long queues
• Lot of screening time spent on benign passengers
10
Proposed Solution
• Finer categories for passengers: risk levels and flight
• Randomized screening
X-Ray + Metal Detector
X-Ray + AIT
X-Ray
Metal Detector
Low Risk, Domestic
4
2
90
4
11
Low Risk, International
40
10
30
20
High Risk, International
5
95
0
0
Threat Screening Games
Actions of Players
• Defender: Allocation of screening teams to passengers
• Resource capacity constraints: For example, X-ray can be used only 40 times/hour
• Passenger flow constraints: All passengers in all categories must be screened
• Adversary: Choose a passenger category to arrive in
12
Threat Screening Games
Payoffs of Players
• Defender payoff: Measures the loss incurred from a successful attack
• Probability of attack is a function of the defender and adversary strategy
• Adversary payoff: Measures the gain from a successful attack
• Probability of attack is a function of the defender and adversary strategy
13
Optimization Problem
• Maximize defender payoff (i.e., minimize loss)
• Function of defender randomized strategy and adversary best response
• Subject to
• The adversary plays a best response
14
Threat Screening Games
Technical Challenges
• Very large game: ~ 10↑41 defender actions
• The equilibrium computation is NP Hard
• Current large scale optimization approaches like Column Generation (CG) fail
• Invalid solution with compact representation (CR) approach
15
Threat Screening Games
Technical Contribution
• We propose the Marginal Guided Algorithm (MGA)
• Brown, Sinha, Schlenker, Tambe; One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats [AAAI 2016]
16
-4
-3
-2
-1
0
10 20 30 40 50
Scre
ener
Util
ity
Flights
MGA CG
0.1 1
10 100
1000 10000
10 20 30 40 50 Run
time
(sec
onds
)
Flights
MGA CG
Threat Screening Games
General Model for Screening
17
Threat Screening Games
Outline
Threat Screening Games
Crime Prediction using Learning in Games
18
Audit games
19
Privacy Concern in HealthCare Audit Games
What’s Going On?
20
} Permissive access control regime } Trust employees to do the right thing } Malicious insiders can cause breaches
Audit Games
• Post-hoc inspection of employee accesses to patient health records
• Detect violations
• Punish violators
• Auditing is ubiquitous and effective against insider threat
• Financial auditing, computer security auditing
21
Auditing Audit Games
Audit Game Model
𝑛 suspicious cases
Auditors
𝑘 Inspections, 𝑘≪𝑛
Adversary
22
Audit Games
• Defender chooses a randomized allocation of limited resources
• Also, chooses a punishment level
• Adversary plays his best response: chooses a misdeed to commit
• Adversary gets punished if the misdeed is caught
23
Actions of Players Audit Games
Payoff of Players
• Defender payoff includes the loss incurred from a successful breach
• Probability of breach is a function of the defender and adversary strategy
• Defender payoff includes loss from a high punishment level (Punishment is not free)
• High punishment level • Negative work environment -> loss for organization
• Immediate loss from punishment -> Suspension/Firing means loss for org.
• Adversary payoff includes the gain from a successful breach
• Probability of attack is a function of the defender and adversary strategy
• Adversary payoff includes the loss due to punishment when caught
24
Optimization Problem
• Maximize defender payoff (i.e., minimize loss)
• Function of defender randomized strategy, punishment level and adversary best response
• Subject to
• The adversary plays a best response (non-linear)
• Used techniques like Second Order Cone Programs for fast computation [IJCAI 2013, AAAI 2015]
25
Audit Games
General Model for Auditing
26
Punishment costs lead to tradeoff between deterrence and loss due to misdeed
Optimal inspection allocation and punishment policy can be computed efficiently
Outline
Threat Screening Games
Crime Prediction using Learning in Games
27
Audit games
A Big Problem
28
Urban Crime
In 2009 7,857,000 crime $10,994,562,000
Crime Prediction
Challenges
• Model behavior of criminals
• What is their utility?
• Criminals are not homogenous
• Crime has spatial aspects
• Opportunity
• Real world data about frequent defender-adversary interaction available
29
Predictive Policing Solution
• Our contribution [AAMAS 2015, 2016]
• Learn crime and crime evolution in response to patrolling
• Then, design optimal patrols
• Distinct from “crime predicts crime” philosophy in criminology [Chen 2004; McCue 2015]
• Deployment: Licensed to a start-up ArmorWay; deployment in University of Southern California
Crime Prediction
30
0
0.5
1
Acc
urac
y
EMC2 Crime predicts crime Random
The Role of Learning in Stackelberg Games
31
Learn Adversary Behavior
Plan Optimal
Defender Strategy
Data about past
interaction
Defender Strategy
Adversary Model
Domain Description
• Five patrol areas
• Eight hour shifts
• Crime data: number of crimes/shift/area
• Patrol data: number of officers/shift/area
32
Crime Prediction
Learning Model
• Dynamic Bayesian Network (AAMAS’15)
• Captures interaction between officers and criminals
• D: Number of defenders (known)
• X: Number of criminals (hidden)
• Y: Number of crimes (known)
• T: Step = Shift
• Expectation-Maximization with intelligent factoring
T T+1 33
Crime Prediction
Planning
DBN (Criminal Model)
Input Defender Strategy
Output Crime Number
Search problem Search space grows exponentially with the number of steps that are planned ahead
34
• DOGS algorithm (AAMAS 2015)
• Apply Dynamic Programming in the search problem
Crime Prediction
Experimental Results
35
Crime heat map without patrol
Crime heat map with random patrol
Crime heat map with optimal patrol
Data Enables Learning in Games
36
Learn Adversary Behavior
Plan Optimal
Defender Strategy
Data about past
interaction
Defender Strategy
Adversary Model
Takeaway
37
Game theory enables intelligent randomized allocation of limited security resources
against an adaptive adversary
Thank You
38