2013.10 operating * by the numbers
DESCRIPTION
Discusses how new approaches to managing business risk and software services (like Dev Ops and Platform Engineering/Management) can draw from their forefather concepts: Operations Management and Decision Science.TRANSCRIPT
OPERATING * BY THE NUMBERSAllison Miller!@selenakyle
Overview! How we got here!! Improving systems using models!! Model building!! Back to the Numbers!! Beg, Borrow, Steal
A Shift to Operations! Life at Layer 8!! The modern operating
environment!High complexity!High stakes!
! Operations!Process of transforming inputs into outputs
Layer 8
Say
You
Would
What
Here?
You
Do
Transport
Session
Presentation
Application
Physical
Network
Data Link
Business Logic
The Modern World! Buzzword Bingo!
Big Data / NoSQL / Graph DB’s!Machine Learning!Agile development/delivery (aka Dev Ops)!Cloud / Anything...as a Service!
! The New Hotness is Old School!Management science !Operations research!Decision Science!Six Sigma / TQM / Kai Zen
Who Cares! Relevant to control systems!! Tools to improve running an
operation/business!Automation!Optimization!Prediction / Forecasting!
! Modeling as an operations tool
I’m no model lady. A model’s just an
imitation of the real thing. !–Mae West
Improving Systems Using Models! What are models!
Not reality, but an approximation!90% likelihood vs 90% of behavior observed!
! Why do we employ models!Design (how to build/design a system)!Management (goal setting & performance monitoring)!Live / Production / Operations (automation)!
! How do we know if they work?
Abstraction RealismPrescriptive Descriptive
Combat Modeling Spectrum Washburn & Kress, Combat Modeling, International Series in Operations Research & Management
Quality cannot be improved by trying harder. !–W.E. Deming
Operating Better Systems! Operations – a transformative process that
converts inputs into outputs
Example: Data Driven Defense! What’s a risk decisioning system?!! Where do you put it?!! What does it cost?!! What do you need to build it?!! How do you build it?!! Operating Risk by the numbers!
Forecasting / Prediction!Automation!Optimization
Big Data &Little Loops
123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/wpaper.gif HTTP/1.0" 200 6248 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:47 -0400] "GET /asctortf/ HTTP/1.0" 200 8130 "http://search.netscape.com/Computers/Data_Formats/Document/Text/RTF" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/5star2000.gif HTTP/1.0" 200 4005 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [info] created shared memory segment #10813446[Tue Mar 9 22:02:41 2004] [notice] Apache/1.3.29 (Unix) mod_ssl/2.8.16 OpenSSL/0.9.7c configured -- resuming normal operations[Tue Mar 9 22:02:41 2004] [info] Server built: Mar 7 2004 13:38:59pausing [http://xmlrevenue.com/s.php?username=jenneypan&keywords=Online+Gambling] for 50000 ms[Tue Mar 9 22:04:16 2004] [error] [client 218.93.92.137] mod_security: Access denied with code 200. Pattern match "Basic" at HEADER.[Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4]123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)[Tue Mar 9 22:03:26 2004] [error] [client 218.93.92.137] mod_security:[Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4]123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)
Big Data &Little Loops
123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/wpaper.gif HTTP/1.0" 200 6248 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:47 -0400] "GET /asctortf/ HTTP/1.0" 200 8130 "http://search.netscape.com/Computers/Data_Formats/Document/Text/RTF" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] "GET /pics/5star2000.gif HTTP/1.0" 200 4005 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [info] created shared memory segment #10813446[Tue Mar 9 22:02:41 2004] [notice] Apache/1.3.29 (Unix) mod_ssl/2.8.16 OpenSSL/0.9.7c configured -- resuming normal operations[Tue Mar 9 22:02:41 2004] [info] Server built: Mar 7 2004 13:38:59pausing [http://xmlrevenue.com/s.php?username=jenneypan&keywords=Online+Gambling] for 50000 ms[Tue Mar 9 22:04:16 2004] [error] [client 218.93.92.137] mod_security: Access denied with code 200. Pattern match "Basic" at HEADER.[Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4]123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)[Tue Mar 9 22:03:26 2004] [error] [client 218.93.92.137] mod_security:[Tue Mar 9 22:07:16 2004] [error] [client 203.121.182.190] mod_security: Invalid character detected [4]123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] "GET /pics/5star.gif HTTP/1.0" 200 1031 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /pics/a2hlogo.jpg HTTP/1.0" 200 4282 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] "GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0" 200 36 "http://www.jafsoft.com/asctortf/" "Mozilla/4.05 (Macintosh; I; PPC)"[Tue Mar 9 22:02:41 2004] [notice] Accept mutex: sysvsem (Default: sysvsem)
Big Data &Little Loops
* Loop Disposition: Logic, Human, or Other?
Big Data &Little Loops
Why are you picking on me?Boo-yah!
Still getting away with it.
<Sigh> Nobody
understands me.
SHALL WE PLAY A GAME?(SINCE WE CAN’T PLAY “CLUE” FOR EVERY LOGIN
TRANSACTIONNEW USER MESSAGE
FRIEND REQUESTATTACHMENT
PACKETWINKPOKECLICK
BIT
WE BUILD RISK MODELS)
Applying Decisions
Risk management is decision management
ACTOR ATTEMPTS
ACTIONSuSUBMIT
WHAT IS THE
REQUEST
HOW TO HONOR THE
REQUESTSHOULD WE
HONOR?
RESULTACTIONOCCURS
Applied where?
Where risks manifest in observable behavior
Where system owners make decisions
Where controls can be optimized by better recognizing identity, intent, or change
Decisions, Decisions
Authorize Block
Good false positive
Bad false negative
RESPONSE
POPULATION
Incorrect decisions have a cost Correct decisions are free (usually)
Good Action Gets Blocked
Bad Action Gets Through
Downstream Impacts
Such as...Populations- Users, Transactions, Messages, Packets, API calls,
Files!
Actions- Allow, Block, Challenge, Review, Retry, Quarantine,
Add privileges, Upgrade privileges, Make Offer!
Costs- Fraud, Data leakage, Customer churn, Customer
contacts, Downstream liability
For example:ACTOR
ATTEMPTS Payment
p (actor attempting payment is
accountholder)
DecisionAuthorize
Review
Refer
Request Authentication
Decline
f(variable A + Variable B + ...)
SuSUBMIT
Flavors of Risk Models
I deviate significantly from a
normal (good) pattern
I summarize a known bad
pattern
fa(x), fb(x), fc(x) fq(x), fr(x), fs(x)
What is normal?
http://en.wikipedia.org/wiki/Normal_distribution
WHAT IS BAD? WHAT IS GOOD?
Model Development Process
Target ! Yes/No questions best
Find Data, Variable Creation ! Best part
Data Prep ! Worst part
Model Training ! Pick an algorithm
Assessment ! Catch vs FP rate
Deployment ! Decisioning vs Detection
User IP Country
<> Billing Country
Buying prepaid mobile phones
Add new shipping address in cart
Buyer = Phone reseller, static machine ID
How much $$ is at risk?What is “normal” for this customer?What “bad” profiles does this match?
Geolocate IP
Convert geo to country code
Flag on Mismatch
Cart Category
Merch Risk Level
Date Added
Address Type
String Matching
Customer Profile
Device IDDevice HistoryTXN-$-AMT
Churn Risk, CLV, ...TXNs, logins, ...
Stolen CC, Collusion
Model TrainingSome algorithms:- Regression: Determines the best equation describe
relationship between control variable and independent variables!
Linear Regression: Best equation is a line!Logistic Regression: Best equation is a curve (exponential properties)!
- Bayesian: Used to estimate regression models, useful when working w/small data sets !
- Neural Nets: Can approximate any type of non-linear function, often highly predictive, but doesn’t explain the relationship between control and independent variables
LOGISTIC <DEPVAR> <VAR1> <VAR2>...
p-value of significance, throw out if > .05
Variance in dependent variable explained by independent variables
Dependent Variable
Independent Variables
Factor odds of dependent go up
when independent var incremented
p-value should be < significance
level (.05)
Operating a Risk System
Disposition &
Time
Email CC# Items Total !Submit
Maybe !! No! !! Yes!! !!SuOutcomeSuAttempt
Black & Whitelists
Machine Learning
Velocity & Spend caps
Geo & IP Logic
Linking
Data• Reporting • Metrics • Analysis • Modeling
Good Bad Indeterminate
The Better Mousetrap
Automates defensive action x-platform
- Fast !
- Accurate!
- Cheap
In Real TimeIn Time to Minimize
LossReasonable False PositivesAs good as a human specialistReduces More Loss than Cost
CreatedCheaper than
Manual intervention
GAIN
More gain/lift = more efficient predictionsCatch as much as possible (as much of the “bads”)Minimize the overall affected
% of population
Cos
t
Number of Defects Produced
Cost of Control
Cost of Defects
Total Cost
“Alice: Which way should I go? Cat: That depends on where you are going. Alice: I don’t know. Cat: Then it doesn’t matter which way you go.” ― Lewis Carroll, Alice in Wonderland
% of populationC
ost
Number of Defects Produced
Cost of Control
Cost of Defects
CV
Total CostCV
Finding the * approach in the wild! Operating * by the numbers in many disciplines!
Automation!Optimization!Forecasting / Prediction!
! Such as…!Science !Finance!Marketing / Advertising!Software Development!Site/Network Ops!Manufacturing!Military
Is all fun and game until you are need of put it in production – @devopsborat
Beg, Borrow, Steal! A/B Testing!! Control Charts!! Highly engaged
change management!! Sample strategy!! Instrument
everything!! Poka-Yoke
RecapOperating systems effectively means:- Using data to understand and improve
performance!- Using tools to:!
- Automate (Efficiency, Scale, Standardization)!- Optimize (Set goals cognizant of tradeoffs)!- Forecast / Predict (Plan, course correct)!
Designing data-driven defenses- Decisions that can be automated w/data!- Where/what data sets to use!- Business drivers to keep in mind !Numbers, Numbers, Numbers
p (bad)f(variable A + Variable B + ...)
Prediction is very difficult, especially about the future
Niels Bohr
Allison Miller@selenakyle
Metrics vs Analytics
METRICS ANALYTICS
Such as...Metrics Analytics
$ Loss Txns Purchase trends of high loss users
# Compromised Accts IP Sources of bad login attempts
% of Spam Messages Delivered Spam subject lines generating most clicks
Minutes of downtime Most process-intensive applications
# Customer Contacts Generated Highest-contact exception flows
The first rule of any technology used in a
business is that automation applied to an efficient operation
will magnify the efficiency. "
The second is that automation applied to
an inefficient operation will
magnify the inefficiency."–Bill Gates