from data to decisions makers: a behind the scenes look at building the most respected report in...
TRANSCRIPT
Bob Rudis • Managing Principal & Senior Data [email protected]
From Data to Decisions Makers
A Behind the Scenes Look at Building TheMost Respected Report In Cybersecurity
Bob Rudis • Managing Principal & Senior Data [email protected]
ABOUT ME(Briefly)
Bob Rudis • Managing Principal & Senior Data [email protected]
• DBIR team manager/author (more on this in a bit)
• Former cyber risk director for a Fortune 100 insurance company
• Serial #rstats Tweeter (@hrbrmstr), blogger (rud.is/b & @ddsecblog) & regular helper on StackOverflow
• Author of and contributor to 14 CRAN packages
• Co-author of Data-Driven Security (@ddsecbook)
• Co-host of the Data-Driven Security Podcast (@ddsecpodcast)
• Die-hard ggplot2 advocate, widgeteer, heavily addicted cartographer & shameless user of the forward assignment operator ←4EVA→
Bob Rudis • Managing Principal & Senior Data [email protected]
WHAT IS THE DBIR?
Bob Rudis • Managing Principal & Senior Data [email protected]
The Verizon Data BreachInvestigations Report (DBIR)
“The Verizon Data Breach Investigations Report (DBIR) is an annual publication that provides
analysis of information security incidents, with a specific focus on data breaches.”
http://searchsecurity.techtarget.com/definition/Verizon-Data-Breach-Investigations-Report-DBIR
verizonenterprise.com/DBIR
Bob Rudis • Managing Principal & Senior Data [email protected]
WHO IS THE DBIR?
Bob Rudis • Managing Principal & Senior Data [email protected]
Wade Baker Dave Hylender Marc Spitler Jay Jacobs
Kevin Thompson Suzanne Widup Bhaskar Karambelkar Gabriel Bassett
Bob Rudis • Managing Principal & Senior Data [email protected]
The DBIR
• Started in 2008• Cited by virtually every other cybersecurity report
by the 3❡• Read by individual contributors up through senior
leadership at virtually every global enterprise• A lot of fun to work on
Bob Rudis • Managing Principal & Senior Data [email protected]
#RSAC#DBIR
2008 2009 2010 2011 2012 2013 2014 2015
1 1 2 3 618
50
70
Bob Rudis • Managing Principal & Senior Data [email protected]
WHAT DOES THIS HAVE TO DO WITH ?
Bob Rudis • Managing Principal & Senior Data [email protected]
200,000
Bob Rudis • Managing Principal & Senior Data [email protected]
Vocabulary for
Event
Recording and
Incident
Sharing
veriscommunity.netvcdb.org
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
library(verisr)
vcdb <- json2veris(jsondir)
summary(vcdb) # too big to show
getenum(vcdb, "actor") ## enum x## 1 external 955## 2 internal 535## 3 partner 100## 4 unknown 85
getenum(vcdb, "actor", add.n=TRUE, add.freq=TRUE) ## enum x n freq## 1 external 955 1643 0.581## 2 internal 535 1643 0.326## 3 partner 100 1643 0.061## 4 unknown 85 1643 0.052
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
vz-risk.github.io/dbir/2015/19/
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
• 200m successful vulnerability exploits across 20,000 enterprises• 170m malware events across over 10,000 enterprises• 6 months of malware traffic data from 30+m mobile devices• Live botnet traffic from compromised organizations• Millions of Indicators of Compromise• Details of all Denial of Service activity for 2014
Bob Rudis • Managing Principal & Senior Data [email protected]
• 200m successful vulnerability exploits across 20,000 enterprises• 170m malware events across over 10,000 enterprises• 6 months of malware traffic data from 30+m mobile devices• Live botnet traffic from compromised organizations• Millions of Indicators of Compromise• Details of all Denial of Service activity for 2014
Bob Rudis • Managing Principal & Senior Data [email protected]
PUTTING IT ALL TOGETHERGetting the data
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
PUTTING IT ALL TOGETHERCreating, organizing and sharing analyses
Bob Rudis • Managing Principal & Senior Data [email protected]
.R .Rmd .json .Rdata
Bob Rudis • Managing Principal & Senior Data [email protected]
1. Assign areas to each researcher
2. For “standard VERIS” analyses, generate reports from core Rmd
3. Have “Findings Review” collaborative meetings where we peer-review the work
4. (Repeat step 3 after refinement of findings)
5. Decide on final sections for the report and assign authors
6. Add rough draft visualizations to the findings
7. Lock in content
8. Refine visualizations
9. Finalize text content
10. Work with Marketing & Graphics
Bob Rudis • Managing Principal & Senior Data [email protected]
FIGURATIVELY SPEAKING
Bob Rudis • Managing Principal & Senior Data [email protected]
• Create one “Master Rmd” for all visualization figures using canned data from outputs of analyses, having one master (giant) HTML document version and multiple individual PDF versions to give to the creative staff to work with
Why PDF? Complex ggplot2 SVGs crash Illustrator and the fonts are horrible (they get converted to polygons).
Bob Rudis • Managing Principal & Senior Data [email protected]
• When you decide you want to use a figure from the analysis spend the time to make it look as amazing (and final) as possible to save $$, save time down the road and to avoid seeing your creations on @wtfviz
Bob Rudis • Managing Principal & Senior Data [email protected]
LESSONS LEA NED
Bob Rudis • Managing Principal & Senior Data [email protected]
R Markdown (Rmd) makes it super amazingly awesomely easy to
document, iterate, modify & share analyses.
spinning is cool too.
Bob Rudis • Managing Principal & Senior Data [email protected]
ggplot2 makes is super amazingly awesomely straightforward to make
“camera ready” visualizations(PDF vs SVG)
Bob Rudis • Managing Principal & Senior Data [email protected]
Do not upgrade your analysis stack or experiment with RStudio during the
core analysis phase
Bob Rudis • Managing Principal & Senior Data [email protected]
Packages (even for analyses) > loosely connected documents and scripts
Bob Rudis • Managing Principal & Senior Data [email protected]
Source code control & data versioning control is extremely important
Bob Rudis • Managing Principal & Senior Data [email protected]
A fellow researcher must be able to reproduce your analyses with the same
data & Rmd and understand your reasoning in the annotation
Bob Rudis • Managing Principal & Senior Data [email protected]
Freezing or at least recording versions of packages you use may be vitally
important to your ability to reproduce at a later date (store them in version
control with analyses or perhaps embed in a container like Docker)
Bob Rudis • Managing Principal & Senior Data [email protected]
ABOUT THE COVER
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
Bob Rudis • Managing Principal & Senior Data [email protected]
• @vzdbir• [email protected]• verizonenterprise.com/dbir• veriscommunity.net• vcdb.org• github.com/vz-risk
• @wadebaker• @davehylender• @marc_spitler• @bfist• @jayjacobs• @SuzanneWidup• @bhaskar_vk• @gdbassett • @hrbrmstr