november 2013 hug: cyber security with hadoop

16
Cyber Security Analytics & Big Data Padmanabh Dabke, PhD VP, Analytics & Visualization Narus Inc.

Upload: yahoo-developer-network

Post on 26-Jan-2015

107 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: November 2013 HUG: Cyber Security with Hadoop

Cyber Security Analytics & Big Data

Padmanabh Dabke, PhD

VP, Analytics & Visualization

Narus Inc.

Page 2: November 2013 HUG: Cyber Security with Hadoop

2

Narus Confidential, © 2013 Narus, Inc.

• Company overview

• Narus Technology

• Key challenges & solutions

• Summary

Agenda

Page 3: November 2013 HUG: Cyber Security with Hadoop

3

Company Overview

Innovative technology protected by a broadIP portfolio

• Cybersecurity & R&D software company based in Silicon Valley -Sunnyvale, CA

• Focused on fusing semantic and data planes, applying it to cybersecurity and risk management

• Making sense of physical, content, and social networksEstablished customer and

partner base

Wholly-owned subsidiary

of The Boeing Company

Page 4: November 2013 HUG: Cyber Security with Hadoop

4

Journey to Cyber 3.0Semantic Web & Cyber Intersect

Web 1.0

Web 2.0

Web 3.0

Semantic WebAdds “Context” to data/Internet traffic based on a superior understanding of relationships within dataSocial Web

User/Community-generated content and the read-write web

Static WebPrimarily read-only content and static HTML websites

Cyber 1.0

Cyber 2.0

Siloed Cyber• Voluminous, homogenous information• Siloed, on demand non-interactive content• Limited number of applications and protocols• Resources and missions not fully aligned• Manual contextualization of data

Integrated Cyber• Voluminous high velocity data• Growth of applications and protocols• People connecting with each other & content• Human approaches to extract & contextualize• Variety of interactions driving looser control

(new threats)

Cyber 3.0

Intelligent Cyber• High volume, velocity and variety of data• Explosion in applications and protocols• Hyper connected people & content, interactivity

between, people machines-machines-people• Automated alignment of resources & missions• Machine learning for intelligence & context

Narus Confidential, © 2013 Narus, Inc.

Page 5: November 2013 HUG: Cyber Security with Hadoop

5

1 10 100

Network Bandwidth in Gb

• Daily traffic volume 20.33 PB/Day (2015), 10+ PB/Day (2012)

• 2.5 devices / person, & 19 billion connections (2016)

• Fixed Line Speeds Growing from 10Gbps (2012) - 100Gbps (2015)

• ~1.5Million Applications in Android & Apple Store• Types of data: growing media (data, voice, video), protocols,

network types (local, cloud, virtualized, hybrid)

Changing LandscapeVolume, Velocity & Variety

Narus Confidential, © 2013 Narus, Inc.

Page 6: November 2013 HUG: Cyber Security with Hadoop

6

Visibility, Context & Control: Key to Enhance Cybersecurity & Protect Assets

Control

• More efficient spending, faster resolution, dynamic approach to solve a dynamic problem

• Lots of tools, but policies not aligned with mission to allow tighter control

Context• Impact & root-cause is manual,

requires highly-skilled & paid analysts to digest overwhelming amounts of data

Visibility

• Need for continuous visibility into every dimension (hosts, users, etc.)

Narus Confidential, © 2013 Narus, Inc.

Page 7: November 2013 HUG: Cyber Security with Hadoop

7

Narus’ Innovative TechnologyAn Integrated perspective

Page 8: November 2013 HUG: Cyber Security with Hadoop

8

Narus nSystemComprehensive & Adaptive Analytics To Enhance Cybersecurity and Protect Critical Assets with Machine Learning

nAnalytics • Single UI with interactive dashboards offer multi-

dimensional views of cyber activity‒ Network, Semantic & User Analytics‒ Targeted Session Captures

• Advanced analytics for automated data fusion with machine learning

nProcessing • Centralized scalable data processing & storage

framework • Automated ability to deal with petabytes of data • Support for streaming, query-based and big-data

analytics• Machine learning applied to large volumes of data

nCapture• Architected for distribution at multiple sites & links

‒ 100% of packets examined, metadata with necessary session fidelity

• Plugins to assimilate data from heterogeneous sources• Precision targeted full packet capture• Support for 20G (duplex 10G) per-link, path to 100G

Page 9: November 2013 HUG: Cyber Security with Hadoop

9Confidential / For Internal Use Only / © 2013 Narus, Inc.

HBase

Narus Analytics Framework

Real Time (< 5 sec latency)

Close Enough (5 min latency)

On Demand

Protocol Vector Creator

In-Memory AnalyticsReal Time

Visualization

Map Reduce Jobs

ETL

Hadoop RDBMSAd-Hoc/Sliding

Window Analytics

Real Time Analytics• Volumetric & Topical

Trends• Anomaly Detection• Classification• Clustering• Summarization

Data-At-Rest Analytics• Long term trends• Opportunistic

Correlations• Model Training

Page 10: November 2013 HUG: Cyber Security with Hadoop

10

Machine Learning for Cyber Security

• Automated Signature Generation– Protocol Identification– Parser Generation– Mobile App Detection– App Categorization

• Text Analytics– Topic Detection– Sentiment Analysis

• Anomaly Detection– Baseline profile generation– Alerts & workflow – Malicious Application Detection

Page 11: November 2013 HUG: Cyber Security with Hadoop

11

Key Challenges

• Increasing network traffic– Line speeds from 20 Gbps to 600 Gbps and above– 210 TB to 6.3 PB Per Day

• Diversity of deployments– Data rates, vertical application areas, SLA, price points:

everything is a variable• Operational issues

– Datacenter connectivity– Burstiness of network traffic

• Data Security

Page 12: November 2013 HUG: Cyber Security with Hadoop

12

Lessons Learned/Solutions

• Extract and store all metadata and provide full packets as identified by the analyst– 90% reduction in data volume

• Use domain knowledge for message compression– Short codes for enumerated values (mobile apps, protocols, etc.)– Session associations to eliminate referential fields

• Hbase over HDFS – provides abstractions useful for modelling dynamic schema

• Off load CPU work to special purpose co-processors to accelerate performance

Page 13: November 2013 HUG: Cyber Security with Hadoop

13

Lessons Learned/Solutions

• Relational databases are not evil– Believe it or not, relational algebra is quite powerful– We use it for fast, in-memory computations in combination with

Java code for processing rule sets• SQL interfaces on HDFS/Hbase are catching up

10 20 500

1

2

3

4

5

6

7

Analytics Data Store Performance

Big SQL Impala mySQL Cluster

mySQL Cluster

Impala

Big SQL

Database Size

Avg

. Que

ry P

roce

ssin

g T

ime

Page 14: November 2013 HUG: Cyber Security with Hadoop

14

Business Considerations

• Optimizing Total Cost of Ownership (TCO)– System acquisition– Data center costs– Administration and maintenance

• Analytics development and skillset required• Global support

Page 15: November 2013 HUG: Cyber Security with Hadoop

15

Data Warehousing Vs Hadoop

Source: “Big Data: What does it really cost?” By Winter Corp

Page 16: November 2013 HUG: Cyber Security with Hadoop

16

Summary

• We blend network, semantic, and user-oriented views to create unique insights

– Data Loss Prevention– Threat Detection– Network pattern mining

• Real Time & At-Rest Analytics– Stateless analysis and short term trends and classification– At-Rest analysis for training models, opportunistic correlations, and

mega-trends• Hybrid Approach

– Hadoop/Hbase for horizontal scaling and cost-effective storage and processing of massive data sets

– Relational databases for creating efficient business intelligence views