bd cloud v3

50
The Big Data Cloud: Are You Ready for the Zettabyte? Steven C. Markey, MSIS, PMP, CISSP, CIPP, CISM, CISA, STS-EV, CCSK, CompTIA Cloud Essentials Principal, nControl, LLC Adjunct Professor President, Cloud Security Alliance – Delaware Valley Chapter (CSA-DelVal)

Upload: scm24

Post on 15-Jan-2015

1.055 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Bd cloud v3

The Big Data Cloud: Are You Ready for the Zettabyte?

Steven C. Markey, MSIS, PMP, CISSP, CIPP, CISM, CISA, STS-EV, CCSK, CompTIA Cloud Essentials

Principal, nControl, LLCAdjunct Professor

President, Cloud Security Alliance – Delaware Valley Chapter (CSA-DelVal)

Page 2: Bd cloud v3

• Presentation Overview– Why Should You Care?– Cloud Overview– Big Data Overview– Cloud-Based Big Data Offerings– Securing Cloud-Based DB Solutions

Big Data Cloud

Page 3: Bd cloud v3

• Why Should You Care– Organizational Cost Reduction Requirements• Justify Investments• Improve Efficiencies (Productivity, Time to Market)

– Digital Information – 60%~ Annual Growth Rate (AGR)– Data Storage – 15-20% AGR Capital Expense (CapEx)– Categorization, Classification & Retention Magnify• Compliance, Legal & Privacy Regulations

– Prevalent & Interconnected Business Ecosystems• Supply Chains• Business Process Outsourcers (BPO)• Information Technology Outsourcers (ITO)• Vendor’s Vendors

Big Data Cloud

Source: IDC

Page 4: Bd cloud v3

Source: NIST

Page 5: Bd cloud v3

Service Delivery Models

Source: Swain Techs

Page 6: Bd cloud v3

Source: Matthew Gardiner, Computer Associates

Page 7: Bd cloud v3

Big Data Cloud

Source: Flickr

Page 8: Bd cloud v3

Big Data Cloud• Big Data Overview– Aggregated Data from the Following Sources• Traditional• Source• Social

Page 9: Bd cloud v3

Big Data Cloud• Traditional Data– Database Management Systems• Relational Database Management Systems (RDBMS)• Object-Oriented Database Management Systems (OODBMS)• Non-Relational, Distributed DB Management Systems (NRDBMS)• Mobile Databases (SQLite, Oracle Lite)

– Online Transaction Processing (OLTP)• Real-Time Data Warehousing

– Online Analytical Processing (OLAP)• Operational Data Stores (ODS)• Enterprise Data Warehouse (EDW)

Page 10: Bd cloud v3

Big Data Cloud• Traditional Data– OLAP• Business Intelligence (BI)

– Data Mining– Reporting– OLAP (Continued)

» Relational OLAP (ROLAP)» Multi-Dimensional OLAP (MOLAP)» Hybrid OLAP (HOLAP)

OLTPODSEDW (Data Marts)BI (Data Mining)OLTPODSEDW (Data Marts)BI (Reporting)OLTPODSEDW (Data Marts)BI (OLAP)

Page 11: Bd cloud v3

Big Data Cloud

Source: Flickr

Page 12: Bd cloud v3

Big Data Cloud• Source Data– Log Files

• Event Logs / Operating System (OS) - Level• Appliance / Peripherals• Analyzers / Sniffers

– Multimedia• Image Logs• Video Logs

– Web Content Management (WCM)• Web Logs• Search Engine Optimization (SEO)

– Web Metadata

Page 13: Bd cloud v3
Page 14: Bd cloud v3

Big Data Cloud• Big Data Overview– Aggregators• Mostly NRDBMS Implemtations

– Not only – Structured Query Language (NoSQL)

• NRDBMS Examples– Column Family Stores: BigTable (Google), Cassandra & HBase (Apache)– Key-Values Stores: App Engine DataStore (Google), DynamoDB &

SimpleDB (AWS)– Document Databases: CouchDB, MongoDB– Graph Databases: Neo4J

Page 15: Bd cloud v3

Big Data Cloud• Big Data Overview– Serial Processing

• Hadoop– Hadoop Distributed File System (HDFS)– Hive – DW– Pig – Querying Language

• Riak

– Parallel Processing• HadoopDB

– Analytics• Google MapReduce• Apache MapReduce• Splunk (for Security Information / Event Management [SIEM])

Page 16: Bd cloud v3

Source: Cloudera

Page 17: Bd cloud v3

Source: Wikispaces

Page 18: Bd cloud v3

Source: Google

Page 19: Bd cloud v3

Source: Cloudera

Page 20: Bd cloud v3

Big Data Cloud• Cloud-Based Big Data Solutions– PaaS

• DBaaS– Amazon Web Services (AWS)

» DynamoDB» SimpleDB» Relational Database Service (RDS): Oracle 11g / MySQL

– Google App Engine» Datastore

– Microsoft SQL Azure– Oracle Public Cloud: 11g

• Processing– AWS Elastic MapReduce (EMR)– Google App Engine MapReduce: Mapper API– Microsoft: Apache Hadoop for Azure– IBM SmartCloud Enterprise on IBM InfoSphere BigInsights Basics

Page 21: Bd cloud v3

Big Data Cloud

Page 22: Bd cloud v3

Big Data Cloud

Page 23: Bd cloud v3

Big Data Cloud

Page 24: Bd cloud v3

Big Data Cloud

Page 25: Bd cloud v3

Big Data Cloud

Page 26: Bd cloud v3
Page 27: Bd cloud v3
Page 28: Bd cloud v3

Big Data Cloud• Cloud-Based Database Solutions– IaaS

• Basic Components: Compute & Storage Nodes– AWS Elastic Compute Cloud (EC2) – AWS Elastic Block Store (EBS)– OpenStack Compute (Nova)– OpenStack Storage (Swift)

• Advanced Components– Apache Hadoop – Apache Hadoop MapReduce

• Commercial Applications– Cloudera– DataStax– MapR– Splunk

Page 29: Bd cloud v3

Big Data Cloud

InternetInternet

AWS CloudAWS Cloud

EC2 Availability Zone

EC2

S3 Storage

EBSEBS

EC2 EC2

EBSEBS

EBSEBS

EBSEBS

EBSEBS

EBSEBS EBS SnapshotEBS Snapshot

EBS SnapshotEBS Snapshot

EBS SnapshotEBS Snapshot

EBS SnapshotEBS Snapshot

EBS SnapshotEBS Snapshot

Source: Amazon

Page 30: Bd cloud v3
Page 31: Bd cloud v3

Big Data Cloud• Big Data in the Cloud Use Cases– Public Cloud

• AWS: EC2 Hadoop & S3• AWS: EC2 Hadoop, DynamoDB & EMR• AWS: EC2 Linux, Apache (w / Tomcat), DynamoDB & EMR• AWS: EC2 Cloudera Hadoop & EMR• AWS: EC2 Splunk

– Hybrid• Oracle Big Data Appliance & Connector, Google App Engine• OpenStack Swift, AWS EC2 Cloudera Hadoop & EMR

– Private Cloud• OpenStack Nova & Swift, Apache Hadoop • OpenStack Nova & Swift, Cloudera Hadoop

Page 32: Bd cloud v3

Big Data Cloud

Page 33: Bd cloud v3

Source: Flickr

Page 34: Bd cloud v3

Big Data Cloud• Securing Cloud-Based NRDBMS Solutions– General

• Focus on Application / Middleware-Level Security– SQL Injections Are Still Possible– Leverage Application IAM for NRDBMS User Rights Mgmt (URM)– Leverage Application & System Logging for Authentication, Authorization & Accounting

(AAA)

• Segregation of Duties– Read / Write Namespaces– Read-Only Namespaces

– Specific• Document

– Consistency Assurance

• Key / Value– Ensure Referential Integrity

Page 35: Bd cloud v3

Big Data Cloud

Page 36: Bd cloud v3

Big Data Cloud• Securing Big Data in the Cloud– Identity & Access Management (IAM)• Security Assertion Markup Language (SAML)• Representational State Transfer (REST)

– AWS IAM– Windows Azure Access Control Service (ACS)

• Web Services – Trust Language (WS-Trust)

Page 37: Bd cloud v3

Source: OASIS

Page 38: Bd cloud v3

Source: Intuit

Page 39: Bd cloud v3

Big Data Cloud• Securing Big Data in the Cloud– Identity & Access Management (IAM)• Security Assertion Markup Language (SAML)• Representational State Transfer (REST)

– AWS IAM– Windows Azure Access Control Service (ACS)

• Web Services – Trust Language (WS-Trust)

Page 40: Bd cloud v3

Source: Apache

Page 41: Bd cloud v3

Big Data Cloud

Page 42: Bd cloud v3

Big Data Cloud

Page 43: Bd cloud v3

Big Data Cloud

Page 44: Bd cloud v3

Big Data Cloud• Securing Big Data in the Cloud– Identity & Access Management (IAM)• Security Assertion Markup Language (SAML)• Representational State Transfer (REST)

– AWS IAM– Windows Azure Access Control Service (ACS)

• Web Services – Trust Language (WS-Trust)

Page 45: Bd cloud v3

Big Data Cloud

Page 46: Bd cloud v3

Big Data Cloud• Securing Big Data in the Cloud– Electronic Discovery (eDiscovery)• eDiscovery Reference Model (EDRM)• Legal Holds• Litigation Response

– Records & Information Management (RIM)• Generally Accepted Recordkeeping Principles (GARP®)• Information Governance Reference Model (IGRM)• Information Lifecycle Management (ILM)• MIKE2.0

Page 47: Bd cloud v3

Big Data Cloud

Page 48: Bd cloud v3

Big Data Cloud• Privacy & Data Protection for Big Data Clouds– Jurisdictions*

• Regional: EU DPA• National: PIPEDA, GLBA, HIPAA / HITECH, COPPA, Safe Harbor• Statutory: Bavarian, CA SB 1386 / 24, MA 201 CMR 17, NV SB 227

– Data Flow & Jurisdictional Adherence• Data Sharing with Third Parties

– Pseudonymization / De-Identification• Consent & Notices

– Contract Clauses• Model Contracts

– Privacy Best Practices• Generally Accepted Privacy Principles (GAPP)

* Not all inclusive.

Page 49: Bd cloud v3

• Presentation Take-Aways– Big Data in the Cloud is Here to Stay– It Has to be Secure–Segregation of Data–Access Controls–Separation / Segregation of Duties–Federated Identities–Logging

Big Data Cloud

Page 50: Bd cloud v3

• Questions?• Contact– Email: [email protected]– Twitter: markes1– LI: http://www.linkedin.com/in/smarkey– CSA-DelVal: http://www.csadelval.org/