microsoft and hortonworks delivers the modern data architecture for big data
DESCRIPTION
Joint webinar with Microsoft and Hortonworns on the power of combining the Hortonworks Data Platform with Microsoft’s ubiquitous Windows, Office, SQL Server, Parallel Data Warehouse, and Azure platform to build the Modern Data Architecture for Big Data.TRANSCRIPT
© Hortonworks Inc. 2014
Hybrid Modern Data Architecture with Microsoft and Apache Hadoop
Your Presenters
• Oliver Chiu (twitter name ) – Title – Years of experience – Fun Fact
• John Kreisa (@marked_man)
– VP Strategic Marketing, Hortonworks – Over 20 years in data management as a
developer and a marketer – Avid camper
Poll 1: What stage are you looking in Hadoop
• Research • Evaluation • Trial • Haven’t started research
Today’s Topics
• Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
© Hortonworks Inc. 2014
Existing Data Architecture AP
PLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
Source: IDC
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020 40 ZB by 2020
© Hortonworks Inc. 2014
Modern Data Architecture Enabled AP
PLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Emerging Sources (Sensor, Sen4ment, Geo, Unstructured)
OPERATIONAL TOOLS
MANAGE & MONITOR
DEV & DATA TOOLS
BUILD & TEST
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
Hadoop Powers Modern Data Architecture
Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment.
Hadoop Cluster
compute &
storage . . . . . . . .
compute &
storage
.
.
Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware
Integrated Interoperable with existing data center investments
Skills Leverage your existing skills: development, operations, analytics
Requirements for Hadoop Adoption
Key Services Platform, operational and data services essential for the enterprise
3 Requirements for Hadoop’s Role in the Modern Data Architecture
© Hortonworks Inc. 2013
Use Cases for the MDA
Page 9
Industry Use Case Type of Data
Financial Services New Account Risk Screens Text, Server Logs
Trading Risk Server Logs
Insurance Underwriting Geographic, Sensor, Text
Telecom Call Detail Records (CDRs) Machine, Geographic
Infrastructure Investment Machine, Server Logs
Real-time Bandwidth Allocation Server Logs, Text, Social
Retail 360° View of the Customer Clickstream, Text
Localized, Personalized Promotions Geographic
Website Optimization Clickstream
Manufacturing Supply Chain and Logistics Sensor
Assembly Line Quality Assurance Sensor
Crowdsourced Quality Assurance Social
Healthcare Use Genomic Data in Medical Trials Structured
Monitor Patient Vitals in Real-Time Sensor
Pharmaceuticals Recruit and Retain Patients for Drug Trials Social, Clickstream
Improve Prescription Adherence Social, Unstructured, Geographic
Oil & Gas Unify Exploration & Production Data Sensor, Geographic & Unstructured
Monitor Rig Safety in Real-Time Sensor, Unstructured
Government ETL Offload in Response to Federal Budgetary Pressures Structured
Sentiment Analysis for Government Programs Social
© Hortonworks Inc. 2014
Microsoft in the Modern Data Architecture
INFRASTRUCTURE
SOURC
ES
Emerging Sources (Sensor, Sen4ment, Geo, Unstructured)
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
APPLICAT
IONS
DATA
SYSTEM
xΩ
OPERATIONAL TOOLS
DEV & DATA TOOLS
Microsoft Applications
New! Power BI
Public Preview
xΩ
Today’s Topics
• Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
Hortonworks and Microsoft
Engineering alignment Corporate alignment
Field Alignment
End-to-End Data Platform
PDW vNext (PDW +
HDInsight)
Windows Azure HDInsight
Hortonworks Data Platform
PDW SQL Server for DW in Azure SQL Server
PDW vNext (PDW + HDInsight)
Windows Azure HDInsight
Hadoop Solutions From Microsoft
Hortonworks Data Platform
Hortonworks Data Platform for Windows
Hortonworks Data Platform
Parallel Data Warehouse Next w/ HDInsight
PDW vNext (PDW + HDInsight)
Microsoft Confidential 17
Select…
Hadoop Data
Result Set
Relational
Data
PolyBase
18
Scale out technologies in SQL Server Parallel Data Warehouse
Windows Azure HDInsight
Windows Azure HDInsight
Master Chief meets Big Data
§ In-game analysis detects cheaters and improves experience for everyone
§ Enables targeted campaigns that improve customer retention
PDW vNext (PDW + HDInsight)
Windows Azure HDInsight
Hadoop Solutions From Microsoft
Hortonworks Data Platform
Development and Data Tools
Hortonworks & Microsoft
AMBARI
MAPREDUCE
YARN
TEZ
DATA SERVICES
HIVE HBASE
PIG
HCATALOG
HDFS
Java
RPC
INTERFACE
ODBC
JDBC
JAVA RPC
HADOOP Data Services
Governance
Exchange
Replication
Query/Visualization/Reporting/Analytics
SQOOP
Reference Architecture
SOURCE DATA
JMS Queue’s
Servers & Mainframe
Files
Databases
Sensor data
Social
LOAD
SQOOP
FLUME
Web HDFS
Enterprise Repositories
Management and Monitoring
Question & Answer session will be conducted electronically, using the panel to the right of your screen
More about Microsoft and Hortonworks http://hortonworks.com/labs/Microsoft
Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/partner-tutorial-microsoft/
Follow us: @hortonworks @MicrosoftBI