Greg Pedley Canadian Sales Lead
Big Data, Small Data, ALL Data with
Agenda
• Introductions
• Handling the Data Deluge!
• Modern Data Warehouse
• Hadoop
• Polybase Demonstration
• What is SQL PDW ?
• PDW Customer Use Case
• Resources
• Q & A
CUSTOMER DRIVEN DATAPOS data, Loyalty data, etc.
3
Today we have more data than ever but …
… it has never been harder to understand it.
SOCIAL CHANNELSCustomer preferences & brand perception
INTERNAL SYSTEMSProfitability & segmentation data
Data is complex, time consuming & hard to get at….
• Quantity an explosion of data
• Integration data locked in silos
• Quality data quality is not reliable
• Action slow to get value from data
… but at same time, it has never been more important to understand massive amounts of data.
5
6
Wouldn’t it be good if you could do the following ?
Traditional Data Warehouses At A Tipping Point
Difficulties with Data Warehousing Today…..
Operational
Systems
Enterprise
Data Warehouse
Data Marts
Business Intelligence
1
2
3
4
1
2
3
Get the data model right—up-front.
Load, clean, transform data fast.
Improve query performance from hours to seconds.
4 Manage multiple types of data.
…and with the Modern Data Warehouse
Files
Business Intelligence
3
1
2
3
DocumentsBlobs Cube4
2
1
SQL4
Trad
ito
nal
Relational
How is Microsoft Unique?
1
3
2
4
Business Intelligence
SQL SQL Query:
Polybase
CLOUDAPPLIANCE
1
2
Data complexity: variety and velocity
Petabytes Big Data
Log files
Spatial & GPS coordinates
Data market feeds
eGov feeds
Weather
Text/image
Click stream
Wikis/blogs
Sensors/RFID/devices
Social sentiment
Audio/video
Types of Big Data?
What is Hadoop?
12
MapReduce (Job Scheduling/Execution System)
HDFS (Hadoop Distributed File System)
HBase (Column DB)
Hive Mahout
Oozie
Sqoop
HBase/Cassandra/Couch/MongoDB
Avro
Zoo
keep
er
Pig
Hadoop = MapReduce + HDFS
FlumeCascad-ingR
Am
bar
i
HCatalog
SQL Polybase Demonstration….
What is SQL Parallel Data Warehouse ?
• PDW = Parallel Data Warehouse
• Massively Parallel Processing (MPP) for high performance
• Sold as an appliance with software preloaded
• Microsoft software running on HP or Dell hardware
• Based on proven MS SQL Server 2012 platform
• Lowest cost of ownership in the industry
• Integral Part of Microsoft’s BIG DATA & Cloud Strategies
• ****Dedicated Region for Hadoop****
Scale out relational data to petabytes
15
Scale out technologies in SQL Server Parallel Data Warehouse
Scale out non-relational data
16
Scale out non-relational data in HDInsight (Azure, Windows, or PDW)
In-memory performance
17
Distributed Data Warehouse Architecture
High-Performance
Reporting
SQL ServerAnalysis Services
Data Files3rd Party Data Integration
ETL Tools 3rd Party RDBMS
Central EDW Hub
Departmental
Reporting
Accessible from
Anywhere
SQL
Database
Resources
• www.UpgradeToPDW.com for a quick video, a downloadable white paper, ROI Calculator, case studies, migration guide, etc.
• Greg Pedley – Canadian Sales Lead – [email protected]
• Tom Pizzato – PDW Technical Lead – [email protected]
Conclusion
• Introductions
• Handling the Data Deluge!
• Modern Data Warehouse
• Hadoop
• Polybase Demonstration
• What is SQL PDW ?
• PDW Customer Use Case
• Resources
• Q & A