microsoft big data. what is big data? how do i optimize my fleet based on weather and traffic...
TRANSCRIPT
MICROSOFT BIG DATA
@ashishjaiman
[email protected] /in/ashishjaiman
ashishjaiman.com
Ashish Jaiman
Director, Startup Strategy
WHAT IS BIG DATA?
Data Complexity: Variety and Velocity
Terabytes
Gigabytes
Megabytes
Petabytes Big
DataLog files
Spatial & GPS coordinates
Data market feeds
eGov feeds
Weather
Text/image
Click stream
Wikis/blogs
Sensors/RFID/devices
Social sentiment
Audio/video
Web 2.0
Web Logs
Digital Marketing
Search Marketing
Recommendations
Advertising
Mobile
Collaboration
eCommerce
ERP/CRM
Payables
Payroll
Inventory
Contacts
Deal Tracking
Sales Pipeline
How do I optimize my fleet based on weather and traffic patterns?
SOCIAL & WEB ANALYTICS
LIVE DATA FEEDS
ADVANCED ANALYTICS
What’s the social sentiment for my brand or products
How do I better predict future outcomes?
A NEW SET OF QUESTIONS
NEW OPPORTUNITIES
Revenue Growth
Increases ad revenue by processing 3.5 billion events per day
Massive Volumes
Processes 464 billion rows per quarter, with average query time under 10 secs.
Businesses Innovation
Measures and ranks online user influence by processing 3 billion signals per day
Cloud Connectivity
Connects across 15 social networks via the cloud for data and API access
Operational Efficiencies
Uses sentiment analysis and web analytics for its internal cloud
GE
Real-Time Insight
Improves operational decision making for IT managers and users
THE BIG DATA LIFECYCLE
InsightManage Enrich
RelationalNon-Relational Streaming
MANAGE ANY DATA, ANY SIZE, ANYWHERE
010101010101010101101010101010101001010101010101101010101010
Unified Monitoring, Management & Security
Data Movement
Extremely large volume of unstructured web logsAd hoc analysis of logs to prototype patternsHadoop data cluster feeds large 24TB cubeBusiness users analyze cube data
6 PB Hadoop Cluster
24 TB SQL Server AS Cube
Microsoft BI Tools
E.g. STRUCTURED & UNSTRUCTURED DATA
HADOOP INTEGRATED INTO THE DATA PLATFORM
Non-Relational
Enterprise class security, HA & managementSeamlessly integrated with Microsoft BI toolsWindows Simplicity and ManageabilityProvisioned in minutes on Windows Azure
Microsoft HDInsight Server for on-premisesWindows Azure HDInsight Service for cloud
BUILT ON HORTONWORKS DATA PLATFORM (HDP)
POLYBASE: COMBINING RELATIONAL AND NON-RELATIONAL DATAThe future of query processing
select... results set
Hadoop Data Warehouse
PolyBase
Single query for relational & Hadoop data
Process data in place
Future expansion to other data sources
Seamless: regular T-SQL command
WHILE DRAMATICALLY SIMPLIFYING PROGRAMMING ON HADOOP
Integration with .NET and new JavaScript libraries for Hadoop
JS
MapReduce programs in JavaScript
Simplified programming
Simplified deployment of MapReduce jobs
Benefits
Key
Featu
res
Deploy JavaScript Hadoop jobs from a simple web browser on any supported device
InsightManage Enrich
THE BIG DATA LIFECYCLE
ENRICH BY CONNECTING TO THE WORLDS DATA
Discover
Combine
Refine
DISCOVER DATA
FROM
TO
SEARCH
RECOMMEND
IDENTITYDOC CONTEXT
SOCIAL GRAPHS
DATA EXPLORERDATA HUB
POWER OF COMBINING THE WORLDS DATA
Personal Data
OrganizationalData
CommunityData
WorldData
Value
REFINE DATA
Enterprise Information Management & Full Analytic Spectrum
Credible, Consistent Data
Advanced Analytics
Data Mining
E.g. VALUE OF EXTERNAL DATA
“When it comes to business intelligence, Microsoft SQL Server 2012 demonstrates that the platform has continued to advance and keep up with the innovations that are happening in big data."
David Mariani, Vice President of Engineering
Connects to more than 1 billion signals
Across 15 leading social networks, including Facebook
Generates a ‘Klout’ score for individual people, brands & partners
Enables analysis, targeting and social graphs
InsightManage Enrich
THE BIG DATA LIFECYCLE
INSIGHTS ON ANY DATA, ALL USERS, WHEREVER THEY ARE
RelationalNon-Relational Streaming
010101010101010101101010101010101001010101010101101010101010
BI Professionals Business AnalystsData Scientists
INSIGHTS FOR ALL USERS THROUGH FAMILIAR TOOLS
Advanced Analytics from Microsoft and 3rd parties
Self Service Analysis with PowerPivot & Power View
Interactivity & exploration with Hadoop data in Excel
PB TB GB
BI Professionals Business AnalystsData Scientists
MICROSOFT BIG DATA
Parallel Data WarehousePowerPivot
Power View
InsightManage Enrich
Microsoft HDInsight Server
HDInsight Service
ADDITIONAL RESOURCES
LEARN MORE1. Microsoft Big Data Solution: www.microsoft.com/bigdata2. Windows Azure:
www.windowsazure.com/en-us/home/scenarios/big-data3. Microsoft BI blog:
http://blogs.msdn.com/b/microsoft_business_intelligence1/
TRY NOW4. Preview of the Windows Azure HDInsight Service: https://
www.hadooponazure.com
5. Developer CTP of Microsoft HDInsight Server for Windows Server: http://www.microsoft.com/bigdata
26
BizSpark for Startups
Accelerate
3 YearsTop Partners
ConnectGraduation
Criteria
<5 Years Old<$1M Revenue
Software Product
Privately Held
Benefits
SoftwareMSDN UltimateCloud Services
SupportMarketing
http://aka.ms/aj-bz
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
BIG DATA IS A GROWTH OPPORTUNITY FOR PARTNERS
1. McKinsey&Company, McKinsey Global Survey Results, Minding Your Digital Business, 20122. IDC Market Analysis, Worldwide Big Data Technology and Services 2012–2015 Forecast , 2012
Big Data is a Big Priority for Customers
49% of top CEOs and CIOs are currently using Big Data for customer analytics1
Big Data Services Growth
2012 2013 2014 20150
2
4
6
8
2.7
3.9
5.1
6.5
Bil
lio
ns $ 39%
compound annual growth
rate2
Big Data Software Growth
2012 2013 2014 20150
1
2
3
4
5
1.82.5
3.4
4.6
Bil
lio
ns $ 34%
compound annual growth
rate2
WHAT IS BIG DATA AND WHY NOW?
HadoopVolumeVarietyVelocity
Cheap, Distributed Storage & Processing
By 2015, organizations that build a modern information managementsystem will outperform their peers financially by 20 percent.
– Gartner, Mark Beyer, “Information Management in the 21st Century”
Changing economics
Dataexplosion
Discover data with Data Explorer
Combine with information from other sources via Azure Marketplace
Refine with advanced analytics
Connecting with the World’s Data
MICROSOFT BIG DATA
Immersive insights for all users
Insights on any data
Embedded insights with simplified programming
Immersive Insight, Wherever you are
Enterprise-ready Hadoop
Windows simplicity and Manageability for Hadoop
Extend data warehouse with Hadoop
Scale & elasticity of the cloud
Open Big Data Platform
Any Data, Any Size Anywhere
Discover data with Data Explorer
Combine with information from other sources via Azure Marketplace
Refine with Advanced Analytics
Connecting with the World’s Data
MICROSOFT BIG DATA
Immersive insights for all users
Insights on any data
Embedded insights with simplified programming
Immersive Insight, Wherever you are
Extend data warehouse with Hadoop
Windows simplicity for Hadoop
Scale & elasticity of the cloud
Any Data, Any Size Anywhere
Parallel Data Warehouse
PowerPivot
Power View
BIG DATA REQUIRES TRADITIONAL AND NEW CAPABILITIES
TRADITIONALRelational Database Management System
NEWPetabyte-Scale Services
WHILE DRAMATICALLY SIMPLIFYING PROGRAMMING ON HADOOP
Integration with .NET and new JavaScript libraries for Hadoop
JS
MapReduce programs in JavaScript
Simplified programming
Simplified deployment of MapReduce jobs
Benefits
Key
Featu
res
Deploy JavaScript Hadoop jobs from a simple web browser on any supported device
HADOOP ON PREMISES AND IN THE CLOUD
Enterprise-class Big Data platform on-premises
Hadoop-based distribution on Windows Server with Microsoft HDInsight
Elastic Big Data platform in the cloud
Hadoop-based Service on Windows Azure platform
Hadoop connectors for SQL Server
Extend your EDWwith Big Data
MANAGE ANY DATA, ANY SIZE ANYWHERE
Non-RelationalRelational
SQL Server Database & Parallel Data Warehouse
Hadoop on WindowsHadoop on Azure
Streaming
101010101010101001010101010101101010101010
StreamInsight
Data MovementHadoop Connectors & ETL
Unified Monitoring, Management & Security
SIMPLICITY AND MANAGEABILITY OF WINDOWS FOR HADOOP
Easy setup on-premises and in the cloud
Hadoop-based service with Windows Azure HDInsight
Smart packaging of Hadoop on Windows with Microsoft HDInsight
Integration with Microsoft System Center
Simplified management
Enterprise-class security
Integration with Windows Server® Active Directory
STREAMING DATA WITH STREAMINSIGHT
Complex Event Processing with SteamInsight (On-premise)
On-premises analysis of streaming data in real time
Event Processing in the Cloud with Windows Azure SQL StreamInsight
Cloud designed analysis of streaming data
StreamInsight SQL StreamInsight
EXTEND YOUR DATA WAREHOUSE WITH HADOOP
Integration with enterprise BI solutions
Microsoft SQL Server connector for Apache Hadoop with SQOOP (SQL to Hadoop)
Integration with Microsoft Data Warehousing
SQL Server Parallel Data Warehouse connector for Apache Hadoop with SQOOP
Deeper insights from structured and unstructured data
CONNECT HADOOP TO THE WORLD VIA WINDOWS AZURE MARKETPLACE
Mashing up of internal and public data sets
Integration with third-party data and services
Sharing of data and insights through Windows Azure Marketplace
Integration with Windows Azure Marketplace through ODATA
ENRICHMENT VIA INTEGRATION WITH SOCIAL MEDIA
Integration of social media data with business applications
Microsoft Codename "Social Analytics"
Stronger customer relationships
Integration with social media sites
Models augmented with publicly available data from social media sites
ADVANCED ANALYSIS WITH HADOOP
Unlock rare patterns from bespoke data mining models
Mahout & Pegasus libraries already supported on Azure
New business insights with predictive analytics from Microsoft
Hive ODBC Driver connects Hadoop to SQL Server Data Mining tools in SSAS
Support for open source Advanced Analytics tools such as Mahout & Pegasus
MICROSOFT ENTERPRISE DATA WAREHOUSING
Software AppliancesReference Architectures
Fast Track for
Dell Parallel Data Warehouse
HP Enterprise Data Warehouse
Dell Quickstart Data Warehouse
HP Business Data Warehouse