esynergy - windows azure: introduction to big data and hadoop

Post on 22-Jan-2015

438 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

eSynergy Solutions was the Gold sponsor for the UK Microsoft Open Source Cloud event held in June 2012.

TRANSCRIPT

Welcome to Windows Azure by Microsoft.

eSynergy Solutions was the Gold sponsor for the UK Microsoft Open Source Cloud event held in June 2012.

Introduction to Big Data and Hadoop

Defining Big Data

Volume Velocity Variety

The world of data is changing

10x increase every five years

85% from new

data types

Cheap Distributed Storage and Processing

4.3connected devices per adult

Dataexplosion

By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent.

– Gartner, Mark Beyer, “Information Management in the 21st Century”

Easy Accessibility of Extended Data

SOCIAL & WEB ANALYTICS

LIVE DATA FEEDS

ADVANCED ANALYTICS

How do I optimize my fleet based on weather and traffic patterns?

How do I better predict future outcomes?

What’s the social sentiment for my brand or products

New questions are being asked by the business:

OPERATIONAL DATA

Traditional E-Commerce Data Flow

NEW USER REGISTRY

NEW PURCHASE

NEW PRODUCT

Excess Data

Logs

ETL Some Data

Data Warehouse

OPERATIONAL DATA

New E-Commerce Big Data Flow

Raw Data“Store it All” Cluster

Raw Data“Store it All” Cluster

NEW USER REGISTRY

NEW PURCHASE

NEW PRODUCT

Data Warehouse

Logs

Logs

How much do views for certain products increase when our TV ads run?

Big data creates New Business Opportunities

Revenue Growth

Increases ad revenue by processing 3.5 billion events per day

MassiveVolumes

Processes 464 billion rows per quarter, with average query time under 10 secs.

1Businesses Innovation

Measures and ranks online user influence by processing more than 1 billion signals per day

CloudConnectivity

Connects across 15 social networks via the cloud for data and API access

Operational Efficiencies

Identify faults in gas turbines before they happen

GE

Near Real-TimeInsight

Receive signals from turbines and compare to normal signals and to ones when fault subsequently occured

1. Klout Case Study: http://www.microsoft.com/casestudies/Microsoft-SQL-Server-2012-Enterprise/Klout/Data-Services-Firm-Uses-Microsoft-BI-and-Hadoop-to-Boost-Insight-into-Big-Data/710000000129

Hadoop - the Basics

FIRST, STORE THE DATA

Server

ServerServer

So How Does It Work?

Files

Server

SECOND, TAKE THE PROCESSING TO THE DATA

So How Does It Work?

// Map Reduce function in JavaScript

var map = function (key, value, context) {var words = value.split(/[^a-zA-Z]/);for (var i = 0; i < words.length; i++) {

if (words[i] !== "")context.write(words[i].toLowerCase(),1);}}};

var reduce = function (key, values, context) {var sum = 0;while (values.hasNext()) {sum += parseInt(values.next());

}context.write(key, sum);};

ServerServer

ServerServer

RUNTIME

Code

Hadoop Architecture

MapReduce – Workflow

The Hadoop Ecosystem

ETL Tools BI Reporting RDBMS

Reference: Tom White’s Hadoop: The Definitive Guide

Traditional RDBMS vs. MapReduce

TRADITIONAL RDBMS MAPREDUCE

Data Size Gigabytes (Terabytes) Petabytes (Hexabytes)

Access Interactive and Batch Batch

Updates Read / Write many times Write once, Read many times

Structure Static Schema Dynamic Schema

Integrity High (ACID) Low

Scaling Nonlinear Linear

DBA Ratio 1:40 1:3000

Reference: Tom White’s Hadoop: The Definitive Guide

Microsoft and Hadoop

Hadoop on WindowsInsights to all users by activating new types of data

Integrate with Microsoft Business Intelligence

Hive ODBC Driver & Hive Add-in for Excel

Choice of deployment on Windows Server + Windows Azure

Integrate with Windows Components (AD, Systems Center)Easy installation and configuration of Hadoop on Windows

Simplified programming with . Net & Javascript integration

Integrate with SQL Server Data Warehousing

Diff

ere

nti

ati

on

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION

IN THIS PRESENTATION.

www.esynergy-solutions.co.uk

0207 444 4080

info@esynergy-solutions.co.uk

top related