webconference : jan 29, 2013
TRANSCRIPT
Key Takeaways
Introduction to Big Data
Global Landscape and Trends
The Big Data Opportunity
Slide 3
Slide 5
Slide 12
Big Data’s Big Impact on Businesses
Slide 20
Key Takeaways
� Big Data market opportunity is expected to witness strong growth in the next 5 years
– Expected to touch US$25 billion globally; the ‘BIG’ opportunity for India lies in the IT & IT-enabled Services space, which is likely to be ~US$ 10-11 billion market globally in 2015
– India is likely to garner a ~10% share of the ~US$ 10-11 billion global Big Data IT Services Market by 2015
– Data-related regulations like Dodd-Frank and Basel III to impact Big Data implementations
� Initially, North America & Europe are likely to drive the Big Data opportunity since over 85% of the world’s data is today residing in these 2 regions
� New database architectures and innovative analytics tools & techniques to facilitate Big Data implementations
� By end of 2012, around 90% of Fortune 500 companies had some initiatives underway related to Big Data
� Key verticals driving demand for Big Data analytics: Financial services, Retail, Telecom, Healthcare and Manufacturing
� Key risk – potential shortfall of 1.5 million Data-Savvy Managers and 140,000-190,000 Data Scientists in the US by 2018
3
Source: CRISIL GR&A analysis
Key Takeaways
An Introduction to Big Data
Slide 3
Slide 5
� Definition of Big Data
� Big Data ecosystem
� Benefits of Big Data to enterprises
� Key applications for end consumers
Global Landscape and Trends
The Big Data Opportunity
Slide 16
Slide 23
Big Data is Defined by Volume, Variety and Velocity
5
Size of Data
Sp
eed
, A
ccu
racy a
nd
Co
mp
lexit
y o
f In
tellig
en
ce
Big Data analyticsBig Data analytics
Big DataBig Data
Traditional analytics
Traditional analytics
Advanced analyticsAdvanced analytics
Big Data relates to rapidly growing, Structured and Unstructured datasets with sizes beyond the ability of
conventional database tools to store, manage, and analyze them. In addition to its size and complexity, it refers to
its ability to help in “Evidence-Based” Decision-making, having a high impact on business operations
What is Big Data ?
Volume
Variety
Velocity
Large quantity of data which may be enterprise-specific or general and public or private
1
Diverse set of data being created, such as social networking feeds, video and audio files, email, sensor data and other raw data
2
Speed of data inflow as well as rate at which this fast-moving data needs to be stored
3
Gigabytes Terabytes Petabytes Zetabytes
Small Data SetsSmall Data Sets
Small Data SetsSmall Data Sets
Traditional analytics
Traditional analytics
Big DataBig Data
Source: CRISIL GR&A analysis
3Vs
Source: CRISIL GR&A analysis
The Global Data likely to Grow at a CAGR of 41%
6
� Need for large storage capacity and quick retrieval of data
� Enable informed decision-making effectively, leveraging large data sets
– Turn 12 TB of Tweets created each day into improved product sentiment analysis
– Convert 350 billion annual meter readings to better predict power consumption
Implication for an organization
2009 2011 2015 2020
0.8
1.9
7.9
35.0CAGR
(2009-2020)41.0%
Zetabytes
Growth of global data, 2009-2020
Note: ZB stands for Zetabytes; Source: IDC; CRISIL GR&A analysis
Today 80% of Data Existing in any Enterprise is Unstructured Data
7
� Variety of sources from where data is being
generated has also undergone a shift
� The types of data being created has changed from
structured to semi-structured to unstructured data
� Variety of sources from where data is being
generated has also undergone a shift
� The types of data being created has changed from
structured to semi-structured to unstructured data
Structured Data
� Resides in formal data stores – RDBMS and Data
Warehouse; grouped in the form of rows or columns
� Accounts for ~10% of the total data existing currently
AudioVideoWeather patternsBlogs
Location co-ordinatesText message
Web logs & clickstreams
RDBMS (e.g., ERP and CRM
Data Warehousing
Microsoft Project Plan File
Semi-Structured Data
� A form of structured data that does not conform with the
formal structure of data models
� Accounts for ~10% of the total data existing currently
Unstructured Data
� Comprises data formats which cannot be stored in row/
column format like audio files, video, clickstream data,
� Accounts for ~80% of the total data existing currently
Sensor data/ M2M Email Social media
Geospatial data
IntroductionIntroduction
� Need to manage broad range of data types
� Process analytic queries across numerous data
types
� Need to manage broad range of data types
� Process analytic queries across numerous data
types
� Need to extract meaningful analysis from this data
has led to several technologies to gain traction
� Examples include NoSQL databases to store
unstructured data as well as innovative processing
methods like Hadoop and massive parallel
processing (MPP)
� Need to extract meaningful analysis from this data
has led to several technologies to gain traction
� Examples include NoSQL databases to store
unstructured data as well as innovative processing
methods like Hadoop and massive parallel
processing (MPP)
Implication for organizationImplication for organization
Solutions required Solutions required
Source: Industry reporting; CRISIL GR&A analysis
Volume
Variety
Velocity
Big Data will Enable Real Time Analytics
8
� Big Data is also characterized by velocity or speed i.e. frequency of data generation or the frequency of data delivery
� New age communication channels such as mobile phones, emails, social networking has increased the rate of information flows
Examples:
� Telcos adopting location based marketing based on user location sensed by mobile towers
� Satellite images can help monitor and analyze troop movements, a flood plane, cloud patterns, or forest fires
� Video analysis systems could monitor a sensitive or valuable facility, watching for possible intruders and alert authorities in real time
Big Data velocity enabling real time use of data
Data velocity
per minute
600+videos on YouTube 200
million+ emails sent
2 million+
Google search queries
400,000+minutes of
Skype calling
400,000+tweets on
US$ 300,000+ are spent on online shopping
700,000+ Facebook updates
7,000+ photos on
flickr
1,500+blog posts
3500+ticks per minute in securities
trading
Source: Industry reporting; CRISIL GR&A analysis
Volume
Variety
Velocity
Descriptiveanalytics
Big Data Analytics is Application of Advanced Techniques on Big Datasets; Answers Questions Previously Considered Beyond Reach
9
Evolution of analytics
Le
ve
l o
f C
om
ple
xit
y
In-database analyticsAnalytics as a separate value chain function
Time
Standard reports
Adhoc reports
Alerts
Statistical analysis
Forecast- ing
Predictive modeling
Optimization
Stochastic optimization
Natural Language Processing
Big Data analytics
Complex event processing
Predictive analytics
Prescriptiveanalytics
Basic analytics� What happened? � When did it happen?� What was the its impact ?
Advanced analytics
� Why did it happen?
� When will it happen again?
� What caused it to happen?
� What can be done to avoid it?
Multivariate statistical analysis
Time series analysis
Behavioral analytics
Data mining
Constraint based BI
Social network analytics
Semantic analytics
Online analytical processing (OLAP)
Extreme SQLVisualization
Analytic database functions
� Big Data analytics is where advanced analytic techniques are applied on Big Data sets
� The term came into play late 2011 – early 2012
Late 1990s 2000 onwards
Source: CRISIL GR&A analysis
Query drill
down
Big Data Management, Analytics, IT Services & Applications are the Key Constituents of Big Data Ecosystem
10
Data Sources
Big Data Analytics
Components of Big Data Ecosystem
Developer Environments(Languages (Java), Environments (Eclipse & NetBeans), programming interfaces (MapReduce))
Analytics products
(Avro, Apache Thrift)
BI &visualization
tools
Applications(mobile, search, web)
End users
Business analysts
Big Data
Data ArchitectureHadoop/ Big Data tech’y framework
(MapReduce etc.)
Unstructured data
(Text, web pages, social media content, video etc.)
Structured data
(stored in MPP, RDBMS and DW*)
Data administration tools
NoSQLMPP
RDBMSDW
NoSQL Hadoop based
Operational Data
Data
man
ag
em
en
t &
sto
rag
eD
ata
an
aly
tics &
its
ap
pli
cati
on
an
d u
se
IT s
erv
ices
(SI,
cu
sto
miz
atio
n, co
nsu
ltin
g, sys
tem
de
sig
n)
ETL & Data integration products
System tools
Workflow/ scheduler products
Input data
Four key elements:
1. Big Data Management & storage:
� Data storage infrastructureand technologies
2. Big Data Analytics
� Includes the technologies and tools to analyze the data and generate insight from it
3. Big Data’s Application & Use
� Involves enabling the Big Data insights to work in BI and end-user applications
4. IT services including
� System Integration
� Consulting
� Project management and customization
What does the Big Data Ecosystem Constitute ?
*MPP – Massively parallel processing; RDBMS - Relational Data Base Management Systems; DW – Data warehouseSource: CRISIL GR&A analysis
Key takeaways
An Introduction to Big Data
Global Landscape and Trends
The Big Data Opportunity
Slide 3
Slide 5
Slide 16
Slide 23
� Big Data – Geographic Analysis
� Market Trends & Developments
North America & Europe Drives the Big Data Opportunity with over 85% of the World’s Data
12
>3,500
>40
>2,000
>200
>400
Data generated: High to low
Amount of new Big Data stored (Petabytes), 2010
� Key verticals: Healthcare, Manufacturing, Retail, Digital Marketing
� Demand trend: High demand of Big Data analytics
>250
� Key verticals: Telecom, Retail, Banking� Demand trend: Still embryonic; most
organizations have wait and watch approach
� Demand trend: Current demand appears to be limited, however, lack of skills may drive outsourcing of Big Data analytics
� Low awareness levels
� Key verticals: Technology, Financial services, Oil & Gas, Utilities, Manufacturing
� Demand trend: European MNC’s are still in the early stages of the adoption cycle
North America
South America
Europe
Middle East
India
China
Japan
As North America and Europe account for the lion’s share of the world’s data the initial opportunity of both Big Data implementations and analytics lies in the these geographies i.e. developed economies
Source: McKinsey Global Institute; CRISIL GR&A analysis
� Key verticals: Manufacturing, Telecom, Health & Life Sciences
� Demand trend: Demand for BI to derive operational efficiency
� Key verticals: Telecom, Bioinformatics, Retail
� Demand trend: Industry is in nascent stage with demand catching up, particularly in retail
>50
Emergence of Niche Startups and Large IT Players Enhancing their Big Data Capabilities are key enablers for the Industry
13
Market Trends and Developments
Emergence of niche Big Data startups driving technological innovation
Large IT players leveraging M&As to add Big Data capabilities to their service portfolios
Financial Services, Retail and Telecom are likely to be the early adopters in the Big Data space
Talent shortage is one of the biggest challenges of the Big Data space
1
2
3
Source: CRISIL GR&A analysis
4
Emergence of niche Big Data start-ups to boost technological innovation
14
A new class of companies, specializing in Big Data technologies have emerged, to capitalize on the
opportunities in the Big Data domain
Big Data start-ups – Key characteristics
Specialized in niche Big Data technologies like
Hadoop, NoSQL systems, in-memory analytics,
multiple parallel processing, and analytical
platforms
Specialized in niche Big Data technologies like
Hadoop, NoSQL systems, in-memory analytics,
multiple parallel processing, and analytical
platforms
1
2
3
Majority of start-ups generate revenue less than
USD 50 million and exhibit double digit revenue
growth annually
Majority of start-ups generate revenue less than
USD 50 million and exhibit double digit revenue
growth annually
Most start-ups raising funding by private ventures
or being acquired by large IT players
Most start-ups raising funding by private ventures
or being acquired by large IT players
Technology Area Players*
Hadoop distributions
Non Hadoop Big Data Platforms
Analytic Platforms and Applications
Cloud-based Big Data Applications
*Indicative list of playersSource: Industry reporting; CRISIL GR&A analysis
1
Large IT Players Leveraging M&As to add Big Data Capabilities to their Service Portfolios
15
Key highlights
� Acquisition targets are mainly innovative Big Data
start-ups
� M&As with bigger deal value are happening in data
management
� M&As in the Big Data space had tripled in the first half of 2012
Area AcquirerTarget
CompanyDate Deal value Rationale
Data Management
Oct. '11 USD 1.1 billion � Develop a comprehensive platform to analyze Big Data
Mar. '11USD 263
million� Strengthen position in data warehousing market through
expertise in SQL and MapReduce-based analysis
Advanced analytics
Jun. '12 N.A.� Extend Smarter Commerce suite with qualitative analytics
software
May. '12 N.A.� Leverage data navigation technologies for Big Data by
automating discovery of through innovative index and search capabilities
May. '12 N.A. � Addition of sales performance analytics
May. '12 N.A. � Enhance Big Data marketing analytics
Apr. '12 N.A. � Acquisition of spend and procurement analytics
Mar. '12 N.A. � Accelerate development of Big Data analytic applications
Mar. '11 N.A. � Enhance real time business analytics for Big Data
N.A. is not available. Source: Industry reporting; CRISIL GR&A analysis
2
1. Retail: Sears is leveraging Big Data analytics internally and is also keen on offering analytics services externally
Benefits
IT need
• Manage Increasing volumes of data
like customer personal information,
PoS data, online purchases, etc.,
posing a challenge
• Capacity run-out on its mainframe, and
adding more capacity proving to be
expensive
Business need
• The need to set prices quickly and in
real time
• The need to drive customer loyalty
• Leverages its global In-house center in
Pune, India for Big Data Analytics
• Implemented a Big Data architecture
using Hadoop
• Used MapReduce algorithms to analyze
data and feed results back into the
mainframe, on individual customer
activity, across all 4,000 locations
Across IT environment
• Utilization of 100% of collected data
against 10% utilization earlier
• Ability to run price elasticity
algorithms in one week, as opposed
to eight weeks previously
• Cost-savings of USD 600,000 per year
Across business
• More relevant and personalized
customer communications and offers to
an active customer base (~80 million)
• Increased shopping and higher spend
per transaction by active members
Looking at the current and potential benefits of Big Data analytics, Sears aims to expand into newer areas and sell its data
management and analytics services technology to other companies, through its subsidiary MetaScale
*Massively Parallel Processing Source: Industry reporting; CRISIL GR&A analysis
Challenge/Business Need
Solution
Sears Holding is a leading integrated retailer with ~4,000 full-line and specialty retail stores in the US
and Canada. It operates through its subsidiaries including Sears, Roebuck and Co. and Kmart Corp.
Sears Holding is a leading integrated retailer with ~4,000 full-line and specialty retail stores in the US
and Canada. It operates through its subsidiaries including Sears, Roebuck and Co. and Kmart Corp.
3
Source: Industry reporting; CRISIL GR&A analysis
• The need to meet growing regulatory compliances, detect fraud and create new market opportunities is driving the growth for Big Data analytics in the financial services sector
• Customer & transaction data from multiple channels like branch, kiosks, mobile and web; social media; emails; credit cards data;insurance claims data; stock market data; statistical data, PDF & excel files, videos, government filings, etc. are key Big Data sources
2. Financial Services: Witnessing increased adoption of Big Data analytics, to reduce risk and uncover new market opportunities
Big Data application across Financial Services sub-sector
Banking
Capital Markets/ Trading
Insurance
Trading surveillance
Intraday analysis
Trading pattern analysis
Credit line optimization
Credit reward program analysis
Pre-trade decision support analytics
Fraud detection
Portfolio analytics
Compliance & regulatory reporting
CRM,, Entering new markets
Predict client longevity, along with analyzing perspective clients
medical status Using weather and
calamity information for managing exposures
and losses
Risk management/assessment
3
Potential Shortfall of 1.5 million Data-Savvy Managers and ~150,000 Data Scientists in the US in 2018
18
Demand-supply gap for data scientists* in US, 2018
Data Scientists
Data-savvy Managers
Technical Engineers
� Expertise in data analytics skills to extract data, use of modeling & simulations
� Multi-disciplinary knowledge of business to find insights
� Advanced business degree such as MBA, M.S. or managerial diplomas
� Advanced degree like M.S. or Ph.D., in mathematics, statistics, economics, computer science or any decision sciences
� Knowledge of statistics and/or machine learning to frame key questions and analyze answers
� Conceptual knowledge of business to interpret and challenge the insights
� Ability to make decisions using Big Data insights
� Having a degree in computer science, information technology, systems engineering. or related disciplines
� Possessing data management knowledge
� IT skills to develop, implement, and maintain hardware and software
� Project management across the Big Data ecosystem
– Consulting services
– Implementation
– Infrastructure management
– Analytics
� Big Data analytics
� Business intelligence
� Visualization
� Technical support in hardware & software across the Big Data ecosystem for:
– Data architecture
– Data administration
– Developer environment
– Applications
50%-60% gap relative
to supply
300K
Role in Ecosystem Requisite educational
qualificationsOther expertise
140K – 190K
440K-490K
Demand-supply gap for data-savvy managers* in US, 2018
60% gap relative to
supply
2.5 million
1.5 million
4.0 million
*Analysts with deep analytical training; **Managers to analyze Big Data and make decisions based on their findings; Source: McKinsey Global Institute; CRISIL GR&A analysis
4
� Forecasted market size
� Future outlook
Key Takeaways
An Introduction to Big Data
Global Landscape and Trends
The Big Data Opportunity
Slide 3
Slide 5
Slide 16
Slide 23
Global Big Data market to reach ~USD 25 billion by 2015,with a 45% share of IT & IT-enabled services
20
Global Big Data Market Size, 2011 – 2015EUS$ billion
5.3-5.6
8.0-8.5
25.0-26.0
� The global Big Data market is expected to grow by about a CAGR of 46% over 2012-2015
� IT & ITES, including analytics, is expected to grow the fastest, at a rate of more than 60%
– Its share in the total Big Data market is expected to increase to ~45% in 2015 from ~31% in 2011
� The USD 25 billion opportunity represents the initial wave of the opportunity. This opportunity is set to expand
even more rapidly after 2015 given the pace at which data is being generated.
Source: Industry reporting; CRISIL GR&A analysis
US$ 6-6.5 billion
US$ 7-7.5 billion
US$ 10-11 billion
Global Big Data Market Size, 2015F
~US$25 billion
Big Data analytics & IT & IT-enabled
services
Software
Hardware
Lion’s share of the Big Data hardware and software market is expected to be occupied by IT giants like IBM, HP, Microsoft, SAP, SAS, Oracle, etc.
Opportunity for India lies in capturing the slice of IT services that includes Big Data analytics and IT & IT-enabled services
Conclusion
� Big Data market opportunity is expected to witness strong growth in the next 5 years
– Expected to touch US$25 billion globally; the ‘BIG’ opportunity for India lies in the IT & IT-enabled Services space, which is likely to be ~US$ 10-11 billion market globally in 2015
– Data-related regulations like Dodd-Frank and Basel III to impact Big Data implementations
� New database architectures and innovative analytics tools & techniques to facilitate Big Data implementations
� Key verticals driving demand for Big Data analytics: Financial services, Retail, Telecom, Healthcare and Manufacturing
� Key risk – potential shortfall of 1.5 million Data-Savvy Managers and 140,000-190,000 Data Scientists in the US by 2018
21
Source: CRISIL GR&A analysis