Hadoopable Problems and Hadoop Application Challenges
Dr. Tariq Mahmood
Chief Data Scientist, NexDegree Pvt. Ltd.
Email: [email protected]
[Professor at PAF-KIET, Karachi]
Agenda
Enterprise Hadoop Architecture: Decisions and Process
6 Common Hadoopable Problems
Hadoop Application Challenges
HADOOPING PROCESS
ZERO IN
REVELATIONS
Distinguish Big Data from Small Data
Specify Big Data Analytical Requirements
From Requirements to MapReduce Code
Develop Integrated Hadoop Base
Generate Smart Data for Big Data
Generate Smart Data for Small Data
VISUALIFY
SMART DATA
Credit Card Late Payment Risk
Small Data Analytics:
Significant segments of Credit Card Customers
Risk of Late Payment for each Segment
Accuracy of Risk Prediction for each Segment
Big Data Analytics with Hadoop:
Discover Segments in Real-Time
Discover Late Payment Risk of each Segment in Real-Time
Discover Per Segment Accuracy in Real-Time
Apply Risk-Aversion Policies Per Segment in Real-Time
Credit Card Late Payment Risk
Card Activation
Authorized Name Match
CV Validation
Fraud Risk Identification Services
Internal Fraud Monitoring
FICO Card Usage Anomaly Prediction
Customer Churn Analysis
Discover Customer Segments in Real-Time
Discover Churn Rate of Each Segment in Real-Time
Dynamic Customer Retention Policies per Segment
Product Recommendations
Discover Product Preferences Per Segment in Real-Time
Apply Personalization Strategies Per Segment Dynamically
Ad Targeting
Discover Web usage Behavior Per Segment in Real-Time
Apply Ad Targeting Policies Per segment in Real-Time
MMM… I could go for some Pizza Tonight
Plan to Shop for Clothes this Weekend… Wanna Join?
Going Berserk over my new iPAD
Going on a Long Drive to Uncle John’s this Friday
POS Transaction Analysis
Discover Customer Segments in Real-Time
Discover Shopping Patterns per Segment in Real-Time
Dynamically Manage Promotion Policies per Segment
Real-Time Inventory Control
Real-Time Purchasing
Real-Time Warehouse Stock Transfers
Real-Time Cash Management
BOUTIQUE
FASHION STORE
SPORTING GOODS
APPAREL STORE
CLOTHING STORE
Why Hadoop Challenges?
Hadoop Continues to Evolve
Lack of a standardized implementation infrastructure – too much breadth
A Huge Clash of Technologies – A Big Muddle!
Business Intelligence, Statistics, Data Mining, Machine Learning, Analytics, Data Warehousing, Distributed Computing (Hadoop), Cloud, Computer Visualization, Natural Language Processing
Lack of Relevant Talent to Harness Big Data Technologies
List of Challenges
Stream Analysis: Develop, Drill and Standardize
Difficult to Standardize Adoption to 3V’s
Adapting MapReduce dynamics and Hardware to generate Smart Data – clusters, configurations
Too Much Focus on Big Players
Full Resource Optimization not Guaranteed
Big Data – Effective Game Plan
Conviction in Mind
Analyses Required – Think Small for Big
Data and Don’t Expect Too Much
Hire the Competence – rigorous process
Focus on Data Revelations for Some Time
To Hadoop or not to Hadoop? To Cloud or
not to Cloud? – Technology Compromise
Ensure Smart Data Validity
Merge Infographics with Dashboards
Be on your Guard all the time