real-time big data use cases john leach cto, splice machine
TRANSCRIPT
Real-Time Big Data Use Cases
John LeachCTO, Splice Machine
2
disruptive
Before After
PhDs Java Programmers
Data Expensive to Store
Distributed Computing Across Commodity Servers
Data Cheap to Store
3
obstacles
MapReduce Java programmers are scarce and costly
Limited use cases because of batch nature of Hadoop
4
Moving Hadoop Beyond Batch Analytics to Power Real-Time Apps
Hadoop – not just for data scientists anymore
Distributed File System
Java MapReduce
Programs
Read-Only
Batch Analytics
Real-Time DatastoresDistributed RDBMSSQL-99 Queries
Real-Time Updates with ACID Transactions
Real-Time Apps and Analytics
5
real-time Big Data use cases
Ad TechnologyDigital MarketingFraud DetectionInternet of Things
Cyberthreat SecurityNetwork MonitoringPersonalized Medicine
6
case study: Rocket Fuel
7
case study: digital marketing
Powers Unica app and CognosScale-out with commodity serversMade queries 3x-7x fasterAchieved over 10x price/perf improvement
Replaced Oracle RAC DB
Initial Results
Clients Consumers
Unica
Real-Time Personalization
Real-Time Actions
Cross-Channel Campaigns
Oracle
8
fraud detection
Correlate spending patterns based on real-time movements or tripsMove beyond simple rulesPrevent false positivesCatch fraud fasterIncrease customer satisfaction
Intelligent Fraud Detection
Roadtrip to Nevada•Start in San Francisco1.Use credit card for gas in Sacramento2.Use credit card in Tahoe for lunch3.Credit card denied for gas because you left CA4.Spend 15 minutes on phone to get credit card reinstated
Benefits
21
34
9
IOT: network monitoring
Detect and isolate faults based by trending real-time events Perform remote resetsIncrease customer satisfaction Reduce costly calls and “truck rolls”
Proactive Fault Response
Cable Set-Top Boxes
Remote Resets
Scale-out
RDBMS
Telemetry Data
Network Monitoring App
Benefits
10
IOT: cyberthreat security
Correlate millions of events/sec against 3-5 years of firewall history to identify “sleepers” waking upPrevents loss of sensitive data such as credit cardsReduce embarrassing public exposure
Real-Time Threat Response
Real-Time Responses
Scale-out
RDBMS
Network Events
Security Monitoring App Network
Firewalls
Benefits
11
IOT: personalized medicine
Genomic Data Doctors
Personalized Treatment
App
Coordinate care with EMRsIdentify complications w/ genetic dataDrive real-time response w/ device dataReduce hospital readmissionsEliminate lost revenue under ObamaCare
Personalized Treatment Plans
Scale-out
RDBMS
Electronic Medical Records (EMRs)
Medical Monitoring
Devices
Personalized Treatment Plans
Alerts
Benefits
12
scale up vs. scale-out
Scale Up- e.g., Exadata- Very expensive- Poor price/performance
Scale Out
NoSQL NewSQL
Proprietary
SQL-on-Hadoop
Hadoop RDBMS
Analytic Engines
How do I scale?
- e.g., MongoDB- Limited SQL- No transactions- May have weak consistency or no joins
- e.g., NuoDB- Unproven scalability
- No Hadoop
- e.g., Impala- No transactions- No real-time updates
- Can’t power a real-time app
- e.g., Splice Machine- Proven scale-out architecture
- Transactional RDBMS- Power real-time apps
13
The onlyHadoopRDBMS
Standard ANSI SQLHorizontal Scale-OutReal-Time UpdatesACID TransactionsPowers OLAP and OLTPSeamless BI Integration
Splice Machine
14
≈
proven building blocks
SQL
Scale
≈Apache Derby
15
how we do it
16
distributed query processing
Parallelized computation across clusterMoves computation to the dataUtilizes HBase co-processorsNo MapReduce
17
summary
Distributed ComputingDisruptive technologyData now cheap to store
Real-Time Use Case TypesPort existing operational applications experiencing cost or scaling issuesDevelop new applications that can leverage historical data in real-time
ExamplesDigital marketingAd TechFraud DetectionInternet of Things
18
Questions?
Real-Time Big Data Use Cases
John LeachCTO, Splice Machine