name infosphere streams - ibm · streams provides tight integration with existing information /...
TRANSCRIPT
Name
Title:
InfoSphere Streams
Jaskiran BhatiaCountry Manager – Information Management
Rules! Mining ! Custom Analytics!
…………….…..In micro seconds
….…
& these opportunities are everywhere
Stock market
Impact of weather on securities prices
Analyze market data at ultra-low latencies
Transportation
Intelligent traffic Management
Health & Life Sciences
Neonatal ICU monitoring
Epidemic early warning system
Remote healthcare monitoring
….…
& these opportunities are everywhere
e-Science
Space weather prediction
Detection of transient events
Synchrotron atomic research
Telephony
CDR processing
Social analysis
Churn prediction
Geomapping
Real-time multimodal surveillance
Situational awareness
Cyber security detection
Law Enforcement, Defense & Cyber Security
Traffic Control System in City of Stockholm
• Data sources– GPS from 1000’s taxis
– Loop Sensors• Speed of traffic
• Flow –
density of traffic (cars per second)
– CCTV video inside tunnels
– Real Time Weather data
• Output– Travel time forecasts
• Via SMS
• Now, In 30 minutes, 1 hour, 2 hours etc
• Integrate with existing system
Traffic Management for Sustainability and Efficiency
• Multimodal Data Streams– GPS
– Cell-phones (location tracking)
– Public Transport (bus, docking)
– Pollution measurements
– Weather Conditions (including road conditions)
– Optical traffic flow detectors
– Travel time data based on plate recognition
– Induction loop detector data
– Accidents in network as they are being recorded
– Road closures (road work, etc)
– Still pictures from road cameras
• Real Time Traffic Monitoring & Information
• (Multimodal) Travel Planner
GPSData
Streams
Real Time Transformation
Logic
Real Time Geo
Mapping
Real Time Speed & Heading
Estimation
Real Time Aggregates & Statistics
DataWarehouseWeb
Server
GoogleEarth
Offlinestatisticalanalysis
Interactivevisualization
Storageadapters
Only 4 x86 Blade servers to process 250,000 GPS probes per second, maps of 630,000 line segments
Matching map artifact
Estimated path
GPS probe
Estimated speed & heading
Real Time Geo Mapping & Speed Estimation
Web Zero platform
Capture weather sensor data, analyses hurricane predicted path
Estimate impact on portfolios
Recommendations Based on Hurricane Forecast
Compute portfolio market indicators
(low latency) Make recommendations
and notify
Capture market data
(high volume)
System S platform
DHTML Result rendering
Real-time projections of hurricane path
Dynamically updated risk assessment
for assets in projected path
Correlate combined risk and trade VWAP to
determine buy/sell recommendations
‘World’s fastest’
options trading prototype
• Identify and execute trades
• Process over 5M events per second with average latency of 150 microseconds
• Expand to incorporate content feeds, news text, audio, video, to establish greater context for better decisions
CIO TD Bank "TD Bank Financial Group worked with IBM Research to develop a first-of-a- kind architecture capable of consuming, analyzing and acting on real-time market data while maintaining sub-millisecond response times even under extreme data loads”
Equities Trading “Starter Application”
Modular designComponents are plug-replaceable –
extend these or substitute your ownDemonstrates how trading strategies may be swapped out at runtime, without stopping the rest of the application
TradingStrategy
module looks for opportunities that have specific quality values and trends
OpportunityFinder
module looks for opportunities and computes quality metrics
SimpleVWAPCalculator
module computes a running volume-weighted average price metric
Predictive Analytics using InfoSphere Streams in a neo natal ICU helps detect life threatening conditions upto 24hrs earlier
• Real Time analytics and correlations on physiological data streams – Blood pressure, Temperature, EKG,
Blood oxygen saturation etc.,
• Early detection of the onset of potentially life threatening conditions– Upto 24 hours earlier than current
medical practices – Early intervention leads to lower
patient morbidity and better long term outcomes
• Technology also enables physicians to verify new clinical hypotheses
Law Enforcement and Security –
Federal Government
• Streams of information including video surveillance, wire taps, communications, call records, etc.
• Millions of streams per second with low density of critical data
• Identify patterns and relationships among vast information sources
"The US Government
has been working with IBM Research since 2003
on a radical new approach
to data analysis that enables high
speed, scalable and complex analytics of heterogeneous data streams in motion. The project has been so successful
that US
Government will deploy additional installations
to enable other agencies to
achieve greater success in various future projects" -
US
Government
SPSS Modeler to Build Model, Streams to Detect Quickly Characterization of ‘Motive’
Build rulesets (‘profiles’) of various cause categories
Utilizing crime scene information such as . . .
Crime reports when entered
REAL-TIME ANALYTIC PROCESSING
Data stored forfuture auditingand evidencerequirements
Data from 911 calls, satellite feeds, imagery from city traffic cameras
Streams defines the geo spatial location of the call by running powerful analytics in real time using satellite communication link and draws in city camera feeds from around the area
Real time support for 911 dispatcher and field personnel
Government and Law Enforcement: e911 Support
Sharpe Engineering and US Navy Phase 1 SBIR Research
Navy SensorsAdvanced Analytics
InfoSphere Streams
•Identify/build sample problem
•Preliminary sizing
•Est. development, deployment and operation costs
•Possible use cases
⁻
Maritime commerce & Anti piracy
⁻
Unmanned surface vehicles
⁻
Disaster relief
⁻
Cyber security
+ + =
Sharpe EngineeringCommand & Control
Temporal Anomalies,Event / Destination Correlations,Partial periodicity
From Data Analytics To Smarter Cyber SecurityIDS
Humanannotations
InfoSphere Warehouse
Firewall
IPS/ADS
Sensors
DNS
ID & NAC
App/DB
Live Data
Logs
Unsupervised Supervised
Channel Profiles
Botnet Models
ADS Models
Statistical Models
Security Events
Event Normalization
Historical data summaries &evidence
Data Repository: Log aggregation / normalization
Temporal Analysis
Security Analytics, e.g.,Botnet Analytics
Entity Analytics (GNR, ...)
Self-tuning
Feedback
Manual Tuning
True Positives/False Positives
Forecasting Space Weather at LOFAR Outrigger in Scandinavia (LOIS)
Triaxial Antenna InfoSphere Streams
Radio signal input and data preparation
Signal detection and noise filtering
Strength and 3D directional analysis
Swedish Institute of Space Physics
Solar Flares
Space Weather prediction
regarding impact on satellites and
electric grids+ + =
Telco Moving to Agile, Real-time Processes & Analytics
Information Management EvolutionInformation Management Evolution
2006 2010
Bus
ines
s V
alue
Bus
ines
s V
alue
2007/8
Corporate Visibility
Data Infrastructure Optimization
Marketing Campaign &
Service Analytics
Large Indian Wireless Telco. 100+ million customers. >10% annual
growth. Expanding operations abroad and growing to provide real-time services and 3G capabilities to customers.
Reduce Complexity Manage Risk Reduce Cost•
Reduce data latency from 6-12 hours to seconds
•
Improvement in data processing throughput
•
Implemented fault tolerant,
and flexible solution
•
Consolidated existing
integration systems by 50%
•
Streamlined development & maintenance of data services
Single, real-time data feed for Fraud,
BI & Revenue Assurance systems
Enterprise Data Warehouse, Data Marts & Reporting
Enterprise Data Enterprise Data Warehouse, Data Warehouse, Data Marts & ReportingMarts & Reporting
Cross-sell/Up-Sell,Reduced Activation
Time
Operational Efficiency
Customer Analytics and Business Process Management
Customer Analytics and Business Customer Analytics and Business Process ManagementProcess Management
Real-time CDR processing Real-time CDR processing
Streams provides tight integration with existing Information / Analytics Infrastructure for Call Detail Record Processing
Cognos
SpreadsheetsApplications
Info Server
Data Marts
SOA Web Service
Fin Planning
Mashups
InfoSphere Warehouse
InfoSphereStreams
DB2
• ERP,CRM and Other Data Sources
Real Time User Analytics
AnalyticModels
Pre-processed Data
CDRs
Real Time Charging system telco in Japan
1. The Pain PointReal time charging can increase revenue/profitBut:
• 20 million users • Dramatic call and IP traffic growth• SMS grows at 30% annually
Even hourly summaries were hard to do
2. The SolutionInfoSphere Streams with solidDB processes:
• 55K CDRs per second on 1 octicore node• 10 million in 200 seconds
• 160K CDRs per second on 3 octicore nodes• 10 million in 60 seconds
• Nearly linear growth• Architectural pattern to prevent data loss• Demonstrated high software productivity
3. The Happy Ending
Telco is extending the pilot; in production in 1H 2011Platform to create new real time billing system