accelerating big data with iomemory and cisco ucs and nosql
DESCRIPTION
When great companies work together, an even greater outcome is possible. I am presenting this at the Oracle Open World 2012 at the Cisco theatre. Could one possibly support a twitter-like workload with just one server and few iodrives? Its all here.TRANSCRIPT
ACCELERATING BIG DATA: IOMEMORY, CISCO UCS AND NOSQL
Ashok Joshi, Senior Director – Oracle NoSQL development, Oracle
Sumeet Bansal, Principal Solutions Architect, Fusion-io
AGENDA
▸ Big Data overview
▸ Oracle NoSQL Database overview
▸ Real-time big data management – a business
perspective
▸ NoSQL testing with YCSB
▸ The Fusion-io value
October 1, 2012 2
VOLUME, VELOCITY, VARIETY, VALUE
October 1, 2012 3
▸ Terabytes, Petabytes
▸ Multiple sources for data
▸ Text, images, XML, JSON,
sensor readings…
▸ Not “master” data, but
important for business
▸ “Real-time” needs
Big Data characteristics
WHO USES BIG DATA?
October 1, 2012 4
Web Services • Clickstream Analysis
• Abuse Prevention
Government • Regulatory compliance
• Environmental monitoring
• Cyber security
Large-scale, E-commerce • Recommendation engines
• Cross-channel analytics
• Golden path to purchase
Big Energy • Granular rate plans
• Grid management
Financial Services • Customer loyalty
• Risk
• Trading
• Fraud
• Compliance
• Credit scoring
Telco • Churn reduction
• Network optimization
Storage • Cost efficient
• Analytics-ready
• Data Store
• Scalable
• Distributed
BIG DATA BUSINESS BENEFITS
October 1, 2012 5
$300 B
US HEALTH CARE
Increase industry
value per year
60+%
US RETAIL
Increase net
margin
-50%
MANUFACTURING
Decrease dev.,
assembly costs
$100 B
GLOBAL PERSONAL
LOCATION DATA
Increase service
provider revenue
€250 B
EUROPE PUBLIC
SECTOR ADMIN
Increase industry
value per year
“In a big data world, a competitor that fails to
sufficiently develop its capabilities will be left behind.”
AGENDA
October 1, 2012 6
▸ Big Data overview
▸ Oracle NoSQL Database overview
▸ Real-time big data management – a business
perspective
▸ NoSQL testing with YCSB
▸ The Fusion-io value
NOSQL DATABASE ARCHITECTURE
October 1, 2012 7
▸ Available; scalable; fast
▸ Simple administration, key-
value data model; transaction
support
▸ Transparent load balancing;
elastic
▸ Commercial grade software
and support
▸ Integrated with related Oracle
technologies
Highlights
Storage Nodes Storage Nodes
NoSQL Database
Driver NoSQL Database
Driver
Application
NoSQL Database
Driver
Application
AGENDA
October 1, 2012 8
▸ Big Data overview
▸ Oracle NoSQL Database overview
▸ Real-time big data management – a business
perspective
▸ NoSQL testing with YCSB
▸ The Fusion-io value
NOSQL DB AND CISCO UCS COLLABORATION WHY DOES IT MATTER
▸ Many components: network, processors, memory,
software, storage – tested, tuned and optimized
▸ Business can focus on core competency and
leveraging benefits of big data
October 1, 2012 9
NOSQL DB AND FUSION-IO COLLABORATION WHY DOES IT MATTER
▸ Speed (latency) is critical
• Amazon study:
Every 100 millisecond increase in latency costs 1% in sales http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
▸ ioMemory enables consistent, extremely low latency and
extreme throughput
October 1, 2012 10
ORACLE NOSQL DB, CISCO UCS, FUSION-IO
▸ Commercial grade solution and support
▸ Tested, tuned, optimized for real-time data
management
▸ For equivalent performance, much lower CapEx
and OpEx compared to commodity (DIY) solutions
October 1, 2012 11
AGENDA
October 1, 2012 12
▸ Big Data overview
▸ Oracle NoSQL Database overview
▸ Real-time big data management – a business
perspective
▸ NoSQL testing with YCSB
▸ The Fusion-io value
SYSTEM UNDER TEST
October 1, 2012 13
▸ YCSB (Yahoo! Cloud
Serving Benchmark)
• 10 client machines
generate load
• Mixed workload
(5% updates/95% reads)
▸ 15 UCS C240 M3 Rack
Servers
▸ 30 Fusion’s ioDrive2
▸ 2TB of data
CONFIGURATION DETAILS
October 1, 2012 14
Note: only two shards are shown in the illustration
NoSQL Database on ioDrive2
UCS 240 M3 server
Client machine (YCSB driver)
1 2 1 2 1 2
PERFORMANCE TEST RESULTS
October 1, 2012 15
Number of shards 2 4 8 10
Mixed workload (95 read/5
write) throughput (ops/sec) 302,152 558,569 1,028,868 1,244,550
Read latency (milliseconds) 0.76 0.79 0.85 0.88
Mixed workload update
latency (milliseconds) 3.08 3.82 4.29 4.47
PUTTING PERFORMANCE IN CONTEXT
October 1, 2012 16
▸ For Example, Twitter: ~150K
API calls/sec
▸ We can achieve that
performance on a single
UCS c240 server using
two ioDrive2s
▸ Plenty of capacity to handle
fluctuating demand without
compromising performance
Highlights
http://blog.programmableweb.com/2011/05/25/who-belongs-to-the-api-billionaires-club/
AGENDA
October 1, 2012 17
▸ Big Data overview
▸ Oracle NoSQL Database overview
▸ Real-time big data management – a business
perspective
▸ NoSQL testing with YCSB
▸ The Fusion-io value
CUT-THROUGH ARCHITECTURE AND VSL FOR EXTREME THROUGHPUT AND LOW LATENCY
October 1, 2012 18
▸ Sophisticated architecture
• maximum performance
▸ Intelligent software
• advanced features
Kernel
File System
Virtual Storage Layer (VSL)
ioMemory
Applications/Databases PCIe
DRAM /
Memory /
Operating System and
Application Memory ioM
em
ory
Vir
tualization
Table
s
Channels Wide
Banks
ioDrive ioMemory
Data-Path
Controller
Commands
Host
Virtual Storage Layer
(VSL)
DA
TA
TR
AN
SF
ER
S
CPU and cores
SOFTWARE DEVELOPMENT KIT ADVANTAGES FOR GREATER PERFORMANCE OPTIMIZATION
APPLICATION
Application source code
Transactional
Block
Native
File Logging
Key-Value
Pair
Auto-Commit
Memory™ Simple
Block
Network
File
Simple
Block
October 1, 2012 19
Traditional Storage
Proprietary Storage OS
Storage Media
Native Flash Translation Layer
Storage Media
Software Defined Storage
Conventional access Memory access Direct access I/O
TYPES OF IOMEMORY
20 October 1, 2012
For Cisco UCS C-Series Rack Servers For Cisco UCS B-Series Blade Servers
Mezzanine Card
365 GB, 785 GB
365 GB, 785 GB, 1.2 TB
2.4 TB
BIG DATA ANALYTICS - HADOOP
October 1, 2012 21
COMPLETE BIG DATA SOLUTION
▸ Right Partners – Cisco and Fusion-io
▸ Multiple technologies for a comprehensive big data
solution – NoSQL, map-reduce, relational
▸ Tested, integrated, optimized, commercially
supported solution delivered by leaders
▸ Cost-effective, reliable, ready for the enterprise
October 1, 2012 22
T H AN K Y O U