geode introduction
TRANSCRIPT
Introduction
Swapnil Bawaskar@sbawaskar
(incubating)
• Introduction• What?• Who?• Why?• How?• DEBS• Roadmap
• Q&A
Agenda
Introduction
A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability, resiliency and continuous
availability• fast access to critical data set• location aware distributed data processing• event driven data architecture
Introduction
• 1000+ systems in production (real customers)• Cutting edge use cases
Incubating… but rock solid
• 17 billion records in memory • GE Power & Water's Remote Monitoring & Diagnostics Center
• 3 TB operational data in-memory, 400 TB archived • China Railways
• 4.6 Million transactions a day / 40K transactions a second • China Railways
• 120,000 Concurrent Users• Indian Railways
Incubating… but rock solid
World: ~7,349,000,000~36% of the world population
Population: 1,251,695,6161,401,586,609
China RailwayCorporation Indian Railways
Incubating… but rock solid
A Re
ads
A Up
date
s
B Re
ads
B Up
date
s
C Re
ads
D In
serts
D Re
ads
F Re
ads
F Up
date
s0
200000
400000
600000
800000
1000000
Cassandra Geode
YCSB Workloads
oper
atio
ns p
er s
econ
d
Horizontal scaling for reads, consistent latency and CPU
2 4 6 8 100
1.25
2.5
3.75
5
6.25
0
4.5
9
13.5
18
speeduplatency (ms)CPU %
Server Hosts
Spee
dup
• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size
What makes it go fast?
• Minimize copying
• Minimize contention points
• Run user code in-process
• Partitioning and parallelism
• Avoid disk seeks
• Automated benchmarks
• Clone & Build
Hands-on: Build & run
git clone https://github.com/apache/incubator-geodecd incubator-geode./gradlew build -Dskip.tests=true• Start a servercd gemfire-assembly/build/install/apache-geode ./bin/gfsh gfsh> start locator --name=locator gfsh> start server --name=server gfsh> create region --name=myRegion --type=REPLICATE
$ docker run -it apachegeode/geode:1.0.0-incubating.M1 gfsh
• Docker
Hands on
• Cache• Region• Member• Client Cache• Functions• Listeners• High Availability• Serialization
Concepts
• Cache• In-memory storage and
management for your data• Configurable through XML,
Java API or CLI• Collection of Region
Concepts
• Region • Distributed java.util.Map on
steroids (Key/Value)• Consistent API regardless of where
or how data is stored• Observable (reactive) • Highly available, redundant on
cache Member (s).
Concepts
• Region• Local, Replicated or Partitioned• In-memory or persistent• Redundant• LRU • Overflow
Concepts
LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY
• Persistent Regions• Durability• WAL for efficient writing• Consistent recovery• Compaction
Concepts
Server 1 Server N
Persistence - Shared Nothing
Server 3Server 2Server 1
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
B3
B2
Server 1 waits for others when it starts
Persistence - Shared Nothing
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2Primary
Secondary
Fetches missed operations on restart
Persistence - Operational Logs
Create k1->v1
Create k2->v2
Modifyk1->v3
Create k4->v4
Modify k1->v5
Create k6->v6
Member 1Put k6->v6
Oplog2.crf
Oplog1.crf
Append to operation log
Persistence - Operational Logs - Compaction
Create k1->v1
Create k2->v2
Modifyk1->v3
Create k4->v4
Modify k1->v5
Create k6->v6
Member 1Put k6->v6
Oplog2.crf
Oplog1.crf
Append to operation log
Copy live data forward
• Member• A process that has a connection to
the system• A process that has created a cache• Embeddable within your
application
Concepts
Client
Locator
Server
• Client cache• A process connected to the Geode
server(s)• Can have a local copy of the data
• Run OQL queries on local data• Can be notified about events on
the servers
Concepts
• Client Notifications• Register Interest
• Individual Keys OR RegEx for Keys• Updates Local Copy
• Examples:• region.registerInterest(“key-1”);• region1.registerInterestRegex(“[a-z]+“);
• Continuous Query• Receive Notification when Query condition
met on server• Example:
• SELECT * FROM /tradeOrder t WHERE t.price > 100.00 • Can be DURABLE
Concepts
• Functions• Used for distributed concurrent processing
(Map/Reduce, stored procedure)• Highly available• Data oriented• Member oriented
Concepts
Concepts
• Functions
• Listeners• CacheWriter / CacheListener• AsyncEventListener (queue / batch)
• Parallel or Serial• Conflation
Concepts
Concepts - HA
Fixed or flexible schema?
id name age pet_id
or
{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}
C#, C++, Java, JSON
No IDL, no schemas, no hand-codingSchema evolution (Forward and Backward Compatible)
* domain object classes not required
| header | data || pdx | length | dsid | typeid | fields | offsets |
Portable Data eXchange
Efficient for queries
{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}
SELECT p.name FROM /Person p WHERE p.pet.type = “dino”
single fielddeserialization
But how to serialize data?
Benchmark: https://github.com/eishay/jvm-serializers
Schema evolutionMember A Member B
Distributed Type Definitions
v2v1
Application #1
Application #2
v2 objects preserve datafrom missing fields
v1 objects use default values tofill in new fields
PDX provides forwards and backwardscompatibility, no code required
Adapters
• write-through as opposed to cache-aside
• Stale Cache• Inconsistent Cache• Thundering Heards
memcached
• Scalable Data-Structures• Use All Cores• WAN Replication
Redis