geode introduction

Introduction

Swapnil Bawaskar@sbawaskar

(incubating)

• Introduction• What?• Who?• Why?• How?• DEBS• Roadmap

• Q&A

Agenda

Introduction

A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability, resiliency and continuous

availability• fast access to critical data set• location aware distributed data processing• event driven data architecture

Introduction

• 1000+ systems in production (real customers)• Cutting edge use cases

Incubating… but rock solid

• 17 billion records in memory • GE Power & Water's Remote Monitoring & Diagnostics Center

• 3 TB operational data in-memory, 400 TB archived • China Railways

• 4.6 Million transactions a day / 40K transactions a second • China Railways

• 120,000 Concurrent Users• Indian Railways


World: ~7,349,000,000~36% of the world population

Population: 1,251,695,6161,401,586,609

China RailwayCorporation Indian Railways


A Re

ads

A Up

date

s

B Re

ads

B Up

date

s

C Re

ads

D In

serts

D Re

ads

F Re

ads

F Up

date

s0

200000

400000

600000

800000

1000000

Cassandra Geode

YCSB Workloads

oper

atio

ns p

er s

econ

d

Horizontal scaling for reads, consistent latency and CPU

2 4 6 8 100

1.25

2.5

3.75

5

6.25

0

4.5

9

13.5

18

speeduplatency (ms)CPU %

Server Hosts

Spee

dup

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size

What makes it go fast?

• Minimize copying

• Minimize contention points

• Run user code in-process

• Partitioning and parallelism

• Avoid disk seeks

• Automated benchmarks

• Clone & Build

Hands-on: Build & run

git clone https://github.com/apache/incubator-geodecd incubator-geode./gradlew build -Dskip.tests=true• Start a servercd gemfire-assembly/build/install/apache-geode ./bin/gfsh gfsh> start locator --name=locator gfsh> start server --name=server gfsh> create region --name=myRegion --type=REPLICATE

$ docker run -it apachegeode/geode:1.0.0-incubating.M1 gfsh

• Docker

https://github.com/apache/incubator-geode

Hands on

• Cache• Region• Member• Client Cache• Functions• Listeners• High Availability• Serialization

Concepts

• Cache• In-memory storage and

management for your data• Configurable through XML,

Java API or CLI• Collection of Region

Concepts

• Region • Distributed java.util.Map on

steroids (Key/Value)• Consistent API regardless of where

or how data is stored• Observable (reactive) • Highly available, redundant on

cache Member (s).

Concepts

• Region• Local, Replicated or Partitioned• In-memory or persistent• Redundant• LRU • Overflow

Concepts

LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY

• Persistent Regions• Durability• WAL for efficient writing• Consistent recovery• Compaction

Concepts

Server 1 Server N

Persistence - Shared Nothing

Server 3Server 2Server 1



B1

B3

B2

B1

B3

B2Primary

Secondary



B1

B3

B2

B1

B3

B2Primary

Secondary

B3

B2

Server 1 waits for others when it starts



B1

B3

B2

B1

B3

B2Primary

Secondary

Fetches missed operations on restart

Persistence - Operational Logs

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Persistence - Operational Logs - Compaction

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Copy live data forward

• Member• A process that has a connection to

the system• A process that has created a cache• Embeddable within your

application

Concepts

Client

Locator

Server

• Client cache• A process connected to the Geode

server(s)• Can have a local copy of the data

• Run OQL queries on local data• Can be notified about events on

the servers

Concepts

• Client Notifications• Register Interest

• Individual Keys OR RegEx for Keys• Updates Local Copy

• Examples:• region.registerInterest(“key-1”);• region1.registerInterestRegex(“[a-z]+“);

• Continuous Query• Receive Notification when Query condition

met on server• Example:

• SELECT * FROM /tradeOrder t WHERE t.price > 100.00 • Can be DURABLE

Concepts

• Functions• Used for distributed concurrent processing

(Map/Reduce, stored procedure)• Highly available• Data oriented• Member oriented

Concepts

Concepts

• Functions

• Listeners• CacheWriter / CacheListener• AsyncEventListener (queue / batch)

• Parallel or Serial• Conflation

Concepts

Concepts - HA

Fixed or flexible schema?

id name age pet_id

or

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}

C#, C++, Java, JSON

No IDL, no schemas, no hand-codingSchema evolution (Forward and Backward Compatible)

* domain object classes not required

| header | data || pdx | length | dsid | typeid | fields | offsets |

Portable Data eXchange

Efficient for queries

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}

SELECT p.name FROM /Person p WHERE p.pet.type = “dino”

single fielddeserialization

But how to serialize data?

Benchmark: https://github.com/eishay/jvm-serializers

https://github.com/eishay/jvm-serializers

Schema evolutionMember A Member B

Distributed Type Definitions

v2v1

Application #1

Application #2

v2 objects preserve datafrom missing fields

v1 objects use default values tofill in new fields

PDX provides forwards and backwardscompatibility, no code required

Adapters

• write-through as opposed to cache-aside

• Stale Cache• Inconsistent Cache• Thundering Heards

memcached

• Scalable Data-Structures• Use All Cores• WAN Replication

Redis

geode introduction

Software