iot platform using geode and activemq platform using geode and activemq swapnil bawaskar @sbawaskar...
TRANSCRIPT
IoT Platform using Geode and ActiveMQ
Swapnil Bawaskar@sbawaskar
Scalable IoT Platform
• Introduction • IoT • MQTT • Apache ActiveMQ Artemis • Apache Geode
• Real world use case • Q&A
2
Agenda
• Devices collect and send data to brokers
• Clients process data to deliver business value
• IoT data platform considerations • Protocol • How to read • How to Analyze • How to scale
IoT
3
!" #
• MQTT • Message Queuing Telemetry Transport • Based on TCP/IP • Optimized binary protocol • No type system • Provides different QoS levels • Low energy consumption
Protocol
4
• Subproject of ActiveMQ • Non blocking architecture
• High Performance • Multi Protocol • Embeddable • Clustered
• Persistence • Journaled • Relational database
ActiveMQ Artemis
5
Scaling
6
!"
#
"
"
"
"
!
!
!
!
• When dealing with large number of devices
Scaling
7
!"
#
"
"
"
"
!
!
!
!
Cluster• Brokers form cluster • Clients are load balanced
Scaling
8
!"
#
"
"
"
"
!
!
!
!
Cluster• Need to scale the
processors
Scaling
9
!"
#
"
"
"
"
!
!
!
!
Cluster
#
#
• Processors do not see all data
Scaling
10
Scaling
11
GEODE
12
What is it?
A distributed, memory-based data management platform for data oriented apps that need: • high performance, scalability, resiliency and continuous
availability • fast access to critical data set • location aware distributed data processing • event driven data architecture
13
What is it?
Numbers Everyone Should Know
14
L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns 0.01 ms Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms Read 1 MB sequentially from memory 250,000 ns 0.25 ms Round trip within same datacenter 500,000 ns 0.5 ms Disk seek 10,000,000 ns 10 ms Read 1 MB sequentially from network 10,000,000 ns 10 ms Read 1 MB sequentially from disk 30,000,000 ns 30 ms Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf
• 17 billion records in memory • GE Power & Water's Remote Monitoring & Diagnostics Center
• 3 TB operational data in-memory, 400 TB archived • China Railways
• 4.6 Million transactions a day / 40K transactions a second • China Railways
• 120,000 Concurrent Users • Indian Railways
15
Who are the users?
World: ~7,349,000,000
~36% of the world population
Population: 1,251,695,6161,401,586,609
China RailwayCorporation
Indian Railways
• Distributed key-value store (java.util.concurrent.ConcurrentMap)
• Region due to old JSR-107 spec • Both Keys as well as Values can be domain objects
17
Regions
Server 3Server 2Server 1
Key1 value1Key2 value2
Key1 value1Key2 value2
Key1 value1Key2 value2
Key1 value1Key2 value2
Partitioned
Replicated
• Deploy Function on all servers • Runs in-process with the servers
18
Functions
Server 2
# # #
Server 2Server 1 Server 3
Key1 value1
Key2 value2
Key1 value1
Key2 value2
Key1 value1
Key2 value2
19
Functions
Server 2
# # #
Server 2Server 1 Server 3
Key1 value1
Key2 value2
Key3 value1
Key4 value2
Key5 value1
Key6 value2
• Deploy Function on all servers • Runs in-process with the servers
• Object Query Language (OQL) • Similar to SQL
• SELECT DISTINCT * FROM /exampleRegion WHERE status = ‘active’
• You can drill down into domain objects • SELECT p.name FROM /person p WHERE p.pet.type=‘dino’
• You can also invoke methods on your domain objects • SELECT DISTINCT * FROM /person p WHERE p.children.size >= 2
• Joins Possible • Between Replicate regions • Between one Partitioned and Replicate regions
• SELECT portfolio1.ID, portfolio2.status FROM /exampleRegion portfolio1, /exampleRegion2 portfolio2 WHERE portfolio1.status = portfolio2.status
20
Query
• Enables event-driven apps • Register a Query with the server
• SELECT * FROM /tradeOrder t WHERE t.symbol=‘VMW’ AND t.price > 100.00
• The server then notifies when the query condition is met • Client implements the CqListener callback
• HA support • Domain objects not required on the server’s class-path
21
Continuous Query
Fixed or flexible schema?
id name age pet_id
or
{id:1,name:“Fred”,age:42,pet:{name:“Barney”,type:“dino”}}
C#, C++, Java, JSON
No IDL, no schemas, no hand-coding Schema evolution (Forward and Backward Compatible)
* domain object classes not required No need to bring down cluster when domain objects change
|header|data||pdx|length|dsid|typeid|fields|offsets|
Portable Data eXchange
Efficient for queries
{id:1,name:“Fred”,age:42,pet:{name:“Barney”,type:“dino”}}
SELECTp.nameFROM/PersonpWHEREp.pet.type=“dino”
single field deserialization
But how fast is it?
Benchmark: https://github.com/eishay/jvm-serializers
Schema evolutionMember A Member B
Distributed Type Definitions
v2v1
Application #1
Application #2
v2 objects preserve data from missing fields
v1 objects use default values to fill in new fields
PDX provides forwards and backwards compatibility, no code required
• Telemetry data from machines • Predicting failure
• Outside one standard deviation • Evaluating markov model
• Use functions to iterate over data • Use CQs to notify • Update CQs based on function results
IoT Use Case
27
Questions?
28