iot etl mashup - imagdata models & database systems one does not fit all ! sql ... time-serie...

31
06/03/18 D. Donsez, V. Quema, IoT Mashup 1 IoT ETL Mashup Didier Donsez, Vivien Quéma (c) Didier Donsez & Vivien Quéma, 2016-2018

Upload: others

Post on 22-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 1

IoT ETL Mashup

Didier Donsez, Vivien Quéma

(c) Didier Donsez & Vivien Quéma, 2016-2018

Page 2: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 2

Sommaire

● Rappel : Architecture de Référence● APIs Stream des sources IoT● Formats● Place à la pratique

Page 3: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 3

Reference Architecture

Fixed EndpointsMobile Endpoints

Base Stations/ Gateways

Core Network* high available* high performance* transient storage

ApplicationsLong Term Storage

NetworkServer

NetworkServer

GW

GW

GW

A C A B A C A

AppCustomer C

AppCustomer B

AppCustomer A

geo-replication

failover

API Client

Page 4: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 4

APIs de Collecte

● HTTP Callback● MQTT● WebSocket● Journaux persistants : Kafka, Flume, ...● Autres : AMQP, gRPC, PubNub, Confluent …● Bases temporelles

Page 5: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 5

HTTP Callback

● Protocole client-serveur (de facto)● Mode opératoire

– Le client publie un point d’entrée public HTTP– Le NS requête (POST ou GET) le point d’entrée pour

chaque message LoRaWAN reçu (ou pour une lot de message LoRaWAN reçus dans un interval de temps T).

– En cas d’indisponibilité du point d’entrée public, le NS stocke provisoirement les messages non livrés (avec une retention de X jours).

Page 6: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 6

HTTP Callback

● Mode opératoire– Schéma

NetworkServer

ApplicationCliente

NetworkServer

POST /message

GET /message?undelivered=true

Page 7: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 7

HTTP Callback

● Avantages– HTTP

● Inconvénients– Le point d’entrée doit avoir une adresse IP publique.– Le point d’entrée doit être gérer en mode Haute Disponibilité

(Load Balancer, Sécurité (SSL, Filtrage d’adresse d’origine)– L’application cliente doit récupérer sur le NS (via une API HTTP

REST) les messages non livrés.– En général, un point d’entrée par AppEUI

Page 8: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 8

Modèle de CommunicationPublish-Subscribe

PubSubBroker

PubSubBroker

Publishertopic="s11/hum"

Publishertopic="s11/hum"

Subscriberevent.topics=

{"#/location"}

Subscriberevent.topics=

{"#/location"}

Publishertopic="s12/temp"

Publishertopic="s12/temp"

Publishertopic=

"s13/wind"

Publishertopic=

"s13/wind"

Subscriberevent.topics={"#/temp","#/hum"}

Subscriberevent.topics={"#/temp","#/hum"}

Subscriberevent.topics=

{"s11/#","s13/#"}

Subscriberevent.topics=

{"s11/#","s13/#"}

Subscriberevent.topics=

{"s14/#"}

Subscriberevent.topics=

{"s14/#"}

E2

E2

E3

E3

E1E1

E1

E4

E4

E4

Publishertopic=

"s10/geiger/pps"

Publishertopic=

"s10/geiger/pps"

E6

ProducteursConsommateurs

MQTTKafkaAMQP

Page 9: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 9

MQTT

● Protocole PubSub dédié à l’IoT– Découplage entre publishers et subscribers

● Mode opératoire– L’application client souscrit à un sujet (en général

l’AppEUI) et recoit les messages quand ils sont produits

NetworkServer

ApplicationCliente

NetworkServer

PUBLISH xnet/3/1234{"water":10}

SUBSCRIBE xnet/3/#

PUBLISH xnet/3/5678{"water":100}

Page 10: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 10

MQTT

● Avantages– Plusieurs applications souscrivent aux mêmes sujets – Nombreuses implémentations (clients et brokers)– Supporté par la plupart des PaaS IoT (IBM, Cayenne, …)– Gestion des reconnections et de la vivacité de la connexion TCP

● Inconvénients– Failover « Adhoc »– « Pas » de rétention en cas d’arrêt d’une application

consommatrice → API REST du NS pour récupérer les frames non reçues.

Page 11: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 11

WebSockets

● « PubSub » sur HTTP● Mode opératoire

– Comme MQTT

● Remarque– Les brokers MQTT offrent un point d’entrée

WebSocket.

Page 12: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 12

Kafka

● Journal distribué persistant– Haute performance– Haute disponibilité– Modèle de communication PubSub

● Groupe de publishers● Groupe de subscribers

– Rétension des données de plusieurs heures à plusieurs jours.

Page 13: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 13

Kafka

● Mode opératoire (TBC)

A1

NetworkServer

A2

R1

B2

Kafka +ZK

PUB /iot PUB /iot

PUB /iotPUB /iot

Groupe A

Page 14: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 14

Kafka

● Avantages– Haute performance– Haute disponibilité– Pas de nécessité de gérer de manière adhoc les messages non

distribués à/aux applications– « Big Data ready »

● Canal d’alimentation de la plupart des stacks Big Data (Hadoop, Storm, Spark, Flink).

● Inconvénients– 2f+1 machines + 2f+1 Zookeeper

● (f étant le nombre de fautes tolérées)

Page 15: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

Data Models & Database SystemsOne does not fit all !

SQL● Oracle, MySQL/MariaDB,

Postgres, HSQL ...

NoSQL (Not 1 NF)● File Systems

– HDFS

● Table– Hbase (Big Table)

● Directories (LDAP)● Key-Value Stores

– Cassandra, Redis, Memcached, ...

● Document-oriented DB– MongoDB, CouchDB, ...

● ….

● Graph-oriented DB– Neo4J, ...

● Time-Series DB– OpenTSDB, InfluxDB, …

● Text Oriented– Lucene, OpenNLP, ElasticSearch– Geolocation

● GIS, Geo extensopns in MongoDB, Postgres, MySQL, ...

– Streams● Kafka, Flume

Performance● In-memory DB– MySQL Cluster, Redis, ...

Page 16: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

Database SystemsMultiple Data Models

Page 17: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

CAP Theorem (Brewer)

• Un SD ne peut garantr qu’au max. 2 propriétés

Page 18: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 18

Time-Serie Databases

● Stockage et requêtage des données indexées par le temps.– Haute performance– Expressivité des requêtes par rapport aux temps– Retention paramêtrable des données

● InfluxDB, OpenTSDB● TSDBaaS : OVH Metrics, Azure, Bluemix, AWS,

InfluxData, ElasticSearch ...

Page 19: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 19

Formats de DonnéesSérialisateur/Désérialisateur

– JSON – CSV– Les autres : XML, BSON, Thrift, Avro, Protobuf,

Parquet, ...

Page 20: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 20

NodeRED

Page 21: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 21

Exemple de Décodage de Payload

● Adeunis Pulse

Page 22: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 22

Decodage Payload Adeunis Pulse02 F0 06 00502D8A 00 00000000

Page 23: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 23

Decodage Payload Adeunis Pulse02 F0 06 00502D8A 00 00000000

Page 24: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 24

Decodage Payload Adeunis Pulse

Page 25: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 25

Chronograf sur InfluxDB

Page 26: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 26

MongoDB - Compass

Page 27: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 27

Dashboard-as-a-ServiceExemple : Cayenne

Page 28: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 28

Dashboard-as-a-ServiceExemple : Jyse.io

Page 29: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 29

Alerting

● Déclenchement d’une action sur franchissement de seuil (Mail, SMS, Trello …)– Huginn, Kapacitor, Grafana– Alert-as-a-Service : Cayenne, IFTTT, Azure,

Bluemix BI, ...

Page 30: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 30

The TICK stack

Page 31: IoT ETL Mashup - imagData Models & Database Systems One does not fit all ! SQL ... Time-Serie Databases ... The TICK stack. 06/03/18 D. Donsez, V. Quema, IoT Mashup 31 Getting started

06/03/18 D. Donsez, V. Quema, IoT Mashup 31

Getting started

● Avec Docker● NodeRED● Mosquitto● InfluxDB● Chronograf● Grafana● MongoDB● Huginn