cosmos, big data ge implementation in fiware
DESCRIPTION
Description of the COSMOS GE, the Big Data solution adopten into the FIWARE ecosystem.TRANSCRIPT
![Page 1: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/1.jpg)
Open APIs for Open Minds
Building your first application using FI-WARE
Cosmos, Big Data GE implementation
![Page 2: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/2.jpg)
2
Big Data:What is it and how much data is there
![Page 3: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/3.jpg)
What is big data?
3
> small data
![Page 4: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/4.jpg)
What is big data?
4
> big data
http://commons.wikimedia.org/wiki/File:Interior_view_of_Stockholm_Public_Library.jpg
![Page 5: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/5.jpg)
How much data is there?
5
![Page 6: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/6.jpg)
Data growing forecast
6
2.3 3.612
19
11.3
39
0.5
1.4
Global users
(billions)
Global networked
devices(billions)
Global broadband speed(Mbps)
Global traffic(zettabytes)
http://www.cisco.com/en/US/netsol/ns827/networking_solutions_sub_solution.html#~forecast
2012
20122012
2012
2017
2017
2017
2017
1 zettabyte = 1021 bytes
1,000,000,000,000,000,000,000 bytes
![Page 7: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/7.jpg)
It is not only about storing big data but using it!
7
> tools
> big data
http://commons.wikimedia.org/wiki/File:Interior_view_of_Stockholm_Public_Library.jpg
![Page 8: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/8.jpg)
8
How to deal with it:The Hadoop reference
![Page 9: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/9.jpg)
Hadoop was created by Doug Cutting at Yahoo!...
9
… based on the MapReduce patent by Google
![Page 10: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/10.jpg)
Well, MapReduce was really invented by Julius Caesar
10
Divide etimpera*
* Divide and conquer
![Page 11: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/11.jpg)
An example
11
How much pages are written in latin among the booksin the Ancient Library of Alexandria?
LATINREF1P45
GREEKREF2P128
EGYPTREF3P12
LATINpages 45
EGYPTIAN
LATINREF4P73
LATINREF5P34
EGYPTREF6P10
GREEKREF7P20
GREEKREF8P230
45 (ref 1)
still reading…
Mappers
Reducer
![Page 12: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/12.jpg)
An example
12
How much pages are written in latin among the booksin the Ancient Library of Alexandria?
GREEKREF2P128
stillreading…
EGYPTIAN
LATINREF4P73
LATINREF5P34
EGYPTREF6P10
GREEKREF7P20
GREEKREF8P230
GREEK
45 (ref 1)
Mappers
Reducer
![Page 13: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/13.jpg)
An example
13
How much pages are written in latin among the booksin the Ancient Library of Alexandria?
LATINpages 73
EGYPTIAN
LATINREF4P73
LATINREF5P34
GREEKREF7P20
GREEKREF8P230
LATINpages 34
45 (ref 1)
+73 (ref 4)
+34 (ref 5)
Mappers
Reducer
![Page 14: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/14.jpg)
An example
14
How much pages are written in latin among the booksin the Ancient Library of Alexandria?
GREEK
GREEK
GREEKREF7P20
GREEKREF8P230
idle…
45 (ref 1)
+73 (ref 4)
+34 (ref 5)
Mappers
Reducer
![Page 15: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/15.jpg)
An example
15
How much pages are written in latin among the booksin the Ancient Library of Alexandria?
idle…
idle…
idle…
45 (ref 1)
+73 (ref 4)
+34 (ref 5)
152 TOTAL
Mappers
Reducer
![Page 16: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/16.jpg)
Hadoop architecture
16
head node
![Page 17: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/17.jpg)
17
FI-WARE proposal:Cosmos Big Data
![Page 18: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/18.jpg)
What is Cosmos?
18
• Cosmos is Telefónica's Big Data platform• Dynamic creation of private computing clusters
as a service• Infinity, a cluster for persistent storage
• Cosmos is Hadoop ecosystem-based• HDFS as its distributed file system• Hadoop core as its MapReduce engine• HiveQL and Pig for querying the data• Oozie as remote MapReduce jobs and Hive
launcher
• Plus other proprietary features• Infinity protocol (secure WebHDFS)• Cygnus, an injector for context data coming from
Orion CB
![Page 19: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/19.jpg)
Cosmos architecture
19
![Page 20: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/20.jpg)
What can be done with Cosmos?
20
WhatLocally
(ssh’ing into the Head Node)
Remotely(connecting your app)
Clusters operation Cosmos CLI REST API
I/O operation‘hadoop fs’
command
REST API(WebHDFS, HttpFS,
Infinity protocol)
Querying tools(basic analysis)
Hive CLI JDBC, Thrift*
MapReduce(advanced analysis)
‘hadoop jar’ command
Oozie REST API
![Page 21: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/21.jpg)
21
Clusters operation:Getting your own
roman legion
![Page 22: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/22.jpg)
Using the RESTful API (1)
22
![Page 23: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/23.jpg)
Using the RESTful API (2)
23
![Page 24: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/24.jpg)
Using the RESTful API (3)
24
![Page 25: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/25.jpg)
Using the Python CLI
25
• Creating a cluster$ cosmos create --name <STRING> --size <INT>
• Listing all the clusters$ cosmos list
• Showing a cluster details$ cosmos show <CLUSTER_ID>
• Connecting to the Head Node of a cluster$ cosmos ssh <CLUSTER_ID>
• Terminating a cluster$ cosmos terminate <CLUSTER_ID>
• Listing available services$ cosmos list-services
• Creating a cluster with specific services$ cosmos create --name <STRING> --size <INT>--services <SERVICES_LIST>
![Page 26: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/26.jpg)
26
How to exploit the data:
Commanding your roman legion
![Page 27: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/27.jpg)
1. Hadoop filesystem commands
27
• Hadoop general command$ hadoop
• Hadoop file system subcommand$ hadoop fs
• Hadoop file system options$ hadoop fs –ls$ hadoop fs –mkdir <hdfs-dir>$ hadoop fs –rmr <hfds-file>$ hadoop fs –cat <hdfs-file>$ hadoop fs –put <local-file> <hdfs-dir>$ hadoop fs –get <hdfs-file> <local-dir>
• http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html
![Page 28: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/28.jpg)
2. WebHDFS/HttpFS REST API
28
• List a directoryGET http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS
• Create a new directoryPUT http://<HOST>:<PORT>/<PATH>?op=MKDIRS[&permission=<OCTAL>]
• Delete a file or directoryDELETE http://<host>:<port>/webhdfs/v1/<path>?op=DELETE [&recursive=<true|false>]
• Rename a file or directoryPUT http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=RENAME&destination=<PATH>
• Concat filesPOST http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CONCAT&sources=<PATHS>
• Set permissionPUT http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETPERMISSION [&permission=<OCTAL>]
• Set ownerPUT http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETOWNER [&owner=<USER>][&group=<GROUP>]
![Page 29: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/29.jpg)
2. WebHDFS/HttpFS REST API (cont.)
29
• Create a new file with initial content (2 steps operation)PUT http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATE [&overwrite=<true|false>][&blocksize=<LONG>][&replication=<SHORT>] [&permission=<OCTAL>][&buffersize=<INT>]HTTP/1.1 307 TEMPORARY_REDIRECTLocation: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE...Content-Length: 0PUT -T <LOCAL_FILE> http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE...
• Append to a file (2 steps operation) POST http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=APPEND[&buffersize=<INT>] HTTP/1.1 307 TEMPORARY_REDIRECT Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND... Content-Length: 0 POST -T <LOCAL_FILE> http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND...
![Page 30: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/30.jpg)
2. WebHDFS/HttpFS REST API (cont.)
30
• Open and read a file (2 steps operation)GET http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=OPEN [&offset=<LONG>][&length=<LONG>][&buffersize=<INT>]HTTP/1.1 307 TEMPORARY_REDIRECTLocation: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=OPEN...Content-Length: 0GET http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=OPEN...
• http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
• HttpFS does not redirect to the Datanode but to the HttpFS server, hidding the Datanodes (and saving tens of public IP addresses)
• The API is the same• http://
hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/index.html
![Page 31: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/31.jpg)
3. Local Hive CLI
31
• Hive is a querying tool• Queries are expresed in HiveQL, a SQL-like
language• https://
cwiki.apache.org/confluence/display/Hive/LanguageManual
• Hive uses pre-defined MapReduce jobs for• Column selection• Fields grouping• Table joining• …
• All the data is loaded into Hive tables
![Page 32: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/32.jpg)
3. Local Hive CLI (cont.)
32
• Log on to the Master node• Run the hive command• Type your SQL-like sentence!
$ hive$ Hive history file=/tmp/myuser/hive_job_log_opendata_XXX_XXX.txthive>select column1,column2,otherColumns from mytable where column1='whatever' and columns2 like '%whatever%';Total MapReduce jobs = 1Launching Job 1 out of 1Starting Job = job_201308280930_0953, Tracking URL = http://cosmosmaster-gi:50030/jobdetails.jsp?jobid=job_201308280930_0953Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=cosmosmaster-gi:8021 -kill job_201308280930_09532013-10-03 09:15:34,519 Stage-1 map = 0%, reduce = 0%2013-10-03 09:15:36,545 Stage-1 map = 67%, reduce = 0%2013-10-03 09:15:37,554 Stage-1 map = 100%, reduce = 0%2013-10-03 09:15:44,609 Stage-1 map = 100%, reduce = 33%…
![Page 33: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/33.jpg)
4. Remote Hive client
33
• Hive CLI is OK for human-driven testing purposes• But it is not usable by remote applications
• Hive has no REST API• Hive has several drivers and libraries
• JDBC for Java• Python• PHP• ODBC for C/C++• Thrift for Java and C++• https://
cwiki.apache.org/confluence/display/Hive/HiveClient
• A remote Hive client usually performs:• A connection to the Hive server (TCP/10000)• The query execution
![Page 34: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/34.jpg)
4. Remote Hive client – Get a connection
34
private Connection getConnection( String ip, String port, String user, String password) { try { // dynamically load the Hive JDBC driver Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver"); } catch (ClassNotFoundException e) { System.out.println(e.getMessage()); return null; } // try catch try { // return a connection based on the Hive JDBC driver, default DB return DriverManager.getConnection("jdbc:hive://" + ip + ":" + port + "/default?user=" + user + "&password=" + password); } catch (SQLException e) { System.out.println(e.getMessage()); return null; } // try catch} // getConnection
https://github.com/telefonicaid/fiware-connectors/tree/develop/resources/hive-basic-client
![Page 35: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/35.jpg)
4. Remote Hive client – Do the query
35
private void doQuery() { try { // from here on, everything is SQL! Statement stmt = con.createStatement(); ResultSet res = stmt.executeQuery("select column1,column2," + "otherColumns from mytable where column1='whatever' and " + "columns2 like '%whatever%'");
// iterate on the result while (res.next()) { String column1 = res.getString(1); Integer column2 = res.getInteger(2); // whatever you want to do with this row, here } // while
// close everything res.close(); stmt.close(); con.close(); } catch (SQLException ex) { System.exit(0); } // try catch} // doQuery
https://github.com/telefonicaid/fiware-connectors/tree/develop/resources/hive-basic-client
![Page 36: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/36.jpg)
4. Remote Hive client – Plague Tracker demo
36
https://github.com/telefonicaid/fiware-connectors/tree/develop/resources/plague-tracker
![Page 37: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/37.jpg)
5. MapReduce applications
37
• MapReduce applications are commonly written in Java
• Can be written in other languages through Hadoop Streaming
• They are executed in the command line
$ hadoop jar <jar-file> <main-class> <input-dir> <output-dir>
• A MapReduce job consists of:• A driver, a piece of software where to define inputs, outputs,
formats, etc. and the entry point for launching the job• A set of Mappers, given by a piece of software defining its
behaviour• A set of Reducers, given by a piece of software defining its
behaviour• There are 2 APIS
• org.apache.mapred old one• org.apache.mapreduce new one
• Hadoop is distributed with MapReduce examples• [HADOOP_HOME]/hadoop-examples.jar
![Page 38: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/38.jpg)
5. MapReduce applications – Map
38
/* org.apache.mapred example */public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { /* use the input value, the input key is the offset within the file and it is not necessary in this example */ String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line);
/* iterate on the string, getting each word */ while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); /* emit an output (key,value) pair based on the word and 1 */ output.collect(word, one); } // while } // map} // MapClass
![Page 39: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/39.jpg)
5. MapReduce applications – Reduce
39
/* org.apache.mapred example */public static class ReduceClass extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0;
/* iterate on all the values and add them */ while (values.hasNext()) { sum += values.next().get(); } // while
/* emit an output (key,value) pair based on the word and its count */ output.collect(key, new IntWritable(sum)); } // reduce} // ReduceClass
![Page 40: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/40.jpg)
5. MapReduce applications – Driver
40
/* org.apache.mapred example */package my.org
import java.io.IOException;import java.util.*;
import org.apache.hadoop.fs.Path;import org.apache.hadoop.conf.*;import org.apache.hadoop.io.*;import org.apache.hadoop.mapred.*;import org.apache.hadoop.util.*;
public class WordCount { public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(MapClass.class); conf.setCombinerClass(ReduceClass.class); conf.setReducerClass(ReduceClass.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } // main} // WordCount
![Page 41: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/41.jpg)
6. Launching tasks with Oozie
41
• Oozie is a workflow scheduler system to manage Hadoop jobs
• Java map-reduce• Pig and Hive• Sqoop• System specific jobs (such as Java programs and shell scripts)
• Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
• Writting Oozie applications is about including in a package
• The MapReduce jobs, Hive/Pig scritps, etc (exeutable code)• A Workflow• Parameters for the Workflow
• Oozie can be use locally or remotely• https://
oozie.apache.org/docs/4.0.0/index.html#Developer_Documentation
![Page 42: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/42.jpg)
6. Launching tasks with Oozie – Java client
42
OozieClient client = new OozieClient("http://130.206.80.46:11000/oozie/");
// create a workflow job configuration and set the workflow application pathProperties conf = client.createConfiguration();conf.setProperty(OozieClient.APP_PATH, "hdfs://cosmosmaster-gi:8020/user/frb/mrjobs");conf.setProperty("nameNode", "hdfs://cosmosmaster-gi:8020");conf.setProperty("jobTracker", "cosmosmaster-gi:8021");conf.setProperty("outputDir", "output");conf.setProperty("inputDir", "input");conf.setProperty("examplesRoot", "mrjobs");conf.setProperty("queueName", "default");
// submit and start the workflow jobString jobId = client.run(conf);
// wait until the workflow job finishes printing the status every 10 secondswhile (client.getJobInfo(jobId).getStatus() == WorkflowJob.Status.RUNNING) {
System.out.println("Workflow job running ..."); Thread.sleep(10 * 1000);} // while
System.out.println("Workflow job completed");
![Page 43: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/43.jpg)
Useful references
43
• Hive resources:• HiveQL language https://
cwiki.apache.org/confluence/display/Hive/LanguageManual• How to create Hive clients https
://cwiki.apache.org/confluence/display/Hive/HiveClient• Hive client example https
://github.com/telefonicaid/fiware-connectors/tree/develop/resources/hive-basic-client
• Plague Tracker demo https://github.com/telefonicaid/fiware-livedemoapp/tree/master/cosmos/plague-tracker
• Plague Tracker instance http://130.206.81.65/plague-tracker/
• Hadoop filesystem commands:• http://
hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html
• WebHDFS and HttpFS REST APIs:• http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Web
HDFS.html• http://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/index.html
• Oozie• https://oozie.apache.org/docs/4.0.0/index.html#Developer_Documentation
![Page 44: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/44.jpg)
44
Cosmos place in FI-WARE:
Typical scenarios
![Page 45: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/45.jpg)
General IoT platform
45
IoT BackendDevice Management
CKAN
COSMOS(BIG DATA)
DATA PROCESSING
DATA QUERYING
SUBSOPEN DATA
CONTEXT BROKER
measures / commands
IoT/Sensor Open Data
SENSOR 2 THINGS
T-T
Accounting &
Paym
ent & B
illing
IDM
& A
uth
SHORT TERM HISTORIC
REAL TIME PRCSSING
BIETL
BLNKRULES
DEFINITION
BLNKOPERATIONAL DASHBOARD
KPI GOVERNANCE OPEN DATA PORTALS
CEP
GISContext
Adapters
Service Orchrestation
CityServices
You don’t haveto use them all!
![Page 46: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/46.jpg)
Real time context data persistence (architecture)
46
https://forge.fi-ware.eu/plugins/mediawiki/wiki/fiware/index.php/How_to_persist_Orion_data_in_Cosmos
https://github.com/telefonicaid/fiware-connectors/tree/develop/flume
![Page 47: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/47.jpg)
Real time context data persistence (detail)
47
![Page 48: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/48.jpg)
Real time context data persistence (examples)
48
• Information coming from city sensors• Presence map gradients, aglomerations…• Services usage distributions, top users (if
available), top POIs, unused resources…• Information generated by smartphones
• Geolocation routes, map gradients, aglomerations…
• Issues reporting top neighbourhooods in incidents, crimilality, noises, garbage, plagues…
• Any other real time information• Depending on your app, this could be product
likes, product consumption, user-2-user feedback… recommendations, advertisement…
![Page 49: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/49.jpg)
49
Roadmap:More functionalities
and integrations
![Page 50: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/50.jpg)
Roadmap
50
• Integrate the clusters creation with the cloud portal• No more REST API work
• Streaming analysis capabilities• Not all the analysis can wait for a batch
processing• Geolocation analysis capabilities
• An important source of data nowadays• Integrate with CKAN
• As a source of batch data• Integrate with the Marketplace
• Selling datasets• Selling analysis results• Selling applications and algorithms
![Page 52: Cosmos, Big Data GE implementation in FIWARE](https://reader036.vdocuments.us/reader036/viewer/2022062320/558ea8cf1a28abe9118b4723/html5/thumbnails/52.jpg)
http://fi-ppp.eu
http://fi-ware.eu
Follow @Fiware on Twitter!
Thanks !
52