proactive enterprise analytic sensing architecture · d2.2.2 proactive enterprise analytic sensing...

PROACTIVE ENTERPRISE ANALYTIC SENSING

ARCHITECTURE

Deliverable nº: D2.2.2

EC‐GA Number: 612329Project full title: The Proactive Sensing Enterprise

Work Package: WP2

Type of document: Prototype

Date: 07.07.2016

Grant Agreement No 612329

Partners: SINTEF

Responsible: SINTEF

Dissemination

level: Public

Title: D2.2.2 Proactive Enterprise Analytic

Sensing Architecture

Version:

1.0 Page: 0 / 50

Deliverable D2.2.2 Proactive Enterprise Analytic Sensing

Architecture

DUE DELIVERY DATE: M32

ACTUAL DELIVERY DATE: JULY 2016 (M33)

D2.2.2 Proactive Enterprise Analytic Sensing Architecture Page 1

Document History

Vers. Issue Date Content and changes Author(s)

0.1 21.06.2016 Initial draft based on D2.2.1 with revised document structure for the 2nd prototype.

Brian Elvesæter

0.2 23.06.2016 Updated all sections describing the 2nd prototype.

Brian Elvesæter

0.3 24.06.2016 Ready for internal peer review. Brian Elvesæter

0.4 01.07.2016 Revisions based on feedback from internal review by Dominik Riemer.

Brian Elvesæter

0.5 05.07.2016 Revisions based on feedback from internal review by Nenad Stojanovic.

Brian Elvesæter

1.0 07.07.2016 Final formatting and layout. Ready for submission.

Brian Elvesæter


Document Authors

Partners Contributors

SINTEF Brian Elvesæter, Arne‐Jørgen Berre, Nicolas Ferry, Anatoly Vasilevskiy, Sobah Abbas Petersen

Dissemination Level

Public (PU)

Document Approvers

Partners Approvers

FZI Dominik Riemer

NISSA Nenad Stojanovic


Executive Summary

This deliverable documents the 2nd version of the software prototype implemented as a result of Task T2.2 "Sensing architecture" within the scope of Work Package 2 (WP2) "Observe Phase: Smart Sensing". The implementation is based on the revised ProaSense Conceptual Architecture of Deliverable D1.3. The software prototype consists of two main components that provides services for the Storage Layer and the Sensing Layer of the conceptual architecture.

The ProaSense Storage component provides the storage, query and registry functionality of the Storage Layer. For the 2nd release, bug fixes were applied to improve the storage functionality, while the query and registry functionality were enhanced.

o The Storage Writer Service is responsible for storing the sensor events (external data collected by adapters) and the ProaSense system events (internal data published by components of the ProaSense system).

o The Storage Reader Service provides a RESTful API that allows to query the events data (i.e., sensor events and ProaSense system events) that are stored.

o The Storage Registry Service provides a RESTful API that allows to query the contextual information (i.e., sensor metadata, product information, etc.) that are modelled for a particular use case (i.e., MHWirth or HELLA).

The ProaSense Adapter Library component supports the Sensing Layer. It provides a Java library for developing specific adapters to acquire data from hardware sensors, software sensors and human sensors. For the 2nd release, further testing and bug fixes were applied to improve the adapter functionality, including finalizing a new adapter for the HELLA use case. A total of 11 specific adapters have been developed to support the ProaSense use cases:

o Three specific adapters were developed to support the MHWirth use case.

o Eight specific adapters were developed to support the HELLA use case.


Table of Contents

1. Introduction ................................................................................................................................ 71.1 Objectives of the Deliverable ....................................................................................................................... 71.2 Relations to other ProaSense Deliverables and Tasks .................................................................................. 71.3 Structure of the Deliverable ......................................................................................................................... 8

2. Overview of the Sensing Architecture .......................................................................................... 92.1 Technical Overview ...................................................................................................................................... 92.2 Evaluation Feedback and Issues Addressed ............................................................................................... 102.3 Context Model for Integration with Other ProaSense Components .......................................................... 11

3. Technical Specification of the 2nd Prototype............................................................................... 133.1 ProaSense Storage ...................................................................................................................................... 13

3.1.1 Storage Writer Service ....................................................................................................................... 143.1.2 Storage Reader Service ...................................................................................................................... 173.1.3 Storage Registry Service..................................................................................................................... 21

3.2 ProaSense Adapter Library ......................................................................................................................... 243.2.1 Generic adapters ................................................................................................................................ 253.2.2 Specific adapters ................................................................................................................................ 28

4. Implementation of the 2nd Prototype ......................................................................................... 334.1 Source code ................................................................................................................................................ 33

4.1.1 ProaSense Storage ............................................................................................................................. 334.1.2 ProaSense Adapter Library ................................................................................................................ 34

4.2 APIs and libraries ........................................................................................................................................ 354.2.1 Proasense Storage ............................................................................................................................. 354.2.2 ProaSense Adapter Library ................................................................................................................ 35

4.3 Preparation of installation .......................................................................................................................... 365. Technical Validation .................................................................................................................. 37

5.1 Testing ........................................................................................................................................................ 375.1.1 Mongo Management Studio .............................................................................................................. 375.1.2 Benchmark test code ......................................................................................................................... 385.1.3 Testing RESTful services ..................................................................................................................... 415.1.4 Testing adapters ................................................................................................................................ 42

5.2 Benchmarking ............................................................................................................................................. 435.2.1 Write performance ............................................................................................................................ 435.2.2 Kafka benchmark ............................................................................................................................... 455.2.3 Query response time ......................................................................................................................... 46

6. Conclusions ............................................................................................................................... 497. References ................................................................................................................................ 50


List of Figures

Figure 1‐1: ProaSense implemented architecture .................................................................................. 8

Figure 2‐1: Sensing architecture components (sensing & storage layers) ............................................ 10

Figure 2‐2: Identified issues in the human adapters ............................................................................. 11

Figure 2‐3: Context Model implemented as an ontology in Protégé .................................................... 12

Figure 3‐1: ProaSense Storage component and services (storage layer) .............................................. 13

Figure 3‐2: Storage Writer Service – implementation design ............................................................... 15

Figure 3‐3: Storage Reader Service – implementation design .............................................................. 17

Figure 3‐4: Storage Registry Service – implementation design ............................................................. 21

Figure 3‐5: ProaSense Adapter Library (sensing layer) ......................................................................... 24

Figure 3‐6: Generic adapters – implementation design ........................................................................ 25

Figure 3‐7: Human adapter – simple generic web form ....................................................................... 27

Figure 3‐8: Sensory data acqusition in the MHWirth use case ............................................................. 28

Figure 3‐9: Inspection report form ........................................................................................................ 29

Figure 3‐10: Maintenance report form ................................................................................................. 29

Figure 3‐11: Sensory data acquisition in the HELLA use case ............................................................... 30

Figure 3‐12: Material certificate form ................................................................................................... 31

Figure 3‐13: Material change form ....................................................................................................... 31

Figure 3‐14: Production plan form ........................................................................................................ 32

Figure 4‐1: Maven project structure for the ProaSense Storage .......................................................... 33

Figure 4‐2: Maven project structure for a file adapter of the ProaSense Adapter Library ................... 34

Figure 5‐1: Mongo Management Studio ............................................................................................... 37

Figure 5‐2: Storage Writer Service – test classes .................................................................................. 38

Figure 5‐3: Storage Reader Services – Example of default query for simple events ............................ 41

Figure 5‐4: Storage Reader Service – Example of average query for simple events ............................. 42

Figure 5‐5: Laptop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms .................. 44

Figure 5‐6: Desktop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms ............... 45

Figure 5‐7: Cloud infrastructure w/Kafka – Synchronous driver, bulksize = 1000, maxwait = 1000 ms 46

Figure 5‐8: Query response times (average results) ............................................................................. 48


Acronyms

Acronym Explanation

API Application Programming Interface

EDA Event Driven Architecture

HDD Hard Disk Drive

IETF Internet Engineering Task Force

IPR Intellectual Property Rights

JSON JavaScript Object Notation

OLE Object Linking and Embedding

OPC OLE for Process Control

RDF Resource Description Framework

REST Representational state transfer

RPC Remote Procedure Call

SSD Solid State Drive

SSN Semantic Sensor Network

SOAP Simple Object Access Protocol


1. Introduction

The goal of Task T2.2 "Sensing architecture" of Work Package 2 (WP2) "Observe Phase: Smart Sensing" is to provide a Sensing Architecture prototype for the ProaSense system. The Sensing Architecture must be able to support large‐scale sensory input and provide a scalable storage for sensed data and the ProaSense system components that process large amount of data events.

1.1 OBJECTIVES OF THE DELIVERABLE

This deliverable presents the 2nd version of the software prototype implemented as a result of Task T2.2. The software prototype consists of two main components:

The ProaSense Storage, which is responsible for storage and query functionality. It provides services for storing all the sensor data events (external data) and system component events (internal data) of the ProaSense system.

The ProaSense Adapter Library, which provides a Java library for developing specific adapters to acquire data from hardware sensors, software sensors and human sensors.

1.2 RELATIONS TO OTHER PROASENSE DELIVERABLES AND TASKS

This deliverable is closely related to other deliverables that have already been submitted as part of WP2. It provides updated descriptions of the general architecture and implementation‐specific details of the 2nd software prototype resulting from Task T2.2. The prototype is an improved and extended version of the 1st version described in Deliverable D2.2.1. Amongst the extensions are support for the enterprise context model developed in Task T2.1 and documented in Deliverable D2.1 "Proactive Enterprise Sensing Logic Services and Management". Additional improvements and bug fixes have been made based on technical integration activities in WP6 and feedback from the first evaluation period in WP7.

The implementation of the prototype are aligned with the requirements and specifications defined in Deliverable D1.3 "Conceptual architecture and data format and protocols" and Deliverable D1.2 "Functional specification of the final prototype" which are results of tasks T1.4 "Conceptual architecture" and T1.3 "Data formats and protocols" respectively.

Figure 1‐1 shows the ProaSense implemented architecture and highlights the components that falls within the scope of Task T2.2 in WP2. The ProaSense Storage provides the storage capabilities of the Storage Layer and the ProaSense Adapter Library supports development of sensor adapters in the Sensing Layer.


Figure 1‐1: ProaSense implemented architecture

1.3 STRUCTURE OF THE DELIVERABLE

This deliverable is structured as follows:

Section 2 provides an overview of the Sensing Architecture software components, summarises the feedback and issues addressed from the first evaluation, and the new technical requirements from the integration with other ProaSense software components.

Section 3 outlines the technical specification of the software prototype consisting of a high‐level architecture design diagram and more detailed specification of all subcomponents.

Section 4 describes implementation‐specific aspects of the software prototype by focusing on the software structure and 3rd party libraries being used in our software components.

Section 5 presents preliminary testing results and planned tests to be conducted in the respective tasks concerned with the presented software components.

Section 6 concludes this deliverable and presents the plan for the finalization steps for system integration and deployment of the 2nd version of the software prototype.


2. Overview of the Sensing Architecture

In this section, we give an overview of the Sensing Architecture. In subsection 2.1 we briefly describe the revised technical architecture and its main components. The following subsection 2.2 summarises the evaluation feedback from Task T7.5 and how the identified issues were addressed. Finally, in subsection 2.3 we present new technical requirements for the Storage Layer, in particular towards the Context Model for integration with other ProaSense components.

2.1 TECHNICAL OVERVIEW

The ProaSense implemented architecture shown in Figure 1‐1 above provides the starting point for the implementation of the sensing and storage layers. Figure 2‐1 below shows a revised 1 technical architecture that zooms in on the software components of sensing and storage layers:

The Sensing Layer provides data acquisition from external sensors and systems into the ProaSense system.

o Adapters provide data acquisition capabilities for retrieving data from hardware, software and human sensors. A library of general ProaSense adapters supporting different input types (i.e., OPC server, Web service, file, Oracle database, Twitter and human) allows for the development of specific sensory adapters.

The Storage Layer provides the storage, query and registry capabilities for the ProaSense system.

o The Storage Writer Service monitors the message broker for events that are published to the ProaSense system. A set of Event Listeners, which can be configured to listen to different topics, consumes the events that should be stored into the Scalable Storage.

o The Storage Reader Service provides a RESTful Query API that allows to query capabilities for big data.

o The Storage Registry Service provides a RESTful Registry API that allows to retrieve relevant contextual information for the industrial use cases, i.e., list of sensors and sensor properties, list of products, etc. A separate context model that contains the relevant information has been defined for each use case.

1 The technical archiecture figure from Deliverable D2.2.1 has been revised so that it matches the ProaSense implemented architecture figure.


Figure 2‐1: Sensing architecture components (sensing & storage layers)

2.2 EVALUATION FEEDBACK AND ISSUES ADDRESSED

The first evaluation feedback gathered as part of Task T7.5 focused on the ProaSense components with a graphical user interface (GUI). Most of the components developed as part of Task T2.2 does not provide a GUI interface towards human users directly, but instead provide services through an event‐based interface or a RESTful interface intended for integration with other software components of the ProaSense system. Thus, most of these sensing architecture components were out of scope for this first evaluation.

ProaSense provides a general adapter library targeted at software developers that can be used to develop specific adapters, including human adapters for different systems and use cases. In addition to the general adapter library, SINTEF has also been responsible for developing a number of specific adapters to support the two use case partners in ProaSense. A total of 11 specific adapters were developed. Five of these specific adapters were developed as human adapters, i.e., functional prototypes with a simple web‐based GUI interface to collect information to be entered by human operators.

Two human adapters were developed for the MHWirth use case and three human adapters were developed for the HELLA use case. These human adapters were subject to the first evaluation in Task T7.3 that focused on scripting trials where evaluators were presented with several scenarios and, on each step, identified usability issues based on several heuristics, along with a textual description and severity level. Additionally, after finishing the scenarios, the evaluators were enquired about the ergonomics and ease of use of the platform, user interface and their stress level when using the component by employing questionnaires.

A number of issues were identified for the human adapters that were developed as summarised in Figure 2‐2. Based on the understanding that the human adapters were developed as functional prototypes, the most important issues (i.e., those that were marked as catastrophe and major) were


chosen to be addressed and fixed for the 2nd release of the human adapters, whereas minor and cosmetic issues were disregarded. Catastrophe issues were identified to be mostly related to use of incompatible web browsers. Examples of major issues were "missing option in a drop‐down menu" and "lack of confirmation after pushing submit button".

Figure 2‐2: Identified issues in the human adapters

2.3 CONTEXT MODEL FOR INTEGRATION WITH OTHER PROASENSE COMPONENTS

New technical requirements for new query functionality for the Storage Layer were identified as part of the integration activities of WP6. In particular, the need to make the Context Model developed in Task T2.1 available as a queryable software model so that relevant contextual information for each business use case could be retrieved. Examples of queries required by the WP5 components were:

Get list of sensors and their properties

Get list of products and their properties

Get list of injection moulding machines and their properties (specific to the HELLA use case)

Get list of moulds and their properties (specific to the HELLA use case)

Deliverable D2.1 "Proactive Enterprise Sensing Logic Services and Management" describes the development of an enterprise model for proactive sensing enterprises. The generic model was developed to identify the technological perspectives and their relevance to the business. The generic model is meant to be instantiated for each specific business use case. The instantiation is implemented as an ontology model in Protégé2 as seen in Figure 2‐3. This Context Model contains information about

2 http://protege.stanford.edu/


sensor data and other contextual information relevant to each business use case (e.g., products and moulds in the HELLA use case).

Figure 2‐3: Context Model implemented as an ontology in Protégé

In order to make the Context Model available to other ProaSense system components we needed to make it available as a queryable information service. For this purpose we decided to replace the Storage Registry Service from the 1st prototype, which was based on SensApp3 [Mosser, et al. 2012], with a new service based on Apache Fuseki4 which allows to serve the RDF5 ontology data defined in Protégé over a RESTful API6 through a query language called SPARQL7.

However, to make it easier to query and retrieve relevant information from the Context Model we did not want to enforce other ProaSense system components to use SPARQL. Thus, we implemented a simple wrapper service on top of Apache Fuseki that provides an easier‐to‐use RESTful API based on a JSON8 data format. It is of course possible to use the SPARQL endpoint provided by Apache Fuseki directly for system developers and integrators that are familiar with RDF and SPARQL technologies.

Another requirement was the need for batch processing, in particular to calculate average values and sums to show daily, weekly and monthly aggregates in the HELLA use case.

3 https://github.com/SINTEF‐9012/sensapp/wiki 4 https://jena.apache.org/documentation/fuseki2/ 5 https://www.w3.org/RDF/ 6 https://jena.apache.org/documentation/serving_data/ 7 https://www.w3.org/TR/sparql11‐overview/ 8 http://www.json.org/


3. Technical Specification of the 2nd Prototype

In this section, we elaborate on the technical specification and design choices of the 2nd prototype. Subsection 3.1 covers the specification for the ProaSense Storage and subsection 3.2 covers the specification for the ProaSense Adapter Library.

3.1 PROASENSE STORAGE

The ProaSense Storage component represents the scalable storage that is responsible for the storage capability of the ProaSense platform. The storage component must be able to 1) scale and absorb large amounts of time series sensor data collected by adapters, 2) store events produced by the ProaSense system components and 3) provide responsive storage/query capabilities to the analytics layer components, including batch processing (e.g., aggregate values per day, week or month).

Figure 3‐1: ProaSense Storage component and services (storage layer)

Figure 3‐1 above highlights the services that are part of the design and implementation of the ProaSense Storage component. The following subsections provide more details on the specification of the three storage services (i.e., writer, reader and registry).


3.1.1 STORAGE WRITER SERVICE

3.1.1.1 EVENT TYPES AND DEFINITIONS

The ProaSense Storage must be able to store all the specified event types used in the ProaSense system. The real‐time processing of the ProaSense system is built on top of an event‐driven architecture (EDA) in which a message broker is used to pass message events between different system components. These events follow a well‐defined message definition that has been defined in collaboration between the ProaSense technical partners. Table 1 summarises the different event types used in the ProaSense system. The right column contains the event definition in the Apache Thrift9 schema definition language10 that we are using for the ProaSense system.

Table 1: ProaSense event types

Event type Description Event definition

(Apache Thrift)

Simple event Simple event types are used to encode all types of sensory input data by the ProaSense adapters.

The event structure has been defined to be flexible. Each simple event has a timestamp, a sensor ID and can have an arbitrary number of event properties, i.e., to encode different measurements.

enum VariableType { LONG, STRING, DOUBLE, BLOB, BOOLEAN } struct ComplexValue { 1: optional string value; 2: optional list<string> values; 3: required VariableType type; } struct SimpleEvent { 1: required long timestamp; 2: required string sensorId; 3: required map<string,ComplexValue> eventProperties; }

Derived event Derived event types are produced by the enricher and CEP components of the ProaSense system.

The event structure is similar to the simple event, but contains two additional mandatory properties indicating the component ID and the event name.

struct DerivedEvent { 1: required long timestamp; 2: required string componentId; 3: required string eventName; 4: required map<string,ComplexValue> eventProperties; }

Predicted event Predicted events are produced by the prediction component of the ProaSense system.

enum PDFType { EXPONENTIAL, HISTOGRAM } struct PredictedEvent { 1: required long timestamp;

9 http://thrift.apache.org/ 10 Apache Thrift is a framework for defining RPC systems. It provides a scheme definition language that is used for defining message formats. The scheme is compiled into platform‐native structures (i.e., classes and structs) and appropriate (de)serialization helpers.


2: required PDFType pdfType; 3: required map<string,ComplexValue> eventProperties; 4: required list<double> params; 5: optional list<long> timestamps; 6: required string eventName; }

Anomaly event Anomaly events are produced by the online and offline analysis components of the ProaSense system.

struct AnomalyEvent { 1: required long timestamp; 2: required string anomalyType; 3: required string blob; }

Recommendation event

Recommendation events are produced by the recommendation component of WP4.

struct RecommendationEvent { 1: required string recommendationId; 2: required string action; 3: required long timestamp; 4: required string actor; 5: required map<string,ComplexValue> eventProperties; 6: required string eventName; }

Feedback event Feedback events are produced by the business components of WP5.

enum Status { SUGGESTED, IMPLEMENTED, SUCCESSFUL, UNSUCCESSFUL } struct FeedbackEvent { 1: required string actor; 2: required long timestamp; 3: required Status status; 4: optional string comments; 5: required string recommendationId;}

3.1.1.2 IMPLEMENTATION DESIGN

Figure 3‐2: Storage Writer Service – implementation design

Figure 3‐2 above illustrates the design of the Storage Writer Service. The service has been designed as a multi‐threaded Java application. The main class StorageWriterMongoService represents an instance of the service which contains a set of running threads:

One or more event listener threads.


o An event listener can be of either of class EventListenerKafkaFilter or class EventListenerKakfaTopic. The Filter class listens for multiple topics defined via a filter string, while a Topic class listens for a specific topic defined by a string.

One or more event writer threads.

o An event writer can be of either of class EventWriterMongoSync or EventWriterMongoAsync. The Sync class uses the MongoDB synchronous driver to write to the database, while the Async class uses the MongoDB asynchronous driver to write to the database.

One event heartbeat thread.

o This thread implements a heartbeat event that signals the event writer to flush data in the writer queue to the database in case there is a stop in the input stream.

Additionally, the Storage Writer Service is dependent on a base module that contains a set of helper classes (i.e., EventConverter, EventDocument, EventDocumentConvert and EventProperties) that provide functionality to convert between the message format used in the ProaSense system (i.e., Apache Thrift events) and the document format used in the storage component (i.e., MongoDB documents).

The Storage Writer Service has been designed to be configurable and provides a set of properties that can be configured to optimize the write performance. Table 2 below shows relevant configurable server properties for the writer service.

Table 2: Storage Writer Service – server properties

Property Value (default or example) Description

zookeeper.connect 192.168.1.111:2181 Kafka connection setting for Kafka consumers.

kafka.bootstrap.servers 192.168.1.111:9092 Kafka connection setting for Kafka producers.

proasense.storage. event.[event_type]. listeners

1 Number of event listener threads to run for the indicated [event_type] (i.e., simple, derived, predicted, anomaly, recommendation and feedback).

proasense.storage. event.[event_type]. topic

eu.proasense.internal.sensing.(mhwirth.*|hella.*)

The topic(s) to listen for. In the given example we have given a filter string that means to listen for events published to the topics:

eu.proasense.internal.sensing.mhwirth.*

eu.proasense.internal.sensing.hella.*

proasense.storage. event.[event_type]. filter

true Boolean value that indicates whether the topic should be read as a filter string or a specific topic string.

proasense.storage. mongodb.url

mongodb://127.0.0.1:27017 The URL of the MongoDB database. Default is to run the writer service on the same machine as the database.

proasense.storage. mongodb.writers

1 Number of event writer threads to run for this service instance.


proasense.storage. mongodb.bulksize

1000 The number of events to queue before writing to storage. Write performance can be significantly improved by doing bulk write.

proasense.storage. mongodb.maxwait

100 The number of ms to wait before doing the bulk write. This ensures that the queue is written to storage in a timely manner in situations where the waiting data received is less than the queue size, e.g., the sensor stops to send events.

proasense.storage. mongodb.syncdriver

true Boolean value that indicates whether to use the MongoDB synchronous our asynchronous driver.

The design of the service and its configurable server properties gives the users flexibility in tuning the write performance for each specific business use case. For instance, if a specific business use case exploits a wide number and range of sensors that vary in sampling rate, we can configure the Storage Writer Service accordingly. The configuration is done at deployment time.

Specific topics and listeners for sensor data with sampling rate above 10 seconds should be configured to be written to storage as soon as they arrive, i.e., setting bulksize = 1.

Specific topics and listeners for numerous sensors with a sampling rate in the ms range with a proper bulksize setting. Moreover, depending on the number of sensors, multiple listeners and writers can be configured in order to improve both the consumption of messages from the message broker and the write performance to the storage.

3.1.2 STORAGE READER SERVICE


Figure 3‐3: Storage Reader Service – implementation design

Figure 3‐3 above illustrates the design of the Storage Reader Service. The service has been designed as a Java Web application. The main class StorageWriterMongoService represents an instance of the service which spawns a thread of class EventReaderMongoSync for each incoming query request. The query can be of type default, average, minimum and maximum corresponding the standard queries described above for each of the event types (i.e., simple, derived, predicted, anomaly, recommendation and feedback). The default query will return a list of events in the Apache Thrift message format. The conversion from the storage document format to the Apache Thrift message format is done via the helper classes in the base storage module.


Table 3 below shows relevant configurable server properties for the deployment of the storage reader service.

Table 3: Storage Reader Service – server properties


proasense.storage.mongodb.url mongodb://127.0.0.1:27017 The URL to the deployed MongoDB database.

proasense.storage.mongodb.database proasense_db The name of the MongoDB database to use.

3.1.2.2 API DOCUMENTATION FOR QUERIES

The ProaSense Storage must provide query capabilities to all the specified event types used in the ProaSense system. Standard queries such as sensor and event data for a specific time period are supported. Additionally, it is possible to query the average, minimum and maximum values of the sensor and event data for a specific time period. Finally, batch processing is currently being set up for calculating average values and sums to show daily, weekly and monthly aggregates in the HELLA use case.

The Storage Reader Service supports a request/response interaction though a REST API. Specific REST methods can be designed for easy access to specific data resources required by the ProaSense system components.

The reader service provides a simple API for queries. To use the API you invoke an HTTP GET method on the Web resource

[storage_reader_url]/query/[event_type]/[operation_type]

indicating the event type (i.e., simple, derived, predicted, anomaly, recommendation and feedback) and the operation type (i.e., default, average, minimum and maximum). The query parameters vary slightly depending on the event type. Table 4 below describes the API in more details. The response type for all query methods are JSON. Examples of query testing can be seen in subsection 5.1.3.

Table 4: Storage Reader Service – REST API

Method Resource Description

GET

/query/simple/default Lists the simple events for the specified time period.

Query parameters:

o sensorId – The sensor ID.

o startTime – The start of the time period.

o endTime – The end of the time period.

Output: List of simple events.

GET /query/simple/average

/query/simple/minimum

/query/simple/maximum

Calculates the average, minum or maximum value over the specified time period for the specified property.

Query parameters:





o propertyKey – The property for which to calculate.

Output: The calculated value.

GET /query/derived/default Lists the derived events for the specified time period.

Query parameters:

o componentId – The component ID.



Output: List of derived events.

GET /query/derived/average

/query/derived/minimum

/query/derived/maximum


Query parameters:

o componentId – The component ID.





GET /query/predicted/default Lists the predicted events for the specified time period.

Query parameters:

o eventName – The event name (optional filter).



Output: List of predicted events.

GET /query/predicted/average

/query/predicted/minimum

/query/predicted/maximum


Query parameters:






GET /query/anomaly/default Lists the anomaly events for the specified time period.

Query parameters:

o anomalyType – The anomaly type (optional filter).



Output: List of predicted events.


GET /query/recommendation/default Lists the recommendation events for the specified time period.

Query parameters:

o recommendationId – The recommendation ID (optional filter).

o actor – The actor ID (optional filter).




Output: List of recommendation events.

GET /query/recommendation/average

/query/recommendation/minimum

/query/recommendation/maximum


Query parameters:





GET /query/feedback/default Lists the feedback events for the specified time period.

Query parameters:

o actor – The actor ID (optional filter)



o status – The status type (optional filter).

o recommendationID – The recommendation ID (optional filter).

Output: List of feedback events.


3.1.3 STORAGE REGISTRY SERVICE


The 1st release of the Storage Registry Service was based on SensApp. For the 2nd release the Context Model defined in Task T2.1 and implemented as an ontology model in Protégé was used. The ontology model is based on the SSN (Semantic Sensor Network) ontology to model sensors. The Context Model is exported as an RDF11 dataset from Protégé and imported into Apache Fuseki12 that serves the RDF dataset over a REST API13 through a query language called SPARQL14. As explained in section 2.3 we have implemented a simple wrapper service that on top of Apache Fuseki that provides an easier‐to‐use RESTful API based on JSON data formats.

Figure 3‐4 below illustrates the design of the Storage Registry Service. The service has been designed as a Java Web application. The main class StorageRegistryFusekiService represents an instance of the service and acts as a RESTful service wrapper on top of Apache Fuseki. The registry service sends specific SPARQL queries to Apache Fuseki (e.g., list sensors and list products), processes the response from Apache Fuseki into a structured JSON format and sends that as an HTTP response to the ProaSense system component invoking the registry service.

Figure 3‐4: Storage Registry Service – implementation design

Table 5 below shows relevant configurable server properties for the deployment of the storage registry service.

Table 5: Storage Registry Service – server properties


proasense.storage.fuseki.sparql.url http://192.168.84.88:8080/fuseki The URL to the deployed Apache Fuseki service.

proasense.storage.fuseki.dataset.default hella The default dataset (context model) to use.

11 https://www.w3.org/RDF/ 12 https://jena.apache.org/documentation/serving_data/ 13 https://jena.apache.org/documentation/serving_data/ 14 https://www.w3.org/TR/sparql11‐overview/


3.1.3.2 API DOCUMENTATION FOR QUERIES

The ProaSense Storage must be able to provide query capabilities to retrieve relevant contextual information for the use cases. The Storage Registry Service supports a request/response interaction though a REST API. Specific REST methods can be designed for easy access to specific data resources, including information about sensors and the schema of the sensor data collected, as well as equipment and product information.

The registry service provides a simple API for queries. To use the API you invoke an HTTP GET method on the Web resource

[storage_registry_url]/query/[resource_type]/[operation_type]

indicating the resource type (i.e., sensor, machine, product and mould) and the operation type (i.e., list and properties). Table 6 below describes the API in more details. The response type for all query methods are JSON.

Table 6: Storage Registry Service – REST API

Method Resource Description

GET

/query/sensor/list Lists registered sensors

Query parameters:

o Dataset – Optional parameter to indicate which contextual model/dataset to use (i.e., mhwirth or hella). If no parameter is given, then the default model (specified as part of the deployment setting) will be used.

Output: List of sensor IDs.

GET /query/sensor/properties

Lists the sensor properties, including a description of the event schema (i.e. the simple event properties) captured by sensor

Query parameters:



Output: List of sensor properties.

GET

/query/machine/list Lists registered machine (i.e., injection moulding machines in the HELLA use case)

Query parameters:


Output: List of machine IDs.

GET /query/machine/properties

Lists the machine properties

Query parameters:



o machineId – The machine ID.

Output: List of machine properties.

GET

/query/product/list Lists registered products (i.e., headlight products in the HELLA use case)

Query parameters:


Output: List of product IDs.

GET /query/product/properties

Lists the product properties

Query parameters:


o productId – The product ID.

Output: List of product properties.

GET

/query/mould/list Lists registered moulds (i.e., headlight moulds in the HELLA use case)

Query parameters:


Output: List of mould IDs.

GET /query/mould/properties

Lists the mould properties

Query parameters:


o mouldId – The mould ID.

Output: List of mould properties.


3.2 PROASENSE ADAPTER LIBRARY

The sensing layer deals with the data acquisition issues. It senses relevant sources, transforms data into a well‐defined format useful for further analysis and pushes the data to the ProaSense system. This layer implements several adapters based on the adapter requirements that were collected from the MHWirth and HELLA business use cases.

Figure 3‐5: ProaSense Adapter Library (sensing layer)

Figure 3‐5 above highlights the adapters that are part of the design and implementation of the ProaSense Adapter Library. We use the terms hardware, software and human sensors to classify sensory input from three different types of sources in a broad sense:

Hardware sensors refer to 1) sensor/process data (e.g. temperature, RPM and pressure) from DCS, SCADA and PLC systems and 2) other sensor data produced by standalone physical sensors (e.g. temperature and environmental sensors).

Software sensors refer to information from software applications that produce information relevant for the ProaSense decision making. This includes enterprise information systems such as ERP and other business applications (e.g. maintenance schedules and reports in the MHWirth use case and scrap reports in the HELLA use case), as well as business context data from the social Web (e.g., Facebook, Twitter, LinkedIn, etc).

Human sensors refer to relevant data and information that are not automatically acquired through hardware or software sensors, due to this information not being easily accessible through existing system interfaces and thus has to be manually entered through human operators. Examples of such information are manual reports, observations and contextual data provided by a user.

However, in order to implement adapters that retrieve data from these three types of sensory sources, we need a framework that allows us to connect to different types of system interfaces. The Sensing


Architecture includes a library of generic adapters that supports different types of inputs. The following subsections provide more details on the specification of these adapters.

3.2.1 GENERIC ADAPTERS

A Java library of general adapters is being developed that can be used to develop specific adapters. The general adapters will support different types of inputs (file, Oracle database, Web Service, OPC, human and Twitter). Each of these adapters will extend a base adapter that contains the necessary code to connect to the ProaSense system using the Kafka message broker.

Figure 3‐6: Generic adapters – implementation design

Figure 3‐6 above illustrates the design of the generic adapters. Each generic adapter inherits from the super class AbstractBaseAdapter, which contains an output port of type KafkaProducerOutput that publishes simple events to the Kafka message broker. Each adapter in turn specifies an input port corresponding to the type of adapter, i.e., file, Oracle, Web Service, OPC, Web form or Twitter. As such each generic adapter will contain one input port which consumes data from a source system and one output port that publishes simple events to the ProaSense system. Additionally, each adapter is configurable through an adapter properties file.

A file adapter monitor folders/files and processes the contents of any newly created files. This can be used to monitor a file folder with event logs or process file reports generated from enterprise systems.

An Oracle adapter queries an Oracle database at regular intervals (i.e., every 30 seconds) and processes the result of the query. This can be used to retrieve data from the databases of enterprise systems.

A Web Service adapter provides a Web endpoint where sensors can publish their data. This can be used to retrieve data from sensors with the ability to push/archive e.g. XML data via a SOAP protocol.

An OPC adapter allows you to connect to an OPC server. This can be used to retrieve data from PLC systems that supports the OPC protocol.

A human adapter provides a simple Web form to retrieve data from a human operator. A generic Web form (see Figure 3‐7) can be used, but it is highly recommended to create specific Web forms that ensures correctness of the input entered by human operators.

A Twitter adapter searches Twitter for tweets containing specific hashtags. This can be used to extract relevant social data from specific tweets.

In order to develop a specific adapter for ProaSense a developer will need to:


1. Extend one of the general adapters with the necessary Java code for translating input data to the ProaSense‐specific simple events (see Listing 1 below). In the case of a human adapter, it is also recommended to define a specific Web form for the human operator to enter the data in question. The simple event structure defines three mandatory properties:

a. timestamp – time in milliseconds

b. sensorId – a unique ID for the sensor

c. eventProperties – a list of properties coded as {key, value, type}15

2. Configure the adapter.properties file with the correct properties.

3. Compile and deploy the adapter (build scripts are provided).

Listing 1: Definition of the simple event structure

Enum VariableType {

LONG,

STRING,

DOUBLE,

BLOB

}

struct ComplexValue {

1: optional string value;

2: optional list<string> values;

3: required VariableType type;

}

Struct SimpleEvent {

1: required long timestamp;

2: required string sensorId;

3: required map<string,ComplexValue> eventProperties;

}

Figure 3‐7 below depicts the generic Web form that is part of the default human adapter. The generic Web form has been used to test and verify the functionality of the human adapter. Specific sensors can be pre‐defined with different sensor properties. The human operator can add additional properties, enter the value and specify the variable type according to the simple event structure. While the generic Web form can be used, we recommend that specific Web forms with better error handling (e.g., checking that the value is within a certain range) is developed for each required human adapter.

15 Each event property in a SimpleEvent can be viewed as a tuple with three values {key, value, type]. A map contains tuplets with two values {key, ComplexValue}, where the ComplexValue is itself represented as a tuple with two values {value, type}. This gives us: {key, ComplexValue} => {key, {value, type}} or {key, value, type} for simplicity.


Figure 3‐7: Human adapter – simple generic web form

Table 7 below shows some relevant configurable adapter properties for common to all adapters. Each adapter type will contain additional adapter properties that are relevant.

Table 7: Generic adapters – adapter properties



proasense.adapter. base.topic

eu.proasense.internal.sensing.simple.baseadapter

The default topic to be used when publishing the simple events. This can be overridden in the specific adapter code.

proasense.adapter. base.sensorid

baseadapter The default sensor ID to be used for the simple events. This can be overridden in the specific adapter code.


3.2.2 SPECIFIC ADAPTERS

3.2.2.1 MHWIRTH ADAPTERS

Figure 3‐8 below gives an overview of the sensory data acquisition in the MHWirth use case. Each facility offshore has a local installation of the RigLogger system that can potentially record data from MHWirth drilling control and monitoring systems installed on the facility. MHWirth provides the following systems16: 1) local equipment controls/PLCs, 2) MH control systems, 3) MH drilling systems, 4) necessary computer hardware and 5) drilling control cabins. The data recorded at the facility is sent to a central RigLogger system onshore.

The central RigLogger will be the integration point towards the sensing layer of the ProaSense system. This system can be integrated by developing specific adapters based on either the generic base or OPC adapter. In addition, we have relevant contextual data from maintenance and inspection reports that can be acquired through the development of specific human adapters.

Figure 3‐8: Sensory data acqusition in the MHWirth use case

Three specific adapters were developed by SINTEF to support the MHWirth use case:

Riglogger: This adapter reads batches of sensor data using the Riglogger Web service. The read data is queued and replayed according to the timestamps, thus simulating a real‐time stream of events (but delayed a configurable number of minutes). You can specify the signals (points) to be read.

16 https://mhwirth.com/news‐media/brochures/our‐products‐services‐2015/


Inspection report: Inspection reports contains relevant information such as inspection period, inspection findings and recommended actions (e.g. increase frequency of oil change, perform visual inspection), oil sample data (e.g. parts per million of various metals, viscosity), etc., that are important for condition‐based maintenance (CBM) use cases. Relevant data from an inspection report can be published to the ProaSense system using a Web form (human adapter) as shown in Figure 3‐9.

Figure 3‐9: Inspection report form

Maintenance report: Maintenance reports contain relevant information such as maintenance time usage, cost and equipment condition after execution. Relevant data from a maintenance report can be published to the ProaSense system using a Web form (human adapter) as shown in Figure 3‐10.

Figure 3‐10: Maintenance report form


3.2.2.2 HELLA ADAPTERS

Figure 3‐11 below gives an overview of the sensory data acquisition in the HELLA use case. Similar to the MHWirth use case there exists a central datahub, the HYDRA Manufacturing Execution System (MES), for collecting and integrating sensor data from process control and PLC systems. The HYDRA Process Communication Controller 17 provides an extensive library of communication modules to integrate machine and process data. Currently a test installation of the HYDRA MES system can integrate data from different machines and process systems at HELLA (e.g., injection moulding machines, scrap inspection station, dryer system, lacquering system and several robots).

In the HELLA use case we require adapters that acquire data from the injection moulding machines and the scrap inspection station. This is done by querying the underlying Oracle database that is used by the HYDRA MES system. Additionaly, production data from the SAP ERP system and events from the Montrac transportation line are required. This is done via a file adapter that processes the exported daily SAP reports and the Montrac event files from a file server. Finally, adapters for retrieving data from web sensors and human input (i.e., material certificates, raw material change at moulding machines, and production and workshift plans) are also needed.

Figure 3‐11: Sensory data acquisition in the HELLA use case

Eight specific adapters were developed by SINTEF to support the HELLA use case:

Montrac: This adapter processes event files produced by the Montrac system.

Material movement: This adapter processes daily SAP reports that contains information about the material movement.

17 http://www.mpdv‐usa.com/PDF‐Files/Data‐Collection‐Brochure_English.pdf


Product plan: This adapter processes daily SAP reports that contains information about the production plan.

IMM: This adapter collects moulding parameters from the five injection moulding machines (IMMs) from the HYDRA MES Oracle database.

Scrap: This adapter collects scrap data from the HYDRA MES Oracle database.

Raw material certificate: This adapter provides a Web form (see Figure 3‐12) that allows a human operator to enter relevant material certificate data and publish it to the ProaSense system.

Figure 3‐12: Material certificate form

Raw material change: This adapter provides a Web form (see Figure 3‐13) that allows a human operator to indicate which material type is used for each of the five IMM machines.

Figure 3‐13: Material change form

Production and shift plan: This adapter provides a Web form (see Figure 3‐14) that allows a human operator to define the production plan (i.e., which products to mould) for each of the five IMM machines according to the shift plan (i.e., three 8 hour shift per day and night, plus the following shift the next day).


Figure 3‐14: Production plan form

In addition, JSI has assisted HELLA in setting up an environmental array that measures temperature, humidity and dust particles at different locations in the HELLA manufacturing line, and developed a corresponding adapter to collect and publish these measurements to the ProaSense system.


4. Implementation of the 2nd Prototype

We deliver a 2nd prototype of the Sensing Architecture that consists of two main components, the ProaSense Storage and the ProaSense Adapter Library. In this section we briefly describe the software project organization and outline external libraries used as part of our prototype.

4.1 SOURCE CODE

The implementation of the 2nd prototype of the Sensing Architecture has been released as open source at GitHub under the Apache License.

4.1.1 PROASENSE STORAGE

The sources for the ProaSense Storage component is available at the following GitHub repository:

https://github.com/SINTEF‐9012/proasense‐storage

The source code for this component is organized as a multi‐module Maven project as shown in Figure 4‐1. The parent project contains four modules. The source code of the Storage Writer Service, the Storage Reader Service and Storage Registry Service are organised in three separate modules. These modules contains implementation classes corresponding to the implementation design described in section 3 above. Common functionality used by all three services have been refactored out to a separate module storage‐base.

Figure 4‐1: Maven project structure for the ProaSense Storage


4.1.2 PROASENSE ADAPTER LIBRARY

The sources for the ProaSense Adapter Library component has been split into several GitHub repositories:

https://github.com/SINTEF‐9012/proasense‐adapter‐base

https://github.com/SINTEF‐9012/proasense‐adapter‐file

https://github.com/SINTEF‐9012/proasense‐adapter‐oracle

https://github.com/SINTEF‐9012/proasense‐adapter‐webservice

https://github.com/SINTEF‐9012/proasense‐adapter‐opc

https://github.com/SINTEF‐9012/proasense‐adapter‐human

https://github.com/SINTEF‐9012/proasense‐adapter‐twitter

https://github.com/SINTEF‐9012/proasense‐virtual‐sensor

The source code for an adapter type is typically organized as a multi‐module project, with the general adapter as one module and the specific adapters that extends the general adapter as separate modules. Figure 4‐2 below shows an example of the Maven project structure for the file adapter of the ProaSense Adapter Library. The parent project contains two modules. The file‐base module contains implementation classes for the generic file adapter, whereas the file‐montrac module contains implementation classes for a specific adapter used in the HELLA use case.

We use JitPack18 to manage dependencies between the different GitHub repositories of the ProaSense Adapter Library. JitPack builds GitHub projects on demand and publishes ready‐to‐use packages. It is used by the general adapter repositories to specify a dependency to the base adapter repository.

Figure 4‐2: Maven project structure for a file adapter of the ProaSense Adapter Library

18 http://jitpack.io/


4.2 APIS AND LIBRARIES

We briefly describe external APIs and libraries that we used to build the prototype of the Sensing Architecture. We only describe the most important libraries that are being used. A detailed survey of all external libraries used as part of the Sensing Architecture will be completed within the analysis of the intellectual property rights (IPR) as part of WP8.

4.2.1 PROASENSE STORAGE

Table 8 below summarises the main 3rd party libraries used for the development of the ProaSense Storage component of the Sensing Architecture.

Table 8: 3rd party libraries used for the development of the ProaSense Storage

API/Library Version Description

Apache Kafka19

0.8.2.1 Message broker used as the underlying EDA architecture for the real‐time processing of the ProaSense system.

Apache Thrift20

0.9.2 Used for defining the event messages, and (de)serialize the messages for publishing on Kafka.

Jersey21 1.19 Used for implementing the REST services of the ProaSense Storage.

MongoDB22 3.0.0 NoSQL document store that is used as the underlying database.

4.2.2 PROASENSE ADAPTER LIBRARY

Table 9 below summarises the main 3rd party libraries used for the development of the ProaSense Adapter Library component of the Sensing Architecture.

Table 9: 3rd party libraries used for the development of the ProaSense Adapter Library

API/Library Version Description

Apache Kafka

0.8.2.1 Message broker used as the underlying EDA architecture for the real‐time processing of the ProaSense system.

Apache Thrift

0.9.2 Used for defining the event messages, and (de)serialize the messages for publishing on Kafka.

Bootstrap23 3.3.4 Web framework for developing the Web forms of the human sensors.

Jersey 1.19 Used for implementing the REST services of the ProaSense Storage.

Twitter4J24 3.0.3 Used by the Twitter adapter to connect to and retrieve data from Twitter.

19 http://kafka.apache.org/ 20 http://thrift.apache.org/ 21 http://jersey.java.net/ 22 http://docs.mongodb.org/manual/release‐notes/3.0/ 23 http://getbootstrap.com/ 24 http://twitter4j.org/en/index.html


Utgard25 1.1.0 Utgard is a vendor‐independent, 100% pure JAVA OPC Client API. It is used by the OPC adapter to connect to and retrieve data from an OPC server.

4.3 PREPARATION OF INSTALLATION

Feedback from the first deployment at the use case premises clearly demonstrated a need for improving the installation procedures. To make the installation procedure simpler for the 2nd deployment, it was decided to use Docker26 to install all technical components.

Docker is a framework that allows us to package software components as images and deploy these images as running instances called containers. We are using the Docker framework to package ProaSense components as separate images that can be easily deployed on physical or virtual machines running in the use case premises.

Fortunately, Docker provides a library of existing images that contains 3rd party software required by the ProaSense components. For the ProaSense Storage services described in this deliverable, a set of base Docker images have been selected:

MongoDB Docker image27 for the ProaSense Storage Writer Service

Apache Tomcat image28 for the ProaSense Storage Reader Service

Apache Fuseki image29 for the ProaSense Storage Registry Service

These base images will be used to create ProaSense‐specific images for the services listed above. In addition the Java and Tomcat images can be used to create adapter‐specific images for each use case. The creation of these Docker images will be finalized as part of the installation and deployment guides to be developed in WP6.

25 http://openscada.org/projects/utgard/ 26 https://www.docker.com/ 27 https://hub.docker.com/r/tutum/mongodb/ 28 https://hub.docker.com/_/tomcat/ 29 https://hub.docker.com/r/stain/jena‐fuseki/


5. Technical Validation

This section describes the testing and benchmarking that has been done to measure and tune the ProaSense Storage component.

5.1 TESTING

5.1.1 MONGO MANAGEMENT STUDIO

To test the storage capabilities of the ProaSense Storage component we have been using the free version of Mongo Management Studio30. This tool allows us to view and account for that all data pushed by the components of the ProaSense system are being stored correctly in the ProaSense Storage. Figure 5‐1 shows the tool after performing a few integration tests during a virtual coding meeting in which the Event Replay Utility (documented in Deliverable D1.3) was used to replay data from the MHWirth use case.

Figure 5‐1: Mongo Management Studio

The raw sensor data from an integration test run are stored as separate collections named simple.[sensorId], and the events published from the ProaSense components are stored as separate collections named [eventType].[componentId] (if the event contains a component id) or [eventType].system (if the event does not contain a component id). The tree pane (left side of the

30 http://www.litixsoft.de/english/mms/


figure) shows that separate data collections for derived.CEP, derived.enricher, predicted.system and recommendation.system has been created for the integration test run "proasense_db_test_007" in addition to the raw sensor data. The view pane (right side of the figure) shows the data stored for the collection derived.enricher. In this particular integration test run, a total of 45918 events published by the enricher were stored.

5.1.2 BENCHMARK TEST CODE

Figure 5‐2: Storage Writer Service – test classes

Figure 5‐2 above shows the addition of test classes that have been developed for the Storage Writer Service. The test classes are used to test and benchmark the write performance of the service. Two different benchmark applications have been developed, one local benchmark application to perform load testing of the underlying MongoDB document store and one Kafka benchmark application to perform load testing of the Storage Writer Service operating in an infrastructure using Kafka as the message broker. The test code for the Storage Writer Services consists of the following classes:

Table 10: Test classes for the Storage Writer Service

Class Description

RandomEvent Generator

Base code for generating events that is used to test and benchmark the ProaSense Storage. It contains functions for generating Simple, Derived, Predicted, Anomaly, Recommendation and Feedback events. The functions take Id as a parameter and generates a corresponding event. The event contains random values that have been generated using the Apache Commons Mathematic Library31.

31 http://commons.apache.org/proper/commons‐math/


RandomEvent KafkaGenerator

Java thread class that implements Kafka producers for generating random events. This is used for benchmarking storage performance on the Kafka broker.

RandomEvent LocalGenerator

Java thread class that implements local producers for generating random events that are produced on a blocking queue. This is used for benchmarking local storage performance, i.e., bypassing the Kafka broker in order to measure the database storage performance.

StorageWriterMongoServiceKafkaBenchmark

Main class for benchmarking the write performance of the ProaSense Storage component publishing to and consuming events from the Kafka infrastructure.

StorageWriterMongoServiceLocalBenchmark

Main class for benchmarking the write performance of the ProaSense Storage component publishing to and consuming events from a local blocking queue.

The main classes for the benchmarking reads properties from a client.properties file:

Table 11: Client properties for the main test classes


zookeeper.connect 192.168.1.111:2181 Kafka connection setting for Kafka consumers.


proasense.benchmark. kafka.[event_type].generators

100 Number of generators (or sensors to simulate) for the benchmarking of the ProaSense Storage on the Kafka infrastructure.

proasense.benchmark. kafka.[event_type].rate

20 Rate in ms between each generated event.

proasense.benchmark. kafka.[event_type].messages

10000 Number of events to generate.

proasense.benchmark. kafka.[event_type].topic

eu.proasense.internal.sensing.mhwirth.simple

The topic to be used for publishing events.

proasense.benchmark. kafka.[event_type].filter

false Boolean flag for topic filters. If set to truethen the topic will be considered as a prefix for topics and each generator will publish events on topic.[thread_number]

proasense.benchmark. local.[event_type].generators

100 Number of generators (or sensors to simulate) for the benchmarking of the ProaSense Storage on the local infrastructure (bypassing Kafka).

proasense.benchmark. local.[event_type].rate

20 Rate in ms between each generated event.

proasense.benchmark. local.[event_type].messages

10000 Number of events to generate.


proasense.benchmark. common.logfile

false Boolean flag for benchmark logfiles. If set to true then files named EventWriterMongo[Sync|Async]_benchmark_[thread number].txt will be created.

proasense.benchmark. common.logsize

10000 Calculate the average events/second for every number of written events matching the logsize.

proasense.benchmark. load.testing

false Boolean flag to indicate load testing. If set to true then only simple events (simulating sensors only) are generated.

proasense.benchmark. load.sensors

1000 Number of sensors to simulate.

proasense.benchmark. load.rate

20 The sampling rate in ms for the simulated sensors.

proasense.benchmark. load.messages

10000 The number of messages to generate for each sensor.

proasense.storage. mongodb.url

mongodb://127.0.0.1:27017 The URL of the MongoDB database. Default is to run the writer service on the same machine as the database.

proasense.storage. mongodb.writers

1 Number of event writer threads to run for this service instance.

proasense.storage. mongodb.bulksize

1000 The number of events to queue before writing to storage. Write performance can be significantly improved by doing bulk write.

proasense.storage. mongodb.maxwait

1000 The number of ms to wait before doing the bulk write. This ensures that the queue is written to storage in a timely manner in situations where the waiting data received is less than the queue size, e.g., the sensor stops sending events.

proasense.storage. mongodb.syncdriver

true Boolean value that indicates whether to use the MongoDB synchronous our asynchronous driver.

Note that the last five proasense.storage.mongodb.* properties are only valid for running a local benchmark, in which the benchmark application generates the random events and stores them to the local MongoDB. When running a Kafka benchmark using a message broker, the benchmark application only generates and publishes the random events to Kafka. An instance of the Storage Writer Service, which reads configuration values from its server.properties file, has to be running for the random events to be written to the storage.


5.1.3 TESTING RESTFUL SERVICES

In order to test the RESTful services of the ProaSense Storage, i.e., the Storage Reader Service and Storage Registry Service, we use the functional testing tool SoapUI32 . The SoapUI tests are also available on the GitHub repository. Figure 5‐3 and Figure 5‐4 shows examples of running default and average queries on a dataset of simple events, which has been stored in the storage using the random event generator of the benchmark test code.

Figure 5‐3: Storage Reader Services – Example of default query for simple events

32 http://www.soapui.org/


Figure 5‐4: Storage Reader Service – Example of average query for simple events

5.1.4 TESTING ADAPTERS

The actual testing of adapters was done at the use case sites during deployment with real data from the local systems. However, since these systems are not available outside the use case sites, we had to use a different approach when testing the adapters during the development phase. We used different testing approaches for the different types of adapters:

For the MHWirth Riglogger adapter we developed a specific test that simulated the Riglogger Web service and generated random sensor measurements. This allowed the adapter to be tested with a simulated service, which was then later replaced with the actual service during deployment.

For the HELLA adapters that are process files and reports (i.e., Montrac event files and SAP reports), we got sample data from HELLA in order to test the adapters.

For the HELLA adapters that collects data from the HYDRA MES Oracle database, we got a description of the database schema as well as some exported sample data so that we could set up a local database to test the adapters.

The human adapters (i.e., the Web forms) was tested manually by the developers.

In testing all of these adapters we used Mongo Management Studio to verify that all events submitted by the adapters were correctly stored in the ProaSense Storage.


5.2 BENCHMARKING

Extensive benchmarking were done to test and tune the write performance of the storage component during the 1st release. Here we provide a summary of the main results. Our benchmark target is to support at least 1 000 sensors at 20 ms, which equals a rate of 50 000 events/second. In order to test that we are achieving this target we simulate 2 000 sensors at 20 ms, which equals a rate of 100 000 events/second. In each load test, we generate 20 million events that are stored in the ProaSense Storage. The storage of these events are verified using the Mongo Management Studio tool described above.

For the 2nd release we also did some benchmarking to measure the query response time of the storage component. The results of the query response time is presented in subsection 5.2.3.

5.2.1 WRITE PERFORMANCE

Local benchmarks were performed on both a laptop computer and a desktop computer as described in the subsections below.

5.2.1.1 LAPTOP COMPUTER

Local benchmarks were executed on a laptop computer used by the main developer of the ProaSense Storage. The specification of the laptop computer is as follows:

CPU: Intel Core i7‐4600 @ 2.10 ‐ 2.70 GHz (dual core)

RAM: 16 GB

HDD: 256 GB SSD (approximately 25 GB free storage between each benchmark run)

OS: MongoDB 3.0 running on Windows 8.1 Enterprise

Several benchmarks, each generating a total of 20 million events, were performed using different bulk size settings for bulk write. This setting severely affects the write performance of the Storage Writer Service. The reason for including this configurable property is to be able to tune the write performance according to the different types of sensors used in the use cases. Different Storage Writer Service threads can be configured, where a high bulk size value should be set for sensors with a high sampling rate and a low bulk size value should be set for sensors with a low sampling rate.

Figure 5‐5 shows a scatter graph resulting from one of the benchmarks where the bulk write size was set to 1000 and the max wait size for the write queue was set to 1000 ms. The scatter graph shows the average events/second for every 10 000 events written to the storage. As can be seen the write performance is not uniform, it varies a little bit over the duration of the load test.


Figure 5‐5: Laptop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms

5.2.1.2 DESKTOP COMPUTER

The laptop computer used in the benchmarks described above has the top of the range ultraportable hardware, but it is not necessarily representative of the server hardware recommended for a storage component. To indicate the effect of vertical scaling we ran the local benchmark with corresponding settings as the test of Figure 5‐5 (synchronous driver, bulksize = 1000, maxwait = 1000 ms) on a more powerful desktop computer with the following specifications:

CPU: Intel Core i7‐3770 @ 3.50 GHz (quad core)

RAM: 16 GB

HDD: 180 GB SSD (approximately 62 GB free storage)

OS: MongoDB 3.0 running on Windows 8.1 Pro

Figure 5‐6 below shows a significant increase in performance, i.e., 70 175 events/second compared to 55 248 events/second for the laptop computer, which is approximately 27% increase in write performance. According to the CPU benchmark software PassMark33 the desktop CPU is more than 50% faster than the laptop CPU.

33 http://www.cpubenchmark.net/cpu_list.php

0

20000

40000

60000

80000

100000

120000

0 5000000 10000000 15000000 20000000 25000000

Laptop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms(Average events/second = 55 248)


Figure 5‐6: Desktop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms

5.2.2 KAFKA BENCHMARK

For the Kafka benchmark tests, we deployed the Storage Writer Service in a cloud infrastructure with one node (virtual machine) running the Kafka message broker and the ProaSense Storage deployed on a separate node (virtual machine).

Below we show a result of running the Storage Writer Service with a single writer thread using the synchronous driver, with the bulk size and max size settings that achieved the best performance in the local benchmark tests (see Figure 5‐5 and Figure 5‐6).

We ran multiple benchmarks with different numbers of concurrent event listeners configured (i.e., 1, 2, 4 and 8). We were expecting performance figures closer to the local benchmark, i.e. above 50 000 events/second. The number of concurrent events did not seem to matter, and we were only able to achieve a throughput of around 43 000 events/second for these tests. When running the Kafka tests we had problems configuring a proper value for the client session timeouts. Initially, we had it set to 4 seconds, but experienced several disconnect/connect issues which lowered the throughput down to around 30 000 events/second. We finally increased the client session timeout value to 60 seconds in order to run the tests without any disconnect/connect issues. Thus, we assume that some of the "hesitance" in the scatter graph below can be explained with potential "hesitance" in the client connections.

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

0 5000000 10000000 15000000 20000000 25000000

Desktop computer – Synchronous driver, bulksize = 1000, maxwait = 1000 ms(Average events/second = 70 175)


Figure 5‐7: Cloud infrastructure w/Kafka – Synchronous driver, bulksize = 1000, maxwait = 1000 ms

5.2.3 QUERY RESPONSE TIME

In order to test the query response time of the ProaSense Storage we used the load testing benchmark described in subsection 5.1.2 to generate sample databases with 1 000, 10 000, 100 000 and 1 000 000 sensor events. Then we used SoapUI as described in subsection 5.1.3 to measure the response time for the average, minimum and average queries.

For the default query that returns a list of sensor events, we used the network panel from the developer tools in Google Chrome34 to measure the page load. For these measures, we also give the transfer size of the list.

Table 12 below summarises the results of this benchmarking. The operations were executed three times. The average result of these measurements are plotted in the Figure 5‐8 below. The results indicates that queries to find average, minimum and average are reasonable responsive, while retrieving a list of events from the ProaSense Storage depends on the number of events. Retrieving up to 100 000 events are reasonable responsive, but larger quantities may require a different access mechanism. It is of course possible to access the database directly bypassing the REST API of the Storage Reader Service in order to retrieve data in different manner. Another possibility is to publish the resulting events of the query as an event stream similar to the Event Replay Utility developed in WP6.

34 https://developers.google.com/web/tools/chrome‐devtools/profile/network‐performance/resource‐loading

0

20000

40000

60000

80000

100000

120000

0 5000000 10000000 15000000 20000000 25000000

Cloud infrastructure w/Kafka – Synchronous driver, bulksize = 1000, maxwait = 1000 ms(Average events/second = 43 010)


Table 12: Query response times

Operation Number of events

Response time#1 (ms)

Response time #2 (ms)

Response time #3 (ms)

Avg. response time (ms)

Get list of events

(transfer size of list)

1 000

(140 KB)

37 44 35 39

10 000

(1,4 MB)

260 226 234 240

100 000

(13,7 MB)

25 450

(25,45 s)

32 010

(32,01 s)

24 850

(24,85 s)

27 437

(27,44 s)

1 000 000

(137 MB)

~ 192 000

(3,2 min)

~ 192 000

(3,2 min)

~ 192 000

(3,2 min)

~ 192 000

(3,2 min)

Get average value

1 000 29 22 31 27

10 000 97 83 86 89

100 000 685 613 669 656

1 000 000 6 291 5 861 5 983 6 045

Get minimum value

1 000 27 31 32 30

10 000 91 77 76 81

100 000 626 599 639 621

1 000 000 5 813 5 646 5 933 5 797

Get maximum value

1 000 29 23 29 27

10 000 75 96 93 88

100 000 755 704 627 695

1 000 000 5 143 5 242 5 527 5 304


Figure 5‐8: Query response times (average results)


6. Conclusions

In this deliverable we presented the prototype developed as result of Task T2.2. Our prototype implements the 2nd release of the Sensing Architecture that covers functionality of the sensing layer and the storage layer of the ProaSense implemented architecture. The prototype consists of two main main components:

The ProaSense Storage component provides the storage, query and registry functionality of the Storage Layer. For the 2nd release, bug fixes were applied to improve the storage functionality, while the query and registry functionality were enhanced.

o The Storage Writer Service is responsible for storing the sensor events (external data collected by adapters) and the ProaSense system events (internal data published by components of the ProaSense system).

o The Storage Reader Service provides a RESTful API that allows to query the events data (i.e., sensor events and ProaSense system events) that are stored.

o The Storage Registry Service provides a RESTful API that allows to query the contextual information (i.e., sensor metadata, product information, etc.) that are modelled for a particular use case (i.e., MHWirth or HELLA).

The ProaSense Adapter Library component supports the Sensing Layer. It provides a Java library for developing specific adapters to acquire data from hardware sensors, software sensors and human sensors. For the 2nd release, further testing and bug fixes were applied to improve the adapter functionality, including finalizing a new adapter for the HELLA use case. A total of 11 specific adapters have been developed to support the ProaSense use cases:

o Three specific adapters were developed to support the MHWirth use case.

o Eight specific adapters were developed to support the HELLA use case.

It is expected that some minor revisions, i.e., mainly bug fixes and possibly support for a few additional queries, may be needed to be developed as part of the WP6 integration and system testing that is planned for the next few months. During this period the installation procedures and guidlines will also be finalised as part of the work in WP6.


7. References

[Dey and Abowd 2000] A. K. Dey and G. D. Abowd, "The context toolkit: Aiding the development of context‐aware applications", in Workshop on Software Engineering for wearable and pervasive computing, 2000, pp. 431‐441.

[Gu, et al. 2005] T. Gu, H. K. Pung, and D. Q. Zhang, "A service‐oriented middleware for building context‐aware services", Journal of Network and computer applications, vol. 28, no. 1, pp. 1‐18, 2005.

[Jagadish, et al. 2014] H. V. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R. Ramakrishnan, and C. Shahabi, "Big Data and Its Technical Challenges", Communications of the ACM, vol. 57, no. 7, pp. 86‐94, 2014.

[Mosser, et al. 2012] S. Mosser, F. Fleurey, B. Morin, F. Chauvel, A. Solberg, and I. Goutier, "SENSAPP as a Reference Platform to Support Cloud Experiments: From the Internet of Things to the Internet of Services", in Proceedings of the 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC '12), 2012, pp. 400‐406. http://www.i3s.unice.fr/~mosser/_media/research/micas12.pdf

proactive enterprise analytic sensing architecture · d2.2.2 proactive enterprise analytic sensing...

Documents