security and privacy-aware iotcrawler framework · architecture, framework, security, iot,...

47
Security and Privacy-Aware IoTCrawler Framework This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 779852

Upload: others

Post on 12-Jun-2020

8 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

Security and Privacy-Aware IoTCrawler Framework This project has received funding from

the European Union’s Horizon 2020 research and innovation programme under grant agreement No 779852

Page 2: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

2

Title: Document Version:

D2.2 Security and Privacy-Aware IoTCrawler Framework 1.2

Project Number:

Project

Acronym: Project Title:

779852 IoTCrawler IoTCrawler

Contractual Delivery Date: Actual Delivery Date: Deliverable Type* - Security**:

31/01/2019 31/01/2019 R - PU * Type: P - Prototype, R - Report, D - Demonstrator, O - Other

** Security Class: PU- Public, PP - Restricted to other programme participants (including the

Commission), RE - Restricted to a group defined by the consortium (including the Commission),

CO - Confidential, only for members of the consortium (including the Commission)

Responsible and Editor/Author: Organization: Contributing WP:

Antonio F. Skarmeta University of Murcia WP2

Page 3: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

3

Authors (organizations):

Antonio F. Skarmeta (UMU)

José Santa (UMU)

Juan A. Martínez (OdinS)

Martin Strohbach (AGT)

Hien Truong (NEC)

Josiane Xavier Paerreria (SIEMENS)

Abstract:

The Internet of Things (IoT) offers an incredible innovation potential for developing smarter

applications and services. However, today we see solutions in the development of vertical

applications and services reflecting what used to be the early days of the Web, leading to

fragmentation and intra-nets of Things. To achieve an open IoT ecosystem of systems and platforms,

several key enablers are needed for effective, adaptive and scalable mechanisms for exploring and

discovering IoT resources and their data/capabilities. This deliverable discusses the framework

created in the EU H2020 IoTCrawler project and, particularly, the architecture to be considered for

the integration of external IoT platforms with the aim of covering the objectives of the project. Its

focus is on the integration and interoperability across different platforms, through dynamic and

reconfigurable solutions for discovery and integration of data and services from legacy and new

systems. This is complemented with adaptive, privacy-aware and secure solutions for crawling,

indexing and searching in distributed IoT systems. It is important to consider that IoTCrawler targets

IoT development and demonstrations with a focus on Industry 4.0, Social IoT, Smart City and Smart

Energy use cases.

Page 4: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

4

Keywords:

Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler.

Disclaimer: The present report reflects only the authors’ view. The European Commission is not

responsible for any use that may be made of the information it contains.

Page 5: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

5

Revision History

The following table describes the main changes done in the document since created.

Revision Date Description Author (Organization)

V0.1 23/10/2018 Initial Proposal including TOC and text about

the overall architecture

Antonio F. Skarmeta (UMU)

José Santa (UMU)

Juan A. Martínez (OdinS)

V0.2 26/10/2018 Contribution to Section 4 and 5 Juan A. Martínez (OdinS)

V0.3 23/11/2018 Overall architecture update Antonio F. Skarmeta (UMU)

José Santa (UMU)

Juan A. Martínez (OdinS)

V0.4 12/12/2018 Section 5. Blockchain-enabled Hien Truong (NEC)

V0.5 21/12/2018 Document update with agreed architecture Antonio F. Skarmeta (UMU)

José Santa (UMU)

Juan A. Martinez (OdinS)

V0.6 02/01/2019 Review and generation of a complete

document

Antonio F. Skarmeta (UMU)

José Santa (UMU)

Juan A. Martínez (OdinS)

V0.7 03/01/2019 Inclusion of DID content Antonio F. Skarmeta (UMU)

José Santa (UMU)

V0.8 11/01/2019 Update of sections 3 and 4 Martin Strohbach (AGT)

V0.9 11/01/2019 Update of sections 1-4 Josiane Parreira (SIE)

V0.91 13/01/2019 Integration of updates in section 5 Hien Truong

V1.0 14/01/2019 Editing and creation of a complete version Antonio F. Skarmeta (UMU)

José Santa (UMU)

V1.1 15/01/2019 Updated Figure, minor edits and comments Martin Strohbach (AGT)

V1.2 30/01/2019 Updates after UoS and DW review Antonio F. Skarmeta (UMU)

José Santa (UMU)

Page 6: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

6

Abbreviations

ABAC Attribute-Based Access Control

API Application Programming Interface

CP-ABE Cyphertext Policy Attribute-Based Encryption

DCapBAC Distributed Capability-Based Access Control

DHT Distributed Hash Table

DID Decentralized Identifier

DLT Distributed Ledger Technology

DSP Data Source Proxy

ECC Elliptic-curve cryptography

ECDSA Elliptic Curve Digital Signature Algorithm

GPS Global Positioning System

IoT Internet of Things

JSON JavaScript Object Notation

MDR Metadata Repository

NGSI Next Generation Service Interface

NGSI-LD NGSI Linked Data

PKI Public Key Infrastructure

QoI Quality of Information

QoS Quality of Service

RBAC Role-Based Access Control

REST Representational State Transfer

SPKI Simple Public Key Infrastructure

URI Uniform Resource Identifier

VNF Virtual Network Function

XACML Extensible Access Control Markup Language

XML Extensible Markup Language

ZBAC Authorization-Based Access Control

Page 7: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

7

Executive Summary

This deliverable discusses the framework created in the EU H2020 IoTCrawler project

to address the challenge of connecting vertical IoT deployments and, particularly, the

architecture to be considered for the integration of these external IoT platforms with

the aim of covering the objectives of the project. Its focus is on the integration and

interoperability across different platforms, through dynamic and reconfigurable

solutions for discovery and integration of data and services from legacy and new

systems. This is complemented with adaptive, privacy-aware and secure solutions for

crawling, indexing and searching in distributed IoT systems.

Page 8: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

8

Disclaimer

This project has received funding from the European Union’s Horizon 2020 research and

innovation programme under grant agreement No 779852, but this document only reflects the

consortium’s view. The European Commission is not responsible for any use that may be made

of the information it contains.

Page 9: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

9

Table of Contents

1. INTRODUCTION 12

2. THE OVERALL IOTCRAWLER FRAMEWORK 15

2.1. IoT Framework of Interoperable (Distributed) Systems 15

2.2. Holistic Security, Privacy and Trust 16

2.3. Crawling, Discovery and Indexing of Dynamic IoT Resources 16

2.4. Machine-Initiated Semantic Search 17

3. IOT RESOURCE INTEGRATION ARCHITECTURE 19

3.1. IoT Domain Interconnection Needs 19

3.2. Distributed IoT Framework 20

3.3. Layered Interconnection Architecture 21 3.3.1. Micro layer 22 3.3.2. Domain layer 23 3.3.3. Inter-domain layer 23 3.3.4. Internal processing layer 23 3.3.5. Application layer 24

3.4. Interconnection of Constrained Domains 25

3.5. Dynamical Coordination of Data Processing 25

4. BROKER FEDERATION SOLUTION 28

5. INTER-DOMAIN POLICY MANAGEMENT 31

5.1. Design overview 31

5.2. Distributed Ledger Technology (DLT) 32

5.3. Smart Contracts 34

Page 10: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

10

5.4. Private Data sharing Enabler 36

5.5. Decentralized Identifiers 37

6. IOT SECURITY: FROM SENSORS TO THE SEARCHING ENGINE 39

6.1. Context-Aware Privacy Policies 39

6.2. Capability-Based Access Control 41

6.3. Privacy Enabler 43

7. CONCLUSIONS 45

8. REFERENCES 46

Page 11: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

11

Table of Figures FIGURE 1: KEY CONCEPTS CONSIDERED IN IOTCRAWLER CONCEPTION 13 FIGURE 2: OVERALL ARCHITECTURE OF THE IOTCRAWLER FRAMEWORK 15 FIGURE 3: INTERCONNECTION OF IOT DOMAINS 19 FIGURE 4: DISTRIBUTED IOT FRAMEWORK OF IOTCRAWLER 21 FIGURE 5: IOT RESOURCE INTEGRATION ARCHITECTURE 22 FIGURE 6: INTERCONNECTION OF CONSTRAINED IOT DOMAINS 25 FIGURE 7: COORDINATION OF PROCESSING RESOURCES IN THE IOTCRAWLER

ARCHITECTURE 26 FIGURE 8: FOGFLOW ARCHITECTURE DEPICTION 28 FIGURE 9: USAGE OF NGSI 9/10 IN THE SCOPE OF THE IOT BROKER 29 FIGURE 10: ARCHITECTURE MAPPING OF A LOGICAL VIEW DESCRIBED ABOVE TO AN

IMPLEMENTATION ARCHITECTURE BASED ON IOTBROKER 29 FIGURE 11: INTER-DOMAIN POLICY MANAGEMENT 32 FIGURE 12: DISTRIBUTED SMART CONTRACT 34 FIGURE 13: SMART CONTRACT APPLICATION. BLOCKCHAIN HANDLER 35 FIGURE 14: BLOCKCHAIN-ENABLED PRIVATE DATA SHARING ENABLER 36 FIGURE 15: DCAPBAC APPROACH FOR AUTHORIZATION APPLICATION 43 FIGURE 16: PRIVACY REQUIREMENT FOR IOTCRAWLER FRAMEWORK 44

Page 12: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

12

1. INTRODUCTION

Efficient and secure access to Big IoT Data will be a pivotal factor for the prosperity of European

industry and society. However, today's data and service discovery, search, access methods and

solutions for the IoT are in their infancy, like Web search in its early days. IoT search is different

from Web search because of dynamicity and pervasiveness of the resources in the network.

Current methods are suited for static resource repositories. Moreover, previous reports show

that a large part of the developers’ time is spent on integration. There is yet no adaptable and

dynamic solution for effective integration of distributed and heterogeneous IoT data and

support of data reuse in compliance with security and privacy needs, thereby enabling a true

digital single market. In general, the following issues limit the adoption of dynamic IoT-based

applications:

• The heterogeneity of various data sources hinders the uptake of innovative cross-domain

applications.

• Data records without embedded meta information that avoids its expansion across

platforms.

• Missing security and neglected privacy are major concerns in most domains and a

challenge for constrained IoT resources.

• The large-scale, distributed and dynamic nature of IoT resources requires new methods

for crawling, discovery, indexing, physical location identification and ranking.

• IoT applications require new search engines, such as bots that automatically initiate

search based on user’s context. This requires machine intelligence.

• The complexity involved in discovery, search, and access methods makes the

development of new IoT enabled applications a complex task.

• Integrating data from diverse sources imply new reputation mechanisms able to offer

services and users trusting guarantees.

Some ongoing efforts, such as Shodan1 and Thingful2 provide IoT searching solutions, but they

rely mainly on a centralised indexing and manually provided metadata. Moreover, they are

1 https://www.shodan.io/ 2 https://www.thingful.net/

Page 13: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

13

rather static and neglect privacy and security issues. To enable the use of IoT data and to

exploit the business potential of IoT applications an effective approach needs to provide:

• An adaptive distributed framework enabling abstraction from heterogeneous data

sources and dynamic integration of volatile IoT resources.

• Security, privacy and trust by design as integral part of all the processes from

publication, indexing, discovery, and subscription to higher-level application access.

• Scalable methods for crawling, discovery, indexing and ranking of IoT resources in large-

scale cross-platform and cross-disciplinary systems and scenarios.

• Machine initiated semantic search to enable automated context dependent access to

IoT resources.

• Monitoring and analysing the Quality of Service (QoS) and Quality of Information (QoI)

to support fault recovery and service continuity in IoT environments.

IoTCrawler addresses the above challenges by proposing efficient and scalable methods for

crawling, discovery, indexing and ranking of IoT resources in large-scale cross-platform and

cross-disciplinary systems and scenarios. It develops enablers for secure and privacy-aware

discovery and access to the resources. It also monitors and analyses QoS and QoI to rank suitable

resources and to support fault recovery and service continuity. The project evaluates the

developed methods and tools in various use-cases, such as Smart City, Social IoT, Smart Energy

and Industry 4.0. The key elements of IoTCrawler are shown in Figure 1.

Figure 1: Key concepts considered in IoTCrawler conception

The project aims to create scalable and flexible IoT resource discovery by using metadata and

resource descriptions in a dynamic data model. This means that searching actions could result

in non-optimal results that could fit the user expectations.

Page 14: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

14

For this, the system should understand the user priorities (which are often machine-initiated

queries and search requests) and provide the results accordingly by using adaptive and dynamic

techniques.

The rest of the document describes the overall framework of IoTCrawler and details its main

architectural elements. The abstract architecture is followed by concrete instantiations using

particular technologies and platforms.

Page 15: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

15

2. THE OVERALL IOTCRAWLER FRAMEWORK

IoTCrawler provides novel approaches to support an IoT framework of interoperable systems

including security and privacy-aware mechanisms, and offers new methods for discovery,

crawling, indexing and search of dynamic IoT resources. It supports and enables machine-

initiated knowledge-based search in the IoT world. Figure 2 depicts the IoTCrawler framework

and highlights its key components, which are detailed next.

Figure 2: Overall architecture of the IoTCrawler framework

2.1. IoT Framework of Interoperable (Distributed) Systems

The diversity of the market has resulted in a variety of sophisticated IoT platforms that will

continue to exist. However, to evolve and enable the full benefits of IoT, these platforms need

Secu

rity,

Priv

acy

& Tr

ust

IoT Resources: sensors and actuators

Use cases

Machine initiated semantic search

IoT discovery

Context management

Monitoring & fault recovery

Multi-criteria ranking

Adaptive indexing

Edgebroker

Edgebroker

Edgebroker

Cloudbroker

DistributedIoT framework

Dynamiccrawling

Sear

ch

Data

ana

lysi

s

API

Smart city Social IoT Smart energy

Industry 4.0

Page 16: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

16

access to data, information and services across various IoT networks and systems within an

integrated ecosystem of IoT systems and platforms. IoTCrawler envisions a cooperation of

platforms and systems to provide smart integrated IoT based services. Instead of defining an

overarching hyper-platform on top, IoTCrawler aims at integrating the platforms and systems

by proposing a common interface that enables cooperation and interconnection of various

platforms. This is carried out through making their data and services discoverable and

accessible by other applications and services. An IoTCrawler enabled platform can internally

be implemented in different ways; it only has to support the common and open interfaces to

join the ecosystem. As further discussed later, the open IoT interfaces are split in two planes

that are called control and data planes (analogous to OpenFlow in software defined networks).

The control plane will coordinate and control the data and information processing in the

platforms (monitoring and quality analysis). The data plane will allow for IoT data flow

exchange between platforms (crawling, indexing and search).

2.2. Holistic Security, Privacy and Trust

An ecosystem of IoT platforms brings immense benefits but also potential risks for users and

stakeholders. The very principle that makes the IoT so powerful - the potential to share data

instantly with everyone and everything - creates huge security and privacy risks. Since IoT

systems are, by their nature, distributed and operate often in uncontrolled environments, the

maintenance of security, privacy, and trust is a challenging task. IoTCrawler addresses quality,

privacy, trust and security issues by employing a holistic and end-to-end approach to the data

and service publication to search and access workflow. Device and connectivity management

will ensure that the end devices only connect to trusted access networks. IoTCrawler develops

solutions for mitigating privacy intrusion without resting capabilities to the platform. Hence,

privacy is supported at the same time data correlation based on data collected from multiple

sources is allowed. Both technical and information governance procedures and guidelines are

defined and implemented. This makes sure that the technical solutions are in place for avoiding

the security and privacy risks, and appropriate information governance procedures, best

practices and measures, are followed in development, deployment and utilisation of the use-

cases and third-party applications.

2.3. Crawling, Discovery and Indexing of Dynamic IoT Resources

Information access and retrieval on the early days of the Internet and the Web mainly relied

on simple functions and methods. For example, Yahoo’s first search engine was simply based

Page 17: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

17

on the “grep” function in Unix and the AltaVista search engine initially did not have a ranking

mechanism. The Internet and the Web have gone a long way in the past two decades to improve

the way we access the information on the Web. There are several sophisticated methods and

solutions that provide crawling, indexing, ranking and search and retrieval of extremely large

volumes of information on the Internet. The new generations of Web search engines have now

focused on information extraction with personalised and customised knowledge, and extraction

techniques and solutions. Some early works are demonstrated by Google’s knowledge graph,

Wolfram Alpha and Microsoft Bing. The current information access and retrieval methods on

the IoT are still at the same stage that the Web and the Internet were in their early days.

Information retrieval on the large-scale IoT systems is currently based on the assumption that

the sources are known to the devices and consumers or it is assumed that opportunistic methods

will send discovery and negotiation messages to find and interact with other relevant resources

in their outreach (e.g. Google’s recent Physical Web project3 is designed based on this

assumption). Overall, IoT systems have many ad-hoc resources that do not comply with

document and URL processing and indexing norms; resources, such as mobile phones and sensing

devices, can publish data and then move to another location or disappear. Service and data

crawling and discovery for smart connected devices and services will also involve automated

associations and integration to provide an extensible framework for information access and

retrieval in IoT. IoTCrawler focuses on providing reliable, high quality, resource-aware and

scalable mechanisms for data and services publishing, crawling, indexing and ranking in very

large-scale distributed dynamic IoT environments.

2.4. Machine-Initiated Semantic Search

In the past, search engines were mainly used by human users to search for content and

information. In the newly emerging search model, information is provided depending on the

users’ (human user or a machine) context and requirements (for example, location, time,

activity, previous records, and profile). The information access can be initiated without the

user’s explicit query or instruction, but launched on its necessity and relevance (context-aware

search). This will require machine interpretable search results in semantic forms. Moreover,

social media, physical sensors (numerical streaming values), and Web documents must be

better integrated, and the search results should become more machine interpretable rather

than remaining as pure links (e.g. the Web search engines mainly return a list of links to the

pages as their results; with some exceptions on popular questions and topics).

3 https://google.github.io/physical-web/

Page 18: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

18

IoTCrawler enables context-aware search and automated processing of data by semantic

annotation of the data streams, thus making their characteristics and capabilities available in

a machine processable way. There are several existing works that provide methods and

techniques for semantic annotations and description of the IoT devices, services and their

messages and data. However, most of these methods rely on centralised solutions and complex

query mechanisms that hinder their scalability and wide scale deployment and use for the IoT.

IoTCrawler supports an ecosystem of multiple platforms and develops dynamic semantic

annotation and reasoning methods that will allow continuous and seamless integration of new

devices and services by exploiting and adapting existing annotations based on similarity

measures.

The automatic discovery has to consider the current context. Context-awareness requires the

integration and analysis of social, physical and cyber data. IoTCrawler develops enablers for

context-aware IoT search. Hence the requirements of different applications are mapped to the

solutions by selecting resources considering parameters such as security and privacy level,

quality, latency, availability, reliability and continuity. IoTCrawler improves reliability and

robustness by fault recovery mechanisms and mitigation of malfunctioning devices, using device

activation/deactivation in the associated area. The fault recovery also requires mechanisms to

support communication among networked IoT resources located in diverse locations and across

different platforms, and to provide secure and efficient re-distribution of information in case

of failure.

Page 19: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

19

3. IOT RESOURCE INTEGRATION ARCHITECTURE

In order to take a step forward in defining the IoTCrawler architecture, a study of already

deployed IoT resources and platforms was initially performed among the project partners. The

details of this work are included in deliverable D2.1. The analysis resulted in a set of IoT

domains with particular features and connectivity requirements.

Since a key objective of IoTCrawler project is the interconnection of already existing IoT

domains, the overall framework depicted in Figure 2 needs to be discussed in further details,

i.e. the Distributed IoT framework.

3.1. IoT Domain Interconnection Needs

The analysis performed in WP2 and reported in deliverable D2.1 raised the need for a cloud-

based solution that enables the interconnection of different IoT domains and the extraction of

metadata to cope with the data discovery/indexing and the semantic search. An overlay

infrastructure as depicted in Figure 3 is needed. Currently, IoT domains coming from the project

partners are considered, but the architecture of interconnection and data extraction of the

IoTCrawler framework must be prepared to cope with generic IoT data resources of any kind.

Figure 3: Interconnection of IoT domains

The key requirements identified for such an interconnection architecture are:

• The system must be able to adhere IoT resources of different levels of granularity. It

should be considered that the range of IoT resources cover from single IoT gateways to

complete domains with brokering capabilities.

Page 20: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

20

• The system should support both the extraction of data and metadata, following the

IoTCrawler perspective of adding semantic search on top of the framework.

• The system must scale as the number of IoT domains increase. Here the computing and

storage requirements must be especially considered, given the potential increase of

data volumes and Big Data requirements in the coming years.

• Apart from metadata extraction, semantic features should be supported by the indexing

and ranking modules identified in the IoT Discovery layer of the IoTCrawler framework

depicted in Figure 2.

• Security and Privacy must be considered from the very beginning of the system,

implementing the cross-layer nature of security features within IoTCrawler.

• Reputation of data must be assured, being this a challenging task because of the very

different nature of resources to be integrated in the framework.

3.2. Distributed IoT Framework

A more detailed view of the problematic to be tackled by the desired IoT resource integration

architecture is shown in Figure 4. Here, three particular domains are included in the areas of

Industry 4.0, Smart City and Smart Buildings.

In Figure 4 it is important to notice that domains involving an IoT data broker must be

interconnected by IoTCrawler. This way it is possible to share data about each particular

deployment within a global platform. This is the case of the MiMurcia platform, which is based

on the FIWARE technology and the Context Broker entity. However, in other domains, such as

the Smart Home, one could imply direct connectivity with IoT gateways.

In the Smart Home domain, we further envision that due to the heterogeneity of IoT devices

and gateway technologies, local crawling and automated semantic annotations are required in

order to easily make IoT data available in the distributed IoTCrawler framework (see Section

3.3.1).

A concept that is essential in the interconnection of domains is the “Micro Broker”, as it is used

in the Industry 4.0 IoT domain depicted in Figure 4. They provide brokering capabilities to small

IoT deployments that do not provide gateway capabilities using well-known protocols. Hence,

micro brokers allow an easier interconnection through standardized message exchanges, most

of them based on REST, such as the well-known NGSI.

Page 21: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

21

Figure 4: Distributed IoT Framework of IoTCrawler

3.3. Layered Interconnection Architecture

A detailed architecture that covers the distributed IoT framework requirements of IoTCrawler

is proposed in Figure 5. It comprises the following layers (following a bottom-up approach):

Micro layer, Domain layer, Inter-domain layer, Internal processing layer and Application layer.

The different IoT domains are interconnected with the base IoTCrawler platform through a

federation approach using Metadata Repositories (MDRs) at different levels, over which

semantics empower user-level searches.

The separation of our platform into control plane, represented by dotted arrows, and data

plane, continuous lines, is one of the key aspects of our architecture.

The control plane manages metadata coming from IoT gateways, IoT devices or complete IoT

platforms, and is responsible for setting up the data delivery between sources and applications.

Our key design principle is that we only maintain metadata and link to the original data sources

(IoT devices, gateways, IoT platform), but do not replicate or manage the data itself. A guiding

principle here is that we manage the metadata based on semantic data models: any metadata

that is added into the Micro layer MDR contains a basic semantic annotation following the

IoTCrawler annotation model described in deliverable D4.1. The metadata can then be further

enriched in the internal processing layer (Semantic Enrichment) including quality of information

Page 22: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

22

and application domain specific attributes. Moreover, information about used access protocols

are also considered as metadata information and are therefore stored in the MDRs.

Figure 5: IoT resource integration architecture

By contrast, the data path between a data source and a consumer (IoTCrawler application or

internal component) is only established in the data plane on a per need basis, i.e. when as a

result of a search in the IoTCrawler MDR, a source needs to be accessed. Typically, we expect

a need for at least one Data Source Proxy (DSP) in the data path. An important instance of

such a proxy is an authorization enabler that in many cases will manage access to the data

source and hide the actual data source endpoint. But note that, in principle, a data path can

directly be established between application and data source. However, this requires that

IoTCrawler is able to manage trust between a source and application in an appropriate way.

3.3.1. Micro layer

Going into details for each of the layers composing the architecture: in the first one, Micro

layer is where crawling takes place. Figure 5 illustrates three different kind of sources of

information that can be crawled: Local IoT Gw, IoT Platform, IoT Dev. Although a common

context source to be integrated can be an IoT Platform, attending to the integration of

different nature IoT resources, there will be cases in which local gateways act as intermediary

nodes to very constrained IoT devices, or even situations of final devices solely provided in

domains with Internet access. In these cases, the IoTCrawler proposal bets on providing

interconnection gateways, taking advantage of virtualization technologies, as is later

explained.

Page 23: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

23

For all of the three cases, the crawling task requires that the representation must incorporate

semantic annotation provided as NGSI-LD representation [1]. Here we distinguish between two

cases: when the underlying platform provides no semantic annotation, in which case a Semantic

Annotation module is used; and when an initial set of annotations is available, in which case

the annotations are enhanced by the Semantic Enrichment module.

The Micro layer may also contain a DSP for instance for transforming the data to common data

protocol and formats as used by concrete framework instantiation such as NGSI-LD.

3.3.2. Domain layer

In the Domain layer we consider not only the case of having one MDR, but also a distribution of

them. The usage of several MDRs in a domain allows load-balancing mechanisms that can be

necessary when the number of users or services accessing or reporting data is too large, or

when the data resources to save make it recommendable to use different servers. We must also

highlight the interaction with the Authorization Enabler and the PEP_Proxy which guarantees

a controlled registration and access to the information stored in our IoTCrawler Platform, which

is later explained.

3.3.3. Inter-domain layer

The inter-domain layer of the architecture federates metadata from different domains into a

global (although distributed) data platform and exposes a distributed MDR. This layer is

responsible for tracking where to search information about IoT resources interconnected with

the IoTCrawler ecosystem. Searching for non-indexed data could be initiated through a DHT

approximation that provides the base of the IoTCrawler discovery mechanism. Additionally, a

security mechanism based on Distributed Ledger Technology (DLT) using the Blockchain

Handler ensures the secure communication between the Distributed MDRs of this layer.

3.3.4. Internal processing layer

The internal processing layer has the following responsibilities:

1) Using the Semantic Enrichment component, it further enriches the available knowledge

about data sources

2) Based on the available metadata maintained by the Distributed MDR, it manages the

Metadata Index for enabling efficient searches on the available IoT resources

3) It uses the Ranking component to rank search results

Page 24: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

24

4) It provides a Monitoring function offering access to data from IoT resources and is used

by the Semantic Enrichment component

The Semantic Enrichment may access the Distributed MDR and infer additional metadata or

directly monitor data sent by the sources. This may for instance be necessary when calculating

certain Quality of Information (QoI) attributes. In this case the Semantic Enrichment component

makes use of the Monitoring component which is included in the data plane. See Deliverable

D4.1 for more information on the QoI component.

This layer is also responsible for creating indices for the gathered large-scale and dynamic data

sources with heterogeneous parameters (i.e. location, time and theme related attributes). The

aim is to aggregate indices from the different data domains based on probabilistic machine

learning techniques, and to use them in the ranking procedure.

The Indexing component is responsible for creating and updating the Metadata index to allow

quick search and retrieval of the metadata stored in the Distributed MDR.

The Ranking component is triggered by a search request and accesses the index to retrieve a

list of resources matching the request. It ranks the results, E.g. based on QoI parameters and

offers a ranked list of resources to the application. As the availability of resources and their

characteristics change over time, the ranking component will be asked for ranking updates upon

the detection of relevant metadata changes. If necessary, the ranking needs to be updated

accordingly and indicated to the application. For this reason, the architecture considers

algorithms and mechanisms for adaptive ranking and selection of IoT resources and services,

providing a dynamic solution for management and orchestration of the selected resources based

on application requirements.

3.3.5. Application layer

The Application layer is the last layer of this platform. It comprises the Application and the

Orchestrator. The application is a client to IoTCrawler that issues search requests to the

orchestrator. The orchestrator is in charge of managing these requests, accessing both indexes

and live rankings. When the concrete data sources are identified, the orchestrator provides a

list of ranked results to the application. The application selects one of the results by sending a

message to the orchestrator. Finally, the orchestrator establishes the necessary data paths to

the data source in the data plane involving all the necessary DPSs and monitoring components

as described above.

Page 25: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

25

3.4. Interconnection of Constrained Domains

A challenge for the IoTCrawler architecture is the interconnection of IoT domains with strict

resource constraints. As previously introduced, the solution given by the architecture is the use

of local brokers dynamically deployed to gather data from these domains. This idea is depicted

in Figure 6. Here, a set of “Micro brokers” are in charge of gathering data from interconnected

IoT assets using appropriate interfaces. Well-known application protocols as NGSI or equivalent

will be used to make the interconnection easier.

Figure 6: Interconnection of constrained IoT domains

Once the data is collected through these micro brokers, the regular architecture described

earlier can be used, through the inter-domain layer, the metadata layer and the application

layer.

3.5. Dynamical Coordination of Data Processing

Different IoT devices, distributed among a number of smart spaces, imply a challenging scenario

where huge amounts of data must be processed at the different layers of the previous

architecture. With the aim of coping with both the strict computing needs and the dynamicity

Page 26: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

26

of the IoTCrawler ecosystem, the base architecture is envisioned with processing coordination

capabilities that will attend the dynamic computing requirements of the various platform

components.

The solution adopted bets on dynamically instantiating MDR entities as required by the

framework. This is particularly useful for local MDR to be deployed by the system in domains

where no data repositories are present. A particular case are the micro MDRs described in Figure

6. Nevertheless, it is important to attend the potential computing and storage requirements of

the higher layers of the architecture. Both the brokering scheme and also the metadata

processing modules could present computing and data storage issues as the number of

integrated IoT resources or users performing searches increase.

Figure 7: Coordination of processing resources in the IoTCrawler architecture

As seen in Figure 7, the IoTCrawler architecture will contain a processing coordination function.

This is a cross-layer element responsible for managing the computing requirements of the

platform. The way the system monitors the computing and storage resources is not only

communicating periodically with already deployed modules to obtain performance parameters,

but also using the information from the Monitor module to scale the system according to the

needs. The monitor provides information about the IoT assets involved in the searches and data

Page 27: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

27

retrieval operations. This way the processing coordinator can modify the resources assigned to

already deployed modules or instantiate new modules to balance the load.

At the local, domain and inter-domain layers, new MDRs could be created to cope with growing

demands of data requests or data reports. Also, as the number of IoT data assets registered

increases, these could be dynamically assigned to new MDRs to scale the capabilities of the

architecture.

Machine learning algorithms are considered to be integrated in the processing coordination

module to predict and anticipate system scaling upon the appearance of indicative events. This

can happen under potential load periods, due to the increase of searches, periodic yearly,

monthly or weekly high loads, performance indicators received from launched MDRs, or

requests for registering new IoT data assets.

Page 28: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

28

4. BROKER FEDERATION SOLUTION

There are plenty of IoT platforms that integrate numerous and heterogeneous IoT systems. For

this reason, the target of this project is not to create a new platform, but to create a crawling

framework which allows for integrating existing IoT deployments. As discussed, we have based

that integration in two premises: first, the adoption of NGSI-LD and second the use of a broker

which adopts this interface for providing APIs to query, subscribe and update the metadata

managed by the MDR.

Considering the scalability and heterogeneity properties of IoTCrawler, a solution for a

distributed architecture, as well as support for federations of platforms is a must in the scope

of this project.

IoTCrawler will use FogFlow, a FIWARE Generic Enabler for cloud-edge orchestration leveraging

edge elements as computational resources, as well as cloud computing. Figure 8 presents an

overview of its architecture where we can highlight three different layers or scopes: Context

Management, Service Management and Data Processing.

Figure 8: FogFlow architecture depiction

Following FIWARE’s architecture (in which FogFlow is embedded), long term data storage,

historic information, Big Data processing, and other high-level services are provided by

different FIWARE components, other than FogFlow.

Page 29: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

29

Context management is accounted for in the IoT Broker and IoT Discovery components which

are concrete instantiation of MDR concept. IoT Brokers are in charge of holding the information

(in the form of context entities) and allowing the query and update of those entities by the

remaining elements of the system. It also allows subscription to content changes. Finally, IoT

Discovery is responsible for keeping track of where the context information is stored (which

broker) and what information is available in the system. It also allows subscription to new

content creation. The adoption of NGSI 9 and 10 and how they are implemented and treated in

this approach is shown in Figure 9.

Figure 9: Usage of NGSI 9/10 in the scope of the IoT Broker

Figure 10: Architecture mapping of a logical view described above to an implementation architecture based on IoTBroker

Please note that in the concrete instantiation of the IoTCrawler framework we consider that

many data sources will provide NGSI-LD interfaces as well. This means that in a possible

implementation, the MDR in the micro layer may be the same IoTBroker that is actually

gathering the data. Figure 10 below illustrates this mapping which focuses on the micro layer

and the interaction of the MDRs with the domain layer.

Page 30: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

30

These mechanisms are heavily used in FogFlow’s architecture by the task manager and the tasks

themselves, as a means to discover new context information that will trigger new data

processing flows, and to feed those flows with context data.

As a result, FogFlow’s architecture is robust and flexible, allowing for a graceful degradation

and scalability, as many components can be replicated (the IoT Broker for instance) and the

task management system takes into consideration available computing resources on the system

during task allocation.

Page 31: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

31

5. INTER-DOMAIN POLICY MANAGEMENT

In the inter-domain layer, we have presented a federation of MDRs with a distributed approach.

The relationship between MDRs must be controlled to allow only legitimate MDRs to participate

in the federation. We also assume these MDRs to be linked to multiple domains do not maintain

established trust relationship, yet they have to agree on common policies on accessing and

sharing data resources. We need a security mechanism that allows us to define global data

sharing policies between the different MDRs of our federation which must be contractually

agreed by all participating domains of the federation. For example, “No data owner (domain)

can give consumers access to their data to untrusted parties such as embargoed countries”. At

this layer, we introduce the use of blockchain handler to solve the inter-domain policy

management. Our policy model includes global policies and domain-specific policies. Note that

the “domain” here refers to a virtual concept about data source providers. Each domain is

responsible for a set of sensor data it provides to the system.

Global data sharing policies: In the federation model that includes many IoT domains, it is

necessary to maintain a common practice of rules for the federation that involving domains

must comply with. Here we define global policies as ones that are mutually agreed by all (or

major) domains and are compliant to by all domains. Data sharing communications to external

entities definitely must follow those policies. Global policies once created are updated or

removed only with agreement of major or all domains in the federation.

Domain-specific data access policies: Each domain is a data source owner and it has full rights

to set its own policies. These policies are typically about who can access which data under

which circumstances. Importantly, domain-specific policies are not allowed to be conflict with

global policies to ensure consistency for federation in managing and sharing data securely.

Thanks to this security mechanism, the system only executes the agreed and validated policies.

So, once a domain policy is validated, further modifications require the approval of the other

domains of the network to be executed. Another important fact is that a domain cannot revoke

a policy as it might cause broken services relying on the modified policies.

5.1. Design overview

Figure 11 shows our design for federating policy management. As described previously, domains

agreed on validated policies even they do not have prior mutual trust relationship. In this

design, we consider IoTCrawler a special domain that serves external entities such as

applications, client services and other consumers. The security technology more suitable for

Page 32: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

32

this model of trustless is based on Distributed Ledger Technology (DLT). As a matter of fact,

Blockchain is a type of DLT. Following we will describe the background of DLT and elaborate

how it is applied to inter-domain policy management.

Figure 11: Inter-domain policy management

5.2. Distributed Ledger Technology (DLT)

DLT aims to solve to solve the problem where we have a P2P network of untrusted entities

exchanging transactions. Additionally, this P2P network does not have a central administration

role and does not rely on a particular hierarchy because all participants have the same

capabilities.

In these networks, each new party member must be agreed by all peers, and a single ledger of

ordered transactions is shared by all peers in real time. So, the ledger is consistently replicated

by each peer. This is the reason why tampering with data is impossible without simultaneously

hacking every peer in the network. Furthermore, additional restrictions can be applied so that

access is granted only to a limited set of entities.

A blockchain is one instance of DLT, defined as an immutable distributed ledger that records

transactions which happened in the network of mutually untrusting peers. The ledger is

replicated by peers and these peers execute a consensus protocol to validate transactions,

group them in blocks and add new block to the ledger.

Page 33: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

33

Blockchain allows executing business logic on transactions programmed in the form of

chaincodes (smart contracts) which are ran concurrently in the distributed network by many

peers. The security of smart contracts relies on underlying consensus among peers.

Hyperledger Fabric (simply Fabric) is an open-source blockchain platform managed by the

Linux Foundation. Fabric has widely range of use in prototypes, proof-of-concepts and industrial

production. Use-cases of Fabric include various areas such as supply chain management,

contract management, data provenance identity management. Fabric is a new blockchain

architecture that overcomes limitations of previous permissioned blockchain platform on

flexibility, scalability and confidentiality. With this goal, Fabric is designed as a modular and

extensible permissioned blockchain. It supports the execution of distributed applications

written in general purpose programming languages such as Go and Java; following execute-

order-validate paradigm for untrusted code in untrusted environment. A distributed application

consists of a chaincode (smart contract) and endorsement policy. The chaincode implements

the application logic and runs in the execution phase. The endorsement policy is evaluated in

the validation phase and it is only modified by trusted entities e.g. administrators. Fabric uses

a hybrid replication design which incorporates primary-backup (passive) replication and active

replication. Primary-backup replication in Fabric means every transaction is executed only by

a subset of peers based on endorsement policies. Fabric adopts active replication such that

transactions are written to the ledger once consensus is reached on the total order. This hybrid

design is what makes Fabric a scalable permissioned blockchain. Fabric contains the following

modular blocks:

• Ordering service that broadcasts updates to peers and establishes consensus on

transaction orders.

• Membership Service Provider (MSP) that associates peers with cryptographic identities.

• Peer-to-peer gossip service to disseminate blocks to all peers in the blockchain network.

• Smart Contracts that run application logic in container environments.

• Ledger that is maintained by peers in append-only form.

A Fabric permissioned blockchain consists of a set of nodes enrolled by the modular MSP. Each

node can be a client, a peer and an orderer. As a client, the node can submit transaction

proposals for execution, execute them and then broadcast them for ordering. As a peer, the

node can execute transaction proposals and validate transactions. The node only executes

transactions that are endorsed, as specified by the endorsement policy, and it stores

Page 34: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

34

transactions in an append-only ledger. As a role of orderer, the node runs the ordering service

to establish total order for transactions in Fabric.

5.3. Smart Contracts

A smart contract is a computer code running on top of a blockchain that contains a set of rules

under which a set of parties agree to interact with each other. If the pre-defined rules are met,

the agreement is automatically enforced. The smart contract code facilitates, verifies, and

enforces the negotiation or performance of an agreement or transaction. It is the simplest form

of decentralized automation. The use of smart contracts provides the following advantages:

Figure 12: Distributed smart contract

• They can enforce custom policies and/or business logic.

• Can be adjusted according to participants needs or as defined by a regulator supervising

the network.

• Business outcomes are set in stone because computer programs are unambiguous.

• Turing complete programs that can validate, approve, or deny transactions.

• Can implement complex business logic.

• Blockchain transactions can be invoked by relevant parties.

• Can be updated at any time (e.g., policy updates).

• Fully automated. Once deployed, smart contracts operate without human supervision.

• Enable parties to agree on business-specific policies in a flexible manner.

Regarding the validation of the policies for federation, the use of Smart Contracts allows the

implementation of policies as smart contracts. This way, at checking point, each policy can be

either approved and added to the ledger or rejected.

Page 35: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

35

The management of policies, i.e. adding new ones or modifying existing ones from any domain

are verified and validated by the network. Finally, checks are done via the execution of the

smart contracts.

In a digitally connected world, vast number of devices can collect data, communicate and share

it with each other to provide a wide range of services. Major players in the industry have

developed platforms to integrate sensors, devices, networks and applications as a complete IoT

ecosystem. These platforms are limited by the lack of transparent user control over their data.

In addition, they require high complex communications between different entities to control

access to data based on policies.

Figure 13: Smart Contract application. Blockchain Handler

By leveraging blockchain technology, as can be seen in Figure 13, we can provide monetization

of data as a solution for transparency where users are able to set up policies that set a limited

time of use (e.g. only for 1 month), license fees (e.g. in case of data resell), etc.

Enhancing user digital experience on IoT systems by leveraging blockchain has been a promising

solution investigated by researchers and practitioners. We propose an extension to existing IoT

platforms with blockchain to track data ownership and access rights. We will allow the IoT

Platform users to bind their data pushing actions to blockchain transactions. In turn, the IoT

Platform will enforce the access control policies that are defined and agreed upon on the

blockchain (i.e. blockchain consensus is necessary to alter data access rights). By doing so, we

will provide greater transparency control over the data.

Page 36: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

36

5.4. Private Data sharing Enabler

Despite the specific use-case scenario for adopting DLT to inter-domain policy management,

we have opted for designing and developing a generic private data sharing enabler. Certainly,

the enabler can be used to manage policies.

In our generic design, we consider main IoT platform components including IoT domains (IoT

stakeholders, IoT Discovery and IoT Brokers). It is depending on specific deployment to decide

which component can participate in the blockchain channel. Figure 14 shows how IoT domains

are linked to blockchain. Router and Blockchain handlers operate on top of existing IoT platform

e.g. FIWARE. They are meeting points for existing IoT components to communicate with

additional services such as data market and data storage.

Figure 14: Blockchain-enabled private data sharing enabler

The router plays a central role where the remaining components connect to. It checks

incoming/outcoming data and directs the data to corresponding component. When data is

received from IoT stakeholders, it will be handled by either cloud handler or Blockchain handler

depending on the content of incoming messages. If the incoming messages are about data

offers, it will direct request to data market place via Blockchain handler. To serve as discovery

service, the Router also updates the IoT Broker and IoT discovery with the appropriate

metadata depending on the design of a specific system.

The blockchain handler manages all communications to the blockchain when data market

operations take place. It will handle data submitted to the blockchain and also data queries. It

Page 37: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

37

also supports IoT access control component. The data will be available to access or retrieve

when access policies are satisfied. IoT access control keeps policies synchronized with the data

market via the blockchain handler.

The data sharing enabler leveraging blockchain technology enhances transparency of data

exchanges in an IoT framework. It simplifies the data management process while providing

transparency and monetization. For proof-of-concept deployment, this enabler will be

implemented on FIWARE and Hyperledger Fabric platforms.

Once the enabler is used for inter-domain policy management, each policy is created and

validated via smart contract executions. Factoring that the blockchain network is secure, that

means no malicious peer (domain) can alter any validated policies. Any changes on existing

policies will require a new approval by executing new smart contracts. IoTCrawler is presented

as a special domain which is also the gateway to communicate to external entities. Hence, it

will handle data access according to valid policies in the blockchain channel of inter-domain

policies.

5.5. Decentralized Identifiers

Taking advantage of the blockchain features inherent in the IoTCrawler framework, it is also

proposed the usage of Decentralized Identifiers (DID) to represent all subjects considered in

data models. Decentralized Identifiers (DID)4 present a globally unique digital identity of a

subject that can be validated using a blockchain environment. DIDs are resolved to a resource

describing the entity identified, in a similar way a URL is resolved to Web content. Hence, the

DID is linked with a document that contains cryptographic material that makes possible

authentication against a network service.

In IoTCrawler, it is possible to go a step further in the identification of sensors, people or virtual

resources, through the management of DIDs representing such subjects. This way,

authentication of these entities is embedded into a blockchain decentralized environment, in

which common centralized entities in Public Key Infrastructure (PKI) environments will no

longer be needed. Hence, identification information about entities is now maintained by the

same entity, and not by the service, as in common registration processes. When needed, the

entity presents the DID, whose linked descriptive document is distributed in the network. This

means, for instance, that in a common data gathering scenario from a sensor, data packets

4 https://w3c-ccg.github.io/did-spec/

Page 38: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

38

reported can be associated with a DID and encrypted using a private key. The public key can

be obtained from the DID document to obtain the plain data received from the sensor.

The usage of DIDs presents a good option to also keep user privacy, given that they present

digital identifiers that can be used for particular features or contexts, and can be unlikeable

to a physical person when this is not desired.

Page 39: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

39

6. IOT SECURITY: FROM SENSORS TO THE SEARCHING ENGINE

IoTCrawler has been conceived as a media to allow users to access the information provided by

the milliards of IoT devices which are already connected to the Internet along the world, but

with one differentiating aspect, this access must be provided in a secure manner. Considering

this added value from the beginning allows us to consider security as a traversal requirement

which must be satisfied at every layer: local, intra-domain, inter-domain, meta and application

layer, as presented in Figure 5.

Security as such is a very broad term with different implications. Let us remark the most

common ones regarding access to specific resources. Within this scope authentication,

authorization and privacy are the three most relevant terms that can be applied. We can also

consider other relevant terms such as integrity, usually referred to the exchanged information,

as well as trust which defines the relationship between different platforms according to the

truthfulness of the exchanged information.

Bearing in mind that security is a traversal necessity for IoTCrawler, we need to equip our

system with enablers that allow for such security technologies. In particular we have

considered:

• The identity management as a key component in which the information of the subjects

accessing the information is registered.

• An authorization enabler based on XACML and capabilities which provides a decoupled

approach where a Capability Token is presented as an authorization token easily

validated without requesting third parties.

• A privacy enabler for the information broadcast to a group of consumers, which allows

only the legitimate consumers to decrypt the received information based on the

attributes of logical entity corresponding to these consumers. This has been stored in

the identity management enabler.

In the following paragraphs we extend the discussion and motivation of the adoption of these

components.

6.1. Context-Aware Privacy Policies

Given the pervasive, distributed and dynamic nature of IoT, context should be a first-class

security component to drive the behavior of devices. This would allow smart objects to be

enabled with context-aware security solutions, in order to make security decisions adaptive to

Page 40: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

40

the context in which transactions are performed. At the same time, context information should

be managed by considering security and privacy considerations. In particular, current IoT

devices (e.g. smartphones) can obtain context information from other entities of their

surrounding environment, as well as to provide contextual data to other smart objects. These

communications can be performed through lossy networks and constrained devices, which must

be secured by suitable security mechanisms and protocols. Additionally, trust and reputation

mechanisms should be employed to assess the trustworthiness of data being provided by other

entities in the environment. In this way, smart objects can discard information that comes from

less reliable devices. Moreover, high-level context information can be reasoned and inferred

by considering privacy concerns. Thus, a smartphone could be configured to provide information

about a person’s location with less granularity (e.g. giving the name of the city where he is,

but not the GPS coordinates), or every long periods of time in order to avoid that daily habits

of a person could be inferred by other entities.

In order to make security adaptive to context, the IoTCrawler system will provide a platform

to enable users and smart objects to share information by maintaining different interacting

entities to be uncoupled. In this sense, the resulting platform will be used for secure exchange

of contextual data, so users and smart objects could adapt their security and privacy behavior

according to it. For example, this information could be used by an Identity Management

component, which is intended to manage the identities of users and smart objects in an

(optionally) privacy-preserving way. It is based on the use of anonymous credentials systems

(e.g. Idemix[2]), enabling users and smart objects to select partial identities (i.e. a subset of

their identity attributes) to be used when accessing to platform services.

In addition to identity management, authorization functionality could be also based on

contextual information. Specifically, the proposed enabler for access control management is

based on a combination of access control models and techniques. It makes use of access control

policies (e.g. eXtensible Access Control Markup Language (XACML) 3.0) [3], which are employed

to generate lightweight authorization tokens based on the capabilities-based approach

[5][6][7], which will be explained in next.

In addition to more established token-based access control approaches, there will be common

situations in which information needs to be outsourced or shared through the use of a central

data management platform. For these scenarios, an approach based on advanced cryptographic

schemes, such as the Ciphertext Policy Attribute Based Encryption (CP-ABE) scheme [4], is key

to guarantee security properties when this data needs to be shared with groups of users or

Page 41: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

41

services. In this case, the high-level context information could be used to select a specific CP-

ABE policy to encrypt a certain piece of data.

In particular, this component could contain a set of Sharing Policies specifying how the

information should be disseminated according to contextual data. These policies are intended

to be evaluated before information is disseminated by the smart objects. The result of the

evaluation of these policies could be, in turn, a CP-ABE policy indicating the set of entities

which will be enabled to decrypt the information to be shared.

An example sharing policy could be "IF contextA=atPub AND data=myLocation, THEN CP-ABE

policy=myfriends OR myfamily", specifying the location of a user is shared with friends or family

members when he/she is at a pub. According to it, when a policy is successfully evaluated, the

resulting CP-ABE policy is used to encrypt the information to be shared. In the case of two or

more sharing policies are successfully evaluated, the most restrictive CP-ABE policy could be

selected. After the information is encrypted and disseminated, this component of smart objects

receiving such data will try to decrypt it with CP-ABE keys related to its identity attributes.

It should be noted that the use of such approach could be integrated into end devices (e.g.

smartphones) that will share their data with other users through the platform. At the same

time, such approach could be included into the platform, in case other devices (e.g. sensors)

are not able to deploy this mechanism.

6.2. Capability-Based Access Control

The inherent requirements and constraints of IoT environments, as well as the nature of the

potential applications of these scenarios, have brought about a greater consensus among

academia and industry to consider access control as one of the key aspects to be addressed for

a full acceptance of all IoT stakeholders.

Due to heterogeneous nature of IoT devices and networks, most of recent access control

proposals have been designed through centralized approaches in which a central entity or

gateway is responsible for managing the corresponding authorization mechanisms, allowing or

denying requests from external entities. Since this component is usually instantiated by

unconstrained entities or back-end servers, standard access control technologies are directly

applied. However, significant drawbacks arise when centralized approaches are considered on

a real IoT deployment. On the one hand, the inclusion of a central entity for each access request

clearly compromises end-to-end security properties, which are considered as an essential

requirement on IoT, due to the sensitivity level of potential applications. On the other hand,

the dynamic nature of IoT scenarios with a potential huge number of devices complicates the

Page 42: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

42

trust management with the central entity, affecting scalability. Moreover, access control

decisions do not consider contextual conditions which are locally sensed by end devices.

These issues could be addressed by a decentralized approach, in which IoT devices (e.g.

smartphones, sensors, actuators, etc.) are enabled with authorization logic without the need

to delegate this task to a different entity when receiving an access request. In this case, end

devices are enabled with the ability to obtain, process and transmit information to other

entities in a protected way. However, in a fully distributed approach, the feasibility of the

application of traditional access control models, such as Rola-Based Access Control (RBAC) or

Attribute-Based Access Control (ABAC), has not been demonstrated so far. Indeed, as previously

mentioned, such models require a mutual understanding of the meaning of roles and attributes,

as well as complex access control policies, which makes challenging their application on IoT

devices. Moreover, the impact of the potential applications of IoT in all aspects of our lives is

shifting security aspects from an enterprise-centric vision to a more user-centric one.

Therefore, usability is a key factor to be considered, since untrained users should be able to

control how their devices and data are shared with other users and services.

As already mentioned, DCapBAC has been postulated as a feasible approach to be deployed on

IoT scenarios even in the presence on devices with tight resource constraints, Figure 15.

Inspired by Simple Public Key Infrastructure (SPKI) Certificate Theory and Authorization-Based

Access Control (ZBAC) foundations [8], it is based on a lightweight and flexible design that

embeds authorization functionality on IoT devices, providing the advantages of a distributed

security approach for IoT in terms of scalability, interoperability and end-to-end security.

The key element of this approach is the concept of capability, as "token, ticket, or key that

gives the possessor permission to access an entity or object in a computer system". This token

is usually composed by a set of privileges which are granted to the entity holding the token.

Additionally, the token must be tamper-proof and unequivocally identified in order to be

considered in a real environment. Therefore, it is necessary to consider suitable cryptographic

mechanisms to be used even on resource-constrained devices which enable an end-to-end

secure access control mechanism. This concept is applied to IoT environments and extended by

defining conditions which are locally verified on the constrained device. This feature enhances

the flexibility of DCapBAC since any parameter which is read by the smart object could be used

in the authorization process. DCapBAC will be part of the access control system and integrated

with Policy-based approach by using XACML, in order to infer the access control privileges to

be embedded into the capability token.

Page 43: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

43

Figure 15: DCapBAC approach for authorization application

6.3. Privacy Enabler

Privacy is also fundamental for our IoTCrawler. For this reason, we need an enabler which

guarantees a secure distribution of information to a group of consumers. A promising technology

in this research field which is based on Attribute-Based Encryption (ABE) like CypherText Policy

ABE (CP-ABE). Thanks to this cryptographic scheme, the IoTCrawler framework enables a secure

data sharing mechanism with groups of entities (i.e. communities and bubbles of smart objects)

in such a way that only legitimate consumers are able to decrypt the received information,

Figure 16.

Specifically, this component contains a set of Sharing Policies defining how the information is

disseminated according to contextual data. These policies are evaluated by the engine before

information is disseminated by the subject. The result of the evaluation of these policies is, in

turn, a CP-ABE policy, which is employed by the CP-ABE engine to encrypt the information with

that policy.

After the information is encrypted and disseminated, the enabler of a target entity (acting as

a consumer) will try to decrypt it with CP-ABE keys related to its identity attributes through its

CP-ABE engine.

Page 44: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

44

Figure 16: Privacy requirement for IoTCrawler framework

Page 45: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

45

7. CONCLUSIONS

In this deliverable we have outlined the overall IoTCrawler framework. For this purpose, we

have presented a diagram defining the different layers comprising this framework. Their

entities, as well as the interactions among them. One remarkable aspect of the architecture of

this framework is the separation of control and data plane. Actually, our framework only

integrates the meta information that comes together the real data, but the data itself remains

in their original context sources.

We have also identified in this architecture where the crawling task takes place. This task

requires a unified representation of information based on NGSI-LD, as well as an initial semantic

annotation which allows our framework to start its analysis, indexing and ranking processes.

Like the adoption of security and privacy are fundamental to control the access to the

information in our framework, we have also identified in this deliverable the requirements

associated to these properties at different layers. On the one hand, an authorization enabler is

required to control the way producers add new information to our framework. Privacy is also

considered by using an attribute-based encryption mechanism which allows for a secure

broadcast of information. Finally, at inter-domain level, we have also introduced the

Distributed Ledger Technology, and more specifically the use of Smart Contracts and blockchain

for the definition of the agreement of global policies.

Thanks to the technologies provided along this document we are ready to develop this ambitious

framework so as to allow users to securely access to IoT information on Internet.

Page 46: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

46

8. REFERENCES

[1] Context Information Management (CIM); NGSI-LD API. Version 0.9.2 ETSI, Sophia Antipolis December, 2018. Available https://docbox.etsi.org/ISG/CIM/Open

[2] Specification of the Identity Mixer Cryptographic Library. Version 2.3.40 IBM Research, Zurich January 30, 2013. Available http://www.zurich.ibm.com/idemix/

[3] Moses, T. eXtensible Access Control Markup Language (XACML) Version 2.0 OASIS Standard Feb. 1, 2005 OASIS Open. Source: http://docs. oasis-open. org/xacml/2.0/access—control-xacml-2.0-core-spec-os. pdf see also http://www. oasis-open. org/committees/tc—home. php.

[4] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-policy attribute-based encryption,” in Security and Privacy, 2007. SP’07. IEEE Symposium on. IEEE, 2007, pp. 321-334.

[5] M. Naedele, An access control protocol for embedded devices, Industrial Informatics, 2006 IEEE International Conference on, IEEE, 2006, pp. 565-569.

[6] P.N. Mahalle, B. Anggorojati, N.R. Prasad, and R. Prasad, Identity driven capability based access control (ICAC) for the internet of things, Proceedings of the 6th IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), Bangalore, India, IEEE, December, 2012, pp. 49-54.

[7] J. L. Hernández-Ramos, A. J. Jara, L. Marín, and A. F. Skarmeta, “Dcapbac: Embedding authorization logic into smart things through ecc optimizations,” International Journal of Computer Mathematics, no. just-accepted, pp. 1-22, 2014.

[8] Karp, A. H., Haury, H., & Davis, M. H. (2010). From ABAC to ZBAC: the evolution of access control models. Journal of Information Warfare, 9(2), 38-46.

[9] S. Li, J. Hoebeke, F. Van den Abeele, and A. Jara, Conditional observe in CoAP, Constrained resources (CoRE) Working group, Internet Engineering Task Force (IETF),work in progress, draft-li-core-conditionalobserve-04, June 2013. Available at http://tools.ietf.org/html/draft-li-core-conditional-observe-04.

[10] C. Jennings, J. Arkko, and Z. Shelby, Media types for sensor markup language (SENML), NetworkWorking group, Internet Engineering Task Force (IETF), Work in Progress, draft-jennings-senml-10, October 2012. Available at http://tools.ietf.org/html/draft-jennings-senml-10.

Page 47: Security and Privacy-Aware IoTCrawler Framework · Architecture, Framework, Security, IoT, Federation, Brokering, IoTCrawler. ... Machine-Initiated Semantic Search 17 3. IOT RESOURCE

47

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 779852

@IoTCrawler IoTCrawler EUproject /IoTCrawler www.IoTCrawler.eu [email protected]