> Cloud Computing> Edge Computing> Data Analytics> Cyber-Physical Systems
JUNE 2019 www.computer.org
COMPSAC 2019 DAT A DRIVEN INTELLIGENCE
FOR A SMARTER WORLD
Hosted by Marquette University, Milwaukee, Wisconsin, USA July 15�19 www.compsac.org
In the era of "big data" there is an unprecedented increase in the amount of data collected in data warehouses. Extracting meaning and knowledge from these data is crucial for governments and businesses to support their strategic and tactical decision making. Furthermore, artificial intelligence (AI) and machine learning (ML) makes it possible for machines, processing large amounts of such data, to learn and execute tasks never before accomplished. Advances in big data-related technologies are increasing rapidly. For example, virtual assistants, smart cars, and smart home devices in the emerging Internet of Things world, can, we think, make our lives easier. But despite perceived benefits of these technologies/methodologies, there are many challenges ahead. What will be the social, cultural, and economic challenges arising from these developments? What are the technical issue related, for example, to the privacy and security of data used by AI/ML systems? How might humans interact with, rely on, or even trust AI predictions or decisions emanating from these technologies? How can we prevent such data-driven intelligence from being used to make malicious decisions?
Authors are invited to submit original, unpublished research work, as well as industrial practice reports. Simultaneous submission to other publication venues is not permitted. All submissions must adhere to IEEE Publishing Policies, and all will be vetted through the IEEE CrossCheck portal. For full CFP and conference information, please visit the conference website at WWW.COMPSAC.ORG
IMPORTANT DATES April 7, 2019: Paper notifications April 15, 2019: Workshop papers due May 1, 2019: Workshop paper notifications May 17, 2019 - Camera ready submissions and advance author registration due
♦ IEEE-�CW£1TE
Be The Difference.
ORGANIZING COMMITTEE General Chairs: Jean-Luc Gaudiot, University of California, Irvine, USA; Vladimir Getov, University of Westminster, UK Program Chairs in Chief: Morris Chang, University of South Florida, USA; Stelvio Cimato, University of Milan, Italy; Nariyoshi Yamai, Tokyo University of Agriculture&: Technology,Japan Workshop Chairs: Hong Va Leong, Hong Kong Polytechnic University, Hong Kong; Yuuichi Teranishi, National Institute of Information and Communications Technology, Japan; Ji-Jiang Yang, Tsinghua University, China Local Organizing Committee Chair: Praveen Madiraju, Marquette University, USA Standing Committee Chair: Sorel Reisman, California State University, USA Standing Committee Vice Chair: Sheikh Iqbal Ahamed, Marquette Universiity, USA
STAFF
EditorCathy Martin
Publications Operations Project SpecialistChristine Anthony
Publications Marketing Project SpecialistMeghan O’Dell
Production & DesignCarmen Flores-Garvey
Publications Portfolio ManagersCarrie Clark, Kimberly Sperka
PublisherRobin Baldwin
Senior Advertising CoordinatorDebbie Sims
Circulation: ComputingEdge (ISSN 2469-7087) is published monthly by the IEEE Computer Society. IEEE Headquarters, Three Park Avenue, 17th Floor, New York, NY 10016-5997; IEEE Computer Society Publications Office, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; voice +1 714 821 8380; fax +1 714 821 4010; IEEE Computer Society Headquarters, 2001 L Street NW, Suite 700, Washington, DC 20036.
Postmaster: Send address changes to ComputingEdge-IEEE Membership Processing Dept., 445 Hoes Lane, Piscataway, NJ 08855. Periodicals Postage Paid at New York, New York, and at additional mailing offices. Printed in USA.
Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in ComputingEdge does not necessarily constitute endorsement by the IEEE or the Computer Society. All submissions are subject to editing for style, clarity, and space.
Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2) includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted material on their own Web servers without permission, provided that the IEEE copyright notice and a full citation to the original work appear on the first screen of the posted copy. An accepted manuscript is a version which has been revised by the author to incorporate review suggestions, but not the published version with copy-editing, proofreading, and formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html. Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribution must be obtained from IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854-4141 or [email protected]. Copyright © 2019 IEEE. All rights reserved.
Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
Unsubscribe: If you no longer wish to receive this ComputingEdge mailing, please email IEEE Computer Society Customer Service at [email protected] and type “unsubscribe ComputingEdge” in your subject line.
IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.
IEEE COMPUTER SOCIETY computer.org • +1 714 821 8380
www.computer.org/computingedge 1
IEEE Computer Society Magazine Editors in Chief
ComputerDavid Alan Grier (Interim), Djaghe LLC
IEEE SoftwareIpek Ozkaya, Software Engineering Institute
IEEE Internet ComputingGeorge Pallis, University of Cyprus
IT ProfessionalIrena Bojanova, NIST
IEEE Security & PrivacyDavid Nicol, University of Illinois at Urbana-Champaign
IEEE MicroLizy Kurian John, University of Texas, Austin
IEEE Computer Graphics and ApplicationsTorsten Möller, University of Vienna
IEEE Pervasive ComputingMarc Langheinrich, University of Lugano
Computing in Science & EngineeringJim X. Chen, George Mason University
IEEE Intelligent SystemsV.S. Subrahmanian, Dartmouth College
IEEE MultiMediaShu-Ching Chen, Florida International University
IEEE Annals of the History of ComputingGerardo Con Diaz, University of California, Davis
JUNE 2019 • VOLUME 5, NUMBER 6
THEME HERE
8Towards
AI-Powered Multiple Cloud
Management
24Error-Resilient
Server Ecosystems for
Edge and Cloud Datacenters
39Multimodal
Sentiment Analysis:
Addressing Key Issues and Setting
Up the Baselines
48Computing in
the Real World Is the Grandest of
ChallengesSubscribe to ComputingEdge for free at www.computer.org/computingedge.
Cloud Computing
8 Towards AI-Powered Multiple Cloud Management BENIAMINO DI MARTINO, ANTONIO ESPOSITO, AND ERNESTO
DAMIANI
16 Serverless Computing: From Planet Mars to the Cloud JOSÉ LUIS VÁZQUEZ-POLETTI AND IGNACIO MARTÍN LLORENTE
Edge Computing
24 Error-Resilient Server Ecosystems for Edge and Cloud Datacenters
GEORGIOS KARAKONSTANTIS, DIMITRIOS S. NIKOLOPOULOS, DIMITRIS GIZOPOULOS, PEDRO TRANCOSO, YIANNAKIS SAZEIDES, CHRISTOS D. ANTONOPOULOS, SRIKUMAR VENUGOPAL, AND SHIDHARTHA DAS
28 A Serverless Real-Time Data Analytics Platform for Edge Computing
STEFAN NASTIC, THOMAS RAUSCH, OGNJEN SCEKIC, SCHAHRAM DUSTDAR, MARJAN GUSEV, BOJANA KOTESKA, MAGDALENA KOSTOSKA, BORO JAKIMOVSKI, SASKO RISTOV, AND RADU PRODAN
Data Analytics
36 The Unreasonable Eff ectiveness of Software Analytics TIM MENZIES
39 Multimodal Sentiment Analysis: Addressing Key Issues and Setting Up the Baselines
SOUJANYA PORIA, NAVONIL MAJUMDER, DEVAMANYU HAZARIKA, ERIK CAMBRIA, ALEXANDER GELBUKH, AND AMIR HUSSAIN
Cyber-Physical Systems
48 Computing in the Real World Is the Grandest of Challenges
MARILYN WOLF
51 Society 5.0: For Human Security and Well-Being YOSHIHIRO SHIROISHI, KUNIO UCHIYAMA, AND NORIHIRO
SUZUKI
Departments 4 Magazine Roundup
7 Editor’s Note: Trends in Cloud Computing 72 Conference Calendar
4 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
CS FOCUS
T he IEEE Computer Society’s lineup of 12 peer-reviewed tech-
nical magazines covers cut-ting-edge topics ranging from software design and computer graphics to Internet comput-ing and security, from scien-tifi c applications and machine intelligence to visualization and microchip design. Here are highlights from recent issues.
Computer
Enabling Human-Centric Smart Cities: Crowdsourcing-Based Practice in ChinaExisting smart city systems are designed mainly for urban man-agement from a governmen-tal perspective, but they fail to cover the wide spectrum of citi-zens’ daily lives. The authors of
this article from the December 2018 issue of Computer propose a crowdsourcing-based platform through which multiple players can collaborate to produce more abundant, personalized, and proactive services.
Computing in Science & Engineering
Mixed Precision: A Strategy for New Science OpportunitiesSince the days of vector super-computers, computational scien-tists have relied on high-precision arithmetic to accurately solve problems. But changes to hard-ware, spurred by the demand for more computing capability and
Magazine Roundup
www.computer.org/computingedge 5
growth in machine learning, have researchers considering lower-precision alternatives. Read more in the November/December 2018 issue of Computing in Science & Engineering.
IEEE Annals of the History of Computing
How Modeless Editing Came To BeSeveral word processing and desk-top publishing innovations grew out of 1970s research by Larry Tesler and his collaborators at Stanford and Xerox PARC. One prototype was PUB, a markup language for print publishing. Another was Gypsy, a text editor with a cut/copy/paste command suite that became the de facto standard in desktop publishing and most other applications. Read more in the July–September 2018 issue of IEEE Annals of the History of Computing.
IEEE Computer Graphics and Applications
CellPAINT: Interactive Illustration of Dynamic Mesoscale Cellular EnvironmentsCellPAINT allows non-expert users to create interactive mesoscale illustrations that integrate a vari-ety of biological data. Like popular digital painting software, scenes are created using a palette of molecular “brushes.” The authors of this article from the November/December 2018 issue of IEEE Com-puter Graphics and Applicationsdescribe how the current release
allows creation of animated scenes with an HIV virion, blood plasma, and a simplifi ed T-cell.
IEEE Intelligent Systems
Autonomous Nuclear Waste ManagementThis article from the November/December 2018 issue of IEEE Intelligent Systems focuses on the design, development, and demon-stration of a reconfi gurable rational agent-based robotic system that aims to highly automate nuclear waste management processes. The proposed system is being demonstrated through a down-sized, lab-based setup incorporat-ing a small-scale robotic arm, a time-of-fl ight camera, and a high-level rational agent-based decision making and control framework.
IEEE Internet Computing
Network Neutrality Is About Money, Not PacketsThe topic of network neutrality has occupied engineers, economists, and telecommunication lawyers since at least 2003. Network engi-neers often largely perceive this as yet another debate about quality-of-service, but this constitutes proba-bly the least interesting part of the problem and has served more often to distract from other, more impor-tant, issues. Almost all network neutrality problems are rooted in economic incentives, and thus are likely to require approaches related to pricing and enhancing competi-tion. Read more in the November/December 2018 issue of IEEE Inter-net Computing.
IEEE Micro
Performance Assessment of Emerging Memories through FPGA EmulationEmerging memory technologies off er the prospect of large capac-ity, high bandwidth, and a range of access latencies ranging from DRAM-like to SSD-like. In this article from the January/Febru-ary 2019 issue of IEEE Micro, the authors evaluate the performance of parallel applications on CPUs whose main memories sweep a wide range of latencies within a bandwidth cap. The article high-lights the performance impact of higher latency on concurrent appli-cations and identifi es conditions under which future high-latency memories can eff ectively be used as main memory.
IEEE MultiMedia
Clustering of Musical Pieces through Complex Networks: An Assessment over Guitar SolosMusical pieces can be modeled as complex networks. This fos-ters innovative ways to categorize music, paving the way toward novel applications in multimedia domains, such as music didactics, multimedia entertainment, and digital music generation. Cluster-ing these networks through their main metrics allows for grouping similar musical tracks. To show the viability of the approach, the authors of this article from the October–December 2018 issue of IEEE MultiMedia provide results on a dataset of guitar solos.
6 ComputingEdge June 2019
MAGAZINE ROUNDUP
IEEE Pervasive Computing
Invisible, Inaudible, and Impalpable: Users’ Preferences and Memory Performance for Digital Content in Thin AirThe authors of this article from the October–December 2018 issue of IEEE Pervasive Computing address novel interfaces that enable users to access digital content “in thin air” in the context of smart spaces that superimpose digital layers upon the topography of the phys-ical environment. The authors collect and evaluate users’ prefer-ences for pinning digital content in thin air. They also set guide-lines for practitioners to assist the
design of novel user interfaces that implement digital content super-imposed on the physical space.
IEEE Security & Privacy
The Need for Speed: An Analysis of Brazilian Malware Classifi ersUsing a dataset containing about 50,000 samples from Brazilian cyberspace, the authors of this article from the November/Decem-ber 2018 issue of IEEE Security & Privacy show that relying solely on conventional machine-learn-ing systems without taking into account the change of the sub-ject’s concept decreases the per-formance of classifi cation. They
emphasize the need to update the decision model immediately after concept drift occurs.
IEEE Software
Relationships between Project Size, Agile Practices, and Successful Software Development: Results and AnalysisLarge-scale software development succeeds more often when using agile methods. Flexible scope, frequent deliveries to produc-tion, a high degree of requirement changes, and more competent pro-viders are possible reasons. Read more in the March/April 2019 issue of IEEE Software.
IT Professional
Lightweight Access Control System for Wearable DevicesWearable devices are being used in health and military environments where, due to information sensi-tivity, it is necessary to be able to control the way information is han-dled by the wearable device. How-ever, current security solutions for wearable devices focus mostly on protecting information access from unauthorized parties. With this in mind, the authors of this article from the January/February 2019 issue of IT Professional propose a wearable device access control system that, in addition to protect-ing the information from unauthor-ized access, allows for defi ning and enforcing a set of restrictions about how the information should be handled.
From the analytical engine to the supercomputer, from Pascal to von Neumann, IEEE Annals of the History of Computing covers the breadth of computer history. � e quarterly publication is an active center for the collection and dissemination of information on historical projects and organizations, oral history activities, and international conferences.
www.computer.org/annals
2469-7087/19/$33.00 © 2019 IEEE Published by the IEEE Computer Society June 2019 7
EDITOR’S NOTEEDITOR’S NOTEEDITOR’S NOTEEDITOR’S NOTEEDITOR’S NOTEEDITOR’S NOTEEDITOR’S NOTE
T he adoption of cloud computing has grown enormously in the past decade, with the vast majority of organizations
now relying on the cloud. As cloud computing continues to evolve in 2019, new trends are emerg-ing. This issue of ComputingEdge discusses some of the biggest trends in cloud computing: multi-cloud, artifi cial intelligence (AI), and serverless.
More people are choosing to use multiple cloud-computing services in one heterogeneous archi-tecture (multicloud) to improve fl exibility and cut costs. In IEEE Internet Computing’s “Towards AI-Powered Multiple Cloud Management,” the authors describe how AI supports automated management across clouds. Many people are also moving toward serverless computing, where they pay cloud providers for only the resources they consume. Computing in Science & Engineering’s “Serverless Computing: From Planet Mars to the Cloud” presents a serverless architecture for sci-entifi c data analysis.
Cloud computing is often used alongside edge computing in Internet of Things (IoT) applica-tions. Computer’s “Error-Resilient Server Ecosys-tems for Edge and Cloud Datacenters” argues that
more effi cient and resilient hardware and software server stacks are needed for IoT cloud and edge computing. “A Serverless Real-Time Data Analyt-ics Platform for Edge Computing,” from IEEE Inter-net Computing, shows how data from IoT devices can be better analyzed with serverless computing.
Serverless computing isn’t the only new idea in data analytics. IEEE Software’s “The Unreason-able Eff ectiveness of Software Analytics” exam-ines why software analytics can predict software project behavior. IEEE Intelligent Systems’ “Multi-modal Sentiment Analysis: Addressing Key Issues and Setting Up the Baselines” dives into how deep learning is improving multimodal sentiment classifi cation.
Finally, this ComputingEdge issue covers cyber-physical systems (CPS), which integrate computation and physical components. Comput-er’s “Computing in the Real World Is the Grand-est of Challenges” emphasizes the importance of CPS in our modern world. Another article from Computer, “Society 5.0: For Human Security and Well-Being” describes a Japanese initiative aimed at making society more sustainable and prosper-ous through CPS.
Trends in Cloud Computing
8 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
Towards AI-Powered MultipleCloud Management
Beniamino Di MartinoUniversity of Campania “Luigi Vanvitelli”
Antonio EspositoUniversity of Campania “Luigi Vanvitelli”
Ernesto DamianiKhalifa University and University of Milan
Abstract—Cloud users worldwide are looking at the next generation of artificial
intelligence (AI) powered cloud management tools to automate cloud performance
tuning and anomaly detection. To be effective across clouds, AI tools need a common
representation of cloud services and support for machine learning optimization
targeting multiple objectives. We put forward the notion that ontology-based models
can support both.
& CLOUD COMPUTING’S SERVICE model, based
on elastic on-demand allocation of virtual
resources, turned out to be suitable for sup-
porting data-intensive applications such as
artificial intelligence (AI) pipelines, which
exploit cloud scaling to perform large-volume
data ingestion, preparation, model training,
and inference. Today, many companies world-
wide rely on the cloud for large AI workloads,
making cloud management and control a key
issue for AI pipelines. Experience has shown
that different stages of an AI pipeline may
have diverse nonfunctional requirements
(e.g., different data confidentiality levels); for
this reason, more and more organizations
adopt edge-cloud or multicloud deployment
strategies, deploying each pipeline stage on a
different public or private cloud. Multicloud
deployment promises to prevent provider
lock-in, take advantage of dynamic resource
pricing at run-time, and secure the content
exchanged or stored on the multicloud. Early
approaches to multicloud deployment were
mostly programmatic: They consisted of multi-
cloud libraries, which allowed run-time map-
ping of computations to the resources of
multiple cloud providers. However, program-
matic control of multicloud deployment hard-
codes deployment decisions in scripts, which
may lead to lack of flexibility and, ultimately,
to inefficiency and loss of control. Some
recent models of multicloud services, such as
MANTUS,9 support decoupling of the architec-
ture model and the cloud resources used.
This separation allows users to apply dynamic
reconfiguration to tune the resources used onDigital Object Identifier 10.1109/MIC.2018.2883839
Date of current version 6 March 2019.
View from the Cloud
641089-7801 � 2019 IEEE Published by the IEEE Computer Society IEEE Internet Computing
www.computer.org/computingedge 9
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
each cloud according to performance and cost
targets. Still, reconfiguration procedures are
mostly manually coded, and little support is
available for automated management across
clouds. Our own research efforts focused on
cloud architecture representations that rely
on a service ontology for defining their enti-
ties.9 Using the Web Ontology Language for
Services (OWL-S) standard to model cloud
Resources, Services and Patterns5 provides a
high-level framework that can be used not
only for architecture description and mainte-
nance, but also—through automated reason-
ing—can effectively support Multiple Cloud
Service Discovery andBrokering, and architecture
agnostic applications’ development and deploy-
ment. The approach can be extremely useful
when dealing with applications that can be
deployed in amulticloud environment. Let us con-
sider a common business intelligence application
in which an extraction, transformation and load-
ing process retrieves data from a database
(DBMS), a customer relationship management
(CMR), and an enterprise resource planning (ERP)
system, and preprocesses them for further analy-
sis after their storage in a data warehouse system.
TheCMRand ERPcomponents utilize data coming
from their own databases, while data recorded in
the data warehouse are used by the OLAP system
and by the data mining component to perform
market analysis. Introducing such an architecture
arises a series of concerns, first of all regarding
interoperability, since the several application’s
components need to interact to achieve the busi-
ness goal. If each of such components were to be
hosted by different cloud providers, because of
the specific functionalities they provide or for eco-
nomic reasons, one could be concerned about the
real capability of such components to communi-
cate due to differences in the communication
interfaces provided by the providers. A semantic
representation, such as the one that will be pre-
sented in this paper, can ease the communication
difficulties and enable interoperability.
Such capabilities have been demonstrated
through several industrial case studies within
the FP7 mOSAIC project.10 Ontology-based mod-
els’ feasibility for representing the computation
of AI pipelines has been experimented in several
industrial case studies within the H2020 TOREA-
DOR project.2
SEMANTIC REPRESENTATION OFCLOUD PATTERNS AND SERVICES
The semantic representation reported in
Figure 1 has been developed to ease the portabil-
ity and interoperability issues that may arise
when either trying to compose multiple cloud
services, or migrating data and applications
from a platform to another.8 The representa-
tion, of which we report an overview, is consti-
tuted by a multilayered stack of conceptual
models, connected to one another in a graph-
like structure but that are still independent.
The conceptual models are divided into the fol-
lowing patterns.
� The Application Pattern layer describes gen-
eral applications and their components, with
no specific connection to an implementing
technology or platform. The solutions repre-
sented by such patterns can be theoretically
applied to different scenarios, independently
from the platforms, programming languages,
or technologies involved.� The Cloud Pattern layer provides the descrip-
tion of cloud-based solutions to be applied
when implementing applications. The seman-
tic representation helps not only to efficiently
categorize them, but also to identify eventual
connections, similarities, and similar applica-
tion scenarios. As many cloud patterns can be
used together to form more complex one, the
representation also supports pattern compo-
sitions. Cloud Patterns’ components are repre-
sented by cloud services, which are described
in the lower layer.� The Cloud Service layer focuses on the des-
cription of cloud services and functionalities
Figure 1. Semantic representation of cloud patterns and
services.
January/February 2019 65
10 ComputingEdge June 2019
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
they expose. The objective of this layer is to
provide a common semantics for the descrip-
tion of services exposed by cloud platforms
so that comparisons can be done both at func-
tional and operational levels.� The Operation layer describes functionalities
exposed by services and not referring to the
cloud, including resources exposed through
the web.� The Ground layer represents the connec-
tion between the abstract description of
input/output parameters and operations
exposed by services, and their actual
implementation. While the WSDL standard
is the native format used for this purpose,
any other standard way to represent the
invocation interface of cloud or web serv-
ices can be used.
Different technologies have been employed
to implement the layers, but they are all based
on the standard OWL language for ontology
description. The application and cloud patterns
layers have been implemented using a combina-
tion of ODOL,3 a semantic representation of
design patterns which has been augmented and
adapted to describe the static entities of pat-
terns, and OWL-S, an ontology designed to
describe web services which, in our case, has
been exploited to define the dynamic behavior
of patterns. OWL-S has also been used to
describe the services and operations layers,
for which it provides native support. Figure 2
portrays the graphical representation of the
semantic description of patterns and services
that we have briefly introduced. In particular,
Figure 2(a) shows the whole representation,
with all the classes and connections existing
among its components, while Figure 2(b)
zooms in on the pattern class and its direct
connections.
All layers are supported by a background
ontology, which provides a common set of
terms to describe each component of the
described patterns, services, and operations,
and which will be further described in the fol-
lowing sections.
NONFUNCTIONAL PROPERTIES:EXAMPLE FOR LEGAL CONTRAINTS
While functional composition of cloud serv-
ices is by itself a complex, yet interesting, mat-
ter, there are nonfunctional aspects that should
be taken in consideration when selecting a spe-
cific service to fulfill user needs. The semantic
description presented in the previous section
covers functional requirements of applications
and cloud services, making it possible to select
and compose them according to the required
functionalities. However, it does not take in con-
sideration nonfunctional requirements, and legal
constraints regarding the use of data in a specific
environment or country are a strong limitation
when selecting a cloud service. The frame-
work,6 of which we report an overview here,
Figure 2. Graphical representation of the semantic multilayered description of patterns and services.
View from the Cloud
66 IEEE Internet Computing
www.computer.org/computingedge 11
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
provides a user-oriented tool, that is able to
check the compliance of cloud services offered
by different cloud platforms, in respect to the
Italian regulations. A simplified graphical inter-
face allows the user to interact with the frame-
work, enabling her to specify the application’s
requirements. More specifically, users can do
the following tasks.
� Select the nature of the data that will be
treated among a set of predefined categories
(sensitive, health, judicial, or not subject to
data protection).� Decide the scope and aim of the data treat-
ment, by selecting one of the available cate-
gories (scientific, statistical, historical, or
generic).� Specify the cloud service provider she wants
to use for its resources, and eventually nar-
row down the selection of available services
offered by that provider, according to the
requirements of her application.� Decide the location of the data center, choos-
ing among the possible locations offered for
the selected provider.
The framework was developed to run with
the Italian legislation and in particular is
based on the formalization of the Italian Legis-
lative Decree 196/2003 and Italian Code for
Digital Administration. It can be used in differ-
ent ways. The basic usage allows the user to
establish the compliance of a specific service,
running in a certain location or exploiting a
specific data-center, to the regulations. Of
course, the kind of data used and the purpose
of their processing should be known
beforehand. Conversely, the user can discover
the kinds of data processing that are allowed
for a certain combination of cloud provider
and data-center location.
The knowledge base exploited by the frame-
work contains two main information sets:
� the semantic rules derived from the
legislation;� the semantic description of the terms of ser-
vice of the providers’ cloud services (such as
Amazon, Microsoft, IBM).
The first set of information has been obtained
by analyzing, via natural language processing
(NLP) techniques, the reference laws on privacy.
The NLP analysis translated prescriptive senten-
ces into logical rules, while ontologies have been
used to describe the terms of services exposed
by the cloud providers. Figure 3 shows the main
components of the framework and how they are
related.
There are two main components: the back
end, composed of the Ontology Cache, the
OWL Parser, and the SWIPL Facade, and the
front end, represented by the user interface
implemented in HTML. The OWL Parser
extracts information from ontologies coded in
OWL and converts them into Prolog facts that
are then questioned using the rules by the
SWIPL Facade component.
CLOUD ONTOLOGYAn ontology is needed to provide a common
background terminology for all the semantic
layers of the cloud representation. The cloud
ontology7 describes an approach, which aims at
providing such an ontology, via a description of
Figure 3. Architecture of the legislation-aware framework.
January/February 2019 67
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
they expose. The objective of this layer is to
provide a common semantics for the descrip-
tion of services exposed by cloud platforms
so that comparisons can be done both at func-
tional and operational levels.� The Operation layer describes functionalities
exposed by services and not referring to the
cloud, including resources exposed through
the web.� The Ground layer represents the connec-
tion between the abstract description of
input/output parameters and operations
exposed by services, and their actual
implementation. While the WSDL standard
is the native format used for this purpose,
any other standard way to represent the
invocation interface of cloud or web serv-
ices can be used.
Different technologies have been employed
to implement the layers, but they are all based
on the standard OWL language for ontology
description. The application and cloud patterns
layers have been implemented using a combina-
tion of ODOL,3 a semantic representation of
design patterns which has been augmented and
adapted to describe the static entities of pat-
terns, and OWL-S, an ontology designed to
describe web services which, in our case, has
been exploited to define the dynamic behavior
of patterns. OWL-S has also been used to
describe the services and operations layers,
for which it provides native support. Figure 2
portrays the graphical representation of the
semantic description of patterns and services
that we have briefly introduced. In particular,
Figure 2(a) shows the whole representation,
with all the classes and connections existing
among its components, while Figure 2(b)
zooms in on the pattern class and its direct
connections.
All layers are supported by a background
ontology, which provides a common set of
terms to describe each component of the
described patterns, services, and operations,
and which will be further described in the fol-
lowing sections.
NONFUNCTIONAL PROPERTIES:EXAMPLE FOR LEGAL CONTRAINTS
While functional composition of cloud serv-
ices is by itself a complex, yet interesting, mat-
ter, there are nonfunctional aspects that should
be taken in consideration when selecting a spe-
cific service to fulfill user needs. The semantic
description presented in the previous section
covers functional requirements of applications
and cloud services, making it possible to select
and compose them according to the required
functionalities. However, it does not take in con-
sideration nonfunctional requirements, and legal
constraints regarding the use of data in a specific
environment or country are a strong limitation
when selecting a cloud service. The frame-
work,6 of which we report an overview here,
Figure 2. Graphical representation of the semantic multilayered description of patterns and services.
View from the Cloud
66 IEEE Internet Computing
12 ComputingEdge June 2019
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
cloud services based on well-known semantic
technologies, such as OWL and OWL-S. Here, we
provide an overview of such an ontology, as it
represents a set of semantic definitions which
the ontology description provided in the previ-
ous sections requires to be effective.
The whole ontology is constituted by a set
of interrelated subontologies, connected
through ad-hoc links and properties, which
define a common representation base for
existing cloud services, together with the
operations they expose and the parameters
they exchange.
Three main layers compose the cloud
ontology.
� The upper layer contains the Agnostic Ser-
vice Description Ontology, which provides a
common terminology to describe cloud
services, resources, operations, and
parameters. Through this ontology, it is
possible to annotate cloud entities by
using a general and shareable catalog of
concepts, which enables their discovery
and comparison.� The central layer is represented by the Cloud
Services Categorization Ontology, which offers
a categorization of cloud services and virtual
appliances. The several categories are based
on the specific functionalities the different
services and appliances offer, as declared
by their respective vendors. By importing the
upper ontology, references to platform
specific services and resources, which are
organized according to the proposed
categorization, can be directly related to
Agnostic descriptions to enable comparisons.� The bottom layer is in turn composed of two
different groups of ontologies, describing
proprietary specific services, operations,
and parameters. In particular, the Cloud Pro-
vider Ontology set defines proprietary con-
cepts that describe a specific cloud
provider’s offer: a single and independent
ontology exists for each cloud provider,
which can be added or removed indepen-
dently, in a modular fashion. Purpose of this
set of ontologies is to describe proprietary
services, resources, and related operations
with the providers’ specific concepts, which
can be categorized against the Cloud Serv-
ices Categorization Ontology and further
annotated through the Agnostic Service
Description Ontology.
On the other hand, the Cloud Services OWL-S
Description set describes the several services
offered by the different providers in terms of
their internal work flow and grounding; in this
way, the ontology provides the information
needed to automatically instantiate such serv-
ices. Figure 4 portrays the aforementioned ontol-
ogy, highlighting the three different layers and
their connections.
SCOPE: A CLOUD SERVICESCOMPOSITION TOOL
The use and management of the semantic
representations proposed in previous sections
require knowledge about the ontology languages
used to describe them, of logical languages,
such as Prolog or SWRL, and of query languages,
such as SPARQL, to interact with them. However,
it is not viable to suppose that all users know
how to build such queries, or that they are famil-
iar with the semantic technologies backing
them, that is why the graphical tool SCOPE has
been created.2 Using the GUI, users can compose
services and patterns just by dragging and drop-
ping boxes and arrows, while the system sug-
gests which connections to establish or services
to use.
The architecture of cloud composer consists
of three main components.
Figure 4. Architecture of the cloud ontology.
View from the Cloud
68 IEEE Internet Computing
www.computer.org/computingedge 13
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
� The Parser, which takes as input OWL plus
OWL-S files, describing the desired services,
pattern, and internal orchestration, and
automatically generate an internal repre-
sentation. Such a representation is used to
keep trace of the modifications made by
the users before actually saving them into
the ontologies.� A Graph model, which is the internal repre-
sentation for cloud patterns and services.
Such a model is implemented as a Java data
structure.� The GUI, which allows visualization, creation,
and modification of all elements defined in
the graph model.
The tool is organized so that the users do not
deal directly with the ontologies describing serv-
ices and patterns, nor with the internal repre-
sentation. The necessary SPARQL queries are
dynamically produced and the ontologies are
modified accordingly using the Jena OWL Web
Ontology Language API. Such an API is at the core
of the Parser component: through Jena, the tool
automatically analyzes the services and patterns
ontologies, and then a graph model is produced
following the semantic-based representation.
The graph model has been implemented in
Java: vertices of the graph are used to represent
both cloud services or patterns participants, while
edges are used to represent both calls between
services and relations existing between patterns’
participants. Using such a representation, the tool
can create a graph that simultaneously describes
both a pattern’s structure andworkflow.
MACHINE LEARNINGAPPROACH TO CLOUDMANAGEMENT
Public cloud providers’ ability to offer
affordable machine learning (ML) services has
triggered interest in using ML for cloud manage-
ment as well. ML techniques analyze patterns
in the cloud entities’ parameters (especially
numerical ones) to classify, predict, and opti-
mize cloud workloads. However, their seamless
application across different clouds requires a
shared representation of such entities. To dis-
cuss how the above ontology-based models for
representing cloud resources and computations
can support ML-driven optimization, let us con-
sider a simple example: an agent using an ML
model for allocating VMs in a multicloud sce-
nario including a public and a private cloud.
To allow the agent to position new VMs, we
could set up an ML classifier targeting categories
{public, private}. The simplest ML model for
doing so is probably a Voronoi (1-NN) model
that represents each VM as a numerical resource
usage vector V ¼ (CPU, RAM, disk) and decides
where to allocate it based on a set of “correct”
allocations, in turn based on experience. The
implementation of our model would periodically
monitor current CPU, disk, and memory usage of
VMs and decide their allocation choosing
between public and private based on the alloca-
tion of the closest VM in the reference set.
The behavior of our ML model clearly
depends on the quality of the reference set,
which needs to represent the best workload dis-
tribution for the specific environment. Perhaps
more importantly, model behavior also
depends on the notion of “closeness,” i.e., on
the definition of distance used. Euclidean dis-
tance between raw vectors is the obvious
choice, but it is not necessarily the distance a
human engineer would want to consider when
writing a configuration cron job for the same
purpose. Using Euclidean distance, two VMs
exhibiting coherent behavior in their usage vec-
tors (all components of one go “up” when the
corresponding ones of the other also go “up” of
the same amount, only with respect to a differ-
ent baseline) could be further apart from each
other than another pair whose loads are uncor-
related but have the same average and vari-
ance. Thus, these two VMs could be matched to
different reference points by the 1-NNmodel, and
different allocation decisions could result, while
the human engineer would make the same alloca-
tion decision for both, disregarding the baseline
difference.
The standard data science solution to this
problem is to recenter and rescale the data, sub-
tracting from each component the average and
dividing by the standard deviation computed
across all VMs. By the way, this adds to the
computational load of the model’s execution,
which uses cloud resources that cannot be billed
to customers.
January/February 2019 69
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
cloud services based on well-known semantic
technologies, such as OWL and OWL-S. Here, we
provide an overview of such an ontology, as it
represents a set of semantic definitions which
the ontology description provided in the previ-
ous sections requires to be effective.
The whole ontology is constituted by a set
of interrelated subontologies, connected
through ad-hoc links and properties, which
define a common representation base for
existing cloud services, together with the
operations they expose and the parameters
they exchange.
Three main layers compose the cloud
ontology.
� The upper layer contains the Agnostic Ser-
vice Description Ontology, which provides a
common terminology to describe cloud
services, resources, operations, and
parameters. Through this ontology, it is
possible to annotate cloud entities by
using a general and shareable catalog of
concepts, which enables their discovery
and comparison.� The central layer is represented by the Cloud
Services Categorization Ontology, which offers
a categorization of cloud services and virtual
appliances. The several categories are based
on the specific functionalities the different
services and appliances offer, as declared
by their respective vendors. By importing the
upper ontology, references to platform
specific services and resources, which are
organized according to the proposed
categorization, can be directly related to
Agnostic descriptions to enable comparisons.� The bottom layer is in turn composed of two
different groups of ontologies, describing
proprietary specific services, operations,
and parameters. In particular, the Cloud Pro-
vider Ontology set defines proprietary con-
cepts that describe a specific cloud
provider’s offer: a single and independent
ontology exists for each cloud provider,
which can be added or removed indepen-
dently, in a modular fashion. Purpose of this
set of ontologies is to describe proprietary
services, resources, and related operations
with the providers’ specific concepts, which
can be categorized against the Cloud Serv-
ices Categorization Ontology and further
annotated through the Agnostic Service
Description Ontology.
On the other hand, the Cloud Services OWL-S
Description set describes the several services
offered by the different providers in terms of
their internal work flow and grounding; in this
way, the ontology provides the information
needed to automatically instantiate such serv-
ices. Figure 4 portrays the aforementioned ontol-
ogy, highlighting the three different layers and
their connections.
SCOPE: A CLOUD SERVICESCOMPOSITION TOOL
The use and management of the semantic
representations proposed in previous sections
require knowledge about the ontology languages
used to describe them, of logical languages,
such as Prolog or SWRL, and of query languages,
such as SPARQL, to interact with them. However,
it is not viable to suppose that all users know
how to build such queries, or that they are famil-
iar with the semantic technologies backing
them, that is why the graphical tool SCOPE has
been created.2 Using the GUI, users can compose
services and patterns just by dragging and drop-
ping boxes and arrows, while the system sug-
gests which connections to establish or services
to use.
The architecture of cloud composer consists
of three main components.
Figure 4. Architecture of the cloud ontology.
View from the Cloud
68 IEEE Internet Computing
14 ComputingEdge June 2019
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
Another reason why Euclidean distance may
not work as expected is related to the model’s
goal. Indeed, human-perceived distances related
to nonfunctional properties such as fairness or
confidentiality are often not isotropic, i.e., they
do not treat all features equally (like Euclidean
distance does). In terms of our example, if
we want to use our model to achieve allocation
fairness (rather than optimize our load), then
disk and RAM usage should not compensate
each other.
Summarizing, we have noticed that the
behavior and execution cost of the ML model
depend on a hidden discretionary parameter
(the distance used) that may not be uniform
across clouds, impairing the use of ML models in
multicloud environments.
Distance Based on Equivalence Relations
Let us now take a step toward explicit
representation of this parameter by refining
our problem statement. Instead of treating our
VMs’ features as vectors of numbers, we
model them as elements of an equivalence
class S, defined by some relation Fs. In terms
of our example’s 1-NN model, the reference
set will contain one representative for each
equivalence class. Two VMs will fall in the
same class where the value of Fs computed
on their representation is the same. Then, we
will be able to assign distance between clas-
ses using a simple lookup table. The lookup
table represents explicitly the metric struc-
ture we want to give to the resource space
and can be published and shared. Also, we
can define multiple tables corresponding to
different goals. Not surprisingly, tabled distan-
ces can be learnt,11 or computed automati-
cally by other AI approaches: methods to do
so include the classic isometric feature map-
ping procedure, or isomap, which can reliably
recover low-dimensional structure in percep-
tual datasets. Another well-studied technique
more suitable for stochastic usage data is t-
SNE by Maaten and Hinton.5
ML-FRIENDLY REPRESENTATIONSOF THE CLOUD RESOURCES SPACE
The above discussion suggests that seman-
tic representations of cloud resources should
be complemented with an ontological repre-
sentation of the distance to be used in the
resources’ metrics space, which in turn will
support ML models in achieving the specific
objective to be attained managing the multi-
cloud. The straightforward way to do so is to
augment the ontology with the representation
of the equivalence relations and of the associ-
ated distance tables. In general, for each cloud,
the AI agent will:
1. choose the equivalence relation that
expresses one or more of its nonfunctional
objectives;
2. identify the corresponding lookup table on
the equivalence classes of resources;
3. use the table to build and tune the corre-
sponding ML model.
It is important to remark that steps 2 and 3
are the interface between symbolic (ontology-
based) and subsymbolic (ML-based) operation of
the multicloud management agent, and there-
fore, equivalence relations are a key point of
interest for future standardization of multicloud
ontologies.
The extension of cloud ontology to represent
equivalence relations among resources12 will
guarantee the uniform behavior (if not uniform
achievements) on the part of all ML models
reconfiguring the cloud to reach an objective
defined at the ontology (symbolic) level, such as
a nonfunctional property mentioned in a service
level agreement.
It is also important to remark that the AI
agent can pursue multiple goals simultaneously:
To achieve jointly the goal of fairness and the
goal of performance, two different metric spaces
can be considered having different lookup
tables, and two different ML models can be
tuned and used. Alternatively, a “hybrid” equiva-
lence class can be defined, and a natural metric
used on it.
ACKNOWLEDGMENTThis work was supported in part by the Euro-
pean Union’s Horizon 2020 Research and Innova-
tion Program under the TOREADOR Project
under Grant 688797 and in part by the mOSAIC
Project under Grant 256910.
View from the Cloud
70 IEEE Internet Computing
www.computer.org/computingedge 15
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
& REFERENCES
1. E.Damiani, C. Ardagna, P.Ceravolo, andN. Scarabottolo,
“Towardmodel-basedbig data-as-a-service: The
TOREADORapproach,” inProc. Eur. Conf. Adv.
Databases Inf. Syst., 2017, pp. 3–9.
2. B. Di Martino and A. Esposito, “A tool for mapping and
editing of cloud patterns: The semantic cloud patterns
editor,”Stud. Inform. Control, vol. 27, no. 1, pp. 117–126,
2018.
3. J. Dietrich and C. Elgar, “An ontology based
representation of software design patterns,” inProc. Des.
Pattern Formalization Techn., Jan. 2007, pp. 258–279.
4. C. Fehling, F. Leymann, R. Retter, W. Schupeck, and
P. Arbitter, Cloud Computing Patterns—Fundamentals
to Design, Build, and Manage Cloud Applications.
New York, NY, USA: Springer, 2014.
5. L. van derMaaten andG. Hinton, “Visualizing data using
t-SNE,” J.Mach. Learn. Res., vol. 9, pp. 2579–2605,
2008.
6. B. Di Martino, G. Cretella, and A. Esposito, “Towards
a legislation-aware cloud computing framework,”
Procedia Comput. Sci., vol. 68, pp. 127–135, 2015.
7. B. Di Martino, G. Cretella, A. Esposito, and G. Carta,
“An OWL ontology to support cloud portability and
interoperability,” Int. J. Web Grid Services, Indersci.,
vol. 11, no. 3, pp. 303–326, 2015.
8. B. Di Martino, A. Esposito, and G. Cretella, “Semantic
representation of cloud patterns and services with
automated reasoning to support cloud application
portability,” IEEE Trans. Cloud Comput., vol. 5, no. 4,
pp. 765–779, Oct./Dec. 2017.
9. A. Palesandro, M. Lacoste, N. Bennani, C. Ghedira-
Guegan, andD. Bourge, “MANTUS: Putting aspects to
work for flexiblemulti-clouddeployment,” inProc.
IEEE10th Int. Conf. CloudComput., 2017, pp. 656–663.
10. D. Petcu et al., “Experiences in building a mosaic of
clouds,” J. Cloud Comput., vol. 2, no. 1, 2013, pp. 2–12.
11. A. Bar-Hillel, “Learning distance functions using
equivalence relations,” in Proc. 20th Int. Conf. Mach.
Learn., 2003, pp. 11–18.
12. N. Guarino and C. Welty, “Identity, unity, and
individuality: Towards a formal toolkit for ontological
analysis,” in Proc. 14th Eur. Conf. Artif. Intell., 2000,
pp. 219–223.
Beniamino Di Martino is a Full Professor at the
University of Campania. He is an author of 14
international books and more than 300 publica-
tions in international journals and conferences;
has been a coordinator of EU funded FP7-ICT
Project mOSAIC, and participates to various inter-
national research projects; is an editor/associate
editor of seven international journals and EB Mem-
ber of several international journals; is vice chair
of the Executive Board of the IEEE CS Technical
Committee on Scalable Computing; is a member
of the IEEE WG for the IEEE P3203 Standard on
Cloud Interoperability, IEEE Intercloud Testbed
Initiative, IEEE Technical Committees on Scalable
Computing and on Big Data, Cloud Standards
Customer Council, and Cloud Experts’ Group of
the European Commission. Contact him at: benia-
Antonio Esposito is a Postdoc with the Depart-
ment of Engineering, University of Campania
“Luigi Vanvitelli” (Italy). He received the Ph.D.
degree in December 2016, with a thesis on a pat-
tern-guided semantic approach to the solution of
portability and interoperability issues in the cloud
enabling automatic services composition. Contact
him at: [email protected].
Ernesto Damiani is a Full Professor at the Univer-
sity of Milan, where he leads the Secure Software
Architectures Lab. He is the founding director of the
Cyber-Physical Systems Research Center, Khalifa
University, UAE. He is an author of more than 500
publications in international journals and conferen-
ces and of several patents. He has been the coordi-
nator of EU funded H2020 Project TOREADOR and
participates to various international research proj-
ects. He is an ACM distinguished scientist and a
recipient of the Stephen Yau Award. He received a
doctorate honoris causa from INSA Lyon (France) for
his contributions to research and innovation architec-
tures for big data analytics. Contact him at: ernesto.
January/February 2019 71
23mic01-dimartino-2883839.3d (Style 5) 05-04-2019 15:24
Another reason why Euclidean distance may
not work as expected is related to the model’s
goal. Indeed, human-perceived distances related
to nonfunctional properties such as fairness or
confidentiality are often not isotropic, i.e., they
do not treat all features equally (like Euclidean
distance does). In terms of our example, if
we want to use our model to achieve allocation
fairness (rather than optimize our load), then
disk and RAM usage should not compensate
each other.
Summarizing, we have noticed that the
behavior and execution cost of the ML model
depend on a hidden discretionary parameter
(the distance used) that may not be uniform
across clouds, impairing the use of ML models in
multicloud environments.
Distance Based on Equivalence Relations
Let us now take a step toward explicit
representation of this parameter by refining
our problem statement. Instead of treating our
VMs’ features as vectors of numbers, we
model them as elements of an equivalence
class S, defined by some relation Fs. In terms
of our example’s 1-NN model, the reference
set will contain one representative for each
equivalence class. Two VMs will fall in the
same class where the value of Fs computed
on their representation is the same. Then, we
will be able to assign distance between clas-
ses using a simple lookup table. The lookup
table represents explicitly the metric struc-
ture we want to give to the resource space
and can be published and shared. Also, we
can define multiple tables corresponding to
different goals. Not surprisingly, tabled distan-
ces can be learnt,11 or computed automati-
cally by other AI approaches: methods to do
so include the classic isometric feature map-
ping procedure, or isomap, which can reliably
recover low-dimensional structure in percep-
tual datasets. Another well-studied technique
more suitable for stochastic usage data is t-
SNE by Maaten and Hinton.5
ML-FRIENDLY REPRESENTATIONSOF THE CLOUD RESOURCES SPACE
The above discussion suggests that seman-
tic representations of cloud resources should
be complemented with an ontological repre-
sentation of the distance to be used in the
resources’ metrics space, which in turn will
support ML models in achieving the specific
objective to be attained managing the multi-
cloud. The straightforward way to do so is to
augment the ontology with the representation
of the equivalence relations and of the associ-
ated distance tables. In general, for each cloud,
the AI agent will:
1. choose the equivalence relation that
expresses one or more of its nonfunctional
objectives;
2. identify the corresponding lookup table on
the equivalence classes of resources;
3. use the table to build and tune the corre-
sponding ML model.
It is important to remark that steps 2 and 3
are the interface between symbolic (ontology-
based) and subsymbolic (ML-based) operation of
the multicloud management agent, and there-
fore, equivalence relations are a key point of
interest for future standardization of multicloud
ontologies.
The extension of cloud ontology to represent
equivalence relations among resources12 will
guarantee the uniform behavior (if not uniform
achievements) on the part of all ML models
reconfiguring the cloud to reach an objective
defined at the ontology (symbolic) level, such as
a nonfunctional property mentioned in a service
level agreement.
It is also important to remark that the AI
agent can pursue multiple goals simultaneously:
To achieve jointly the goal of fairness and the
goal of performance, two different metric spaces
can be considered having different lookup
tables, and two different ML models can be
tuned and used. Alternatively, a “hybrid” equiva-
lence class can be defined, and a natural metric
used on it.
ACKNOWLEDGMENTThis work was supported in part by the Euro-
pean Union’s Horizon 2020 Research and Innova-
tion Program under the TOREADOR Project
under Grant 688797 and in part by the mOSAIC
Project under Grant 256910.
View from the Cloud
70 IEEE Internet Computing
This article originally appeared in IEEE Internet Computing, vol. 23, no. 1, 2019.
16 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Serverless Computing:From Planet Marsto the Cloud
Serverless computing is a new way of managing
computations in the cloud. We show how it can be put to
work for scientific data analysis. For this, we detail our
serverless architecture for an application analyzing data
from one of the instruments onboard the ESA Mars
Express orbiter, and then, we compare it with a
traditional server solution.
SERVERLESS AND (PUBLIC) CLOUD COMPUTINGServerless computing is an execution model where the user provides a code to be run without anyinvolvement in server management or capacity planning. Many serverless implementations are offeredin the form of compute runtimes, which execute application logic but do not persistently store data. Asthe uploaded code is exposed to the outside world, the developer no longer needs to be involved inmultitasking, handling requests, or operating system costs (installation, maintenance, and licenses).Serverless computing is a form of utility computing and paradoxically relies on actual serversmaintained by cloud computing providers.
Cloud computing is a provision model that gives access to dynamic, elastic, and on-demandcomputational resources. Within the general deployment models, public clouds have attractedmuch interest among the scientific community in recent years. Public cloud services are providedby an independent organization that owns compute resources, which are then offered to theircustomers. Public cloud users, especially those from the science field, find this pay-as-you-gopricing model to be very convenient. Its great flexibility allows budgets to be easily adapted tocomputing needs.
Jos�e Luis V�azquez-PolettiUniversidad Complutense deMadrid
Ignacio Mart�ln LlorenteUniversidad Complutense deMadridHarvard University
Editors:Konrad Hinsen;[email protected],
Matthew Turk;[email protected]
DEPARTMENT: SCIENTIFIC PROGRAMMING
Computing in Science & EngineeringNovember/December 2018 73
Published by the IEEE Computer Society1521-9615/18/$33.00 �2018 IEEE
www.computer.org/computingedge 17
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Amazon web services (AWS) is one of the most widely used public cloud platforms. Among theirnumerous services, Lambda (aws.amazon.com/lambda) offers what can be regarded as serverlesscomputing. Under the premise “run code, not servers,”AWS Lambda makes it possible to run code inresponse to specific events while transparently managing the underlying compute resources for the user.
AN APPLICATION FROM A NEIGHBORING PLANETThe application ported to the Lambda serverless environment processes data from the Mars AdvancedRadar for Subsurface and Ionosphere Sounding (MARSIS).1 MARSIS is a pulse-limited and low-frequency radar sounder and altimeter installed on the Mars Express orbiter from the European SpaceAgency (sci.esa.int/mars-express/), which has been traveling around Mars since 2003. The instrumentgained fame in July 2018 when it detected liquid water hidden beneath the Martian South Pole.2
The application processes MARSIS data from the active ionospheric sounding (AIS) experiment anddisplays it graphically in order to identify magnetic fields. It also allows for the detection of inducedmagnetic fields deep in the Martian ionosphere, compressing its plasma.3 Through this process, newavenues have become available to study the effects of solar wind and understand dust storms.
Figure 1 shows the process performed by the application, which reuses software previously used inthe mission.
Our objective was to process every data file once it became available. This minimized response time,achieving an optimal performance-cost trade-off. The Mars Express mission should be extended to atleast the year 2022, and the study of the ionosphere has been identified as one of the mission’spriorities, as observations performed by the orbiter will see their coverage augmented and theirlong-time series extended.
In light of this, a large MARSIS dataset was generated as our starting point to evaluate the bestcomputing approach. This consisted of 9761 files with a total size of 92 GB and was retrievedbetween 2005 and 2016 (see Figure 2). The idea was that if our proposal worked with this largedataset, it should also be suitable for future data.
Figure 1. Application overview. Data from the AIS experiment are converted into images, which arethen used to calculate magnetic fields.
COMPUTING IN SCIENCE & ENGINEERING
November/December 2018 74 www.computer.org/itpro
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Serverless Computing:From Planet Marsto the Cloud
Serverless computing is a new way of managing
computations in the cloud. We show how it can be put to
work for scientific data analysis. For this, we detail our
serverless architecture for an application analyzing data
from one of the instruments onboard the ESA Mars
Express orbiter, and then, we compare it with a
traditional server solution.
SERVERLESS AND (PUBLIC) CLOUD COMPUTINGServerless computing is an execution model where the user provides a code to be run without anyinvolvement in server management or capacity planning. Many serverless implementations are offeredin the form of compute runtimes, which execute application logic but do not persistently store data. Asthe uploaded code is exposed to the outside world, the developer no longer needs to be involved inmultitasking, handling requests, or operating system costs (installation, maintenance, and licenses).Serverless computing is a form of utility computing and paradoxically relies on actual serversmaintained by cloud computing providers.
Cloud computing is a provision model that gives access to dynamic, elastic, and on-demandcomputational resources. Within the general deployment models, public clouds have attractedmuch interest among the scientific community in recent years. Public cloud services are providedby an independent organization that owns compute resources, which are then offered to theircustomers. Public cloud users, especially those from the science field, find this pay-as-you-gopricing model to be very convenient. Its great flexibility allows budgets to be easily adapted tocomputing needs.
Jos�e Luis V�azquez-PolettiUniversidad Complutense deMadrid
Ignacio Mart�ln LlorenteUniversidad Complutense deMadridHarvard University
Editors:Konrad Hinsen;[email protected],
Matthew Turk;[email protected]
DEPARTMENT: SCIENTIFIC PROGRAMMING
Computing in Science & EngineeringNovember/December 2018 73
Published by the IEEE Computer Society1521-9615/18/$33.00 �2018 IEEE
18 ComputingEdge June 2019
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Execution time may take from milliseconds to half a minute for each data file depending on its size(see Figure 3). As there were many independent, simple processes involved, where each computationneeded to be optimized for latency and completed in near real time, we considered porting to aserverless environment to be the next logical step.
GETTING IT DONEWITH AWS LAMBDAWith regard to Lambda, a so-called function was created that contained the following.
1. An executable that performs the core operations on the specified data file. This executable,programed in C, was generated on an Amazon Linux box with static libraries. Its size is21 KB.
2. The main function code that handles the data file, invoking the executable and copying theoutput to a specified repository. Node.js 6.10 was used to program it, but Lambda also offersJava, C#, and Python bindings. Its size is 11.1 KB.
The main function code is also responsible for retrieving a reference file needed in the process. Thereason for not including it in the function bundle is its size (77 MB), which exceeds the maximumallowed. All output files must be generated in a scratch directory (/tmp) limited to 512 MB. The rest ofthe filesystem where the function is executed is read-only.
Figure 2. Arrival dates and times of the entire dataset from the AIS experiment between 2005 and2016.
Figure 3. Distribution of data file sizes from the AIS experiment between 2005 and 2016 (in MB).
SCIENTIFIC PROGRAMMING
November/December 2018 75 www.computer.org/itpro
www.computer.org/computingedge 19
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
The function was then configured to be triggered every time a new file with the .dat suffix appeared ina specified simple storage service (S3) bucket. In Lambda, there are many event sources that can beused for triggering functions, all of them related to AWS services. Some examples are email receptionby simple email service, data addition to a DynamoDB table, and scheduled triggers usingCloudWatch events.
Returning to our function, output files are moved to another S3 bucket. An overview of thearchitecture is shown in Figure 4.
Since Lambda is a serverless service, users do not need to be concerned with server specifications.Functions were conceived to be small and short, and the price depends on the following parameters.
� Memory assigned to the function. We specified 704 MB, with a maximum of 3008 MB.� Deadline. Even if 30 s could be enough, we specified 1 min, with a maximum of 5.
When one of the previous limits is reached, Lambda stops the execution of the function.
TUNING OUR FUNCTION: COLD ANDWARM CONTAINERSWhen a function is invoked, Lambda releases a container (sandbox) to be allocated within the AWSinfrastructure, function files are copied onto it, and execution takes place. If several trigger eventsoccur at the same time, several containers are released in order to meet the demand.
The lifetime of each of these containers is 15 min after the execution of the function. If there is anew invocation, the container is reused, including all of its files. We took advantage of this inorder to speed up the execution of the function. For instance, a warm (reused) container does notrequire the function to obtain the reference file (77 MB), because it has already been downloaded.Also, a warm container skips the Node.js language initialization and the code initialization itself.
Figure 4. Architecture overview showing AWS services involved. Official icon set provided by Amazonweb services.
COMPUTING IN SCIENCE & ENGINEERING
November/December 2018 76 www.computer.org/itpro
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Execution time may take from milliseconds to half a minute for each data file depending on its size(see Figure 3). As there were many independent, simple processes involved, where each computationneeded to be optimized for latency and completed in near real time, we considered porting to aserverless environment to be the next logical step.
GETTING IT DONEWITH AWS LAMBDAWith regard to Lambda, a so-called function was created that contained the following.
1. An executable that performs the core operations on the specified data file. This executable,programed in C, was generated on an Amazon Linux box with static libraries. Its size is21 KB.
2. The main function code that handles the data file, invoking the executable and copying theoutput to a specified repository. Node.js 6.10 was used to program it, but Lambda also offersJava, C#, and Python bindings. Its size is 11.1 KB.
The main function code is also responsible for retrieving a reference file needed in the process. Thereason for not including it in the function bundle is its size (77 MB), which exceeds the maximumallowed. All output files must be generated in a scratch directory (/tmp) limited to 512 MB. The rest ofthe filesystem where the function is executed is read-only.
Figure 2. Arrival dates and times of the entire dataset from the AIS experiment between 2005 and2016.
Figure 3. Distribution of data file sizes from the AIS experiment between 2005 and 2016 (in MB).
SCIENTIFIC PROGRAMMING
November/December 2018 75 www.computer.org/itpro
20 ComputingEdge June 2019
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
We finally estimated the container average warming time (execution time in cold containers minusthat in warm ones) to be 5 s.
The full function structure is described in Figure 5.
SERVERLESS OR SERVERFUL? A PERFORMANCEAND COSTANALYSISThis section compares our serverless solution with a more traditional server-based one, but still in thecloud, using AWS Elastic Compute Cloud (EC2).
In the AWS EC2 case, a t2.small machine (1 virtual CPU, 2 GB memory) was prepared witheverything needed for processing the input files when they became available, including thereference file. As keeping this machine up 24/7 was not a valid option, the following procedurewas considered.
1. Start the machine and download the .dat file from the S3 input bucket.2. Run the executable, one .dat file at a time.3. Upload the output file to the S3 output bucket.4. If there is a new .dat file in less than 1 min, go to 2.5. Stop the machine.
In the EC2 simulations, we assumed the existence of an external program that detects the availabilityof new .dat files in the S3 input bucket. As explained before, this process is automatic in Lambda,where the user can easily define triggers.
Figure 5. Flowchart describing the lifecycle of our AWS Lambda function.
SCIENTIFIC PROGRAMMING
November/December 2018 77 www.computer.org/itpro
www.computer.org/computingedge 21
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
Stopping the machine ensures that its configuration and files (executable, reference file) will be readyfor the next process without any further cost (which is $0.023/h when running). Starting the machinetakes an average of 26 s, and the file transfer takes around 4 s.
The arrival of files, as detailed in Figure 2, was simulated in EC2. The machine would be up a total of173 h 380 (including the 10 wait for new .dat files). The execution of 4776 processes (48.92% of thedataset) was delayed because of machines that were either stopped or busy. In the first case, boot timedelays were applied. In the second, a new machine was started if its boot time was lower than theestimated remaining time for the active process.
During the simulations, no more than two simultaneous instances were needed. This happened 512times. However, the accumulated delay was 39 h 200: an average of 29 s for each of the 4776 affectedprocesses.
Using the virtual machine approach costs $3.99. Storage with S3 involves a very small cost, as itcharges $0.023/GB when transfers are less than 50 TB/month. The total cost would be $2.21, $2.12corresponding to input transfers (92 GB) and $0.092 to output transfers (4 GB).
Moving to Lambda, the total usage time was 10 h 110, but AWS would bill 8 min more because ofrounding up. With regard to container temperature, 898 were cold and 8863 were warm, takingadvantage of the tweak described before. Also, 5 s was the average time for container warming,whereas the accumulated delay for Lambda was 1 h 150.
Table 1 shows a comparison of the execution times in both solutions for a set of file sizes. As shown,Lambda offers the best performance for all selected cases.
Prices in Lambda depend on the chosen configuration. In our case (704 MB and 10 deadline), the totalcost would be $4.25 just for execution. Storage has the same cost as in the EC2 solution ($2.21), asS3 buckets are used.
As can be seen in the breakdown of costs for each of the solutions shown in Table 2, the EC2 solution isslightly cheaper than the Lambda one. On the other hand, Table 1 shows that execution time and delaysfor Lambda are lower. Moreover, execution on EC2 would require the development of a trigger system.
AWS Lambda can be a valid solution if the demand is for low latency processing at a reasonable cost,and the underlying infrastructure is not a concern (only two performance parameters can be specified).Serverless computing requires processes to be atomized, and Lambda is limited by the Amazon Linuximage it comes with, static libraries, and what can be installed and used in less than the time availablefor execution. If you rely on some of the many services offered by Amazon or you are willing to
Table 1. Comparison of execution times in AWS EC2 and Lambda considering file sizes.
File size (MB) Execution time AWSEC2 (ms)
Execution time AWSLambda (ms)
4.9 3374 1800
13 9228 3700
17 11 796 4600
Table 2. Costs for each of the solutions.
AWS EC2 AWS Lambda
Execution $3.99 $4.25
Storage $2.21
COMPUTING IN SCIENCE & ENGINEERING
November/December 2018 78 www.computer.org/itpro
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
We finally estimated the container average warming time (execution time in cold containers minusthat in warm ones) to be 5 s.
The full function structure is described in Figure 5.
SERVERLESS OR SERVERFUL? A PERFORMANCEAND COSTANALYSISThis section compares our serverless solution with a more traditional server-based one, but still in thecloud, using AWS Elastic Compute Cloud (EC2).
In the AWS EC2 case, a t2.small machine (1 virtual CPU, 2 GB memory) was prepared witheverything needed for processing the input files when they became available, including thereference file. As keeping this machine up 24/7 was not a valid option, the following procedurewas considered.
1. Start the machine and download the .dat file from the S3 input bucket.2. Run the executable, one .dat file at a time.3. Upload the output file to the S3 output bucket.4. If there is a new .dat file in less than 1 min, go to 2.5. Stop the machine.
In the EC2 simulations, we assumed the existence of an external program that detects the availabilityof new .dat files in the S3 input bucket. As explained before, this process is automatic in Lambda,where the user can easily define triggers.
Figure 5. Flowchart describing the lifecycle of our AWS Lambda function.
SCIENTIFIC PROGRAMMING
November/December 2018 77 www.computer.org/itpro
22 ComputingEdge June 2019
20mcse06-vzquezpoletti-2875315.3d (Style 4) 05-04-2019 15:32
migrate from other providers, orchestration with Lambda is a must. Obviously, the drawback comes inthe form of vendor lock-in.
ACKNOWLEDGMENTSThe authors would like to thank the ESA Mars Express mission team (in particular, itsProject Scientist, Dr. Dmitri Titov) and their colleagues from the UCMMartian StudiesGroup (in particular, Prof. Luis V�azquez, Dr. Mar�la Ram�lrez-Nicol�as, Dr. Pedro J. Pascual,Dr. Salvador Jim�enez, and Dr. David Usero).
The authors are also grateful for the support provided by the Spanish Ministry of Economyand Competitiveness under project number TIN2015-65469-P and by the EuropeanCommission under the IN-TIME project under Grant 823934.
REFERENCES1. G. Picardi, D. Biccardi, R. Seu, J. Plaut, W. T. K. Johnson, and R. L. Jordan, “MARSIS: Mars
advanced radar for subsurface and ionosphere sounding,” inMars Express: The ScientificPayload. Noordwijk, The Netherlands: ESA Publ. Div., 2004, pp. 51–69.
2. R. Orosei et al., “Radar evidence of subglacial liquid water on Mars,” Science, 2018, http://science.sciencemag.org/content/early/2018/07/24/science.aar7268
3. M. Ram�lrez-Nicol�as et al., “The effect of the induced magnetic field on the electron densityvertical profile of the Mars’ ionosphere: A Mars Express MARSIS radar data analysis andinterpretation, A case study,” Planet. Space Sci., vol. 20, pp. 49–62, 2016.
ABOUT THE AUTHORS
Jos�e Luis V�azquez-Poletti is an Associate Professor with the Universidad Complutensede Madrid, Madrid, Spain, where he is also the Head of its Open Source Software andOpen Technologies Office. His research interests include distributed, parallel, anddata-intensive computing technologies, as well as innovative applications of thosetechnologies to the scientific area. He received the Ph.D. degree in computer architecturefrom the Universidad Complutense de Madrid. Contact him at [email protected].
Ignacio Mart�ln Llorente is a Visiting Professor with the Harvard John A. Paulson Schoolof Engineering and Applied Sciences, Cambridge MA, USA, a Professor in computerarchitecture and the Head of the Data-Intensive Cloud Lab, Universidad Complutense deMadrid, Madrid, Spain, and the Co-Founder and Director of the OpenNebula open-source project for cloud computing management. His research interests includedistributed, parallel, and data-intensive computing technologies, as well as innovativeapplications of those technologies to business and scientific problems. Contact him [email protected] or [email protected].
November/December 2018 79 www.computer.org/itpro
SCIENTIFIC PROGRAMMING
This article originally appeared in Computing in Science & Engineering, vol. 20, no. 6, 2018.
PURPOSE: The IEEE Computer Society is the world’s largest association of computing professionals and is the leading provider of technical information in the field.
MEMBERSHIP: Members receive the monthly magazine Computer, discounts, and opportunities to serve (all activities are led by volunteer members). Membership is open to all IEEE members, affiliate society members, and others interested in the computer field.
COMPUTER SOCIETY WEBSITE: www.computer.org
OMBUDSMAN: Direct unresolved complaints to [email protected].
CHAPTERS: Regular and student chapters worldwide provide the opportunity to interact with colleagues, hear technical experts, and serve the local professional community.
AVAILABLE INFORMATION: To check membership status, report an address change, or obtain more information on any of the following, email Customer Service at [email protected] or call +1 714 821 8380 (international) or our toll-free number, +1 800 272 6657 (US):
• Membership applications• Publications catalog• Draft standards and order forms• Technical committee list• Technical committee application• Chapter start-up procedures• Student scholarship information• Volunteer leaders/staff directory• IEEE senior member grade application (requires 10 years
practice and significant performance in five of those 10)
PUBLICATIONS AND ACTIVITIESComputer: The flagship publication of the IEEE Computer Society, Computer, publishes peer-reviewed technical content that covers all aspects of computer science, computer engineering, technology, and applications.
Periodicals: The society publishes 12 magazines, 15 transactions, and two letters. Refer to membership application or request information as noted above.
Conference Proceedings & Books: Conference Publishing Services publishes more than 275 titles every year.
Standards Working Groups: More than 150 groups produce IEEE standards used throughout the world.
Technical Committees: TCs provide professional interaction in more than 30 technical areas and directly influence computer engineering conferences and publications.
Conferences/Education: The society holds about 200 conferences each year and sponsors many educational activities, including computing science accreditation.
Certifications: The society offers three software developer credentials. For more information, visit www.computer .org/certification.
2019 BOARD OF GOVERNORS MEETINGS6 – 7 June: Hyatt Regency Coral Gables, Miami, FL(TBD) November: Teleconference
EXECUTIVE COMMITTEEPresident: Cecilia Metra President-Elect: Leila De Floriani; Past President: Hironori Kasahara; First VP: Forrest Shull; Second VP: Avi Mendelson; Secretary: David Lomet; Treasurer: Dimitrios Serpanos; VP, Member & Geographic Activities: Yervant Zorian; VP, Professional & Educational Activities: Kunio Uchiyama; VP, Publications: Fabrizio Lombardi; VP, Standards Activities: Riccardo Mariani; VP, Technical & Conference Activities: William D. Gropp 2018–2019 IEEE Division V Director: John W. Walz2019 IEEE Division V Director Elect: Thomas M. Conte2019–2020 IEEE Division VIII Director: Elizabeth L. Burd
BOARD OF GOVERNORSTerm Expiring 2019: Saurabh Bagchi, Leila De Floriani, David S. Ebert, Jill I. Gostin, William Gropp, Sumi Helal, Avi MendelsonTerm Expiring 2020: Andy Chen, John D. Johnson, Sy-Yen Kuo, David Lomet, Dimitrios Serpanos, Forrest Shull, Hayato YamanaTerm Expiring 2021: M. Brian Blake, Fred Douglis, Carlos E. Jimenez-Gomez, Ramalatha Marimuthu, Erik Jan Marinissen, Kunio Uchiyama
EXECUTIVE STAFFExecutive Director: Melissa RussellDirector, Governance & Associate Executive Director: Anne Marie KellyDirector, Finance & Accounting: Sunny HwangDirector, Information Technology & Services: Sumit Kacker Director, Marketing & Sales: Michelle TubbDirector, Membership Development: Eric Berkowitz
COMPUTER SOCIETY OFFICESWashington, D.C.: 2001 L St., Ste. 700, Washington, D.C. 20036-4928 • Phone: +1 202 371 0101 • Fax: +1 202 728 9614Email: [email protected]
Los Alamitos: 10662 Los Vaqueros Cir., Los Alamitos, CA 90720Phone: +1 714 821 8380 • Email: [email protected]
Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Minato-ku, Tokyo 107-0062, Japan • Phone: +81 3 3408 3118 Fax: +81 3 3408 3553 • Email: [email protected]
MEMBERSHIP & PUBLICATION ORDERSPhone: +1 800 272 6657 • Fax: +1 714 821 4641 Email: [email protected]
IEEE BOARD OF DIRECTORSPresident & CEO: Jose M.D. MouraPresident-Elect: Toshio FukudaPast President: James A. JefferiesSecretary: Kathleen KramerTreasurer: Joseph V. LillieDirector & President, IEEE-USA: Thomas M. CoughlinDirector & President, Standards Association: Robert S. FishDirector & VP, Educational Activities: Witold M. KinsnerDirector & VP, Membership and Geographic Activities: Francis B. Grosz, Jr.Director & VP, Publication Services & Products: Hulya KirkiciDirector & VP, Technical Activities: K.J. Ray Liu
revised 13 February 2019
24 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE78 C O M P U T E R P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y 0 0 1 8 - 9 1 6 2 / 1 7 / $ 3 3 . 0 0 © 2 0 1 7 I E E E
CLOUD COVER
In the past few decades, aggressive miniaturization of semiconductor circuits has driven the explo-sive performance improvement of digital systems that have radically reshaped the way we work,
entertain, and communicate. At the same time, new
paradigms such as cloud and edge computing, as well as the Internet of Things (IoT), enable billions of devices to interconnect intelligently. These devices will generate huge volumes of data that must be pro-cessed and analyzed in centralized or decentralized datacenters located close to users. The analysis of this data could lead to new scientific discoveries and new applications that will improve our lives.1
Such advances are at risk, however, because ongoing technology minia-turization appears to be ending, as it frequently causes otherwise-identical nanoscale circuits to exhibit di� erent performance or power-consumption behaviors, even though they’re de-signed using the same processes and architecture (see Figure 1). Such variations are caused by imperfec-tions in the manufacturing process
that are magni� ed as circuits get smaller. This can result in many fabricated chips not meeting
their intended performance and power speci� cations, thereby endangering the correct functionality of products that use them.
Error-Resilient Server Ecosystems for Edge and Cloud DatacentersGeorgios Karakonstantis and Dimitrios S. Nikolopoulos, Queen’s University Belfast
Dimitris Gizopoulos, National and Kapodistrian University of Athens
Pedro Trancoso and Yiannakis Sazeides, University of Cyprus
Christos D. Antonopoulos, University of Thessaly
Srikumar Venugopal, IBM Research
Shidhartha Das, ARM Research
The explosive growth of Internet-connected
devices that form the Internet of Things and
the fl ood of data they yield require new energy-
effi cient and error-resilient hardware and
software server stacks for next-generation
cloud and edge datacenters.
r12clo.indd 78 11/17/17 3:12 PM
www.computer.org/computingedge 25D E C E M B E R 2 0 1 7 79
EDITOR SAN MURUGESAN BRITE Professional Services; [email protected]
Manufacturers try to deal with the huge performance and power vari-ability in fabricated chips—and hide it from the software layers—by adopting pessimistic safety timing margins and redundant error-correction schemes designed to counteract the worst pos-sible scenario.2 Typically, such mea-sures are extremely pessimistic be-cause they’re based on rare worst-case operating conditions and because the capabilities of the worst performing chips are far inferior to those of the vast majority of identically manufac-tured circuits. As a result, the majority of the chips are constrained to operate at the low speed and high power con-sumption of the relatively few worst-case chips, not at the speed and power they could actually achieve.3
The use of pessimistic timing mar-gins—along with the inability to save power e� ciently by scaling down the supply voltage in nanoscale circuits because that would make circuits even more prone to failure—has elevated energy e� ciency’s signi� cance.4
Reducing processors’ energy con-sumption could not only enable prod-ucts to meet their tight power budgets but could also let users improve perfor-mance by employing more resources or by operating the chips at a higher frequency.5 This would be particularly important for servers, which will soon have to handle the huge amounts of data that the increasing number of in-terconnected devices will generate, a total estimated to reach 24.3 exabytes per month in 2019.6
IMPROVEMENTS REQUIRE NEW DESIGN APPROACHESSubstantially improving energy-e� ciency requires new types of error-resilient server ecosystems that can handle hardware components’ in-creased power and performance vari-ability more intelligently than con-ventional pessimistic paradigms.
The computing industry should see such heterogeneity not as a problem but as an opportunity to improve en-ergy e� ciency. This could be done by not arti� cially constraining all chips’ performance based on a few outliers but rather by letting each chip operate according to its true capabilities. Ex-ploiting such heterogeneity requires a shift away from current approaches and a redesign of next-generation serv-ers’ hardware and system software.
SERVER ECOSYSTEM Making the most of heterogeneity requires automated � rmware-level procedures to expose each processor’s and memory resource’s capabilities, even as they change over time. This requires the embedding of diagnostic and health-monitoring daemons in the � rmware of any server that eval-uates hardware components’ opera-tions during the product lifetime (see Figure 2). These daemons would access on-chip sensors and error-detection circuitry to collect and analyze vari-ous parameters, such as correctable and uncorrectable errors, perfor-mance counters, system crashes and hangs, and thermal and power be-havior. This would be similar to the
procedures that the machine-check architecture in x86 systems has ad-opted (www.mcelog.org).
The new system would use a hard-ware exposure interface (HEI) en-hanced to collect the required infor-mation and communicate it to the software stack, which would identify energy-e� cient voltage, frequency, and refresh-rate states for processors and memory subsystems.
We should also rethink the design of all system-software layers—includinghypervisors and resource-management frameworks such as OpenStack—used in today’s datacenters. These layers should be able to operate hardware close to its performance and power limits. For example, hypervisors could use this ca-pability to allocate processor and mem-ory resources with di� erent reliability, power, and performance-efficiency characteristics to virtual machines so that they improve performance while re-ducing the server’s energy consumption. Cloud-management frameworks could leverage the same capability to provide better quality of service (QoS) to users and applications.
However, operating hardware out-side its normal safety margins could introduce critical system-software
FReq1 FReq2 FReqk
Frequency
Num
ber o
f chi
ps
CPU1
CPU1
CPU2...CPU3
CPU2
CPU3
Figure 1. Identical chips, in this case CPUs, might have substantially different perfor-mance characteristics, such as operating frequency (Freq), even if they were designed and manufactured using the same processes.
r12clo.indd 79 11/17/17 3:12 PM
78 C O M P U T E R P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y 0 0 1 8 - 9 1 6 2 / 1 7 / $ 3 3 . 0 0 © 2 0 1 7 I E E E
CLOUD COVER
In the past few decades, aggressive miniaturization of semiconductor circuits has driven the explo-sive performance improvement of digital systems that have radically reshaped the way we work,
entertain, and communicate. At the same time, new
paradigms such as cloud and edge computing, as well as the Internet of Things (IoT), enable billions of devices to interconnect intelligently. These devices will generate huge volumes of data that must be pro-cessed and analyzed in centralized or decentralized datacenters located close to users. The analysis of this data could lead to new scientific discoveries and new applications that will improve our lives.1
Such advances are at risk, however, because ongoing technology minia-turization appears to be ending, as it frequently causes otherwise-identical nanoscale circuits to exhibit di� erent performance or power-consumption behaviors, even though they’re de-signed using the same processes and architecture (see Figure 1). Such variations are caused by imperfec-tions in the manufacturing process
that are magni� ed as circuits get smaller. This can result in many fabricated chips not meeting
their intended performance and power speci� cations, thereby endangering the correct functionality of products that use them.
Error-Resilient Server Ecosystems for Edge and Cloud DatacentersGeorgios Karakonstantis and Dimitrios S. Nikolopoulos, Queen’s University Belfast
Dimitris Gizopoulos, National and Kapodistrian University of Athens
Pedro Trancoso and Yiannakis Sazeides, University of Cyprus
Christos D. Antonopoulos, University of Thessaly
Srikumar Venugopal, IBM Research
Shidhartha Das, ARM Research
The explosive growth of Internet-connected
devices that form the Internet of Things and
the fl ood of data they yield require new energy-
effi cient and error-resilient hardware and
software server stacks for next-generation
cloud and edge datacenters.
r12clo.indd 78 11/17/17 3:12 PM
26 ComputingEdge June 201980 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CLOUD COVER
errors and even crash servers. We need system-software-resilience ap-proaches to avoid or mitigate such errors and sustain high server avail-ability. In datacenters, this means not degrading the QoS of a customer workload in case of a server crash and adopting mechanisms for speeding up the recovery process.
EMPOWERING INTERNET EVOLUTIONManufacturers could integrate our pro-posed ecosystem into standard high-end servers and the newly introduced microservers. Microservers don’t per-form as well as mainstream servers yet, but they can service many types of application requests with an appro-priate level of performance and with signi�cantly less power consumption.
Integrating our proposed software and hardware ecosystem into servers would help power next-generation datacenters in the cloud and at the network edge, where energy e�ciency is particularly critical for minimizing power supply, cooling, and mainte-nance costs.
Energy-e�cient servers would help create a more sustainable Internet. Presently, most Internet processing and storage takes place in the cloud, in massive centralized datacenters that contain tens of thousands of servers, consume as much electricity as a small city, and utilize expensive cooling mechanisms. This won’t be practical in the IoT era because the current Inter-net infrastructure’s limited network capacity won’t accommodate the exa-bytes of data that Internet-connected
devices will soon generate. However, using the typical centralized datacen-ters along with new decentralized da-tacenters at the network edge, closer to users, could limit the load put on the Internet infrastructure by allowing preprocessing and selective forward-ing of data to the cloud. This paradigm, used by both edge and fog computing, has advantages over the cloud para-digm and is being promoted by ma-jor companies such as Cisco, Huawei, IBM, and Intel as a way to transform the next-generation Internet.7
Finally, edge resources’ ability to provide all necessary services within a home or small business improves privacy because the data they carry doesn’t have to travel through the pub-lic network or reside in third-party datacenters.
Edge/cloud datacenter use cases
Secu
rity
thre
at a
naly
sis
and
coun
term
easu
res
that
exp
loit
the
intri
nsic
har
dwar
e he
tero
gene
ity
Stand-alone deployment
Applications
Operating system (OS)
UniServerdriver/serverVirtual machine (VM)
Variability-aware OpenStack
Scalable fault-tolerant hypervisor
Mic
ro-
chec
kpoi
ntin
g
Syst
em s
oftw
are
Apps
Firm
war
eHa
rdw
are
Proactive and adaptiveerror recovery
Hardware resourcesmanagement
Hardware exposure interface
Intrinsically and functionally heterogeneous hardware (cores, static/dynamic memories)
System con�gurationpolicies
Stress log
Stress daemons Health monitor
Health log
VM/hardware interactionmonitoring characterization
VM scheduling onheterogeneous cluster
VM
VM
OS OS
Applications Applications
Figure 2. This server ecosystem spans all layers of the system stack and enhances it with technologies for monitoring hardware health and optimizing overall operation under an extended range of operating points.
r12clo.indd 80 11/17/17 3:12 PM
www.computer.org/computingedge 27D E C E M B E R 2 0 1 7 81
Realizing our proposed error-resilient, energy-efficient eco-system faces many challenges,
in part because it requires the design of new technologies and the adoption of a system operation philosophy that de-parts from the current pessimistic one.
The UniServer Consortium (www .uniserver2020.eu)—consisting of ac-ademic institutions and leading com-panies such as AppliedMicro Circuits, ARM, and IBM—is working toward such a vision. Its goal is the develop-ment of a universal system architec-ture and software ecosystem for serv-ers used for cloud- and edge-based datacenters. The European Communi-ty’s Horizon 2020 research program is funding UniServer (grant no. 688540).
The consortium is already imple-menting our proposed ecosystem in a state-of-the art X-Gene2 eight-core, ARMv8-based microserver with 28-nm feature sizes. The initial character-ization of the server’s processing cores shows that there is a signi�cant safety margin in the supply voltage used to operate each core. Results show that the some cores could use 10 percent below the nominal supply voltage that the manufacturer advises. This could lead to a 38 percent power savings.8
Similarly promising is the char-acterization of the DRAM memories used on the ARMv8-based microserver showing that the refresh rate and sup-ply voltage could be decreased by 98 percent and 5 percent from nominal levels, respectively. Such reductions could lead to an average power sav-ings of more than 22 percent across a range of benchmarks.9
REFERENCES1. H. Bauer, M Patel, and J. Veira, The
Internet of Things: Sizing up the Oppor-tunity, online report, McKinsey & Co., December 2014; www.mckinsey.com/industries/semiconductors/our-insights/the-internet-of-things-sizing-up-the-opportunity.
2. S. Ghosh and K. Roy, “Parameter Variation Tolerance and Error Resil-iency: New Design Paradigm for the
Nanoscale Era,” Proc. IEEE, volume 98, no. 10, 2010, pp. 1718–1751.
3. P.N. Whatmough et al., “An All- Digital Power-Delivery Monitor for Analysis of a 28nm Dual-Core ARM Cortex-A57 cluster,” Proc. 2015 IEEE Solid-State Circuits Conf. (ISSCC 15), 2015, pp. 1–3.
4. H. Wong et al., “Implications of Historical Trends in the Electrical E¨ciency of Computing,” IEEE Annals of the History of Computing, vol. 33, no. 3, 2011, pp. 46–54.
5. H. Esmaeilzadeh et al., “Dark Silicon and the End of Multicore Scaling,” IEEE Micro, vol. 32, no. 3, 2012, pp. 122–134.
6. Cisco Visual Networking Index: Global Mobile Data Tra�c Forecast Update, 2016–2021, online white paper, Cisco Systems, March 2017; www.cisco .com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-white-paper-c11-520862.html.
7. W. Shi et al., “Edge Computing: Vision and Challenges,” IEEE Internet of Things J., vol. 3, no. 5, 2016, pp. 637–646.
8. G. Papadimitriou et al., “Harnessing Voltage Margins for Energy E¨-ciency in Multicore CPUs,” Proc. 50th IEEE/ACM Int’l Symp. Microarchitec-ture (MICRO 17), 2017, pp. 503–516.
9. K. Tovletoglou, D. Nikolopoulos, and G. Karakonstantis, “Relaxing DRAM Refresh Rate through Access Pattern Scheduling: A Case Study on Stencil-Based Algorithms,” Proc. 23rd IEEE Int’l Symp. Online Testing and Robust System Design (IOLTS 17), 2017, pp. 45–50.
GEORGIOS KARAKONSTANTIS
is an assistant professor in the
School of Electronics, Electrical
Engineering, and Computer Science
(EEECS) at Queen’s University
Belfast and the scientific coordinator
of the UniServer project. Contact
him at [email protected].
DIMITRIOS S. NIKOLOPOULOS is
a professor and the head of the
School of EEECS at Queen’s
University Belfast. Contact him at
DIMITRIS GIZOPOULOS is a profes-
sor in the Department of Informatics
and Telecommunications at the
National and Kapodistrian University
of Athens, where he leads the
Computer Architecture Laboratory.
Contact him at [email protected].
PEDRO TRANCOSO is an associate
professor in the University of Cyprus’
Department of Computer Science.
Contact him at [email protected].
YIANNAKIS SAZEIDES is an asso-
ciate professor in the University of
Cyprus’ Department of Computer
Science. Contact him at yanos@
cs.ucy.ac.cy.
CHRISTOS D. ANTONOPOULOS
is an assistant professor in the
University of Thessaly’s Electrical
and Computer Engineering
Department. Contact him at cda@
uth.gr.
SRIKUMAR VENUGOPAL is a
research scientist at IBM Research–
Ireland. Contact him at srikumarv@
ie.ibm.com.
SHIDHARTHA DAS is a principal
research engineer at ARM Research
and a Royal Academy of Engineering
visiting professor at Newcastle
University. Contact him at sdas@
arm.com.
Read your subscriptions through the myCS publications portal at
http://mycs.computer.org
r12clo.indd 81 11/17/17 3:12 PM
80 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CLOUD COVER
errors and even crash servers. We need system-software-resilience ap-proaches to avoid or mitigate such errors and sustain high server avail-ability. In datacenters, this means not degrading the QoS of a customer workload in case of a server crash and adopting mechanisms for speeding up the recovery process.
EMPOWERING INTERNET EVOLUTIONManufacturers could integrate our pro-posed ecosystem into standard high-end servers and the newly introduced microservers. Microservers don’t per-form as well as mainstream servers yet, but they can service many types of application requests with an appro-priate level of performance and with signi�cantly less power consumption.
Integrating our proposed software and hardware ecosystem into servers would help power next-generation datacenters in the cloud and at the network edge, where energy e�ciency is particularly critical for minimizing power supply, cooling, and mainte-nance costs.
Energy-e�cient servers would help create a more sustainable Internet. Presently, most Internet processing and storage takes place in the cloud, in massive centralized datacenters that contain tens of thousands of servers, consume as much electricity as a small city, and utilize expensive cooling mechanisms. This won’t be practical in the IoT era because the current Inter-net infrastructure’s limited network capacity won’t accommodate the exa-bytes of data that Internet-connected
devices will soon generate. However, using the typical centralized datacen-ters along with new decentralized da-tacenters at the network edge, closer to users, could limit the load put on the Internet infrastructure by allowing preprocessing and selective forward-ing of data to the cloud. This paradigm, used by both edge and fog computing, has advantages over the cloud para-digm and is being promoted by ma-jor companies such as Cisco, Huawei, IBM, and Intel as a way to transform the next-generation Internet.7
Finally, edge resources’ ability to provide all necessary services within a home or small business improves privacy because the data they carry doesn’t have to travel through the pub-lic network or reside in third-party datacenters.
Edge/cloud datacenter use cases
Secu
rity
thre
at a
naly
sis
and
coun
term
easu
res
that
exp
loit
the
intri
nsic
har
dwar
e he
tero
gene
ity
Stand-alone deployment
Applications
Operating system (OS)
UniServerdriver/serverVirtual machine (VM)
Variability-aware OpenStack
Scalable fault-tolerant hypervisor
Mic
ro-
chec
kpoi
ntin
g
Syst
em s
oftw
are
Apps
Firm
war
eHa
rdw
are
Proactive and adaptiveerror recovery
Hardware resourcesmanagement
Hardware exposure interface
Intrinsically and functionally heterogeneous hardware (cores, static/dynamic memories)
System con�gurationpolicies
Stress log
Stress daemons Health monitor
Health log
VM/hardware interactionmonitoring characterization
VM scheduling onheterogeneous cluster
VM
VM
OS OS
Applications Applications
Figure 2. This server ecosystem spans all layers of the system stack and enhances it with technologies for monitoring hardware health and optimizing overall operation under an extended range of operating points.
r12clo.indd 80 11/17/17 3:12 PM
This article originally appeared in Computer, vol. 50, no. 12, 2017.
28 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
Internet of Things, People, and ProcessesEditor: Schahram Dustdar • [email protected]
64 Published by the IEEE Computer Society 1089-7801/17/$33.00 © 2017 IEEE IEEE INTERNET COMPUTING
A Serverless Real-Time Data Analytics Platform for Edge ComputingStefan Nastic, Thomas Rausch, Ognjen Scekic, and Schahram Dustdar • TU Wien
Marjan Gusev, Bojana Koteska, Magdalena Kostoska, and Boro Jakimovski • Ss. Cyril and Methodius University
Sasko Ristov and Radu Prodan • University of Innsbruck
A novel approach implements cloud-supported, real-time data analytics in
edge computing applications. The authors introduce their serverless edge-
data analytics platform and application model and discuss their main design
requirements and challenges, based on real-life healthcare use case scenarios.
W ith the increasing growth of the Inter-net of Things (IoT) and edge comput-ing,1-3 an abundance of geographically
dispersed computing infrastructure and edge resources remain largely underused for data analytics applications. At the same time, the value of data becomes effectively lost at the edge by remaining inaccessible to the more powerful data analytics in the cloud due to networking costs, latency issues, and limited interoper-ability between edge devices. The reason for both of these shortcomings is that today’s cloud models do not optimally support data analytics at the volume and variety of data originating from sensors and edge devices, typically char-acterized by high latencies and response times. There is a hard line between the edge and the cloud parts of analytics applications in terms of responsibilities, design, and runtime consider-ations. While contemporary solutions for cloud-supported, real-time data analytics mostly apply analytics techniques in a rigid bottom-up approach regardless of the data’s origin,4,5 doing data analytics on the edge forces developers to resort to ad hoc solutions speci�cally tailored to the available infrastructure. The process is
largely manual, task-speci�c, and error-prone and usually requires good knowledge of the underlying infrastructure. Consequently, when faced with large-scale, heterogeneous resource pools, performing effective data analytics is dif�cult, if not impossible.
A promising approach to address these issues is the serverless computing paradigm. Serverless computing is an emerging cloud-based execu-tion model in which user-de�ned functions are seamlessly and transparently hosted and man-aged by a distributed platform.6 There are multi-ple commercial and open source implementations of serverless platforms, such as Amazon Web Services Lambda (see http://aws.amazon.com/lambda), Apache OpenWhisk (http://openwhisk.org), or OpenLambda.6 The bene�ts of the server-less model become especially evident in the con-text of the described cloud and edge model, as both models seek to mitigate inef�cient, error-prone, and costly infrastructure and application management.
In this article, we propose a uni�ed cloud and edge data analytics platform, which extends the notion of serverless computing to the edge and facilitates joint programmatic resource and
www.computer.org/computingedge 29
A Serverless Real-Time Data Analytics Platform for Edge Computing
JULY/AUGUST 2017 65
analytics management. We intro-duce a reference architecture for our platform and a serverless applica-tion execution model, which enable uniform development and operation of analytics functions, thereby free-ing users from worrying about the complexity of the underlying edge infrastructure. Finally, we outline a set of research challenges and design principles to serve as a road map toward a fully-�edged platform for cloud and edge real-time data analytics.
Use Case ScenariosTo better understand the need for distributed cloud and edge process-ing, we present two scenarios from IoT mobile healthcare (mHealth) and discuss holistic data analytics.
Use Case 1: Measuring Human Vital Signs in DisastersIn case of a major disaster, prompt paramedic attention is crucial to save people’s lives. An extension of major disaster protocol includes wearable biosensors that can be attached to injured people by on-site medics, providing critical information about the patient’s medical condition and helping determine a priority queue for further patient processing.
As soon as the sensors are attached, they start emitting data to nearby portable edge devices, such as a smartphone. Such devices per-form only basic, lightweight ana-lytics, such as triage decision tree, to determine the overall stability of the injured person and inform onsite medics. Once a network con-nection becomes available, selected data can be transferred to the cloud for more complex analysis that will predict and prioritize more pre-cisely the patients’ treatments, help-ing improve coordination between hospitals, optimize the time needed to reach a hospital, and provide timely information about patients’ conditions.
Use Case 2: Measuring Human Vital Signs in Everyday LifeIn this use case, a wearable biosen-sor measures a patient’s vital signs during everyday life activities. Sensors continuously stream data of electrocardiogram readings to a nearby edge device that performs simple data analytics to monitor the patient’s health condition. If an anomaly such as a heart failure is detected, the system immediately noti�es medical emergency services. Preprocessed and �ltered data are subsequently sent to the cloud, where comprehensive data analytics can be performed to gain better insight into the patient’s overall condition, and help with diagnostics.
Holistic Data Analytics in mHealthThe presented use cases demonstrate the need for consolidating cloud- and edge-based data analytics tech-niques. To enable prompt reactions to a patient’s changing health condi-tion, low-latency algorithms should process the data at the edge in real-time. Conversely, detecting patterns in large amounts of historic patient data requires analytics techniques that depend on cloud storage and processing capabilities. In an archi-tecture that combines cloud and edge data analytics, edge devices should act as a data gateway. Preprocessed and �ltered data can be sent to the cloud, where they become persistent and highly available for compute-intensive analytics. To consolidate different techniques, and transpar-ently handle data management, a holistic data analytics platform that uni�es cloud and edge resources is needed.
Serverless PlatformOur main objective is to provide a full-stack platform for supporting real-time data analytics across cloud and edge in a uniform manner. The key role of the distributed cloud and
edge platform is to facilitate auto-mated management of the underlying resource pool and optimal placement of analytics functions to support the envisioned serverless execution model. This approach enables com-bining the bene�ts of the edge (lower response time and heterogeneous data management) with the computa-tional and storage capabilities of the cloud. For example, time-sensitive data, such as life-critical vital signs, can be analyzed at the edge, close to where data are generated instead of being transported to the cloud for processing. Alternatively, selected data can be forwarded to the cloud for further, more powerful analysis and long-term storage.
Platform Use and Architecture OverviewFigure 1a shows a high-level view of the platform and the main top-down control process (left) and bottom-up data management and delivery pro-cess (right). The proposed serverless data analytics paradigm is particu-larly suitable for managing differ-ent granularities of data analytics approaches bottom-up. This means that the edge focuses on local views (for example, per edge gateway), while the cloud supports global views, that is, combining and analyzing data from different edge devices, regions, or even domains. Data are collected from the underlying devices and delivered to the applications via con-sumption APIs. More importantly, the data analytics can be performed on edge nodes, cloud nodes, or both, and delivered from any of the nodes directly to the application, based on the desired view. Moreover, the top-down control process allows decou-pling of application requirements (the what) from concrete realization of those requirements (the how). This allows developers to simply de�ne the analytics function behavior and data-processing business logic and application goals (for example,
Internet of Things, People, and ProcessesEditor: Schahram Dustdar • [email protected]
64 Published by the IEEE Computer Society 1089-7801/17/$33.00 © 2017 IEEE IEEE INTERNET COMPUTING
A Serverless Real-Time Data Analytics Platform for Edge ComputingStefan Nastic, Thomas Rausch, Ognjen Scekic, and Schahram Dustdar • TU Wien
Marjan Gusev, Bojana Koteska, Magdalena Kostoska, and Boro Jakimovski • Ss. Cyril and Methodius University
Sasko Ristov and Radu Prodan • University of Innsbruck
A novel approach implements cloud-supported, real-time data analytics in
edge computing applications. The authors introduce their serverless edge-
data analytics platform and application model and discuss their main design
requirements and challenges, based on real-life healthcare use case scenarios.
W ith the increasing growth of the Inter-net of Things (IoT) and edge comput-ing,1-3 an abundance of geographically
dispersed computing infrastructure and edge resources remain largely underused for data analytics applications. At the same time, the value of data becomes effectively lost at the edge by remaining inaccessible to the more powerful data analytics in the cloud due to networking costs, latency issues, and limited interoper-ability between edge devices. The reason for both of these shortcomings is that today’s cloud models do not optimally support data analytics at the volume and variety of data originating from sensors and edge devices, typically char-acterized by high latencies and response times. There is a hard line between the edge and the cloud parts of analytics applications in terms of responsibilities, design, and runtime consider-ations. While contemporary solutions for cloud-supported, real-time data analytics mostly apply analytics techniques in a rigid bottom-up approach regardless of the data’s origin,4,5 doing data analytics on the edge forces developers to resort to ad hoc solutions speci�cally tailored to the available infrastructure. The process is
largely manual, task-speci�c, and error-prone and usually requires good knowledge of the underlying infrastructure. Consequently, when faced with large-scale, heterogeneous resource pools, performing effective data analytics is dif�cult, if not impossible.
A promising approach to address these issues is the serverless computing paradigm. Serverless computing is an emerging cloud-based execu-tion model in which user-de�ned functions are seamlessly and transparently hosted and man-aged by a distributed platform.6 There are multi-ple commercial and open source implementations of serverless platforms, such as Amazon Web Services Lambda (see http://aws.amazon.com/lambda), Apache OpenWhisk (http://openwhisk.org), or OpenLambda.6 The bene�ts of the server-less model become especially evident in the con-text of the described cloud and edge model, as both models seek to mitigate inef�cient, error-prone, and costly infrastructure and application management.
In this article, we propose a uni�ed cloud and edge data analytics platform, which extends the notion of serverless computing to the edge and facilitates joint programmatic resource and
30 ComputingEdge June 2019
Internet of Things, People, and Processes
66 www.computer.org/internet/ IEEE INTERNET COMPUTING
regarding provisioning) instead of dealing with the complexity of dif-ferent management, orchestration, and optimization processes.
Figure 1b shows the platform’s core architecture. We detail the layers of the architecture in the following.
The analytics function wrapper and APIs layer. This layer focuses on executing and managing user-provided data analytics functions — for example, delivering required data to the function and creating the
resulting endpoints. To this end, this layer wraps the analytics functions in executable artifacts such as Linux containers and relies on the underly-ing layers to perform concrete run-time actions and execution steps.
The orchestration layer. This layer interprets and executes user-de�ned real-time analytics functions, re-quire ments, and con�guration mod-els. It acts as a gluing component, bringing together an application’s con�guration model, user-de�ned
analytics functions, and the plat-form’s runtime mechanisms. There-fore, the orchestration layer receives the application con�guration direc-tives, in terms of high-level objectives such as optimizing network latency. It interprets and analyzes these goals and decides how to orchestrate the underlying resources, as well as the user-de�ned functions, by invoking the underlying runtime mechanisms. To this end, this layer contains micro (edge-based) and macro (cloud-based) orchestration and control loops. For example, it can use the scheduling and placement mechanisms to deter-mine the most suitable node (cloud or edge) for an analytics function to re-duce the network latency.
The runtime mechanisms layer. This is an extensible plug-in layer, providing mechanisms to support executing the actions initiated by the orchestration layer. The deploy-ment, scheduling, elasticity, and basic reasonable defaults for the quality of service (QoS) are core run-time mechanisms. More precisely, the platform determines the mini-mally required elastic resources, provisions them, deploys, and then schedules and executes analytics functions, which will satisfy QoS requirements. On the other hand, the governance, placement, fault tolerance, and extended QoS mecha-nisms are optional. For example, in some cases, the data could be con-�dential and some geographical regions should be excluded. Plac-ing the functions closer to the data and deciding whether to use cloud or edge resources could improve the QoS. Additionally, having a k-fault-tolerant platform that will mitigate failure risks to acceptable levels could further improve the QoS.
Serverless Stream ModelTo facilitate the serverless execu-tion of edge real-time data ana-lytics applications, we propose an
Figure 1. Cloud and edge real-time data analytics platform. (a) High-level usage context. (b) Internal software architecture.
Dataconsumption endpoints/APls
Data deliveryviews
Data analytics
Datacollection
Application/control
requirements/goals
Goal-to-executionmapping
Runtimemanagement/enforcementmechanisms
Resourcecon�guration &management
Serverlessexecution
model
Infrastructure resource pool
Distributedcloud-and-edge
platform
Cloud
(a)
Edge/fogdevices IoT devices
Stream application model
Resources abstraction/virtualization layer
Analytics function wrapper & APls layer
Dis
trib
uted
clo
ud-a
nd-e
dge
plat
form
Orc
hest
ratio
nla
yer
Monitor
Analyze Plan
Execute
Plugins/extensions runtime mechanisms layer
Dep
loym
ent
Sche
dulin
g
Plac
emen
t
Elas
ticity
Faul
tto
lera
nce
QoS
Gov
erna
nce
(b)
www.computer.org/computingedge 31
A Serverless Real-Time Data Analytics Platform for Edge Computing
JULY/AUGUST 2017 67
extension of the traditional stream processing model. In our server-less stream model (see Figure 2), the transformation function is the core concept and encapsulates user-de�ned data analytics logic to pro-cess data along the stream. These functions are then composed into topologies that enable complex data processing applications. In our model, we consider streams as �rst-class citizens — that is, streams are de�ned through all of the presented concepts, as well as function wrap-pers and stream contracts.
The wrapper is responsible for encapsulating the transformation functions and exposing a thin API layer, enabling the analytics function layer to treat functions as microser-vices. This lets our system transpar-ently schedule and deploy functions using container-based deployment strategies, and compose functions to complex topologies. For stateful functions, these wrappers also pro-vide implicit state management. The wrapper transparently handles state replication and migration, and access to a function’s state is controlled via the exposed API.
The contract is a high-level de-scription of how the platform man-ages deployment and execution of streams and their functions. Spe-ci�cally, a contract gives a user �ne-grained control over the plat-form’s runtime mechanisms — that is, placement and scaling policies, governance rules, and QoS require-ments. A contract is divided into sections, where each section speci-�es a respective runtime mecha-nism. The placement section allows control over how analytics func-tions are deployed across the infra-structure, such as which cloud or edge resources to use. The scaling section allows control over elas-ticity strategies — that is, how the platform should adapt to varying workloads. Governance rules allow additional de�nition of restrictions
regarding security or privacy, such as the exclusion of certain geo-graphical regions in the distributed infrastructure. The QoS section al-lows control over QoS requirements the platform should respect, for ex-ample, maximum stream latency or minimum stream throughput. Unless explicitly de�ned in respec-tive sections, the platform will en-act sane defaults for each runtime mechanism.
Design Requirements and ChallengesThis section outlines the design re-quirements and challenges to realize our real-time data analytics platform and its main runtime mechanisms in edge.
Provisioning Data Analytics Functions and Edge ResourcesThe serverless paradigm and appli-cation execution model undoubt-edly has the potential to offer a wide range of bene�ts for provisioning and managing real-time edge data
analytics functions. Unfortunately, due to the inherently different nature of edge infrastructure, for exam-ple, in terms of available resources, networks, and so on, provisioning solutions designed for cloud-based serverless landscapes are hardly applicable out of the box in this new computing environment. Fundamen-tal architecture and design assump-tions behind such approaches need to be reexamined and speci�cally tailored for the edge infrastructure to support seamless provisioning of both infrastructure resources and application components.
In our previous work, we de-veloped models7 and middleware8
that enable provisioning edge/IoT resources and applications across large-scale IoT and edge deploy-ments. Although, these solutions offer numerous advantages to the application developers and opera-tions managers — such as a logically centralized point of operation in a geographically dispersed infrastruc-ture, uniform interaction patterns with both cloud and edge resources,
Figure 2. Serverless stream application model. The transformation function is the core concept and encapsulates user-de�ned data analytics logic to process the stream data.
Stream
DataData analytic
function
StateAnalytics
business logic
Wrapper API
Contract
Internet of Things, People, and Processes
66 www.computer.org/internet/ IEEE INTERNET COMPUTING
regarding provisioning) instead of dealing with the complexity of dif-ferent management, orchestration, and optimization processes.
Figure 1b shows the platform’s core architecture. We detail the layers of the architecture in the following.
The analytics function wrapper and APIs layer. This layer focuses on executing and managing user-provided data analytics functions — for example, delivering required data to the function and creating the
resulting endpoints. To this end, this layer wraps the analytics functions in executable artifacts such as Linux containers and relies on the underly-ing layers to perform concrete run-time actions and execution steps.
The orchestration layer. This layer interprets and executes user-de�ned real-time analytics functions, re-quire ments, and con�guration mod-els. It acts as a gluing component, bringing together an application’s con�guration model, user-de�ned
analytics functions, and the plat-form’s runtime mechanisms. There-fore, the orchestration layer receives the application con�guration direc-tives, in terms of high-level objectives such as optimizing network latency. It interprets and analyzes these goals and decides how to orchestrate the underlying resources, as well as the user-de�ned functions, by invoking the underlying runtime mechanisms. To this end, this layer contains micro (edge-based) and macro (cloud-based) orchestration and control loops. For example, it can use the scheduling and placement mechanisms to deter-mine the most suitable node (cloud or edge) for an analytics function to re-duce the network latency.
The runtime mechanisms layer. This is an extensible plug-in layer, providing mechanisms to support executing the actions initiated by the orchestration layer. The deploy-ment, scheduling, elasticity, and basic reasonable defaults for the quality of service (QoS) are core run-time mechanisms. More precisely, the platform determines the mini-mally required elastic resources, provisions them, deploys, and then schedules and executes analytics functions, which will satisfy QoS requirements. On the other hand, the governance, placement, fault tolerance, and extended QoS mecha-nisms are optional. For example, in some cases, the data could be con-�dential and some geographical regions should be excluded. Plac-ing the functions closer to the data and deciding whether to use cloud or edge resources could improve the QoS. Additionally, having a k-fault-tolerant platform that will mitigate failure risks to acceptable levels could further improve the QoS.
Serverless Stream ModelTo facilitate the serverless execu-tion of edge real-time data ana-lytics applications, we propose an
Figure 1. Cloud and edge real-time data analytics platform. (a) High-level usage context. (b) Internal software architecture.
Dataconsumption endpoints/APls
Data deliveryviews
Data analytics
Datacollection
Application/control
requirements/goals
Goal-to-executionmapping
Runtimemanagement/enforcementmechanisms
Resourcecon�guration &management
Serverlessexecution
model
Infrastructure resource pool
Distributedcloud-and-edge
platform
Cloud
(a)
Edge/fogdevices IoT devices
Stream application model
Resources abstraction/virtualization layer
Analytics function wrapper & APls layer
Dis
trib
uted
clo
ud-a
nd-e
dge
plat
form
Orc
hest
ratio
nla
yer
Monitor
Analyze Plan
Execute
Plugins/extensions runtime mechanisms layer
Dep
loym
ent
Sche
dulin
g
Plac
emen
t
Elas
ticity
Faul
tto
lera
nce
QoS
Gov
erna
nce
(b)
32 ComputingEdge June 2019
Internet of Things, People, and Processes
68 www.computer.org/internet/ IEEE INTERNET COMPUTING
and utility-based resource consump-tion — there’s still a number of chal-lenges to address to enable our vision of a serverless real-time data analyt-ics platform for the edge. These in-clude fully automated provisioning solutions, where user interaction is limited to providing high-level poli-cies and goals that need to be ful-�lled by the platform; design and implementation of provisioning and orchestration mechanisms for the edge network (for example, based on network function virtualization slicing or software-de�ned net-works); and models and techniques for secure edge-resource negotiation based on smart contracts and Block-chain technologies.
Availability and Scheduling of Loosely Coupled Edge ResourcesThe analytics wrapper and APIs layer (see Figure 1b) is an abstraction level that addresses management of data
analytics functions. These abstrac-tion concepts that arrive at differ-ent speeds stochastically must be mapped to real runtime mechanisms. This orchestration and con�guration cannot occur in an easy and predict-able manner, because of the limited resources in the runtime mecha-nisms layer. If there exist unlimited, tightly coupled resources as in the cloud, then mapping is ideal because each new function will be mapped to a new resource runtime mechanism in the described monitor, analyze, plan, and execute loop.
Resource availability in edge computing complicates the execu-tion of the planned activities by the orchestration layer. Multiple meth-ods can solve the problem of limited available resources and required pro-cessing, communication, or storage demands, such as inserting queues in front of each resource in the pool, or a common (grouped) queue.
Scheduling tasks over provisioned resources is by itself a challenge, as it is an NP-hard, real-time problem. Scheduling in the edge’s loosely cou-pled infrastructure is even more chal-lenging than for the cloud’s tightly coupled one. Even if the edge provi-sions elastic resources, the edge’s dis-tribution will generate huge network latency because of its slower wide area network (WAN) compared to the cloud’s high-speed LAN. Therefore, the platform should provide light-weight algorithms for optimizing the real-time decision making for sched-uling, considering multiple con�ict-ing criteria, such as energy ef�ciency, dependability, real-time response, resiliency, reliability, time-predict-ability, fault tolerance, and system cost. To reduce the tradeoff between the acceptable schedule that can be calculated fastest (in real time), a foot-print analyzer can be implemented, whose historical outputs can be used
Related Work in Scalable Data Analytics Applications
Traditionally, scalable data analyt-ics applications have been realized
with cloud-supported, distributed, data-stream processing systems. Maintaining low end-to-end latencies under high data velocity is a major challenge for such sys-tems, particularly in large-scale Internet of Things scenarios. Systems like Stream-Cloud1 and Twitter Heron2 were devel-oped to handle massive amounts of data, using concepts such as auto-paralleliza-tion of stream operators, clustering, and elastic scaling. While these approaches address scalability issues, they don’t consider edge-speci�c features such as locality awareness, which are crucial for achieving low-latency, real-time analytics.
Different approaches extended tra-ditional stream processing with novel algorithms for deploying and scheduling operators at the edge. For example, Apos-tolos Papageorgiou and colleagues ex-tended the stream topology deployment
algorithm of Apache Storm.3 In particular, their deployment optimization approach incorporates quality of service (QoS) metrics of topology-external interactions to reduce communication latencies with-in stream-processing topologies. Valeria Cardellini and colleagues propose a simi-lar approach, where the scheduling algo-rithm takes into account network QoS metrics, such as latency, between stream operators.4
Only a few efforts have been made to develop novel architectures for data analyt-ics platforms in the edge. Mahadev Satyana-rayanan and colleagues propose GigaSight, a hybrid architecture for computer-vision analytics based on cloudlets.5 In GigaSight, cloudlets �lter and process mobile video streams in near real time. Only process-ing results, such as recognized objects, and corresponding metadata are sent to the cloud, thereby reducing end-to-end laten-cies as well as bandwidth usage.
References1. P. Valduriez et al., “StreamCloud: An Elastic and
Scalable Data Streaming System,” IEEE Trans.
Parallel and Distributed Systems, vol. 23, no. 12,
2012, pp. 2351–2365.
2. S. Kulkarni et al., “Twitter Heron: Stream Pro-
cessing at Scale,” Proc. ACM Sigmod Int’l Conf.
Management of Data, 2015, pp. 239–250.
3. A. Papageorgiou, E. Poormohammady, and
B. Cheng, “Edge-Computing-Aware Deploy-
ment of Stream Processing Tasks Based on
Topology-External Information: Model, Algo-
rithms, and A Storm-Based Prototype,” Proc.
IEEE Int’l Conf. Big Data, 2016, pp. 259–266.
4. V. Cardellini et al., “On QoS-Aware Scheduling
of Data Stream Applications Over Fog Comput-
ing Infrastructures,” Proc. IEEE Symp. Computers
and Comm., 2015, pp. 271–276.
5. M. Satyanarayanan et al., “Edge Analytics in the
Internet of Things,” IEEE Pervasive Computing,
vol. 14, no. 2, 2015, pp. 24–31.
www.computer.org/computingedge 33
A Serverless Real-Time Data Analytics Platform for Edge Computing
JULY/AUGUST 2017 69
as heuristics to predict edge resource behavior without interrupting real-time data analytics.
Resources Virtualization and Heterogeneous Infrastructure MappingMost of the data collection devices (sensors) in the infrastructure re-source pool are abstracted by their virtualized model in the resources abstraction layer. The main idea of the model is to transfer data and re-alize the data analytics functions via cloud/fog computing servers.
We address the autonomous func-tion of IoT devices as a challenge that our model will face. Autono-mous processing means that certain data analytics algorithms are also performed on a lower level — that is, the processing is located closer to the source (in the edge). This leads to a dew computing design, including a dew server (edge device) on the path between the IoT devices/sensors and the cloud. Our model implements the autonomous function by mapping the main data analytics functions directly to the IoT devices.
In this context, an open chal-lenge is to solve the interoperability and integration between different device implementations on the infra-structure level. Our resource virtu-alization layer directly addresses these issues, because each device is presented by its functions.
Finally, portability on the lower layer means to transfer de�ned func-tions between various devices. For example, this is addressed by the fault tolerance and elasticity runtime mechanisms to replicate or enable a spare device that will continue per-forming the de�ned functions if one resource fails or share the load if the load increases.
Rapid Elasticity at the EdgeCloud-based serverless platforms use commodity infrastructure and have small footprint and short execution
duration, combined with statisti-cal multiplexing of a large number of heterogeneous workloads over time.9 Edge, on the other hand, has different characteristics because of its different infrastructure — that is, infrastructure deployment and its geographic dispersion, the topology of network connectivity, and local-ity-awareness. One challenge is the discoverability of resources. Given the number of heterogeneous devic-es, standard discovery mechanisms could become burdened with mas-sive workloads; as a solution, and to ensure elasticity, dynamic strate-gies should be in place for sharing resources. Another perspective is the geographic dispersion, where edge devices are scattered around the network and have limited capacity. A framework should be established, which can dynamically select appro-priate devices in regard to proximity to data sources, while considering latency and mobility. The framework should differentiate critical and non-critical functions by setting priori-ties to the services in the case of a heavy workload in a single location. The topology of the network con-nectivity represents another issue in rapid provisioning of processing capabilities. It should allow imper-ceptible handling of any increased workload without increasing laten-cies and cope with heterogeneous devices, while maintaining a secure environment. As a result, the com-munication protocols among the nodes should be carefully selected and established.
Data ManagementApart from the �ve Vs (volume, ve-locity, variety, veracity, and value) for big data analytics, some real-time analytics applications require additional data management, which should be performed in edge, as well. The data must be preprocessed before going through the transformation functions. Further, data ingestion,
transcription services, deduplica-tion, and natural language-process-ing algorithms are required.
Many real-time analytics, such as our �rst use case, require updat-ing the model in real time accord-ing to the shorter data horizon. This requirement refreshes the incremen-tal learning algorithms, which will compensate the lack of edge resources compared to the cloud. On the other hand, the lack of storage capacity is compensated with the smaller data obsolescence — that is, prediction upon smaller data series. The sec-ond use case is a good representative of this. In both use cases, the data can be forgotten or destroyed after being processed, which will mitigate the risk of data leakage and bring the data privacy and security to an acceptable level. For other real-time analytics, such as longer data hori-zons or data obsolescence, the same challenges remain, as is the case in the cloud.
To tackle these requirements, one challenge is to design a whole work-�ow of dependent processes in the orchestration layer. Placement, QoS, and fault tolerance should be added to the core runtime mechanisms for these types of real-time analytics, as struggling or failure of some execu-tion nodes could lead to service-level agreement violations.
Despite the elastic computing power of cloud, along with high-
speed networks, real-time analyt-ics in the novel edge computing landscape are becoming ever-more challenging. Our platform facili-tates real-time data analytics over cloud and edge computing in a uniform manner. The proposed serverless stream model makes transparent the underused under-lying heterogeneous edge infra-structure and enables easier and more intuitive development of vari-ous real-time analytics functions,
Internet of Things, People, and Processes
68 www.computer.org/internet/ IEEE INTERNET COMPUTING
and utility-based resource consump-tion — there’s still a number of chal-lenges to address to enable our vision of a serverless real-time data analyt-ics platform for the edge. These in-clude fully automated provisioning solutions, where user interaction is limited to providing high-level poli-cies and goals that need to be ful-�lled by the platform; design and implementation of provisioning and orchestration mechanisms for the edge network (for example, based on network function virtualization slicing or software-de�ned net-works); and models and techniques for secure edge-resource negotiation based on smart contracts and Block-chain technologies.
Availability and Scheduling of Loosely Coupled Edge ResourcesThe analytics wrapper and APIs layer (see Figure 1b) is an abstraction level that addresses management of data
analytics functions. These abstrac-tion concepts that arrive at differ-ent speeds stochastically must be mapped to real runtime mechanisms. This orchestration and con�guration cannot occur in an easy and predict-able manner, because of the limited resources in the runtime mecha-nisms layer. If there exist unlimited, tightly coupled resources as in the cloud, then mapping is ideal because each new function will be mapped to a new resource runtime mechanism in the described monitor, analyze, plan, and execute loop.
Resource availability in edge computing complicates the execu-tion of the planned activities by the orchestration layer. Multiple meth-ods can solve the problem of limited available resources and required pro-cessing, communication, or storage demands, such as inserting queues in front of each resource in the pool, or a common (grouped) queue.
Scheduling tasks over provisioned resources is by itself a challenge, as it is an NP-hard, real-time problem. Scheduling in the edge’s loosely cou-pled infrastructure is even more chal-lenging than for the cloud’s tightly coupled one. Even if the edge provi-sions elastic resources, the edge’s dis-tribution will generate huge network latency because of its slower wide area network (WAN) compared to the cloud’s high-speed LAN. Therefore, the platform should provide light-weight algorithms for optimizing the real-time decision making for sched-uling, considering multiple con�ict-ing criteria, such as energy ef�ciency, dependability, real-time response, resiliency, reliability, time-predict-ability, fault tolerance, and system cost. To reduce the tradeoff between the acceptable schedule that can be calculated fastest (in real time), a foot-print analyzer can be implemented, whose historical outputs can be used
Related Work in Scalable Data Analytics Applications
Traditionally, scalable data analyt-ics applications have been realized
with cloud-supported, distributed, data-stream processing systems. Maintaining low end-to-end latencies under high data velocity is a major challenge for such sys-tems, particularly in large-scale Internet of Things scenarios. Systems like Stream-Cloud1 and Twitter Heron2 were devel-oped to handle massive amounts of data, using concepts such as auto-paralleliza-tion of stream operators, clustering, and elastic scaling. While these approaches address scalability issues, they don’t consider edge-speci�c features such as locality awareness, which are crucial for achieving low-latency, real-time analytics.
Different approaches extended tra-ditional stream processing with novel algorithms for deploying and scheduling operators at the edge. For example, Apos-tolos Papageorgiou and colleagues ex-tended the stream topology deployment
algorithm of Apache Storm.3 In particular, their deployment optimization approach incorporates quality of service (QoS) metrics of topology-external interactions to reduce communication latencies with-in stream-processing topologies. Valeria Cardellini and colleagues propose a simi-lar approach, where the scheduling algo-rithm takes into account network QoS metrics, such as latency, between stream operators.4
Only a few efforts have been made to develop novel architectures for data analyt-ics platforms in the edge. Mahadev Satyana-rayanan and colleagues propose GigaSight, a hybrid architecture for computer-vision analytics based on cloudlets.5 In GigaSight, cloudlets �lter and process mobile video streams in near real time. Only process-ing results, such as recognized objects, and corresponding metadata are sent to the cloud, thereby reducing end-to-end laten-cies as well as bandwidth usage.
References1. P. Valduriez et al., “StreamCloud: An Elastic and
Scalable Data Streaming System,” IEEE Trans.
Parallel and Distributed Systems, vol. 23, no. 12,
2012, pp. 2351–2365.
2. S. Kulkarni et al., “Twitter Heron: Stream Pro-
cessing at Scale,” Proc. ACM Sigmod Int’l Conf.
Management of Data, 2015, pp. 239–250.
3. A. Papageorgiou, E. Poormohammady, and
B. Cheng, “Edge-Computing-Aware Deploy-
ment of Stream Processing Tasks Based on
Topology-External Information: Model, Algo-
rithms, and A Storm-Based Prototype,” Proc.
IEEE Int’l Conf. Big Data, 2016, pp. 259–266.
4. V. Cardellini et al., “On QoS-Aware Scheduling
of Data Stream Applications Over Fog Comput-
ing Infrastructures,” Proc. IEEE Symp. Computers
and Comm., 2015, pp. 271–276.
5. M. Satyanarayanan et al., “Edge Analytics in the
Internet of Things,” IEEE Pervasive Computing,
vol. 14, no. 2, 2015, pp. 24–31.
34 ComputingEdge June 2019
Internet of Things, People, and Processes
70 www.computer.org/internet/ IEEE INTERNET COMPUTING
without worrying about nonfunc-tional requirements.
Our model will switch the cur-rent view of centralized premise and cloud real-time analytics into more distributed, edge, ubiquitous, real-time analytics, in which the data’s value won’t be lost at the edge and all computing layers will be used evenly. Our vision is that all com-puting layers work together like a team, without making the cloud the team’s most important player. Our platform will act as a team manager that will follow the road map toward a fully-�edged platform for cloud and edge real-time data analytics. It will overcome the challenges of pro-visioning data analytics functions at edge resources, making them highly available and rapidly elastic, and processing and managing the data in real time without (or before) passing it to the cloud.
AcknowledgmentsThis work is partiality supported by JPI Urban
Europe, ERA-NET, under project 5631209 and
by bilateral MKD-AUT project 18779 (Scal-
ability and Elasticity Performance of Cloud
Services).
References1. M. Satyanarayanan, “The Emergence of
Edge Computing,” Computer, vol. 50, no.
1, 2017, pp. 30–39.
2. V. Bahl, “Cloud 2020: Emergence of Micro
Data Centers (Cloudlets) For Latency
Sensitive Computing,” keynote address,
Devices and Networking Summit, 21
Apr. 2015; https://channel9.msdn.com/
Events/Microsof t-Research/Devices-
and-Networking-Summit-2015/Closing-
Keynote.
3. F. Bonomi et al., “Fog Computing and Its
Role in the Internet of Things,” Proc. MCC
Workshop Mobile Cloud Computing, 2012,
pp. 13–16.
4. P. Valduriez et al., “StreamCloud: An
Elastic and Scalable Data Streaming
System,” IEEE Trans. Parallel and Dis-
tributed Systems, vol. 23, no. 12, 2012,
pp. 2351–2365.
5. S. Kulkarni et al., “Twitter Heron: Stream
Processing at Scale,” Proc. ACM Sigmod
Int’l Conf. Management of Data, 2015,
pp. 239–250.
6. S. Hendrickson et al., “Serverless Com-
putation with OpenLambda,” Proc. Use-
nix Conf. Hot Topics in Cloud Computing,
2016, pp. 33–39.
7. S. Nastic et al., “Provisioning Software-
De�ned IoT Cloud Systems,” Proc. Int’l
Conf. Future Internet of Things and Cloud,
2014; doi:10.1109/FiCloud.2014.52.
8. S. Nastic, H.-L. Truong, and S. Dustdar,
“A Middleware Infrastructure for Utility-
Based Provisioning of IoT Cloud Systems,”
Proc. 1st IEEE/ACM Symp. Edge Comput-
ing, 2016; doi:10.1109/SEC.2016.35.
9. D. Breitgand et al., “SLA-Aware Resource
Over-Commit in an IaaS Cloud,” Proc. 8th
Int’l Conf. Network and Service Manage-
ment, 2012, pp. 73–81.
Stefan Nastic is a postdoctoral research assis-
tant at the Distributed Systems Group
(DSG), TU Wien, Austria. His research
interests include Internet of Things (IoT)
and edge computing, cloud computing,
big data analytics, and smart cities. Nas-
tic has a DrTech in programming, provi-
sioning, and governing IoT cloud systems
from TU Wien. Contact him at snastic@
infosys.tuwien.ac.at.
Thomas Rausch is a PhD student at the DSG,
TU Wien, Austria. His research interests
include IoT, edge computing, and event-
based systems. Rausch has an MS in soft-
ware engineering and Internet computing
from TU Wien. Contact him at trausch@
dsg.tuwien.ac.at.
Ognjen Scekic is a postdoctoral university
assistant at the DSG, TU Wien, Austria.
His research interests include social com-
puting, collective adaptive systems, and
smart cities. Scekic has a PhD in auto-
mated incentive management for social
computing (foundations, models, tools
and algorithms) from TU Wien. Contact
him at [email protected].
Schahram Dustdar is a full professor of
computer science, and he heads the DSG
at TU Wien. His work focuses on distrib-
uted systems. Dustdar is an IEEE Fellow,
a member of the Academia Europaea,
an ACM Distinguished Scientist, and
recipient of the IBM Faculty Award. He
is on the editorial boards of IEEE Inter-
net Computing and Computer. He’s an
associate editor of IEEE Transactions
on Services Computing, ACM Transac-
tions on the Web, and ACM Transactions
on Internet Technology. He’s the editor-
in-chief of Springer Computing. Con-
tact him at [email protected];
http://dsg.tuwien.ac.at/staff/sd/.
Marjan Gusev is a professor at Ss. Cyril and
Methodius University, Skopje, Macedo-
nia. His research interests include IoT,
cloud computing, and eHealth solutions.
Gusev has a PhD in electrical sciences
from the University of Ljubljana. Contact
him at marjan.gushev@�nki.ukim.mk.
Bojana Koteska is a PhD student and teach-
ing and research assistant at Ss. Cyril
and Methodius University. Her research
interests include scienti�c and cloud
computing, software quality, and dis-
tributed computing. Koteska has an MS
in software engineering from Ss. Cyril
and Methodius University. Contact her at
bojana.koteska@�nki.ukim.mk.
Magdalena Kostoska is an assistant profes-
sor at Ss. Cyril and Methodius Univer-
sity. Her research interests include cloud
computing and IoT. Kostoska has a PhD
in cloud computing from Ss. Cyril and
Methodius University. Contact her at
magdalena.kostoska@�nki.ukim.mk.
Boro Jakimovski is an associate professor at
Ss. Cyril and Methodius University. His
research interests include grid comput-
ing, high-performance computing, par-
allel and distributed processing, and
genetic algorithms. Jakimovski has a
PhD in distributed systems from Ss. Cyril
and Methodius University. Contact him at
boro.jakimovski@�nki.ukim.mk.
Sasko Ristov a university assistant (postdoc)
at the University of Innsbruck, Austria,
www.computer.org/computingedge 35
A Serverless Real-Time Data Analytics Platform for Edge Computing
JULY/AUGUST 2017 71
and an assistant professor at Ss. Cyril and
Methodius University, Skopje, Macedonia.
His research interests include performance
modeling and optimization and parallel
and distributed systems. Ristov has a PhD
degree in computer science from Ss. Cyril
and Methodius University. Contact him at
Radu Prodan is an associate professor at the
University of Innsbruck, Austria. His
research interests include parallel and
distributed systems and software tools
(performance, debugging, scheduling, fault
tolerance, and energy). Prodan has a habili-
tation degree in computer science from the
University of Innsbruck. He’s an associate
editor of IEEE Transactions on Parallel and
Distributed Systems. Contact him at radu@
dps.uibk.ac.at.
Read your subscriptions through the myCS publi-cations portal at http://mycs.computer.org.
NOMINATE A COLLEAGUE FOR THIS AWARD!
DUE: 15 OCTOBER 2017
CALL FOR STANDARDS AWARD NOMINATIONS
IEEE COMPUTER SOCIETY HANS KARLSSON STANDARDS AWARD
Submit your nomination electronically: awards.computer.org | Questions: [email protected]
• Requires 3 endorsements.
• Self-nominations are not accepted.
• Do not need IEEE or IEEE Computer Society membership to apply.
A plaque and $2,000 honorarium is presented in recognition of outstanding skills and dedication to diplomacy, team facilitation, and joint achievement in the development or promotion of standards in the computer industry where individual aspirations, corporate competition, and organizational rivalry could otherwise be counter to the bene� t of society.
Internet of Things, People, and Processes
70 www.computer.org/internet/ IEEE INTERNET COMPUTING
without worrying about nonfunc-tional requirements.
Our model will switch the cur-rent view of centralized premise and cloud real-time analytics into more distributed, edge, ubiquitous, real-time analytics, in which the data’s value won’t be lost at the edge and all computing layers will be used evenly. Our vision is that all com-puting layers work together like a team, without making the cloud the team’s most important player. Our platform will act as a team manager that will follow the road map toward a fully-�edged platform for cloud and edge real-time data analytics. It will overcome the challenges of pro-visioning data analytics functions at edge resources, making them highly available and rapidly elastic, and processing and managing the data in real time without (or before) passing it to the cloud.
AcknowledgmentsThis work is partiality supported by JPI Urban
Europe, ERA-NET, under project 5631209 and
by bilateral MKD-AUT project 18779 (Scal-
ability and Elasticity Performance of Cloud
Services).
References1. M. Satyanarayanan, “The Emergence of
Edge Computing,” Computer, vol. 50, no.
1, 2017, pp. 30–39.
2. V. Bahl, “Cloud 2020: Emergence of Micro
Data Centers (Cloudlets) For Latency
Sensitive Computing,” keynote address,
Devices and Networking Summit, 21
Apr. 2015; https://channel9.msdn.com/
Events/Microsof t-Research/Devices-
and-Networking-Summit-2015/Closing-
Keynote.
3. F. Bonomi et al., “Fog Computing and Its
Role in the Internet of Things,” Proc. MCC
Workshop Mobile Cloud Computing, 2012,
pp. 13–16.
4. P. Valduriez et al., “StreamCloud: An
Elastic and Scalable Data Streaming
System,” IEEE Trans. Parallel and Dis-
tributed Systems, vol. 23, no. 12, 2012,
pp. 2351–2365.
5. S. Kulkarni et al., “Twitter Heron: Stream
Processing at Scale,” Proc. ACM Sigmod
Int’l Conf. Management of Data, 2015,
pp. 239–250.
6. S. Hendrickson et al., “Serverless Com-
putation with OpenLambda,” Proc. Use-
nix Conf. Hot Topics in Cloud Computing,
2016, pp. 33–39.
7. S. Nastic et al., “Provisioning Software-
De�ned IoT Cloud Systems,” Proc. Int’l
Conf. Future Internet of Things and Cloud,
2014; doi:10.1109/FiCloud.2014.52.
8. S. Nastic, H.-L. Truong, and S. Dustdar,
“A Middleware Infrastructure for Utility-
Based Provisioning of IoT Cloud Systems,”
Proc. 1st IEEE/ACM Symp. Edge Comput-
ing, 2016; doi:10.1109/SEC.2016.35.
9. D. Breitgand et al., “SLA-Aware Resource
Over-Commit in an IaaS Cloud,” Proc. 8th
Int’l Conf. Network and Service Manage-
ment, 2012, pp. 73–81.
Stefan Nastic is a postdoctoral research assis-
tant at the Distributed Systems Group
(DSG), TU Wien, Austria. His research
interests include Internet of Things (IoT)
and edge computing, cloud computing,
big data analytics, and smart cities. Nas-
tic has a DrTech in programming, provi-
sioning, and governing IoT cloud systems
from TU Wien. Contact him at snastic@
infosys.tuwien.ac.at.
Thomas Rausch is a PhD student at the DSG,
TU Wien, Austria. His research interests
include IoT, edge computing, and event-
based systems. Rausch has an MS in soft-
ware engineering and Internet computing
from TU Wien. Contact him at trausch@
dsg.tuwien.ac.at.
Ognjen Scekic is a postdoctoral university
assistant at the DSG, TU Wien, Austria.
His research interests include social com-
puting, collective adaptive systems, and
smart cities. Scekic has a PhD in auto-
mated incentive management for social
computing (foundations, models, tools
and algorithms) from TU Wien. Contact
him at [email protected].
Schahram Dustdar is a full professor of
computer science, and he heads the DSG
at TU Wien. His work focuses on distrib-
uted systems. Dustdar is an IEEE Fellow,
a member of the Academia Europaea,
an ACM Distinguished Scientist, and
recipient of the IBM Faculty Award. He
is on the editorial boards of IEEE Inter-
net Computing and Computer. He’s an
associate editor of IEEE Transactions
on Services Computing, ACM Transac-
tions on the Web, and ACM Transactions
on Internet Technology. He’s the editor-
in-chief of Springer Computing. Con-
tact him at [email protected];
http://dsg.tuwien.ac.at/staff/sd/.
Marjan Gusev is a professor at Ss. Cyril and
Methodius University, Skopje, Macedo-
nia. His research interests include IoT,
cloud computing, and eHealth solutions.
Gusev has a PhD in electrical sciences
from the University of Ljubljana. Contact
him at marjan.gushev@�nki.ukim.mk.
Bojana Koteska is a PhD student and teach-
ing and research assistant at Ss. Cyril
and Methodius University. Her research
interests include scienti�c and cloud
computing, software quality, and dis-
tributed computing. Koteska has an MS
in software engineering from Ss. Cyril
and Methodius University. Contact her at
bojana.koteska@�nki.ukim.mk.
Magdalena Kostoska is an assistant profes-
sor at Ss. Cyril and Methodius Univer-
sity. Her research interests include cloud
computing and IoT. Kostoska has a PhD
in cloud computing from Ss. Cyril and
Methodius University. Contact her at
magdalena.kostoska@�nki.ukim.mk.
Boro Jakimovski is an associate professor at
Ss. Cyril and Methodius University. His
research interests include grid comput-
ing, high-performance computing, par-
allel and distributed processing, and
genetic algorithms. Jakimovski has a
PhD in distributed systems from Ss. Cyril
and Methodius University. Contact him at
boro.jakimovski@�nki.ukim.mk.
Sasko Ristov a university assistant (postdoc)
at the University of Innsbruck, Austria,
This article originally appeared in IEEE Internet Computing, vol. 21, no. 4, 2017.
Advertising Personnel
Debbie Sims: Advertising CoordinatorEmail: [email protected]: +1 714 816 2138 | Fax: +1 714 821 4010
Advertising Sales Representatives (display)
Central, Northwest, Southeast, Far East: Eric KincaidEmail: [email protected]: +1 214 673 3742Fax: +1 888 886 8599
Northeast, Midwest, Europe, Middle East: David SchisslerEmail: [email protected] Phone: +1 508 394 4026Fax: +1 508 394 1707
Southwest, California: Mike HughesEmail: [email protected]: +1 805 529 6790
Advertising Sales Representative (Classi�eds & Jobs Board)
Heather BuonadiesEmail: [email protected]: +1 201 887 1703
Advertising Sales Representative (Jobs Board)
Marie ThompsonEmail: [email protected]: 714-813-5094
ADVERTISER INFORMATION
36 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
REDIRECTIONS
MARCH/APRIL 2018 | IEEE SOFTWARE 97
Furthermore, to make those es-timates, we need only a remarkably small number of attributes. For ex-ample, feature subset selection (FSS) is an automatic technique for �nd-ing what attributes we can remove without damaging our ability to make a prediction from the data. Recent results show that traditional FSS methods (for example, step-wise regression) can be improved by AI search algorithms that quickly search very large subsets of the attri-butes to �nd the most useful ones.7 Applying FSS to software defect pre-dictors or software effort estimators often reduces datasets with 24 to 42 attributes to sets with only two or three attributes.5,8 This means that (24 – 3)/24 � 88 percent to (42 – 3)/ 42 � 93 percent of the collected at-tributes aren’t essential to predict-ing software quality. That is, much of what we thought was important for quality prediction turns out to be mostly irrelevant. And, more in-terestingly, we can’t tell beforehand what attributes will be the most use-ful9 until we test those attributes on real project data.
This result is surprising, to say the least. Software engineers tend to emphasize the complexities, rather than the simplicity, of software proj-ects. Much has been written about what factors might in�uence a soft-ware project, so developers often spend much effort collecting dozens of attributes. Yet for the projects I’ve mentioned, nearly all those at-tributes are irrelevant for prediction.
Important QuestionsWhy are so many things irrelevant? Software engineering data often con-tains much noise. Collecting data from multiple projects is dif�cult because the collected data’s mean-ing can vary from project to project.
We can remove such noisy attri-butes without damaging predictive prowess.
We should also remove most of the closely associated attributes. Suppose a software company assigns its most skilled programmers to mission-critical projects. In that data, “programming skill” would be associated with “criticality.” We could dispense with either (but not both) of those attributes without los-ing important information.
In addition, there’s the effect of context. Figure 1 shows data from NASA regarding software projects at the Jet Propulsion Laboratory (JPL). Most of the projects are of high complexity. Thus, feature selection would tend to delete “high complex-ity” because it’s (mostly) a constant across all the data. That is, although no one doubts that software com-plexity contributes to software cost, for the JPL data, it’s mostly irrelevant.
So, what does this mean for the practice of analytics? The previous examples tell us that real software projects can surprise and confound our expectations. In new projects, we should check all expectations
(that some factor contributes to soft-ware quality). That’s the bad news. The good news is that such checks are now fast to run, given the ready availability of data-mining tools (and developers skilled in using them).
M ore generally, note how these examples are all motivation for this new
Redirections department. Our �eld is rife with any number of truisms that are commonly quoted but rarely checked. Perhaps it’s time to reverse that trend. Let’s all look over old re-sults in software engineering with a fresh eye and ask, “Which of those results are most applicable?” and “Can we con�rm those results using contemporary data?” Hopefully, this department will prompt many such inquiries.
References 1. R. Prikladnicki and T. Menzies,
“From Voice of Evidence to Redirec-
tions,” IEEE Software, vol. 35, no. 1,
2018, pp. 11–13
2. T. Menzies and T. Zimmermann,
“Software Analytics: So What?,”
IEEE Software, vol. 30, no. 4, 2013,
Low Nominal Highcomplexity
Very high Extra high
25
50
2 10
10
20
30
40
50
No. o
f pro
ject
s
FIGURE 1. The distribution of complexity in a NASA project dataset.5 Complexity is
a constant across nearly all the data, so it could be removed from consideration as an
attribute during software analytics.
FROM THE EDITOREditor: Editor Nameaffi l [email protected]
96 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0 7 4 0 - 7 4 5 9 / 1 8 / $ 3 3 . 0 0 © 2 0 1 8 I E E E
REDIRECTIONSEditor: Tim MenziesNorth Carolina State [email protected]
The Unreasonable Effectiveness of Software AnalyticsTim Menzies
AS RAFAEL PRIKLADNICKI and I commented in last issue’s Voice of Evidence article, it’s time to ask, “What’s surprising about software engineering?”1 Accordingly, in this article, I explore one of the great mysteries of software analytics: why does it work at all?
Software analytics distills large amounts of low-value data into small chunks of very-high-value data. Such chunks are often predictive; that is, they can offer a somewhat accurate prediction about some quality attri-bute of future projects—for exam-ple, the location of potential defects or the development cost.
In theory, software analytics shouldn’t work because software proj-ect behavior shouldn’t be predictable. Consider the wide, ever-changing range of tasks being implemented
by software and the diverse, con-tinually evolving tools used for soft-ware’s construction (for example, IDEs and version control tools). Let’s make that worse. Now consider the constantly changing platforms on which the software executes (desktops, laptops, mobile devices, RESTful services, and so on) or the system developers’ varying skills and experience.
Given all that complex and con-tinual variability, every software project could be unique. And, if that were true, any lesson learned from past projects would have limited ap-plicability for future projects.
This turns out not to be the case. One of the lessons of software ana-lytics is that software projects have predictable properties2 and that at least some of those properties hold
for future projects. Stranger still, the number of variables required to make those predictions is small—which means that most of the things we think might affect software qual-ity have little impact in practice.
Not as Complex as We ThoughtConsider the task of predicting how long it takes to build software. Given dozens of attributes describ-ing a software project, we can usu-ally guess that project’s development time. We can do this using qualita-tive methods (for example, planning poker,3 which is favored by the agile community) or parametric-modeling methods (favored by large govern-ment projects4,5). However we do it, such estimates are surprisingly accurate.3,5,6
Call for Submissions
Do you have a surprising result or industrial experience? Something that challenges
decades of conventional thinking in software engineering? If so, email a one-
paragraph synopsis to [email protected] (use the subject line “REDIRECTIONS: Idea:
[your idea]”). If that looks interesting, I’ll ask you to submit a 1,000- to 2,400-word
article (where each graph, table, or � gure is worth 250 words) for review for IEEE
Software. Note: Heresies are more than welcome (if supported by well-reasoned
industrial experiences, case studies, or other empirical results). —Tim Menzies
www.computer.org/computingedge 37
REDIRECTIONS
MARCH/APRIL 2018 | IEEE SOFTWARE 97
Furthermore, to make those es-timates, we need only a remarkably small number of attributes. For ex-ample, feature subset selection (FSS) is an automatic technique for �nd-ing what attributes we can remove without damaging our ability to make a prediction from the data. Recent results show that traditional FSS methods (for example, step-wise regression) can be improved by AI search algorithms that quickly search very large subsets of the attri-butes to �nd the most useful ones.7 Applying FSS to software defect pre-dictors or software effort estimators often reduces datasets with 24 to 42 attributes to sets with only two or three attributes.5,8 This means that (24 – 3)/24 � 88 percent to (42 – 3)/ 42 � 93 percent of the collected at-tributes aren’t essential to predict-ing software quality. That is, much of what we thought was important for quality prediction turns out to be mostly irrelevant. And, more in-terestingly, we can’t tell beforehand what attributes will be the most use-ful9 until we test those attributes on real project data.
This result is surprising, to say the least. Software engineers tend to emphasize the complexities, rather than the simplicity, of software proj-ects. Much has been written about what factors might in�uence a soft-ware project, so developers often spend much effort collecting dozens of attributes. Yet for the projects I’ve mentioned, nearly all those at-tributes are irrelevant for prediction.
Important QuestionsWhy are so many things irrelevant? Software engineering data often con-tains much noise. Collecting data from multiple projects is dif�cult because the collected data’s mean-ing can vary from project to project.
We can remove such noisy attri-butes without damaging predictive prowess.
We should also remove most of the closely associated attributes. Suppose a software company assigns its most skilled programmers to mission-critical projects. In that data, “programming skill” would be associated with “criticality.” We could dispense with either (but not both) of those attributes without los-ing important information.
In addition, there’s the effect of context. Figure 1 shows data from NASA regarding software projects at the Jet Propulsion Laboratory (JPL). Most of the projects are of high complexity. Thus, feature selection would tend to delete “high complex-ity” because it’s (mostly) a constant across all the data. That is, although no one doubts that software com-plexity contributes to software cost, for the JPL data, it’s mostly irrelevant.
So, what does this mean for the practice of analytics? The previous examples tell us that real software projects can surprise and confound our expectations. In new projects, we should check all expectations
(that some factor contributes to soft-ware quality). That’s the bad news. The good news is that such checks are now fast to run, given the ready availability of data-mining tools (and developers skilled in using them).
M ore generally, note how these examples are all motivation for this new
Redirections department. Our �eld is rife with any number of truisms that are commonly quoted but rarely checked. Perhaps it’s time to reverse that trend. Let’s all look over old re-sults in software engineering with a fresh eye and ask, “Which of those results are most applicable?” and “Can we con�rm those results using contemporary data?” Hopefully, this department will prompt many such inquiries.
References 1. R. Prikladnicki and T. Menzies,
“From Voice of Evidence to Redirec-
tions,” IEEE Software, vol. 35, no. 1,
2018, pp. 11–13
2. T. Menzies and T. Zimmermann,
“Software Analytics: So What?,”
IEEE Software, vol. 30, no. 4, 2013,
Low Nominal Highcomplexity
Very high Extra high
25
50
2 10
10
20
30
40
50
No. o
f pro
ject
s
FIGURE 1. The distribution of complexity in a NASA project dataset.5 Complexity is
a constant across nearly all the data, so it could be removed from consideration as an
attribute during software analytics.
FROM THE EDITOREditor: Editor Nameaffi l [email protected]
96 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIETY 0 7 4 0 - 7 4 5 9 / 1 8 / $ 3 3 . 0 0 © 2 0 1 8 I E E E
REDIRECTIONSEditor: Tim MenziesNorth Carolina State [email protected]
The Unreasonable Effectiveness of Software AnalyticsTim Menzies
AS RAFAEL PRIKLADNICKI and I commented in last issue’s Voice of Evidence article, it’s time to ask, “What’s surprising about software engineering?”1 Accordingly, in this article, I explore one of the great mysteries of software analytics: why does it work at all?
Software analytics distills large amounts of low-value data into small chunks of very-high-value data. Such chunks are often predictive; that is, they can offer a somewhat accurate prediction about some quality attri-bute of future projects—for exam-ple, the location of potential defects or the development cost.
In theory, software analytics shouldn’t work because software proj-ect behavior shouldn’t be predictable. Consider the wide, ever-changing range of tasks being implemented
by software and the diverse, con-tinually evolving tools used for soft-ware’s construction (for example, IDEs and version control tools). Let’s make that worse. Now consider the constantly changing platforms on which the software executes (desktops, laptops, mobile devices, RESTful services, and so on) or the system developers’ varying skills and experience.
Given all that complex and con-tinual variability, every software project could be unique. And, if that were true, any lesson learned from past projects would have limited ap-plicability for future projects.
This turns out not to be the case. One of the lessons of software ana-lytics is that software projects have predictable properties2 and that at least some of those properties hold
for future projects. Stranger still, the number of variables required to make those predictions is small—which means that most of the things we think might affect software qual-ity have little impact in practice.
Not as Complex as We ThoughtConsider the task of predicting how long it takes to build software. Given dozens of attributes describ-ing a software project, we can usu-ally guess that project’s development time. We can do this using qualita-tive methods (for example, planning poker,3 which is favored by the agile community) or parametric-modeling methods (favored by large govern-ment projects4,5). However we do it, such estimates are surprisingly accurate.3,5,6
Call for Submissions
Do you have a surprising result or industrial experience? Something that challenges
decades of conventional thinking in software engineering? If so, email a one-
paragraph synopsis to [email protected] (use the subject line “REDIRECTIONS: Idea:
[your idea]”). If that looks interesting, I’ll ask you to submit a 1,000- to 2,400-word
article (where each graph, table, or � gure is worth 250 words) for review for IEEE
Software. Note: Heresies are more than welcome (if supported by well-reasoned
industrial experiences, case studies, or other empirical results). —Tim Menzies
38 ComputingEdge June 2019
REDIRECTIONS
98 IEEE SOFTWARE | W W W.COMPUTER.ORG/SOFT WARE | @IEEESOFT WARE
pp. 31–37; doi:10.1109/MS.2013.86;
goo.gl/aGS7wP.
3. K. Molokken-Ostvold and N.C.
Haugen, “Combining Estimates
with Planning Poker—an Empirical
Study,” Proc. 18th Australian Soft-
ware Eng. Conf. (ASWEC 07), 2007,
pp. 349–358; doi:10.1109/ASWEC
.2007.15.
4. F. Sarro, A. Petrozziello, and M.
Harman, “Multi-objective Soft-
ware Effort Estimation,” Proc.
38th Int’l Conf. Software Eng.
(ICSE 16), 2016, pp. 619–630;
doi:10.1145/2884781.2884830.
5. Z. Chen et al., “Finding the Right
Data for Software Cost Modeling,”
IEEE Software, vol. 22, no. 6, 2005,
pp. 38–46; doi:10.1109/MS.2005
.151.
6. F. Zhang et al., “Towards Building
a Universal Defect Prediction Model
with Rank Transformed Predictors,”
Empirical Software Eng., vol. 21,
no. 5, 2016, pp. 2107–2145.
7. M.A. Hall and G. Holmes, “Bench-
marking Attribute Selection Tech-
niques for Discrete Class Data
Mining,” IEEE Trans. Knowledge
and Data Eng., vol. 15, no. 6, 2003,
pp. 1437–1447.
8. T. Menzies et al., “Defect Predic-
tion from Static Code Features:
Current Results, Limitations, New
Approaches,” Automated Software
Eng., vol. 17, no. 4, 2010, pp. 375–
407; doi:10.1007/s10515-010-0069-5.
9. R. Krishna and T. Menzies, “Bell-
wethers: A Baseline Method for
Transfer Learning,” 3 Dec. 2017;
arxiv.org/abs/1703.06218.
Read your subscriptions through the myCS publications portal at
http://mycs.computer.org
ABOUT THE AUTHOR
TIM MENZIES is a full professor at North Carolina State University, where
he leads the RAISE (Real-World AI for Software Engineering) research
group. Contact him at [email protected]; menzies.us.
Author guidelines: www.computer.org/software/authorFurther details: [email protected]
www.computer.org/software
IEEE Software seeks practical,
readable articles that will appeal to
experts and nonexperts alike. The
magazine aims to deliver reliable,
useful, leading-edge information
to software developers, engineers,
and managers to help them stay
on top of rapid technology change.
Topics include requirements,
design, construction, tools, project
management, process improvement,
maintenance, testing, education and
training, quality, standards, and more.
CallArticlesfor
This article originally appeared in IEEE Software, vol. 35, no. 2, 2018.
@s e cur it ypr ivac y
FOLLOW US
2469-7087/19/$33.00 © 2019 IEEE Published by the IEEE Computer Society June 2019 39
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
Multimodal SentimentAnalysis: Addressing KeyIssues and Setting Up theBaselines
We compile baselines, along with dataset split, for
multimodal sentiment analysis. In this paper, we explore
three different deep-learning-based architectures for
multimodal sentiment classification, each improving upon
the previous. Further, we evaluate these architectures with
multiple datasets with fixed train/test partition. We also
discuss some major issues, frequently ignored in
multimodal sentiment analysis research, e.g., the role of
speaker-exclusive models, the importance of different
modalities, and generalizability. This framework illustrates
the different facets of analysis to be considered while
performing multimodal sentiment analysis and, hence,
serves as a new benchmark for future research in this
emerging field.
Emotion recognition and sentiment analysis is opening up numerous opportunities pertaining to socialmedia in terms of understanding users’ preferences, habits, and their contents.10 With the advancementof communication technology, an abundance of mobile devices, and the rapid rise of social media, alarge amount of data is being uploaded as a video, rather than text.2 For example, consumers tend torecord their opinions on products using a webcam and upload them on social media platforms, such asYouTube and Facebook, to inform the subscribers of their views. Such videos often containcomparisons of products from competing brands, pros and cons of product specifications, and otherinformation that can aid prospective buyers to make informed decisions.
Soujanya PoriaNanyang TechnologicalUniversity
Navonil MajumderInstituto Polit�ecnico Nacional
Devamanyu HazarikaNational University ofSingapore
Erik CambriaNanyang TechnologicalUniversity
Alexander GelbukhInstituto Polit�ecnico Nacional
Amir HussainEdinburgh Napier University
Editor:Erik [email protected]
DEPARTMENT: Affective Computing and Sentiment Analysis
IEEE Intelligent SystemsNovember/December 2018 17
Published by the IEEE Computer Society1541-1672/18/$33.00 �2018 IEEE
40 ComputingEdge June 2019
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
The primary advantage of analyzing videos over mere text analysis, for detecting emotions andsentiment, is the surplus of behavioral cues. Videos provide multimodal data in terms of vocal andvisual modalities. The vocal modulations and facial expressions in the visual data, along with text data,provide important cues to better identify true affective states of the opinion holder. Thus, acombination of text and video data helps to create a better emotion and sentiment analysis model.
Recently, a number of approaches to multimodal sentiment analysis producing interesting results havebeen proposed.11,13 However, there are major issues that remain mostly unaddressed in this field, suchas the consideration of the context in classification, effect of speaker-inclusive and speaker-exclusivescenario, the impact of each modality across datasets, and generalization ability of a multimodalsentiment classifier. Not tackling these issues has presented difficulties in the effective comparison ofdifferent multimodal sentiment analysis methods. In this paper, we outline some methods that addressthese issues and set up a baseline based on state-of-the-art methods. We use a deep convolutionalneural network (CNN) to extract features from visual and text modalities.
This paper is organized as follows: The “Related Work” section provides a brief literature review onmultimodal sentiment analysis. The “Unimodal Feature Extraction” section briefly discusses thebaseline methods; experimental results and discussion are given in the “Experiments andObservations” section, and finally, “Conclusion” section concludes the paper.
RELATED WORKIn 1970, Ekman et al.6 carried out extensive studies on facial expressions. Their research work showedthat universal facial expressions are able to provide sufficient clues to detect emotions. Recent studieson speech-based emotion analysis4 have focused on identifying relevant acoustic features, such asfundamental frequency (pitch), the intensity of utterance, bandwidth, and duration.
As to fusing audio and visual modalities for emotion recognition, two of the early works were done byDe Silva et al.5 and Chen et al.3 Both works showed that a bimodal system yielded a higher accuracythan any unimodal system.
While there are many research papers on audio-visual fusion for emotion recognition, only a fewresearch works have been devoted to multimodal emotion or sentiment analysis using text clues alongwith visual and audio modalities. Wollmer et al.14 fused information from audio, visual and textmodalities to extract emotion and sentiment. Metallinou et al.8 fused audio and text modalities foremotion recognition. Both approaches relied on feature-level fusion.
In this paper, we study the behavior of the method proposed in Poria et al.,12 in the aspects rarelyaddressed by other authors, such as speaker independence, the generalizability of the models andperformance of individual modalities.
UNIMODAL FEATURE EXTRACTIONFor the unimodal feature extraction, we follow the procedures by bc-LSTM.12
Textual Feature ExtractionWe employ CNN for textual feature extraction. Following Kim,16 we obtain n-gram features from eachutterance using three distinct convolution filters of sizes 3, 4, and 5, respectively, each having 50 feature-maps. Outputs are then subjected to max-pooling followed by rectified linear unit activation. Theseactivations are concatenated and fed to a 100-dimensional (100-D) dense layer, which is regarded as thetextual utterance representation. This network is trained at utterance level with the emotion labels.
Audio and Visual Feature ExtractionIdentical to Poria et al.,12 we use 3-D-CNN and openSMILE7 for visual and acoustic featureextraction, respectively.
November/December 2018 18 www.computer.org/intelligent
IEEE INTELLIGENT SYSTEMS
www.computer.org/computingedge 41
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
FusionIn order to fuse the information extracted from different modalities, we concatenated the featurevectors representative of the given modalities and sent the combined vector to a classifier for theclassification. This scheme of fusion is called feature-level fusion. Since the fusion involvedconcatenation and no overlapping, merge, or combination, scaling and normalization of the featureswere avoided. We discuss the results of this fusion in the “Experiments and Observations” section.
Baseline Method
1. bc-LSTM:We follow the method bc-LSTM12 where they used a bidirectional LSTM tocapture the context from the surrounding utterances to generate context-aware utterancerepresentation.
2. SVM: After extracting the features, we merged and sent to an SVM with RBF kernel for thefinal classification.
EXPERIMENTS AND OBSERVATIONSIn this section, we discuss the datasets and the experimental settings. Also, we analyze the resultsyielded by the aforementioned methods.
Datasets
1. Multimodal Sentiment Analysis Datasets: For our experiments, we used the MOUD dataset,developed by Perez-Rosas et al.9 They collected 80 product review and recommendationvideos from YouTube. Each video was segmented into its utterances (498 in total) and eachof these was categorized by a sentiment label (positive, negative and neutral). On average,each video has six utterances and each utterance is five seconds long. In our experiment, wedid not consider neutral labels, which led to the final dataset consisting of 448 utterances.We dropped the neutral label to maintain consistency with previous work. In a similarfashion, Zadeh et al.15 constructed a multimodal sentiment analysis dataset calledmultimodal opinion-level sentiment intensity (MOSI), which is bigger than MOUD,consisting of 2199 opinionated utterances, 93 videos by 89 speakers. The videos address alarge array of topics, such as movies, books, and products. In the experiment to address thegeneralizability issues, we trained a model on MOSI and tested on MOUD. Table 1 showsthe split of train/test of these datasets.
2. Multimodal Emotion Recognition Dataset:The IEMOCAP database1 was collected for thepurpose of studying multimodal expressive dyadic interactions. This dataset contains 12 hoursof video data split into five minutes of dyadic interaction between professional male and femaleactors. Each interaction session was split into spoken utterances. At least three annotators
Table 1. Person-independent train/test split details of each dataset (�70/30% split).Note: X! Y represents train: X and test: Y; Validation sets are extracted from
the shuffled train sets using 80/20% train/val ratio.
Dataset Train Test
utterance video utterance video
IEMOCAP 4290 120 1208 31
MOSI 1447 62 752 31
MOUD 322 59 115 20
MOSI!MOUD
2199 93 437 79
Affective Computing and Sentiment Analysis
November/December 2018 19 www.computer.org/intelligent
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
The primary advantage of analyzing videos over mere text analysis, for detecting emotions andsentiment, is the surplus of behavioral cues. Videos provide multimodal data in terms of vocal andvisual modalities. The vocal modulations and facial expressions in the visual data, along with text data,provide important cues to better identify true affective states of the opinion holder. Thus, acombination of text and video data helps to create a better emotion and sentiment analysis model.
Recently, a number of approaches to multimodal sentiment analysis producing interesting results havebeen proposed.11,13 However, there are major issues that remain mostly unaddressed in this field, suchas the consideration of the context in classification, effect of speaker-inclusive and speaker-exclusivescenario, the impact of each modality across datasets, and generalization ability of a multimodalsentiment classifier. Not tackling these issues has presented difficulties in the effective comparison ofdifferent multimodal sentiment analysis methods. In this paper, we outline some methods that addressthese issues and set up a baseline based on state-of-the-art methods. We use a deep convolutionalneural network (CNN) to extract features from visual and text modalities.
This paper is organized as follows: The “Related Work” section provides a brief literature review onmultimodal sentiment analysis. The “Unimodal Feature Extraction” section briefly discusses thebaseline methods; experimental results and discussion are given in the “Experiments andObservations” section, and finally, “Conclusion” section concludes the paper.
RELATED WORKIn 1970, Ekman et al.6 carried out extensive studies on facial expressions. Their research work showedthat universal facial expressions are able to provide sufficient clues to detect emotions. Recent studieson speech-based emotion analysis4 have focused on identifying relevant acoustic features, such asfundamental frequency (pitch), the intensity of utterance, bandwidth, and duration.
As to fusing audio and visual modalities for emotion recognition, two of the early works were done byDe Silva et al.5 and Chen et al.3 Both works showed that a bimodal system yielded a higher accuracythan any unimodal system.
While there are many research papers on audio-visual fusion for emotion recognition, only a fewresearch works have been devoted to multimodal emotion or sentiment analysis using text clues alongwith visual and audio modalities. Wollmer et al.14 fused information from audio, visual and textmodalities to extract emotion and sentiment. Metallinou et al.8 fused audio and text modalities foremotion recognition. Both approaches relied on feature-level fusion.
In this paper, we study the behavior of the method proposed in Poria et al.,12 in the aspects rarelyaddressed by other authors, such as speaker independence, the generalizability of the models andperformance of individual modalities.
UNIMODAL FEATURE EXTRACTIONFor the unimodal feature extraction, we follow the procedures by bc-LSTM.12
Textual Feature ExtractionWe employ CNN for textual feature extraction. Following Kim,16 we obtain n-gram features from eachutterance using three distinct convolution filters of sizes 3, 4, and 5, respectively, each having 50 feature-maps. Outputs are then subjected to max-pooling followed by rectified linear unit activation. Theseactivations are concatenated and fed to a 100-dimensional (100-D) dense layer, which is regarded as thetextual utterance representation. This network is trained at utterance level with the emotion labels.
Audio and Visual Feature ExtractionIdentical to Poria et al.,12 we use 3-D-CNN and openSMILE7 for visual and acoustic featureextraction, respectively.
November/December 2018 18 www.computer.org/intelligent
IEEE INTELLIGENT SYSTEMS
42 ComputingEdge June 2019
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
assigned to each utterance one emotion category: happy, sad, neutral, angry, surprised, excited,frustration, disgust, fear and other. In this paper, we considered only the utterances withmajorityagreement (i.e., at least two out of three annotators labeled the same emotion) in the emotionclasses of angry, happy, sad, and neutral. Table 1 shows the split of train/test of this dataset.
Speaker-Exclusive ExperimentMost of the research work on multimodal sentiment analysis is performed with datasets having acommon speaker(s) between train and test splits. However, given this overlap, results do not scale totrue generalization. In real-world applications, the model should be robust to speaker variance. Thus,we performed speaker-exclusive experiments to emulate unseen conditions. This time, our train/testsplits of the datasets were completely disjoint with respect to speakers. While testing, our models hadto classify emotions and sentiments from utterances by speakers they have never seen before. Below,we elaborate this speaker-exclusive experiment:
� IEMOCAP: As this dataset contains ten speakers, we performed a ten-fold speaker-exclusive test, where in each round exactly one of the speakers was included in the test setand missing from the train set. The same SVM model was used as before and accuracy wasused as a performance metric.
� MOUD: This dataset contains videos of about 80 people reviewing various products inSpanish. Each utterance in the video has been labeled as positive, negative, or neutral. In ourexperiments, we consider only samples with positive and negative sentiment labels. Thespeakers were partitioned into five groups and a five-fold person-exclusive experiment wasperformed, where in every fold one out of the five group was in the test set. Finally, we tookan average of the accuracy to summarize the results (Table 2).
� MOSI:MOSI dataset is rich in sentimental expressions, where 93 people review variousproducts in English. The videos are segmented into clips, where each clip is assigned asentiment score between 3 to þ3 by five annotators. We took the average of these labels asthe sentiment polarity and naturally considered two classes (positive and negative). LikeMOUD, speakers were divided into five groups and a five-fold person-exclusive experimentwas run. For each fold, on average 75 people were in the training set and the remaining inthe test set. The training set was further partitioned and shuffled into 80%– 20% split togenerate train and validation sets for parameter tuning.
Table 2. Accuracy reported for speaker-exclusive (Sp-Ex) and speaker-inclusive (Sp-In) splitfor Concatenation- Based Fusion. IEMOCAP: 10-fold speaker-exclusive average.
MOUD: Five-fold speaker-exclusive average.MOSI: Five-fold speaker-exclusive average.Legend: A stands for Audio, V for Video, T for Text.
Modality IEMOCAP MOUD MOSI
Combi-nation
Sp-In Sp-Ex Sp-In Sp-Ex Sp-In Sp-Ex
A 66.20 51.52 – 53.70 64.00 57.14
V 60.30 41.79 – 47.68 62.11 58.46
T 67.90 65.13 – 48.40 78.00 75.16
T þA 78.20 70.79 – 57.10 76.60 75.72
T þ V 76.30 68.55 – 49.22 78.80 75.06
A þV 73.90 52.15 – 62.88 66.65 62.4
T þA þV
81.70 71.59 – 67.90 78.80 76.66
IEEE INTELLIGENT SYSTEMS
November/December 2018 20 www.computer.org/intelligent
www.computer.org/computingedge 43
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
1) Speaker-Inclusive versus Speaker-Exclusive: In comparison with the speaker-inclusive experiment,the speaker-exclusive setting yielded inferior results. This is caused by the absence of knowledgeabout the speakers during the testing phase. Table 2 shows the performance obtained in the speaker-inclusive experiment. It can be seen that audio modality consistently performs better than visualmodality in both MOSI and IEMOCAP datasets. The text modality plays the most important role inboth emotion recognition and sentiment analysis. The fusion of the modalities shows more impact foremotion recognition than for sentiment analysis. Root mean square error (RMSE) and TP-rate of theexperiments using different modalities on IEMOCAP and MOSI datasets are shown in Figure 1.
Contributions of the ModalitiesAs expected, bimodal, and trimodal models have performed better than unimodal models in allexperiments. Overall, audio modality has performed better than visual on all datasets. Except forMOUD dataset, the unimodal performance of text modality is substantially better than the other twomodalities (Figure 2).
Figure 1. Experiments on IEMOCAP and MOSI datasets. The top-left figure shows the RMSEof the models on IEMOCAP and MOSI. The top-right figure shows the dataset distribution.Bottom-left and bottom-right figures present TP-rate of the models on IEMOCAP and MOSIdataset, respectively.
Figure 2. Performance of the modalities on the datasets. Red line indicates the median of theaccuracy.
Affective Computing and Sentiment Analysis
November/December 2018 21 www.computer.org/intelligent
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
assigned to each utterance one emotion category: happy, sad, neutral, angry, surprised, excited,frustration, disgust, fear and other. In this paper, we considered only the utterances withmajorityagreement (i.e., at least two out of three annotators labeled the same emotion) in the emotionclasses of angry, happy, sad, and neutral. Table 1 shows the split of train/test of this dataset.
Speaker-Exclusive ExperimentMost of the research work on multimodal sentiment analysis is performed with datasets having acommon speaker(s) between train and test splits. However, given this overlap, results do not scale totrue generalization. In real-world applications, the model should be robust to speaker variance. Thus,we performed speaker-exclusive experiments to emulate unseen conditions. This time, our train/testsplits of the datasets were completely disjoint with respect to speakers. While testing, our models hadto classify emotions and sentiments from utterances by speakers they have never seen before. Below,we elaborate this speaker-exclusive experiment:
� IEMOCAP: As this dataset contains ten speakers, we performed a ten-fold speaker-exclusive test, where in each round exactly one of the speakers was included in the test setand missing from the train set. The same SVM model was used as before and accuracy wasused as a performance metric.
� MOUD: This dataset contains videos of about 80 people reviewing various products inSpanish. Each utterance in the video has been labeled as positive, negative, or neutral. In ourexperiments, we consider only samples with positive and negative sentiment labels. Thespeakers were partitioned into five groups and a five-fold person-exclusive experiment wasperformed, where in every fold one out of the five group was in the test set. Finally, we tookan average of the accuracy to summarize the results (Table 2).
� MOSI:MOSI dataset is rich in sentimental expressions, where 93 people review variousproducts in English. The videos are segmented into clips, where each clip is assigned asentiment score between 3 to þ3 by five annotators. We took the average of these labels asthe sentiment polarity and naturally considered two classes (positive and negative). LikeMOUD, speakers were divided into five groups and a five-fold person-exclusive experimentwas run. For each fold, on average 75 people were in the training set and the remaining inthe test set. The training set was further partitioned and shuffled into 80%– 20% split togenerate train and validation sets for parameter tuning.
Table 2. Accuracy reported for speaker-exclusive (Sp-Ex) and speaker-inclusive (Sp-In) splitfor Concatenation- Based Fusion. IEMOCAP: 10-fold speaker-exclusive average.
MOUD: Five-fold speaker-exclusive average.MOSI: Five-fold speaker-exclusive average.Legend: A stands for Audio, V for Video, T for Text.
Modality IEMOCAP MOUD MOSI
Combi-nation
Sp-In Sp-Ex Sp-In Sp-Ex Sp-In Sp-Ex
A 66.20 51.52 – 53.70 64.00 57.14
V 60.30 41.79 – 47.68 62.11 58.46
T 67.90 65.13 – 48.40 78.00 75.16
T þA 78.20 70.79 – 57.10 76.60 75.72
T þ V 76.30 68.55 – 49.22 78.80 75.06
A þV 73.90 52.15 – 62.88 66.65 62.4
T þA þV
81.70 71.59 – 67.90 78.80 76.66
IEEE INTELLIGENT SYSTEMS
November/December 2018 20 www.computer.org/intelligent
44 ComputingEdge June 2019
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
Generalizability of the ModelsTo test the generalization ability of the models, we trained the framework on MOSI dataset in speaker-exclusive fashion and tested with MOUD dataset. From Table 3, we can see that the trained model withMOSI dataset performed poorly with MOUD dataset.
This is mainly due to the fact that reviews in MOUD dataset had been recorded in Spanish, so bothaudio and text modalities miserably fail in recognition, as MOSI dataset contains reviews in English. Amore comprehensive study would be to perform generalizability tests on datasets of the samelanguage. However, we were unable to do this for the lack of benchmark datasets. Also, similarexperiments of cross-dataset generalization were not performed on emotion detection, given theavailability of only a single dataset (IEMOCAP).
Table 3. Cross-dataset results: Model (with previous configurations) trained on MOSI datasetand tested on MOUD dataset.
Modality Combination Accuracy
SVM bc-LSTM
T 46.5% 46.9%
V 43.3% 49.6%
A 42.9% 47.2%
T þA 50.4% 51.3%
T þ V 49.8% 49.8%
A þV 46.0% 49.6%
T þA þV 51.1% 52.7%
Table 4. Accuracy reported for speaker-exclusive classification. IEMOCAP: Ten-fold speaker-exclusive average.MOUD: Five-fold speaker-exclusive average.MOSI: 5-fold speaker-exclusive average. Legend: A represents Audio, V represents Video, T represents Text.
Modality IEMOCAP MOUD MOSI
Combi-nation
SVM bc-LSTM
SVM bc-LSTM
SVM bc-LSTM
A 52.9 57.1 51.5 59.9 58.5 60.3
V 47.0 53.2 46.3 48.5 53.1 55.8
T 65.5 73.6 49.5 52.1 75.5 78.1
T þA 70.1 75.4 53.1 60.4 75.8 80.2
T þ V 68.5 75.6 50.2 52.2 76.7 79.3
A þV 67.6 68.9 62.8 65.3 58.6 62.1
T þA þV
72.5 76.1 66.1 68.1 77.9 80.3
IEEE INTELLIGENT SYSTEMS
November/December 2018 22 www.computer.org/intelligent
www.computer.org/computingedge 45
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
Comparison Among the Baseline MethodsTable 4 consolidates and compares the performance of all the baseline methods for all the datasets. Weevaluated SVM and bc-LSTM fusion with MOSI, MOUD, and IEMOCAP dataset.
From Table 4, it is clear that bc-LSTM performs better than SVM across all the experiments. So, it isvery apparent that consideration of the context in the classification process has substantially boostedthe performance.
Visualization of the DatasetsMOSI visualizations present information regarding dataset distribution within single and multiplemodalities (Figure 3). For the textual and audio modalities, comprehensive clustering can be seen withsubstantial overlap. However, this problem is reduced in the video and all modalities scenario withstructured declustering but the overlap is reduced only in multimodal. This forms an intuitiveexplanation of the improved performance in the multimodal scenario. IEMOCAP visualizationsprovide insight for the four-class distribution for uni- and multimodal scenario, where clearly the
Figure 3. T-SNE 2-D visualization of MOSI and IEMOCAP datasets when unimodal featuresand multimodal features are used.
Affective Computing and Sentiment Analysis
November/December 2018 23 www.computer.org/intelligent
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
Generalizability of the ModelsTo test the generalization ability of the models, we trained the framework on MOSI dataset in speaker-exclusive fashion and tested with MOUD dataset. From Table 3, we can see that the trained model withMOSI dataset performed poorly with MOUD dataset.
This is mainly due to the fact that reviews in MOUD dataset had been recorded in Spanish, so bothaudio and text modalities miserably fail in recognition, as MOSI dataset contains reviews in English. Amore comprehensive study would be to perform generalizability tests on datasets of the samelanguage. However, we were unable to do this for the lack of benchmark datasets. Also, similarexperiments of cross-dataset generalization were not performed on emotion detection, given theavailability of only a single dataset (IEMOCAP).
Table 3. Cross-dataset results: Model (with previous configurations) trained on MOSI datasetand tested on MOUD dataset.
Modality Combination Accuracy
SVM bc-LSTM
T 46.5% 46.9%
V 43.3% 49.6%
A 42.9% 47.2%
T þA 50.4% 51.3%
T þ V 49.8% 49.8%
A þV 46.0% 49.6%
T þA þV 51.1% 52.7%
Table 4. Accuracy reported for speaker-exclusive classification. IEMOCAP: Ten-fold speaker-exclusive average.MOUD: Five-fold speaker-exclusive average.MOSI: 5-fold speaker-exclusive average. Legend: A represents Audio, V represents Video, T represents Text.
Modality IEMOCAP MOUD MOSI
Combi-nation
SVM bc-LSTM
SVM bc-LSTM
SVM bc-LSTM
A 52.9 57.1 51.5 59.9 58.5 60.3
V 47.0 53.2 46.3 48.5 53.1 55.8
T 65.5 73.6 49.5 52.1 75.5 78.1
T þA 70.1 75.4 53.1 60.4 75.8 80.2
T þ V 68.5 75.6 50.2 52.2 76.7 79.3
A þV 67.6 68.9 62.8 65.3 58.6 62.1
T þA þV
72.5 76.1 66.1 68.1 77.9 80.3
IEEE INTELLIGENT SYSTEMS
November/December 2018 22 www.computer.org/intelligent
46 ComputingEdge June 2019
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
multimodal distribution has the least overlap (increase in red and blue visuals, apart from the rest) withsparse distribution aiding the classification process.
CONCLUSIONWe have presented useful baselines for multimodal sentiment analysis and multimodal emotionrecognition. We also discussed some major aspects of multimodal sentiment analysis problem, such asthe performance in the unknown-speaker setting and the cross-dataset performance of the models.
Our future work will focus on extracting semantics from the visual features, relatedness of the cross-modal features and their fusion. We will also include contextual dependency learning in our model toovercome the limitations mentioned in the previous section.
REFERENCES1. C. Busso et al., “IEMOCAP: In teractive emotional dyadic motion capture database,” Lang.
Resour. Eval., vol. 42, no. 4, pp. 335–359, 2008.2. S. Poria, E. Cambria, D. Hazarika, N. Mazumder, A. Zadeh, and L. P. Morency, “Multi-level
multiple attentions for context-aware multimodal sentiment analysis,” in Proc. Int. Conf.Data Mining, 2017, pp. 1033–1038.
3. L. S. Chen, T. S. Huang, T. Miyasato, and R. Nakatsu, “Multi-modal human emotion/expression recognition,” in Proc. 3rd IEEE Int. Conf. Autom. Face Gesture Recognit., 1998,pp. 366–371.
4. D. Datcu and L. Rothkrantz, “Semantic audio-visual data fusion for automatic emotionrecognition,” Euromedia, 2008, http://mmi.tudelft.nl/pub/dragos/_datcu_euromedia08.pdf
5. L. C. De Silva, T. Miyasato, and R. Nakatsu, “Facial emotion recognition using multi-modalinformation,” in Proc. IEEE ICICS, 1997, pp. 397–401.
6. P. Ekman, “Universal facial expressions of emotion,” Culture and Personality:Contemporary Readings/Chicago, pp. 151–158, 1974.
7. F. Eyben, M. Wo€llmer, and B. Schuller, “Opensmile: The munich versatile and fast open-source audio feature extractor,” in Proc. 18th ACM Int. Conf. Multimedia, 2010,pp. 1459–1462.
8. A. Metallinou, S. Lee, and S. Narayanan, “Audio-visual emotion recognition using Gaussianmixture models for face and voice,” in Proc. 10th IEEE Int. Symp. ISM, 2008, pp. 250–257.
9. V. Pe�rez-Rosas, R. Mihalcea, and L.-P. Morency, “Utterance- level multimodal sentimentanalysis,” in Proc. 51st Annu. Meeting Assoc. Comput. Linguistics, 2013, pp. 973–982.
10. S. Poria, E. Cambria, R. Bajpai, and A. Hussain, “A review of affective computing: Fromunimodal analysis to multimodal fusion,” Inf. Fusion, vol. 37, pp. 98–125, 2017.
11. N. Majumder, D. Hazarika, A. Gelbukh, E. Cambria, and S. Poria, “Multimodal sentimentanalysis using hierarchical fusion with context modeling,” Knowl. Based Syst., vol. 161,pp. 124–133, 2018.
12. S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh, and L.-P. Morency, “Context-dependent sentiment analysis in user- generated videos,” in Proc. 55th Annu. Meeting Assoc.Comput. Linguistics (volume 1: Long papers), July 2017, pp. 873–883.
13. S. Poria, I. Chaturvedi, E. Cambria, and A. Hussain, “Convolutional MKL based multimodalemotion recognition and sentiment analysis,” in Proc. Int. Conf. Data Mining, 2016,pp. 439–448.
14. M. Wollmer et al., “Youtube movie reviews: Sentiment analysis in an audio-visual context,”IEEE Intell. Syst., vol. 28, no. 3, pp. 46–53, May/Jun. 2013.
15. A. Zadeh, R. Zellers, E. Pincus, and L.-P. Morency, “Multimodal sentiment intensityanalysis in videos: Facial gestures and verbal messages,” IEEE Intell. Syst., vol. 31, no. 6,pp. 82–88, Nov./Dec. 2016.
16. Y. Kim, “Convolutional neural networks for sentence classification,” arXiv:1408.5882,2014.
IEEE INTELLIGENT SYSTEMS
November/December 2018 24 www.computer.org/intelligent
www.computer.org/computingedge 47
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
ABOUT THE AUTHORS
Soujanya Poria is a Presidential Postdoctoral Fellow with the School of ComputerScience and Engineering, Nanyang Technological University, Singapore. Contact him [email protected].
Navonil Majumder is currently working toward the Ph.D. degree at the Centro deInvestigaci�on en Computaci�on (CIC) of the Instituto Polit�ecnico Nacional, Mexico City,Mexico. Contact him at [email protected].
Devamanyu Hazarika is currently working toward the Ph.D. degree at the School ofComputing at National University of Singapore, Singapore. Contact him [email protected].
Erik Cambria is an Assistant Professor with the School of Computer Science andEngineering, Nanyang Technological University, Singapore. Contact him [email protected].
Alexander Gelbukh is a Research Professor with the CIC of the Instituto Polit�ecnicoNacional, Mexico City, Mexico. Contact him at [email protected].
Amir Hussain is a Full Professor with the School of Computing, Edinburgh NapierUniversity, Edinburgh, U.K. Contact him at [email protected].
Affective Computing and Sentiment Analysis
November/December 2018 25 www.computer.org/intelligent
00mis00-poria-2882362.3d (Style 4) 05-04-2019 15:37
multimodal distribution has the least overlap (increase in red and blue visuals, apart from the rest) withsparse distribution aiding the classification process.
CONCLUSIONWe have presented useful baselines for multimodal sentiment analysis and multimodal emotionrecognition. We also discussed some major aspects of multimodal sentiment analysis problem, such asthe performance in the unknown-speaker setting and the cross-dataset performance of the models.
Our future work will focus on extracting semantics from the visual features, relatedness of the cross-modal features and their fusion. We will also include contextual dependency learning in our model toovercome the limitations mentioned in the previous section.
REFERENCES1. C. Busso et al., “IEMOCAP: In teractive emotional dyadic motion capture database,” Lang.
Resour. Eval., vol. 42, no. 4, pp. 335–359, 2008.2. S. Poria, E. Cambria, D. Hazarika, N. Mazumder, A. Zadeh, and L. P. Morency, “Multi-level
multiple attentions for context-aware multimodal sentiment analysis,” in Proc. Int. Conf.Data Mining, 2017, pp. 1033–1038.
3. L. S. Chen, T. S. Huang, T. Miyasato, and R. Nakatsu, “Multi-modal human emotion/expression recognition,” in Proc. 3rd IEEE Int. Conf. Autom. Face Gesture Recognit., 1998,pp. 366–371.
4. D. Datcu and L. Rothkrantz, “Semantic audio-visual data fusion for automatic emotionrecognition,” Euromedia, 2008, http://mmi.tudelft.nl/pub/dragos/_datcu_euromedia08.pdf
5. L. C. De Silva, T. Miyasato, and R. Nakatsu, “Facial emotion recognition using multi-modalinformation,” in Proc. IEEE ICICS, 1997, pp. 397–401.
6. P. Ekman, “Universal facial expressions of emotion,” Culture and Personality:Contemporary Readings/Chicago, pp. 151–158, 1974.
7. F. Eyben, M. Wo€llmer, and B. Schuller, “Opensmile: The munich versatile and fast open-source audio feature extractor,” in Proc. 18th ACM Int. Conf. Multimedia, 2010,pp. 1459–1462.
8. A. Metallinou, S. Lee, and S. Narayanan, “Audio-visual emotion recognition using Gaussianmixture models for face and voice,” in Proc. 10th IEEE Int. Symp. ISM, 2008, pp. 250–257.
9. V. Pe�rez-Rosas, R. Mihalcea, and L.-P. Morency, “Utterance- level multimodal sentimentanalysis,” in Proc. 51st Annu. Meeting Assoc. Comput. Linguistics, 2013, pp. 973–982.
10. S. Poria, E. Cambria, R. Bajpai, and A. Hussain, “A review of affective computing: Fromunimodal analysis to multimodal fusion,” Inf. Fusion, vol. 37, pp. 98–125, 2017.
11. N. Majumder, D. Hazarika, A. Gelbukh, E. Cambria, and S. Poria, “Multimodal sentimentanalysis using hierarchical fusion with context modeling,” Knowl. Based Syst., vol. 161,pp. 124–133, 2018.
12. S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh, and L.-P. Morency, “Context-dependent sentiment analysis in user- generated videos,” in Proc. 55th Annu. Meeting Assoc.Comput. Linguistics (volume 1: Long papers), July 2017, pp. 873–883.
13. S. Poria, I. Chaturvedi, E. Cambria, and A. Hussain, “Convolutional MKL based multimodalemotion recognition and sentiment analysis,” in Proc. Int. Conf. Data Mining, 2016,pp. 439–448.
14. M. Wollmer et al., “Youtube movie reviews: Sentiment analysis in an audio-visual context,”IEEE Intell. Syst., vol. 28, no. 3, pp. 46–53, May/Jun. 2013.
15. A. Zadeh, R. Zellers, E. Pincus, and L.-P. Morency, “Multimodal sentiment intensityanalysis in videos: Facial gestures and verbal messages,” IEEE Intell. Syst., vol. 31, no. 6,pp. 82–88, Nov./Dec. 2016.
16. Y. Kim, “Convolutional neural networks for sentence classification,” arXiv:1408.5882,2014.
IEEE INTELLIGENT SYSTEMS
November/December 2018 24 www.computer.org/intelligent
IEEE WORLD CONGRESS ON SERVICES 20198–13 July 2019 • University of Milan • Milan, Italy
conferences.computer.org/services/2019Register Now
Engage, Learn, and Connect at IEEE SERVICES 2019— The leading technical forum covering services computing and applications, as well as service software technologies, for building and delivering innovative industry solutions.
■ IEEE International Congress on Big Data (BigData Congress 2019)
■ IEEE International Conference on Cloud Computing (CLOUD 2019)
■ IEEE International Conference on Edge Computing (EDGE 2019)
■ IEEE International Conference on Cognitive Computing (ICCC 2019)
■ IEEE International Congress on Internet of Things (ICIOT 2019)
■ IEEE International Conference on Web Services (ICWS 2019)
■ IEEE International Conference on Services Computing (SCC 2019)
■ Plus two additional signature symposia on future digital health services and future fi nancial services
Don’t miss IEEE SERVICES 2019—the ONLY services conference that publishes its proceedings in the IEEE Xplore digital library—where the brightest minds converge for service computing’s latest developments and breakthroughs.
This article originally appeared in IEEE Intelligent Systems, vol. 33, no. 6, 2018.
48 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE90 C O M P U T E R P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y 0 0 1 8 - 9 1 6 2 / 1 8 / $ 3 3 . 0 0 © 2 0 1 8 I E E E
CYBER-PHYSICAL SYSTEMS
Cyber-physical systems (CPS) are often regarded as small and simple when it comes to comput-ing. The opposite is true. These systems rely on large and complex hardware and software.
CPS not only su� er from all of the problems of informa-tion technology (IT) systems, they also present unique and di� cult challenges.
Ford Motor announced at CES 2016 that their F-150 pickup, which for years has been the most popular ve-hicle in America, now includes over 150 million lines of software in its design. That is a big software project by any measure. Software systems of this size present a number of challenges. They are traditionally built using software from multiple sources and often in multiple languages. Version control and con� guration management must be
exercised on the code base. The soft-ware must be tested.
Now consider that all of this soft-ware is wrapped in 4,000 pounds of metal and plastic. The physicality of cyber-physical computing raises the stakes. The cost of failures is mea-sured not just in lost productivity
but in physical damage and lives.Computing systems tied to physical plants must meet
constraints that do not concern IT systems. Timing is key to the behavior of cyber-physical systems. Failure to meet timing constraints and deadlines can cause the system to fail. Timing is a � rst-class functional characteristic in the case of CPS.
Cyber-physical systems also present lifecycle chal-lenges that dwarf even the considerable long lifetimes of many legacy IT systems. Automobiles are typically in ser-vice for 10 to 20 years. Many airplanes operate for a half century or more. Roadways and buildings, which are in-creasingly instrumented with IoT sensors, might last for centuries. Electric power grids operate over continental scales and must operate continuously.
Computing in the Real World Is the Grandest of ChallengesMarilyn Wolf, Georgia Tech
Cyber-physical systems are no longer a sideline
for computing—they are a core concern for
computer scientists and engineers.
r5cps.indd 90 5/9/18 3:31 PM
www.computer.org/computingedge 49M AY 2 0 1 8 91
EDITOR DIMITRIOS SERPANOS ISI/ATHENA and University of Patras; [email protected]
Security is also a key challenge that takes on new dimensions in the context of cyber-physical systems. Insecure computing devices in cyber-physical systems—and there are many—not only threaten information but also physical safety. Dimitrios Ser-panos and I recently argued (“Scan-ning The Issue,” Proc. IEEE , vol. 106, no. 1, Jan. 2018, pp. 7–8; https://doi.org/10.1109/JPROC.2017.2777799) that safety and security can’t be considered separate and distinct in cyber-physical systems: insecure systems degrade safety; safety considerations also mean that some traditional computer security approaches can’t be used in cyber-physical systems.
What do these challenges mean? What should we do about them? Do we need to treat cyber-physical computa-tion di� erently than the development of IT systems?
The � rst thing we need to do is take CPS seriously. The professional societies and universities often treat cyber-physical systems as secondary. We should promote cyber-physical computing to the forefront of profes-sional concerns.
All computer scientists and engi-neers should know a few basic princi-ples of cyber-physical computing, just as computer scientists are expected to know the fundamentals of algorithms and operating systems. Real-time and low-power computing are fun-damental to CPS and also underlie other major computing applications. Multimedia and virtual reality rely on the principles established by embed-ded and cyber-physical computing. Data centers must live under power constraints, a topic that was pioneered by embedded computing.
CPS practitioners need to make use of the best practices from IT and scienti� c computing, a principle that isn’t always followed. CPS designers also need to understand when those
principles are lacking or inadequate. In many cases, we know how to adapt computing system design to the needs of CPS; in some cases, particularly in the case of safety and security, we have more work to do.
To manage the challenges of safety and security, we must extend tradi-tional engineering methodologies. Physical plant designers are used to building machines to the speci� ca-tions of an unchanging world. Unfor-tunately, computer systems face an ever-changing set of security chal-lenges and potential attacks. Even cyber-physical systems that aren’t di-rectly connected to the Internet are vulnerable to these attack vectors—a number of studies have demonstrated indirect attacks on cyber-physical in-stallations. We need to develop cyber-physical systems that can be updated to respond to new cyberthreats. And we need cyber-physical architectures that are inherently resilient in the face of a wide range of attacks.
Cyber-physical systems are no longer a sideline for comput-ing—they are a core concern for
computer scientists and engineers. It’s too late to turn back the clock to a day when computers were isolated in data centers. These challenges can be met with a combination of e� ort and skill.
MARILYN WOLF is the Rhesa “Ray”
S. Farmer, Jr. Distinguished Chair
in Embedded Computing Systems
at Georgia Tech and a Georgia
Research Alliance Eminent Scholar.
She is a Fellow of IEEE, Fellow of
ACM, a Golden Core Member of the
IEEE Computer Society, and recipi-
ent of the ASEE Frederick E. Terman
Award. Contact her at marilyn.wolf@
gatech.edu.
Author guidelines: www.computer.org/software/authorFurther details: [email protected]
www.computer.org/software
IEEE Software seeks practical,
readable articles that will appeal to
experts and nonexperts alike. The
magazine aims to deliver reliable,
useful, leading-edge information
to software developers, engineers,
and managers to help them stay
on top of rapid technology change.
Topics include requirements,
design, construction, tools, project
management, process improvement,
maintenance, testing, education and
training, quality, standards, and more.
CallArticlesfor
r5cps.indd 91 5/9/18 3:31 PM
90 C O M P U T E R P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y 0 0 1 8 - 9 1 6 2 / 1 8 / $ 3 3 . 0 0 © 2 0 1 8 I E E E
CYBER-PHYSICAL SYSTEMS
Cyber-physical systems (CPS) are often regarded as small and simple when it comes to comput-ing. The opposite is true. These systems rely on large and complex hardware and software.
CPS not only su� er from all of the problems of informa-tion technology (IT) systems, they also present unique and di� cult challenges.
Ford Motor announced at CES 2016 that their F-150 pickup, which for years has been the most popular ve-hicle in America, now includes over 150 million lines of software in its design. That is a big software project by any measure. Software systems of this size present a number of challenges. They are traditionally built using software from multiple sources and often in multiple languages. Version control and con� guration management must be
exercised on the code base. The soft-ware must be tested.
Now consider that all of this soft-ware is wrapped in 4,000 pounds of metal and plastic. The physicality of cyber-physical computing raises the stakes. The cost of failures is mea-sured not just in lost productivity
but in physical damage and lives.Computing systems tied to physical plants must meet
constraints that do not concern IT systems. Timing is key to the behavior of cyber-physical systems. Failure to meet timing constraints and deadlines can cause the system to fail. Timing is a � rst-class functional characteristic in the case of CPS.
Cyber-physical systems also present lifecycle chal-lenges that dwarf even the considerable long lifetimes of many legacy IT systems. Automobiles are typically in ser-vice for 10 to 20 years. Many airplanes operate for a half century or more. Roadways and buildings, which are in-creasingly instrumented with IoT sensors, might last for centuries. Electric power grids operate over continental scales and must operate continuously.
Computing in the Real World Is the Grandest of ChallengesMarilyn Wolf, Georgia Tech
Cyber-physical systems are no longer a sideline
for computing—they are a core concern for
computer scientists and engineers.
r5cps.indd 90 5/9/18 3:31 PM
This article originally appeared in Computer, vol. 51, no. 5, 2018.
mult-22-03-c1 Cover-1 July 12, 2016 4:40 PM
http://www.computer.org
JULY
–SEP
TEM
BER
2016
IEEE M
ultiM
edia
July–Sep
temb
er 20
16
❚ Quality M
od
eling
Vo
lum
e 23
Nu
mb
er 3
IEEE MultiMedia serves the community of scholars,
developers, practitioners, and students who are interested in multiple
media types and work in fields such as image and video processing, audio
analysis, text retrieval, and data fusion.
Read It Today!
www.computer.org /multimedia
www.computer.org/itpro
Whether your enjoy your current position or you are ready for change, the IEEE Computer Society Jobs Board is a valuable resource tool.
Take advantage of these special resources for job seekers:
www.computer.org/jobs
Keep Your Career Options Open
Upload Your Resume Today!
No matter your career
level, the IEEE Computer
Society Jobs Board keeps
you connected to workplace trends
and exciting new career prospects.
JOB ALERTS
CAREER ADVICE
WEBINARS
TEMPLATES
RESUMES VIEWED BY TOP EMPLOYERS
2469-7087/19/$33.00 © 2019 IEEE Published by the IEEE Computer Society June 2019 51C O M P U T E R 0 0 1 8 - 9 1 6 2 / 1 8 / $ 3 3 . 0 0 © 2 0 1 8 I E E E P U B L I S H E D B Y T H E I E E E C O M P U T E R S O C I E T Y J U LY 2 0 1 8 91
CYBER-PHYSICAL SYSTEMSEDITOR DIMITRIOS SERPANOS
ISI/ATHENA and University of Patras; [email protected]
Today, more than ever, advances through inno-vations in science and technology, such as the dramatic increase in computing power, are con-tributing to improvements in business and so-
ciety. At the same time, the world is facing global-scale
challenges such as depletion of natural resources, global warming, growing economic disparity, and terrorism. We live in a challenging age of uncertainty, with complexity growing at all levels. Thus, it’s criti-cal that we leverage ICT to its fullest to gain new knowledge and create new values by making connections between “people and things” and between the “real and cyber” worlds to e� ectively and e� ciently resolve issues in society, create better lives
for its people, and sustain healthy economic growth. Overcoming these challenges by encouraging various stakeholders at multiple levels to share a common future vision will be vital to realizing such a society through digitalization.
Society 5.0: For Human Security and Well-BeingYoshihiro Shiroishi, Kunio Uchiyama, and Norihiro Suzuki, Hitachi Re-search and Development Group
The Japanese Cabinet’s “Society 5.0” initiative
seeks to create a sustainable society for human
security and well-being through a cyber-physical
system. Keidanren (Japan Business Federation) is
well aligned to proactively deliver on the United
Nations’ Sustainable Development Goals to end
poverty, protect the planet, and ensure prosperity
for all through the creation of Society 5.0. Typical
collaborative ecosystem activities for Society 5.0
in Japan are outlined in this column.
r7cps.indd 91 7/9/18 5:24 PM
52 ComputingEdge June 201992 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CYBER-PHYSICAL SYSTEMS
In 2016, an initiative called “Society 5.0” was proposed by the Japanese Cab-inet in its 5th Science and Technology Basic Plan,1 with a vision toward creat-ing a “Super Smart Society.” The Super Smart Society is positioned as the fth developmental stage in human soci-ety, following hunter/gatherer, pas-toral/agrarian, industrial, and infor-mation,2 and represents a sustainable society connected by digital technol-ogies that attend in detail to the var-ious needs of that society by provid-ing necessary items or services to the people who require them, when they are required, in the amount required, thus enabling its citizens to live an active and comfortable life through high-quality services regardless of age, sex, region, language, and so on. Note, however, that digitalization is only the means, and that it is essential that we humans remain the central actors so that a rm focus is kept on building a society that makes us happy and provides us with a sense of worth. The Japanese government presented
its vision of Society 5.0, together with exhibits by supporting companies from Japan, at CeBIT 2017,3 Europe’s business festival for innovation and digitalization that covers the digita-lization of business, government, and society from every angle.
International discussion is pro-ceeding on the implementation of the United Nations’ Sustainable De-velopment Goals (SDGs),4 which were adopted in September 2015 as guide-posts for the entire world. The driving principle is to realize peace and pros-perity for all people and the planet by responding to the challenges with an inclusiveness that “leaves no one be-hind.” The Japanese government has made the SDGs Implementation Guid-ing Principles—science, technology, and innovation (STI)—a key policy and priority area. The Advisory Board for the Promotion of Science and Technol-ogy Diplomacy deliberated on these concepts and prepared a recommenda-tion5 identifying four action areas to mobilize “STI for SDGs”:
› creating a global future through Society 5.0,
› enabling solutions using global data,
› promoting cooperation at a global level, and
› fostering human resources to undertake STI e�orts for SDGs.
The 12 service platforms shown in Figure 1 will be developed by fully utilizing the Internet of Things (IoT): big data, computation, articial in-telligence (AI), display, and robotics technologies. A series of government initiatives are now in progress in Ja-pan, including “Robot Industry”6 and “Connected Industries,”7 which were introduced by the Ministry of Econ-omy, Trade, and Industry (METI); and “Conference toward AI Network Soci-ety,”8 introduced by the Ministry of Internal A�airs and Communications (MIC). These initiatives essentially target the development of advanced common platform technologies, ser-vices and systems, and system of
Intelligenttransportation
systems
Advanced securityand social
implementation
Utilizing existingsystems for positioning,
authentication, etc.
New manufacturingsystems
Regionalinclusive care
systems
Smartfood-chain systems
Utilization ofstandard data
Standardization ofinterfaces and data
formats
Human resourcedevelopment
Strengtheningdeveloping ofinformation
communicationplatforms Regulatory
institutional reform fornew services
Hospitality systems
Integratedmaterial development
systems
Society resilientagainst natural
disasters
Global environmentinformation platform
New businessand services
Infrastructuremaintenanceand updates
Smartmanufacturing
systems
Energy value chains
Figure 1. The 12 service platforms for creating a Super Smart Society.2
r7cps.indd 92 7/9/18 5:24 PM
www.computer.org/computingedge 53J U LY 2 0 1 8 93
systems for new market creation and transformation into a prosperous so-ciety by creating new values through cyber-physical systems (CPS).9 As the CPS in Figure 2 shows, various big data items—collected from intelli-gent sensing devices with low power and networks and kept in informa-tion storage devices—can be ana-lyzed and visualized using analytic tools such as AI with high computing power in cyberspace. This valuable data, often hard for humans alone to notice, will inform actions taken by decision-makers to provide solu-tions to societal issues and economic growth in the physical world.
In Japan, a series of government projects, such as ImPACT (Impulsing Paradigm Change through Disruptive Technologies Program) and SIP (cross- ministerial Strategic Innovation Pro-motion program),10 are geared toward realizing such technologies and ser-vice platforms. ImPACT is designed to develop industry- and society-chang-ing disruptive STIs through high-risk,
high-impact research and develop-ment, while SIP covers the entire path from basic research to e�ective exit strategies (practical application/com-mercialization) as well as taking on initiatives to reform regulations and systems. In SIP, for example, projects such as an automated driving sys-tem; energy carriers; cybersecurity for critical infrastructure; and tech-nologies for creating next-generation agriculture, forestry, and �sheries are now in progress with allocated budgets from the Council for Science, Technology, and Innovation (CSTI), which is responsible for planning and coordinating STI policies under the Japanese Cabinet. In addition, several academic studies on authentic third proposals and policy proposals for Society 5.0, including the “SDGs Proj-ect,” “Next-Generation Computing Project,” and “Telexistence Project,” are being carried out by The Engi-neering Academy of Japan (EAJ),11 a unique non-governmental engineer-ing academy in Japan.
TOWARD HUMAN SECURITY AND WELL-BEINGWe have also entered an era in which the human lifespan— because of STI—is reaching 100 years; therefore, it’s in-creasingly vital to empower humans by including a wider range of stake-holders and digitalization technolo-gies. We are just beginning to strive for true human security and well- being. We must foster a pioneering spirit and the ability to be disruptive if necessary by increasing the number of people working under their own ini-tiative and acting as game changers. To achieve a sustainable society on a global scale as soon as possible, it will be necessary to pursue transformation through a collaborative ecosystem that brings together ideas from indus-try, academia, and citizens.
Japan’s most important business federation, Keidanren, is well aligned with this game-changing initiative. Keidanren issued the �fth revision of its Charter of Corporate Behavior12
with the primary aim of proactively
Many Cores
In memory DB, etc. Distributed KVS*, etc.
Wide area network
Device w/MCU/RFID/sensor(temp, ACC, position, etc.)
Industry, humannature, smart society
IoT infrastructure
Real-world sensing Real-world actuation
Data center/cloud
Edge MCU/RFID/Sensor
MCU/RFID/sensor
Collab.
Business application
Service solution(actuation)
GPU
CPU
AI
Cloud computing
Edge computing
Security, privacy, digital rights protection
(Making rawdata available for
analytics)
*KVS: key value storages
End point/user
Collab.
Cloud
Robot/display
Accumulated databatch process
Modeling and mapping
Streaming continuous process
Social societyindustry
Figure 2. A cyber-physical system.
r7cps.indd 93 7/9/18 5:24 PM
92 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CYBER-PHYSICAL SYSTEMS
In 2016, an initiative called “Society 5.0” was proposed by the Japanese Cab-inet in its 5th Science and Technology Basic Plan,1 with a vision toward creat-ing a “Super Smart Society.” The Super Smart Society is positioned as the fth developmental stage in human soci-ety, following hunter/gatherer, pas-toral/agrarian, industrial, and infor-mation,2 and represents a sustainable society connected by digital technol-ogies that attend in detail to the var-ious needs of that society by provid-ing necessary items or services to the people who require them, when they are required, in the amount required, thus enabling its citizens to live an active and comfortable life through high-quality services regardless of age, sex, region, language, and so on. Note, however, that digitalization is only the means, and that it is essential that we humans remain the central actors so that a rm focus is kept on building a society that makes us happy and provides us with a sense of worth. The Japanese government presented
its vision of Society 5.0, together with exhibits by supporting companies from Japan, at CeBIT 2017,3 Europe’s business festival for innovation and digitalization that covers the digita-lization of business, government, and society from every angle.
International discussion is pro-ceeding on the implementation of the United Nations’ Sustainable De-velopment Goals (SDGs),4 which were adopted in September 2015 as guide-posts for the entire world. The driving principle is to realize peace and pros-perity for all people and the planet by responding to the challenges with an inclusiveness that “leaves no one be-hind.” The Japanese government has made the SDGs Implementation Guid-ing Principles—science, technology, and innovation (STI)—a key policy and priority area. The Advisory Board for the Promotion of Science and Technol-ogy Diplomacy deliberated on these concepts and prepared a recommenda-tion5 identifying four action areas to mobilize “STI for SDGs”:
› creating a global future through Society 5.0,
› enabling solutions using global data,
› promoting cooperation at a global level, and
› fostering human resources to undertake STI e�orts for SDGs.
The 12 service platforms shown in Figure 1 will be developed by fully utilizing the Internet of Things (IoT): big data, computation, articial in-telligence (AI), display, and robotics technologies. A series of government initiatives are now in progress in Ja-pan, including “Robot Industry”6 and “Connected Industries,”7 which were introduced by the Ministry of Econ-omy, Trade, and Industry (METI); and “Conference toward AI Network Soci-ety,”8 introduced by the Ministry of Internal A�airs and Communications (MIC). These initiatives essentially target the development of advanced common platform technologies, ser-vices and systems, and system of
Intelligenttransportation
systems
Advanced securityand social
implementation
Utilizing existingsystems for positioning,
authentication, etc.
New manufacturingsystems
Regionalinclusive care
systems
Smartfood-chain systems
Utilization ofstandard data
Standardization ofinterfaces and data
formats
Human resourcedevelopment
Strengtheningdeveloping ofinformation
communicationplatforms Regulatory
institutional reform fornew services
Hospitality systems
Integratedmaterial development
systems
Society resilientagainst natural
disasters
Global environmentinformation platform
New businessand services
Infrastructuremaintenanceand updates
Smartmanufacturing
systems
Energy value chains
Figure 1. The 12 service platforms for creating a Super Smart Society.2
r7cps.indd 92 7/9/18 5:24 PM
54 ComputingEdge June 201994 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CYBER-PHYSICAL SYSTEMS
delivering on SDGs through the cre-ation of Society 5.0. Figure 3 summa-rizes the concept of Society 5.0 for SDGs, as well as the challenges, key technologies, and systems of Society 5.0 and the 17 goals of SDGs.13 Al-though STI has in many ways greatly enhanced the convenience of our lifestyle, it has also increased social complexity, revealing some negative aspects of a digital society. Society 5.0 can provide approaches to reduc-ing or eliminating these negative as-pects. However, doing so will require breaking down what the position pa-per calls the “�ve walls”: ministries and agencies, the legal system, tech-nologies, human resources, and social acceptance.14 This will be our global challenge, and professional societies such as the IEEE Computer Society are expected to play a leading role in fully �edged cooperation with industrial
society on STI, trans-science, and multidisciplinary issues.
We are living in a challeng-ing age of societal com-plexity and uncertainty.
The Japanese Cabinet’s Society 5.0 initiative envisions the creation of a Super Smart Society—a sustainable society where various types of val-ues are connected through CPS and where people can live in safety, se-curity, and comfort. CPS can bridge different sectors, countries, regions, and societies that otherwise tend to be divided. The key to implementing Society 5.0/SDGs is that stakeholders share and address the challenges to-gether by fully utilizing the potential of CPS.
To move toward greater human security and well-being, we will need
to pursue transformation through a collaborative ecosystem with a shared vision for the future created with the participation of all stakeholders. Spe-ci�cally, we should take the following actions:
› present a future vision of changes in society through STI;
› grasp and overcome challenges by creating new values through CPS; and
› establish collaborations among industry, multidisciplinary ac-ademia, and public and private sectors.
REFERENCES1. “The 5th Science and Technology Ba-
sic Plan,” Government of Japan, 22 Jan. 2016; http://www8.cao.go.jp/cstp/english/basic/5thbasicplan.pdf.
2. Y. Harayama, “Society 5.0: Aiming
Early warningalert system
Empowermentof women
e-Learningsystem
1
2
16 17
14
13
12
11
9 8
7
6
5
4
3
15
Society 5.0for SDGs
17 Goals of SDGs
IoT
Drone
AI
Robot
On Demand
Sharing
Big Data
MR
AR
VR
5G
PKI
3DPrint
SensorCloud
Edge
Mobile
RETech
TransTech
AutoTech
DPTech
AdTech
AgriTech
FinTech
CivicTech
GovTech
LegalTech
JobTech
EdTech
ConTech
IntraTech
FoodTech
CareTech
HealthTech
UrbanTech
TourTech
BioTech
SportsTech
Utilization ofremote sensing andoceanographic data
Utilization ofmeteorological
and otherobservation data
Smart cities
Global innovationecosystem
Smart agricultureSmart Food
Smart grid systemi-Construction
HomeTech
EnviTech
MediaTech
10
Figure 3. Society 5.0 for SDGs (Sustainable Development Goals).13
r7cps.indd 94 7/9/18 5:24 PM
www.computer.org/computingedge 55J U LY 2 0 1 8 95
for a New Human-Centered Society.” Hitachi Rev., vol. 66, no. 6, 2017, pp. 556–557; http://www.hitachi.com/rev/archive/2017/r2017_06/pdf/p08-13_TRENDS.pdf.
3. “CeBIT: Japan’s Vision of Society 5.0,” Euronews; http://www.euronews.com/tag/cebit-2017.
4. “Sustainable Development Goals: 17 Goals to Transform Our World,” United Nations; http://www.un.org /sustainabledevelopment.
5. “Recommendation for the Future—STI as a Bridging Force to Provide Solutions for Global Issues: Four Actions of Science and Technology Diplomacy to Implement the SDGs,” Advisory Board for the Promotion of Science and Technology Diplomacy, 12 May 2017; http://www.mofa.go.jp /�les/000255801.pdf.
6. “Robot Industry,” Ministry of Econ-omy, Trade and Industry (METI), 31 Jan. 2018; http://www.meti.go.jp /english/policy/mono_info_service/robot_industry/index.html.
7. “Connected Industry,” METI, 13 June 2018; http://www.meti.go.jp/english/policy/mono_info_service/connected_industries/index.html.
8. “Conference toward AI Network Society” Ministry of Internal A¤airs and Communication; http://www .oecd.org/going-digital/ai-intelligent-machines-smart-policies/conference-agenda/ai-intelligent-machines-smart-policies-sudoh.pdf.
9. R. Poovendran et al., “Special Issue on Cyber-Physical Systems,” Proc. IEEE, vol. 100, no. 1, 2012, pp. 6–12.
10. “Pioneering the Future: Japanese Science, Technology and Innovation 2017,” Cabinet O¦ce, Brochure SIP; http://www8.cao.go.jp/cstp/panhu/sip_english/sip_en.html.
11. “Engineering Academy of Japan”; https://www.eaj.or.jp.
12. “Revision of the Charter of Corporate Behavior,” Keidanren (Japan Business Federation) Policy Proposals, 8 Nov. 2017; http://www.keidanren.or.jp/en/policy/csr/charter2017.html.
13. “Society 5.0 for SDGs,” Keidanren
(Japan Business Federation), 8 Nov. 2017; https://www.keidanren.or.jp/en/policy/csr/2017reference2.pdf.
14. “Toward Realization of the New Economy and Society: Reform of the economy and society by the deepen-ing of ‘Society 5.0’,” Keidanren (Japan Business Federation) Policy and Action, 19 Apr. 2016; http://www .keidanren.or.jp/en/policy/2016/029_outline.pdf.
DISCLAIMERThis article does not necessarily
reflect the positions or views of the
authors’ employer.
YOSHIHIRO SHIROISHI is a tech-nology advisor in the R&D Group at Hitachi, Ltd.; a Fellow of IEEE; and a member of EAJ. Contact him at [email protected].
KUNIO UCHIYAMA is a technology advisor in the R&D Group at Hitachi, Ltd.; a Fellow of IEEE; and a member of EAJ. Contact him at kunio.uchiyama [email protected].
NORIHIRO SUZUKI is vice president, executive o�cer, and chief technology o�cer of Hitachi, Ltd., and general manager of its R&D Group; and is a Senior Member of IEEE and member of EAJ. Contact him at norihiro.suzuki [email protected].
For more information on paper submission, featured articles, calls for papers, and subscription links visit:
www.computer.org/tsusc
IEEE TRANSACTIONS ON
SUSTAINABLECOMPUTING
T-SUSC is financially cosponsored by IEEE
Computer Society and IEEE Communications Society
T-SUSC is technically cosponsored by IEEE Council on Electronic
Design Automation
SUBSCRIBE AND SUBMIT
SUBMITTODAY
Read your subscriptions through the myCS publications portal at
http://mycs.computer.org
r7cps.indd 95 7/9/18 5:24 PM
94 C O M P U T E R W W W . C O M P U T E R . O R G / C O M P U T E R
CYBER-PHYSICAL SYSTEMS
delivering on SDGs through the cre-ation of Society 5.0. Figure 3 summa-rizes the concept of Society 5.0 for SDGs, as well as the challenges, key technologies, and systems of Society 5.0 and the 17 goals of SDGs.13 Al-though STI has in many ways greatly enhanced the convenience of our lifestyle, it has also increased social complexity, revealing some negative aspects of a digital society. Society 5.0 can provide approaches to reduc-ing or eliminating these negative as-pects. However, doing so will require breaking down what the position pa-per calls the “�ve walls”: ministries and agencies, the legal system, tech-nologies, human resources, and social acceptance.14 This will be our global challenge, and professional societies such as the IEEE Computer Society are expected to play a leading role in fully �edged cooperation with industrial
society on STI, trans-science, and multidisciplinary issues.
We are living in a challeng-ing age of societal com-plexity and uncertainty.
The Japanese Cabinet’s Society 5.0 initiative envisions the creation of a Super Smart Society—a sustainable society where various types of val-ues are connected through CPS and where people can live in safety, se-curity, and comfort. CPS can bridge different sectors, countries, regions, and societies that otherwise tend to be divided. The key to implementing Society 5.0/SDGs is that stakeholders share and address the challenges to-gether by fully utilizing the potential of CPS.
To move toward greater human security and well-being, we will need
to pursue transformation through a collaborative ecosystem with a shared vision for the future created with the participation of all stakeholders. Spe-ci�cally, we should take the following actions:
› present a future vision of changes in society through STI;
› grasp and overcome challenges by creating new values through CPS; and
› establish collaborations among industry, multidisciplinary ac-ademia, and public and private sectors.
REFERENCES1. “The 5th Science and Technology Ba-
sic Plan,” Government of Japan, 22 Jan. 2016; http://www8.cao.go.jp/cstp/english/basic/5thbasicplan.pdf.
2. Y. Harayama, “Society 5.0: Aiming
Early warningalert system
Empowermentof women
e-Learningsystem
1
2
16 17
14
13
12
11
9 8
7
6
5
4
3
15
Society 5.0for SDGs
17 Goals of SDGs
IoT
Drone
AI
Robot
On Demand
Sharing
Big Data
MR
AR
VR
5G
PKI
3DPrint
SensorCloud
Edge
Mobile
RETech
TransTech
AutoTech
DPTech
AdTech
AgriTech
FinTech
CivicTech
GovTech
LegalTech
JobTech
EdTech
ConTech
IntraTech
FoodTech
CareTech
HealthTech
UrbanTech
TourTech
BioTech
SportsTech
Utilization ofremote sensing andoceanographic data
Utilization ofmeteorological
and otherobservation data
Smart cities
Global innovationecosystem
Smart agricultureSmart Food
Smart grid systemi-Construction
HomeTech
EnviTech
MediaTech
10
Figure 3. Society 5.0 for SDGs (Sustainable Development Goals).13
r7cps.indd 94 7/9/18 5:24 PM
This article originally appeared in Computer, vol. 51, no. 7, 2018.
w w w . c o m p u t e r . o r g / i n t e r n e t
IEEE IN
TERNET CO
MPU
TING
July/August 2018
Evolution of Rack-Scale System
s
Volum
e 22 Num
ber 4
VOLUME 22, NUMBER 4 JULY/AUGUST 2018
Evolution of Rack-Scale Systems
www.computer .o rg / in te rne t
VOLUME 22, NUMBER 2 MARCH/APRIL 2018
Healthcare Informatics and Privacy
w w w . c o m p u t e r . o r g / i n t e r n e t
IEEE IN
TERNET CO
MPU
TING
M
ay/June 2018
Connected and Autonomous Vehicles
Volum
e 22 Num
ber 3
VOLUME 22, NUMBER 3 MAY/JUNE 2018
Connected and Autonomous Vehicles
w w w . c o m p u t e r . o r g / i n t e r n e t
IEEE IN
TERNET CO
MPU
TING
January/February 2018
IoT-Enhanced H
uman Experience
Volum
e 22 Num
ber 1
VOLUME 22, NUMBER 1 JANUARY/FEBRUARY 2018
IoT-Enhanced Human Experience
Join the IEEE Computer Society for subscription discounts today!www.computer.org/product/magazines/internet-computing
IEEE Internet Computing delivers novel content from academic and industry experts on the latest developments and key trends in Internet technologies and applications.
Written by and for both users and developers, the bimonthly magazine covers a wide range of topics, including:
• Applications• Architectures• Big data analytics• Cloud and edge computing• Information management• Middleware• Security and privacy• Standards• And much more
In addition to peer-reviewed articles, IEEE Internet Computing features industry reports, surveys, tutorials, columns, and news.
www.computer.org/internet
features industry reports, surveys, tutorials, columns, and news.
features industry reports, surveys, tutorials, columns, and news.
features industry reports, surveys, tutorials, columns, and news.
features industry reports, surveys, tutorials, columns, and news.
features industry reports, surveys, tutorials, columns, and news. reports, surveys, tutorials, columns, and news. reports, surveys, tutorials, columns, and news.
72 June 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE
I EEE Computer Society conferences are valuable forums for learning on broad and dynamically shifting topics from within the computing profession. With over 200 conferences featuring leading
experts and thought leaders, we have an event that is right for you.
JULY 8 July• ICME (IEEE Int’l Conf. on Multimedia and
Expo) ▲• ICMEW (IEEE Int’l Conf. on Multimedia &
Expo Workshops) ▲• SERVICES (IEEE World Congress on Ser-
vices) ●• SNPD (20th IEEE/ACIS Int’l Conf. on Soft-
ware Eng., Artifi cial Intelligence, Networking and Parallel/Distributed Computing) ▲
15 July• ASAP (IEEE 30th Int’l Conf. on Application-
specifi c Systems, Architectures and Proces-sors) ◗
• CBI (IEEE 21st Int’l Conf. on Business Infor-matics) ●
• COMPSAC (IEEE 43rd Annual Computer Soft-ware and Applications Conf.) ◗
• ICALT (19th IEEE Int’l Conf. on Advanced Learning Technologies) ★
• ISVLSI (IEEE Computer Society Annual Sym-posium on VLSI) ◗
23 July• ICCI*CC (IEEE 18th Int’l Conf. on Cognitive
Informatics & Cognitive Computing) ●30 July• IRI (IEEE 20th Int’l Conf. on Information Reuse
and Integration for Data Science) ◗• SMC-IT (IEEE Int’l Conf. on Space Mission
Challenges for Information Technology) ◗
AUGUST1 August• CSE (IEEE Int’l Conf. on Computational Sci-
ence and Eng.) ◗• EUC (IEEE Int’l Conf. on Embedded and Ubiq-
uitous Computing) ◗5 August• TrustCom (IEEE Int’l Conf. on Trust, Secu-
rity and Privacy in Computing and Commu-nications) ◆
8 August• 2019 Cloud Summit ◗
9 August• SmartIoT (IEEE Int’l Conf. on Smart Internet of
Things) ▲10 August• HPCC (IEEE 21st Int’l Conf. on High Perfor-
mance Computing and Communications) ▲• SmartCity (IEEE 17th Int’l Conf. on Smart City)
▲
• DSS (IEEE 5th Int’l Conf. on Data Science and Systems) ▲
15 August• NAS (IEEE Int’l Conf. on Networking,
Conference CalendarQuestions? Contact [email protected]
Find a region: Africa ■Asia ▲
Australia ◆Europe ●
North America ◗South America ★
Architecture and Storage) ▲18 August• HCS (IEEE Hot Chips 31 Symposium) ◗• RTCSA (IEEE 25th Int’l Conf. on Embedded
and Real-Time Computing Systems and Appli-cations) ▲
27 August• ASONAM (IEEE/ACM Int’l Conf. on Advances in
Social Networks Analysis and Mining) ◗
SEPTEMBER13 September• EWDTS (IEEE East-West Design & Test Sym-
posium) ●19 September• AVSS (16th IEEE Int’l Conf. on Advanced
Video and Signal Based Surveillance) ▲• ESEM (ACM/IEEE Int’l Symposium on Empir-
ical Software Eng. and Measurement) ★23 September• CLUSTER (IEEE Int’l Conf. on Cluster Com-
puting) ◗• PACT (28th Int’l Conf. on Parallel Architectures
and Compilation Techniques) ◗• RE (IEEE 27th Int’l Requirements Eng. Conf.) ▲• SecDev (IEEE Secure Development) ◗
25 September• HCC (IEEE Int’l Conf. on Humanized Comput-
ing and Communication) ◗29 September• ICSME (IEEE Int’l Conf. on Software Mainte-
nance and Evolution) ◗
OCTOBER1 October• MCSoC (IEEE 13th Int’l Symposium on
Embedded Multicore/Many-core Systems-on-Chip) ▲
2 October• DFT (IEEE Int’l Symposium on Defect and
Fault Tolerance in VLSI and Nanotechnology Systems) ●
6 October• MODELS (ACM/IEEE 22nd Int’l Conf. on
Model Driven Eng. Languages and Systems) ●
12 October• MICRO (52nd Annual IEEE/ACM Int’l Sympo-
sium on Microarchitecture) ◗14 October• ISMAR (IEEE Int’l Symposium on Mixed and
Augmented Reality) ▲• LCN (IEEE 44th Conf. on Local Computer Net-
works) ●• VL/HCC (IEEE Symposium on Visual Lan-
guages and Human-Centric Computing) ◗15 October• AIPR (IEEE Applied Imagery Pattern Recogni-
tion Workshop) ◗16 October• FIE (IEEE Frontiers in Education Conf.) ◗
20 October• ICCV (IEEE/CVF Int’l Conf. on Computer
Vision) ▲• VIS (IEEE Visualization Conf.) ◗
28 October• EDOC (IEEE 23rd Int’l Enterprise Distrib-
uted Object Computing Conf.) ●• ISSRE (IEEE 30th Int’l Symposium on
Software Reliability Eng.) ●
NOVEMBER4 November• ICTAI (IEEE 31st Int’l Conf. on Tools with Arti-
ficial Intelligence) ◗7 November• SEC (IEEE/ACM Symposium on Edge Comput-
ing) ◗8 November• ICDM (IEEE Int’l Conf. on Data Mining) ▲
Learn more about IEEE Computer Society Conferenceswww.computer.org/conferences
VERACODE AT THE FOREFRONT OF THE TRANSFORMATIONBy providing a comprehensive and accurate view of software security defects, companies can create secure software and ensure the software they buy or download is free of vulnerabilities. As a result, companies using Veracode are free to boldly innovate, explore, pioneer, discover, entertain, and change the world.
VISIT US AT VERACODE.COM