data-driven discovery of cyber-physical systemscyber-physical systems (cpss) embed software into the...

83
Data-driven Discovery of Cyber-Physical Systems Ye Yuan 1,2 , Xiuchuan Tang 1,3 , Wei Pan 4 , Xiuting Li 1 , Wei Zhou 1 , Hai-Tao Zhang 1,2 , Han Ding 2,3,* , Jorge Goncalves 5,6,* 1 School of Automation, Huazhong University of Science and Technology 2 State Key Lab of Digital Manufacturing Equipment and Technology 3 School of Mechanical Science and Engineering, Huazhong University of Science and Technology 4 Department of Cognitive Robotics, Delft University of Technology 5 Department of Engineering, University of Cambridge 6 Luxembourg Centre for Systems Biomedicine, University of Luxembourg * E-mail: [email protected], [email protected]. Cyber-physical systems (CPSs) embed software into the physical world. They appear in a wide range of applications such as smart grids, robotics, intelligent manufacture and medical monitoring. CPSs have proved resistant to modeling due to their intrinsic complexity arising from the combination of physical com- ponents and cyber components and the interaction between them. This study proposes a general framework for reverse engineering CPSs directly from data. The method involves the identification of physical systems as well as the inference of transition logic. It has been applied successfully to a number of real-world ex- amples ranging from mechanical and electrical systems to medical applications. The novel framework seeks to enable researchers to make predictions concern- ing the trajectory of CPSs based on the discovered model. Such information has been proven essential for the assessment of the performance of CPS, the design of failure-proof CPS and the creation of design guidelines for new CPSs. 1 arXiv:1810.00697v1 [cs.SY] 1 Oct 2018

Upload: others

Post on 23-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Data-driven Discovery of Cyber-Physical Systems

Ye Yuan1,2, Xiuchuan Tang1,3, Wei Pan4, Xiuting Li1, Wei Zhou1,

Hai-Tao Zhang1,2, Han Ding2,3,∗, Jorge Goncalves5,6,∗

1School of Automation, Huazhong University of Science and Technology

2State Key Lab of Digital Manufacturing Equipment and Technology

3School of Mechanical Science and Engineering,

Huazhong University of Science and Technology

4Department of Cognitive Robotics, Delft University of Technology

5 Department of Engineering, University of Cambridge

6Luxembourg Centre for Systems Biomedicine, University of Luxembourg

∗E-mail: [email protected], [email protected].

Cyber-physical systems (CPSs) embed software into the physical world. Theyappear in a wide range of applications such as smart grids, robotics, intelligentmanufacture and medical monitoring. CPSs have proved resistant to modelingdue to their intrinsic complexity arising from the combination of physical com-ponents and cyber components and the interaction between them. This studyproposes a general framework for reverse engineering CPSs directly from data.The method involves the identification of physical systems as well as the inferenceof transition logic. It has been applied successfully to a number of real-world ex-amples ranging from mechanical and electrical systems to medical applications.The novel framework seeks to enable researchers to make predictions concern-ing the trajectory of CPSs based on the discovered model. Such information hasbeen proven essential for the assessment of the performance of CPS, the design offailure-proof CPS and the creation of design guidelines for new CPSs.

1

arX

iv:1

810.

0069

7v1

[cs

.SY

] 1

Oct

201

8

Page 2: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Introduction

Since the invention of computers, software has quickly become ubiquitous in our daily lives. Soft-ware controls domestic machines, such as washing and cooking appliances, aerial vehicles such asquadrotors, the scheduling of power generation and the monitoring of human body vital signals.These technologies embed cyber components throughout our physical world. In fact, almost allmodern engineering systems involve the integration of cyber and physical processes.

The integration of cyber and physical components provides new opportunities and challenges.On one hand, this integration produces new functionality in traditional physical systems, such asbrakes and engines in vehicles, intelligent control systems for biochemical processes and wear-able devices [1–3]. On the other hand, the integration of cyber components adds new layers ofcomplexity, potentially seriously complicating their design and guaranteeing their performance.CPSs, such as modern power grids or autonomous cars, require guarantees on performance to beeconomically and safely integrated into society. In power grids, the failure of transformer taps,capacitors and switching operations alter the dynamics of the grid. Changes in dynamics of thepower grid can be extremely costly. We have, after all, already witnessed a massive power outagein Southern California on September 2011 due to a cascading failure from a single line tripping(which was not detected by operators using their model), costing billions of USD. In autonomousdriving, when operating in multiple complex scenarios – from driving on multi-lane highway toturning at intersections while obeying rules – high-level software makes decisions while low-levelcomputer control systems realize the command using a combination of GPS/IMU, camera, radarand LIDAR data [4]. In such complex scenarios, guaranteeing CPS’s performance poses a funda-mental challenge.

For performance guarantees, we require reliable models that capture essential dynamics. Thecentral question this study seeks to answer, therefore, is how to reliably and efficiently automatemechanistic modeling of CPSs from data [6,7]. An appropriate mathematical model of CPS shouldrecognize the hybridity of CPS, which comprise of discrete and continuous components due to theintegration of software and physical systems, respectively. Hybrid dynamical systems (detailed inthe Materials and Methods section below and Supplementary Materials Section S1.2) use finite-state machines to model the cyber components and dynamical systems for the physical counter-parts. Hybrid dynamical models can produce accurate predictions and enable assessments of theCPS’s performance [8]. This paper presents a new method, namely identification of hybrid dy-namical systems (IHYDE), for automating the mechanistic modeling of hybrid dynamical systemsfrom observed data and without any prior knowledge. IHYDE has low computational complexity

2

Page 3: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

and is robust to noise, enabling its application to real-world CPS problems.

There are various methods for identifying nonhybrid dynamical systems. Schmidt and Lipson [9]

proposed a data-driven approach to determine the underlying structure and parameters of time-invariant nonlinear dynamical systems. Schmidt and Lipson’s method uses symbolic regressionto identify the system, balancing model complexity and accuracy. However, symbolic regressionhas its limitations: it is computationally expensive, does not scale to large systems, and is proneto overfitting. Although recent research [10–13] managed to reduce the expensive computationalburden using compressive sensing and sparse learning, these methods cannot be applied to hybriddynamical systems because of the complexity in hybrid models; basically, these algorithms cannotaccount for an unknown number of unknown subsystems that interact via unknown transition logic.

There has been a number of interesting results in hybrid dynamical system identification in the pasttwo decades [14–26]. Researchers have been developing different methods across several fieldssuch as algebraic-geometric [17], mixed integer programming [20], bounded-error [21], Bayesianlearning [22], clustering-based strategies [23], and multi-modal symbolic regression [24]. Refer-ence [16] gives a comprehensive literature review, which summarizes all major progresses at thetime. Later, pioneering works [18, 19] use ideas from compressive sensing [27] to identify theminimum number of submodels by recovering a sparse vector-valued sequence. Despite the clearmerits of all these pioneering contributions, yet most research on hybrid system identification hasbeen dealing with the most basic hybrid dynamical model– the piecewise affine model with lineartransition rules [15]. These methods require some type of prior knowledge of the hybrid system,such as number of subsystems, parametrization of subsystems dynamics or transition logic. In con-trast, IHYDE removes all these assumptions, and with no prior knowledge of the system (exceptperhaps the general field of the system), provides the number of subsystems, their dynamics, andthe transition logic. IHYDE deals with this problem in two parts: first, the algorithm discovers howmany subsystems are interacting – and identifies a model for each one; second, the algorithm infersthe transition logic between each pair of subsystems. Later in this work, we will propose IHYDEand detail the two-step method for discovering hybrid dynamical systems from data directly. Next,we present the results of using IHYDE on a number of examples, ranging from power engineer-ing and autonomous driving to medical applications, to demonstrate the algorithm’s application tovarious types of datasets.

3

Page 4: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Results

This section is divided in two major parts. The first presents the proposed inference-based IHYDEalgorithm using a simple example– a thermostat, while the second illustrates its applicability toa wide range of systems, from real physical systems to challenging in silico systems, and fromlinear to nonlinear dynamics and transition rules. Details of both the algorithm and how data wasacquired or generated can be found in Materials and Methods or Supplementary Materials.

The inference-based IHYDE algorithm applied to a thermostat

This section explains the key concepts of IHYDE using one of the simplest and ubiquitous hybriddynamical systems: a room temperature control system consisting of a heater and a thermostat.The objective of the thermostat is to keep the room temperature y(t) near a user specified temper-ature. At any given time, the thermostat can turn the heater on or off. When the heater is off, thetemperature dissipates to the exterior at a rate of −ay(t) degrees Celsius per hour, where a > 0 isrelated to the insulation of the room. When the heater is on, it provides a temperature increase rateof 30a degrees Celsius per hour (Fig. 1A).

Assume a desired temperature is set to 20 degrees Celsius. Thermostats are equipped with hystere-sis to avoid chattering, i.e., fast switching between on and off. A possible transition rule is to turnthe heater on when the temperature falls bellow 19 degrees, and switching it off when it reaches21 degrees (Fig. 1B). The goal of IHYDE is to infer both subsystems plus the transition logic fromonly the observed time-series data of the temperature (Fig. 1C).

Inferring subsystems

The first step of the algorithm is to iteratively discover which subsystem of the thermostat generatedwhich time-series data. Initially, the algorithm searches for the subsystem that captures the mostdata, since this subsystem would explain the largest amount of data. In this case, the algorithmwould firstly find subsystem 2 (heater on) since more than half of the data corresponds to thatsubsystem (Fig. 1C). The time-series portion of the data (Fig. 1D) is then used to find the dynamicsof subsystem 2. The algorithm is then repeated on the remaining data (Fig. 1E). In this case, thereis only one subsystem left (heater off). Hence, the algorithm classified all the data to a subsystemand identified the corresponding dynamics.

4

Page 5: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Temperaturey(t)

System1(OFF) System2(On)

y 19

y � 21

Temperaturey(t)

Temperaturey(t)

(A)(C)

On On

(D)

(E)

19

20

21

Off

19

20

21

19

20

21

On On

Time

Time

TimeOff

y = �ay y = a(30 � y)

2019 21

(B)Temperaturey(t)

On

Off

Fig. 1. Thermostat example. (A) The physical dynamic equations plus the transition rules of thehybrid system. (B) The transition rules of the relay hysteresis. (C) A simulation of the temperatureof the thermostat system. Red (blue) is associated with the heater on (off). (D) (E) Values of thetemperature corresponding to the heater on (off).

Inferring transition logic

The second and final step is to identify the transition logics between the two subsystems, i.e. whattriggered the transitions from on to off and from off to on. Starting with subsystem 2 (heater on) andits associated data in Fig. 1D, the algorithm first learns that no switch occurs when the temperaturechanges from just below 19 to near 21. Since the switch happens when the temperature reaches 21degrees, the algorithm concludes that the switch from on to off happens when y(t) = 21 degrees.In practice, however, the software detects the switches when y(t) ≥ 21. Similarly, from Fig. 1E,the algorithm learns that the switch from on to off happens when y(t)≤ 19. In summary, IHYDEautomatically learns the dynamics of all subsystems and the transition rules from one subsystemto another. While this is a simple system, as we will show next, this is true even in the presence ofa large number of subsystems, potentially with nonlinear dynamics and transition rules.

Universal application

Next, we illustrate how IHYDE can be applied a wide range of applications, from power engineer-ing to robotics to medicine, showing the flexibility, applicability and power of IHYDE to modelcomplex systems. Here, we consider the following systems. 1) Autonomous vehicles and robots:design and validation of an autonomous vehicle. 2) Large scale electronics: Chua’a circuit. 3)Monitoring of industrial processes: monitoring a wind turbine. 4) Power systems: transmissionlines and smart grids. 5) Medical applications: heart atrial AP monitoring. To test IHYDE’s per-formance, these systems will include both real experiments and synthetic datasets. Details can be

5

Page 6: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

found below and in Section S3 of Supplementary Materials.

Table 1 contains a summary of the most important systems analyzed in the paper. The first threeexamples are based on real data, while the other three are based on simulated data. The first columnillustrates the systems, while the second column shows the different subsystems plus the transitionrules. Each subsystem is associated with a particular color. The third column shows the originaltime-series data (dots) in the color associated with the subsystem that generated it, the fitted datafrom the identified models (lines connecting the dots), and the location of the transitions (changesin colors). Note that, at this resolution, the original data and the data obtained from the fitted mod-els are indistinguishable. The last column presents the relative error ratio [28] between the truedata and the data simulated by the fitted model. A small error ratio indicates a good agreement be-tween the true and modeled systems, and serves as a measure of the performance of IHYDE. Datais either collected (real systems) or simulated (synthetic systems) and captures all key transitions.As seem in column 3 and column 4 of Table 1, IHYDE successfully modeled the original dynam-ics that generated the data in all examples with extremely high precision (nearly zero identificationerrors). First, it was able to classify each time point according to the respective subsystem thatgenerated it. Second, it identified the dynamics of each subsystem with a very small error (lessthan 0.3% on all simulated examples). Finally, it correctly identified the transition rules betweensubsystems.

Autonomous vehicles and robots: design and validation

To demonstrate IHYDE’s usefulness in designing and validating complex systems, we tested thealgorithm on an autonomous vehicle, custom built in the lab (Table 1A). Typically, the designprocess of complex systems consists of an arduous, time-consuming, and trial-and-error based ap-proach: start from an initial design, evaluate it’s performance and revise it until the performanceis satisfied. A primary issue with this iterative approach is that when a design fails to meet desiredspecifications, many times engineers have little to no insight on how to improve the next iteration.Often, an engineer cannot discern whether the issue is due to poor mechanical design, issues withthe software, or factors that were not considered. And this is also true with other general complexsystems that involve interactions between physical/mechanical parts and software.

The autonomous electrical car consists of a body, a MK60t board, a servo motor, a driving motor,and a camera. The design goal of the autonomous car was to successfully run through a windingtrack as quickly as possible. Using an embedded camera, the software captures information of theupcoming road layout to ascertain whether a straightway or a curve is coming up. Based on thisinformation, the motor chooses an appropriate power to match the desired speed control strategy.

6

Page 7: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

System Hybrid dynamical System Data fitting and transitions

Straight Curve

Low-voltage

model

High-voltage

model

Middle-voltage

model

Normal Broken

Base

configuration

Changed

configuration

Normal Line Fault

Normal Disease

Autonomous

vehicles and robots

A

Large scale

electronics

B

Monitoring of

industrial processes

C

Smart grid

D

Power systems

fault

monitoring

E

Medical

applications

F

Relative error ratio (%)

0.24%

5.2%

2.5%

0.000081%

0.00080%

0.029%

Differential motor input

Gating variable

Current

Derivative of Voltage

Apparent power

Current

Time

Table 1. Summary of IHYDE algorithm applied to numerous examples.

For the purpose of illustration, we considered a simple controller that provides higher velocities onstraightways and lower velocities on curves. In addition, simple feedback controllers help the carfollow the chosen speed and stay on the track. The speed control strategy is based on incrementalproportional and integral (PI) control that keeps the car at the correct speed, while the switchingrule decides on the correct speed depending on whether a straight or curve is coming up.

For the first design, we deliberately swapped the straightway and curve speeds to mimic a soft-ware bug. As a consequence, the car travelled rather slow in the straights and left the tracks in thecurves (Supplementary Materials: movie S1). While in this case it was rather easy for engineers tospot the software bug, debugging, in general, can be extremely difficult, sometimes only possibleby trial and error, and, as a consequence, very time consuming. One would like to check whetherthese types of bugs could be pinpointed by IHYDE. Indeed, from the data generated by the faultysystem, IHYDE showed that the models had incorrect speed controllers. Hence, from data alone,IHYDE successfully reverse engineered the control strategy of the CPS.

7

Page 8: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Large scale electronics (Chua’s circuit)

Debugging and verifying large scale electronics can be a daunting experience. Modeling couldhelp identify whether a device has been built according to the desired specifications by identifyingfaulty connections or incorrect implementations. Simple electrical circuits, such as RLC circuits,are linear and easy to model. However, most electronic circuits introduce both nonlinear dynamicsand switches (e.g., diodes and transistors), which can lead to extremely complex behaviors. Thus,modeling such systems can be very challenging.

To illustrate IHYDE in this scenario, we built an electronic circuit that exhibits complex behaviors.We chose a well known system, called the Chua’s circuit [29], that exhibits chaotic trajectories (Ta-ble 1B). Chaotic systems constitute a class of systems that depend highly on initial conditions, andmakes simulation and modeling very challenging. Our circuit consists of an inductor, two capaci-tors, a passive resistor and an active nonlinear resistor, which fits the condition for chaos with theleast components. The most important active nonlinear resistor is a conceptual component that canbe built with operational amplifiers and linear resistors. The resulting nonlinear resistor is piece-wise linear, making the Chua’s circuit a hybrid dynamical system with a total of three subsystems.

After collecting real data measured from the circuit, IHYDE successfully captures the dynam-ics of system and the transition rules between identified subsystems. In particular, the nonlineardynamics are consistent with the true parameters of the circuit elements. As with all examples,modeling of the Chua’s circuit was achieved using only the data, and no other assumptions ondynamics or switching behaviors.

Monitoring of industrial processes (wind turbines)

Next, we consider the problem of real-time monitoring industrial processes. Modeling large scaleindustrial processes is challenging due to the large number of parts involved, nonlinear dynamicsand switching behaviors. Switches, in particular, are caused by breaking down of parts (due towear and tear) and turning processes on and off, which introduce discontinuities in the dynamics.We propose IHYDE as a tool to detect these switches as quickly as possible to prevent lengthy andexpensive downtimes in industrial processes.

To put IHYDE to the test, we used real data from a wind turbine platform built in [30]. We mea-sured the current generated by the wind turbine under different operating conditions (Table 1C).The system included a 380V power supply, a variable load, a power generator, a motor, a fan,two couplings and a gearbox that transmits the energy generated by the wind wheel to the power

8

Page 9: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

generator [30]. We performed experiments under normal and faulty conditions (a broken toothof gearbox (Fig. S14)) and measured the current of the wind turbine with sampling frequency1000Hz. In both experiments, the generator speed was 200 revolutions per minute and the loadwas 1.5 KNm.

IHYDE was tested under two different scenarios: offline modeling, used, for example, at the de-sign stage; and online modeling, for real-time monitoring. In offline modeling, all the data areavailable for modeling, while in real-time monitoring only past data are available, and the systemis continuously modeled as new data is gathered. In offline modeling, IHYDE identifies two linearsubsystems, corresponding to the system in the two different conditions. In addition, it correctlydetects the fault right after it happens and infers the transition logic. In online modeling, a modelpredicts the next time-series data point, and compares it with the real one, when this becomesavailable. If the difference is high, IHYDE detects a transition, builds a new model, and comparesit with the old model to pinpoint the location of the fault. In this example, we focus on the onlinemodeling: the fault is detected within only 3 data points, or within 1 sec, following its occur-rence. This application demonstrates the capabilities of IHYDE in online monitoring of industrialprocesses.

Power systems (smart grids and transmission lines)

Smart grids have been gaining considerable attention in the last decades and are transforming howpower systems are developed, implemented and operated. They considerably improve efficiency,performance and makes renewable power feasible. In addition, it overhauls aging equipment andfacilitates real-time troubleshooting, which decreases brownouts, blackouts, and surges. As withall critical infrastructures, smart grids require strict safety and reliability constraints. Thus, it isof great importance to design monitoring schemes to diagnose anomalies caused by unpredictedor sudden faults [31]. Here, we consider two examples of power systems: real-time modeling tocontrol smart grids and pinpointing the location of a transmission line failure.

We start by illustrating how IHYDE can model and control smart grids in real-time. Accuratemodel information is not only necessary for daily operation and scheduling, but also critical forother advanced techniques such as state estimation and optimal power flow computation. However,such information is not always available in distribution systems due to frequent model changes.These changes include: high uncertainty in distributed energy resources, such as components be-ing added and removed from the network; unexpected events, such as line faults and unreportedline maintenance; and trigger of automatic control and protection measures. We apply IHYDE toidentify network models and infer transition logics, capturing model changes from advanced me-

9

Page 10: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

tering infrastructure data and in real-time. The 33-bus benchmark distribution system [32], shownin Figure S8, generates the data. It is a hypothetical 12.66 KV system with a substation, 4 feeders,32 buses, and 5 tie switches [32]. The system is not well-compensated and lossy, and is widelyused to study network reconfiguration problems. Assume the loads on some remote nodes of afeeder suddenly increase, causing voltage sag. Subsequently, an operator takes switch action forload balancing and voltage regulation. Figure S9 depicts the switching topologies and the realtransition logic. Suppose we can measure all active and reactive power consumption, and voltagephasors of the nodes. Hence, the system is changing between two configurations corresponding totopologies when some switches turn on and off. For each node and subsystem, IHYDE success-fully identifies the responding column of the admittance matrices with nearly zero identificationerrors. The identified admittance matrices at the switching time instants are very different fromthat of the previous moments, indicating a model switching (corresponding to changes in colors onthe data in Table 1D). Indeed, the identified logic is consistent with the real logic and demonstratesthat IHYDE can reveal voltage drops at specific nodes in real-time and suggest switch action toavoid sharp voltage drops.

To simulate a transmission line failure, assume a transmission line fails between two buses inthe network. We will use a standard benchmark IEEE 14-bus power network (please see [33] andFig. S7). This system consists of generators, transmission lines, transformers, loads and capacitorbanks. Looking directly at the generated data (Table 1E), it is not clear when the fault occurred, andmuch less what happened at the time of failure and where it was located. This is because the powersystem compensated the failure by rerouting power across other lines. IHYDE, however, can im-mediately detect the occurrence of this event and determine its location. This is done by estimatingthe new admittance matrix using only 10 measurements following the failure (corresponding to166.7 milliseconds, according to the IEEE synchrophasor measurements standard C37.118, 2011).Basically, it successfully discovers both subsystems (normal and failure) from data and calculatesthe difference of the discovered subsystems (leading to the location of the fault). Given the fre-quency at which Phasor Measurement Units (PMUs) sample voltage and current, IHYDE is ableto locate the fault in a few hundred milliseconds after the event occurs, enabling the operators todetect the event, identify its location, and take remedial actions in real-time.

Medical applications (heart atrial AP monitoring)

The development of medical devices is another active research area. Especially, with the widespreaduse of wearables and smart devices, there is an exponential growth of data collection. These datarequires personalized modeling algorithms to extract critical information for diagnosis and treat-ments. Within this context, we apply IHYDE to model data gathered from a human atrial action

10

Page 11: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

potential (AP) system [34]. The human atrial AP and ionic currents that underlie its morphologyare of great importance to our understanding and prediction of the electrical properties of atrialtissues under normal and pathological conditions.

The model captures the spiking of the atrial AP. In particular, two gating variables capture thefast and slow inactivation with switching dynamics. Following a spike, these two variables raise,preventing a new spike. Eventually, as the AP returns to low values, the inactivation dynamicsswitch back, and in time allow a new spike to take place. The goal is to test whether IHYDE candetect these transitions, together with the rules that led to the switch. In this study, IHYDE in-deed identifies the two subsystems, together with their dynamics, and pinpoints the changing logiccorrectly (Table 1F). Hence, IHYDE provides a reliable model to study the system and to builddevices to detect abnormal AP.

Discussion

This work presents a new algorithm for identifying CPS from data. The algorithm does not re-quire any prior knowledge and assumptions, except perhaps the general area of the system (e.g.a power grid or a biological system). IHYDE successfully identifies complex mechanistic mod-els directly from data, including the subsystem dynamics and their associated transition logics.The proposed method differs from classical machine learning tools, such as deep neural networkmodels [5], which typically do not provide insight on the underlying mechanisms of the systems.While IHYDE is inspired by prior work in symbolic regression [35], it has much lower computa-tional complexity due to the use of sparse identification and artificial intelligence. As a result, itcan solve large-scale CPSs, facilitating its application to complex real-world problems.

After IHYDE models a CPS, the resulting model can help verify the design specifications and pre-dict future trajectories. If the CPS model reveals design flaws or fails to meet desired requirements,it can guide the redesign to achieve the required performance. Applications include robotics andautomated vehicles (such as cars and unmanned air/spacecraft), where data-driven models promiseto reduce the reliance on trial and error. Furthermore, IHYDE can monitor, detect, and pinpointreal-time faults of CPSs (for example, power systems), thereby helping avoid catastrophic failures.As seen in the results section, IHYDE can be applied to a wide range of applications. Supple-mentary Materials includes additional examples on canonical hybrid dynamical systems [35]. Asbefore, IHYDE successfully identifies both the subsystems and the transition rules with virtuallyzero error (see Example 1-4 in Supplementary Materials).

11

Page 12: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

IHYDE unifies previous results as it can discover not only hybrid dynamical systems, but alsononhybrid dynamical systems (i.e., time-invariant linear and nonlinear systems [10,13]) as specialcases. This was confirmed in Supplementary Materials Section S2, where IHYDE successfullyidentified the original canonical dynamical systems from the data in [13] (Table S31). Hence,IHYDE provides a unified approach to the discovery of hybrid and non-hybrid dynamical systems.

While the approach has a number of advantages, there are still some open questions. First, itrequires a new theory to understand when particular datasets are informative enough to uniquelyidentify a single (the true) hybrid dynamical system. Identifiability is a central topic in systemidentification and provides guarantees that there does not exist multiple systems that can producethe same data. This is illustrated in Supplementary Materials Section S4, where we construct sev-eral hybrid dynamical systems that yield the exact same data, and hence cannot be differentiatedfrom data alone. A second issue is on the choice of a suitable tuning parameter of IHYDE thatbalances model complexity and fitness in the identification process. This often requires fine-tuningand cross validation; the result will vary considerably according to decision made.

Materials and Methods

Hybrid dynamical systems

A formal definition of hybrid dynamical systems can be found in [35, 36] and in SupplementaryMaterials. Here, we summarize the key aspects. Physical systems are characterized by inputsu(t) ∈ Rm and outputs y(t) ∈ Rn. Based on these variables, at any given time a particular modem(t) is chosen from a possible total of K modes, i.e., m(t) ∈ {1,2, ...,K}. Each mode correspondsto particular sets of physical parameters.

The physical system evolves according to sets of differential equations:

dy(t)dt

= Fk (y(t),u(t)) , k = 1,2, . . . ,K,

where each Fk(y(t),u(t))) is related to the dynamics of subsystem k. Assume y(t) and u(t) aresampled at a rate h > 0, i.e. sampled at times 0,h,2h,3h.... For fast enough sampling (for smallsampling period h), one of the simplest method to approximate derivatives is to consider

dy(t)dt≈ y(t +h)− y(t)

h,

12

Page 13: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

which yields the discrete-time system

y(t +h) = y(t)+h Fk(y(t),u(t)), fk(y(t),u(t)), k = 1,2, . . . ,K.

At any given time, the decision of the transition logic to switch to another subsystem is governedtransition rules of the form

m(t +h) = T (m(t),y(t),u(t)).

Hence, the current input-output variables y(t),u(t) plus the current subsystem m(t) determine, viaa function T , the next subsystem. Without loss of generality, we can rescale the time variablet so that h = 1. Thus, we can construct the following mathematical model for hybrid dynamicalsystems

m(t +1) = T (m(t),y(t),u(t))

y(t +1) = f(m(t),y(t),u(t)) =

f1(y(t),u(t)), if m(t) = 1... ,

...

fK(y(t),u(t)), if m(t) = K

Subsystem identification

Let Y and U denote column vectors containing all the samples of y(t) and u(t), respectively, fort = 1,2, . . . ,M + 1, where M + 1 is the total number of samples. The first step to identify thesubsystems is to construct a library Φ(Y,U) of nonlinear functions from the input-output data. Theexact choice of nonlinear functions in this library depends on the actual application. For example,for simple mechanical systems, Φ would consist of constant, linear and trigonometric terms; inbiological networks, Φ would consist of polynomial (mass action kinetics) and sigmoidal (enzymekinetics) terms. Let

Y =

y(1) y(2) . . . y(M)

T

, U =

u(1) u(2) . . . u(M)

T

.

As an illustration, for polynomials (with U = 0) we would have

Φ(Y,U) =[1 Y Y P2 · · ·

]

13

Page 14: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

where higher polynomials are denoted as Y P2,Y P3, etc. For instance, Y P2 denotes quadratic nonlin-earities:

Y P2 =

y21(1) y1(1)y2(1) · · · y2

n(1)

y21(2) y1(2)y2(2) · · · y2

n(2)...

... . . . ...

y21(M) y1(M)y2(M) · · · y2

n(M)

.

Basically, each column of Φ(Y,U) represents a candidate function for a nonlinearity in fk.

These libraries of functions may be very large. However, since only a very small number ofthese nonlinearities appear in each row of Φ, we set up a sparse regression problem to determinethe sparse vectors of coefficients W =

[w1 w2 . . . wn

]. The nonzero elements in W determine

which nonlinearities are active and the corresponding parameters. Letting

Y ,

y1(2) · · · y1(M+1)

y2(2) · · · y2(M+1)... . . . ...

yn(2) · · · yn(M+1)

T

,

and define residual Z = Y −ΦW −ξ, then the first objective is to find the sparsest possible Z, i.e.,

W ∗ = argminW‖Z‖`0

subject to: Z = Y −ΦW −ξ.

This step identifies which time points correspond to which subsystem. The second objective per-forms a similar optimization, but only for those data points where Z is zero, and searching forsparse W . This step identifies the actual dynamics of each subsystem. These two steps are doneiteratively until all subsystems have been identified. Further details are found in Algorithm 1 inSupplementary Materials.

Transition logic identification

Define γi(t) as the set membership: it equals to 1 only if the subsystem i is active at discrete-timet, otherwise it equals to 0. These functions are known from the information in the subsystem iden-tification above. Here, we are interested in learning what functions trigger the switch from one

14

Page 15: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

subsystem to another. Define also step(x), which equals to 1 if x ≥ 0, and 0 otherwise. Mathe-matically, we are searching for a nonlinear function g, such that step(g(y(t),u(t))) specifies themembership. Due to non-differentiability of step functions at 0, we alternatively relax the stepfunction to a sigmoid function, i.e., γ j(t +1)≈ 1

1+e−g(y(t),u(t)) , where j is a potential subsystem thatwe can jump to at time t+1. Assuming we are in subsystem i at time t, the fitness function to jumpto subsystem j at time t +1 is then

M−1

∑t=1

γi(t)∥∥∥∥γ j(t +1)− 1

1+ e−g(y(t),u(t))

∥∥∥∥2

`2

. (1)

To minimize eq. (1), we can parameterize g(y(t),u(t)) as a linear combination of over-determineddictionary matrix, i.e., g(y(t),u(t)) , Ψ(Y,U)v, in which Ψ can be constructed similarly to Φ inthe previous subsection and v is a vector of to-be-discovered parameters. Further details are foundin Algorithm 2 in Supplementary Materials.

References

[1] R. Poovendran, Cyber-physical systems: Close encounters between two parallel worlds[point of view]. Proceedings of the IEEE 98(8), 1363-1366 (2010).

[2] P. Antsaklis, A Brief Introduction to the Theory and Applications of Hybrid Systems. Pro-

ceedings of the IEEE 88(7), 879-887 (2000).

[3] K. Aihara, H. Suzuki, Theory of hybrid dynamical systems and its applications to biolog-ical and medical systems. Philosophical Transactions of the Royal Society of London A:

Mathematical, Physical and Engineering Sciences, 368(1930), 4893-4914 (2010).

[4] D. Wooden, M. Powers, M. Egerstedt, H. Christensen, T. Balch, A modular, hybrid systemarchitecture for autonomous, urban driving. Journal of Aerospace Computing Information

and Communication, 4, 1047-1058 (2012).

[5] Y. Lecun, Y. Bengio, G. Hinton, Deep learning. Nature, 521(7553), 436-444 (2015).

[6] J. N. Kutz, Data-driven modeling & scientific computation: methods for complex systems &

big data. (Oxford University Press, 2013).

[7] W. X. Wang, Y. C. Lai, C. Grebogi, Data based identification and prediction of nonlinearand complex dynamical systems. Physics Reports, 644, 1-76 (2016).

15

Page 16: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[8] A. van der Schaft, J. Schumacher, An Introduction to Hybrid Dynamical Systems, SpringerLecture Notes in Control and Information Sciences, Vol. 251, Springer-Verlag, London,(2000).

[9] M. Schmidt, H. Lipson, Distilling free-form natural laws from experimental data. Science,324(5923), 81-85 (2009).

[10] W. Pan, Y. Yuan, J. Goncalves, G. B. Stan, Reconstruction of arbitrary biochemical reac-tion networks: A compressive sensing approach. In 51st IEEE Conference on Decision andControl, 2334-2339 (2012).

[11] W. X. Wang, R. Yang, Y. C. Lai, V. Kovanis, C. Grebogi, Predicting catastrophes in nonlin-ear dynamical systems by compressive sensing. Physical Review Letters, 106(15), 154101(2011).

[12] Y. H. Chang, C. Tomlin, Data-driven graph reconstruction using compressive sensing. In51st IEEE Conference on Decision and Control, 1035-1040 (2012).

[13] S. L. Brunton, J. L. Proctor, J. N. Kutz, Discovering governing equations from data bysparse identification of nonlinear dynamical systems. Proceedings of the National Academy

of Sciences, 113(15), 3932-3937 (2016).

[14] H. Ohlsson, L. Ljung, Identification of switched linear regression models using sum-of-norms regularization. Automatica, 49(4), 1045-1050 (2013).

[15] A. Bemporad, A. Garulli, S. Paoletti, A. Vicino, A bounded-error approach to piecewiseaffine system identification. IEEE Transactions on Automatic Control, 50(10),1567-1580(2005).

[16] S. Paoletti, A. L. Juloski, G. Ferrari-Trecate, R. Vidal, Identification of hybrid systems atutorial. European Journal of Control, 13(2), 242-260 (2007).

[17] R. Vidal, S. Soatto, Y. Ma, S. Sastry, An algebraic geometric approach to the identificationof a class of linear hybrid systems. In Proceedings of the IEEE Conference on Decision andControl, 167-172 (2003).

[18] L. Bako, Identification of switched linear systems via sparse optimization. Automatica,47(4), 668-677 (2011).

[19] N. Ozay, M. Sznaier, C. Lagoa, O. Camps. A sparsification approach to set membershipidentification of a class of affine hybrid systems. In Proceedings of the IEEE Conference onDecision and Control, 123-130 (2008).

16

Page 17: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[20] J. Roll, A. Bemporad, L. Ljung, Identification of piecewise affine systems via mixed-integerprogramming. Automatica, 40(1), 37-50 (2004).

[21] A. Bemporad, A. Garulli, S. Paoletti, A. Vicino, A bounded-error approach to piecewiseaffine system identification. IEEE Transactions on Automatic Control, 50(10),1567-1580(2005).

[22] A. L. Juloski, S. Weiland, W. Heemels, A Bayesian approach to identification of hybridsystems. IEEE Transactions on Automatic Control, 50(10),1520-1533 (2005).

[23] H. Nakada, K. Takaba, T. Katayama, Identification of piecewise affine systems based onstatistical clustering technique. Automatica, 41(5), 905-913 (2005).

[24] G. Ferrari-Trecate, M. Muselli, D. Liberati, M. Morari, A clustering technique for the iden-tification of piecewise affine systems. Automatica, 39(2), 205-217 (2003).

[25] M. Oishi, E. May, Addressing biological circuit simulation accuracy: Reachability for pa-rameter identification and initial conditions. In Proceedings of the IEEE-NIH Life ScienceSystems and Applications Workshop,152-155 (2007).

[26] J. Thai, A. M. Bayen, State estimation for polyhedral hybrid systems and applications to theGodunov scheme for highway traffic estimation. IEEE Transactions on Automatic Control,60(2), 311-326 (2015).

[27] E. J. Candes, Compressive sampling. In Proceedings of the international congress of math-ematicians, 1433-1452 (2006).

[28] L. Ljung, System identification: theory for the user. (PTR Prentice Hall, Upper Saddle River,NJ 1999).

[29] L. O. Chua, M. Itoh, L. Kocarev, K. Eckert, Chaos synchronization in Chua’s circuit. Journal

of Circuits System & Computers, 2(03), 705-708 (2011).

[30] Q. He, Y. Guo, X. Wang, Z. Ren, J. Li, Gearbox fault diagnosis based on RB-SSD andMCKD. China Mechanical Engineering, 28(13), 1528-1534 (2017).

[31] W. Pan, Y. Yuan, H. Sandberg, J. Goncalves, G. B. Stan, Online fault diagnosis for nonlinearpower systems. Automatica, 55, 27-36 (2015).

[32] M. Baran, F. Wu, Network reconfiguration in distribution systems for loss reduction andload balancing. IEEE Trans Power Delivery, 4(2), 1401-1407 (1989).

17

Page 18: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[33] Power Systems Test Case Archive. https://www2.ee.washington.edu/research/pstca/pf14/pg tca14bus.htm.

[34] M. Courtemanche, R. Ramirez, S. Nattel, Ionic mechanisms underlying human atrial actionpotential properties: insights from a mathematical model. Am J Physiol, 275(2), 301-321(1998).

[35] D. L. Ly, H. Lipson, Learning symbolic representations of hybrid dynamical systems. Jour-

nal of Machine Learning Research, 13, 3585-3618 (2012).

[36] J. Lygeros, C. Tomlin, S. Sastry, Hybrid Systems: Modeling, Analysis and Control, mono-graph.

Acknowledgements

General: The first author would like to sincerely thank Prof. Claire J. Tomlin for insightful dis-cussion and continuous help. We thank Prof. Guang Yang for help on the experimental setup.We thank Mr. Anthony Haynes, Mr. Frank Jiang, Dr. Anija Dokter and Mrs. Karen Haynes forediting. We thank Dr. Daniel Ly and Prof. Ke Li for sharing datasets. Author contributions:Y.Y. developed the IHYDE algorithms. Y.Y. and X.T. developed simulation codes for the exampleproblems considered. All authors participated in designing and discussing the study and writingthe paper. Competing interests: The authors declare that they have no competing interests. Dataand materials availability: All data needed to evaluate the conclusions in the paper are present inthe paper and/or the Supplementary Materials. The code implementation of IHYDE is available athttps://github.com/HAIRLAB/CPSid.

18

Page 19: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart
Page 20: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Supplementary Materials

Contents

S1 Preliminaries 21S1.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21S1.2 Introduction to Hybrid Dynamical Systems . . . . . . . . . . . . . . . . . . . . . 21

S2 IHYDE Algorithm 23S2.1 Inferring Sub-systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

S2.1.1 Application to nonhybrid dynamical systems . . . . . . . . . . . . . . . . 26S2.1.2 Variants of Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

S2.2 Inferring Transition Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27S2.3 Cross Validation for Parameter Tuning . . . . . . . . . . . . . . . . . . . . . . . . 29

S3 Results for IHYDE 31S3.1 Example 1: Hysteresis Relay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31S3.2 Example 2: Continuous Hysteresis Loop . . . . . . . . . . . . . . . . . . . . . . . 34S3.3 Example 3: Phototaxis Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37S3.4 Example 4: Non-linear Hybrid System . . . . . . . . . . . . . . . . . . . . . . . . 40S3.5 Example 5: Power Grid Fault Detection . . . . . . . . . . . . . . . . . . . . . . . 43S3.6 Example 6: Identification of Real-time Models for Smart Grid . . . . . . . . . . . 45S3.7 Example 7: Discovery of Human Atrial Action Potential Models . . . . . . . . . . 49S3.8 Example 8: Monitoring of Industrial Processes . . . . . . . . . . . . . . . . . . . 54S3.9 Example 9: Chua’s Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59S3.10Example 10: Autonomous Car . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64S3.11Example 11: Non-hybrid Dynamical Systems . . . . . . . . . . . . . . . . . . . . 69

S4 Discussion 70S4.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70S4.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

A Extension of the Proposed IHYDE Algorithm 72

B User’s Manual of the Code 78

20

Page 21: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S1 Preliminaries

S1.1 Notations

Rn denotes the n-dimensional Euclidean space.

Z denotes the set of integers, . . . ,�1,0,1, . . ..

kxk`0: the `0-norm of a vector x.

kxk`1: the `1-norm of a vector x, i.e., kxk`1 =nÂ

i=1|xi|.

kxk`2: the `2-norm of a vector x, i.e., kxk`2 = (|x1|2 + · · ·+ |xn|2)1/2.

A: For a matrix A 2 RM⇥N , A[i, j] 2 R denotes the element in the ith row and jth column,A[i, :] 2 R1⇥N denotes its ith row, A[:, j] 2 RM⇥1 denotes its jth column.

a: For a column vector a 2 RN⇥1, a[i] denotes its ith element.

S1.2 Introduction to Hybrid Dynamical Systems

Before giving formal definition to hybrid dynamical systems, we shall first give a brief review todynamical systems. A dynamical system describes how state variables (typically physical quanti-ties) evolve with respect to time. Following definitions in [36], we define three types of variables.

1. continuous state variables: if the state variable takes value in Rn for n � 1.

2. discrete state variables: if the state variable takes value in a finite set, for example, {1,2,3, . . .}.

3. hybrid state variables: if a part of the state variables are continuous and the other discrete.

Based on the time set over which the state evolves, we classify the dynamical systems as:

• continuous time: if the set of time is a subset of the real line R. Normally we use t 2 R todenote the continuous time. The evolution of the state-variables in continuous time can bedescribed as ordinary differential equations.

• discrete time: if the set of time is a subset of the integers. Normally we use k 2 Z to de-note discrete time. The evolution of the state-variables in discrete time can be described asdifference equations.

A hybrid dynamical system H , is defined as a tuple, H = (W ,M ,F ,T ) with the followingdefinitions:

21

Page 22: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

• W defines a subspace in Rm+n for input-output variables u(t) 2 Rm,y(t) 2 Rn;

• M defines a countable, discrete set of modes in which only a single mode, m 2 {1,2, . . . ,K},is occupied at a given time;

• F defines a countable discrete set of first-order differential equations:

F =

⇢dy(t)

dt= Fk (y(t),u(t))) | k = 1,2, . . . ,K

• T defines a countable discrete set of transitions, where Ti! j denotes a Boolean expressionthat represents the condition to transfer from mode i to j.

The signals y(t) and u(t) are sampled at a rate h > 0, i.e. sampled at times 0,h,2h,3h.... For fastenough sampling (or low h), standard system identification typically obtains first a discrete-timesystem, and then coverts it to a continuous-time system [28]. One of the simplest methods toapproximate derivatives is to consider

dy(t)dt

⇡ y(t +h)� y(t)h

which yields the discrete-time system

y(t +h) = y(t)+h Fk(y(t),u(t)) , fk(y(t),u(t)), k 2 {1,2, . . . ,K}

For simplification of notation, assume the system can be written as

y(t +h) = fk(y(t),u(t)) , Ik(y(t))+hk(u(t)), k 2 {1,2, . . . ,K}

Hence, the class of systems considered is discrete-time, Markovian and nonlinear. While this isalready a very rich class of systems, it can be easily extended to more general nonlinear systems,including, for example, dynamics of non-separable nonlinear functions of (y(t),u(t)).

Without loss of generality, we can rescale the time variable t so that h = 1. Thus, we canconstruct a mathematical model for hybrid dynamical systems

m(t +1) = T (m(t),y(t),u(t))

y(t +1) = f(m(t),y(t),u(t)) =

8>>><>>>:

f1(y(t),u(t)), if m(t) = 1... ,

...

fK(y(t),u(t)), if m(t) = K

(2)

22

Page 23: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Example 1. Consider again the temperature control system in Fig. 1, consisting of a heater and a

thermostat. Variables that are included in a model of such a system are the room temperature y(t)

and the operating mode of the heater (on or off). Assuming a sampling time of h > 0, we obtain

the following approximate difference equations for the temperature

Subsystem 1 (heat off) :y(t +1)� y(t)

h⇡�ay(t), ) y(t +1) = (1�ah)y(t)

Subsystem 2 (heat on) :y(t +1)� y(t)

h⇡�a(y(t)�30), ) y(t +1) = (1�ah)y(t)+30ah.

These two equations model how the temperature changes under the heater off or on, respec-

tively. The transition logics between the two subsystems are

Transition logic from subsystem 1 to 2 T1!2 : y 19

Transition logic from subsystem 2 to 1 T2!1 : y � 21,

representing the control for the operating mode of the heater. This is the hybrid dynamical system

model for the thermostat. Given the hybrid dynamical system, we can study the stability of such a

system or simulate the system to check the trajectories of state variables. Note that, in practice, the

hybrid dynamical system model is usually unknown or only partially known. The goal of this paper

is to infer both the above subsystems and the transition logic (Fig. 1(A)) from only time-series data

of the temperature in Fig. 1(C).

S2 IHYDE Algorithm

When a hybrid dynamical system has a single subsystem, i.e., K = 1 in eq. (2), it becomes a time-invariant nonlinear dynamical system. We start by briefly reviewing identification tools for thisclass of systems from [10, 13], since parts of our proposed algorithm are based on these tools. Asexplained before, our algorithm uses only time-series data to directly model the system. Hence,the first step is to collect time-course input-output data (y(t),u(t)) uniformly sampled at a numberof discrete time indices t = 1,2, . . . ,M +1. Let

Y =

26664 y(1) y(2) . . . y(M)

37775

T

, U =

26664 u(1) u(2) . . . u(M)

37775

T

.

Note that y(t)2Rn and u(t)2Rm, and so Y 2RM⇥n and U 2RM⇥m. Next, we construct an overde-termined library F(Y,U) consisting of potential nonlinear functions that appear in fk in eq. (2). It is

23

Page 24: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

expected that the true nonlinearities are part of this library. The choice of these functions is guidedby the particular field of study. For example, the library would consist of sinusoidal functions inpendulums, and polynomial and sigmoidal functions in biochemical networks. As an illustration,a library consisting of constant or polynomials would result in the following dictionary matrix

F(Y,U) =h1 Y Y P2 · · · U UP2 · · ·

i.

Here, higher polynomials are denoted as Y P2 ,Y P3 , etc. For example, Y P2 denotes the quadraticnonlinearities in the state variable Y , given by:

Y P2 =

26666664

y21(1) y1(1)y2(1) · · · y2

n(1)

y21(2) y1(2)y2(2) · · · y2

n(2)...

... . . . ...

y21(M) y1(M)y2(M) · · · y2

n(M)

37777775

.

Basically, each column of F(Y,U) represents a candidate function for a nonlinearity in f. Thelibrary of functions may be very large. However, since only a very small number of these nonlin-earities appear in each row of F(Y,U), we can set up a sparse regression problem to determine thesparse matrices of coefficients W =

hw1 w2 . . . wn

i, where wi 2 RP⇥1 and P is the total num-

ber of total number of candidate functions in the library. The nonzero elements in W determinewhich nonlinearities are active [10, 13] and the corresponding parameters.

Let

Y ,

26666664

y1(2) . . . yn(2)

y1(3) . . . yn(3)... . . . ...

y1(M +1) . . . yn(M +1)

37777775

.

This results in the overall model Y = F(Y,U)W +⇠, where ⇠ =h⇠1 ⇠2 . . . ⇠n

iand ⇠i 2 RM⇥1

is zero-mean i.i.d. Gaussian measurement noise with covariance matrix s2I, for some s � 0. Thework in [10,13], developed methods based on Lasso and Sparse Bayesian Learning for identifyingeach wi in the above equation as the following optimization:

w⇤i = argmin

wkY [:, i]�Fwik2

`2+lkwik`1 . (3)

For the completeness of this paper, we briefly review the results in [10] in the Appendix A.

24

Page 25: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S2.1 Inferring Sub-systems

When K > 1, we can use a similar formulation as above. However, the challenge is that there isno single W typically fits all the data due to the hybrid nature of the dynamical system. Next, weintroduce a new method to tackle such a challenge.

Define Z = Y �FW �⇠, where ⇠ is defined similarly as realizations of zero-mean i.i.d. Gaus-sian measurement noise with covariance matrix s2I. The goal is to find a Z⇤,

hZ⇤

1 Z⇤2 . . . Z⇤

n

i,

Y �FW ⇤ �⇠ as sparse as possible, i.e.,

W ⇤ = argminW

n

Âi=1

kZik`0

subject to: Z = Y �FW �⇠.

(4)

Correspondingly, we have Z⇤i = Y �Fw⇤

i . The interpretation of this optimization is to find a W

(or equivalently a subsystem) that fits most of the input-output data. As a result, the index ofthe zero entries of Z⇤ corresponds to the index for input-output that is can be fitted by a singlesubsystem. This initial idea was similar to those presented in [18], yet we later extend this ideato a robust Bayesian algorithm that works well for noisy data. To solve eq. (4), assume, withoutloss of generality, that the dictionary matrix F is full rank. Define a transformation matrix Q inwhich each column spans the left null space of F. Then, it follows that QY = QZ + Q⇠. Usingan appropriate Lagrange multiplier lz, we now can rewrite the above problem as an unconstrainedminimization:

minZ

12( ˜Y �QZ)T⌃�1( ˜Y �QZ)+lz

n

Âi=1

kZik`0 ,

where ˜Y , QY and ⌃ = QQT .

Remark 1. This is the key step in the later proposed algorithm; there is no W in this optimization

after the transformation. Instead, we are optimizing over the residual Z.

However, this problem is known to be computationally expensive. Instead, we use the follow-ing convex relaxation

Z⇤ = argminZ

12( ˜Y �QZ)>⌃�1( ˜Y �QZ)+lz

n

Âi=1

kZik`1 .

We can decompose the above optimization to a number of smaller optimizations: for i = 1, . . . ,n

Z⇤i = argmin

Zi

12( ˜Y [:, i]�QZi)

>⌃�1( ˜Y �QZi)+lzkZik`1 . (5)

25

Page 26: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Once this problem is solved, we consider the index set seq = { j| |Z⇤i [ j]| ez} and further

identify the sparse coefficients w⇤i using the following optimization

w⇤i = argmin

wi

12kY [seq, i]�F[seq, :]wik2

`2+lwkwik`1 .

The variables w⇤i are the coefficients of the identified subsystem.

Remark 2. The reason to enforce w⇤i to be sparse is due to the constructed redundant dictionary

matrix F.

We further define error = abs(Y [:, i]�Fw⇤i ) (here abs is an elementary-wise operator which

returns the absolute value of every element of a vector) and we set the jth element of Y [:, i]: Y [ j, i] =

0 and the jth row of Q: Q[ j, :] = 0 if the jth element of error is less than ew, for some small ew > 0.This removes the data that has already been fitted by the subsystem.

Once we have the new Y and Q, we can solve the same problem with the remaining timepoints (where the corresponding elements of Y and the corresponding row of Q are nonzero) usingthe exact same procedure. The number of iterations gives the minimum number of subsystems.The proposed algorithm is summarized in Algorithm 1. The code implementation is availableat https://github.com/HAIRLAB/CPSid with User’s Manual in Appendix B. In what follows, weshall briefly discuss extensions and variants of Algorithm 1, which can empirically improve theperformance of IHYDE.

S2.1.1 Application to nonhybrid dynamical systems

When there is only one subsystem, we show that Zi should be an all 0 matrix from the first opti-mization in eq. (5). Eq. (6) should be the same as eq. (3) since seq = {1,2, . . . ,M + 1}, which re-covers the results for time-invariant nonlinear system identification in [10,13]. As a result, IHYDEprovides a unified point of view to the subsystems identification problem for any K 2 {1,2, . . .}.

S2.1.2 Variants of Algorithm 1

One of the issues in Algorithm 1, according to the experiments presented in the Results section, isquite sensitive to both noise and redundancy in the dictionary matrix. The reason for this sensitivityis that the proposed algorithm has a sequence of optimizations, thus, when an error occurs earlyon, the optimization propagates it to later times, causing large final errors. We hereby propose anew variant of Algorithm 1 to improve the empirical performance of optimizations using Bayesianstatistics. The derivations are presented in Appendix A.

26

Page 27: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S1. Schematics of the proposed subsystems identification algorithm. We construct alibrary of nonlinear functions to start with and a dictionary matrix F. We formulate an iterativeconvex optimization method to infer the number of subsystems and the underlying system modelsfor every subsystem. More specifically, we first identify a best model that fits the majority of data,then we remove the fitted data and re-do the identification until no data are left.

S2.2 Inferring Transition Logic

Once the subsystems have been identified, we can assign every input-output data point (u(t),y(t))

to a specific subsystem as shown in Fig. S1. The next step is to identify the transition logic be-tween different subsystems. We first convert the problem of identifying the transition logic to astandard sparse logistic regression problem which can be efficiently solved by many methods inthe literature. The scheme is illustrated in Fig. S2.

To proceed, we define gi(t) as the set membership which equals to 1 only if the subsystem i isactive at discrete-time t or otherwise it equals to 0. The goal is to identify the transition rules Ti! j

between any subsystems i, j. These functions are known from the information in the subsystemidentification above. Define also step(x), which equals 1 if x� 0, and 0 otherwise. Mathematically,we are searching for a nonlinear function g, such that step(g(y(t),u(t))) specifies the membership.Due to non-differentiability of step functions at 0, we alternatively relax the step function to asigmoid function, i.e., g j(t +1) ⇡ 1

1+e�g(y(t),u(t)) , where j is a potential subsystem that we can jumpto at time t +1. Assuming we are in subsystem i at time t, the fitness function to jump to subsystem

27

Page 28: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Algorithm 1 Sub-systems Identification Algorithm

1: Input: Collect input-output data u(t) and y(t) for t = 1,2, . . . ,M+1. Two pre-specified thresh-olds ez and ew, two tuning parameters lz and lw

2: Output: Return {W i} for i = 1, . . . ,K and the number of subsystems K3: Construct dictionary matrix F(Y,U) based on prior knowledge of the system4: for j = 1, . . . ,n do5: for i = 1, . . . ,Kmax do6: Compute Q in which all column span the left null space of Q: QF = 07: Solve for Zi

j from eq. (5)8: if Zi

j = 0 then9: K = i, Break

10: end if11: h = 1 and seq = [ ]12: for l = 1 . . . ,M do13: if the lth element of Zi

j, i.e., abs(Zij[l]) ez then

14: Set seq[h] = l and h++15: end if16: end for17: Solve the following convex optimization

wij = argmin

w

12kY [seq, j]�F[seq, :]wk2

`2+lwkwk`1

(6)

18: error = abs(Y [:, j]�Fwij)

19: for l = 1 . . . ,M do20: if the lth element of error, i.e., error[l] ew then21: Set Y [l, j] = 0 and Q[l, :] = 022: end if23: end for24: end for25: end for26: Return nonzero W i ,

hwi

1 . . . win

ifor i = 1, . . . ,K and the number of subsystems K

j at time t +1 is thenM�1

Ât=1

gi(t)����g j(t +1)� 1

1+ e�g(y(t),u(t))

����2

`2

. (7)

To solve the optimization in (7), we can parameterize g(y(t),u(t)) as a linear combination of over-determined dictionary matrix, i.e., g(y(t),u(t)) , Y(y(t),u(t))v, in which Y can be constructedsimilarly as F in the previous section and v is a vector of to-be-discovered parameters. The costfunction only takes non-zero value when gi(t) = 1.

28

Page 29: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S2. Illustration of the proposed Algorithm to identify transition logic. Using the mem-bership of every classified data point, we apply logistic regression to infer the logic between everypair of identified subsystems, i.e., Ti!i0 for every i and i0.

Let V , {t|gi(t) = 1}, then

M�1

Ât=1

gi(t)����g j(t +1)� 1

1+ e�g(y(t),u(t))

����2

`2

= Âk2V

����g j(t +1)� 11+ e�Yv

����2

`2

. (8)

After this transformation, the minimization of eq. (8) is known as a the logistic regression. Hence,we can use the standard gradient descent method to solve the logistic regression [39].

Similarly, we can also add an `1 regularizer in the optimization, i.e., we minimize

Ât2V

����g j(t +1)� 11+ e�Yv

����2

`2

+bkvk`1 . (9)

There are many Matlab codes for sparse linear logistic regression. Here, we adopt the implemen-tation framework proposed in [44].

S2.3 Cross Validation for Parameter Tuning

We use cross-validation as a model validation technique for assessing the results of the identifiedsystem in our experiments. We firstly identify a model from a dataset, and later we test it on adataset which is unseen in the modeling stage. The goal of cross validation is to prevent prob-

29

Page 30: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Algorithm 2 Transition Logic Identification Algorithm

1: Input: Input-output data y(t),u(t) and gi(t), i = 1,2 . . . ,K and t = 1,2, . . . ,M2: Output: Transition logic Ti! j(y(t),u(t)) for any pair i, j3: for i = 1, . . . ,K do4: for j 6= i do5: Construct the dictionary matrix from prior knowledge Y as described in the main text6: The solution to the logistic regression in eq. (9) gives the transition model for Ti! j7: end for8: end for9: Return all transition logic mapping T

lems like overfitting, give an insight on how the model can be generalized to different datasets.In particular, we use s-fold cross-validation. The original sample is randomly partitioned into s

equal sized subsamples. Of the s subsamples, a single subsample is retained as the validation datafor testing the model, and the remaining s� 1 subsamples are used as training data. The cross-validation process is then repeated s times, with each of the s subsamples used exactly once as thevalidation data. The s results from the folds can then be averaged to produce a single estimation.The advantage of this method over repeated random sub-sampling is that all observations are usedfor both training and validation, and each observation is used for validation exactly once.

30

Page 31: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3 Results for IHYDE

This section applies IHYDE to a number of examples ranging from power systems to robotics,showcasing the wide range of applicability of the proposed IHYDE method.

S3.1 Example 1: Hysteresis Relay

One of the most common Cyber Physical Systems is the Hysteresis Relay. It is found, for example,in almost all thermostats: the heater is turned on when the temperature is below a threshold, andturned off when the temperature is above another threshold. Typically, the low and high temper-ature switching are different to avoid frequent switching, which could damage the system. TheHysteresis Relay can be found in physical, chemical, engineering and biological applications.

The datasets for discovery are generated by Ly et. al. in [35]. The additive noise level variesfrom 0% to 6% in 2% increments, i.e., Np = snoise

sy⇥ 100%, where snoise is the noise variance

and sy is the variance of the measurement. We apply the proposed IHYDE to data generated byan unknown Hysteresis Relay to discover its hybrid dynamical model (shown in Fig. S3). Thediscovered systems are shown in Table S1 and Table S2 using 2000 data-points repsectively.

Subsystem 1 Subsystem 2

u > 0.98

u < �0.98

y = 0.5u2

+u � 0.5

y = �0.5u2

+u + 0.5

Fig. S3. The hybrid dynamical system model for the Hysteresis Relay. u and y are the measuredoutputs of the Hysteresis Relay.

Using IHYDE for subsystems identification, we are able to successfully identify that there areonly two subsystems that generate the datasets. In addition, the two identified subsystems areconsistent with or close to the true ones from both noiseless and noisy data. Specifically, with orwithout redundant basis functions, we are able to identify the true systems, achieving very similardiscovery results. This, in other words, demonstrates that the IHYDE is able to discover the truesubsystems, the number of subsystems together with parameterizations of every subsystem.

31

Page 32: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S1. The identified systems of Hysteresis Relay for both noiseless and noisy datasets. Wealso present the tuning parameters for different noise levels.

Noise Np = 0 Np = 2% Np = 4% Np = 6%Library F 1

Identified subsystem 1 y = 0.9999 y = 1.0013 y = 1.0027 y = 1.0040Identified subsystem 2 y = �0.9999 y = �1.0004 y = �1.0009 y = �1.0014

lz 1e�3ez 1e�4

lw 0.05

ew 0.2 0.3Number of

misclassified points0

Table S2. The identified result and details of Hysteresis Relay with redundant basis functions.

Noise Np = 0 Np = 2% Np = 4% Np = 6%Library F 1 u u2 u3 u4 u5

Identified subsystem 1 y = 0.9999 y = 1.0013 y = 1.0027 y = 1.0038Identified subsystem 2 y = �0.9999 y = �1.0004 y = �1.0009 y = �1.0014

lz 1e�3ez 1e�4

lw 0.05

ew 0.2Number of

misclassified points0

Once all subsystems have been identified and all data points have been classified, IHYDE iden-tifies the transition logic between subsystems. Even when there exists redundant basis function,IHYDE is able to precisely identify the correct transition logic. The identified results are shown inTable S3 and Table S4.

32

Page 33: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S3. The identified transition logic of the Hysteresis Relay using noiseless data and redundantbasis functions.

Systems Subsystem 1 Subsystem 2Subsystem 1 u > 0.4995Subsystem 2 u < �0.4987

Library Y 1 ub 0.1

Table S4. The identified transition logic of the Hysteresis Relay using noiseless data and redundantbasis functions.

Systems Subsystem 1 Subsystem 2Subsystem 1 u > 0.4995Subsystem 2 u < �0.4987

Library Y 1 u e(10⇤(sin(u2))+10) log(|u|)sin(u) u2 u4

b 0.1

33

Page 34: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.2 Example 2: Continuous Hysteresis Loop

A Continuous Hysteresis Loop is yet another classical hybrid system– here we use the Preisachmodel [35] for data simulation. In our setup, each subsystem has its own input-output behaviorwhile the transitions occur when the input hits certain thresholds. We apply the IHYDE to reverseengineering the Continuous Hysteresis Loop using 2000 data points generated by [35].

u > 0.5

u < �0.5

Subsystem 1 Subsystem 2y = 1

y=-1y = �1 y = �1

Fig. S4. The hybrid dynamical systems model for the Continuous Hysteresis Loop.

The identified systems are shown in Table S5 and Table S6. In contrast with the previous Hys-teresis Relay example, the IHYDE starts obtaining false classification results as the noise levelincreases and with redundant basis functions. Yet, IHYDE is still able to identify the actual sub-system dynamics up to some precision.

Once all subsystems have been identified and all data points have been classified, IHYDEidentifies the transition logic between subsystems. When there does not exist redundant basisfunction (i.e., when prior knowledge is available about the structure of transition logic), IHYDE isable to precisely identify the correct transition logic. The identified results are shown in Table S7.When there exists redundant basis functions, IHYDE still successfully identifies the transitionlogic. The identified results are shown in Table S8.

34

Page 35: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S5. The identified result and details of Continuous Hysteresis Loop.

Noise Np = 0 Np = 2% Np = 4% Np = 6%Library F polynomials in u up to second order

Identifiedsubsystem 1

y = 0.4998u2

+1.0000u�0.4999

y = 0.4989u2

+1.0003u�0.4996

y = 0.4920u2

+0.9999u�0.4982

y = 0.4842u2

+0.9980u�0.4961

Identifiedsubsystem 2

y = �0.5042u2

+1.0021u+0.5008

y = �0.5072u2

+1.0031u+0.5015

y = �0.5172u2

+1.0055u+0.5032

y = �0.5133u2

+0.9966u+0.5027

lz 0.005 0.005 0.008 0.03ez 1e�4

lw 0.005 0.001 0.005 0.008ew 0.04 0.04 0.08 0.105

Number ofmisclassified points

0 17 45 88

Table S6. The identified result and details of Continuous Hysteresis Loop with redundant basisfunctions.

Noise Np = 0 Np = 2% Np = 4% Np = 6%

Library F 1 u u2 e�u u3

eucos(2u)sin(u)3

Identifiedsubsystem 1

y = 0.4999u2

+1.0000u�0.4999

y = 0.5001u2

+1.0001u�0.4998

y = 0.4919u2

+0.9995u�0.4979

y = 0.4811u2

+0.9994u�0.4956

Identifiedsubsystem 2

y = �0.5010u2

+0.9995u+0.5002

y = �0.4979u2

+0.9990u+0.5001

y = �0.5123u2

+1.0000u+0.5034

y = �0.5275u2

+0.9999u+0.5047

lz 0.005 0.005 0.008 0.03ez 1e�4

lw 0.005 0.001 0.005 0.008ew 0.04 0.04 0.08 0.105

Number ofmisclassified points

0 11 45 94

35

Page 36: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S7. The identified transition logic of Continuous Hysteresis Loop.

System Subsystem 1 Subsystem 2Subsystem 1 u > 0.9803Subsystem 2 u < �0.9799

Library Y 1 ub 10

Table S8. The identified transition logic of Continuous Hysteresis Loop when existing redundantbasis functions.

System Subsystem 1 Subsystem 2Subsystem 1 u > 0.9803Subsystem 2 u < �0.9799

Library Y 1 u 1u2

cosusinu3

b 10

36

Page 37: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.3 Example 3: Phototaxis Robot

Consider a Phototaxis Robot with a hybrid dynamical system model shown in Fig. S5 [45], therobot has phototaxis movement: it approaches, avoids, or remains stationary depending on thecolor of light. As described in [35], the output y is velocity of the robot. There are five inputs:u1 and u2 are the absolute positions of the robot and the light, respectively, while {u3,u4,u5} is abinary, one-hot encoding of the light color, where 0 indicates the light is off and 1 indicates thelight is on.

Subsystem 1 Subsystem 2

Subsystem 3

u4 = 1

u3 = 1

u3 = 1

u5 = 1

u5 = 1u4 = 1

y = u2 � u1 y =1

u1 � u2

y = 0

Fig. S5. The hybrid dynamical system model of Phototaxis Robot.

Similar to previous examples, 2000 data points are used. IHYDE starts obtaining false classi-fication results as the noise level increases (shown in Table S9 and in Table S10). Yet, IHYDE isstill able to identify the actual subsystem dynamics without redundant basis functions when noiseintensity is low. When there exists redundant basis functions, IHYDE can identify all the subsys-tems when there is no noise. When noise level increases, IHYDE still identifies the right numberof subsystems, yet the third identified subsystem is different from the true one, i.e., y = 0.

Again, once all subsystems have been identified and all data points have been classified,IHYDE identifies the transition logic between subsystems. IHYDE is able to precisely identifythe correct transition logic both when there is no redundant basis function (Table S11) and whenthere is (Table S12). At a first glance, the inferred transition logic is different from the actual ones.Given u3,u4,u5 are binary values, still, the inferred transition logic are equivalent to the actualones.

37

Page 38: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S9. The identified result and details of tuning parameters in the Phototaxis Robot example.

Noise Np = 0 Np = 2% Np = 4% Np = 6%LibraryF u1 �u2

1u1�u2

Identifiedsubsystem 1

y = �0.9980(u1 �u2)

y = �0.9978(u1 �u2)

y = �0.9964(u1 �u2)

y = �0.99555(u1 �u2)

Identifiedsubsystem 2

y = 0.9908u1�u2

y = 0.9947u1�u2

y = 0.9820u1�u2

y = 0.9821u1�u2

Identified subsystem 3 y = 0 y = 0 y = 0.0068(u1 �u2) y = 0.0095(u1 �u2)

lz 5e�4 5e�4 5e�4 0.001ez 1e�4

lw 0.05 0.05 0.1 0.1ew 0.05 0.06 0.2 0.2

Number ofmisclassified points

0 11 28 47

Table S10. The identified result and details of tuning parameters in the Phototaxis Robot examplewith redundant basis functions.

Noise Np = 0 Np = 2% Np = 4% Np = 6%LibraryF 1 u1 �u2

1u1�u2

u21 u2

2

Identifiedsubsystem 1

y = �1.0000(u1 �u2)

y = �0.9957(u1 �u2)

y = �0.9944(u1 �u2)

y = �0.9941(u1 �u2)

Identifiedsubsystem 2

y = 0.9998u1�u2

y = 0.9891u1�u2

y = 0.9727u1�u2

y = 0.9689u1�u2

Identifiedsubsystem 3

y = 0 y = �0.0002u22

y = 0.0046(u1 �u2)

+0.0014u21

y = 0.0064(u1 �u2)

+0.0019u21

lz 1e�4 1e�4 5e�4 1e�3ez 1e�4

lw 1e�3 0.1 0.15 0.15ew 0.005 0.06 0.2 0.2

Number ofmisclassified points

0 11 28 48

38

Page 39: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S11. The identified result and details of tuning parameters in the Phototaxis Robot example.

System Subsystem 1 Subsystem 2 Subsystem 3Subsystem 1 u3 < 0.4986u4 u3 < 0.5085u5

Subsystem 2 u4 < 0.4553u3 u4 < 0.5055u5

Subsystem 3 u5 < 0.5242u3 u5 < 0.4543u4

Library Y 1 u3 u4 u5

b 0.5

Table S12. The identified transition logic for the Phototaxis Robot example.

System Subsystem 1 Subsystem 2 Subsystem 3Subsystem 1 u3 < 0.4986u4 u3 < 0.5085u5

Subsystem 2 u4 < 0.4553u3 u4 < 0.5055u5

Subsystem 3 u5 < 0.5242u3 u5 < 0.4543u4

LibraryY 1 u�11 u2 sin(u1) cos(u2) eu1u2 u3 u4 u5

b 0.5

39

Page 40: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.4 Example 4: Non-linear Hybrid System

Consider the Nonlinear Hybrid System shown in Fig. S6. This example is a system without anyphysical counterpart, yet it is useful to evaluate the capabilities of IHYDE for finding non-linearexpressions. The system consists of three subsystems, where all of the behaviors and transitionlogic consist of non-linear equations which cannot be modeled via parametric regression withoutincorporating prior knowledge. All the expressions are a function of the variables u1 and u2, thediscriminant functions are not linearly separable and the transitions are modally dependent.

Subsystem 1 Subsystem 2

Subsystem 3u21 + u2

2 < 9

u21 + u2

2 > 25

u1u2 < 0

y = u1u2 y =6u1

u2 + 6

y =u1 + u2

u1 � u2

Fig. S6. The nonlinear hybrid dynamical system model.

Using 2000 data points in dataset generated by [35], the identified results are shown in Ta-ble S13 and Table S14. Using IHYDE for subsystems identification, we are able to successfullyidentify that there are three subsystems that generate the datasets. In addition, the three identifiedsubsystems are consistent with or close to the true ones from both noiseless and noisy data. IHYDEis also able to precisely identify the correct transition logic both when there is no redundant basisfunction (Table S15) and when there is (Table S16).

40

Page 41: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S13. The identified result and details of Non-linear Hybrid System.

Noise Np = 0 Np = 2% Np = 4% Np = 6%LibraryF u1u2

6u16+u2

u1+u2u1�u2

Identified subsystem 1 y = 0.9998u1u2 y = 0.9958u1u2 y = 0.9962u1u2 y = 0.9953u1u2

Identified subsystem 2 y = 0.9983 6u16+u2

y = 0.9961 6u16+u2

y = 0.9947 6u16+u2

y = 0.9939 6u16+u2

Identified subsystem 3 y = 0.9990u1+u2u1�u2

y = 0.9945u1+u2u1�u2

y = 0.9940u1+u2u1�u2

y = 0.9949u1+u2u1�u2

lz 1e�6 1.5e�4 1.5e�4 1.5e�4ez 1e�4

lw 0.03

ew 0.6 2 2 2Number of

misclassified points0 63 129 177

Table S14. The identified result and details of Non-linear Hybrid Systems when there exists re-dundant basis functions.

Noise eps = 0 eps = 2% eps = 4% eps = 6%Library F u1u2

6u16+u2

u1+u2u1�u2

u1 u2 sin(u1) sin(u2) u21 u2

2

Identified subsystem 1 y = 0.9997u1u2 y = 0.9999u1u2 y = 0.9944u1u2 y = 0.9953u1u2

Identified subsystem 2 y = 0.9983 6u16+u2

y = 0.9960 6u16+u2

y = 0.9960 6u16+u2

y = 0.9963 6u16+u2

Identified subsystem 3 y = 0.9990u1+u2u1�u2

y = 0.9975u1+u2u1�u2

y = 0.9898u1+u2u1�u2

y = 0.9877u1+u2u1�u2

lz 1e�5 5e�5 1.5e�4 1.5e�4ez 1e�4

lw 0.03 0.03 0.032 0.0282ew 0.6 0.8 2 2

Number ofmisclassified points

0 67 129 175

Table S15. The identified transition logic of Non-linear Hybrid System.

System Subsystem 1 Subsystem 2 Subsystem 3Subsystem 1 u1

2 +u22 < 8.9993

Subsystem 2 u12 +u2

2 > 24.9803Subsystem 3 u1u2 < �0.013

Library Y 1 u1u2 u12 +u2

2

b 0.01

41

Page 42: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S16. The identified transition logic of Non-linear Hybrid System when there are redundantbasis functions.

System Subsystem 1 Subsystem 2 Subsystem 3

Subsystem 130.8698u1

2 +31.9412u22

< 282.0993

Subsystem 211.7323u1

2 +11.7384u22

> 292.7959Subsystem 3 u1u2 < �0.013

Library Y 1 u1 u2 eu1+u2 u1u2 u12 u2

2

b 0.01

42

Page 43: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.5 Example 5: Power Grid Fault Detection

The next example illustrates how IHYDE can be used in real-time monitoring applications. Con-sider the fault detection problem in a smart grid. The design of monitoring schemes to diagnoseanomalies caused by unpredicted or sudden faults on power networks is of great importance.

Here we consider a benchmark power network in Fig. S7. Suppose the line connecting buses 6and 12 disconnects at time 31, changing the admittance between these two buses to zero. We simu-late the data and only pass the data to IHYDE without other information. IHYDE can immediatelydetect the occurrence of this event and estimate the new admittance matrix using measurementsof the next 10 measurements. It successfully discovers two different subsystems from data andpinpoints the difference in the discovered subsystems which corresponds to the fault. Given thefrequency at which PMUs sample voltage and current, IHYDE is able to locate the fault in a fewhundred milliseconds after the event occurs, enabling the operators to detect the event, identify itslocation, and take remedial actions in near real-time.

Table S17. The identified result and details of power system fault detection.

Bus Bus 6 and bus 12 Other bus except bus 1 Bus 1True time for fault occurance 31 None None

Identified time for fault occurance 31 None Nonelz 1e�3 1e�3 1e�3ez 0.008 0.008 0.008lw 1e�6 1e�6 1e�9ew 0.05 0.05 0.05

Library Y 1 tb 0.01 None None

43

Page 44: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S7. The grid topology (IEEE 14-bus test case) that we used to simulate the data. Figureis obtained from http://www2.ee.washington.edu/research/pstca/pf14/pg tca14bus.html. We usedMATPOWER to simulate the system to obtain the data. All codes are available.

44

Page 45: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.6 Example 6: Identification of Real-time Models for Smart Grid

This example illustrates how the proposed IHYDE method can be used to solve the identifica-tion problem in smart grid, which contains two major parts, smart infrastructure system and smartmanagement system [46]. It is crucial to obtain real-time models for smart management systemto achieve resilient and efficient operations. Accurate model information is not only necessary fordaily operation and scheduling, but also critical for other advanced techniques such as state esti-mation and optimal power flow computation. However, such information is not always availablein distribution systems due to frequent model changes. For example, the model of a distributionsystem connected with photovoltaic panels maybe change once every eight hours [47]. Further-more, some unexpected events, such as line faults and unreported line maintenance, can lead tomodel changes. Moreover, network reconfiguration (such as switch action for balancing loads andavoiding voltage sag) happens frequently in distribution systems. Therefore, model identificationin real-time is meaningful.

We apply IHYDE to identify network models in real-time and to infer transition logic for modelchanges using data from advanced metering infrastructure. The 33-bus benchmark distribution sys-tem [32] shown in Figure S8 is used to generate data. Consider the situation where loads at someremote nodes of a feeder suddenly increase causing voltage sag, subsequently, an operator takesswitch action for load balancing and voltage regulation. Figure S9 depicts the switching topolo-gies and the real transition logic. The detailed actions and switching time are shown in Table S18.Measurements are generated via solving nonlinear power flow equations using MATPOWER tool-box [48] in MATLAB.

Suppose that we can measure all the active and reactive power consumption, and voltage pha-sors of the nodes, denoted by M as follows.

M =

26664

P(1)1 Q(1)

1 U (1)1 · · · P(1)

33 Q(1)33 U (1)

33...

......

......

......

P(m)1 Q(m)

1 U (m)1 · · · P(m)

33 Q(m)33 U (m)

33

37775 (10)

where U (m)i , P(m)

i and Q(m)i are the voltage phasor, active and reactive power of Bus i at time

instant m, respectively. The total sampling time m is set to 180 in the following simulation andmeasurement noise is not considered.

For each node, we apply IHYDE to identify the responding column of the admittance matrix.The output yi 2 R2m⇥1 of Bus i is constructed according to its active and reactive power as yi =

45

Page 46: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Feeder 1

1

2

19

3

23

20 21 22

24 25

4 5 6 7 8 9 10

26 27 28 29 30 31 32 33

1112

13

1415161718Su

bsta

tion

Feeder 2

Feeder 3

Feeder 4

Fig. S8. Diagram of the 33-bus distribution system with four feeders. Every node representsa concentrated load with active and reactive power consumption. The solid line stands for theelectric line connecting two distinct nodes in service with close sectionalizing switch, and dottedlines represent the open tie switches.

[P(1)i ,Q(1)

i , · · · ,P(m)i ,Q(m)

i ]T . Its dictionary matrix Fi 2 R2m⇥66 can be formulated as follows.

Fi =

26666666664

|U (1)i ||U (1)

1 |cosq (1)i1 · · · |U (1)

i ||U (1)33 |cosq (1)

i33 |U (1)i ||U (1)

1 |sinq (1)i1 · · · |U (1)

i ||U (1)33 |sinq (1)

i33

|U (1)i ||U (1)

1 |sinq (1)i1 · · · |U (1)

i ||U (1)33 |sinq (1)

i33 �|U (1)i ||U (1)

1 |cosq (1)i1 · · · �|U (1)

i ||U (1)33 |cosq (1)

i33...

......

......

...

|U (m)i ||U (m)

1 |cosq (m)i1 · · · |U (m)

i ||U (m)33 |cosq (m)

i33 |U (m)i ||U (m)

1 |sinq (m)i1 · · · |U (m)

i ||U (m)33 |sinq (m)

i33

|U (m)i ||U (m)

1 |sinq (m)i1 · · · |U (m)

i ||U (m)33 |sinq (m)

i33 �|U (m)i ||U (m)

1 |cosq (m)i1 · · · �|U (m)

i ||U (m)33 |cosq (m)

i33

37777777775

.

where |U (m)i | and q (m)

i j denote the voltage magnitude of Bus i and the phase difference betweennodal voltages of Bus i and j at time instant m, respectively.

Table S19 shows the identified results and the detailed tuning parameters of the proposed al-

Table S18. The detailed parameters of switch operators.

Model Switching Time Opened Switch Closed Switch Bus of load changeT1 ! T2 31,91,151 11�12 12�22 9,10,11T2 ! T1 61,121 12�22 11�12 20,21,22

46

Page 47: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Substation

Feeder 1

1

219

323

20

21

222425

4

5

6

7

8

9

10

26

27

28

29

30

31

32

33

11

1213

14

15

16

17

18

Feeder 2

Feeder 3

Feeder 4

Base configuration

Feeder 1

1

219

323

20

21

222425

4

56

7

8

9

10

26

27

28

29

30

31

32

33

11

1213

14

15

16

17

18

Substation

Feeder 2

Feeder 3

Feeder 4

Changed configuration

Δ|U10 |< -0.05

Δ|U21 | < -0.05

Fig. S9. Subsystem models and transition logic of the smart grid example.

gorithm. For example, at Bus 12, the maximum relative identification ratio of Base configurationand Changed configuration are 0.00057% and 0.00182%, respectively. The identified admittancematrices at time instants 31,61,91,121,151 are very different from that of the previous moments,which indicates the model switching. The results demonstrate that IHYDE can identify the mod-els accurately and pinpoint model switching time correctly. We add the difference of voltagemagnitude between different times, denoted by D|Ui|, into dictionary matrix for logic identifica-tion. Table S20 indicates that the identified logic is consistent with the real logic with small error.Specifically, the result of T1 !T2 (switching from T1 to T2) reveals that the voltage drop of node10 at feeder 3 are more than 0.0500 at time 30, subsequently, switch action is taken to avoid sharpvoltage drop. The tie switch between Bus 12 and 22 is closed, while the sectionalizing switchbetween Bus 11 and 12 opens. This is consistent with our preset reason that loads at Bus 9,10,11increase rapidly at time 30. There are many indistinct physical phenomenons in actual powersystem and IHYDE can be utilized to help engineers understand the hidden mechanism behind it.

47

Page 48: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S19. The identified result and detailed parameters.

System model T1 ! T2 T2 ! T1

True switching time 31,91,151 61,121Identified switching time 31,91,151 61,121

lz 5e�3ez 1.5e�2lw 1e�6ew 5e�2

Number of misclassified points 0

Table S20. The identified transition logic for the model switching in smart grid.

System Subsystem 1 Subsystem 2Subsystem 1 D|U10| < �0.0500Subsystem 2 D|U21| < �0.0473

Library Y 1 D|U1| · · ·D|U33|b 1e�4

48

Page 49: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.7 Example 7: Discovery of Human Atrial Action Potential Models

In this section, we apply IHYDE to a human atrial action potential (AP) model proposed in [34] toshow the applicability of IHYDE to the discovery in biology. The parameters of the human atrialAP model are determined based on the data that is directly measured on human atrial cells and thatis from AP model of guinea pig ventricular and rabbit atrial. The AP model can reproduce a varietyof observed AP behaviors and provide potential insights into its underlying ionic mechanisms.The human atrial AP and ionic currents that underlie its morphology are of great importance toour understanding and prediction of the electrical properties of atrial tissues under normal andpathological conditions.

Specifically, the cell membrane is modeled as a capacitor connected in parallel with variableresistances and batteries representing the ionic channels and driving forces. The AP model includes21 differential equations and 163 parameters in total (see the code for detailed information). Themembrane potential formulation is as follows

dVdt

=�(Iion + Ist)

Cm, (11)

where V is membrane potential, and Cm is the constant total membrane capacitance. Iion and Ist arethe total ionic current and stimulus current flowing across the membrane, respectively.

Figure S10 shows that the action potential generated by the AP model through voltage clampmethod is a spike-and-dome morphology commonly observed in human atrial AP recordings. Weapply the stimulation current with 2 ms pulses of 2 nA amplitude across the cell membrane ev-ery 1000 ms. To check the performance of the IHYDE method, we focus on two representativeequations about gating variables h and j with time-varying parameters as follows:

dhdt

= ah �h(ah +bh) (12)

d jdt

= a j � j(a j +b j), (13)

where h and j are fast and slow inactivation gating variables for fast inward Na+ current, respec-tively. For convenience, we present the time-varying parameters ah,a j,bh,b j:

ah =

8<:

ah1 , 0.135exp(�V+806.8 ) V < �40

ah2 , 0 V ��40

49

Page 50: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

a j =

8>>><>>>:

a j1 , [�1.2714⇥105 exp(0.2444V )�3.474⇥10�5 exp(�0.04391)] V+37.78

1+exp[0.311(V+79.23)] V < �40

a j2 , 0 V ��40

bh =

8<:

bh1 , 3.56exp(0.079V )+3.1⇥105 exp(0.35V ) V < �40

bh2 , {0.13[1+ exp(�V+10.6611.1 )]}�1 V ��40

b j =

8<:

b j1 = 0.1212 exp(�0.01052V )1+exp[�0.1378(V+40.14)] V < �40

b j2 = 0.3exp(�2.535⇥10�7V )1+exp[�0.1(V+32)] V ��40.

When the gating variables h and j are equal to 1, the fast inward Na2+ current is inactivecompletely. Figure S11 depicts that they gradually rise to their resting values 0.9775 and 0.9649after stimulus.

0 100 200 300 400 500 600

Time (ms)

-100

-80

-60

-40

-20

0

20

40

V (

mV

)

Fig. S10. Model action potential V during stimulation at the frequency of 1 Hz.

It is clearly observed that membrane voltage gradually returns to its stable resting potential�81mV after the stimulation from Figure S10. During the process, the dynamics for gating vari-ables h and j has been switched as shown in Figure S11 when the membrane voltage V goesthrough �40 mV. We apply IHYDE to discover the different models and the transition logic onlyusing measurements. The first-order differential values of h and j are considered as their output,respectively. For instance, we down-sample the differential value of h during 120�500 ms as itsoutput

yh =

dh(120)

dt,dh(120+dt)

dt, · · · , dh(499.8)

dt

�T

2 R1267⇥1.

The sampling period dt is set to 0.3 ms, and there are 1267 data points for each variable. The

50

Page 51: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

dictionary matrix of gating variables h and j, denoted by Fh and F j, respectively, are establishedbased on the terms of the above equations

Fh =

26664

exp(�V (t0)+806.8 ) h(t0)exp(�V (t0)+80

6.8 ) h(t0)bh1(t0) h(t0)bh2(t0)...

......

...

exp(�V (t1)+806.8 ) h(t1)exp(�V (t1)+80

6.8 ) h(t1)bh1(t1) h(t1)bh2(t1)

37775 ,

F j =

26664

a j1(t0) j(t0)a j1(t0) j(t0)exp(�0.01052V (t0))

1+exp[�0.1378(V (t0)+40.14)] j(t0)exp(�2.535⇥10�7V (t0))1+exp[�0.1(V (t0)+32)]

......

......

a j1(t1) j(t1)a j1(t1) j(t1)exp(�0.01052V (t1))

1+exp[�0.1378(V (t1)+40.14)] j(t1)exp(�2.535⇥10�7V (t1))1+exp[�0.1(V (t1)+32)]

37775 ,

where t0 and t1 are 120 and 499.8 ms, respectively.

100 150 200 250 300 350 400 450 5000

0.2

0.4

0.6

0.8

h

subsystem 1

subsystem 2

100 150 200 250 300 350 400 450 500

time(ms)

0

0.2

0.4

0.6

j

subsystem 1

subsystem 2

Fig. S11. The value of gating variable h and j. Different colors denote data that are produced fromdifferent subsystems.

The identified results and the detailed parameters are summarized in Table S21. We can seethat IHYDE identifies the subsystem and pinpoints the changing time correctly. The identifiedlogic for both gating variables are V �40.0093, which is very close to the real logic V �40.

51

Page 52: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S21. The identified results and detailed parameters.

Gatingvariable

h j

Actualsubsystem 1 h = �hbh2 j = �0.3 j exp(�2.535⇥10�7V )

1+exp[�0.1(V+32)]

Actualsubsystem 2

h = 0.135exp(�V+806.8 )�

0.135hexp(�V+806.8 )�hbh1

j = a j1 � ja j1 �0.1212 j exp(�0.01052V )

1+exp[�0.1378(V+40.14)]

Actual changetime 332.10 ms 332.10 ms

Identifiedsubsystem 1 h = �0.9999hbh2 j = �0.3000 j exp(�2.535⇥10�7V )

1+exp[�0.1(V+32)]

Identifiedsubsystem 2

h = 0.1349exp(�V+806.8 )�

0.1349exp(�V+806.8 )h�

0.9987hbh1

j = 1.0000a j1 �1.0000 ja j1 �0.1212 j exp(�0.01052V )

1+exp[�0.1378(V+40.14)]

Identifiedchange time 332.10 ms 332.10 ms

lz 1e-4 1e-4ez 3e-5 3e-5lw 3e-5 1e-5ew 5e-5 5e-5

Library F exp(�V+806.8 ) hexp(�V+80

6.8 )hbh1 hbh2

a j1 ja j1 j exp(�0.01052V )1+exp[�0.1378(V+40.14)]

j exp(�2.535⇥10�7V )1+exp[�0.1(V+32)]

Number ofmisclassfied

points0 0

Table S22. The identified transition logic for gating variable h.

gating variable h Subsystem 1 Subsystem 2Subsystem 1 V �40.0093

Library Y 1 Vb 1e-6

52

Page 53: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S23. The identified transition logic for gating variable j.

gating variable j Subsystem 1 Subsystem 2Subsystem 1 V �40.0093

Library Y 1 Vb 1e-6

53

Page 54: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.8 Example 8: Monitoring of Industrial Processes

The next example illustrates how IHYDE can be used for fault detection in mechanical engineering.Experiments conducted on a wind turbine system platform [30] shown in Figure S12 and S13 areused to verify its effectiveness.

This system contains a power supply of 380V , an inverter, a motor, a gearbox, a power gener-ator, and a load. The platform is used to simulate the process of air flow through wind turbines togenerate electricity. Specifically, the motor with a gear reducer of 20 : 1 ratio supplies the generatorwith mechanical power through the gearbox. In this experiment, we adopt the mode of gearboxinversion to simulate the operation of wind turbine system. The gearbox has been widely used toprovide speed and torque conversions from a motor to generator in wind turbines [49]. This sys-tem has a gearbox with three shafts, i.e., shaft with low speed, shaft with intermediate speed andshaft with high speed. The load consumes the power generated by the generator. We can measurethe root-mean-square current and voltage of the motor from the inverter. The current of generatorcan be captured by oscilloscope and its voltage is measured through multimeter. We measure thevoltage of the load in the same way.

We perform experiments under normal and faulty conditions. Both experiments are performedin the situation where the generator speed is 200 revolutions per minute and the load is 1.5 KNm.One-third of the tooth width cut off from the gear tooth on the high-speed shaft is considered asthe faulty condition shown in Figure S14. In the normal operation, the motor power is 383.01W

and the generator power is 53.28W ; the load voltage is 75V in the faulty condition.Two sets of current data are measured at the frequency of 1000Hz connected in series for iden-

tification. The first dataset contains 19,995 data points sampled under normal operating condition,the other has 20,000 data points obtained from the faulty condition. Then, we down-sample at theperiod of 0.3s and denote as i(k) in which k = 1, . . . ,133.

As described in the main text, here we used an online monitoring scheme. We construct the out-put y 2 R64⇥1, including 61 current measurements from 1.8s to 19.8s in the normal condition and3 data points from 20.1s to 20.7s in the faulty condition when the mismatch is large. Specifically

y =hi(7) i(8) . . . i(70)

iT.

The kth row of the dictionary matrix F2R64⇥28 is all the polynomial combinations of i(k), . . . , i(k+

54

Page 55: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S12. A picture of the wind turbine system platform built in [30].

5) to second order as follows:

F =

26664

1 i(6) · · · i(1) i2(6) . . . i2(1)...

......

......

......

1 i(69) . . . i(64) i2(69) . . . i2(64)

37775 ,

Figure S15 shows that the relative fitting error ratio is small. The identified results and details ofthe IHYDE are presented in Table S24, which shows that the identified time for the fault occurrenceis the same as the real fault time 68. We only use three fault points to realize the fault detection.This example demonstrates the capability of IHYDE to the identification of the fault in industrialprocesses.

55

Page 56: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Gearbox

Motor

Generator

Load

Couplings

Couplings

Power Supply(380V)

Inverter

Fan

A

B

C

DE

F

Fig. S13. The corresponding schematic diagram of the wind turbine system platform.

Fig. S14. The broken tooth in the gearbox [30].

56

Page 57: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

0 20 40 60 80k

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

i

Fig. S15. The data fitting curve using data obtained from the wind turbine platform usingIHYDE.The original time-series data (lines connecting the dots) in the color associated with thesubsystem, the fitted data from the identified models (dots), and the location of the transitions(changes in colors).

57

Page 58: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S24. The identified result and details of the gearbox broken tooth fault with redundant basisfunctions.

System GearboxTrue fault time 68

Identified fault time 68lz 1.5ez 1e�4lw 5e�5ew 0.026

Library F all the polynomial combinations of y(k) · · ·y(k�5) to second orderNumber of misclassified points 0

Library Y 1 kb 0.1

58

Page 59: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.9 Example 9: Chua’s Circuit

In this subsection and the next one, we shall apply IHYDE to data that is obtained from exper-iments. Chua’s circuit shown in Fig. S16 is the simplest electronic circuit that exhibits classicchaotic behavior. It consists of an inductor, two capacitors, a passive resistor and an active nonlin-ear resistor which fits the condition for chaos with the least components. The most important activenonlinear resistor is a conceptual component and the resistor can be built with operational ampli-fiers and linear resistors. The current-voltage characteristics of the nonlinear resistor are plotted inFig. S17.

Fig. S16. The circuit structure of the Chua’s circuit.

Fig. S17. The current-voltage characteristics of the nonlinear resistor.

By design, the current-voltage relationship can be described as follows:

i(U) =

8>>><>>>:

aU +(b�a)(U �E), U > E

aU, �E < U < E

aU +(b�a)(U +E), U < �E,

(14)

59

Page 60: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

or equivalently

i(U) = bU +12(a�b)(|U +E|� |U �E|). (15)

In both equations, a, b, E are parameters defined in Fig. S17.The nonlinear resistor can be built using the circuit realization as shown in Fig. S18.

Fig. S18. The circuit structure of nonlinear resistor with specified current-voltage realization.

From KCL and KVL, we obtain

C1dU1

dt=

U2 �U1

R� i(U1)

C2dU2

dt=

U1 �U2

R+ IL

�LdIL

dt= U2.

(16)

Then we introduce a number of variables to simplify the above equations:

y1 =U1 ⇤ threshold

E

y2 =U2 ⇤ threshold

E

y3 =IL ⇤ threshold

E,

(17)

where the variables are defined as below:

• C1: Capacity of Capacitor1. C2: Capacity of Capacitor2. L: Inductance of the Inductor.

• U1 : Voltage through Capacitor1. U2: Voltage through Capacitor2.

• IL: Current through Inductor. i: Current through the nonlinear resistor.

60

Page 61: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

• E: the threshold voltage for the nonlinear resistor. a: the slope of low voltage for the nonlin-ear resistor. b: the slope of high voltage for the nonlinear resistor

• threshold: a parameter of the Chua’s circuit, which equals to E in this experiment.

Leta =

1RC1

b =1L

f (y)|y=x thresholdE

= Rthreshold

Ei(x),

(18)

we have the equation in the following form:

dy1

dt= a[y2 � y1 � f (y1)]

RC2dy2

dt= y1 � y2 +Ry3

dy3

dt= �by1,

(19)

where f (x) is given by

f (x) =

8><>:

k1x+b1, x < �threshold (20a)

k0x, �threshold < x < threshold (20b)

k1x+b2. x > threshold (20c)

andk0 = Ra

k1 = Rb

b1 = R(a�b)threshold

b2 = R(a�b)threshold.

(21)

The behavior of the system will be changed between chaos and non-chaos depending on the valueof R. Each mode in eq. (20a), eq. (20b) and eq. (20c), corresponds to subsystem 1, subsystem 2and subsystem 3 respectively. We focus on the discovery of the first equation in eq. (19) and onlycollect the value of y1 and y2. The output data from the Chua’s circuit can be seen in Fig. S19.

From the true parameters in Table S25, we can compute the true coefficients of f (x) to deter-

61

Page 62: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S19. The output trajectory of the Chua’s circuit. Different colors in the trajectory specify theoutput generated by different subsystems.

mine dy1dt :

10�5 dy1

dt=

8><>:

1.0858y2 �0.2115y1 �0.5349, y1 < �1.5 (22a)

1.0858y2 +0.1451y1, �1.5 < y1 < 1.5 (22b)

1.0858y2 �0.1994y1 +0.5168. y1 > 1.5. (22c)

Table S25. True parameters of the built Chua’s circuit.

Item Value Item Valuea �1.2309e�3 c1 0.01µFb �8.743e�4 c2 0.1µFb0 �8.864e�4 dt 10�5s

threshold 1.5 L 6.8mHR 921

The algorithm accurately infers the form of eq. (22c) from the data as shown in Table S26.

62

Page 63: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S26. The identified subsystems and transition logic of Chua’s circuit which contains allsubsystems.

Library F 1 y1 y2

Identified subsystems10�5 dy1

dt = 1.0889y2 �0.2095y1 �0.5622 y1 < �1.588810�5 dy1

dt = 1.0813y2 +0.1593y1 �1.5627 < y1 < 1.313710�5 dy1

dt = 1.0870y2 �0.2128y1 +0.4860 y1 > 1.4627lz 0.01ez 0.05lw 0.01ew 0.045

Library Y 1 y1

b 0.01

Table S27. The identified subsystems and transition logic of Chua’s circuit which contains allsubsystems with redundant basis functions.

Library F 1 y1 y2 ey1 y1y2

cos(0.1y1)2

1+y22

cos(y1 + y2)2

Identified subsystems10�5 dy1

dt = 1.0758y2 �0.2028y1 �0.5405 y1 < �1.434810�5 dy1

dt = 1.0793y2 +0.1576y1 �1.5627 < y1 < 1.313710�5 dy1

dt = 1.0869y2 �0.2127y1 +0.4859 y1 > 1.4627lz 0.05ez 0.012lw 0.01ew 0.044

Library Y 1 y1 y2 sin(y2)cos(y1)dy1dt

sin(y1)+dy1dt

dy1dty2

dy1dt

b 0.01

63

Page 64: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.10 Example 10: Autonomous Car

This example presents the results of IHYDE applying to an autonomous car built in our lab. Theautonomous car consists of a body, a MK60t board, a servo motor, tow driving motors and a cam-era. During execution, the embedded camera captures the upcoming road layouts to check whetherthere is an upcoming straightaway or curve. Naturally, the car will drive faster on straightawaysand slower on the curves.

Based on this design principle, we would like to design a hybrid dynamical system with twosubsystems and simple transition logic to realize this goal as shown in the right panel of Figure S21.The car measures current speed by encoder and calculates the Du, a control input to the motor. Thespeed control strategy is based on an incremental PI control algorithm, which is widely used incontrol systems. The incremental PI algorithm is developed from position PI algorithm. Theposition PI model is described as below and can be seen in Fig. S20. r(t) represents the inputof the whole system (the expected speed vexpect(t)) and c(t) represents the output of the wholesystem (the real speed observed v(t)). In the figure, u(t) is the output of the controller and it can

Fig. S20. The position PI controller structure for the autonomous car.

be calculated from e(t) (where TI is the time constant for the integral control):

u(t) = P

24e(t)+

1TI

tZ

0

e(t)dt

35 . (23)

In the Laplace domain, eq. (23) is equivalent to U(s) = D(s)E(s), where U(s) and E(s) are theLaplace transform of u(t) and e(t) respectively, D(s) represents the transfer function of the con-troller:

D(s) =U(s)E(s)

= P✓

1+1

TIs

◆. (24)

Since the controller is implemented by a computer, it must be first converted to discrete time. The

64

Page 65: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

integral can be approximated by

tZ

0

e(t)dt ⇡k

Âi=0

Te(i) ) de(t)dt

⇡ e(k)� e(k�1)

T. (25)

So we obtain the following control law

u(k) = P

"e(k)+

TTI

k

Âi=0

e(i)

#. (26)

The position PI algorithm is usually approximated by an incremental PI algorithm:

Du(k) , u(k)�u(k�1) = P[e(k)� e(k�1)]+ Ie(k), (27)

where I , PTTI

. In the autonomous car example, we have

r(k) = vexpect(k), c(k) = v(k), e(k) = vexpect(k)� v(k). (28)

Here vexpect(k) is the expected velocity depending on whether there is an upcoming straightaway orcurve from the camera. We set up a faster velocity on straightaways and slower one on the curves.Substituting e(k) into the equation, we obtain

Du(k) = P[v(k�1)� v(k)]+P[vexpect(k)� vexpect(k�1)]+ I(vexpect(k)� v(k)). (29)

When the car changes its expected velocity, this could lead to a more complicated hybrid dynamicalsystem than we would design as shown in Figure S21. This side effect is due to the abrupt switchingand discretization. In practice, we normally neglect these subsystems in the modeling, analysis anddesign. The flow chart of the PI control algorithm is shown in Fig. S22.

Next, we demonstrate how IHYDE can help in the design process. In the first experiment, theautonomous car failed to drive through the track. We collected the experimental data and usedIHYDE to discover the failed system. We compared the discovered system model with the to-bedesigned one and found an implementation error that led the system to failure. We expected ahigher speed when the car is running in a straight line and a lower speed while it is running ona curve. The model from the failed experiments showed that the transition logistics should bereversed as shown in Fig. S20 and Table S28. We fixed the bug and as a result the autonomous carwas able to run through the track. Finally, as a validation, we collected the data and repeated themodeling process in Table S29 and Table S30.

In summary, IHYDE successfully reverse engineered the control strategy of the CPS. Addi-

65

Page 66: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Subsystem 1

Subsystem 2

Subsystem 3 Subsystem 4�u(k) = P (v(k � 1) � v(k))

+P (vexpect2 � vexpect1)

+I(vexpect2 � v(k))

�u(k) = P (v(k � 1) � v(k))

+P (vexpect1 � vexpect2)

+I(vexpect1 � v(k))

�u(k) = P (v(k � 1)

�v(k)) + I(vexpect1

�v(k))

�u(k) = P (v(k � 1)

�v(k)) + I(vexpect2

�v(k))

Subsystem 1

Subsystem 2

Designed SystemImplemented System

straight(k) = 0

straight(k � 1) = 0

straight(k) = 0

straight(k � 1) = 1

straight(k) = 1

straight(k � 1) = 1

straight(k) = 1

straight(k � 1) = 0

straight(k) = 1

straight(k) = 0

�u(k) = P (v(k � 1)

�v(k)) + I(vexpect1

�v(k))

�u(k) = P (v(k � 1)

�v(k)) + I(vexpect2

�v(k))

Fig. S21. Left: a more complicated hybrid dynamical system model due to discretization andswitching. Right: the correct hybrid dynamical system model that we would like to design.

Fig. S22. The flow chart of the PI algorithm.

tionally, we deliberately swapped the straightway and curve speeds to mimic a software bug. Themodeled system immediately pinpointed the location of the faulty software and yielded importantinformation for debugging the system.

66

Page 67: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Fig. S23. The IHYDE pinpoints the implementation error that leads to a failure in the autonomouscar experiment. Left: the identified model from experimental data. Right: the designed system.The two subsystems are swapped around due to a design bug.

Table S28. The identified transition logic of autonomous car testbed.

System Straightaway CurveStraightaway straight < 0.3318

Curve straight > 0.6072

Library Y 1 straight sin(v(k)) cos(v(k)) tan(v(k)) v(k�1)�v(k�4)v(k�2) v(k�1) tan(v(k�3))

b 0.001

Table S29. The identified result and details of autonomous car testbed.

Library F 1 v(k) v(k�1)

Speed control strategy Straightaway Curve

True strategyDu(k) = 9.5(380� v(k))

+48(v(k�1)� v(k))Du(k) = 9.5(280� v(k))

+48(v(k�1)� v(k))True times 147 249

Identified strategyDu(k) = 9.4957(379.9550� v(k))

+47.9742(v(k�1)� v(k))Du(k) = 9.4961(279.9469� v(k))

+47.9927(v(k�1)� v(k))Identified switching times 147 249

lz 0.1ez 100lw 1e�5ew 8

67

Page 68: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S30. The identified result and details of autonomous car testbed with redundant basis func-tions.

Library F all the polynomial combinations of v(k), . . . ,v(k�4) to fourth orderSpeed control strategy Straightaway Curve

True strategyDu(k) = 9.5(380� v(k))

+48(v(k�1)� v(k))Du(k) = 9.5(280� v(k))

+48(v(k�1)� v(k))True times 147 249

Identified strategyDu(k) = 9.4957(379.9554� v(k))

+47.9741v(k�1)� v(k))Du(k) = 9.4959(279.9465� v(k))

+47.9888(v(k�1)� v(k))Identified switching times 147 249

lz 0.01ez 100lw 1e�5ew 8

68

Page 69: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S3.11 Example 11: Non-hybrid Dynamical Systems

We also tested the proposed method to non-hybrid dynamical systems using datasets in [13] toillustrate the applicability of IHYDE. The results are summarized in Table S31 and Table S32.These tables show that the proposed subsystem identification algorithm unifies previous results forthe discovery of nonhybrid dynamical systems, such as examples in [13].

Table S31. The identified results using datasets in [13]. Five prototypical systems were exam-ined. We used the actual systems to simulate datasets and then used IHYDE to reverse engineeringthese systems from data.

Examples Actual systems Discovered systems

Linear 2D ddt

"xy

#=

"�0.1 2�2 �0.1

#"xy

#ddt

"xy

#=

"�0.0993 2.0054�2.0004 �0.1048

#"xy

#

Cubic 2D ddt

"xy

#=

"�0.1 2�2 �0.1

#"x3

y3

#ddt

"xy

#=

"�0.1015 2.0005�2.0010 �0.1002

#"x3

y3

#

Linear 3D ddt

264

xyz

375=

264�0.1 2 0�2 �0.1 00 0 �0.3

375

264

xyz

375 d

dt

264

xyz

375=

264�0.0992 2.0002 0�1.9999 �0.0991 0

0 0 �0.2983

375

264

xyz

375

Lorenz systemx = 10y�10x

y = 28x� xz� yz = xy�2.6667z

x = 10.0060y�9.9968xy = 27.9480x�0.9957xz�0.9954y

z = 1.0010xy�2.6673z

Logistic mapxk+1 = µkxk(1� xk)

µk+1 = µk

xk+1 = µkxk(1.0005�1.0006xk)

µk+1 = 1.0000µk

Table S32. The identified results using datasets in [13]. The tuning parameters are presentedfor these prototypical examples.

Examples Noise lz ez lw ew Numbers of points

Linear 2D 0.05 1 1e�4 2e�3 0.2 1001

Cubic 2D 0.05 1 1e�4 2e�3 0.2 1001

Linear 3D 0.01 1 1e�4 2e�3 0.2 1001

Lorenz system 1 1 1e�4 1e�3 4 1000

Logistic map 0.01 1 1e�4 2e�3 0.2 990

69

Page 70: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

S4 Discussion

The IHYDE algorithm has been tested in a number of examples. As the number of basis functionsand the amount of noise increases, the algorithm is eventually unable to identify the actual model.Although it can fit data very well, it usually obtains more complex models than the true ones.This is actually a typical problem in system identification [28]. When the data is not informative,it leads to non-identifiability issues, i.e., there will exist multiple hybrid dynamical systems thatcan produce the same data, which prevents the proposed IHYDE algorithm from finding the truesystem.

S4.1 Example 1

Consider the following linear system with unknown parameters k1 and k2

ddt

24x1

x2

35=

24k1 1+ k2

0 k1 + k2

3524x1

x2

35+

240

1

35u

y =h0 1

i24x1

x2

35 .

(30)

The observed output is plotted as follows in Fig. S24 (the system is stimulated by an impulse input,i.e., u(t) = d (t) where d (·) is the Dirac delta function):

0 2 4 6 8 10

time (t)

0

0.5

1

y

Fig. S24. The observed output of eq. (30) when a Dirac delta function is applied to stimulate thesystem.

However, any k1,k2 with k1 + k2 = 0.8 produces the same input-output data. For example, theactual system

70

Page 71: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

ddt

24x1

x2

35=

240.4 1.4

0 0.8

3524x1

x2

35+

240

1

35u

y =h0 1

i24x1

x2

35 ,

(31)

andddt

24x1

x2

35=

240.3 1.5

0 0.8

3524x1

x2

35+

240

1

35u

y =h0 1

i24x1

x2

35

(32)

are indistinguishable from the input-output data alone. Hence, without more information, the trueparameters cannot be identified using any methods.

S4.2 Example 2

The previous example demonstrates that, when the parameterization is not identifiable, no algo-rithm is able to identify the correct parameters. Next, we shall demonstrate another example where,even though the subsystem is identifiable, the data is not informative enough. For example, someof the logic transitions never occur. Consider the following hybrid dynamical system in Fig. S25.If the system starts at initial condition y(0) = 18, then it always stays in subsystem 1. Hence, withthe data generated, no algorithm is able to identify the complete hybrid dynamical system.

Subsystem 1 Subsystem 2

y = �ay y = �a(y � 30)

y 21

y � 19

Fig. S25. A counterexample of a constructed hybrid dynamical system that is not able to beidentified from data.

71

Page 72: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

A Extension of the Proposed IHYDE Algorithm

Both subsystems identification algorithms require solving a number of `1 regularized optimizationproblems. In practice, we introduced an alternative method to formulate the optimization problemby introducing a sparse prior and re-deriving the process using Bayesian statistics. This can leadto recursive `1 regularized optimizations (shown below). This section is a brief introduction to theresults in [10], for the completeness of the paper, we present it here.

In the following, we shall first interpret the `1-minimization problem from a Bayesian perspec-tive and show that the regularized least square problem is equivalent to introducing a sparsity-inducing prior in the Bayesian framework. We demonstrate that the corresponding cost functionis concave both in the hyperparameter space and in the weight space. Finally, we analyze the costfunction and propose algorithms to relax the concave optimization problem into an iterative convexoptimization problem based on the reweighted `1-minimization algorithm.

Bayesian modeling treats all unknowns as stochastic variables with certain probability distri-butions. Consider a generalized model y = Qw + ⇠ with ⇠ being a vector containing stochas-tic variables, it’s assumed that the stochastic variables in ⇠ are i.i.d. Gaussian distributed with⇠ s N (0,s 2I). Then the likelihood of the output y given a weight w is

P(y|w) = N (y|Qw,s2I) µ exp� 1

2s2 ky�Qwk22

�. (33)

Note that obtaining the maximum likelihood estimates for w in eq. (33) is equivalent to searchingfor a minimal `2-norm solution w, i.e., minky�Qwk2

2 . Given the likelihood function in eq. (33)and specifying a prior of w

P(w) = ’j

P(w j), (34)

we compute the posterior distribution over w via Bayes’ rule:

P(w|y) =likelihood ⇥ prior

normalizing f actor=

P(y|w)P(w)

P(y)=

P(y|w)P(w)RP(y|w)P(w)dw

. (35)

Let P(w) be defined as

P(w) µ exp�1

2g(w)

�= exp

"�1

2 Âj

g(w j)

#, (36)

where g(w j) is an arbitrary penalty function of w j. Combining P(y|w) and P(w), we get the

72

Page 73: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

posterior distributionP(w|y) µ P(y|w)P(w). (37)

Based on the fact that the posterior distributions P(w|y) and P(y) are independent of w, we canderive the following cost function

L (w) ,�2logP(w|y) µ1

s2 ky�Qwk22 +g(w). (38)

We then formulate a maximum a posteriori (MAP) estimate w to y = Qw+⇠:

wMAP = argmaxw

P(w|y) = argminw

{ky�Qwk22 +s2g(w)}. (39)

From a Bayesian viewpoint, MAP estimation is equivalent to a regularized least square problem.Setting g(w) = kwk1 and lz = s2, we recover the `1 regularized optimization in eq. (6). We mayalso consider super-Gaussian priors, which yield a lower bound for the priors P(w j) [41]. Morespecifically, if we define � , [g1, . . . ,gN ]T 2 RN

+, we can represent the prior in eq. (34) in thefollowing relaxed (variational) form [40]:

P(w) =n

’j=1

P(w j), P(w j) = maxg j>0

N (w j|0,g j)j(g j), (40)

where j(g j) is a nonnegative function which is treated as a hyperprior with g j being associatedhyperparameters. Throughout, we call j(g j) the potential function. This Gaussian relaxation ispossible if and only if logP(

pw j) is concave on (0,•) shown in [40].For a fixed � = [g1, . . . ,gN ]T , define a relaxed prior which is a joint probability distribution of

w and �

P(w;�) = ’j

N (w j|0,g j)j(g j) = P(w|�)P(�), (41)

where P(w|�) , ’ j N (w j|0,g j),P(�) , ’ j j(g j). Since is P(y|w) is Gaussian in eq. (33), wecan get a relaxed Gaussian posterior

P(w|y,�) =P(y|w)P(w;�)R

P(y|w)P(w;�)dw= N (mw,Sw), (42)

where

mw = GQT (s2I +QG�1QT )�1y, (43)

Sw = G�GQT (s2I +QG�1QT )�1Q, (44)

73

Page 74: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

with G , diag[�]. Next we shall choose the most appropriate � = � = [g1, . . . , gN ]T to maximize

’ j N (w j|0,

g j)j(g j) such that P(w|y, �) can be a good relaxation to P(w|y). Using the product rule for proba-bilities, we can write the full posterior

P(w,�|y) µ P(w|y,�)P(�|y) = N (mw,Sw)⇥ P(y|�)P(�)

P(y). (45)

Since P(y) is independent of �, the quantity

P(y|�)P(�) =Z

P(y|w)P(w|�)P(�)dw

is the prime target for variational methods [42]. This quantity is known as evidence or marginallikelihood. One way to select � is to choose it as the minimizer of the sum of the misalignedprobability mass, e.g.,

� = argmin��0

ZP(y|w)|P(w)�P(w;�)|dw

= argmax��0

ZP(y|w)

n

’j=1

N (w j|0,g j)j(g j)dw. (46)

Since P(w) is not a function of �, and from the variational representation in eq. (40), we can removethe absolute value. The procedure in eq. (46) is referred to as evidence maximization or type-IImaximum likelihood [43]. It means that the marginal likelihood can be maximized by selectingthe most probable hyperparameters able to explain the observed data. Once � is computed, anestimate of the unknown weights can be obtained by setting w to the posterior mean in eq. (43)

w = E(w|y; �) = GQT (s2I +QGQT )�1y. (47)

We shall now propose an algorithm to compute � in eq. (46). Then, from this computed � we canobtain an estimation of the posterior mean w in eq. (47) based on the reweighted `1-minimizationalgorithm.

Theorem 1. [10] The optimal hyperparameters � in (46) can be achieved by minimizing the

following cost function

L� (�) = log��s2I +QG�1QT ��+ yT (s2I +QG�1QT )�1y+

N

Âj=1

p(g j), (48)

where p(g j) = �2logj(g j).

74

Page 75: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Then, from eq. (47), we compute the posterior mean to get an estimate to w:

w = E(w|y; �) = GQT (s2I +QGQT )�1y

where G = diag[�].The data-dependent term can be re-expressed as

yT �s2I +QG�1QT��1y = s�2yT y�s�2yT QSwQT s�2y

= s�2yT (y�Qmw)

= s�2ky�Qmwk22 +mT

wS�1w mw �s�2mT

wQT Qmw

= s�2ky�Qmwk22 +mT

wGmw

= minw

{ 1s2 ky�Qwk2

2 +wT Gw}. (49)

From eq. (49), we can create a strict upper bounding auxiliary function L�,w(�,w) on L�(�) ineq. (46),

L�,w(�,w) , h�⇤,�i�h⇤(�⇤)+ yT �s2I +QG�1QT��1y

=1

s2 ky�Qwk22 +Â

j

w2

j

g j+ g⇤j g j

!�h⇤(�⇤). (50)

Then we define the terms excluding h⇤(�⇤) in eq. (50) as

L�⇤(�,w) , 1s2 ky�Qwk2

2 +Âj

w2

j

g j+ g⇤j g j

!. (51)

For a fixed �⇤, we notice that L�⇤(�,w) is jointly convex in w and � and can be globally minimizedby solving over � and then w. Since w2

j/g j +g⇤j g j � 2w j

q�⇤

j , for any w, g j = |w j|/q

g⇤j minimizesL�⇤(�,w).

The next step is to find a w that minimizes L�⇤(�,w). When g j = |w j|/q

g⇤j is substituted intoL�⇤(�,w), w can be obtained by solving the following weighted convex `1-minimization problem

w = argminw

{ky�Qwk22 +2s2

N

Âj=1

u j|w j|}

= argminw

{ky�Qwk22 +2s2

N

Âj=1

q�⇤

j |w j|}. (52)

75

Page 76: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

whereq�⇤

j are the weights. We can then set

g j =|w j|q

g⇤j,8 j, (53)

and, as a consequence, L�⇤(�,w) will be minimized for any fixed �⇤.Consider L�,w(�,w) in eq. (50) again, for any fixed � and w, the tightest bound can be obtained

by minimizing over �⇤. The tightest value of �⇤ = �⇤ equals the slope at the current � of thefunction h(�) , log |s2I +QG�1QT |+ÂN

j=1 p(g j). Using basic principles of convex analysis, wethen obtain the following analytic form for the optimizer �⇤:

�⇤ = —�

log |s2I +QG�1QT |+

N

Âj=1

p(g j)

!

= diaghQT �s2I +QG�1QT��1 Q

i+ p0(�), (54)

where p0(�) = [p0(g1), . . . , p0(gN)]T .The algorithm is then based on successive iterations of eq. (52), eq. (53) and eq. (54) until it

converges to �. We can then compute the posterior mean and covariance as in eq. (43) and eq. (44)

w = GQT (s2I +QGQT )�1y

Sw = G� GQT (s2I +QGQT )�1Q (55)

where G = diag[�]. The above described procedure is summarized in Algorithm 3.

76

Page 77: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Algorithm 3 Reweighted `1-minimizationData: y and QResult: Posterior mean for w.

Step 1. Set iteration count k to zero and initialize each u(0)j =

qg⇤j , with randomly chosen initial

values for g⇤j , 8 j, e.g., with g⇤j = 1, 8 j. Set k = 1.

Step 2. At the kth iteration, solve the reweighted `1-minimization problem

w(k) = argminw

{ky�Qwk22 +2s2

N

Âj=1

u(k)j |w j|}.

Step 3. Compute g(k)j

g(k)j =

|w(k)j |

qg⇤(k)j

,8 j,

Step 4. Update �⇤(k+1) using eq. (54)

�⇤(k+1)= diag

QT⇣

s2I +QG(k)QT⌘�1

Q�+ p0(�(k)).

Step 5. Update weights u(k+1)j for the `1-minimization at the next iteration u(k+1)

j =

q�⇤

j(k+1)

.

Step 6. Set k = k +1. Iterate Steps 2 to 5 until convergence to some �.

Step 7. Compute w = E(w|y; �) = GQT (s2I +QGQT )�1y.

77

Page 78: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

B User’s Manual of the Code

Identification of hybrid dynamical systems (IHYDE) is a open-source Matlab toolbox for automat-ing the mechanistic modeling of hybrid dynamical systems from observed data. IHYDE has muchlower computational complexity comparing with genetic algorithms [35], enabling its applicationto real-world CPS problems. IHYDE implements the clustering-based algorithms described in theData-driven Discovery of Cyber Physical Systems. It can also be used, positively, for the creationof guidelines for designing new CPSs. IHYDE uses routines of the CVX and SLR toolboxes forconstructing and solving disciplined convex programs (DCPs).

Download the latest version of IHYDE toolbox in a directory and add its path (and the pathof the subdirectories) to the Matlab path. The IHYDE toolbox consists of directories listed inTable S33.

Table S33. Directories in the IHYDE toolbox.

Directories Description/IHYDE main functions and examples

/IHYDE/docs pdf and html documentation/IHYDE/data datasets we used in the paper

/IHYDE/tools functions for IHYDE

Table S34, Table S35, Table S36 and Table S37 contain the introduction of IHYDE’s API.

Table S34. The introduciton of function library which can construct library for further identifica-tion.

Function library Description

yinan m by n matrix which contains time-course input-output data.

In here, m is the sample number, and n is the number of variables.polyorder used to construct the polynomial of the highest order (up to fifth order).

basis function add more basis functions. It can be turned off, if basis function.work set as ’off’.Phi constructed dictionary matrix F.

78

Page 79: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

Table S35. The introduction of function ihyde. The ihyde can be used to identify each subsystem.

Function ihyde Descriptionparameter.y the output data.

parameter.normalize y set to 1 if y need to be normalized.parameter.max s the max number of subsystems that could be identified by IHYDE.parameter.epsilon a 1 by 2 vector. For finding new subsystems.parameter.lamdba a 1 by 2 vector. For the identification of a new subsystems.

parameter.MAXITER the max number of iterations that the sparsesolver function solves.result. idx sys the index of each subsystem.

result.sys the model of each subsystem.result.theta z of each identified subsystem.result.error the fitting error.

Table S36. The introduction of function finetuning. Based on the minimum error principle, itfinetunes the result from ihyde and outputs the final result.

Function finetuning Description

result.lambdathe trading-off parameter l of the sparsesolver function.

Parameter.lamdba(2) is set as the default value.result.epsilon the threshold in finetuning. Parameter.epsilon(2) is set as the default value.

result.threshold the threshold for subsystem clustering.final result.idx the index of each subsystem.final result.sys the model of each subsystem.

final result.allerror the error which compared with the true output.

Table S37. The introduction of function ihydelogic.

Function ihydelogic Descriptionpara log.Phi2 constructed dictionary matrix Y for inferring transition logic of each subsystem.

para log.idx sys the index of each subsystem.para log.beta the tradeoff parameter in the `1 sparse logistic regression.

para log.y the output data.para log.normalize set to 1 if Phi2 need to be normalized.

79

Page 80: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

To quickly familiarize with IHYDE, examples are in the directory /IHYDE/examples. These.m files can also be used as templates for other experiments. We shall use the autonomous carexample to explain the code. First, we load the data.

clear all ,close all ,clc

addpath(’./ tools’);

addpath(’./data’);

basis_function.work=’off’;

data=load(’normal_car.mat’); %% load Data

index = 1000:1400;

flag = data.flag(index); % 1: straghtway 0:curve

dy = data.dy(index);

v = data.v(index)/10;

memory = 4; %The data covers the time from k-4 to k

polyorder = 4; % The highest order of the polynomial is 4

order

usesine = 0;%no sin and cos

A= library(v,polyorder ,usesine ,memory ,basis_function);%

make a library

A = A(memory +2:end ,:);

dy = dy(memory +2: end); %dpwm_{k}

flag = flag(memory +2: end); %flag_{k}

v_k1 = v(memory +1:end -1,:); % v_{k-1}

v_k2 = v(memory:end -2,:);% v_{k-2}

v_k3 = v(memory -1:end -3,:);% v_{k-3}

v = v(memory +2:end ,:); % v_{k}

Then, we initialize the parameters and identify the systems by function ihyde.

parameter.MAXITER = 5; %the iter for the sparsesolver

algorithm

parameter.max_s = 20; % the max number of

subsystems

parameter.epsilon = [100 8]; % the fist element in

lambda is epsilon_z and the second is epsilon_w

parameter.Phi = A; % the library

parameter.y = dy; %dpwm

parameter.normalize_y = 1; % normalize :1 unnormalize

:0

80

Page 81: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[result ]= ihyde(parameter); % inferring subsystems

Function ihyde will return a preliminary identified result which contains the details of subsys-tems. Since we want to get a better result based on the minimum error principle, we use functionfinetuning to fine-tune the results.

result.epsilon = parameter.epsilon (2);% use epsilon_w as

the epsilon in finetuning

result.lambda = parameter.lambda (2);% use lambda_w as the

epsilon in finetuning

result.threshold = [0.05]; %set a threshold for

clustering

final_result = finetuning(result); % finetuning each

subsystems

sys = final_result.sys; % get the identified subsystems

idx_sys = final_result.idx;% get the index of each

subsystems

The code for inferring transition logic between subsystems is shown below.

Phi2 = [ones(size(flag)) flag 1./v sin(v) cos(v) v.^2

v_k1./v_k2 v_k3 .^2 ];% library for inferring

transition logic between subsystems.

para_log.idx_sys = idx_sys;

para_log.beta= 0.5; % the tradeoff of l1-sparse

logistic regression

para_log.y = dy;

para_log.Phi2 = Phi2;

[syslogic ,labelMat ,data] = ihydelogic(para_log);

The identified results are saved in sys, idx sys and syslogic.

References

[34] A. Visintin, Differential Models of Hysteresis. (Springer, 1994).

[35] M. Grant, S. Boyd, CVX: Matlab software for disciplined convex programming, version 2.0beta. http://cvxr.com/cvx, (2013).

81

Page 82: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[36] E. Candes, J. Romberg, T. Tao, Robust uncertainty principles: Exact signal reconstructionfrom highly incomplete frequency information. IEEE Trans. on Information Theory, 52(2),489-509 (2006).

[37] D. L. Donoho, M. Elad, Optimally sparse representation in general (nonorthogonal) dictio-naries via l1minimization. Proceedings of the National Academy of Sciences, 100(5), 2197,(2003).

[38] T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, R. Tibshirani, The elements of

statistical learning, (Springer, 2009).

[39] K.P. Murphy, Machine learning: A probabilistic perspective (MIT Press, 2012).

[40] J. Palmer, D. Wipf, K. Kreutz-Delgado, B. Rao, Variational EM algorithms for non-Gaussian latent variable models, Advances in neural information processing systems, 18,1059 (2006).

[41] C. Bishop, Pattern Recognition and Machine Learning. (Springer New York, 2006).

[42] M. Wainwright, M. Jordan, Graphical models, exponential families, and variational infer-ence, Foundations and Trends in Machine Learning, 1, no. 1-2, 1-305 (2008).

[43] M. Tipping, Sparse bayesian learning and the relevance vector machine, The Journal of

Machine Learning Research, 1, 211-244 (2001).

[44] O. Yamashita, M. A. Sato, T. Yoshioka, F. Tong, Y. Kamitani, Sparse estimation automati-cally selects voxels relevant for the decoding of fmri activity patterns, Neuroimage, 42(4),1414-1429 (2008).

[45] B. Reger, K. Fleming, V. Sanguineti, S. Alford, F. Mussa-Ivaldi, Connecting brains torobots: an artificial body for studying the computational properties of neural tissues, Ar-

tificial Life, 6(4), 307 (2000).

[46] X. Fang, S. Misra, G. Xue, D. Yang, Smart grid the new and improved power grid: A survey.IEEE Communications Surveys & Tutorials, 14(4), 944-980 (2012).

[47] R. Jabr, Minimum loss operation of distribution networks with photovoltaic generation. IET

Renewable Power Generation, 8(1), 33-44 (2014).

[48] R. Zimmerman, C. Murillo-Sanchez, R. Thomas, MATPOWER: Steady-State Operations,Planning, and Analysis Tools for Power Systems Research and Education. IEEE Transac-

tions on Power Systems, 26(1), 12-19 (2011).

82

Page 83: Data-driven Discovery of Cyber-Physical SystemsCyber-physical systems (CPSs) embed software into the physical world. They They appear in a wide range of applications such as smart

[49] Z. Hameed, T. Hong, Y. Cho, Condition monitoring and fault detection of wind turbines andrelated algorithms. Renewable & Sustainable Energy Reviews, 13(1), 1-39 (2009).

83