improvement of process state recognition performance by

8
422 Journal of Chemical Engineering of Japan, Vol. 50, No. 6, pp. 422–429, 2017 Improvement of Process State Recognition Performance by Noise Reduction with Smoothing Methods Hiromasa Kaneko and Kimito Funatsu Department of Chemical System Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan Keywords: Process Control, Fault Detection, Noise Reduction, Smoohing, Hyperparameter Multivariate statistical process control (MSPC) is important for monitoring multiple process variables and their relation- ships while controlling chemical and industrial plants efficiently and stably. Although many MSPC methods have been developed to improve the accuracy of fault detection, noise in the operating data, such as measurement noise and sensor noise, conceals important variations in process variables. This noise makes it difficult to recognize process states, but has not been fully considered in traditional MSPC methods. In this study, to improve the process state recognition performance, we apply several smoothing methods to each process variable. The best smoothing method and its hyper- parameters are selected based on the normal distribution and variation of the reduced noise. Through case studies using numerical data and dynamic simulation data from a virtual plant, it is confirmed that the fault detection and identifica- tion accuracy are improved using the proposed method, which leads to enhanced state recognition performance. Introduction To operate industrial and chemical plants in a safe and stable manner, it is important to monitor and control the operating conditions. Because of the abundant amount of operating data in plants, data-based process control sys- tems have received significant attention in recent years. At- tempting to control each process variable independently is inefficient, as the number of process variables is enormous and there exist correlations between process variables. One practical solution is multivariate statistical process control (MSPC) (Kourti, 2005), which monitors multiple process variables and their relationships simultaneously. A major technique in MSPC is principal component anal- ysis (PCA)-based process control (Ku et al., 1995). A process fault is detected when one of the T 2 or Q statistics based on PCA exceeds some threshold. Partial least squares (PLS) (Chiang et al., 2000) and independent component analysis (Kaneko et al., 2009) have also been applied to MSPC. When there is a nonlinear relationship between process variables and the data distribution is multimodal, nonlinear MSPC methods such as kernel PCA (Lee et al., 2004), kernel PLS (Godoy et al., 2014), generative topographic mapping (Es- cobar et al., 2015), and the one-class support vector machine (OCSVM) (Mahadevan and Shah, 2009) are more effective than linear approaches. Although many MSPC methods have been developed to improve the fault detection accuracy, noise in the operating data (such as measurement noise and sensor noise) conceals important variations in process variables. is makes it diffi- cult to recognize process states, detect and isolate faults, and diagnose faulty process variables. When there is some varia- tion in a process variable, it is important to judge whether this is due to a process change and disturbance or process and sensor noise. However, in actual plants, noisy process variables make this very difficult, decreasing the process state recognition performance significantly. us, it is vital to handle noise appropriately in state recognition, fault de- tection, fault isolation, and fault diagnosis. Chemometric methods such as PCA and OCSVM can handle noise. However, these methods do not consider the characteristics of the operating data or time-series data. For example, data measured closely in time have strong relation- ships and correlations. In soſt sensor analysis, smoothing methods such as the simple moving average (SMA), linearly weighted moving average (LMA), exponentially weighted moving average (EMA), and Savitzky–Golay filtering (SGF) can be used to separate important variations in process vari- ables from noise (Kaneko and Funatsu, 2015b). is enables the noise in the predicted values to be reduced, which en- hances the predictive ability of soſt sensors and improves the control performance. We aim to improve the process state recognition per- formance using smoothing methods. It may seem easy to apply smoothing methods to process variables. However, there are many smoothing methods (such as SMA, LMA, EMA, and SGF), and the hyperparameters for each method must be optimized before smoothing the process variables. e appropriate smoothing method and its hyperparameters depend on the process variables, which makes it difficult to determine the optimal design. If an inappropriate smooth- ing method and hyperparameters are used for the process variables, the noise may not be reduced and important Received on November 8, 2016; accepted on April 13, 2017 DOI: 10.1252/jcej.16we325 Partly presented at the 7th International Symposium on Design, Opera- tion and Control of Chemical Processes (PSE Asia), Tokyo, July 2016 Correspondence concerning this article should be addressed to K. Funatsu (E-mail address: [email protected]). Research Paper This article is published under a CC BY-NC-ND.

Upload: others

Post on 25-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

422 Journal of Chemical Engineering of Japan Copyright © 2017 The Society of Chemical Engineers, Japan

Journal of Chemical Engineering of Japan, Vol. 50, No. 6, pp. 422–429, 2017

Improvement of Process State Recognition Performance by Noise Reduction with Smoothing Methods

Hiromasa Kaneko and Kimito FunatsuDepartment of Chemical System Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan

Keywords: Process Control, Fault Detection, Noise Reduction, Smoohing, Hyperparameter

Multivariate statistical process control (MSPC) is important for monitoring multiple process variables and their relation-ships while controlling chemical and industrial plants e�ciently and stably. Although many MSPC methods have been developed to improve the accuracy of fault detection, noise in the operating data, such as measurement noise and sensor noise, conceals important variations in process variables. This noise makes it di�cult to recognize process states, but has not been fully considered in traditional MSPC methods. In this study, to improve the process state recognition performance, we apply several smoothing methods to each process variable. The best smoothing method and its hyper-parameters are selected based on the normal distribution and variation of the reduced noise. Through case studies using numerical data and dynamic simulation data from a virtual plant, it is con�rmed that the fault detection and identi�ca-tion accuracy are improved using the proposed method, which leads to enhanced state recognition performance.

Introduction

To operate industrial and chemical plants in a safe and stable manner, it is important to monitor and control the operating conditions. Because of the abundant amount of operating data in plants, data-based process control sys-tems have received significant attention in recent years. At-tempting to control each process variable independently is inefficient, as the number of process variables is enormous and there exist correlations between process variables. One practical solution is multivariate statistical process control (MSPC) (Kourti, 2005), which monitors multiple process variables and their relationships simultaneously.

A major technique in MSPC is principal component anal-ysis (PCA)-based process control (Ku et al., 1995). A process fault is detected when one of the T2 or Q statistics based on PCA exceeds some threshold. Partial least squares (PLS) (Chiang et al., 2000) and independent component analysis (Kaneko et al., 2009) have also been applied to MSPC. When there is a nonlinear relationship between process variables and the data distribution is multimodal, nonlinear MSPC methods such as kernel PCA (Lee et al., 2004), kernel PLS (Godoy et al., 2014), generative topographic mapping (Es-cobar et al., 2015), and the one-class support vector machine (OCSVM) (Mahadevan and Shah, 2009) are more effective than linear approaches.

Although many MSPC methods have been developed to improve the fault detection accuracy, noise in the operating

data (such as measurement noise and sensor noise) conceals important variations in process variables. This makes it diffi-cult to recognize process states, detect and isolate faults, and diagnose faulty process variables. When there is some varia-tion in a process variable, it is important to judge whether this is due to a process change and disturbance or process and sensor noise. However, in actual plants, noisy process variables make this very difficult, decreasing the process state recognition performance significantly. Thus, it is vital to handle noise appropriately in state recognition, fault de-tection, fault isolation, and fault diagnosis.

Chemometric methods such as PCA and OCSVM can handle noise. However, these methods do not consider the characteristics of the operating data or time-series data. For example, data measured closely in time have strong relation-ships and correlations. In soft sensor analysis, smoothing methods such as the simple moving average (SMA), linearly weighted moving average (LMA), exponentially weighted moving average (EMA), and Savitzky–Golay filtering (SGF) can be used to separate important variations in process vari-ables from noise (Kaneko and Funatsu, 2015b). This enables the noise in the predicted values to be reduced, which en-hances the predictive ability of soft sensors and improves the control performance.

We aim to improve the process state recognition per-formance using smoothing methods. It may seem easy to apply smoothing methods to process variables. However, there are many smoothing methods (such as SMA, LMA, EMA, and SGF), and the hyperparameters for each method must be optimized before smoothing the process variables. The appropriate smoothing method and its hyperparameters depend on the process variables, which makes it difficult to determine the optimal design. If an inappropriate smooth-ing method and hyperparameters are used for the process variables, the noise may not be reduced and important

Received on November 8, 2016; accepted on April 13, 2017DOI: 10.1252/jcej.16we325Partly presented at the 7th International Symposium on Design, Opera-tion and Control of Chemical Processes (PSE Asia), Tokyo, July 2016Correspondence concerning this article should be addressed to K. Funatsu (E-mail address: [email protected]).

Research Paper

This article is published under a CC BY-NC-ND.

Vol. 50 No. 6 2017 423

variations could be incorrectly neglected.In this study, SMA, LMA, EMA, and SGF are applied to

each process variable, and the best method and its hyper-parameters are selected based on the normal distribution and variation of the reduced noise. After smoothing the process variables, a fault detection model is constructed. In online fault detection, the values of the process variables are smoothed online before being input into fault detection and faulty variable diagnosis models.

Through case studies with numerical data and dynamic simulation data from a virtual plant, the fault detection and identification accuracy of the proposed method are demon-strated, and the state recognition performance is evaluated.

1. Method

In spectral analysis, SMA, LMA, EMA, and SGF tech-niques are typically applied to each spectrum, i.e., sample. However, in the present study, the smoothing methods are not applied to each sample, but for the time-series data of each process variable. Because process variables have tem-porally continuous time-series data that can be represented as spectra, SMA, LMA, EMA, and SGF will reduce the noise in each process variable as well as in the spectra.

The process variable data are replaced by data that have been smoothed by SMA, LMA, EMA, and SGF. In con-structing a fault detection and identification model, the true relationships between process variables can be modeled using the noise-free dataset. In the prediction of faults and faulty variables, the data are smoothed every time the pro-cess variables are measured.

1.1 Moving averages in process monitoringWe introduce three types of moving average. Equations

(1)–(3) compute the SMA, LMA, and EMA of recent values of the i-th process variable at target time t.

{ }− +

==

( 1)

1( )smoothed,

Wt ji

jti

xx W (1)

{ }−−

+

=

=

+

=+

( 1)

1( )smoothed,

1

( 1)

( 1)

Wt ji

jti W

j

W j xx

W j (2)

{ }− −− −= +( ) +( ) +( ) ( ) ( 1) 2 ( 2)smoothed, 1 1t t t t

i i iix α x α x α x (3)

Here, x(t)smoothed,i denotes the smoothed value of the i-th pro-

cess variable at t, xi(t−j) denotes the raw values of the i-th

process variable at t−j, W is the window size, and α is a constant smoothing factor between 0 and 1. The method of determining W and α is explained in Section 1.3.

In online process state recognition, SMA, LMA, and EMA are applied to each process variable every time the process variable is measured, and the smoothed data are input to a

process monitoring model such as a soft sensor model, fault detection model, fault identification model, or fault diagno-sis model.

1.2 Savitzky–Golay �ltering in process monitoringThe SGF method, which is mainly used in spectral analy-

sis, performs smoothing (noise processing) and numerical differentiation simultaneously. Noisy spectra can be effec-tively smoothed while data characteristics such as peaks are retained in the spectra. In soft sensor analysis, SGF can be used as a smoothing method. Li et al. (2012) applied SGF to time-series data while assuming that (2N+1) points cen-tered on the target point i, i.e., time, could be used for filter-ing. However, SGF cannot be used for online monitoring under their assumption, because future data, i.e. from point (i+1) to (i+N), cannot be prepared beforehand.

It is assumed that the i-th process variable xi has a value of xi

(t) at time t and the measurement time interval is a con-stant k (for example, 1 min). In SGF, x is approximated by a high-order expression in terms of time t, which leads to a smooth curve. This high-order expression is then numeri-cally differentiated with respect to t. In spectral analysis, (2N+1) points centered on the target point (t, xi

(t−N)), i.e. (t−N, xi

(t−N)), (t−N+1, xi(t−N+1)), …, (t, xi

(t−N)), …, (t+N−1, xi

(t+N−1)), (t+N, xi(t+N)), are used to approximate the smooth-

ing function. However, only (2N+1) points at t or earlier, i.e. (t−2N, xi

(t−2N)), (t−2N+1, xi(t−2N+1)), …, (t−1, xi

(t−1)), (t, xi(t))

can be used in online monitoring.When we calculate the smoothed value x(t)

smoothed,i, the Mth function of t is given by Eq. (4).

=

=1

( )M

jj

j

f t b t (4)

Here, bj is estimated by the method of least squares using (2N+1) points, i.e., (t−2N, xi

(t−2N)), (t−2N+1, xi(t−2N+1)),

…, (t−1, xi(t−1)), (t, xi

(t)). In other words, multiple regression relations are computed between the objective variable xi and the explanatory variables t, t2, …, tM. The smoothed value at t is given by inputting t into f(t) in Eq. (4). This equa-tion calculates bj more effectively than the method of least squares; for details, see Yoshimura and Takayanagi (2012). The optimization program for SGF can be downloaded from https://jp.mathworks.com/matlabcentral/fileexchange/4038-savitzky-golay-smoothing-and-differentiation-filter/content/sgsdf.m. The method of determining the hyperparameters M and N is described in Section 1.3.

In monitoring the process variables online, SGF is used for each process variable every time a process variable is measured. The smoothed data of the process variables are input into a process monitoring model such as a soft sensor model, fault detection model, fault identification model, or fault diagnosis model.

1.3 Determination of the best smoothing method and its hyperparameters

Kaneko and Funatsu (2015a) determined the hyperpa-

424 Journal of Chemical Engineering of Japan

rameter values for soft sensor analysis through a trial and error process. The optimal smoothing method can also be determined through the same process. However, this depends on the data analysis. In addition, training data in-cluding the objective variable are required to optimize the smoothing methods and hyperparameters. In fault detection and diagnosis problems, for example, there are no training data or objective variables, as the normal state data domain is defined in the fault detection and diagnosis model using only normal data, and faults and faulty process variables are detected when new samples are far from these normal data.

Therefore, in the present study, we automatically deter-mine the best smoothing method and its hyperparameters based on the normal distribution of the local reduced noise and the variance of the global reduced noise. Figure 1 shows the procedure of the proposed method. This procedure is computed for each process variable. For a given dataset, the window size p is set and the normal distribution of reduced noise is checked for all combinations of smoothing methods and hyperparameters using continuous p data. We use the Kolmogorov–Smirnov test with a significance level of 5%. This is repeated q times with randomly selected start points in p. The smoothing methods and hyperparameters for which the reduced noise is normally distributed all q times are then saved. Among these smoothing methods and hy-perparameters, those with the highest variation in reduced noise for a whole dataset are selected. The proposed method ensures both a normal distribution of the local reduced noise and global noise reduction by smoothing the process variables.

It is difficult to mathematically justify that process noise has a normal distribution; however, it is reasonable as tra-ditional studies (Patra and Kot, 2002; Atkinson et al., 2012) produce good results by assuming it. In addition, the Kal-

man filter (Gordon et al., 1993; Vo and Ma, 2006) also as-sumes that process noise and observation noise follow nor-mal distributions. We have added these explanations in the last paragraph of “1.3 Determination of the best smoothing method and its hyperparameters”.

2. Results and Discussion

To verify the effectiveness of the proposed method, we analyzed numerical simulation data and Tennessee East-man Process (TEP) data. The simulation dataset includes noise. We used PCA and OCSVM (Mahadevan and Shah, 2009) as fault detection methods. For PCA, process faults are detected when one of the T2 or Q statistics exceeds some threshold. The threshold was set so that 99.7% of the data were classified as being normal (i.e. the 3σ rule). The num-ber of principal components was determined so that the cumulative contribution ratio was above 0.99. For OCSVM, we used OCSVMind (which gives a negative value of the OCSVM output) as an index for fault detection and used the Gaussian kernel function (see Appendix A). In this study, LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) was used to construct the OCSVM model. Table 1 lists the can-didates for W, α, N, and M in the proposed method.

We set p to 50 and q to 20 in the smoothing method and hyperparameter decision procedure.

2.1 Numerical simulation data including noiseWe used the two-input/two-output system described by

Ku et al. (1995), which can be written as Eqs. (5) and (6).

− − − −

= +0.118 0.191 1.0 2.0

( ) ( 1) ( 1)0.847 0.264 3.0 4.0

x t x t u t (5)

= +( ) ( ) ( )y t x t v t (6)

In this system, v is generated from a Gaussian distribution with zero mean and variance 0.1. Values of u are generated as follows.

− − − − −

= +0.811 0.266 0.193 0.689

( ) ( 1) ( 1)0.477 0.415 0.320 0.749

u t x t w t

(7)

Here, w is generated from an uncorrelated Gaussian distri-bution with zero mean and unit variance. u and y are used as process variables. The first 1000 data points are the normal data and the next 1000 data points are the test data. In the test data, the mean of w changes from 0 to 1 at t=501 to represent an abnormal situation.

Fig. 1 Procedure of the proposed method

Table 1 Candidates for W, α, 2N+1, and M

W 5, 10, …, 195, 200α 0.005, 0.01, …, 0.495, 0.5N 5, 10, …, 195, 200M 1, 2, 3, 4

Vol. 50 No. 6 2017 425

In this case, Q is not used because the number of X-variables and the number of principal components are both two. The fault detection performance using the test data is presented in Table 2 for the traditional PCA, the proposed PCA with smoothing methods, the traditional OCSVM, and the proposed OCSVM with smoothing methods. The accu-

racy rate (AR), detection rate (DR), and precision (PR) are defined as follows.

+=

+ + +TP TNAR TP FP TN FN (8)

Fig. 2 Time plots of T2 and OCSVMind in numerical simulation data analysis; blue lines denote T2 and OCSVMind values and red lines denote the thresholds

Table 3 Fault detection performance for process fault 6 in TEP data analysis

Traditional PCA Proposed PCA Traditional OCSVM Proposed OCSVM

Accuracy rate [%] 99.8 99.9 99.9 99.8Detection rate [%] 100 100 99.9 99.8Precision [%] 99.8 99.9 100 100ARL 0 0 0 0

Table 4 Fault detection performance for process fault 7 in TEP data analysis

Traditional PCA Proposed PCA Traditional OCSVM Proposed OCSVM

Accuracy rate [%] 99.7 98.9 100 99.9Detection rate [%] 100 100 100 99.9Precision [%] 99.6 98.6 100 100ARL 0 0 0 1

Table 2 Fault detection performance in numerical simulation data analysis

Traditional PCA Proposed PCA Traditional OCSVM Proposed OCSVM

Accuracy rate [%] 63.1 98.5 72.2 99.2Detection rate [%] 26.6 97.0 45.4 99.4Precision [%] 98.5 100.0 97.8 99.0ARL 14.2 9.4 10.1 6.2

426 Journal of Chemical Engineering of Japan

Table 5 Fault detection performance for process fault 9 in TEP data analysis

Traditional PCA Proposed PCA Traditional OCSVM Proposed OCSVM

Accuracy rate [%] 19.1 75.7 17.5 75.6Detection rate [%] 3.1 78.4 1.9 80.6Precision [%] 92.6 91.3 68.2 89.1ARL 13 0 4 0

Table 6 Fault detection performance for process fault 10 in TEP data analysis

Traditional PCA Proposed PCA Traditional OCSVM Proposed OCSVM

Accuracy rate [%] 52.2 91.9 54.0 81.6Detection rate [%] 43.1 92.1 44.8 82.3Precision [%] 98.9 98.0 100 94.9ARL 35 15 24 5

Fig. 3 Time plots of T2, Q, and OCSVMind for process fault 6 in TEP data analysis; blue lines denote T2, Q, and OCSVMind values and red lines de-note the thresholds

Vol. 50 No. 6 2017 427

=+TPDR TP FN (9)

=+TPPR TP FP (10)

Here, TP denotes the number of true positives, or the number of samples for which the model detects the fault correctly; TN represents the number of true negatives, or the number of samples for which the model does not detect the fault and it is indeed normal; FP denotes the number of false positives, or the number of samples for which the model detects a fault incorrectly; and FN represents the number of false negatives, or the number of samples that are actually abnormal but are not detected as faulty. The average run length (ARL) is the time required to detect the fault. Using the proposed PCA and OCSVM with smoothing, AR, DR, and PR improve compared to the traditional PCA and

OCSVM. The proposed OCSVM achieved close to 100% ac-curacy, detection, and precision. The proposed method can smooth the process variables appropriately, which leads to improved fault detection performance.

Figure 2 shows time plots of T2 and OCSVMind in the traditional model and those in the proposed model using test data (Q does not exist because the number of X-variables and the number of principal components are both two). Although the traditional T2 and OCSVMind could not detect faults correctly and stably, the T2 and OCSVMind values increased sharply at t=500. The proposed meth-od detected faults from t=500. In addition, the T2 and OCSVMind values given by the proposed method were very smooth compared with those of the traditional method. This confirms that our proposed method can successfully smooth the process variables in process state recognition.

Fig. 4 Time plots of T2, Q, and OCSVMind for process fault 9 in TEP data analysis; blue lines denote T2, Q, and OCSVMind values and red lines de-note the thresholds

428 Journal of Chemical Engineering of Japan

2.2 Tennessee Eastman Process dataThe TEP was developed by the Eastman Chemical Com-

pany to simulate a real industrial process, and has been used to compare the performance of process control and monitoring methods; a TEP diagram has been published by Downs and Vogel (1993). TEP consists of a reactor, con-denser, compressor, separator, and stripper, as well as eight components, labeled A–H. The liquid products G and H and the by-product F are produced from the gaseous reactants A, C, D, and E through some chemical reactions. Fifty-two variables including manipulated variables, easy-to-measure process variables, and composition measurements are in-cluded in the TEP data.

There are 21 preprogrammed faults in TEP, including five unknown faults. In this study, we targeted the sixth, sev-enth, ninth, and tenth process faults. The sixth and seventh process faults are easily detected by previous approaches; whereas, the ninth and tenth process faults are generally difficult to detect. The training and test data were obtained from Russell et al. (2000). Training data for the normal con-dition consisted of 500 (1500 min) observations, and the test data consisted of 960 (2880 min) observations, including 160 (480 min) normal data, for each process fault.

Tables 3–6 present the fault detection results using the test data and the traditional and proposed PCA and OCSVM methods for the sixth, seventh, ninth, and tenth process faults, respectively. For the sixth and seventh process faults, the fault detection performance of the traditional PCA and OCSVM is almost the same as that of the pro-posed PCA and OCSVM with smoothing. All AR, DR, and PR scores were close to 100% and the ARL was negligible in all cases. For the ninth process fault, although AR, DR, and PR in the proposed PCA and OCSVM with smoothing are not close to 100%, they are considerably better than those given by the traditional PCA and OCSVM, especially DR. For the tenth process fault, although DR decreased slightly, AR and PR were increased by using the proposed PCA and OCSVM. By successfully smoothing the process variables, the proposed method detected the fault more accurately.

Figures 3 and 4 show time plots of T2, Q, and OCSVMind given by the traditional and proposed methods for the sixth and ninth process faults, respectively. The scales of the y-axis have no meaning since PCA and OCSVM mod-els are derived from different data sets in the traditional and proposed methods, that is, raw data in the traditional method and smoothed data in the proposed method. T2, Q and OCSVMind in the proposed method are completely different from those in the traditional method. Therefore, it is difficult to discuss the difference of the range of the varia-tion in the traditional and proposed plots in Figure 4. For the sixth process fault, the indexes exceed the thresholds sharply at t=161 (483 min), which means that the fault could be detected appropriately in all cases. However, for the ninth fault, the T2, Q, and OCSVMind values given by the traditional method rarely exceed the threshold, which means that fault detection failed completely. Using the pro-posed smoothing method, T2, Q, and OCSVMind are clearly

smoother. Additionally, the proposed fault detection mod-els, especially Q and OCSVMind, detected faults as they ac-tually occurred. Although the actual fault started at t=161 (483 min), it was detected before this point, which suggests that the indexes with smoothing are too sensitive for small changes in process states. However, the proposed method can detect faults that are difficult to detect using traditional methods.

Conclusion

In the present study, we applied smoothing methods to process state recognition, fault detection, and fault identifi-cation. The optimal smoothing method and its hyperparam-eters are selected based on the normal distribution of the local reduced noise and the variance of the global reduced noise. The proposed method offers appropriate smoothing and noise reduction, and improves the fault detection per-formance of PCA and OCSVM in both numerical simula-tion data and TEP data. However, the fault was detected too early in TEP, and the process variables after smoothing are very sensitive to small process changes. This issue will be considered in a future work. We believe that industrial and chemical plants can be operated stably and safely by apply-ing our proposed method to process control.

Acknowledgement

The authors acknowledge the support of Mizuho Foundation for the Promotion of Sciences and the Core Research for Evolutionary Science and Technology (CREST) project “Development of a knowledge-gener-ating platform driven by big data in drug discovery through production processes” of the Japan Science and Technology Agency (JST).

Nomenclature

AR = accuracy rate [%]b = constant [—]DR = detection rate [%]EMA = exponentially weighted moving average [—]K = kernel function [—]LMA = linearly weighted moving average [—]M = degree of a polynomial [—]MSPC = multivariate statistical process control [—]m = the number of process variables [—]N = one side of window [—]n = number of normal data [—]OCSVM = one-class support vector machine [—]PCA = principal component analysis [—]PLS = partial least squares [—]PR = precision [%]p = number of continuous data in determining the best

smoothing method and its hyperparameters [—]q = times to select p in determination of the best

smoothing method and its hyperparameters [—]SGF = Savitzky–Golay filtering [—]SMA = simple moving average [—]TEP = Tennessee Eastman process [—]t = time [—]u = input variable [—]

Vol. 50 No. 6 2017 429

v = random number [—]W = window size [—]w = random number that changes when a fault occurs [—]x = process variable [—]y = output variable [—]

α = smoothing factor [—]ξ = slack variable [—]γ = hyperparameter [—]

Literature Cited

Atkinson, P. M., C. Jeganathan, J. Dash and C. Atzberger; “Inter-Com-parison of Four Models for Smoothing Satellite Sensor Time-Series Data to Estimate Vegetation Phenology,” Remote Sens. Environ., 123, 400–417 (2012)

Chiang, L. H., E. L. Russell and R. D. Braatz; “Fault Diagnosis in Chem-ical Processes Using Fisher Discriminant Analysis, Discriminant Partial Least Squares, and Principal Component Analysis,” Che-mom. Intell. Lab. Syst., 50, 243–252 (2000)

Downs, J. J. and E. F. Vogel; “A Plant-Wide Industrial Process Control Problem,” Comput. Chem. Eng., 17, 245–255 (1993)

Escobar, M. S., H. Kaneko and K. Funatsu; “Combined Generative Top-ographic Mapping and Graph Theory Unsupervised Approach for Non-linear Fault Identification,” AIChE J., 61, 1559–1571 (2015)

Godoy, J. L., D. A. Zumoffen, J. R. Vega and J. L. Marchetti; “New Con-tributions to Non-linear Process Monitoring through Kernel Par-tial Least Squares,” Chemom. Intell. Lab. Syst., 135, 76–89 (2014)

Gordon, N. J., D. J. Salmond and A. F. M. Smith; “Novel-approach to Nonlinear Non-Gaussian Bayesian State Estimation,” IEE Proc. F, 140, 107–113 (1993) DOI: 10.1049/ip-f-2.1993.0015

Kaneko, H., M. Arakawa and K. Funatsu; “Development of a New Soft Sensor Method Using Independent Component Analysis and Par-tial Least Squares,” AIChE J., 55, 87–98 (2009)

Kaneko, H. and K. Funatsu; “Fast Optimization of Hyperparameters for Support Vector Regression Models with Highly Predictive Ability,” Chemom. Intell. Lab. Syst., 142, 64–69 (2015a)

Kaneko, H. and K. Funatsu; “Smoothing-combined Soft Sensors for Noise Reduction and Improvement of Predictive Ability,” Ind. Eng. Chem. Res., 54, 12630–12638 (2015b)

Kourti, T.; “Application of Latent Variable Methods to Process Control and Multivariate Statistical Process Control in Industry,” Int. J. Adapt. Control Signal Process., 19, 213–246 (2005)

Ku, W., R. H. Storer and C. Georgakis; “Disturbance Detection and Isolation by Dynamic Principal Component Analysis,” Chemom. Intell. Lab. Syst., 30, 179–196 (1995)

Lee, J. M., C. K. Yoo, S. W. Choi, P. A. Vanrolleghem and I. B. Lee; “Nonlinear process Monitoring Using Kernel Principal Compo-nent Analysis,” Chem. Eng. Sci., 59, 223–234 (2004)

Li, Q., Q. Du, W. Ba and C. Shao; “Multiple-Input Multiple-Output Soft Sensors Based on KPCA and MKLS-SVM for Quality Prediction in Atmospheric Distillation Column,” Int. J. Innov. Comput., Inf. Control, 8, 8215–8230 (2012)

Mahadevan, S. and S. L. Shah; “Fault Detection and Diagnosis in Pro-cess Data Using One-class Support Vector Machines,” J. Process Contr., 19, 1627–1639 (2009)

Patra, J. C. and A. C. Kot; “Nonlinear Dynamic System Identification Using Chebyshev Functional Link Artificial Neural Networks,”

IEEE Trans. Syst. Man Cybern. B Cybern., 32, 505–511 (2002)Russell, E. L., L. H. Chiang and R. D. Braatz; “Fault Detection in Indus-

trial Processes Using Canonical Variate Analysis and Dynamic Principal Component Analysis,” Chemom. Intell. Lab. Syst., 51, 81–93 (2000)

Vo, B. N. and W. K. Ma; “The Gaussian Mixture Probability Hypothesis Density Filter,” IEEE Trans. Signal Process., 54, 4091–4104 (2006)

Yoshimura, N. and M. Takayanagi; “Chemometrics Calculations with Microsoft Excel (5),” J. Comput. Chem. Jpn., 11, 149–158 (2012)

Appendix A. One-Class Support Vector Machine (OCSVM)

An OCSVM model f is given as follows.

−=φ( ) ( )( ) ( )i if bx x w (A.1)

Here, x(i)∈R1×m (m is the number of process variables) is a sample of normal data, ϕ is a nonlinear function, w is a weight vector, and b is a constant.

In OCSVM modelling, we aim to minimize Eq. (A.2) sub-jected to Eq. (A.3).

−=

+2

1

1 12

n

ii

ξ bνnw (A.2)

≥ −

φ ( )( )

0

ii

i

b ξξ

x w (A.3)

Here, n is the number of normal data. ν∈(0, 1) is inter-preted as the fraction of outliers in the normal data, and controls the size of the normal data domain. In the present study, ν is set to 0.003 in reference to the 3σ limit. ξi is a slack variable. Using the method of Lagrange multipliers, the out-put of a query x∈R1×m is given as follows.

−=

= ( )

1

( ) ( , )n

ii

i

f α K bx x x (A.4)

Here, K is the kernel function. We use the Gaussian kernel function, which can be written as follows.

− −=2( ) ( ) ( ) ( )( , ) exp( )i j i jK γx x x x (A.5)

Here, γ is a hyperparameter that can be optimized to maxi-mize the variance in K using normal data.

In the present study, we use OCSVMind as an index for fault detection. This is given as follows.

− ( )

1

OCSVMind( ) ( )

( , )n

ii

i

f

α K b

x x

x x=

=

= +

(A.6)

High OCSVMind values indicate faulty situations. The threshold of OCSVMind is set as 99.7% of the index using normal data in reference to the 3σ limit.