multi-source data ensemble modeling for clinker free lime...

ACCEPTED MANUSCRIPT SMCA-13-04-0221 1

Multi-source Data Ensemble Modeling for ClinkerFree Lime Content Estimate in Rotary Kiln

Sintering ProcessesWeitao Li, Dianhui Wang,Senior Member, IEEE, and Tianyou Chai,Fellow, IEEE

Abstract—Clinker free lime (f-CaO) content plays a crucialrole in determining the quality of cement. However, the existingmethods are mainly based on laboratory analysis and withsignificant time delays, which makes the closed-loop control off-CaO content impossible. In this paper, a multi-source dataensemble learning-based soft sensor model is developed foronline estimation of clinker f-CaO content. To build such a softsensor model, input flame images, process variables, and thecorresponding output f-CaO content data for a rotary cementkiln were collected from No. 2 rotary kiln at JiuganghongdaCement Plant which produces 2,000t of clinker per day. The rawdata were pre-processed to distinguish the flame image regions ofinterest (ROI) and remove process variable outliers. Three typesof flame image ROI features, i.e., color, global configuration,and local configuration features, were then extracted withoutsegmentation. Further, a kernel partial least square (KPLS)technique was applied for extracting the compressed score matrixfeatures from the concatenated flame image features and filteredprocess variables to avoid high dimensional, nonlinear, andcorrelated problems. Feed-forward neural networks with randomweights were employed as base learners in our proposed ensemblemodeling framework, which aims to enhance the model’s relia-bility and prediction performance. A total of 157 flame images,the associated process variable data, and the experimentallymeasured f-CaO content data were used in our experiments.A comparative study on the f-CaO content estimator built byvarious feature compressed techniques and learner models androbustness analysis were carried out. The results indicate that theproposed multi-source data ensemble soft sensor model performsfavorably and has good potential in real world applications.

Index Terms—f-CaO content, soft sensor, multi-source data,ensemble modeling, neural networks with random weights.

I. I NTRODUCTION

T HE rotary kiln, as a large-scale heat exchange facility,is widely used in metallurgical, cement, chemical, and

environment protection industries. A major issue in the rotarykiln sintering process is the online index measurement forthe system output, i.e., clinker quality. Unfortunately, there isno analyzer instrument available so far for real time sensingof clinker quality due to its special structure. Some relevant

W. T. Li is with Department of Electric Engineering and Automation (HefeiUniversity of Technology), Hefei, Anhui Province 230009, China (e-mail:[email protected])

D. H. Wang is with Department of Computer Science and ComputerEngineering, La Trobe University, Melbourne, VIC 3086, Australia; He isalso with The State Key Laboratory of Synthetical Automationfor ProcessIndustries (Northeastern University), Shenyang, Liaoning Province 110004,China (e-mail: [email protected])

T. Y. Chai is with The State Key Laboratory of Synthetical Automation forProcess Industries (Northeastern University), Shenyang,Liaoning Province110004, China (e-mail: [email protected])

work utilizing either process variables or flame image featureshas been done based on statistical approaches [1]–[3]. It iswell known that clinker quality directly affects the qualityof cement. Thus, it is important to effectively and efficientlyestimate f-CaO content as feedback to design controllers.Generally, f-CaO content is obtained by offline lab analysisfrom manual sampling at 1h period intervals. Therefore, sig-nificant time delays take place between the clinker qualityreal-time control and the availability of the f-CaO contentfeedback signals. This makes f-CaO content-based closed-loop control impossible. So far, open-loop control schemesare implemented by observing burning zone flame images andprocess variables. Operators estimate the current burningstatevia flame images and process variables, and then regulate themanipulated variables to drive the controlled variables tofallinto preset ranges so that the f-CaO content can be estimated.Nevertheless, the accuracy of the estimated f-CaO content canbe affected by an operator’s mental state, work experience andattitude. Fluctuant f-CaO content will significantly impact thestability of the cement quality, and lab analysis can only bea reference and guide for operators in subsequent operations.Therefore, any method for online estimation will greatly helpin reducing the amount of clinker rejections and implementingclosed-loop control strategies.

With the development of computational intelligence tech-niques, soft sensor techniques have received considerableattention in process industries. The soft sensor, as a signalreconstruction modeling approach, is used to distinguish hard-to-measure process variables from online easy-to-measureprocess variables. Moreover, recent developments in measure-ment techniques enable us to collect, store and analyze alarge amount of process data, and make data-driven-basedsoft sensor modeling methods possible. Although soft sensortechniques have been applied to various domain applications[4]–[6], they share common components and properties, suchas input variable selection and estimator design. It is importantto select a subset of the whole feature to build a robust modelwith better generalization capability. As for regressor design,many learner models, such as support vector regressor (SVR)[7] and random vector functional-link networks (RVFL) [8],[9], can be employed.

Based on the operator’s experience, it is believed thatboth flame images and process variables have a close rela-tionship with the clinker f-CaO content. From an operator’sunderstanding, the color and configuration features of ROI,i.e., the material region and flame region, of burning zone


Fig. 1. A burning zone flame image.

flame images (see Fig. 1) are two critical visual featuresby which to estimate f-CaO content [10]. The color of theROI indicate the combustion region and the distribution ofthe temperature field, whilst the configuration of the ROIcharacterizes the heat source, disturbance from the smokeand dust, and the clinker sintering status. On the other hand,process variables are usually employed to evaluate the burningstate with smoke and dust disturbance. In [11], a segmentation-free approach is introduced for the first time to extract the colorand configuration features of flame image ROI to identify theburning state, which aims to avoid some of the uncertaintiescaused by unreliable ROI segmentation.

This work is built on our previous work in [11], and ourtechnical contributions in this paper can be summarized asfollows: (i) we propose an ensemble learning-based clinkerf-CaO content soft sensor model, where data on both flameimages and process variables are utilized to improve theaccuracy of single source data-based regression models; (ii)we address randomness in system design and suggest a re-cursive process to select the input variables to achieve bettermodeling performance in terms of the prediction accuracy, themodel’s reliability and generalization. Our proposed method iscomposed of several modules, including data pre-processing,feature extraction and compression, and ensemble estimatordesign and implementation. Please note, this paper does notaim to connect these modules in series but to develop anovel multi-source data-ensemble-driven-based clinker f-CaOcontent soft sensor model which has not been previouslydeveloped. Specifically, in the data pre-processing step, acompact Gabor filter bank [12] and modified median filterare employed to distinguish the flame image ROI and removeprocess variable outliers to facilitate the following featureextraction procedure. For image feature extraction, the color,global configuration, and local configuration features of flameimage ROI are computed by multivariate image analysis (MIA)[2], principal component analysis (PCA) [13], and a scaleinvariant feature transform (SIFT) operator [14] combinedwith the “bag of visual words” (BoVW) descriptor [15] andthe latent semantic analysis (LSA) technique [16]. Further, thekernel partial least squares (KPLS) algorithm [17] is applied toextract the low dimensional score matrix features to compressthe concatenated flame image features and filtered processvariables. The extracted score matrix features will be usedas the potential inputs of the soft sensor model. Finally, inen-

semble estimator design phase, feed-forward neural networkswith random weights are employed and a performance-basedrecursive feature selection method is used to determine theinput variables of the learner models. Numerous experimentswith comprehensive comparisons are carried out. The resultsindicate the superiority of our proposed multi-source dataensemble clinker f-CaO content soft sensor model, comparedwith other soft sensor models.

The remainder of the paper is organized as follows. Section2 details the cement producing process and our multi-sourcedata modeling approach. Section 3 describes the flame imagesand process variable pre-processing steps and feature extrac-tion methods used in this study. Section 4 details our ensemblelearning-based regressor design method. Section 5 reportsourexperimental results with comparisons. Section 6 concludesthis work and describes further research.

II. CEMENT PRODUCING PROCESS DESCRIPTION AND

MULTI -SOURCE DATA MODELING APPROACH

A. Brief description of cement producing process

For cement production, the raw material is limestone, clay,laterite, and red ochre in the required proportions. Thesematerials, preset in size, are fed into the core equipment ofthe cement production process, i.e., a rotary kiln, as showninFig. 2. At the kiln head, coal powders from the coal feederare mixed with the primary air to form a fuel flow to sprayinto the kiln head hood and combust with the air from thecooler. The hot gas is brought to the kiln tail by the induceddraft fan, whilst the raw material moves to the kiln head bythe rotation of the kiln and its gravity, in a counter-clockwisedirection. The raw material passes through the drying zone,thepre-heating zone, the decomposing zone, the burning zone andthe cooling zone in sequence to create the final product, i.e., aclinker. Specifically, during the burning zone, at 1300oC, theliquid phase of tetracalcium aluminoferrite (C4AF), tricalciumaluminate (C3A), and dicalcium silicate (C2S) will absorbthe freeCaO (f-CaO) to produce the final tricalcium silicate(C3S).

The clinker is ground with gypsum in a ball mill to producecement [18]. The clinker comprises 85% of the cement com-ponent, hence its quality determines the cement performanceto a large extent. Due to volume expansion with water, theexcess f-CaO appearing in the clinker is the main cause ofcement instability [18]. The f-CaO content in the cement isset at 0.5%∼2.0% by the state standard. However, owingto the special rotary kiln structure, the clinker can only bemanually sampled at the kiln outlet at 1h period intervals, andits f-CaO content is measured offline using a mixed solutionof clinker and ethanediol or anhydrous glycerin-ethanol for5∼20mins [18]. Therefore, significant delays make timelyclinker quality-based control impossible, which lead to therejection or recycling of the substandard clinker.

As an alternative, a f-CaO content indirect control mode isused, as shown in Fig. 2, that is, with input raw material,by observing the burning zone clinker sintering and coalburning status, operators extract visual features relatedto theclinker quality, and then combine these with process variables


rotary kiln

coal storage

kiln head hood

induced draft fan

to the chimney

raw material (limestone,

clay, laterite, red ochre)

burning zone

cooling zone

electric dust collector

operator

visual fire inspection

K2

PT2

TT2

raw material pump

main motor

coal feeder

TT1PT1

f CaO lab analyzing

process variables

boundarycondition

target value f Cao

U3

U4

U1

U2

bagdust

collector

cooler

air

clinker storage

coal storage

K1

primary air

clinker storage

drying zone

pre heating zone

decomposing zone

cyclone dust collector

returned dust

kiln tail

hood

Fig. 2. Control schema of rotary kiln sintering process.K1, coal feeder valve;K2, induced draft fan valve;PT1, kiln head pressure sensor;PT2, kiln tailpressure sensor;TT1, kiln head temperature sensor;TT2, kiln tail temperature sensors,U1, coal feeder valve manipulated variable;U2, induced draft fanvalve manipulated variable;U3, raw material pump current manipulated variable;U4, main motor current manipulated variable.

TABLE IFLAME IMAGE FEATURES AND ASSOCIATED PROCESS VARIABLES FOR

CEMENT ROTARY KILN F-CAO CONTENT SOFT SENSOR MODEL

Soft sensor model input and output variablesColor of ROI

Flame image features (input) Global configuration of ROILocal configuration of ROI

Coal feeding (Wc)Opening degree of induced draft fan (Od)

Kiln main motor current (Ik)Kiln operating variables (input) Raw material pump current (Im)

Kiln tail temperature (Tt)Kiln head temperature (Th)

Kiln head pressure (Ph)Lime saturation factor (KH)

Raw material quality (input) Silicic acid rate (SM )Alumina modulua (AM )

Granularity (Gr)Clinker quality (output) f-CaO content

(including operation variables and boundary conditions, i.e.,raw material components), and the expected f-CaO content,to identify the current burning state, and then regulate themanipulated variables, i.e., coal feeder valve (U1), induceddraft fan valve (U2), raw material pump current (U3), andmain motor current (U4) to guarantee f-CaO content. However,the quality of f-CaO content can be affected by an operator’smental state, their work experience and their attitude. It is well-known that fluctuant f-CaO content has a significant impacton the stability of cement quality. Although lab testing isthe only reliable reference to guide the subsequent operation,in order to realize clinker quality-based timely closed-loopcontrol strategies, a soft sensor model that is able to implementan online estimate of f-CaO content will make such a controlmode possible. Jiuganghongda Cement Plant, with a capacityof 5,000 tonnes of clinker per day, is a subcompany of thelargest steel enterprise in China’s northwest, located in the Ji-ayuguan of Gansu province. In order to achieve our soft sensorgoal, burning zone flame images, associated process variables,and f-CaO content measured under various conditions werecollected from No. 2 rotary kiln which produces 2,000 tonnesof clinker per day, as shown in Table 1.

Input flame images

New image and variables

Feature extraction

.

.

.

LEARNI

NG

TESTI

NG

MIA

SIFT+BoVW+LSA

Feature extraction

f CaO estimated

value

KPLS+RVFL

Soft sensor model

PCA

Process variables

Feature vactor

F

fa

fb

fc

fd

Compact Gabor filter

bank

Preprocessing

Compact Gabor filter

bank and Improved

median filter

Improved median filter

Fig. 3. Flow chart of the clinker f-CaO content prediction system.

B. Multi-source data ensemble learning modeling

Through the above analysis, to achieve the goal of clinkerquality-based timely control, a clinker f-CaO content softsensor model is designed at the beginning of the prophasebased on multi-source data ensemble learning, as depicted inFig. 3 with the following components:

• Training flame image ROI are pre-processed by a com-pact Gabor filter bank to distinguish ROI. Then, colorfeature fa, global configuration featuref b, and localconfiguration featuref c of the ROI are extracted.

• Training filtered process variablesfd are concatenatedwith the extracted flame image features, and then are fedinto KPLS to extract their low dimensional score matrixfeatures to form the f-CaO soft sensor model input.

• For the entire score matrix feature subset, a RVFL-based ensemble predictor is built to predict their f-CaOcontent to select the optimal predicted result as the f-Cao predicted values of the training dataset and theircorresponding optimal score matrix feature subset.

• Based on the well-trained parameters in the pre-processing, feature extraction and soft sensor model de-sign steps, testing flame images and process variables can


be used to obtain their f-CaO content predicted values inthe same way.

The next section describes each part of the proposed f-CaOcontent soft sensor model in our study.

C. Performance evaluation metrics

The performance of the f-CaO content soft sensor modelcan be evaluated by using the root mean squared error (RMSE)and the goodness of fit valueR2 produced by the training andtesting datasets.

1) The RMSE is given by [19]:

RMSE =

√

∑sci=1(yi − yi)2

sc(1)

where sc, yi, and yi denote the sample number, measuredvalue, and predicted value of f-CaO content, respectively.Thesmaller the RMSE, the smaller the fitting mean error, and thebetter the model performance.

2) The goodness of fit (R2) is given by [20]:

R2 = 1− β∑sci=1(yi − yi)

2

∑sci=1(yi − y)2 −∑sc

i=1(yi − yi)2(2)

whereβ = (sc−1)/(sc−fd−1); fd and y denote the featuredimension and mean of the measured f-CaO content variables,respectively; andR2 ∈ [0, 1], approaching 1 demonstrates thehigh clustering degree of the dataset around the fitting curve,whilst 0 indicates a more inferior fitting performance.

III. D ATA PRE-PROCESSING AND FEATURE EXTRACTION

A. Data pre-processing

Due to the smoke and dust and harsh kiln environment,although CCD cameras are meticulously maintained on a dailybasis to avoid blurry and dirty flame images and online sensorsare available, the sampled burning zone flame images andprocess variables still exhibit massive noises and large fluc-tuations. Therefore, filtering must be employed to minimizethe influence of the disturbances. The pre-processing proce-dures of flame images and process variables are illustratedrespectively, as follows.

1) Pre-processing of flame images:Motivated by theknowledge that discriminative ROI facilitate feature extractionand ROI with distinct texture attributes, the Gabor filteremerged as the most popular texture analysis method, andhence is employed to discriminate ROI. [12] reported thatonly a subset of a Gabor filter bank may be useful, whileothers are redundant and offer little improvement to (or evenreduce) discriminative power due to the peaking phenomenon[21]. Hence, such a novel method is proposed to generate acompact Gabor filter bank to enhance the separability of ROI.

As a fixed camera provides a rough location estimate forROI, to avoid the difficult ROI segmentation issue, two fixedwindows, 25 × 25 pixels in size, are used to sample theROI to represent their texture attributes as shown in Fig. 4.Assume a total of2ntr flame and material texture imagesT1,T2, . . . ,T2ntr

sampled fromntr gray-scale images trans-formed by ntr training RGB flame imagesI1, I2, . . . , Intr

.

materialheight

flame region

kiln wall

material region

coalregion

fixedwindow

fixedwindow

Fig. 4. Fixed windows of burning zone flame image.

Let f1, f2, . . . , fnGdenote the feature groups extracted from

filtered texture sample images by usingnG = 64 initial Gaborfilters (filter bank parameters set asfm = γ/(2γ+2

√log 2/π),

nf = 4, no = 4, γ = 0.5, 1.0, andη = 0.5, 1.0 [22]), wherefz = [f1,z, f2,z, . . . , f2ntr,z

]T , z = 1, . . . , nG. For each filteredtexture image, the meanµ and standard deviationσ featuresare extracted, i.e.,fj,z = [µj,z, σj,z]. Mahalanobis separabilitymeasureJM (z) is employed as the metric function to evaluateand sort the discriminative power offz and associated filter

JM (z) = (gi,z − gj,z)g−1c,z(gi,z − gj,z)

T , i, j = 1, 2 (3)

where gi,z, gj,z, and gc,z denote the mean vector and co-variance matrix of the flame class and material class in thefeature space along withfz, respectively. In this study, sucha metric is incorporated with a forward selection techniqueto automatically select the best uncorrelated feature groupsand associated filters to distinguish the ROI to facilitate thefollowing feature extraction step [12].

Once the best compact filter bank withnC filters for thetraining dataset is selected, they are applied to three channelsub-images (IR, IG, andIB) of each training and testing RGBflame image, respectively. Then, the complex imageI of themean image of the three channels fornC filtered images isused as the pre-processed flame image.

2) Pre-processing of process variables:During online mea-surement, outliers often take a place in the collected processvariable dataset which might lead to model deterioration andincorrect analysis results. Based on [23], Hampel’s methodismore efficient in detecting the outliers, and hence is employedto each process variable dataset list as follows.

Step 1: For the collected process variablex(ι) of the currentmomentι, create a moving window sequence withς width:

{x(ι− ς), x(ι− ς + 1), . . . , x(ι− 2), x(ι− 1)} (4)

Step 2: Compute the filtered valuexf (ι) of x(ι) using thestandard median filtering algorithm.

xf (ι) =

{

x(ι), if |x(ι)− xm(ι)| < L1 ∗MAD(ι)xm(ι), else

(5)

whereL1 and xm(ι) are a threshold and the median of thewindow sequence,MAD = 1.4826× vD [23].

Step 3: In order to eliminate possible big peak disturbance,an improved step is added to compute the finalxf (ι):

xf (ι) =

{

xf (ι− 1), if MAD(ι) > L2

xf (ι), else(6)


whereL2 is a threshold to detect the existence of big distur-bance. In our study,ς = 7, L1 = 0.5, andL2 = 8.

For xf (ι), a normalization operation is done to takexf (ι)into the range of 0∼1 to avoid the influence of various variabledimensions and magnitudes. Based on Table 1, kiln operatingvariables with a 10s sampling period use the above pre-processing procedure, whilst raw material quality with a 1hlabanalysis period is considered as constant during such an inter-val and normalized into[0, 1], and then values with the samesampling moment are selected to match the operating vari-ables. The final pre-processed process variables can be denotedas fd = [KH,SM,AM,Gr, Od,Wc, Ph, Im, Ik, Th, Tt].

B. Flame Image Feature Extraction

Due to the enormous amount of pixels in an image, the useof direct pixel-level features may result in the deteriorationof performance. Hence, more meaningful high-level featuresshould be employed to mitigate this problem. Based on theoperator’s understandings, two key visual features closelyrelated to f-CaO content, i.e., the color and configuration ofthe ROI, can be used to represent the flame image.

1) Flame image color feature:The color of the ROI is onekey factor which indicates combustion region and temperaturefield distribution. An appropriate color indicates a normalburning state and satisfactory f-CaO content. Compared withother methods to track a turbulent flame directly in the imagespace to extract color features, the MIA technique [2] showsits efficiency to feature the color without the difficult flamelocation step.

The key idea of MIA is to project image pixels of a similarcolor in a common region of the score space in spite of theirspatial location, and retrieve the locations of pixels of a similarcolor in the image space. MIA is based on multi-way PCA,which is able to unfold the 3-dimensional filtered flame imageI into a 2-dimensional matrixI without considering the pixellocations, and then apply PCA to it:

Inx×ny×3unfold−−−−→ I(nx·ny)×3 =

PC∑

m=1

tmpTm + E (7)

whereI has the sizenx×ny × 3, whilst I with (nx ·ny)× 3;tm andpm are score vectors and loading vectors, respectively.

After scaling and rounding off from 0 to 255,tm is denotedassm. Inspection of thet1− t2 score plot is a common methodin PCA. However, many pixels may have nearly identicalt1−t2values. A compressed 256×256 score plot histogramTT ismore useful to describe the score plot space, as shown in Fig.5(b) [24], where each element is computed as:

TT i,j =∑

τ

1, (∀τ, s1,τ = i, s2,τ = j, i, j = 0, . . . , 255) (8)

In such a score plot space, pixels of a similar color inthe image space are clustered together. Moreover, as theburning state changes, the pixel locations in such a plot changesignificantly. This enables one to use masking to obtain moremeaningful ROI to feature various clinkers. According to [24],a 256×256 binary masking matrixM with fixed graphics isconstructed, also as shown in Fig. 5(b).

s2

s1(a) (b) (c)

Fig. 5. (a) Original flame image, (b) Score plot of flame image with masking(yellow), (c) Flame image with overlay of highlighted pixels.

Then, the area featurefa of the training and testing flameimages can be calculated to feature the color of the ROI:

fa =∑

i,j

TT i,j , (∀i, j,Mi,j = 1) (9)

2) Flame image global configuration feature:The config-uration of ROI is another key factor characterizing the heatsource, disturbance from the smoke and dust, and clinkersintering status, and has a close relationship with f-CaOcontent. Good flame region circularity and appropriate materialregion height indicate a proper heat supply, fine ventilation,and satisfactory clinker quality [10].

Eigen-flame image decomposition based on PCA can beapplied to extract features that represent the global configura-tion of ROI. SupposeI′1, I

′2, . . . , I

′ntr

representntr gray im-ages from filtered RGB training flame imagesI1, I2, . . . , Intr

.Y1,Y2, . . . ,Yntr

denote the eigen-flame images after applyingPCA to such a gray dataset. Notice that the correlation coeffi-cients between a flame image and the eigen-flame images canbe considered as a feature to represent the global configurationof the ROI. In our study, the contribution rate criterion isemployed to select the firstp(p ≤ ntr) eigen-flame images,i.e., global configuration featuresf b, for the training andtesting flame images to reduce the global feature dimension

κ =

∑pi=1 λi

∑ntr

i=1 λi(10)

whereλi denotes the eigen-value of the covariance matrix inPCA, andκ is set asκ ≥ 0.95.

3) Flame image local configuration feature:Generally,local features are considered to contain more valuable detailsto complement global features. The SIFT operator is superiorto other feature detection methods, and hence is employed toextract the local configuration features of ROI. Fig. 6 givesthe detected SIFT keypoints forI1, I2, . . . , Intr

.With the 128-dimension SIFT descriptor, it is possible to

explore the research in image and text retrieval. BoVW cantransformnS SIFT descriptors ofntr training images into avisual word-image tableφζ×ntr

, whereζ is the cluster numberof visual words. Forφ which actually lists the frequency ofeach visual word appearing in each image, synonymies causedby the zero-frequency problem [25] might deteriorate theperformance. LSA is hence implemented to map the originalvisual word space to a conceptual latent semantic space tomitigate such a problem.


Fig. 6. Detected SIFT keypoints of a flame image.

LSA generates a semantic space by a singular value decom-position (SVD), which can be written as:

φ = UΣVT (11)

whereU, V, andΣ are the matrices of visual words, images,and singular values. The best approximationφα with rank-αcapturing most of the important structure ofφ is given by

φα = UαΣαVTα (12)

where Uα is the first α columns of U, VTα is the first αrows of VT , and Σα is the firstα factors ofΣ. Thus, thelocal configuration feature of a training and testing image ofa ROI can be represented as a vector inα-dimensional space,respectively

ρ = ρTUαΣ−1α (13)

ˆ = TUαΣ−1α (14)

The contribution rate criterion (≥0.95) forΣ is also used toselect semantics, i.e., local configuration featuresfc. Moreover,ζ is determined by the f-CaO content estimate result.

C. Compressed Score matrix feature based on KPLS

In building the regressor model, a major issue is the high di-mensionality and collinearity of the input dataset which makesit difficult or even impossible to reliably predict the outputs.A powerful tool, partial least squares (PLS), which finds alow dimensional set of latent variables through the projectionof the input and output dataset onto a subspace is widelyused to overcome such disadvantages. However, PLS assumeslinear process data is inappropriate for describing complexindustrial processes because such systems may exhibit strongnonlinear characteristics. To tackle data nonlinearity, KPLSwas developed. Compared with PLS, KPLS nonlinearly trans-forms the original input dataset into a feature space of arbitrarydimensionality via nonlinear kernel functions, and then alinear PLS model is generated in the feature space to avoidnonlinear optimization. Moreover, because of its ability to usedifferent kernel functions, KPLS can handle a wide range ofnonlinearities.

In order to derive KPLS, firstly, the training filtered processvariablesfd and flame image featuresfa, fb, andfc are concate-nated as the high dimensional input datasetF = [fa, fb, fc, fd]of KPLS. Secondly, consider a nonlinear transformation of

TABLE IIKPLS ALGORITHM

(1) i = 1;(2) Randomly initialize vectorui;(3) wi = ψT

i ui/ ‖ ψTi ui ‖, ti = Kiui, ti ← ti/ ‖ ti ‖;

(4) ci = YTci

ti/tTi ti;

(5) ui = Ycici/cTi ci;

(6) Repeat steps 3-5 untilui converges to obtainui, ti, andwi, ci;(7) DeflateKi, Yci

: Ki = (I− titTi /tTi ti)Ki(I− titTi /tTi ti),Yci

= Yci− titTi Yci

/tTi ti;(8) i = i+ 1, goto step 2;

the training input dataset{Fi}ntr

i=1 into a feature spaceF bymappingψ: {Fi} ∈ R

m → ψ(Fi) ∈ F, where it is assumedthatΣntr

i=1ψ(Fi) = 0 andψ(·) is a nonlinear function that mapsthe original input dataset intoF. By employing the kernel trick,ψ(Fi)Tψ(Fj) = K(Fi,Fj), both nonlinear mapping and dotproducts inF can be avoided.ψψT represents the(ntr×ntr)kernel Gram matrixK of the cross dot products between allmapped training input dataset{ψ(Fi)}ntr

i=1

K(i, j) = 〈ψ(Fi), ψ(Fj)〉 (15)

According to the KPLS algorithm presented in Table 2, thehigh-dimensional training input dataset{Fi}ntr

i=1 and outputf-CaO contentYc can be projected to obtain their coefficientmatrixes with regard to the projection axisw andc, i.e., the lowdimensional score matrixesTntr

= [t1, . . . , tnlv] and Untr

=[u1, . . . ,unlv

], where the dimension of latent variablesti andui equals the number of input training samples, andnlv isthe obtained maximal latent variable number. Here, the lowdimensional score matrixTntr

is selected as the compressedfeature vector to reduce the dimension representation of thetraining input dataset{Fi}ntr

i=1. Specifically, with the kerneltrick, the score matrix featureTntr

of training input{Fi}ntr

i=1

and outputYc can be computed as follows:

Tntr= ψψTUntr

(TTntr

KUntr)−1 (16)

= KUntr(TTntr

KUntr)−1

For the testing input dataset{Fi}nte

i=1, the score matrix featureTnte

can be calculated in the same way, i.e.,

Tnte= ψnte

ψTUntr(TTntr

KUntr)−1 (17)

= KnteUntr

(TTntr

KUntr)−1

whereψnteis the matrix of the mapped testing input dataset,

and Knteis a (nte × ntr) testing matrix whose elements

are Knte(i, j) = K(Fi,Fj). Finally, low dimensional score

matrixesTntrandTnte

of the training and testing dataset canbe regarded as the input feature vectors of the f-CaO contentsoft sensor model, andT

∗ntr

= [t1, . . . , tnre], nre < nlv as the

retained latent variable subset will be selected by the optimalRMSE andR2 criterion for the f-CaO content predicted valuesof the training dataset .

IV. RVFL- BASED ENSEMBLE PREDICTOR DESIGN

Random vector functional-link (RVFL) networks were pro-posed by Pao and his co-workers in [8], and a significant


approximation theory was established in [9]. It states thatanynonlinear maps defined over a compact set can be arbitrarilyapproximated with 1-probability provided that the number ofbasis functions is large enough. Indeed, the RVFL networkis functionally equivalent to a random basis function approx-imator [26], where the weights between the input layer andthe hidden layer are randomly assigned and the output layerweights (namely (ω)) can be obtained by solving the followingleast mean square regression problem [27]:

ω = arg{minntr∑

i=1

[rTi ω − yi]2} (18)

where theri = [ri,1, ri,2, . . . , ri,nh]T and theyi are the hidden

layer output vectors and the actual output for theith trainingsample, and thenh is the number of neurons at the hiddenlayer, respectively. Eq. (18) can be rewritten as

ω = R†Y (19)

whereY = [y1, y2, . . . , yntr]T , R is the output matrix at the

hidden layer, andR† = (RRT )−1R is the Moore-Penrosegeneralized inverse ofR.

To the best of our knowledge, RVFL networks can beregarded as an extension of feed-forward neural networks withrandom weights, which were originally proposed by Schmidtand his co-worker in [28]. From an implementation point ofview, RVFL networks are handy candidates to perform datamodeling in a real-time manner although we have a limitedunderstanding of the randomness in such learner models. Inpractice, it is inevitable that there will be unstable performancefrom different RVFL networks. This is true because eachRVFL model can be viewed as an instance of a randomvariable. Therefore, the performance of a randomly generatedRVFL network is based on luck. Theoretically, RVFL-basedensemble models should perform better in terms of modelreliability. This was shown empirically in [29], which formsthe basis of this soft sensor model.

The two key factors in ensemble design are the number ofhidden nodesnh, which relates to the under-fitting or over-fitting property in data regression exercises, and the numberof RVFL base modelsχ. For nh, we randomly took valuesbetween fifty and seventy for each RVFL. By referring tothe simulation results reported in [29] and our trials for thepredicted performance evaluation of the soft sensor model,weemployed20 RVFL networks in our soft sensor model. In thisstudy, the sigmoidal function is used as an activation functionin all hidden neurons of the RVFL networks. The followinggives the proposed RVFL-based ensemble predictor design:

Step 1: Forζ(ζmin ≤ ζ ≤ ζmax) visual words, extractthe training image ROI’s local configuration featurefc, andthen concatenated with the extracted ROI’s color featurefa,the ROI’s global configuration featurefb, and filtered processvariablesfd to build the high dimensional training input dataset{Fi}ntr

i=1.Step 2: With {Fi}ntr

i=1 and associated f-CaO contentYc,compute the low dimensional score matrixTntr

and Untrby

KPLS as the input feature vectors for the soft sensor model,where the kernel function is selected asK(Fi,Fj) = FTi Fj .

Step 3: For all candidate score matrix feature subsets{t1},{t1, t2},. . ., {t1, t2, . . . , tnlv

}, we generatedχ RVFL regressorsby Eq. (19) to obtain its f-CaO content predicted valueYci(i = 1, . . . , χ) and associated RMSE andR2. The meanof χ predicted valueYc1 , . . . , Ycχ is considered as the finalpredicted result for the candidate score matrix feature subset.

Step 4: Repeat Step 3 to obtain the optimal score matrixfeature subsetT

′ntr

= [t1, t2, . . . , tnre] among all candidates

and its optimal RVFL-based ensemble predictor with respectto theζ visual words.

Step 5: Increase the value ofζ, repeat Steps 1∼4 toobtain their predicted results for a validation dataset. Thefinal f-CaO content predictor will be determined by taking theensemble model with the best prediction performance over thevalidation dataset. The associated optimal visual word numberζ∗, score matrix feature subsetT

∗ntr

andU∗ntr

can be obtainedsimultaneously.

Step 6: For the testing flame images and process variables,with ζ

∗, T∗ntr

, andU∗ntr

, their score matrix featureTntecan be

generated by Eq. (17) and is fed into the trained RVFL-basedensemble predictor to obtain their f-CaO content predictedvalues.

V. EXPERIMENTAL RESULTS

In order to validate the presented method, flame images andassociated process variables and the f-CaO values measuredunder various conditions are collected from No. 2 rotary kiln atJiuganghongda Cement Plant. A color CCD camera (PanasonicWV-CP450) is installed outside the peephole of the kiln head,and is meticulously maintained on a daily basis by operatorsto avoid blurry and dirty flame images as much as possible.The output signal of CCD is digitized using an image grabbercard (Matrox Meteor II). Each digital flame image is 512×384pixels in size with 300dpi, and each pixel is composed of red(R), green (G), and blue (B) components. The sampling periodfor the flame image and kiln operation variables is set to 10s,whilst raw material quality and f-CaO content are 1h.

A total of 157 f-CaO content and associated flame imageand process variables were collected and split into two usingrandom selection. One half (79 data) was used for soft sensormodel development (training data) and the other half (78 data)for soft sensor model validation (testing data). This procedurewas repeated 50 times to predict the average performance.Some of the collected flame images and process variablesamples are shown in Fig. 7 and Table 3, respectively. Ouroperation environment is MATLAB 2010a, with 4G RAM,and Intel Core i5-3380M CPU with 2.90GHz.

Fig. 8 shows the flow chart of our soft sensor experiment.Firstly, the training flame images and process variables arepre-processed. Then, the extracted image ROI features andthe filtered process variables are concatenated as the highdimensional inputF. The compressed score matrix featuresfor F are extracted by KPLS, and are then fed into a RVFL-based ensemble regressor to predict the f-CaO content. Basedon the well-trained parameters in the training step, the testingdataset can obtain their f-CaO content predicted values. Eachpart of the experiment is described in detail as follows.


Fig. 7. Some collected flame images.

Input flame images and

process variables

f CaO estimated

value

Feature extraction and concatenation

Concatenated feature F

Preprocessing

KPLSRVFL

regressor

Compressed feature

Fig. 8. Flow chart of experiments.

0 10 20 30 40 50 60 700.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Each Gabor filter

Dis

crim

inat

ive

pow

er

Fig. 9. Discriminative power of each Gabor filter for two classtrainingtexture images.

A. Experiments for data pre-processing

1) Flame image pre-processing:According to the methodpresented in Section III A, flame and material texture imagesare first sampled from training gray flame images and then fea-ture groupsf1, f2, . . . , fnG

are extracted from filtered textureimages by initial Gabor filters. Fig. 9 shows the discriminativepower of each filter for two class texture images by Eq. (3).As we can see, many filters possess similar attributes, so thebuilding of a compact filter bank without redundant filters ishence important to discriminate ROI.

After the generation of the training candidate feature groupsubset, each subset is fed into the learner models to obtainthe discrimination results. In our study, four learner mod-els are employed, i.e., probabilistic neural networks (PNN)

TABLE IIISOME COLLECTED DATA SAMPLES

1 2 3 . . . 156 157KH 0.93 0.92 0.94 . . . 0.93 0.94SM 2.13 2.40 2.20 . . . 1.99 2.32AM 1.35 1.48 1.36 . . . 1.16 1.39Gr 16.8 16.2 16.5 . . . 16.6 16.9Od 64.2 64.2 64.1 . . . 64.2 64.2Wc 2.49 2.55 2.48 . . . 2.50 2.51Ph -201.8 -207.7 -203.1 . . . -203.2 -201.3Im 216.1 213.1 214.8 . . . 215.4 217.4Ik 103.8 100.0 102.4 . . . 102.6 106.0Th 1027.2 1049.8 1037.4 . . . 1035.5 1020.5Tt 1026.9 1068.2 1047.2 . . . 1037.4 1013.4

f-CaO 0.93 1.11 1.12 . . . 0.88 0.78

0 5 10 15 20 250.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

Candidate Gabor filter bank

Dis

crim

inat

ion

perf

orm

ance

(%

)

RVFLSVMPNNBPNN

Fig. 10. Discrimination performance of training texture images vs. CandidateGabor filter bank subset.

0 200 400 600 800 10001000

1100

1200

1300

1400

1500

1600

Sample

Kiln

hea

d te

mpe

ratu

re (

°C)

original databasic L*MADimproved L*MAD

Fig. 11. Filtering results comparison of process variables between standardmedian filter and improved median filter.

[30], back-propagation neural networks (BPNN) [31], supportvector machines (SVM), and RVFL, where the smoothingparameter of PNN is computed by [32]; the number of hiddenlayer nodes and activation function of BPNN are set as twotimes the input vector dimension plus one and the sigmoidfunction [11]; the kernel function, cost parameterC, andkernel function parameterσS of SVM are selected as Gaussianfunction,C ∈ {212, . . . , 2−2}, andσS ∈ {24, . . . , 2−10} [11];the setting of RVFL is the same as described in Section IV.

Fig. 10 gives the discrimination result of the candidateGabor filter bank subset used with respect to various learnermodels for training texture images. From Fig. 10, the bestperformance for training the texture dataset is obtained withthe six most discriminative Gabor filters and RVFL. Sucha compact filter bank is hence selected to filter the trainingand testing flame images to discriminate the ROI as much aspossible in one experiment.

2) Process variables pre-processing:Taking Th for in-stance, Fig. 11 shows the filtering result ofTh based onstandard median filter and an improved median filter. As wecan see, the proposed filter not only eliminates the impulsivenoise of a minor width, but also removes the big peakdisturbance compared with the standard median filter.

We apply our filtering algorithm toOd, Wc, Ph, Im, Ik, Th,and Tt measured sequences, and then select the normalizedkiln operating variables and raw material quality correspond-ing to the collected f-CaO content to form the process variabledataset.


0 5 10 15 20

0.04

0.045

0.05

0.055

0.06

Number of Visual Words

RM

SE

Fig. 12. RMSE of f-CaO content predicted values of training dataset vs.Number of visual words.

B. Experiments for feature extraction and soft sensor modeldesign

Following the procedure in Section III B, via the fixedmasking matrixM, the area featurefa of the flame images canbe extracted to represent the color of the ROI, whilst accordingto Eq. (10), the number of selected eigen-flame images, i.e.,the dimension of the extracted ROI’s global configurationfeature f b, can be chosen for each replica. For the ROI’slocal configuration featuref c, ζ

min = 2 and ζmax = 20respectively, and then the optimalζ∗ will be selected based onthe f-CaO content predicted values of the training dataset withrespect to the above 19 candidate local feature subsets. Subse-quently, during each experiment, corresponding to the selectedtraining f-CaO content, the associated training image featuresand filtered process variables are selected and concatenated toform training input dataset{Fi}ntr

i=1 for the f-CaO content softsensor model building.

After the calculation of all candidate score matrix featuresubsets for training input dataset{Fi}ntr

i=1 derived fromζ vi-sual words, each candidate score matrix feature subset is fed toχ RVFL regressors to predict its f-CaO content. Correspondingto the optimum, the optimal score matrix feature subset isretained in theζ visual words sense. Fig. 12 gives the RMSEof the f-CaO content predicted values of the training datasetfor all visual word numbers in one experiment, where everypoint resulted from the minimum RMSE for the entire scorematrix feature subset in theζ visual words sense. From sucha figure, with the9th {Fi}ntr

i=1, i.e., the optimalζ∗ as 10,the global optimum RMSE can be obtained, and records thecorresponding optimalT

∗ntr

andU∗ntr

.Once the optimal parameters for building the soft sensor

model have been obtained, the testing process for the pro-posed f-CaO content soft sensor model can be carried out bysimulating the model with the testing dataset. The predictedf-CaO content values produced by the well-trained RVFL-based ensemble regressors model for the testing dataset in oneexperiment is shown in Fig. 13. As we can see, the proposedmulti-source data ensemble learning-based method is feasibleto model the f-CaO content of the rotary kiln. Such resultsvalidate the previously mentioned fact that a close relationshipexists between the flame images, process variables and f-CaOcontent.

0 10 20 30 40 50 60 70 800

1

2

3

4

5

6

Observation number

Pre

dict

ed v

alue

of f

−C

aO c

onte

nt

Actual outputKPLS+RVFL output

Fig. 13. f-CaO content predicted values of testing dataset.

C. Contrasting experiments

In order to validate the effectiveness of our proposed multi-source data ensemble driven-based clinker f-CaO content softsensor model, we tested the performance of other soft sensormodels, including various compressed feature methods (PLScombined with RVFL ensemble regressors), without com-pressed feature methods (high dimensional inputF directlyfed into RVFL ensemble regressors), various regressor de-sign methods (KPLS combined with SVR regressor), variouscompressed feature and regressor design methods (PLS com-bined with SVR regressor), without compressed feature andvarious regressor design methods (high dimensional inputFdirectly fed into SVR regressor), and various compressed fea-ture and without regressor design methods (KPLS-based andPLS-based). With the same candidate training input dataset{Fi}ntr

i=1, their corresponding optimalζ∗, T∗ntr

andU∗ntr

, andregressor parameters can be selected. Based on the selectedoptimal parameters, the f-CaO content of the testing inputdataset can be predicted in the same way.

The detailed RMSE,R2, and the running time for varioussoft sensor model design methods with 50 replicas are listedin Table 4. Moreover, the impact of the single source data-based methods, i.e., flame images and process variables usedindividually for f-CaO content prediction is also studied asshown in Table 5. From Tables 4∼5, the following conclusionsare drawn:

a) Our proposed multi-source data ensemble learning-basedmethod can achieve a minimumRMSE = 0.092 ± 0.01 andmaximumR2 = 0.958±0.01 for the testing dataset. Moreover,the performance of our method is better than the single sourcedata-based methods. Such results validate the fact that notonlyflame images but process variables have a close relationshipwith f-CaO content. Hence, both flame images and processvariables should be combined to model f-CaO content.

b) Data regression using neural networks suffers from modelarchitecture selection and parameter settings. Moreover,thecorrelation between input feature vectors and the output f-CaO content is difficult to judge. In this study, compared withvarious compressed feature methods and without compressedfeature methods, the KPLS+RVFL-based ensemble frameworkis employed to lessen the uncertainties caused by the factorsmentioned above.


TABLE IVCOMPARISON OF PREDICTION RESULTS AMONG DIFFERENT MODELING METHODS (FLAME IMAGES COMBINED WITH PROCESS VARIABLES)

Soft sensor Flame images and process variablesdesign RMSE RMSE R2 R2 Running time Running timemethods (training) (testing) (training) (testing) (training) (testing)KPLS+RVFL 0.044±0.01 0.092±0.01 0.973±0.01 0.958±0.01 1118.7±5.4 25.6±0.4KPLS+SVR 0.031±0.02 0.112±0.01 0.974±0.02 0.940±0.01 155.1±4.1 25.7±0.1KPLS 0.009±0.01 0.135±0.01 0.980±0.02 0.935±0.01 48.5±1.3 25.5±0.2PLS+SVR 0.010±0.01 0.153±0.03 0.979±0.01 0.911±0.01 155.9±2.7 25.7±0.1PLS+RVFL 0.009±0.01 0.164±0.06 0.979±0.01 0.899±0.02 173.2±2.4 25.6±0.1PLS 0.044±0.01 0.165±0.01 0.975±0.01 0.898±0.01 43.9±1.3 25.5±0.2SVR 0.007±0.01 0.161±0.06 0.982±0.01 0.899±0.01 144.4±3.5 25.7±0.2RVFL 0.007±0.01 0.158±0.02 0.982±0.01 0.912±0.01 88.2±1.6 25.5±0.1

TABLE VCOMPARISON OF PREDICTION RESULTS AMONG DIFFERENT MODELING METHODS (FLAME IMAGES AND PROCESS VARIABLES USED RESPECTIVELY)

Soft sensor Flame images Process variablesdesign RMSE RMSE R2 R2 RMSE RMSE R2 R2

methods (training) (testing) (training) (testing) (training) (testing) (training) (testing)KPLS+RVFL 0.036±0.07 0.133±0.05 0.972±0.08 0.935±0.04 0.027±0.01 0.140±0.01 0.975±0.04 0.918±0.01KPLS+SVR 0.044±0.06 0.146±0.06 0.970±0.05 0.933±0.07 0.028±0.01 0.149±0.01 0.975±0.02 0.917±0.01KPLS 0.132±0.06 0.181±0.03 0.939±0.08 0.886±0.06 0.148±0.01 0.180±0.01 0.937±0.03 0.886±0.02PLS+SVR 0.010±0.01 0.201±0.09 0.978±0.02 0.870±0.05 0.010±0.01 0.203±0.08 0.979±0.01 0.869±0.03PLS+RVFL 0.009±0.02 0.208±0.09 0.976±0.01 0.869±0.05 0.011±0.01 0.186±0.09 0.978±0.01 0.882±0.02PLS 0.105±0.03 0.188±0.02 0.973±0.01 0.882±0.01 0.150±0.01 0.191±0.01 0.915±0.02 0.874±0.02SVR 0.011±0.01 0.199±0.05 0.978±0.01 0.875±0.01 0.011±0.01 0.195±0.04 0.977±0.01 0.873±0.01RVFL 0.012±0.02 0.191±0.03 0.978±0.01 0.878±0.02 0.012±0.01 0.194±0.05 0.977±0.01 0.877±0.03

c) During the comparison of various regressors, withthe same compressed features, the performance of theKPLS+SVR-based method lags behind the KPLS+RVFL-based method in the generalization capability of the soft sensormodel, i.e., how well the designed model performs in thetesting dataset. This is probably because the over-fitting forthe SVR parameters decreases the training RMSE, but poorgeneralization increases the testing RMSE.

d) KPLS maps the original input dataset into a high di-mension space. The extracted compressed score matrix featurenot only avoids the nonlinearity of the input dataset, but alsoreduces the feature representation dimension to increase thegeneralization capability of the designed soft sensor model.Hence, less RMSE and a biggerR2 are acquired comparedwith the corresponding PLS compressed feature-based predic-tion results.

e) Regressors can build a complex mapping between theinput feature space and the expected output space. Therefore,for the without regressor soft sensor models, the compressedfeature is inadequate to represent the nonlinear relationshipbetween the input and output dataset. Hence, the perfor-mance is inferior to the corresponding with regressor softsensor models. Moreover, the high dimensional inputF isdirectly fed into various regressor methods. Due to the highdimensional, nonlinear, and correlated problems existingin theinput dataset, the generalization capability and performance ishence inferior compared to the compressed feature soft sensormodels.

f) The computational complexity of the various soft sensormodels is evaluated by the running time, where the runningtime includes pre-processing, feature extraction, and soft sen-sor model building or testing. As we can see, the difference inthe running time of the different methods is almost negligible

TABLE VICOMPARISON OF PREDICTION RESULTS FOR VARIOUS IMAGE NOISES AND

IMAGE RESOLUTIONS

Various RMSE RMSE R2 R2

methods (training) (testing) (training) (testing)120dpi 0.048±0.01 0.118±0.06 0.968±0.02 0.936±0.0296dpi 0.041±0.02 0.123±0.05 0.970±0.07 0.935±0.0572dpi 0.042±0.02 0.133±0.07 0.970±0.04 0.935±0.04Pepper 0.051±0.02 0.094±0.02 0.967±0.03 0.956±0.04Poisson 0.018±0.02 0.161±0.04 0.974±0.03 0.896±0.04Gaussian 0.050±0.02 0.094±0.02 0.968±0.03 0.960±0.03

for the testing dataset. However, considering the RMSE andR2, with the superior predicted result, the complexity of ourmethod is comparable to the other methods. Such running timecovers the whole training or testing dataset, and a testing flameimage and process variable only occupies 0.33s on average.After the offline training, compared with a 10s sampling time,the testing dataset can immediately obtain its online f-CaOcontent predicted values to guarantee future f-CaO content-based timely closed-loop control.

D. Robustness analysis

Due to the harsh industrial operational environment of therotary kiln, the CCD camera is meticulously maintained on adaily basis by the operators, as much as possible. If blurry anddirty flame images appear, further robustness analysis for theproposed soft sensor model should be carried out. In order toimitate the blurry and dirty images taken by the CCD camera,although not real, three kinds of noise signals are added to theflame image to test the robustness of our method, namely zeromean noise with 0.01 variance Gaussian white noise, Poissonnoise, and density 0.05 salt and pepper noise. Moreover, asthe collected flame images are 300dpi, the impact of various


image resolutions for the proposed soft sensor model shouldalso be studied. In the present study, the collected flame imagesare transformed into 120dpi, 96dpi, and 72dpi to test therobustness of our method. These transformed flame images areto extract the same flame image features, and then combinethem with the filtered process variables to feed them intoKPLS to extract the compressed score matrix feature. RVFL-based ensemble regressors are finally employed to predict thef-CaO content in the same way. The predicted results of theabove mentioned experiments with 50 replicas are providedin Table 6. From Table 6, the following conclusions can bedrawn:

i) With the gradually decreased image resolution, the RMSEand R2 of the testing dataset declines. This fact can beeasily interpreted from the human-vision point of view. Themore blurry the flame images, the harder it is to extractmeaningful image features. The extracted image features withlarger deviation will gradually deteriorate the predictedf-CaO content. Hence, the CCD camera should be carefullymaintained on a daily basis to avoid blurry and dirty flameimages.

ii) For the collected industrial images, image quality isnot fine compared with other standard images. However, theproposed soft sensor model is robust for Gaussian white noiseand salt and pepper noise. Such noises have little impact onthe predicted f-CaO content. However, with Poisson noise, theperformance rapidly worsens. To avoid this, a Poisson noisefilter will be added in our next work to further enhance therobustness of our method.

VI. CONCLUSIONS

Online estimation for clinker f-CaO content will greatly helpin reducing the amount of poor quality clinker by offeringsignificant information to operators. Unfortunately, there arefew effective methods for estimating such content. In thisstudy, a multi-source data ensemble learning-based soft sensormodel is developed using real input-output data, collectedfroma cement rotary kiln. Firstly, an improved compact Gaborfilter bank and a modified median filter are used in the pre-processing step to discriminate the flame image ROI andeliminate the process variable outliers. Then, the color, globalconfiguration, and local configuration features of the flameimage ROI are extracted by MIA, PCA, and the SIFT operatorcombined with the BoVW descriptor and LSA. By applyingthe KPLS technique, the compressed score matrix featuresare extracted from the concatenated flame image features andfiltered process variables to avoid high dimensional, nonlin-ear, and correlated problems for soft sensor modeling. Thefinal f-CaO content predicted result is based on the RVFL-based ensemble model with a subset of the compressed scorematrix features. Comprehensive experiments were carried out,including various compressed feature and regressor designmethods, without compressed feature and regressor designmethods, and single source data-based methods, i.e., flameimage features and process variables used individually. Arobustness analysis of the proposed soft sensor model isgiven, where various image noises and image resolutions are

considered. The experimental results demonstrate the positivepotential of our soft sensor for real world applications.

In our proposed f-CaO content soft sensor model, theselection of the compressed feature vector is realized by theKPLS technique, which cannot guarantee the optimality offeature combination. Moreover, the random weights assignedby the RVFL networks and the number of RVFL base modelsused in the ensemble model are not optimal either. Therefore,a further study on feature selection and ensemble modelparameter optimization is anticipated. It has been noticedthatseveral uncertainties exist in the data collected for buildingthe soft sensor model. Thus, robust modeling techniques,such as neuro-fuzzy systems, should be explored in futurestudies. Also, it is strongly anticipated that we will buildaf-CaO content-based rotary kiln closed-loop control systemto maintain proper clinker quality and minimize the cost ofenergy.

ACKNOWLEDGMENT

The authors are grateful to the Editors and Reviewers fortheir constructive and detailed suggestions which truly helpedus to substantially improve the quality of this publication.

This work is supported by National Natural Science Foun-dation of China (Grant No. 61305029), Natural Science Foun-dation of Anhui Province (Grant No. 1408085QF133), TheKey Support Plan of Overseas Experts, China PostdoctoralScience Foundation (Grant No. 2013M541820), and Funda-mental Research Funds for the Central Universities (Grant No.2013HGBH0010, 2013HGQC0012).

REFERENCES

[1] B. Lin, B. Recke, J. K. H. Knudsen, and S. B. Jørgensen, “A system-atic approach for soft sensor development”,Computers and ChemicalEngineering., vol. 31, pp. 419-425, May 2007.

[2] G. Szatvanyi, C. Duchesne, and G. Bartolacci, “Multivariate imageanalysis of flames for product quality and combustion control in rotarykilns”, Ind. Eng. Chem. Res., vol. 45, no. 13, pp. 4706-4715, May 2006.

[3] A. K. Pani, V. K. Vadlamudi, and H. K. Mohanta, “Developmentandcomparison of neural network based soft sensors for online estimationof cement clinker quality”,ISA Transactions, vol. 52, no. 1, pp. 19-29,Jan. 2013.

[4] B. Lin and S. B. Jorgensen, “Soft sensor design by multivariate fusion ofimage features and process measurements”,Journal of Process Control,vol. 21, no. 4, pp. 547-553, Apr. 2011.

[5] C. K. N. Kartik and S. Narasimhan, “A theoretically rigorous approachto soft sensor development using Principal Components Analysis,”Computer Aided Chemical Engineering, vol. 29, pp. 793-797, 2011.

[6] P. Kadlec, R. Grbic, and B. Gabrys, “Review of adaptation mechanismsfor data-driven soft sensors”,Computers and Chemical Engineering, vol.35, no. 1, pp. 1-24, Jan. 2011.

[7] V. N. Vapnik, The Statistical Learning Theory, New York: Wiley, 1998.[8] Y. H. Pao and Y. Takefuji, “Functional-link net computing, theory,

system architecture, and functionalities”,IEEE Comput., vol. 25, no.5, pp. 76-79, May 1992.

[9] B. Igelnik and Y. H. Pao, “Stochastic choice of basis functions inadaptive function approximation and the functional-link net”, IEEETrans. Neural Netw., vol. 6, no. 6, pp. 1320-1329, Nov. 1995.

[10] P. Sun, T. Y. Chai, and X. J. Zhou, “Rotary kiln flame image segmen-tation based on FCM and Gabor wavelet based texture coarseness”, inProc. of the 7th World Congress on Intelligent Control and Automation,2008, pp. 7615-7620.

[11] W. T. Li, D. H. Wang, and T. Y. Chai, “Flame Image-Based BurningState Recognition for Sintering Process of Rotary Kiln Using Het-erogeneous Features and Fuzzy Integral”,IEEE Trans. on IndustrialInformatics, vol. 8, no. 4, pp. 780-790, Nov. 2012.


[12] W. T. Li, K. Z. Mao, H. Zhang, and T. Y. Chai, “Designing compactGabor filter banks for efficient texture feature extraction”, in Proc. of theInternational Conference on Control, Automation, Robotics and Vision,2010, pp. 1193-1198.

[13] J. Yang, D. Zhang, A. F. Frangi, and J. Y. Yang, “Two-dimensionalPCA: a new approach to appearance-based face representation andrecognition”,IEEE Trans. on Pattern Analysis and Machine Intelligence,vol. 26, no. 1, pp. 131-137, Jan. 2004.

[14] D. G. Lowe, “Distinctive image features from scale-invariant keypoints”,International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110,Jan. 2004.

[15] J. Sivic and A. Zisserman, “A video Google: a text retrieval approachto object matching in videos”, inProc. of the 9th IEEE InternationalConference on Computer Vision, 2003, pp. 1470-1477.

[16] S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R.A. Harshman, “Indexing by latent semantic analysis”,Journal of theAmerican Society of Information Science, vol. 41, no. 6, pp. 391-407,Sep. 1990.

[17] R. Rosipal and L. J. Trejo, “Kernel partial least squares regressionin reproducing kernel Hilbert space”,Journal of Machine LearningResearch., vol. 2, pp. 97-123, Dec. 2001.

[18] P. Zhou,Cement Sintering Technology and Equipment, Wuhan Univer-sity of Technology Press, 2004.

[19] R. J. Hyndman, and A. B. Koehler, “Another look at measuresof forecastaccuracy”,International Journal of Forecasting, vol. 22, no. 4, pp. 679-688, Oct. 2006.

[20] David M. Glover, William J. Jenkins, and Scott C. Doney,Least Squaresand Regression Techniques, Goodness of Fit and Tests, Non-linear LeastSquares Techniques, Woods Hole Oceanographic Institute, 2008.

[21] A. K. Jain, R. P. W. Duin, and J. C. Mao, “Statistical pattern recognition:a review”, IEEE Trans. on Pattern Analysis and Machine Intelligence,vol. 22, no. 1, pp. 4-37, Jan. 2000.

[22] F. Bianconi and A. Fernandez, “Evaluation of the effects of Gabor filterparameters on texture classification”,Pattern Recognition, vol. 40, no.12, pp. 3325-3335, Dec. 2007.

[23] M. N. Nounou and B. R. Bakshi, “On-line multiscale filtering of randomand gross errors without process models”,AIChE Journal, vol. 45, no.5, pp. 1041-1058, May 1999.

[24] H. L. Yu and J. F. MacGregor, “Monitoring flames in an industrial boilerusing multivariate image analysis”,AIChE Journal, vol. 50, no. 7, pp.1474-1483, Jul. 2004.

[25] I. H. Witten and T. C. Bell, “The zero-frequency problem:estimatingthe probabilities of novel events in adaptive text compression”, IEEETrans. on Information Theory, vol. 37, no. 4, pp. 1085-1094, Jul. 1991.

[26] I. Tyukin and D. Prokhorov, “Feasibility of random basis functionapproximators for modeling and control”, inProceedings of IEEE Multi-Conference on Systems and Control, 2009, pp. 1391-1396.

[27] S. F. McLoone and G. Irwin, “Improving neural network trainingsolutions using regularisation”,Neurocomputing, vol. 37, No. 1-4, pp.71-90, Apr. 2001.

[28] W. Schmidt, M. Kraaijveld, and R. Duin, “Feedforward neural networkswith random weights”, inProceedings of 11th IAPR InternationalConference on Pattern Recognition Methodology and Systems, 1992,pp. 1-4.

[29] M. Alhamdoosh and D. H. Wang, “Fast decorrelated neural networkensembles with random weights”,Information Sciences, vol. 264, pp.104-117, Apr. 2014.

[30] D. F. Specht, “Probabilistic neural network”,Neural Network, vol. 3,no. 1, pp. 109-118, 1990.

[31] S. B. Park, J. W. Lee, and S. K. Kim, “Content-based image classificationusing a neural network”,Pattern Recognition Lett., vol. 25, no. 3, pp.287-300, Feb. 2004.

[32] R. E. Shaffer, S. L. Rose-Pehrsson, and R. A. McGill, “A comparisonstudy of chemical sensor array pattern recognition algorithms”, AnalyticaChimica Acta, vol. 384, no. 1, pp. 305-317, Apr. 1999.

Weitao Li received his Ph.D. in Industrial Automa-tion from Northeastern University in 2012.

From Sept. 2008 to April 2010, he worked as aresearch associate in the Department of ComputingScience, University of Alberta, Canada. He is cur-rently employed as a Lecturer and working in theCollege of Electrical Engineering and Automation,Hefei University of Technology, Heifei, China. Hisresearch interests include image processing and pat-tern recognition for control engineering.

Dianhui Wang (M’03-SM’05) was awarded a Ph.D.from Northeastern University, Shenyang, China, in1995.

From 1995 to 2001, he worked as a Postdoc-toral Fellow with Nanyang Technological University,Singapore, and a Researcher with The Hong KongPolytechnic University, Hong Kong, China. He iscurrently a Reader and Associate Professor with theDepartment of Computer Science and Computer En-gineering, La Trobe University, Melbourne, Victoria,Australia. He is also associated with The State Key

Laboratory of Synthetical Automation of Process Industries, Northeastern Uni-versity. His current research interests include data miningand computationalintelligence systems for bioinformatics and engineering applications.

Tianyou Chai (M’90-SM’97-F’08) was awardeda Ph.D. in control theory and engineering fromNortheastern University, Shenyang, China, in 1985.

He is the director of the State Key Laboratoryof Synthetical Automation of Process Industries,Northeastern University. His current research inter-ests include adaptive control, intelligent decouplingcontrol, and integrated automation of industrial pro-cess.

Dr. Chai was elected as a member of the ChineseAcademy of Engineering in 2003, an Academician

of the International Eurasian Academy of Sciences in 2007, anIEEE Fellowin 2008, and an International Federation of Automatic Control Fellow in 2008.

multi-source data ensemble modeling for clinker free lime...

Documents