reviewarticle data mining techniques for wireless sensor ...since data mining is a broad discipline...

Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2013, Article ID 406316, 24 pageshttp://dx.doi.org/10.1155/2013/406316

Review ArticleData Mining Techniques for Wireless Sensor Networks: A Survey

Azhar Mahmood, Ke Shi, Shaheen Khatoon, and Mi Xiao

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

Correspondence should be addressed to Ke Shi; [email protected]

Received 18 February 2013; Revised 15 June 2013; Accepted 20 June 2013

Academic Editor: Marimuthu Palaniswami

Copyright © 2013 Azhar Mahmood et al. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.

Recently, data management and processing for wireless sensor networks (WSNs) has become a topic of active research in severalfields of computer science, such as the distributed systems, the database systems, and the data mining. The main aim of deployingthe WSNs-based applications is to make the real-time decision which has been proved to be very challenging due to the highlyresource-constrained computing, communicating capacities, and huge volume of fast-changed data generated by WSNs. Thischallenge motivates the research community to explore novel data mining techniques dealing with extracting knowledge fromlarge continuous arriving data from WSNs. Traditional data mining techniques are not directly applicable to WSNs due to thenature of sensor data, their special characteristics, and limitations of the WSNs.This work provides an overview of how traditionaldata mining algorithms are revised and improved to achieve good performance in a wireless sensor network environment. Acomprehensive survey of existing data mining techniques and their multilevel classification scheme is presented. The taxonomytogether with the comparative tables can be used as a guideline to select a technique suitable for the application at hand. Based onthe limitations of the existing technique, an adaptive data mining framework of WSNs for future research is proposed.

1. Introduction

Advances in wireless communication and microelectronicdevices led to the development of low-power sensors and thedeployment of large-scale sensor networks.With the capabili-ties of pervasive surveillance, sensor networks have attractedsignificant attention in many applications domains, such ashabitat monitoring [1, 2], object tracking [3, 4], environmentmonitoring [5–7], military [8, 9], disaster management [10],as well as smart environments. In these applications, real-time and reliable monitoring is essential requirement. Theseapplications yield huge volume of dynamic, geographicallydistributed and heterogeneous data. This raw data, if effi-ciently analyzed and transformed to usable informationthrough data mining, can facilitate automated or human-induced tactical/strategic decision. Therefore, it is essentialto develop techniques to mine the sensor data for patterns inorder to make intelligent decisions promptly.

Recently, extracting knowledge from sensor data hasreceived a great deal of attention by the data mining com-munity. Different approaches focusing on clustering [11–14], association rules [15, 16], frequent patterns [17–20],

sequential patterns [21–23], and classification [24–26] havebeen successfully used on sensor data. However, the designand deployment of sensor networks creates unique researchchallenges due to their large size (up to thousands of sensornodes), random and hazardous deployment, lossy communi-cating environment, limited power supply, and high failurerate. These challenges make traditional mining techniquesinapplicable because traditionally mining is centralized andcomputationally expensive, and it focuses on disk-residenttransactional data. As a result, new algorithms have beencreated, and some of the data mining algorithms havebeen modified to handle the data generated from sensornetworks. A plethora of knowledge discoverymethodologies,techniques, and algorithms have been proposed during thelast ten years.

For example, a decent amount of work is done fordetection of the outlier in WSNs which is presented in [27–29]. Most of the techniques examined in [27, 28] heavilyrely on data mining techniques, but their focus is detec-tion of irregularities in WSNs data rather than informationextraction and analysis. A survey [29] presented the anomaly

2 International Journal of Distributed Sensor Networks

Table 1: Difference between traditional and sensor data processing.

Traditional data WSNs dataProcessing architecture Centralized DistributedData type Static DynamicMemory usage Unlimited RestrictedProcessing time Unlimited RestrictedComputational power High WeakEnergy No constraints LimitedData flow Stationary ContinuousData length Bounded UnboundedResponse time Non-real-time Real timeUpdate speed Low HighNumber of passes Multipass Single

detection in multiple domains using data mining as well asstatistical information theoretic and spectral techniques.

Since data mining is a broad discipline and can beapplied to any domain data, more general surveys on datamining techniques can be found in [30], where authorsexamined the machine-learning and data mining techniquesfor analyzing medical data. Since the classification of datamining techniques in this survey is based on frequent patternmining, clustering, and classification, there are plenty ofsurveys available on each of these techniques. For example,frequent pattern mining over data stream is presented in [31,32]. A survey on clustering algorithm for WSNs is presentedin [33, 34]. The clustering techniques examined in thosepapers exclusively focus on architecture and managementof network rather than information discovery. A survey onclassificationmethods over data stream is given in [35], wherethe author examined conventional classification techniquesover data streams.

However, none of the above surveys examined datamining techniques that focus on information extractionand analysis from WSNs data. In comparison with theabove-mentioned surveys, this paper examines algorithmsand approaches specially designed for WSNs data, not onlyleading to a different classification, evaluation, and discussionon different domains but also presenting different choices ofa solution. We examined how data mining algorithms will beutilized to make the sensor network applications intelligent.The research method consists of review of data mining tech-niques for WSNs such as frequent pattern mining, sequentialpatternmining, clustering, and classification. Problem-basedtaxonomy is presented to classify and compare existing datamining techniques adopted forWSNs. In addition, evaluationof each technique is presented. Based on the limitations ofexisting techniques and special characteristics of WSNs, weproposed a new hybrid data mining architecture for WSNs,which combines the offline learning with distributive andonline data processing.

The rest of the paper is organized as follows. After theintroduction in Section 1, how traditional data mining pro-cess is different with data mining process in WSNs andchallenges of data mining for WSNs data are discussed inSection 2. In Section 3, taxonomy of categorizing the existing

data mining techniques for WSNs is presented. In Section 4,we analyzed a collection of published studies using the taxon-omy framework. The comparison of data mining techniquesfor WSNs is presented in Section 5. The limitations of thiswork are given in Section 6, and future research directionsare presented in Section 7. Finally, the paper ends with theconclusion in Section 8.

2. Fundamentals of Data Mining in WSNs

2.1. Data Mining Process in WSNs. Data mining in sensornetworks is the process of extracting application-orientedmodels and patterns with acceptable accuracy from a contin-uous, rapid, and possibly nonended flow of data streams fromsensor networks. In this case, whole data cannot be stored andmust be processed immediately. Data mining algorithm hasto be sufficiently fast to process high-speed arriving data.Theconventional data mining algorithms are meant to handle thestatic data and use the multistep techniques and multiscanmining algorithms for analyzing static data-sets. Therefore,conventional data mining techniques are not suitable forhandling the massive quantity, high dimensionality, anddistributed nature of the data generated by theWSNs. Table 1shows the summary of difference between traditional dataand WSNs data mining process.

It can be observed from Table 1 that traditional data min-ing is centralized, computationally expensive, and focused ondisk-resident transactional data. It directly collects data at thecentral sitewhich is not bounded by computational resources.In comparison with traditional data-sets, the WSNs dataflows continuously in systems with varying update rates. Dueto huge amount and high storage cost, it is impossible tostore the entire WSNs data or to scan through it multipletimes. These characteristics of sensor data and the specialdesign issues of sensor networks make traditional datamining techniques challenging. Hence, it is crucial to developdata mining technique that can analyze and process WSNsdata in multidimensional, multilevel, single-pass, and onlinemanner.

2.2. Challenges. According to the following reasons, conven-tional data mining techniques for handling sensor data inWSNs are challenging.

(i) Resource Constraint. The sensor nodes are resourceconstraints in terms of power, memory, communica-tion bandwidth, and computational power. The mainchallenge faced by data mining techniques for WSNsis to satisfy the mining accuracy requirements whilemaintaining the resource consumption of WSNs to aminimum.

(ii) Fast and Huge Data Arrival. The inherent nature ofWSNs data is its high speed. In many domains, dataarrives faster than we are able to mine. Additionally,spatiotemporal embedding of sensor data plays animportant role in WSNs application. This may causemany classical data processing techniques to performpoorly on spatiotemporal sensor data. The challengefor data mining techniques is how to cope with the

International Journal of Distributed Sensor Networks 3

continuous, rapid, and changing data streams andalso how to incorporate user interaction during high-speed data arrival.

(iii) Online Mining. In WSNs, environment data is geo-graphically distributed, inputs arrive continuously,and newer data items may change the results basedon older data substantially. Most of data mining tech-niques that analyze data in an offline manner do notmeet the requirement of handling distributed streamdata. Thus, a challenge for data mining techniques ishow to process distributed streaming data online.

(iv) Modeling Changes of Mining Results Over Time. Whenthe data-generating phenomenon is changing overtime, the extracted model at any time should beup-to-date. Due to the continuity of data streams,some researchers have pointed out that capturing thechange of mining results is more important in thisarea than themining results.The research issue is howto model this change in the results.

(v) Data Transformation. Since sensor nodes are limitedin terms of bandwidth, transforming original dataover the network is not feasible. Knowledge structuretransformation is an important issue. After extractingmodel and patterns locally from WSNs data, theoutput is transferred to the base station.The challengefor data mining technique is how to efficiently rep-resent data and discovered patterns over network fortransmission.

(vi) DynamicNetwork Topology. Sensor network deployedin potentially harsh, uncertain, heterogenic, anddynamic environments. Moreover, sensor nodes maymove among different locations at any point overtime. Such dynamicity and heterogeneity increase thecomplexity of designing an appropriate data miningtechnique for WSNs.

To address these challenges, researchers have modifiedthe conventional data mining techniques and also proposednew data mining algorithms to handle the data generatedfrom sensor networks. In the following section we haveprovided the taxonomy of these data mining techniquesbased on the discipline from which they adopt their ideas.

3. Taxonomy of Data Mining Techniquesfor WSNs

In this section, a classification scheme for existing approachesdesigned for mining WSNs data is presented. The highest-level classification is based upon the general data miningclasses used such as frequent pattern mining, sequential pat-tern mining, clustering, and classification. Most of the frequentpattern mining and sequential pattern mining approacheshave adapted the traditional frequent mining techniquessuch as the Apriori and frequent pattern (𝐹𝑃) growth-basedalgorithms to find the association among large WSNs data.Cluster-based approaches have adapted the K-mean, hier-archical, and data correlation-based clustering, based upon

the distance among the datapoint, whereas, classification-based approaches have adapted the traditional classificationtechniques such as decision tree, rule-based, nearest neighbor,and support vector machines methods based on type ofclassification model that they used. These algorithms havevery different and distinct roles; therefore, in order to choosethe algorithm forWSNs application, one has to decide in termof these top-level classes.

The second level of classification is based upon eachapproach’s ability to process data on centralized or distributedmanner. Since WSNs nodes are limited in terms of resourcesuch as power, computation, bandwidth, and memory, there-fore, the approach meant for distributed processing requiresone-pass algorithms to complete a part of data mininglocally and then aggregate the results. The objective to usethe distributed approaches is to limit the messages andcommunication energy of sensor nodes while transferringdata to central server. It also helps to improve the WSNs life-time and can extract maximum data from the environment,whereas, the centralized processing data from entire networkis collected and stored at central server for analysis. Sincethe central server is rich in resources, therefore, there are nosuch constraints for choosing the accurate algorithm. Thisapproach is always discouraged for the researchers becauseit generates huge amount of dataflow and communicationwhich can create bottlenecks and wastage of communicationbandwidth. These two data processing/storage architectureshave a large impact on type of data mining algorithm tochoose; therefore, one has to decide the processing\storagearchitecture for choosing the data mining algorithm forWSNs application.

The third level of classification is selected according tothe attitude towards solving a specific problem. Researchin WSNs area has focused on two separate aspects ofissues, namely, WSNs performance issues and applicationissues. As WSNs nodes are usually resource constrainedsuch as energy, communication bandwidth, memory, andresource, aware algorithms are needed to maximize theWSNs performance. On the other hand, a WSNs applicationrequires data precision and accuracy, fault tolerance, eventprediction, scalability, and robustness, and it often needsabundant use of energy, communication, and redundancies.This leads to resource tradeoff: whether someone sacrificesthe application’s performance in favor of network efficiency orwants to get the best application performance and deal withthe network resource issues such as energy in some other way(larger battery; renewable sources with the nodes). For thisreason, WSNs performances or application-specific-orientedapproaches have been selected as the lowest-level classifica-tion criteria. The taxonomy of data mining techniques forWSNs is presented in Figure 1.

4. State of the Art of Data Mining Techniquesfor WSNs

In this section, data mining techniques designed for WSNsare classified using the taxonomy framework presented inSection 3, and the characteristics and performance analysisof each technique is discussed.


Data mining techniques for WSNs

ClassificationClusteringSequential miningFrequent mining

Distributed Centralized Distributed Centralized Distributed Centralized Distributed Centralized

WSN performance

WSN performance

WSN performance

WSN performance

WSN performance

WSN performance

WSN performance

Application based

Application based

Application based

Application based

Application based

Application based

DSARMCARM

Distributed data aggregation

Association rules mining framework

Online algorithm Lightweight rule

learning

MPGPTSP

Relational frameworkEpisode discovery

Contextual patterns discovery

Pattern learner MSAP

DCC

Prediction model CAG

Clustering sensory data Attribute-based clustering

DHCS

EEDC

Prediction framework FVLD

online learning

Person identification algorithms

NNTC Fuzzy predictor model

LWClass

SP-treeH-cluster

In-network datamining

TMP-mine One-class quarter-sphere SVM

Figure 1: Taxonomy of data mining techniques for sensor networks.

4.1. Frequent Pattern Mining. In this section, we review someof the works that have been proposed for mining frequentpatterns from WSNs data. Frequent pattern mining is usedto find the group of variables that co-occur frequently inthe data-set. The aim is to find the most interesting relationsbetween variables. Traditional frequent pattern mining algo-rithms [36–39] are the CPU and the I/O intensive, making itvery expensive to mine dynamic nature of WSN data. Unlikethe mining static database, dynamic nature of WSNs data ledto the study of online mining of frequent itemset. As a result,traditional frequent pattern mining algorithms are modifiedaccording to nature of WSNs data.

The basic frequent pattern mining technique is associ-ation rule mining technique. The first known associationrule mining algorithm is Apriori [40]. It is based on level-wise candidate generation and test methodology by makingseveral scans over database. In each iteration, the patternsfound to be frequent are used to generate possible frequentpatterns (the candidates) to be counted in the next iteration.Therefore, theApriori technique finds the frequent patterns oflength 𝑘 from the set of already generated candidate patternsof length 𝑘 − 1. In the subsequent step, the association rulesare generated by computing the support and confidence ofeach frequent item in given database 𝐷 which is defined asfollows:

Support (𝐴) =Sup (𝐴)𝐷, (1)

where Sup(𝐴) is the number of occurrence of 𝐴 in database𝐷. Consider the following:

Confidence (𝐴 → 𝐵) =Sup (𝐴 ∪ 𝐵)Sup (𝐴)

. (2)

This is impractical in the context of sensor networksas it implies that all data has to be stored somewhere.However, recently, there has been a growing amount of workon discovering frequent item-sets from a data stream oftransactions such that every transaction is considered onlyonce and can be deleted afterwards.

The other basic approach from mining association ruleis FP-growth [41] which can discover frequent patterns byreducing the database scans by two and eliminating therequirement of candidate generation as compared with Apri-ori. With the first database scan, the algorithm finds theset of distinct items with respective support count (i.e.,frequency) in the database. Then, with the second database,scan the algorithm summarizes the database in the form ofa frequency-descending tree (i.e., the FP-tree). The completeset of frequent patterns is, then, mined from the FP-treeby recursively applying a divide-and-conquer-based patterngrowth approach, called the FP-growth algorithm, withoutadditional database scan. The highly compact FP-tree struc-ture introduced a new wing of research in mining frequentpatterns. However, the static nature of the FP-tree and twodatabase scans still limit its applicability to frequent patternmining over a WSNs data. Recently, several centralized and


distributed solutions have been proposed with the aimto maximize the WSNs’ performance and maximize theapplication-based performance by applying Apriori-like andFP-growth methods over WSNs data.

4.1.1. Centralized Approaches Aim to SolveWSNs’ Application-Based Issues. Halatchev and Gruenwald [42] proposed acentralized methodology called data stream association rulemining (DSARM) to identify the missing sensor’s readings. Ituses the association rulemining algorithm to identify sensorsthat report the same data for a number of times in a slidingwindow called related sensors and then estimates the missingdata from a sensor by using the data reported by its relatedsensors. Due to the stream nature of sensor data, applyingan association mining algorithm such as Apriori directly tosensor data is not possible. This situation led the authorsto propose the DSARM framework that adapts the Apriorialgorithm to make it applicable to the data stream receivedfrom sensor nodes.This technique is evaluated by simulationexperiments on real data collected by the Department ofTransportation in Austin, TX, USA, to estimate missingvalue in related data streams. Performance evaluations wereconducted to compare DSARM and alternative approaches.The results show that DSARM requires more memory spaceand takes longer to produce estimation than the consideredalternative approaches; it achieves better accuracy of theestimated value than the alternative approaches do. However,there exist some limitations in DSARM. First, it is basedon two frequent itemsets association rule mining, whichmeans that it can discover the relationships only between twosensors and ignore the cases where missing values are relatedwith multiple sensors. Second, it finds those relationshipsonly when both sensors report the same value and ignoresthe cases where missing values can be estimated by therelationships between sensors that report different values.

Jiang and Gruenwald [43, 44] proposed a data estimationtechnique called CARM (closed item-sets-based associationrule mining), which can derive the most recent associationrules between the sensors in the current sliding window. Thetechnique is based on the closed frequent item-sets miningalgorithmof data streams calledCFI-stream [45]. Itmaintainsan in-memory data structure called direct update (DIU) treeto store closed item-sets. When a new transaction arrives,the algorithm checks each item-set in the transaction over adata stream slidingwindowonline and incrementally updatesthe closed item-sets’ support. If CRAM found some missingvalues in sensor reading, instead of generating all possibleassociation rules, it generates the rules that have strongrelationships with the current round of sensor readingswhereone or more readings are missing. Based on these rules andselected closed item-sets, CRAM generates the estimatedvalues which contain item values that are not included inthe original readings. Figure 2 redrawn from [43] shows theDIU tree after receiving first four transactions. It shows thatcurrently there are four closed item-sets: C, AB, CD, andABCin the DIU tree, and their associated supports at the right-upper corner are 3, 3, 1, and 2. A basic set of rules is generatedfrom these frequent item-sets. All other rules can be inferredfrom this basic rule set.

Φ

CDTim

eline

TID

1

2

3

4

Items

C, D

A, B

A, B, C

A, B, C

AB 3

C 3

ABC 2

Figure 2: Lexicographical-ordered direct update tree.

4.1.2. Centralized Approaches Aim to Maximize WSNs’ Per-formance. Loo et al. [46] have proposed online one-passalgorithms for mining large sensor streams. They mine thefrequent value set from sensor stream data by transformingthe stream data into interval list (IL) under lossy countingframework [47]. The time is divided into equal-size intervaland snapshot from the sensor reading is taken when there isan update on sensor reading. Sensors’ value at that snapshotconstructs the value sets stored in database. An Apriori-based strategy is used to mine the value sets. The analysisof IL-based presentation of stream data showed favorableresults using synthetic data-set. However, while computingthe IL of candidate value set, redundant intersection ofIL is inevitable, which affects the performance in termsof time and computation cost. The proposed technique isevaluated by comparing the performance of ILB againstan application of lossy counting (LC) using a weightedtransformation method on synthetic dataset. According totheir experiments, ILB outperforms LC significantly for largesensor networks. Moreover, both the processing time andmemory consumption of ILB are more stable than those ofLC.

Chong et al., [48] proposed a rule-learning model thatfinds strong rules from sensor readings. The rules are used asa trigger to control sensor network operations; for example,they can be used to sleep sensor or reduce data transmissionto conserve energy. To mine the rules, Apriori is modified tocount the number of transactions that are frequent insteadof the item-sets within transactions, and transactions areprocessed in batches 𝑏

1, 𝑏2, . . . , 𝑏

𝑋. Suppose, there is node

𝑀 that collects light, temperature, and microphone readingfrom three other sensor streams 𝑆

0, 𝑆1, and 𝑆

2. Initially, 𝑀

is queried to collect all sensory values, it is used to generatea rule of the form of 𝑎

𝑛which implies 𝑎

𝑛−1; therefore, the

rule is extracted and only 𝑎𝑛is sent to the base station. Upon

receiving the reading 𝑎𝑛and utilizing knowledge of the rule,

the reading of 𝑎𝑛−1

can be inferred. All extracted rules arestored in rule repository. The proposed method is validatedby using simulation implemented in C language on syntheticdataset. In the experiment, the first correlated data receivedfrom sensor is used to extract rules. For subsequent phase,these rules are used to infer reading of sensor for the nextround.

Tanbeer et al. [49] proposed a tree-based data structurecalled sensor pattern tree (SP-tree) to generate association


rules from WSNs data with one database scan. The mainidea of the proposed approach is to obtain the frequencyof all event-detecting sensors’ data, construct a prefix-treebased on that in any canonical order, and then reorganizethe tree in a frequency descending order. Through thereorganization, the SP-tree canmaintain the frequently event-detecting sensors’ nodes at the upper part of the tree, whichin turn provides high compactness in the tree structure.Once the SP-tree is constructed, FP-growthmining techniqueis applied to find the frequent event-detecting sensor sets.Experiments are performed to verify the improvement inmemory consumption and runtime that SP-tree achieves overPLT [50]. The experiments show that SP-tree outperformsPLT in time and memory consumption. The reason of suchgain is two folds: first, the PLT construction requires twodatabase scans, while SP-tree constructs the tree by scanningthe database only once; second, the mining phase of SP-tree is highly efficient due to the frequency-descending treestructure.

4.1.3. Distributed Approaches Aim to SolveWSNs’ Application-Based Issues. Romer [51] proposed an in-network data min-ing technique to discover frequent patterns of events withcertain spatial and temporal properties. In this approach, userspecifies the upper boundmaxscope andmaxhistory (variableto be measured in seconds) for the patterns of interest. Thesensor collects these events and applies amining algorithm todiscover the pattern that satisfies the given parameters. Eachnode in the network collects the events from its neighborswithin themaximum scope and keeps a history of their eventsfor duration of the maximum history. After that, each nodeapplies a mining algorithm to discover the local frequentpatterns. The resulting frequent patterns are converted toassociation rules that describe an event of type 𝐸 that occursat node 𝑛 with support 𝑆 and confidence 𝐶. Local patternsare sent to the sink where secondary mining is performed tocompute the global picture of entire network. The algorithmis implemented on BT node (bluetooth radio) platform [52],and the tradeoff between scope of the query and resourceconsumption on real dataset is evaluated. Results show, byreducing the scope of the query, that the proposed approachcould decrease resource consumption. Major issues in thisapproach are memory consumption of itemset discoveryalgorithms and the communication overhead of event collec-tion.

4.1.4. Distributed Approaches Aim to Maximize WSNs’ Perfor-mance. Boukerche and Samarah [15] presented a distributeddata extraction methodology to aggregate the data on sensornode which reduced the number of messages during trans-mission. The distributed solution sends some parameterssuch as support, time-slot size, and historic period from sink toall nodes within network. Each sensor node has its own bufferentry to set the support value. After each time slot, nodescheck whether there are messages received during this timeslot; if yes, then that node will set its buffer entry. When thehistoric period ended, each node will traverse its buffer; if thenumber of set value is more than or equal to support value

provided initially, then the message would be transfered tosink. To evaluate the validity of the distributed approach, it iscompared with the centralized methodology on real dataset.They conducted two experiments using historical periods of 5and 10 days with minimum support values ranging from 10%to 90% and a time-slot size equal to 30 seconds. All of thereported results show a reduction in the number of messagesand the data sizewhile increasing in the support values.Majorissues in thismethodology are increase in cost for node bufferand also delay in crucial messages in case of high supportvalue.

Boukerche and Samarah [50] proposed the positionallexicographic tree (PLT) structure for mining associationrules in which the event-detecting sensors are the mainobjects of the rules regardless of their values. Similar to theFP-growth approach, PLT follows a pattern growth miningtechnique. The mining begins with the sensor having themaximum rank by generating the frequent patterns from itsPLT in a recursive way. The computation is required at eachrecursion to update the PLT involved in the prefix part ofa pattern. Therefore, two database scans requirement andthe additional PLT update operations during mining limitthe efficient use of this approach in handling WSNs data.The performance evaluation is done by comparing the PLTstructure with the FP-growth algorithm. According to theirresults, PLT structure outperforms FP-growth in terms ofCPU time and memory usage for all of the support valuesused; the enhanced performance using PLT when comparedwith FP-growth ranges from 30 percent to 50 percent.

4.2. Sequential PatternMining (SPM). Frequent patternmin-ing has been extended to find more complex structuresuch as sequential pattern mining. It discovers frequentsubsequences as patterns in a sequence database. A sequencedatabase stores a number of records, where all records aresequences of ordered events, with orwithout concrete notionsof time. A large number of real-world domains such as userprofiling, medicine, local weather forecast, and bioinformat-ics show an inherent tendency to be modeled by means ofsequences of events/objects related to each other. This greatvariety of applications of sequential pattern mining makesthis problem one of the central topics in WSNs data miningas shown by the research efforts produced in the recent years.The sequential pattern mining techniques in sensor networkbased on either traditional sequential mining algorithmssuch as Apriori-like algorithm [53], Apriori-based methods:GSP [54] PSP [55], and pattern growth approaches: FreeSpanand PrefixSpan [56, 57] or some new algorithm are devisedspecifically to work with sensor network environment.

4.2.1. Centralized Approaches Aim to SolveWSNs’ Application-Based Issues. Esposito et al. [58, 59] presented a multi-dimensional relational sequence mining framework to iden-tify the hidden frequent temporal correlations betweensensor nodes. The algorithm is based on generic level-wise search method called APRIORI [60] for discoveringcorrelated sensors. The framework exploits the relationallanguage to describe the temporal evolution of a sensor


network along with contextual information by working intwo phases. Firstly, an abstraction step is to segment andlabel the real-valued time series into similar subsequencesby using a kernel density estimator approach. Then, theknowledge is enriched by adding interval-based operatorsbetween the subsequences obtained in the discretization step,and the relation pattern mining algorithm has been extendedin order to deal with these new operators. By taking intoaccount the interval-based temporal data along with contex-tual information about events, it discovers interesting andmore human-readable patterns. The framework is evaluatedon real dataset collected from a wireless sensor networkmade up of 54 Mica2Dot [61] sensors deployed in the IntelBerkeley Research Lab [62]. Each sensor collected topologyinformation, along with humidity, temperature, light, andvoltage values once every 31 seconds. Results show the strongcorrelation among some measurements, which is useful foranomaly detection.

Cook et al. [21] present MavHome smart home archi-tecture which focuses on the creation of an intelligenthome, perceiving the state of the home through sensors andacting upon the environment through device controllers. Animportant characteristic of the proposed architecture is theability to make decisions based on predicted activities. Topredict the activities, an algorithm called episode discovery(ED) is proposed, which is based on the work of Srikantand Agrawal [54] for mining sequential patterns from time-ordered transactions. Values that can be predicted include theusage pattern of devices in the home, the movement patternsof the inhabitants, and the typical activities of the inhabitants.They utilize prediction algorithms on action sequences storedin inhabitant event history to forecast user actions. Actionscan then be automated based on the significance of minedpatterns as well as the predictive accuracy of the next event.A key disadvantage is the fact that the entire action historymust be stored and processed off line, which is not practicalfor large prediction tasks over a long period of time. Cook etal. demonstrated the effectiveness of MavHome on syntheticsmart home data and real data collected by students usingX10controllers in their homes. Experiments show a predictiveaccuracy as high as 53.4% on the real data and 94.4% on thesynthetic data.

Rabatel et al. [22] presented a strategy to detect anomaliesfrom sensor data to improve the railway maintenance. Theyextract sequential pattern from real railway data and identifythe abnormal behavior. Based on these abnormal findings,alarms are automatically triggered to notify potential fail-ures. This abnormal behavior depends on environmental(weather conditions, travel characteristics) and structural(route, episode index in the route) changes in data. ThePSP [55] algorithm has been used to identify the sequentialpatterns. To tackle the environments conditions, a contextualknowledge-based method is proposed, which is able toprovide information on the seriousness and possible causesof a deviation. The proposed technique helps in proactivemaintenance of train. However, real-time context can beimproved by providing precise and exact information foranomaly detection.

a q kTq,kTa,q

Figure 3: Example of sequential alarm pattern.

Guralnik and Haigh [23] use sequential pattern miningto learn typical behaviors of humans in their homes. Humanbehavior is inferred by using motion sensors, pressure pads,door latch sensors, and toilet flush sensors. They installed10–20 sensors of different types in a home and built modelsof what sensor firings correspond to what activities, in whatorder, and at what time. For example, “In 60% of the days,the Kitchen-Motion sensor fires between 18h00 and 18h30,and then the Living-Room-Motion sensor fires between18h20 and 20h00, and then the Bedroom-Motion sensor firesbetween 19h45 and 22h00.”Their algorithm uses these data tolearn the sequences of rooms in which the person was acting,and it uses domain knowledge to extract the sequences ofrooms the person was acting in. These sequences are thenanalyzed by a human expert to identify complex behaviormodels. These models can be used to select the appropriateresponse plan to the action of elderly.

Wu et al. [63] proposed a new algorithm for miningsequential alarm patterns (MSAPs) from the alarm datagenerated by GSM system. Sequential events are identifiedfrom alarm data by defining time interval between adjacentevents. For example, if time is set as six hours, then thesequential alarm pattern (𝑎, 𝑏, 𝑐) indicates that 𝑎, 𝑏, and 𝑐happen in order and that the time interval between 𝑎 and𝑏 and between 𝑏 and 𝑐 is less than six hours. An exampleof sequential alarm sequence redrawn from [63] is shown inFigure 3.

The number in circle represents the error ID, and 𝑇𝑎,𝑞

denotes the time difference between alarm event 𝑎 and alarmevent 𝑞. The knowledge extracted is not only useful foridentifying relevance between two events, but it is also predictthe alarm sequence and takes proper steps to prevent theoccurrence of the alarms if at all possible. For example, if thenetwork operator detects that, the alarm 𝑎 occurring at time 𝑡operator should dissipate this alarm before the time 𝑡+𝑇

𝑎,𝑞to

alleviate the abnormal situations incurred. The limitation inthis technique is that it cannot discover other possible time-interval patterns between the events.

It is observed that there is none of centralized solutionswhich aim to maximize the WSNs’ performance.

4.2.2. DistributedApproaches Aim to SolveWSNs’ Application-Based Issues. Tseng and Lu [64] proposed an object trackingstrategy named themultilevel object tracking (MLOT) to dis-cover sequential patterns in object tracking sensor networks(OTSNs) by mining the movement log in sensor networks. Amultilevel hierarchical structure is adapted by using the clus-tering mechanism that represents the hierarchical relationsamong sensor nodes to achieve the goal of keeping track ofmoving objects in a real-time manner. The movement logsof the moving objects are analyzed by developing the data


mining algorithm movement pattern generation (MPG) toobtain themovement patterns, which are then used to predictthe next position of a moving object and to activate the leastsensor node. The MPG is based on Apriori which uses thefrequency of the inference pattern to evaluate the confidenceof the pattern and which with the highest frequency serves asthe basis of the prediction.

4.2.3. Distributed Approaches Aim to Maximize WSNs’ Per-formance. Tseng and Lin [65] proposed an object trackingstrategy named TMP-mine to discover sequential patternsin object tracking sensor networks (OTSNs) by mining thetemporal movement patterns (TMPs) logs. The discoveredtemporal movement rules (TMRs) are used to predict thelocation of next objects for saving energy. In the proposedmodel object is able to record the sensor nodes it visitedalong with the arrival time at each node.Themovement log iscollected by equipping the sensor nodes with storage devices.TheWSN collects and integrates themovement log ofmovingobjects. The integrated movement log is used as the input tothe data mining method named the TMP-miner which usesthe pattern growth approach for discovering the TMPs. Byapplying the TMP-mine algorithm, the TMPs are discovered,and then the temporalmovement rules (TMRs) are generatedfor predicting next location of moving object. Suppose thatthe following two rules are discovered by vehicle trackingsystem:

Rule 1. (Station A → interval 10min → Station B →interval 5min → Station C).

Rule 2. (Station A → interval 20min → Station B →interval 5min → Station → D).

By dispatching these rules to the corresponding sensornodes, the tracking can be made in energy-efficient way. Forexample, if a car moves with the pattern as (Station A →interval 10min → Station B → interval 5min) that matcheswith Rule 1, then the node in Station B has only to activatethe node in Station C rather than that in Station D or thosearound Station B.

Samarah et al. [66] proposed an energy-efficientprediction-based tracking technique by using the sequentialpatterns (PTSPs). This technique helps to predict the futurelocation of a moving object with the minimum number ofsensor nodes while keeping the other sensor nodes in thenetwork in sleep mode. The PTSP is based on the inheritedpatterns of the objects movements in the network and theutilization of sequential patterns to predict in which sensornode the moving object will be heading next.

4.3. Clustering. Clustering is unsupervised learning, wheregiven data is categorized into subsets so that each subsetrepresents a cluster which has distinctive properties. It hasbeen considered a useful technique especially for applicationsthat require scalability to large number of sensor nodes.Clustering also supports aggregation of data in order tosummarize the overall transmitted data.

ClustersInput sensor data

Feedback

Identification ofdata correlation Grouping data

Figure 4: Data clustering for sensor networks.

In the current literatures, problems related to clusteringare addressed by node clustering or data clustering. Recently,large numbers of node clustering algorithms have beendesigned for WSNs [67–83]. These clustering techniqueswidely vary in their objectives depending on the node deploy-ment and bootstrapping schemes, the pursued networkarchitecture, the characteristics of the cluster head (CH),and the network operation model. Although node clusteringmay be related to data clustering, for example, consideringdata similarity of neighboring node, many popular nodeclustering algorithms that partition the sensor nodes into anumber of small groups and elect a cluster head for everygroup do not use the data mining techniques directly. In thisstudy, we only focus on data clustering techniques to efficientdata mining and find data correlations among the nodes.Figure 4 shows the commonly used data clustering in datamining process.

This work adapted the K-mean, hierarchical, and datacorrelation-based methods. The k-mean algorithm takes theinput parameter, k, and partitions a set of 𝑛 objects into kclusters so that the resulting intracluster similarity is high,but the intercluster similarity is low. Cluster similarity ismeasured with respect to the mean value of the objectsin a cluster. Hierarchical method creates a hierarchicaldecomposition of the given set of data objects. It works bygrouping data objects into a tree of clusters, whereas, datacorrelation-based clustering forms clusters based on spatialand temporal correlations with similar node sensory valueswithin a given threshold, and these clusters remain fixeduntil the sensory value threshold has changed over time.When the threshold values change, the related sensor nodeswill then communicate with neighboring nodes associatedwith other clusters to change their cluster memberships. Thedrawback of this type of clustering is that it does not considernode residual energy. It is observed from the survey that thecentralized and distributed clustering solutions are aim tomaximize the WSNs performance.

4.3.1. Centralized Approaches Aim to Maximize WSNs’ Per-formance. Liu et al. [84] proposed a centralized graph-basedenergy-efficient data collection (EEDC). EEDC is on-demandclustering algorithm that clusters node into groups such thatmembers have similar sensor readings, and thus the protocolclusters the network with an awareness of the phenomenabeing sensed. EEDC is a centralized approach where thesink compares data from different nodes with a user-defineddissimilarity measure. EEDC models the cluster creationprocess as a clique-covering problem by constructing a graph𝐺 such that each sensor node is a vertex in the graph. An edge(𝑢, V) is drawn if the dissimilarity measure between vertex𝑢 and vertex V is less than or equal to the given intracluster


dissimilarity measure thresholdmax dst. A cluster is a cliquein the graph, and the clustering problem uses the minimumnumber of cliques to cover all vertices in the graph. Thisprocess minimizes the number of clusters and maximizes theenergy saving. The sink also dynamically adjusts the clustersbased on spatial correlation and the received data from thesensors. The algorithm produces robust and well-balancedclusters. However, due to centralized processings it is notsuitable for large-scale WSNs.

4.3.2. Distributed Approaches Aim toMaximizeWSNs’ Perfor-mance. Guo et al. [85] proposed the H-cluster, a distributedalgorithm to cluster sensory data.The input of this algorithmis the set of sensory data collected by all of the sensorsfrom the time WSN starts working up to the current time.The output of the algorithm is a set of cluster featuresthat summarize the clusters of the input sensory data-set.Hilbert-Map mapping algorithm has been used to map ad-dimensional sensory data space into a 2-dimensional areacovered by a given WSN. H-cluster has 2 phases: (1) itmerges connected grid features with local cluster featuresof (sensory dimensional) D at each destination node; (2)it combines the connected local clusters to global clusters.The experiments on the centralized and distributed dataare carried out to compare the H-Cluster with C-Cornerand C-Center algorithms. During experiment, four types ofenvironment attributes are sensed by the sensors, which aretemperature, humidity, light, and voltage. The results showthatH-Cluster algorithm ismuch efficient in data loss, energy,and the quality of cluster data in small WSN.The results alsoshows that as the amount of sensory data delivered increasesthe amount of data loss also increases and energy efficiencydecreases by increasing the size of WSNs.

Yeo et al. [86] proposed data correlation-based clusteringscheme (DCC) based on similarity of sensor data along aspatial suppression scheme which helps to reduce the datasize. DCC enhances the advertisement phase of HEED [71]in which cluster heads are selected according to probabilityof becoming a cluster head; during this phase, sensor nodescommunicate with each other, and the resulting clustersare organized by sensor nodes which have similar readings.Spatial suppression is performed on cluster head, and italso computes the difference between sensor reading andrepresentative value. If a cluster head has redundant data,it will remove it except for the node identification. Theexperimental results justify the hypothesis claim that theclustering based on data correlation has better compressionperformance than ordinary clustering based on locality ofcommunication, they show that DCC reduces 40% of datasize through suppression and prolongs network lifetime20%–30%. However, for the large-scale network applications(nodes > 500), DCC is inefficient because each cluster headneeds more energy to collect similar data readings and alsoto communicate with several nodes. Also in case of lowpercentage of similar data reading, DCC is ineffective due tohigher rate of cluster head creation.

Beyens et al. [87] proposed a cluster-based architecturefor wireless sensor networks in which cluster heads spa-tiotemporally correlate and predict the measurements of the

cluster members by executing their prediction model. Intheir approach, the cluster heads execute a prediction model,while gateway nodes at the circumference of the clusters areresponsible for the routing task. Prediction model is used toselect a suitable node of the cluster to be activated. The ideais to put a sensor node to sleep when there are no objects inits sensing region.

Yoon and Shahabi [88] present the clustered aggregation(CAG) algorithm that forms clusters of nodes sensing similarvalues within a given threshold (spatial correlation), andthese clusters remain unchanged as long as the sensor valuesstay within a threshold over time (temporal correlation).By grouping nodes on similar values, CAG only transmitsone reading per group. When the threshold values change,the related sensor nodes will then communicate with neigh-boring nodes associated with other clusters to change theircluster memberships. CAG guarantees the result to be withina user-specified error-tolerance threshold. Cluster formationis performed while queries are disseminated to the network(query phase), where clusters group nodes sensing similarvalues. Subsequently, CAG enters the response phase whereinonly one aggregated value per cluster is transmitted up theaggregation tree. CAG is a lossy clustering algorithm (mostsensory readings are never reported) which trades a lowerresult precision for a significant energy, storage, computation,and communication saving.

Taherkordi et al. [67] proposed a communication-efficient distributed protocol for clustering sensory data.A distributed version of 𝐾-Mean clustering algorithm isproposed and sends summarized data towards sink whichreduces the communication transmission, time, and powerconsumption of sensor nodes. The sensor network is dividedinto clusters and cluster head node will only communicatewith sink. Initially, base station transmits current centerlocations to cluster heads. Cluster head collects data fromits sensor node and sends it to the base station includingcount and vector sum of its local sensory data points aswell as sum of the squared distance from each local pointto its center. On receiving data from CH, the base stationupdates the cluster mean, and the algorithm repeats until thefunction convergence is met. The efficiency of the algorithmis evaluated via simulations. Several programs are run to getthe average number of transmissions over the network duringeach test. According to results, the communication cost isindependent of the number of sensors (𝑁) and increaseslinearly by increasing the number of centers. Major issuesare extra memory for cluster head and computation powerfor summarization of data before transmitting to sink. Alsothe algorithm requires multiple rounds of message passingbetween cluster heads and the base station; this may have aserious effect on communication efficiency when the numberof sensors is relatively high.

Wang et al. [89] promoted the idea of clustering theWSNs based on the queries and attributes of the data. Themain motive is to achieve efficient dissemination of data inthe network. The concept resembles the data-centric designmodel of WSNs. The clustering is established by mappinga hierarchy of data attributes to the network topology. Thebase station starts the clustering process by asking nodes


Class label (Y)

Attribute set (X)

OutputInput Classification model

Figure 5: Classification maps input attribute set (X) to class label(Y).

to form clusters. Those nodes that hear the request decidewhether they should nominate themselves as CHs basedon their energy. After receiving the base-station request,sensor nodes having intention to become CHs wait for arandom time period that is based on the remaining batterysupply. If a node nominates itself, then it broadcasts anannouncement to all nodes. A node joins the CH that itcan reach over the least number of hops. Upon hearing aCH announcement from a node whose attribute is different,the recipient node establishes a new cluster for that attributeand becomes a CH. To evaluate the attribute-based clusteringscheme, the authors have provided the theoretical analysis ofit with flooding-based schemes. Analysis shows its attribute-based clustering scheme yield that gains over flooding-basedschemeswhen there are subregions in the sensor network thatare more targeted than others, that is, when the distributionof inquiries is not uniformly distributed over time and space.

Ma et al. [90] the proposed distributed, hierarchicalclustering and Summarization algorithm (DHCS) for onlinedata analysis and mining in sensor networks. The proposedmethod clusters sensor nodes based on their current datavalues aswell as their geographical proximity, and it computesa summary for each cluster. The algorithm adopts severaltechniques, such as difference and hop count thresholds, tomodel node, and distance-based clustering. Initially, eachnode treats itself as an active cluster. Then, similar adjacentclusters are merged into larger clusters round by round. Ineach round, each cluster will try to combine with its mostsimilar adjacent cluster simultaneously. Two clusters can bemerged only if both consider one another as the most similarneighbor. DHCS terminates when no merging happens anymore. The final clusters, which cannot be merged any more,are called steady clusters.

4.4. Classification. Classification is a task of assigning newobject into a class of predefined object categories. Classifi-cation model is learned using the set of training data andclassifies new data into one of the learned class. Figure 5shows that classification maps input attribute set (X) to classlabel (Y).

Classification-based approaches have adapted the tra-ditional classification techniques such as decision tree-based, rule-based, nearest neighbor-based, and support vectormachines-based techniques based on type of the classificationmodel that they used. Decision tree is a classifier in the formof tree and classifies the instance by starting at the root oftree and moving through it until a leaf node where class labelis assigned. The internal nodes are used to partition datainto subsets by applying test condition to separate instancesthat have different characteristics. Nearest neighbor-basedapproaches classify dataset based on closet training examples.

The training examples are vectors in a multidimensionalfeature space with corresponding class labels. A nearestneighbor classifier is a lazy learner that does not processpatterns during training [91]. To respond, a request to classifya query vector is made to locate the closest training vectorsaccording to the distance metric.The classes of these trainingvectors are used to assign a class to the query vector.

Rule-based classifier groups the dataset in predefinedclasses by using “if. . .then. . .” rules of following form:

(Condition) → Y: condition is a conjunction ofattribute, and Y is a class label.

SVM (support vector machine) techniques partition thedata belonging to different classes by fitting a hyperplanebetween them which maximizes the partition. The data ismapped into a higher-dimensional feature space where it canbe easily partitioned by a hyperplane. Furthermore, a kernelfunction is used to approximate the dot products between themapped vectors in the feature space to find the hyperplane.

4.4.1. Centralized Approaches Aim to SolveWSNs’ Application-Based Issues. Chikhaoui et al. [92] proposed the decisionTree (DT-) based classification technique for sensor data.They applied the classification model to identify the personsin ubiquitous environment. In order to identify persons,the proposed approach first extracts frequent patterns calledepisodes from the datasets using the Apriori algorithm [53].The next step evaluates the extracted patterns and assignsweights to these episodes to construct frequent episodeweight matrix (FEWM).

Finally, the classification algorithm Decision tree (DT) isapplied on FEWM.DT builds pattern classifier from a labeledtraining data-set using a divide-and-conquer approach. Tobuild up a DT model, it recursively selects the attribute thatis used to partition the training data-set into subsets untileach leaf node in the tree has uniform class membership.The proposed approach is validated by experiment usingdata collected from the Domus Laboratory [93] and theTestbed smart home [94]. The general performance andclassification accuracy of algorithm are evaluated by usingthe Weka framework version 3.7.0 [95]. Experiment resultsshow good classification. However, using frequent episodesalone without temporal constraints and deep analysis doesnot guarantee good identification.

Sharma et al. [96] proposed amethodology for classifyingthe sensors data by using nearest neighbor trajectory clas-sification (NNTC). The training phase simply stores everytraining example with its label. To make a prediction for atest example, first, its distance to every training example iscomputed.Then, 𝑘 closest training examples are stored,where𝑘 is a fixed integer and 𝑘 ≥ 1; among the 𝑘 examples, itlooks for the label that is most frequent. This label is theprediction for this test example. The algorithm is evaluatedby building a classifier from the preprocessed training datagenerated from NS2 [97] and test trajectory data [98] usingclass labels. Experimental investigation yields a significantoutput in terms of the correctly classified success rate, 92.3%.

Akhlaghinia et al. [99] proposed the prediction techniquein smart home environments to predict the behavior pattern


of occupants.The sensor NWs collect the variety of attributesincluding environmental changes and occupant’s interactionwith the environment. The collected data is then used by thelearning approach to construct a classification-based predic-tive model to predict the ambient intelligence environmentoccupancy. The occupancy is predicted by using the fuzzyrules which are modeled by using the past value of timeseries data. In the learning process, input from the sensor iscompared with stored rules to take appropriate action. Theprediction-based approach improves the energy saving insmart homes and enhances the safety and security of occu-pants. The result shows the ability of the proposed techniqueto predict the combined occupancy time series. However, themodel is implemented in single-user environment and unableto predict the complex environmental patterns in multi-userenvironment over long period.

4.4.2. Centralized Approaches Aim toMaximizeWSNs’ Perfor-mance. Gaber et al. [100] proposed the lightweight classifica-tion (LWClass), a one-pass algorithm for on-board miningof data streams in sensor networks. They used the algorithmoutput granularity (AOG) [101, 102] technique to preserve thelimited memory size and change the algorithm output rateaccording to data rate, available memory, algorithm outputrate history, and time constraints to fill the available memorywith generated knowledge.The algorithmworks by searchingfor the nearest instance stored in main memory when a newelement arrives. All instances are already stored in the mainmemory according to a prespecified distance threshold. Thethreshold here represents the similarity measure acceptableby the algorithm to consider two or more elements as oneelement according to the elements attribute values. If thealgorithm finds this element, then it checks the class label.If the class label is the same, then it increases the weightfor this instance by one; otherwise, it decrements the weightby one. If the weight becomes zero, then this element isreleased from the memory. The algorithm is empiricallyvalidated using synthetic streaming data under the resource-constrained environment of a common handheld computer.

4.4.3. DistributedApproaches Aim to SolveWSNs’ Application-Based Issues. McConnell and Skillicorn [103] presented adistributed framework for building and deploying predictorsin sensor networks. By using the computational power ofeach sensor, a powerful learning structure on whole networkis constructed. A distributed voting approach is proposedin which each sensor is a leaf of tree (DT) to performlocal prediction. Instead of sending the raw data, the localpredictive models built on sensors transmit the target class tothe sink. At sink, the local predication models are combinedto construct global prediction model. It shows how thelocal model enables sensors to respond to the change intarget by relearning local models. The proposed frameworkis useful especially for sensor networks with limited energy,computation, and bandwidth resources. It makes efficientthe distributed data mining in the presence of movingclass boundaries. Data is also confidentially achieved bytransmitting a predictivemodel instead of original data to the

sink. The distributed prediction model is evaluated using J48decision tree (implemented in WEKA) on variety of datasetfor both simple and weighted voting schemes. According toresults, distributed prediction model has the potential of anincrease in accuracy combined with a reduction in modelsize and runtime as compared with a centralized approach.Major issues in this framework are the need of an expensiveCPU on each sensor node for computing and building localpredictive model, and also extra memory is required to storelocal predictive model.

4.4.4. Distributed Approaches Aim to Maximize WSNs’ Per-formance. Malhotra et al. [104] proposed a distributed clas-sification scheme to generate effective feature vectors of lowdimension (FVLD) for wireless audio network. A distributedcluster-based algorithm for detection and classification ofvehicles has been proposed. Sensors form clusters on-demand for the sake of running a classification task based onthe produced feature vectors. The monitoring area is dividedinto clusters, and a cluster head is selected for each cluster.All sensors send their feature vector to cluster heads. Thecluster head combines all received feature vectors (includingone from itself), executes the classification task using, forexample, KNN or ML classifiers, and makes decision on theclass of the unknown vehicle. Two approacheswere proposed:the first combines extracted features and the second combinesindividual decisions. Classification using decision fusion anda maximum likelihood (ML) classifier led to the best results.ML is also compared with KNN classifier with varioussettings of data and decision fusion schemes. The proposedtechnique produced the best classification accuracy of 89.46%as compared with all other approaches.

Flouri et al. [105–107] have proposed distributed andincremental techniques for learning classification rules usingSVM-based (support vector machine) technique in a sensornetwork. The authors proposed two distributed algorithms:the distributed fix partition SVM (DFP-SVM) and theweighted distributed fix partition SVM (WDFP-SVM) fortraining a SVM applied to the classification problem in aWSN. SVM is incrementally trained on example set calledsupport vector. The fact with SVM is that the number ofsupport vectors is very small comparedwith the number of allsample values. Besides, the support vectors (and offset) revealcompressed representation of separating SVM hyperplane.That is why sending only the support vectors instead ofall training samples to the next cluster head is obviouslyvery energy efficient due to communication reduction. Aftertraining, the required parameters of the kernel functions aretransferred to each node for classification. The performanceof the proposed approach is evaluated by running number ofsimulation, and comparison is made with centralized algo-rithm. The results show that energy consumption decreaseswhen the SVM is trained incrementally as compared with thecentralized case. However, the challenges for SVM formula-tions are computational complexity and the choice of properkernel function.

Rajasegarar et al. [108] proposed the SVM-based tech-nique for outlier detection in sensor data. This techniqueuses one-class quarter-sphere SVM to identify local outliers


at each node and to minimize the computational complexity.The sensor data that lies outside the quarter sphere isconsidered as an outlier. Each node communicates onlythe radius information of sphere with its parent for outlierclassification. This technique identifies outliers from the datameasurements collected after a long-time window and is notperformed in real time. The technique also ignores spatialcorrelation of neighboring nodes, which makes the results oflocal outliers inaccurate. The technique is evaluated by usingthe real sensor measurement collected from deployment ofwireless sensors in the Great Duck Island Project [2] formonitoring the habitat of sea birds. The algorithm is imple-mented in Matlab and two simulations were run to measurethe computational strategy and various kernel functions.Results reveal that the proposed technique achieves signifi-cant energy savings in terms of communication overhead inthe network.

5. Comparison of Data Mining Techniquesfor WSNs

This section identifies several common and different aspectsof data mining techniques specially designed for WSNsdiscussed above. These aspects will be used as metrics in thecomparative Tables 2, 3, 4, 5, and 6. First, evaluation aspectsfor different techniques are discussed, and, then, comparativetables are presented to compare and differentiate existing datamining techniques for WSNs data.

5.1. Input Sensor Data. Sensor data can be viewed as largevolume of real-valued data that is continuously collectedfrom WSNs. The type of input sensor data demonstrateswhich data mining techniques can be used to analyze thedata. Data mining techniques usually consider following twocharacteristics of data.

Attribute. Mining techniques can identify the associationbetween data attributes. Attributes can be homogenous [50] orheterogeneous [33, 48]. Homogenous attribute means sensingsingle-value attribute, for example, temperature only. Forheterogeneous case, each nodemay be equippedwithmultiplesensors and can sense multiple attributes, for example, tem-perature, humidity, and pressure. The data mining techniqueshould be able to identify the correlation between multipleattributes.

Correlation. Two types of data correlation appear at eachsensor node. The first type is attribute correlation, that is,dependency among data attributes. The second type is interms of time and space, that is, temporal and spatial corre-lation. Temporal correlation indicates that the readings fromdifferent sensor node are observed at the same time instant,and readings observed at one time instant are related tothe readings observed at the previous time instant, whereas,spatial correlation indicates that the readings from sensornodes geographically close to each other are expected tobe largely correlated. Capturing spatiotemporal correlation

helps to predict future trend of sensor reading and identifica-tion of dead node if reading from correlated sensor ismissing.

5.2. Processing Architecture. In order to apply data miningtechnique on sensor data, we need to determine the modelsof computation. There are two general models. Consider thefollowing.

Centralized.The simplest way to analyzeWSNs data is to use acentralized model. In this approach, entire raw data collectedfromWSNs is transferred to central server whichmaintains adatabase of readings from all of the sensors.The central serverperforms offline extensive analysis in order to find interestingpatterns from the aggregated data. With the size of WSNsincreasing, the amount of data transmitted in the system willbecome huge. The obvious drawback of this approach is highconsumption of energy and bandwidth. Furthermore, it is notscalable to very large number of sensors.

Distributed. Another computation approach uses distributedmodel, in which sensor nodes use their processing abilitiesto carry out some mining tasks locally and transmit onlythe required and partially processed data called local model.Local models contain the compact event patterns rather thanraw data. For example, data collected from different sensorcan be aggregated before being transmitted to central server.In these systems, an intermediate node called “aggregator” isused to collect and aggregate the data from different sensors.Since sensor nodes are constrained in resources, the challengefor this approach is how to satisfy the mining accuracywhile keeping the communication overhead, memory, andcomputational cost low.

5.3. Data Mining Method. It refers to the data miningalgorithm adapted or developed for unique characteristic ofWSNs data. Distributed approaches use one-scan algorithmsfor real-time processing in order to deal with the high dataarrival rate; the mining results are expected to be availablewithin short response times, whereas centralized approachescollect the sensory data to single site and applies offlinemultiscan technique for extensive data analysis.

5.4. Node Properties. The proposed techniques are largelyinfluenced by following types of node properties.

Connectivity. Single-hop communication is a direct commu-nication between the sensor node and the base station. It issimple and easy to implement but limited by communicationdistance.Multihop communication uses some kinds of nodesas relays when transmitting data packets from the source tothe sink, which is more complex.

Mobility. Node mobility increases the complexity of design-ing an appropriate data mining technique for WSNs. Themajority of techniques assumes that sensor nodes are static,only a few techniques consider the node mobility. Whennodes are mobile, maintaining a certain structure for data


Table2:Com

paris

onof

dataminingtechniqu

esforw

irelesssensor

networks.

Approach

Objectiv

eDM

metho

d

Processin

gSensor

data

Nod

eproperties

Implem

entatio

nLimitatio

nsArchitecture

Attributes

Correlatio

nCon

nectivity

Mob

ility

Nod

erole

Nod

etask

Applicationarea

Evaluatio

nmetho

dDatas

ource

Opt.

objective

Distributed

Central

Homogenous

Heterogeneous

Attribute

Spatial

Temporal

Singlehop

Multihops

Static

Mobile

ClusterheadSensorRelay

Simulation

AnalyticalMod.

Real

Synthetic

Frequent

patte

rnmining

DSA

RM[42]

Missingdata

estim

ation

Aprio

rilik

e√√

√√

√√

Sensea

ndsend

Traffi

cmon

itorin

g√

√Data

accuracy

Igno

rethes

ensor

thatrepo

rts

different

values

In-networkdata

mining[51]

Eventspatte

rns

discovery

Aprio

rilik

e√

√√√

√√

√

Aggregatio

n,localp

attern

mining

Environm

ental

mon

itorin

g√√√

Scalability

Highmem

oryand

commun

ication

Distrib

uted

data

aggregation[15]

ImproveW

SNperfo

rmance

Aprio

rilik

e√

√√

√√

√Supp

ort-b

ased

aggregation

WSN

sperfo

rmance

mon

itorin

g√

√Datas

ize

Increasesb

uffer

cost,

delayed

crucialm

essages

Onlinea

lgorith

m[46]

Intervallist

ofrepresentatio

nof

WSN

sdata

Lossy

coun

ting

√√

√√

√√

Perio

dical

sensing

WSN

smon

itorin

g√

√Timea

ndmem

ory

Datar

edun

dancy

Lightweightrule

learning

[48]

Identifyhigh

lycorrelated

rules

forsensin

gAp

riorilik

e√

√√

√√

√Query-based

data

sensing

Con

trolW

SNs

operations

√√

Energy

Not

valid

ated

well

onrealdata

CARM

[43]

Missingdata

estim

ation

FP-growth

based

√√

√√

√√

Sensea

ndsend

Dataa

nalysis

√√

Data

accuracy

Ineffi

cientfor

hand

ling

high

-speed

data


Table3:Com

paris

onof

dataminingtechniqu

esforw

irelesssensor

networks,con

tinued.

Approach

Objectiv

eDM

metho

d

Processin

gSensor

data

Nod

eproperties

Implem

entatio

nLimitatio

nsArchitecture

Attributes

Correlatio

nCon

nectivity

Mob

ility

Nod

erole

Nod

etask

Applicationarea

Evaluatio

nmetho

dDatas

ource

Opt.

objective

Distributed

Central

Homogenous

Heterogeneous

Attribute

Spatial

Temporal

Singlehop

Multihops

Static

Mobile

Clusterhead

Sensor

Relay

Simulation

Analyticalmod.

Real

Synthetic

Frequent

patte

rnmining

Associationrules

mining

fram

ework[50]

Faultand

future

event

predictio

n

FP-growth

usingPL

T-str

ucture√

√√

√√

√√

Aggregatio

nMon

itorW

SNs

quality

ofservice√

√No.of

messages

Increase

costdu

eto

multip

leDBscan

SP-tr

ee[49]

Disc

over

events

patte

rns

FP-growth

based

√√

√√

√√

Sensea

ndsend

Generic

mon

itorin

g√√√

Mem

ory

Hightre

econstructio

ncost

Sequ

entia

lpattern

mining

Relatio

nal

fram

ework[58]

Multi-

dimensio

nal

correlation

discovery

Aprio

rilik

e√

√√√

√√

Sensea

ndsend

Environm

ental

mon

itorin

g√√

Data

representatio

nMem

oryandtim

econsum

ing

Episo

dediscovery(ED)

[21]

Actio

npredictio

n

Generalized

sequ

entia

lpatte

rn(G

SP)

√√

√√

√Sensea

ndsend

Inhabitants

behavior

predictio

n√√√

Predictio

naccuracy

Ineffi

cientfor

complex

activ

ities

MPG

[64]

Predicto

bject’s

future

movem

ent

Aprio

rilik

e√

√√

√√√

Clusterin

gRe

al-timeo

bject

tracking

√√

Tracking

time

andenergy

Not

analyzed

onrealdataset

Con

textual

patte

rns

discovery[22]

Ano

maly

detection

PSP

√√√

√√

√Sensea

ndsend

Railw

aymaintenance

√√

Ano

maly

precision

Missingreal-time

anom

alypredictio

n


Table4:Com

paris

onof

dataminingtechniqu

esforw

irelesssensor

networks,con

tinued.

Approach

Objectiv

eDM

metho

d

Processin

gSensor

data

Nod

eproperties

Implem

entatio

nLimitatio

nsArchitecture

Attributes

Correlatio

nCon

nectivity

Mob

ility

Nod

erole

Nod

etask

Applicationarea

Evaluatio

nmetho

dDatas

ource

Opt.objectiv

e

Distributed

Central

Homogenous

Heterogeneous

Attribute

Spatial

Temporal

Singlehop

Multihops

Static

Mobile


Simulation

Analyticalmod.

Real

Synthetic

Sequ

entia

lpattern

mining

TMP-mine[65]

Predicto

bject’s

future

movem

ent

Patte

rngrow

thusingTM

P-tre

econstructio

n√

√√

√√

√Ru

le-based

node

activ

ation

Real-timeo

bject

tracking

√√

Energy

Highmissing

rateandtim

e

Patte

rnlearner[23]B

ehavior

recogn

ition

Tree

projectio

n√

√√

√√

√Sensea

ndsend

Behavior

mon

itorin

g√√

No.of

patte

rns

learned

Com

plex

and

redu

ndant

patte

rns

MSA

P[63]

Faultp

rediction

Cand

idate

constructio

n√√

√√√

√Sensea

ndsend

Telecommun

ication

√√

Patte

rnsa

ccuracy

Cand

idate

constructio

nis

expensiveto

compu

te

PTSP

[66]

Object’s

future

movem

ent

predictio

n

Sequ

entia

lpatte

rngeneratio

n√

√√

√√

√Ru

le-based

node

activ

ation

Objecttracking

√√

Energy

Ineffi

cientto

predict

high

-speed

objects

Clusterin

g

DCC

[86]

WSN

slon

gevity

Data

correlation-

based

cluste

ring

√√

√√√

√√

Data

supp

ression

GenericWSN

sapplication

√√

Energy

anddata

size

Highclu

sterin

grate

H-cluste

r[85]

In-network

commun

ication

Data

correlation-

based

cluste

ring

√√

√√√

√√

Data

summarization

Real-time

mon

itorin

g√

√√

Com

mun

ication

Highdataloss

rate


Table5:Com

paris

onof

dataminingtechniqu

esforw

irelesssensor

networks,con

tinued.

Approach

Objectiv

eDM

metho

d

Processin

gSensor

data

Nod

eproperties

Implem

entatio

nLimitatio

nsArchitecture

Attributes

Correlatio

nCon

nectivity

Mob

ility

Role

Nod

etask

Applicationarea

Evaluatio

nmetho

dDatas

ource

Opt.objectiv

e

Distributed

Central

Homogenous

Heterogeneous

Attribute

Spatial

Temporal

Singlehop

Multihops

Static

Mobile


Simulation

Analyticalmod.

Real

Synthetic

Clusterin

gPredictio

nmod

el[87]

Predictio

n-based

mon

itorin

gHeuris

ticscheme

√√

√√

√√

√√√

Localprediction

mod

elEn

vironm

ental

mon

itorin

g√

√Com

mun

ication

Clustero

verla

pping

CAG[88]

WSN

sbandw

idth

gain

Data

correlation-

based

cluste

ring

√√

√√

√√

√√

Dataa

ggregatio

nGenericWSN

sapplications

√√

Com

mun

ication

Sensorydataloss

EEDC[84]

On-demand

cluste

ring

Data

correlation-

based

cluste

ring

√√

√√

√√

√Sensea

ndsend

Surveillanced

ata

analysis

√√√

Energy

Ineffi

cientfor

large

WSN

s

Clusterin

gsensorydata[67]Com

mun

ication

efficiency

K-means

√√√

√√

√√

Data

summarization

Dataa

nalysis

√√

Com

mun

ication

Ineffi

cientfor

large

WSN

sAttributeb

ased

cluste

ring[89]

WSN

sbandw

idth

gain

Hierarchal

cluste

ring√

√√

√√

√√

Datac

luste

ring

Mon

itorin

gand

tracking

√√

Com

mun

ication

Highcompu

tatio

ncost

DHCS

[90]

Uniform

data

distr

ibution

Hierarchal

cluste

ring√

√√√

√√

√√

Datac

luste

ring

and

summarization

Interactived

ata

analysis

√Message

redu

ction

Nod

esenergy

isigno

red.


Table6:Com

paris

onof

dataminingtechniqu

esforw

irelesssensor

networks,con

tinued.

Approach

Objectiv

eDM

metho

d

Processin

gSensor

data

Nod

eproperties

Implem

entatio

nLimitatio

nsArchitecture

Attributes

Correlatio

nCon

nectivity

Mob

ility

Role

Nod

etask

Applicationarea

Evaluatio

nmetho

dDatas

ource

Opt.

objective

Distributed

Central

Homogenous

Heterogeneous

Attribute

Spatial

Temporal

Singlehop

Multihops

Static

Mobile


Simulation

Analyticalmod.

Real

Synthetic

Classifi

catio

nPerson

identifi

catio

nalgorithm

s[109]

Identifyhu

man

behavior

Decision

tree

√√√

√√

√Sensea

ndsend

Health

care

√√

Classifi

catio

naccuracy

Doesn

otgu

arantee

thec

orrectness

Predictio

nfram

ework[103]

Distrib

uted

predictio

nDecision

tree

√√

√√√

√√

Localprediction

Generic

√√

Predictio

naccuracy

Com

putatio

nal

complexity

NNTC

[96]

Real-time

classificatio

nNearest

neighb

or√√

√√

√√

Sensea

ndsend

Generic

√√√

Classifi

catio

naccuracy

Not

evaluatedon

realdataset

LWClass[100]

Preserve

WSN

sresources

KNN

√√

√√

√√

Sensea

ndsend

Ubiqu

itous

environm

ents

√√

Resource

awareness

Non

adaptio

nto

conceptd

rift

FVLD

[104

]Lo

w-dim

ensio

nfeaturev

ector

generatio

nKN

N,M

L√

√√

√√

√√

Classifi

catio

nVe

hicle

classificatio

n√

√En

ergy

Highcostof

feature

vector

transm

ission

Fuzzypredictor

mod

el[99]

Occup

ancy

predictio

nFu

zzyrules

√√

√√

√√

Sensea

ndsend

Health

care

√√

Predictio

naccuracy

Ineffi

cientfor

complex

scenarios

Onlinelearning

[105]

Increm

ental

classificatio

nSV

M√

√√

√√

√√

Classifi

catio

nEn

vironm

ental

mon

itorin

g√

√En

ergy

Com

putatio

nal

complexity

One-class

quarter-sphere

SVM

[108]

Ano

maly

detection

SVM

√√

√√

√√√

Localano

maly

detection

Habitat

mon

itorin

g√

√En

ergy

Igno

resspatia

lcorrelation


mining becomes difficult because updates on this structureshould be persisted over time.

Node Role. Node can perform three types of role [33] asfollows.

(i) Regular Sensor. These are the nodes with limitedresources, and they are used to sense the phenomenaand send the sensed data to the base station.

(ii) Cluster Head. Cluster head can be a regular sensornode, or it can be rich in resources. In centralizedapproaches, cluster head is a regular sensor node thatonly controls the cluster membership. In distributedapproaches, besides responding for cluster formation,CHs perform aggregation/fusion of collected sensors’data. Therefore, they are equipped with significantlymore computation and communication resources.

(iii) Relay. It is the node that acts as medium to transmitthe data packet from one node to the others.

Node Task. In centralized approach, node task is to sense thephenomena being monitored and send the sensed data to thebase station. In distributed approaches, node can performcomputation and can take action based on the detectedphenomena or target.

5.5. Application Area. We also evaluated the type of applica-tion benefited fromWSNs data mining techniques. Here, weexemplify some real-world applications as follows.

(i) First is the environmental monitoring [5–7, 51, 58,87], in which sensors are deployed in harsh andunattended regions to monitor the natural environ-ment. Data mining techniques can identify when andwhere an event may occur and trigger an alarm upondetection.

(ii) Second is the habitant and health monitoring [1, 2,99, 109], in which patients/humans are equipped withsmall sensors on multiple different positions of theirbody tomonitor their health or behavior.Dataminingtechnique can identify the abnormal behavior andhelp to take effective action.

(iii) Third is the object tracking [3, 4, 65, 66]. in whichsensors are embedded inmoving targets to track themin real-time. Data mining techniques help to improvethe estimation of the location of targets and also tomake tracking more efficient and accurate.

(iv) Fourth is the WSNs performance [46, 48, 50, 51].WSNs are usually unattended and deployed in harshenvironment. Sensor nodes are resource constrainedespecially in terms of power. Data mining techniqueshelp to identify the faulty or dead nodes. Theyalso help to conserve energy by using in-networkprocessing in which aggregated data is sent to centralside.

(v) Fifth is the data analysis [67, 84, 90]. Data miningtechniques help to discover potentially interesting

data patterns in a sensor network for a certainapplication.

(vi) Sixth is the real-time monitoring [64, 65, 85]. Datamining techniques especially distributed techniqueshelp to identify certain patterns and predict futureevents in a given time window, which make real-timeresponse and action feasible.

5.6. Implementation. Each technique is also evaluated interms of experimental validation, that is, which dataset isused, which WSNs optimization objectives are achieved, andso forth.

Evaluation Method. Analytical modeling, simulation

reviewarticle data mining techniques for wireless sensor ...since data mining is a broad discipline...

Documents