data mining and fusion techniques for wsns as a source of the big data
TRANSCRIPT
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Data Mining and Fusion Techniques for WSNs as a Source of The Big Data
Mohamed Mostafa Fouad, PhD.
Arab Academy for Science, Technology, and Maritime Transport, Cairo - EgyptIT4Innovations, VSB-Technical University of Ostrava, Ostrava - Czech Republic.
Member at SRGE Research Group (www.egyptscience.net).
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Agenda • WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
1
WSN: An Overview
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
WSN: An Overview
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
WSN Design challenges
Resource Constraints
Depletable Energy Source
Security & Privacy
Distribution Strategy
Fault Tolerance
Heterogeneity
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
2
Big Data Challenges
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Big Data Challenge
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
The Basic Big Data Challenge• The continues and high speeds Data Streaming.
BIG DATA STORAGE
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Big Data Classification
Data Classification
Structured Semi-structured Unstructured
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Big Data Analysis ChallengesE.g. cloud computing, WSN, IoT, Social Networks,
Search Engine, Biomedical, Mobile, NFC, etc.
Inconsistence Data
Redundancy Data
Incompleteness Data
Input Data Sources
Produced Data
Pre-Processing
Data Analysis Value Generating
Output Values
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
3
Sensory Data Processing Tech.
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
The Sensor Data Processing Techniques
• Standard data processing models (such as RDBMS) may not be applicable for WSN.
• From our perspective, the new processing techniques should not only specially designed for WSNs but also will have their benefits over the high-volume and high-velocity big data such as:
• Data Mining, and • Data Fusion.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
3-A
Data Mining Over WSNs
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Data Mining Over WSNs• The need for extracting knowledge from the sensor data,
collected from WSNs, has become an important issue in real-time decision systems.
• However, the traditional data mining techniques not applicable due to the characteristics of WSNs.
• The data mining algorithms for WSNs could be generally classified into:
–Centralized data mining. –Distributed data mining.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Centralized Data Mining Approaches• In the centralized approaches, all sensors send their
data to a centralized computing resources usually the sink node to be processed.
Exam
ples
of C
entr
alize
d D
ata
Min
ing
Appr
oach
es Defining the sensors’ missing data
Mining for WSN-Web based applications
Mining for multiple data streams
• Usually the centralized approach requires a high computational power with non-bounded energy sources.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Examples of Centralized Data Mining Approaches
• The Mining used to defining the sensors’ missing data, such as:– Data Stream Association Rule Mining (DSARM) framework.– Adaptive Multiple Regression (AMR) framework (the enhanced
version of DSARM).
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Examples of Centralized Data Mining Approaches
• The Mining used for sensor-based applications those heavily utilize the World Wide Web (WWW), such as:– the Sensor Web which are considered as another layer added to the
WWW.– The XML language provides the suitable solution to connect the sensors
directly to the web applications. But the problem is the tree structure of the XML document.
– Paik et al. have proposed a reformulation of the association rules for XML streamed data.
– The main idea of the solution is that the association rules are used with the Label Projection Approach to generate frequent XML tree items without any redundancy.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Examples of Centralized Data Mining Approaches
• The Mining used for dealing with multiple data streams, such as:
– the MG-join algorithm which used the Discrete Fourier transforms
(DFTs) to reduce the dimensionality of streamed data into a few
numbers of coefficients.
– They used incremental methodology to update the streamed data.
– The main issue that the increase number of coefficients will affect
the performance of the algorithm.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Distributed Data Mining Approaches• Each node uses its limited computing resources to
perform the mining process. The main advantage of this approach is reducing the raw data streams to be delivered to the sink node.
• However, it may deplete the network resources in terms of memory footprint and energy consumptions. Ex
ampl
es o
f disti
bute
d D
ata
Min
ing
Appr
oach
es
The on-disk data structure (DSTable)
Trained classifiers for capturing semantic features
Deep neural network (DNN)A Machine Learning approach
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Example of Distributed Data Mining
• The Stream Mining Application (SMA) for distributed mining in WSN.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
3-B
Data Fusion Over WSNs
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Data Fusion Over WSNs• Data fusion is an
important concept in both big data and WSNs. In the big data context, the fusion is achieved at the computational platform while in the WSNs context, the fusion is performed inside the network (i.e. in-network process).
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Fusion based on Input Sources Relation
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Fusion based on Level of Abstraction
Fusion based on Level of Abstraction
Low-level fusion
Combining a number of raw input data into
a new and accurate raw data
Medium- level fusion
Provides an abstraction map
of all features andattributes of the
entry data
High-level fusion
combining symbols/decisions
from different sensor sources toestablish a single
accurate symbol/decision
Multilevel fusion
Com
bini
ng m
ore
than
on
e fu
sion
App
roac
h
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Example of the Fusion based on Level of Abstraction
• The multilevel fusion system which combines the medium-level and the high-level fusions in the automation of obstacle detection application.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
• WSN: An Overview– WSN Design challenges
• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges
• Sensory Data Processing Techniques– Data Mining – Data Fusion
• Conclusions
4
Conclusions
Agenda
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Conclusions• The talk has focused on the need to apply pre-processing techniques at the data collected
from the WSNs (in-network pre-processing operations on sensor data). Rather than transmitting amount of continues streaming data to big data storage, such as WSNs’s Data mining and data fusion techniques.
• The advantages and the limitation of centralized and distributed data mining techniques for WSNs have analyzed.
• Moreover, the data fusion techniques, ensuring the accuracy and trustiness of the collected data, and their sub-classes (i.e. abstraction-based and input sources relations-based) have been discussed and analyzed in terms of the energy consumptions and the limited resources of the WSNs.
• It is then concluded that as main sources of big data, it is vital for the sensor data to be in-network processed as this would prolong the WSNs lifetime and contribute to reduction of data volume of the big data, thus accelerating of the values discovery process from this big data.
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic
Thank you
Mohamed Mostafa [email protected]
Ostrava, Faculty of Electrical Engineering and Computer Science (20th April 2015)
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic