research article mapreduce based parallel neural networks in...

14
Research Article MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning Yang Liu, 1 Jie Yang, 1 Yuan Huang, 1 Lixiong Xu, 1 Siguang Li, 2 and Man Qi 3 1 School of Electrical Engineering and Information, Sichuan University, Chengdu 610065, China 2 e Key Laboratory of Embedded Systems and Service Computing, Tongji University, Shanghai 200092, China 3 Department of Computing, Canterbury Christ Church University, Canterbury, Kent CT1 1QU, UK Correspondence should be addressed to Lixiong Xu; [email protected] Received 28 May 2015; Accepted 11 August 2015 Academic Editor: Ladislav Hluchy Copyright © 2015 Yang Liu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. ree data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. e performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation. 1. Introduction Recently, big data has received a momentum from both industry and academia. Many organizations are continuously collecting massive amounts of datasets from various sources such as the World Wide Web, sensor networks, and social net- works. In [1], big data is defined as a term that encompasses the use of techniques to capture, process, analyze, and visu- alize potentially large datasets in a reasonable time frame not accessible to standard IT technologies. Basically, big data is characterized with three Vs [2]: (i) Volume: the sheer amount of data generated. (ii) Velocity: the rate at which the data is being generated. (iii) Variety: the heterogeneity of data sources. Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. Back-propagation neural network (BPNN), the most popular one of ANNs, could approximate any continuous nonlinear functions by arbitrary precision with an enough number of neurons [3]. Normally, BPNN employs the back-propagation algorithm for training which requires a significant amount of time when the size of the training data is large [4]. To fulfill the potentials of neural networks in big data applications, the computation process must be speeded up with parallel computing techniques such as the Message Passing Interface (MPI) [5, 6]. In [7], Long and Gupta presented a scalable par- allel artificial neural network using MPI for parallelization. It is worth noting that MPI was designed for data intensive applications with high performance requirements. MPI pro- vides little support in fault tolerance. If any fault happens, an MPI computation has to be started from the beginning. As a result, MPI is not suitable for big data applications, which would normally run for many hours during which some faults might happen. is paper presents a MapReduce based parallel back- propagation neural network (MRBPNN). MapReduce has become a de facto standard computing model in support of big data applications [8, 9]. MapReduce provides a reliable, fault-tolerant, scalable, and resilient computing framework for storing and processing massive datasets. MapReduce Hindawi Publishing Corporation Computational Intelligence and Neuroscience Volume 2015, Article ID 297672, 13 pages http://dx.doi.org/10.1155/2015/297672

Upload: others

Post on 22-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Research ArticleMapReduce Based Parallel Neural Networks inEnabling Large Scale Machine Learning

Yang Liu1 Jie Yang1 Yuan Huang1 Lixiong Xu1 Siguang Li2 and Man Qi3

1School of Electrical Engineering and Information Sichuan University Chengdu 610065 China2The Key Laboratory of Embedded Systems and Service Computing Tongji University Shanghai 200092 China3Department of Computing Canterbury Christ Church University Canterbury Kent CT1 1QU UK

Correspondence should be addressed to Lixiong Xu xulixiongscueducn

Received 28 May 2015 Accepted 11 August 2015

Academic Editor Ladislav Hluchy

Copyright copy 2015 Yang Liu et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications However ANNsare notably slow in computation especially when the size of data is large Nowadays big data has received a momentum from bothindustry and academia To fulfill the potentials of ANNs for big data applications the computation process must be speeded upFor this purpose this paper parallelizes neural networks based on MapReduce which has become a major computing model tofacilitate data intensive applications Three data intensive scenarios are considered in the parallelization process in terms of thevolume of classification data the size of the training data and the number of neurons in the neural network The performanceof the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy inclassification and efficiency in computation

1 Introduction

Recently big data has received a momentum from bothindustry and academia Many organizations are continuouslycollecting massive amounts of datasets from various sourcessuch as theWorldWideWeb sensor networks and social net-works In [1] big data is defined as a term that encompassesthe use of techniques to capture process analyze and visu-alize potentially large datasets in a reasonable time frame notaccessible to standard IT technologies Basically big data ischaracterized with three Vs [2]

(i) Volume the sheer amount of data generated(ii) Velocity the rate at which the data is being generated(iii) Variety the heterogeneity of data sources

Artificial neural networks (ANNs) have been widelyused in pattern recognition and classification applicationsBack-propagation neural network (BPNN) themost popularone of ANNs could approximate any continuous nonlinearfunctions by arbitrary precision with an enough number of

neurons [3] Normally BPNN employs the back-propagationalgorithm for training which requires a significant amount oftime when the size of the training data is large [4] To fulfillthe potentials of neural networks in big data applicationsthe computation process must be speeded up with parallelcomputing techniques such as the Message Passing Interface(MPI) [5 6] In [7] Long and Gupta presented a scalable par-allel artificial neural network using MPI for parallelizationIt is worth noting that MPI was designed for data intensiveapplications with high performance requirements MPI pro-vides little support in fault tolerance If any fault happens anMPI computation has to be started from the beginning Asa result MPI is not suitable for big data applications whichwould normally run formany hours duringwhich some faultsmight happen

This paper presents a MapReduce based parallel back-propagation neural network (MRBPNN) MapReduce hasbecome a de facto standard computing model in support ofbig data applications [8 9] MapReduce provides a reliablefault-tolerant scalable and resilient computing frameworkfor storing and processing massive datasets MapReduce

Hindawi Publishing CorporationComputational Intelligence and NeuroscienceVolume 2015 Article ID 297672 13 pageshttpdxdoiorg1011552015297672

2 Computational Intelligence and Neuroscience

scales well with ever increasing sizes of datasets due to itsuse of hash keys for data processing and the strategy of mov-ing computation to the closest data nodes In MapReducethere are mainly two functions which are the Map function(mapper) and the Reduce function (reducer) Basically amapper is responsible for actual data processing and gen-erates intermediate results in the form of ⟨key value⟩ pairsA reducer collects the output results from multiple mapperswith secondary processing including sorting and mergingthe intermediate results based on the key values Finally theReduce function generates the computation results

We present three MRBPNNs (ie MRBPNN 1MRBPNN 2 and MRBPNN 3) to deal with differentdata intensive scenarios MRBPNN 1 deals with a scenario inwhich the dataset to be classified is large The input dataset issegmented into a number of data chunks which are processedby mappers in parallel In this scenario each mapper buildsthe same BPNN classifier using the same set of training dataMRBPNN 2 focuses on a scenario in which the volume ofthe training data is large In this case the training data issegmented into data chunks which are processed by mappersin parallel Each mapper still builds the same BPNN but usesonly a portion of the training dataset to train the BPNNTo maintain a high accuracy in classification we employa bagging based ensemble technique [10] in MRBPNN 2MRBPNN 3 targets a scenario in which the number ofneurons in a BPNN is large In this case MRBPNN 3 fullyparallelizes and distributes the BPNN among the mappers insuch away that eachmapper employs a portion of the neuronsfor training

The rest of the paper is organized as follows Section 2gives a review on the related work Section 3 presents thedesigns and implementations of the three parallel BPNNsusing the MapReduce model Section 4 evaluates the perfor-mance of the parallel BPNNs and analyzes the experimentalresults Section 5 concludes the paper

2 Related Work

ANNs have been widely applied in various pattern recogni-tion and classification applications For example Jiang et al[11] employed a back-propagation neural network to classifyhigh resolution remote sensing images to recognize roads androofs in the images Khoa et al [12] proposed a method toforecast the stock price using BPNN

Traditionally ANNs are employed to deal with a smallvolume of data With the emergence of big data ANNs havebecome computationally intensive for data intensive applica-tions which limits their wide applications Rizwan et al [13]employed a neural network on global solar energy estimationThey considered the research as a big task as traditionalapproaches are based on extreme simplicity of the parame-terizations A neural network was designed which containsa large number of neurons and layers for complex functionapproximation and data processing The authors reportedthat in this case the training time will be severely affectedWang et al [14] pointed out that currently large scale neuralnetworks are one of the mainstream tools for big dataanalyticsThe challenge in processing big data with large scale

neural networks includes two phases which are the trainingphase and the operation phase To speed up the computationsof neural networks there are some efforts that try to improvethe selection of initial weights [15] or control the learningparameters [16] of neural networks Recently researchershave started utilizing parallel and distributed computingtechnologies such as cloud computing to solve the computa-tion bottleneck of a large neural network [17ndash19] Yuan andYu[20] employed cloud computing mainly for exchange of pri-vacy data in a BPNN implementation in processing cipheredtext classification tasks However cloud computing as acomputing paradigm simply offers infrastructure as a service(IaaS) platform as a service (PaaS) and software as a service(SaaS) It is worth noting that cloud computing still needsbig data processing models such as the MapReduce model todeal with data intensive applications Gu et al [4] presented aparallel neural network using in-memory data processingtechniques to speed up the computation of the neuralnetwork but without considering the accuracy aspect ofthe implemented parallel neural network In this work thetraining data is simply segmented into data chunks which areprocessed in parallel Liu et al [21] presented a MapReducebased parallel BPNN in processing a large set of mobile dataThis work further employs AdaBoosting to accommodate theloss of accuracy of the parallelized neural network Howeverthe computationally intensive issue may exist not only at thetraining phase but also at the classification phase In additionAdaBoosting is a popular sampling technique it may enlargethe weights of wrongly classified instances which woulddeteriorate the algorithm accuracy

3 Parallelizing Neural Networks

This section presents the design details of the parallelizedMRBPNN 1 MRBPNN 2 and MRBPNN 3 First a briefreview of BPNN is introduced

31 Back-Propagation Neural Network Back-propagationneural network is a multilayer feed forward network whichtrains the training data using an error back-propagationmechanism It has become one of themost widely used neuralnetworks BPNN can perform a large volume of input-outputmappings without knowing their exact mathematical equa-tions This benefits from the gradient-descent feature of itsback-propagationmechanism During the error propagationBPNN keeps tuning the parameters of the network until itadapts to all input instances A typical BPNN is shown inFigure 1 which consists of an arbitrary number of inputs andoutputs

Generally speaking a BPNN can have multiple networklayers However it has beenwidely accepted that a three-layerBPNN would be enough to fit the mathematical equationswhich approximate themapping relationships between inputsand outputs [3] Therefore the topology of a BPNN usuallycontains three layers input layer one hidden layer and outputlayerThe number of inputs in the input layer is mainly deter-mined by the number of elements in an input eigenvector forinstance let s denote an input instance

s = 1198861 1198862 1198863 119886

119899 (1)

Computational Intelligence and Neuroscience 3

x0

x1

xnminus1

y0

y1

ynminus1

intintint

int intint

intintint

Figure 1 The structure of a typical BPNN

Then the number of inputs is 119899 Similarly the numberof neurons in the output layer is determined by the numberof classifications And the number of neurons in the hiddenlayer is determined by users Every input of a neuron has aweight119908

119894119895 where 119894 and 119895 represent the source and destination

of the input Each neuron also maintains an optional param-eter 120579

119895which is actually a bias for varying the activity of the

119895th neuron in a layerTherefore let 1199001198951015840 denote the output from

a previous neuron and let 119900119895denote the output of this layer

the input 119868119895of neurons located in both the hidden and output

layer can be represented by

119868119895= sum119894

1199081198941198951199001198951015840 + 120579119895 (2)

The output of a neuron is usually computed by thesigmoid function so the output 119900

119895can be computed by

119900119895=

1

1 + 119890minus119868119895 (3)

After the feed forward process is completed the back-propagation process starts Let Err

119895represent the error-

sensitivity and let 119905119895represent the desirable output of neuron

119895 in the output layer thus

Err119895= 119900119895(1 minus 119900

119895) (119905119895minus 119900119895) (4)

Let Err119896represent the error-sensitivity of one neuron in

the last layer and let 119908119896119895represent its weight thus Err

119895of a

neuron in the other layers can be computed using

Err119895= 119900119895(1 minus 119900

119895)sum119896

Err119896119908119896119895 (5)

After Err119895is computed the weights and biases of each

neuron are tuned in back-propagation process using

Δ119908119894119895

= Err119895119900119895

119908119894119895

= 119908119894119895+ Δ119908119894119895

Δ120579119895= Err119895

120579119895= 120579119895+ Δ120579119895

(6)

After the first input vector finishes tuning the net-work the next round starts for the following input vectors

The input keeps training the network until (7) is satisfied fora single output or (8) is satisfied for multiple outputs

min (119864 [1198902]) = min (119864 [(119905 minus 119900)

2]) (7)

min (119864 [119890119879119890]) = min (119864 [(119905 minus 119900)

119879(119905 minus 119900)]) (8)

32 MapReduce Computing Model MapReduce has becomethe de facto standard computing model in dealing with dataintensive applications using a cluster of commodity comput-ers Popular implementations of the MapReduce computingmodel include Mars [22] Phoenix [23] and Hadoop frame-work [24 25]The Hadoop framework has been widely takenup by the community due to its open source feature Hadoophas its HadoopDistributed File System (HDFS) for dataman-agement A Hadoop cluster has one name node (Namenode)and a number of data nodes (Datanodes) for running jobsThe name node manages the metadata of the cluster whilst adata node is the actual processing node The Map functions(mappers) and Reduce functions (reducers) run on the datanodes When a job is submitted to a Hadoop cluster theinput data is divided into small chunks of an equal size andsaved in the HDFS In terms of data integrity each datachunk can have one or more replicas according to the clusterconfiguration In a Hadoop cluster mappers copy and readdata from either remote or local nodes based on data localityThe final output results will be sorted merged and generatedby reducers in HDFS

33 The Design of MRBPNN 1 MRBPNN 1 targets the sce-nario in which BPNN has a large volume of testing data to beclassified Consider a testing instance 119904

119894= 1198861 1198862 1198863 119886

119894119899

119904119894isin 119878 where

(i) 119904119894denotes an instance

(ii) 119878 denotes a dataset(iii) 119894119899 denotes the length of 119904

119894 it also determines the

number of inputs of a neural network(iv) the inputs are capsulated by a format of

⟨instance119896 target

119896 type⟩

(v) instance119896represents 119904

119894 which is the input of a neural

network(vi) target

119896represents the desirable output if instance

119896is

a training instance(vii) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

Files which contain instances are saved into HDFSinitially Each file contains all the training instances and aportion of the testing instances Therefore the file number119899 determines the number of mappers to be used The filecontent is the input of MRBPNN 1

When the algorithm starts each mapper initializes aneural network As a result there will be 119899 neural networks inthe cluster Moreover all the neural networks have exactly thesame structure and parameters Each mapper reads data in

4 Computational Intelligence and Neuroscience

Mapper

Mapper

Mapper

Training data

Training data

Training data

Data chunk 2

Data chunk 1

Data chunk n

ReducerFinal results

Figure 2 MRBPNN 1 architecture

the formof ⟨instance119896 target

119896 type⟩ fromafile and parses the

data records If the value of type field is ldquotrainrdquo instance119896is

input into the input layer of the neural networkThe networkcomputes the output of each layer using (2) and (3) untilthe output layer generates an output which indicates thecompletion of the feed forward process And then the neuralnetwork in each mapper starts the back-propagation processIt computes and updates new weights and biases for its neu-rons using (4) to (6) The neural network inputs instance

119896+1

Repeat the feed forward and back-propagation process untilall the instances which are labeled as ldquotrainrdquo are processedand the error is satisfied

Each mapper starts classifying instances labeled as ldquotestrdquoby running the feed forward process As each mapper onlyclassifies a portion of the entire testing dataset the efficiencyis improved At last each mapper outputs intermediate out-put in the form of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key

and 119900119895119898

represents the output of the 119898th mapperOne reducer starts collecting and merging all the outputs

of the mappers Finally the reducer outputs ⟨instance119896 119900119895119898

into HDFS In this case 119900119895119898

represents the final classifica-tion result of instance

119896 Figure 2 shows the architecture of

MRBPNN 1 and Algorithm 1 shows the pseudocode

34 The Design of MRBPNN 2 MRBPNN 2 focuses on thescenario inwhich a BPNNhas a large volume of training dataConsider a training dataset 119878 with a number of instances Asshown in Figure 3 MRBPNN 2 divides 119878 into 119899 data chunksof which each data chunk 119904

119894is processed by a mapper for

training respectively

119878 =

119899

⋃1

119904119894 forall119904 isin 119904

119894| 119904 notin 119904

119899 119894 = 119899 (9)

Each mapper in the Hadoop cluster maintains a BPNNand each 119904

119894is considered as the input training data for the

neural network maintained in mapper119894 As a result each

BPNN in a mapper produces a classifier based on the trainedparameters

(mapper119894BPNN

119894 119904119894) 997888rarr classifier

119894 (10)

To reduce the computation overhead each classifier119894is

trained with a part of the original training dataset However

a critical issue is that the classification accuracy of a mapperwill be significantly degraded using only a portion of thetraining data To solve this issue MRBPNN 2 employsensemble technique to maintain the classification accuracyby combining a number of weak learners to create a stronglearner

341 Bootstrapping Training diverse classifiers from a singletraining dataset has been proven to be simple compared withthe case of finding a strong learner [26] A number of tech-niques exist for this purpose A widely used technique is toresample the training dataset based on bootstrap aggregatingsuch as bootstrapping and majority voting This can reducethe variance of misclassification errors and hence increasesthe accuracy of the classifications

Asmentioned in [26] balanced bootstrapping can reducethe variance when combining classifiers Balanced bootstrap-ping ensures that each training instance equally appears in thebootstrap samples It might not be always the case that eachbootstrapping sample contains all the training instances Themost efficient way of creating balanced bootstrap samples isto construct a string of instances119883

1 1198832 1198833 119883

119899repeating

119861 times so that a sequence of 1198841 1198842 1198843 119884

119861119899can be

achieved A random permutation 119901 of the integers from 1 to119861119899is taken Therefore the first bootstrapping sample can be

created from 119884119901(1) 119884119901(2) 119884119901(3) 119884

119901(119899) In addition the

second bootstrapping sample is created from119884119901(119899+1) 119884

119901(119899+

2) 119884119901(119899 + 3) 119884

119901(2119899) and the process continues until

119884119901((119861minus1)119899+1) 119884

119901((119861minus1)119899+2) 119884

119901((119861minus1)119899+3) 119884

119901(119861119899) is

the119861th bootstrapping sampleThebootstrapping samples canbe used in bagging to increase the accuracy of classification

342 Majority Voting This type of ensemble classifiers per-forms classifications based on the majority votes of the baseclassifiers [26] Let us define the prediction of the 119894th classifier119875119894as 119875119894119895

isin 1 0 119894 = 1 119868 and 119895 = 1 119888 where 119868 is thenumber of classifiers and 119888 is the number of classes If the 119894thclassifier chooses class 119895 then 119875

119894119895= 1 otherwise 119875

119894119895= 0

Then the ensemble prediction for class 119896 is computed using

119875119894119896

=119888max119895=1

119868

sum119894=1

119875119894119895 (11)

343 Algorithm Design At the beginning MRBPNN 2employs balanced bootstrapping to generate a number ofsubsets of the entire training dataset

balanced bootstrapping 997888rarr 1198781 1198782 1198783 119878

119899

119899

⋃119894=1

119878119894= 119878

(12)

where 119878119894represents the 119894th subset which belongs to entire

dataset 119878 119899 represents the total number of subsetsEach 119878

119894is saved in one file in HDFS Each instance

119904119896

= 1198861 1198862 1198863 119886

119894119899 119904119896

isin 119878119894 is defined in the format of

⟨instance119896 target

119896 type⟩ where

(i) instance119896represents one bootstrapped instance 119904

119896

which is the input of neural network

Computational Intelligence and Neuroscience 5

Training data

Sample 3

Sample 2

Sample n

Sample 1

MR 2Data

chunk

MR 1Data

chunk

MR 3Data

chunk

MR nData

chunk

Major voting

Reducecollects results of all BPNNs

Figure 3 MRBPNN 2 architecture

Input 119878 119879 Output 119862119898 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) forall119904 isin 119878 119904119894= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(4) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(5) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(6) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(8) Divide 119879 into 119879

1 1198792 1198793 119879

119898 ⋃119898119894=0

119879119894= 119879

(9) Each mapper inputs 119905119895= 1198861 1198862 1198863 119886

119894119899 119905119895isin 119879119894

(10) Execute (3) (4)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects and merges all ⟨119905119895 119900119895⟩

Repeat (9) (10) (11) (12)Until 119879 is traversed

(13) Reducer outputs 119862End

Algorithm 1 MRBPNN 1

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

2 Computational Intelligence and Neuroscience

scales well with ever increasing sizes of datasets due to itsuse of hash keys for data processing and the strategy of mov-ing computation to the closest data nodes In MapReducethere are mainly two functions which are the Map function(mapper) and the Reduce function (reducer) Basically amapper is responsible for actual data processing and gen-erates intermediate results in the form of ⟨key value⟩ pairsA reducer collects the output results from multiple mapperswith secondary processing including sorting and mergingthe intermediate results based on the key values Finally theReduce function generates the computation results

We present three MRBPNNs (ie MRBPNN 1MRBPNN 2 and MRBPNN 3) to deal with differentdata intensive scenarios MRBPNN 1 deals with a scenario inwhich the dataset to be classified is large The input dataset issegmented into a number of data chunks which are processedby mappers in parallel In this scenario each mapper buildsthe same BPNN classifier using the same set of training dataMRBPNN 2 focuses on a scenario in which the volume ofthe training data is large In this case the training data issegmented into data chunks which are processed by mappersin parallel Each mapper still builds the same BPNN but usesonly a portion of the training dataset to train the BPNNTo maintain a high accuracy in classification we employa bagging based ensemble technique [10] in MRBPNN 2MRBPNN 3 targets a scenario in which the number ofneurons in a BPNN is large In this case MRBPNN 3 fullyparallelizes and distributes the BPNN among the mappers insuch away that eachmapper employs a portion of the neuronsfor training

The rest of the paper is organized as follows Section 2gives a review on the related work Section 3 presents thedesigns and implementations of the three parallel BPNNsusing the MapReduce model Section 4 evaluates the perfor-mance of the parallel BPNNs and analyzes the experimentalresults Section 5 concludes the paper

2 Related Work

ANNs have been widely applied in various pattern recogni-tion and classification applications For example Jiang et al[11] employed a back-propagation neural network to classifyhigh resolution remote sensing images to recognize roads androofs in the images Khoa et al [12] proposed a method toforecast the stock price using BPNN

Traditionally ANNs are employed to deal with a smallvolume of data With the emergence of big data ANNs havebecome computationally intensive for data intensive applica-tions which limits their wide applications Rizwan et al [13]employed a neural network on global solar energy estimationThey considered the research as a big task as traditionalapproaches are based on extreme simplicity of the parame-terizations A neural network was designed which containsa large number of neurons and layers for complex functionapproximation and data processing The authors reportedthat in this case the training time will be severely affectedWang et al [14] pointed out that currently large scale neuralnetworks are one of the mainstream tools for big dataanalyticsThe challenge in processing big data with large scale

neural networks includes two phases which are the trainingphase and the operation phase To speed up the computationsof neural networks there are some efforts that try to improvethe selection of initial weights [15] or control the learningparameters [16] of neural networks Recently researchershave started utilizing parallel and distributed computingtechnologies such as cloud computing to solve the computa-tion bottleneck of a large neural network [17ndash19] Yuan andYu[20] employed cloud computing mainly for exchange of pri-vacy data in a BPNN implementation in processing cipheredtext classification tasks However cloud computing as acomputing paradigm simply offers infrastructure as a service(IaaS) platform as a service (PaaS) and software as a service(SaaS) It is worth noting that cloud computing still needsbig data processing models such as the MapReduce model todeal with data intensive applications Gu et al [4] presented aparallel neural network using in-memory data processingtechniques to speed up the computation of the neuralnetwork but without considering the accuracy aspect ofthe implemented parallel neural network In this work thetraining data is simply segmented into data chunks which areprocessed in parallel Liu et al [21] presented a MapReducebased parallel BPNN in processing a large set of mobile dataThis work further employs AdaBoosting to accommodate theloss of accuracy of the parallelized neural network Howeverthe computationally intensive issue may exist not only at thetraining phase but also at the classification phase In additionAdaBoosting is a popular sampling technique it may enlargethe weights of wrongly classified instances which woulddeteriorate the algorithm accuracy

3 Parallelizing Neural Networks

This section presents the design details of the parallelizedMRBPNN 1 MRBPNN 2 and MRBPNN 3 First a briefreview of BPNN is introduced

31 Back-Propagation Neural Network Back-propagationneural network is a multilayer feed forward network whichtrains the training data using an error back-propagationmechanism It has become one of themost widely used neuralnetworks BPNN can perform a large volume of input-outputmappings without knowing their exact mathematical equa-tions This benefits from the gradient-descent feature of itsback-propagationmechanism During the error propagationBPNN keeps tuning the parameters of the network until itadapts to all input instances A typical BPNN is shown inFigure 1 which consists of an arbitrary number of inputs andoutputs

Generally speaking a BPNN can have multiple networklayers However it has beenwidely accepted that a three-layerBPNN would be enough to fit the mathematical equationswhich approximate themapping relationships between inputsand outputs [3] Therefore the topology of a BPNN usuallycontains three layers input layer one hidden layer and outputlayerThe number of inputs in the input layer is mainly deter-mined by the number of elements in an input eigenvector forinstance let s denote an input instance

s = 1198861 1198862 1198863 119886

119899 (1)

Computational Intelligence and Neuroscience 3

x0

x1

xnminus1

y0

y1

ynminus1

intintint

int intint

intintint

Figure 1 The structure of a typical BPNN

Then the number of inputs is 119899 Similarly the numberof neurons in the output layer is determined by the numberof classifications And the number of neurons in the hiddenlayer is determined by users Every input of a neuron has aweight119908

119894119895 where 119894 and 119895 represent the source and destination

of the input Each neuron also maintains an optional param-eter 120579

119895which is actually a bias for varying the activity of the

119895th neuron in a layerTherefore let 1199001198951015840 denote the output from

a previous neuron and let 119900119895denote the output of this layer

the input 119868119895of neurons located in both the hidden and output

layer can be represented by

119868119895= sum119894

1199081198941198951199001198951015840 + 120579119895 (2)

The output of a neuron is usually computed by thesigmoid function so the output 119900

119895can be computed by

119900119895=

1

1 + 119890minus119868119895 (3)

After the feed forward process is completed the back-propagation process starts Let Err

119895represent the error-

sensitivity and let 119905119895represent the desirable output of neuron

119895 in the output layer thus

Err119895= 119900119895(1 minus 119900

119895) (119905119895minus 119900119895) (4)

Let Err119896represent the error-sensitivity of one neuron in

the last layer and let 119908119896119895represent its weight thus Err

119895of a

neuron in the other layers can be computed using

Err119895= 119900119895(1 minus 119900

119895)sum119896

Err119896119908119896119895 (5)

After Err119895is computed the weights and biases of each

neuron are tuned in back-propagation process using

Δ119908119894119895

= Err119895119900119895

119908119894119895

= 119908119894119895+ Δ119908119894119895

Δ120579119895= Err119895

120579119895= 120579119895+ Δ120579119895

(6)

After the first input vector finishes tuning the net-work the next round starts for the following input vectors

The input keeps training the network until (7) is satisfied fora single output or (8) is satisfied for multiple outputs

min (119864 [1198902]) = min (119864 [(119905 minus 119900)

2]) (7)

min (119864 [119890119879119890]) = min (119864 [(119905 minus 119900)

119879(119905 minus 119900)]) (8)

32 MapReduce Computing Model MapReduce has becomethe de facto standard computing model in dealing with dataintensive applications using a cluster of commodity comput-ers Popular implementations of the MapReduce computingmodel include Mars [22] Phoenix [23] and Hadoop frame-work [24 25]The Hadoop framework has been widely takenup by the community due to its open source feature Hadoophas its HadoopDistributed File System (HDFS) for dataman-agement A Hadoop cluster has one name node (Namenode)and a number of data nodes (Datanodes) for running jobsThe name node manages the metadata of the cluster whilst adata node is the actual processing node The Map functions(mappers) and Reduce functions (reducers) run on the datanodes When a job is submitted to a Hadoop cluster theinput data is divided into small chunks of an equal size andsaved in the HDFS In terms of data integrity each datachunk can have one or more replicas according to the clusterconfiguration In a Hadoop cluster mappers copy and readdata from either remote or local nodes based on data localityThe final output results will be sorted merged and generatedby reducers in HDFS

33 The Design of MRBPNN 1 MRBPNN 1 targets the sce-nario in which BPNN has a large volume of testing data to beclassified Consider a testing instance 119904

119894= 1198861 1198862 1198863 119886

119894119899

119904119894isin 119878 where

(i) 119904119894denotes an instance

(ii) 119878 denotes a dataset(iii) 119894119899 denotes the length of 119904

119894 it also determines the

number of inputs of a neural network(iv) the inputs are capsulated by a format of

⟨instance119896 target

119896 type⟩

(v) instance119896represents 119904

119894 which is the input of a neural

network(vi) target

119896represents the desirable output if instance

119896is

a training instance(vii) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

Files which contain instances are saved into HDFSinitially Each file contains all the training instances and aportion of the testing instances Therefore the file number119899 determines the number of mappers to be used The filecontent is the input of MRBPNN 1

When the algorithm starts each mapper initializes aneural network As a result there will be 119899 neural networks inthe cluster Moreover all the neural networks have exactly thesame structure and parameters Each mapper reads data in

4 Computational Intelligence and Neuroscience

Mapper

Mapper

Mapper

Training data

Training data

Training data

Data chunk 2

Data chunk 1

Data chunk n

ReducerFinal results

Figure 2 MRBPNN 1 architecture

the formof ⟨instance119896 target

119896 type⟩ fromafile and parses the

data records If the value of type field is ldquotrainrdquo instance119896is

input into the input layer of the neural networkThe networkcomputes the output of each layer using (2) and (3) untilthe output layer generates an output which indicates thecompletion of the feed forward process And then the neuralnetwork in each mapper starts the back-propagation processIt computes and updates new weights and biases for its neu-rons using (4) to (6) The neural network inputs instance

119896+1

Repeat the feed forward and back-propagation process untilall the instances which are labeled as ldquotrainrdquo are processedand the error is satisfied

Each mapper starts classifying instances labeled as ldquotestrdquoby running the feed forward process As each mapper onlyclassifies a portion of the entire testing dataset the efficiencyis improved At last each mapper outputs intermediate out-put in the form of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key

and 119900119895119898

represents the output of the 119898th mapperOne reducer starts collecting and merging all the outputs

of the mappers Finally the reducer outputs ⟨instance119896 119900119895119898

into HDFS In this case 119900119895119898

represents the final classifica-tion result of instance

119896 Figure 2 shows the architecture of

MRBPNN 1 and Algorithm 1 shows the pseudocode

34 The Design of MRBPNN 2 MRBPNN 2 focuses on thescenario inwhich a BPNNhas a large volume of training dataConsider a training dataset 119878 with a number of instances Asshown in Figure 3 MRBPNN 2 divides 119878 into 119899 data chunksof which each data chunk 119904

119894is processed by a mapper for

training respectively

119878 =

119899

⋃1

119904119894 forall119904 isin 119904

119894| 119904 notin 119904

119899 119894 = 119899 (9)

Each mapper in the Hadoop cluster maintains a BPNNand each 119904

119894is considered as the input training data for the

neural network maintained in mapper119894 As a result each

BPNN in a mapper produces a classifier based on the trainedparameters

(mapper119894BPNN

119894 119904119894) 997888rarr classifier

119894 (10)

To reduce the computation overhead each classifier119894is

trained with a part of the original training dataset However

a critical issue is that the classification accuracy of a mapperwill be significantly degraded using only a portion of thetraining data To solve this issue MRBPNN 2 employsensemble technique to maintain the classification accuracyby combining a number of weak learners to create a stronglearner

341 Bootstrapping Training diverse classifiers from a singletraining dataset has been proven to be simple compared withthe case of finding a strong learner [26] A number of tech-niques exist for this purpose A widely used technique is toresample the training dataset based on bootstrap aggregatingsuch as bootstrapping and majority voting This can reducethe variance of misclassification errors and hence increasesthe accuracy of the classifications

Asmentioned in [26] balanced bootstrapping can reducethe variance when combining classifiers Balanced bootstrap-ping ensures that each training instance equally appears in thebootstrap samples It might not be always the case that eachbootstrapping sample contains all the training instances Themost efficient way of creating balanced bootstrap samples isto construct a string of instances119883

1 1198832 1198833 119883

119899repeating

119861 times so that a sequence of 1198841 1198842 1198843 119884

119861119899can be

achieved A random permutation 119901 of the integers from 1 to119861119899is taken Therefore the first bootstrapping sample can be

created from 119884119901(1) 119884119901(2) 119884119901(3) 119884

119901(119899) In addition the

second bootstrapping sample is created from119884119901(119899+1) 119884

119901(119899+

2) 119884119901(119899 + 3) 119884

119901(2119899) and the process continues until

119884119901((119861minus1)119899+1) 119884

119901((119861minus1)119899+2) 119884

119901((119861minus1)119899+3) 119884

119901(119861119899) is

the119861th bootstrapping sampleThebootstrapping samples canbe used in bagging to increase the accuracy of classification

342 Majority Voting This type of ensemble classifiers per-forms classifications based on the majority votes of the baseclassifiers [26] Let us define the prediction of the 119894th classifier119875119894as 119875119894119895

isin 1 0 119894 = 1 119868 and 119895 = 1 119888 where 119868 is thenumber of classifiers and 119888 is the number of classes If the 119894thclassifier chooses class 119895 then 119875

119894119895= 1 otherwise 119875

119894119895= 0

Then the ensemble prediction for class 119896 is computed using

119875119894119896

=119888max119895=1

119868

sum119894=1

119875119894119895 (11)

343 Algorithm Design At the beginning MRBPNN 2employs balanced bootstrapping to generate a number ofsubsets of the entire training dataset

balanced bootstrapping 997888rarr 1198781 1198782 1198783 119878

119899

119899

⋃119894=1

119878119894= 119878

(12)

where 119878119894represents the 119894th subset which belongs to entire

dataset 119878 119899 represents the total number of subsetsEach 119878

119894is saved in one file in HDFS Each instance

119904119896

= 1198861 1198862 1198863 119886

119894119899 119904119896

isin 119878119894 is defined in the format of

⟨instance119896 target

119896 type⟩ where

(i) instance119896represents one bootstrapped instance 119904

119896

which is the input of neural network

Computational Intelligence and Neuroscience 5

Training data

Sample 3

Sample 2

Sample n

Sample 1

MR 2Data

chunk

MR 1Data

chunk

MR 3Data

chunk

MR nData

chunk

Major voting

Reducecollects results of all BPNNs

Figure 3 MRBPNN 2 architecture

Input 119878 119879 Output 119862119898 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) forall119904 isin 119878 119904119894= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(4) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(5) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(6) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(8) Divide 119879 into 119879

1 1198792 1198793 119879

119898 ⋃119898119894=0

119879119894= 119879

(9) Each mapper inputs 119905119895= 1198861 1198862 1198863 119886

119894119899 119905119895isin 119879119894

(10) Execute (3) (4)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects and merges all ⟨119905119895 119900119895⟩

Repeat (9) (10) (11) (12)Until 119879 is traversed

(13) Reducer outputs 119862End

Algorithm 1 MRBPNN 1

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 3

x0

x1

xnminus1

y0

y1

ynminus1

intintint

int intint

intintint

Figure 1 The structure of a typical BPNN

Then the number of inputs is 119899 Similarly the numberof neurons in the output layer is determined by the numberof classifications And the number of neurons in the hiddenlayer is determined by users Every input of a neuron has aweight119908

119894119895 where 119894 and 119895 represent the source and destination

of the input Each neuron also maintains an optional param-eter 120579

119895which is actually a bias for varying the activity of the

119895th neuron in a layerTherefore let 1199001198951015840 denote the output from

a previous neuron and let 119900119895denote the output of this layer

the input 119868119895of neurons located in both the hidden and output

layer can be represented by

119868119895= sum119894

1199081198941198951199001198951015840 + 120579119895 (2)

The output of a neuron is usually computed by thesigmoid function so the output 119900

119895can be computed by

119900119895=

1

1 + 119890minus119868119895 (3)

After the feed forward process is completed the back-propagation process starts Let Err

119895represent the error-

sensitivity and let 119905119895represent the desirable output of neuron

119895 in the output layer thus

Err119895= 119900119895(1 minus 119900

119895) (119905119895minus 119900119895) (4)

Let Err119896represent the error-sensitivity of one neuron in

the last layer and let 119908119896119895represent its weight thus Err

119895of a

neuron in the other layers can be computed using

Err119895= 119900119895(1 minus 119900

119895)sum119896

Err119896119908119896119895 (5)

After Err119895is computed the weights and biases of each

neuron are tuned in back-propagation process using

Δ119908119894119895

= Err119895119900119895

119908119894119895

= 119908119894119895+ Δ119908119894119895

Δ120579119895= Err119895

120579119895= 120579119895+ Δ120579119895

(6)

After the first input vector finishes tuning the net-work the next round starts for the following input vectors

The input keeps training the network until (7) is satisfied fora single output or (8) is satisfied for multiple outputs

min (119864 [1198902]) = min (119864 [(119905 minus 119900)

2]) (7)

min (119864 [119890119879119890]) = min (119864 [(119905 minus 119900)

119879(119905 minus 119900)]) (8)

32 MapReduce Computing Model MapReduce has becomethe de facto standard computing model in dealing with dataintensive applications using a cluster of commodity comput-ers Popular implementations of the MapReduce computingmodel include Mars [22] Phoenix [23] and Hadoop frame-work [24 25]The Hadoop framework has been widely takenup by the community due to its open source feature Hadoophas its HadoopDistributed File System (HDFS) for dataman-agement A Hadoop cluster has one name node (Namenode)and a number of data nodes (Datanodes) for running jobsThe name node manages the metadata of the cluster whilst adata node is the actual processing node The Map functions(mappers) and Reduce functions (reducers) run on the datanodes When a job is submitted to a Hadoop cluster theinput data is divided into small chunks of an equal size andsaved in the HDFS In terms of data integrity each datachunk can have one or more replicas according to the clusterconfiguration In a Hadoop cluster mappers copy and readdata from either remote or local nodes based on data localityThe final output results will be sorted merged and generatedby reducers in HDFS

33 The Design of MRBPNN 1 MRBPNN 1 targets the sce-nario in which BPNN has a large volume of testing data to beclassified Consider a testing instance 119904

119894= 1198861 1198862 1198863 119886

119894119899

119904119894isin 119878 where

(i) 119904119894denotes an instance

(ii) 119878 denotes a dataset(iii) 119894119899 denotes the length of 119904

119894 it also determines the

number of inputs of a neural network(iv) the inputs are capsulated by a format of

⟨instance119896 target

119896 type⟩

(v) instance119896represents 119904

119894 which is the input of a neural

network(vi) target

119896represents the desirable output if instance

119896is

a training instance(vii) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

Files which contain instances are saved into HDFSinitially Each file contains all the training instances and aportion of the testing instances Therefore the file number119899 determines the number of mappers to be used The filecontent is the input of MRBPNN 1

When the algorithm starts each mapper initializes aneural network As a result there will be 119899 neural networks inthe cluster Moreover all the neural networks have exactly thesame structure and parameters Each mapper reads data in

4 Computational Intelligence and Neuroscience

Mapper

Mapper

Mapper

Training data

Training data

Training data

Data chunk 2

Data chunk 1

Data chunk n

ReducerFinal results

Figure 2 MRBPNN 1 architecture

the formof ⟨instance119896 target

119896 type⟩ fromafile and parses the

data records If the value of type field is ldquotrainrdquo instance119896is

input into the input layer of the neural networkThe networkcomputes the output of each layer using (2) and (3) untilthe output layer generates an output which indicates thecompletion of the feed forward process And then the neuralnetwork in each mapper starts the back-propagation processIt computes and updates new weights and biases for its neu-rons using (4) to (6) The neural network inputs instance

119896+1

Repeat the feed forward and back-propagation process untilall the instances which are labeled as ldquotrainrdquo are processedand the error is satisfied

Each mapper starts classifying instances labeled as ldquotestrdquoby running the feed forward process As each mapper onlyclassifies a portion of the entire testing dataset the efficiencyis improved At last each mapper outputs intermediate out-put in the form of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key

and 119900119895119898

represents the output of the 119898th mapperOne reducer starts collecting and merging all the outputs

of the mappers Finally the reducer outputs ⟨instance119896 119900119895119898

into HDFS In this case 119900119895119898

represents the final classifica-tion result of instance

119896 Figure 2 shows the architecture of

MRBPNN 1 and Algorithm 1 shows the pseudocode

34 The Design of MRBPNN 2 MRBPNN 2 focuses on thescenario inwhich a BPNNhas a large volume of training dataConsider a training dataset 119878 with a number of instances Asshown in Figure 3 MRBPNN 2 divides 119878 into 119899 data chunksof which each data chunk 119904

119894is processed by a mapper for

training respectively

119878 =

119899

⋃1

119904119894 forall119904 isin 119904

119894| 119904 notin 119904

119899 119894 = 119899 (9)

Each mapper in the Hadoop cluster maintains a BPNNand each 119904

119894is considered as the input training data for the

neural network maintained in mapper119894 As a result each

BPNN in a mapper produces a classifier based on the trainedparameters

(mapper119894BPNN

119894 119904119894) 997888rarr classifier

119894 (10)

To reduce the computation overhead each classifier119894is

trained with a part of the original training dataset However

a critical issue is that the classification accuracy of a mapperwill be significantly degraded using only a portion of thetraining data To solve this issue MRBPNN 2 employsensemble technique to maintain the classification accuracyby combining a number of weak learners to create a stronglearner

341 Bootstrapping Training diverse classifiers from a singletraining dataset has been proven to be simple compared withthe case of finding a strong learner [26] A number of tech-niques exist for this purpose A widely used technique is toresample the training dataset based on bootstrap aggregatingsuch as bootstrapping and majority voting This can reducethe variance of misclassification errors and hence increasesthe accuracy of the classifications

Asmentioned in [26] balanced bootstrapping can reducethe variance when combining classifiers Balanced bootstrap-ping ensures that each training instance equally appears in thebootstrap samples It might not be always the case that eachbootstrapping sample contains all the training instances Themost efficient way of creating balanced bootstrap samples isto construct a string of instances119883

1 1198832 1198833 119883

119899repeating

119861 times so that a sequence of 1198841 1198842 1198843 119884

119861119899can be

achieved A random permutation 119901 of the integers from 1 to119861119899is taken Therefore the first bootstrapping sample can be

created from 119884119901(1) 119884119901(2) 119884119901(3) 119884

119901(119899) In addition the

second bootstrapping sample is created from119884119901(119899+1) 119884

119901(119899+

2) 119884119901(119899 + 3) 119884

119901(2119899) and the process continues until

119884119901((119861minus1)119899+1) 119884

119901((119861minus1)119899+2) 119884

119901((119861minus1)119899+3) 119884

119901(119861119899) is

the119861th bootstrapping sampleThebootstrapping samples canbe used in bagging to increase the accuracy of classification

342 Majority Voting This type of ensemble classifiers per-forms classifications based on the majority votes of the baseclassifiers [26] Let us define the prediction of the 119894th classifier119875119894as 119875119894119895

isin 1 0 119894 = 1 119868 and 119895 = 1 119888 where 119868 is thenumber of classifiers and 119888 is the number of classes If the 119894thclassifier chooses class 119895 then 119875

119894119895= 1 otherwise 119875

119894119895= 0

Then the ensemble prediction for class 119896 is computed using

119875119894119896

=119888max119895=1

119868

sum119894=1

119875119894119895 (11)

343 Algorithm Design At the beginning MRBPNN 2employs balanced bootstrapping to generate a number ofsubsets of the entire training dataset

balanced bootstrapping 997888rarr 1198781 1198782 1198783 119878

119899

119899

⋃119894=1

119878119894= 119878

(12)

where 119878119894represents the 119894th subset which belongs to entire

dataset 119878 119899 represents the total number of subsetsEach 119878

119894is saved in one file in HDFS Each instance

119904119896

= 1198861 1198862 1198863 119886

119894119899 119904119896

isin 119878119894 is defined in the format of

⟨instance119896 target

119896 type⟩ where

(i) instance119896represents one bootstrapped instance 119904

119896

which is the input of neural network

Computational Intelligence and Neuroscience 5

Training data

Sample 3

Sample 2

Sample n

Sample 1

MR 2Data

chunk

MR 1Data

chunk

MR 3Data

chunk

MR nData

chunk

Major voting

Reducecollects results of all BPNNs

Figure 3 MRBPNN 2 architecture

Input 119878 119879 Output 119862119898 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) forall119904 isin 119878 119904119894= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(4) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(5) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(6) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(8) Divide 119879 into 119879

1 1198792 1198793 119879

119898 ⋃119898119894=0

119879119894= 119879

(9) Each mapper inputs 119905119895= 1198861 1198862 1198863 119886

119894119899 119905119895isin 119879119894

(10) Execute (3) (4)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects and merges all ⟨119905119895 119900119895⟩

Repeat (9) (10) (11) (12)Until 119879 is traversed

(13) Reducer outputs 119862End

Algorithm 1 MRBPNN 1

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

4 Computational Intelligence and Neuroscience

Mapper

Mapper

Mapper

Training data

Training data

Training data

Data chunk 2

Data chunk 1

Data chunk n

ReducerFinal results

Figure 2 MRBPNN 1 architecture

the formof ⟨instance119896 target

119896 type⟩ fromafile and parses the

data records If the value of type field is ldquotrainrdquo instance119896is

input into the input layer of the neural networkThe networkcomputes the output of each layer using (2) and (3) untilthe output layer generates an output which indicates thecompletion of the feed forward process And then the neuralnetwork in each mapper starts the back-propagation processIt computes and updates new weights and biases for its neu-rons using (4) to (6) The neural network inputs instance

119896+1

Repeat the feed forward and back-propagation process untilall the instances which are labeled as ldquotrainrdquo are processedand the error is satisfied

Each mapper starts classifying instances labeled as ldquotestrdquoby running the feed forward process As each mapper onlyclassifies a portion of the entire testing dataset the efficiencyis improved At last each mapper outputs intermediate out-put in the form of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key

and 119900119895119898

represents the output of the 119898th mapperOne reducer starts collecting and merging all the outputs

of the mappers Finally the reducer outputs ⟨instance119896 119900119895119898

into HDFS In this case 119900119895119898

represents the final classifica-tion result of instance

119896 Figure 2 shows the architecture of

MRBPNN 1 and Algorithm 1 shows the pseudocode

34 The Design of MRBPNN 2 MRBPNN 2 focuses on thescenario inwhich a BPNNhas a large volume of training dataConsider a training dataset 119878 with a number of instances Asshown in Figure 3 MRBPNN 2 divides 119878 into 119899 data chunksof which each data chunk 119904

119894is processed by a mapper for

training respectively

119878 =

119899

⋃1

119904119894 forall119904 isin 119904

119894| 119904 notin 119904

119899 119894 = 119899 (9)

Each mapper in the Hadoop cluster maintains a BPNNand each 119904

119894is considered as the input training data for the

neural network maintained in mapper119894 As a result each

BPNN in a mapper produces a classifier based on the trainedparameters

(mapper119894BPNN

119894 119904119894) 997888rarr classifier

119894 (10)

To reduce the computation overhead each classifier119894is

trained with a part of the original training dataset However

a critical issue is that the classification accuracy of a mapperwill be significantly degraded using only a portion of thetraining data To solve this issue MRBPNN 2 employsensemble technique to maintain the classification accuracyby combining a number of weak learners to create a stronglearner

341 Bootstrapping Training diverse classifiers from a singletraining dataset has been proven to be simple compared withthe case of finding a strong learner [26] A number of tech-niques exist for this purpose A widely used technique is toresample the training dataset based on bootstrap aggregatingsuch as bootstrapping and majority voting This can reducethe variance of misclassification errors and hence increasesthe accuracy of the classifications

Asmentioned in [26] balanced bootstrapping can reducethe variance when combining classifiers Balanced bootstrap-ping ensures that each training instance equally appears in thebootstrap samples It might not be always the case that eachbootstrapping sample contains all the training instances Themost efficient way of creating balanced bootstrap samples isto construct a string of instances119883

1 1198832 1198833 119883

119899repeating

119861 times so that a sequence of 1198841 1198842 1198843 119884

119861119899can be

achieved A random permutation 119901 of the integers from 1 to119861119899is taken Therefore the first bootstrapping sample can be

created from 119884119901(1) 119884119901(2) 119884119901(3) 119884

119901(119899) In addition the

second bootstrapping sample is created from119884119901(119899+1) 119884

119901(119899+

2) 119884119901(119899 + 3) 119884

119901(2119899) and the process continues until

119884119901((119861minus1)119899+1) 119884

119901((119861minus1)119899+2) 119884

119901((119861minus1)119899+3) 119884

119901(119861119899) is

the119861th bootstrapping sampleThebootstrapping samples canbe used in bagging to increase the accuracy of classification

342 Majority Voting This type of ensemble classifiers per-forms classifications based on the majority votes of the baseclassifiers [26] Let us define the prediction of the 119894th classifier119875119894as 119875119894119895

isin 1 0 119894 = 1 119868 and 119895 = 1 119888 where 119868 is thenumber of classifiers and 119888 is the number of classes If the 119894thclassifier chooses class 119895 then 119875

119894119895= 1 otherwise 119875

119894119895= 0

Then the ensemble prediction for class 119896 is computed using

119875119894119896

=119888max119895=1

119868

sum119894=1

119875119894119895 (11)

343 Algorithm Design At the beginning MRBPNN 2employs balanced bootstrapping to generate a number ofsubsets of the entire training dataset

balanced bootstrapping 997888rarr 1198781 1198782 1198783 119878

119899

119899

⋃119894=1

119878119894= 119878

(12)

where 119878119894represents the 119894th subset which belongs to entire

dataset 119878 119899 represents the total number of subsetsEach 119878

119894is saved in one file in HDFS Each instance

119904119896

= 1198861 1198862 1198863 119886

119894119899 119904119896

isin 119878119894 is defined in the format of

⟨instance119896 target

119896 type⟩ where

(i) instance119896represents one bootstrapped instance 119904

119896

which is the input of neural network

Computational Intelligence and Neuroscience 5

Training data

Sample 3

Sample 2

Sample n

Sample 1

MR 2Data

chunk

MR 1Data

chunk

MR 3Data

chunk

MR nData

chunk

Major voting

Reducecollects results of all BPNNs

Figure 3 MRBPNN 2 architecture

Input 119878 119879 Output 119862119898 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) forall119904 isin 119878 119904119894= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(4) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(5) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(6) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(8) Divide 119879 into 119879

1 1198792 1198793 119879

119898 ⋃119898119894=0

119879119894= 119879

(9) Each mapper inputs 119905119895= 1198861 1198862 1198863 119886

119894119899 119905119895isin 119879119894

(10) Execute (3) (4)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects and merges all ⟨119905119895 119900119895⟩

Repeat (9) (10) (11) (12)Until 119879 is traversed

(13) Reducer outputs 119862End

Algorithm 1 MRBPNN 1

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 5

Training data

Sample 3

Sample 2

Sample n

Sample 1

MR 2Data

chunk

MR 1Data

chunk

MR 3Data

chunk

MR nData

chunk

Major voting

Reducecollects results of all BPNNs

Figure 3 MRBPNN 2 architecture

Input 119878 119879 Output 119862119898 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) forall119904 isin 119878 119904119894= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(4) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(5) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(6) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(8) Divide 119879 into 119879

1 1198792 1198793 119879

119898 ⋃119898119894=0

119879119894= 119879

(9) Each mapper inputs 119905119895= 1198861 1198862 1198863 119886

119894119899 119905119895isin 119879119894

(10) Execute (3) (4)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects and merges all ⟨119905119895 119900119895⟩

Repeat (9) (10) (11) (12)Until 119879 is traversed

(13) Reducer outputs 119862End

Algorithm 1 MRBPNN 1

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

6 Computational Intelligence and Neuroscience

(ii) 119894119899 represents the number of inputs of the neuralnetwork

(iii) target119896represents the desirable output if instance

119896is

a training instance(iv) type field has two values ldquotrainrdquo and ldquotestrdquo which

marks the type of instance119896 if ldquotestrdquo value is set

target119896field should be left empty

When MRBPNN 2 starts each mapper constructs oneBPNN and initializes weights and biases with random valuesbetween minus1 and 1 for its neurons And then a mapper inputsone record in the form of ⟨instance

119896 target

119896 type⟩ from the

input fileThe mapper firstly parses the data and retrieves the type

of the instance If the type value is ldquotrainrdquo the instance is fedinto the input layer Secondly each neuron in different layerscomputes its output using (2) and (3) until the output layergenerates an output which indicates the completion of thefeed forward process Eachmapper starts a back-propagationprocess and computes and updates weights and biases forneurons using (4) to (6) The training process finishes untilall the instances marked as ldquotrainrdquo are processed and error issatisfied All the mappers start feed forwarding to classify thetesting dataset In this case each neural network in a mappergenerates the classification result of an instance at the outputlayer Each mapper generates an intermediate output in theform of ⟨instance

119896 119900119895119898

⟩ where instance119896is the key and 119900

119895119898

represents the outputs of the 119898th mapperFinally a reducer collects the outputs of all the mappers

The outputs with the same key are merged together Thereducer runs majority voting using (11) and outputs theresult of instance

119896into HDFS in the form of ⟨instance

119896 119903119896⟩

where 119903119896represents the voted classification result of instance

119896

Figure 3 shows the algorithm architecture and Algorithm 2presents the pseudocode of MRBPNN 2

35 The Design of MRBPNN 3 MRBPNN 3 aims at thescenario inwhich a BPNNhas a large number of neuronsThealgorithm enables an entire MapReduce cluster to maintainone neural network across it Therefore each mapper holdsone or several neurons

There are a number of iterations that exist in thealgorithm with 119897 layers MRBPNN 3 employs a number of119897 minus 1 MapReduce jobs to implement the iterations Thefeed forward process runs in 119897 minus 1 rounds whilst theback-propagation process occurs only in the last round Adata format in the form of ⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899

1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩ has been designed to guarantee the

data passing between Map and Reduce operations where

(i) index119896represents the 119896th reducer

(ii) instance119899

represents the 119899th training or testinginstance of the dataset one instance is in the form ofinstance

119899= 1198861 1198862 119886

119894119899 where 119894119899 is length of the

instance(iii) 119908

119894119895represents a set of weights of an input layer whilst

120579119895represents the biases of the neurons in the first

hidden layer

(iv) target119899represents the encoded desirable output of a

training instance instance119899

(v) the list of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 represents the

weights and biases for next layers it can be extendedbased on the layers of the network for a standardthree-layer neural network this option becomes119908119894119895119900

120579119895119900

Before MRBPNN 3 starts each instance and the infor-mation defined by data format are saved in one file inHDFS The number of the layers is determined by thelength of 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 field The number of neurons

in the next layer is determined by the number of filesin the input folder Generally different from MRBPNN 1and MRBPNN 2 MRBPNN 3 does not initialize an explicitneural network instead it maintains the network parametersbased on the data defined in the data format

When MRBPNN 3 starts each mapper initially inputsone record fromHDFS And then it computes the output of aneuronusing (2) and (3)Theoutput is generated by amapperwhich labels index

119896as a key and the neuronrsquos output as a value

in the form of

⟨index119896 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ where 119900

119895

represents the neuronrsquos output

Parameter index119896can guarantee that the 119896th reducer col-

lects the output which maintains the neural network struc-ture It should be mentioned that if the record is the first oneprocessed by MRBPNN 3 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 will be also

initialized with random values between minus1 and 1 by the map-pers The 119896th reducer collects the results from the mappersin the form of ⟨index

1198961015840 119900119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895 target

119899⟩

These 119896 reducers generate 119896 outputs The index1198961015840 of the

reducer output explicitly tells the 1198961015840th mapper to start

processing this output file Therefore the number of neuronsin the next layer can be determined by the number of reduceroutput files which are the input data for the next layerneurons Subsequently mappers start processing their corre-sponding inputs by computing (2) and (3) using 119908

2

119894119895and 1205792

119895

The above steps keep looping until reaching the lastround The processing of this last round consists of twosteps The first step is that mappers also process ⟨index

1198961015840

119900119895 119908119897minus1

119894119895 120579119897minus1

119895 target

119899⟩ compute neuronsrsquo outputs and gen-

erate results in the forms of ⟨119900119895 target

119899⟩ One reducer

collects the output results of all the mappers in the form of⟨1199001198951 1199001198952 1199001198953 119900

119895119896 target

119899⟩ In the second step the reducer

executes the back-propagation process The reducer com-putes new weights and biases for each layer using (4) to(6) MRBPNN 3 retrieves the previous outputs weights andbiases from the input files of mappers and then it writesthe updated weights and biases 119908

119894119895 120579119895 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895

into the initial input file in the form of ⟨index119896 instance

119899 119908119894119895

120579119895 target

119899 1199082

119894119895 1205792

119895 119908

119897minus1

119894119895 120579119897minus1

119895⟩The reducer reads the sec-

ond instance in the form of ⟨instance119899+1

target119899+1

⟩ for whichthe fields instance

119899and target

119899in the input file are replaced

by instance119899+1

and target119899+1

The training process continuesuntil all the instances are processed and error is satisfied

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 7

Input 119878 119879 Output 119862119899 mappers one reducer(1) Each mapper constructs one BPNN with 119894119899 inputs 119900 outputs ℎ neurons in hidden layer(2) Initialize 119908

119894119895= random

1119894119895isin (minus1 1) 120579

119895= random

2119895isin (minus1 1)

(3) Bootstrap 1198781 1198782 1198783 119878

119899 ⋃119899119894=0

119878119894= 119878

(4) Each mapper inputs 119904119896= 1198861 1198862 1198863 119886

119894119899 119904119896isin 119878119894

Input 119886119894rarr 119894119899

119894 neuron

119895in hidden layer computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(5) Input 119900119895

rarr out119894 neuron

119895in output layer computes

119868119895119900

=

sum119894=1

119900119895ℎ

sdot 119908119894119895+ 120579119895

119900119895119900

=1

1 + 119890119868119895119900

(6) In each output computeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119895minus 119900119895119900)

(7) In hidden layer compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(8) Update119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

Repeat (3) (4) (5) (6) (7)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs 119905

119894= 1198861 1198862 1198863 119886

119894119899 119905119894isin 119879

(10)Execute (4) (5)(11) Mapper outputs ⟨119905

119895 119900119895⟩

(12) Reducer collects ⟨119905119895 119900119895119898

⟩ 119898 = (1 2 119899)

(13) For each 119905119895

Compute 119862 = max119888119895=1

sum119868

119894=1119900119895119898

Repeat (9) (10) (11) (12) (13)Until 119879 is traversed(14) Reducer outputs 119862End

Algorithm 2 MRBPNN 2

For classification MRBPNN 3 only needs to run the feedforwarding process and collects the reducer output in theform of ⟨119900

119895 target

119899⟩ Figure 4 shows three-layer architecture

of MRBPNN 3 and Algorithm 3 presents the pseudocode

4 Performance Evaluation

We have implemented the three parallel BPNNs usingHadoop an open source implementation framework of theMapReduce computing model An experimental Hadoopcluster was built to evaluate the performance of the algo-rithmsThe cluster consisted of 5 computers in which 4 nodesare Datanodes and the remaining one is Namenode Thecluster details are listed in Table 1

Two testing datasets were prepared for evaluations Thefirst dataset is a synthetic datasetThe second is the Iris dataset

Table 1 Cluster details

NamenodeCPU Core i73GHz

Memory 8GBSSD 750GBOS Fedora

DatanodesCPU Core i738GHz

Memory 32GBSSD 250GBOS Fedora

Network bandwidth 1GbpsHadoop version 230 32 bits

which is a published machine learning benchmark dataset[27] Table 2 shows the details of the two datasets

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

8 Computational Intelligence and Neuroscience

Input 119878 119879 Output 119862119898 mappers 119898 reducersInitially each mapper inputs

⟨index119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ 119896 = (1 2 119898)

(1) For instance119899= 1198861 1198862 1198863 119886

119894119899

Input 119886119894rarr 119894119899

119894 neuron inmapper computes

119868119895ℎ

=

119894119899

sum119894=1

119886119894sdot 119908119894119895+ 120579119895

119900119895ℎ

=1

1 + 119890119868119895ℎ

(2)Mapper outputs output⟨index

119896 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 119903 = (1 2 119898)

(3) 119896th reducer collects outputOutput ⟨index

1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩ 1198961015840 = (1 2 119898)

(4) Mapper1198961015840 inputs

⟨index1198961015840 119900119895ℎ 119908119894119895119900

120579119895119900 target

119899⟩

Execute (2) outputs ⟨119900119895119900 target

119899⟩

(5) 119898th reducer collects and merges all ⟨119900119895119900 target

119899⟩ from 119898 mappers

Feed forward terminates(6) In 119898th reducer

ComputeErr119895119900

= 119900119895119900

(1 minus 119900119895119900) (target

119899minus 119900119895119900)

Using HDFS API retrieve 119908119894119895 120579119895 119900119895ℎ instance

119899

Compute

Err119895ℎ

= 119900119895ℎ

(1 minus 119900119895ℎ)

119900

sum119894=1

Err119894119908119894119900

(7) Update 119908119894119895 120579119895 119908119894119895119900

120579119895119900

119908119894119895= 119908119894119895+ 120578 sdot Err

119895sdot 119900119895

120579119895= 120579119895+ 120578 sdot Err

119895

into⟨index

119896 instance

119899 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩

Back propagation terminates(8) Retrieve instance

119899+1and target

119899+1

Update into ⟨index119896 instance

119899+1 119908119894119895 120579119895 target

119899+1 119908119894119895119900

120579119895119900⟩

Repeat (1) (2) (3) (4) (5) (6) (7) (8) (9)Until

min (119864 [1198902]) = min (119864 [(target

119895minus 119900119895119900)2

])

Training terminates(9) Each mapper inputs ⟨index

119896 instance

119905 119908119894119895 120579119895 target

119899 119908119894119895119900

120579119895119900⟩ instance

119905isin 119879

(10) Execute (1) (2) (3) (4) (5) (6) outputs ⟨119900119895119900 target

119899⟩

(11) 119898th reducer outputs 119862End

Algorithm 3 MRBPNN 3

Table 2 Dataset details

Data type Instancenumber

Instancelength

Elementrange

Classnumber

Syntheticdata 200 32 0 and 1 4

Iris data 150 4 (0 8) 3

We implemented a three-layer neural network with16 neurons in the hidden layer The Hadoop cluster wasconfigured with 16 mappers and 16 reducers The numberof instances was varied from 10 to 1000 for evaluating

the precision of the algorithms The size of the datasets wasvaried from 1MB to 1GB for evaluating the computation effi-ciency of the algorithms Each experiment was executed fivetimes and the final result was an average The precision 119901 iscomputed using

119901 =119903

119903 + 119908times 100 (13)

where 119903 represents the number of correctly recognizedinstances 119908 represents the number of wrongly recognizedinstances 119901 represents the precision

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 9

Instance

Instance

weights and biases

and biases

and biases

Instance weights

weights

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Mapper 1

Mapper 2

Mapper n

Reducer outputs l-1 rounds

Reducercollects

andadjustsweights

andbiases

Outputs

Next instance

Weights biases

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

middot middot middotmiddot middot middotmiddot middot middot

Figure 4 MRBPNN 3 structure

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

Number of training instances

0

20

40

60

80

100

120

Prec

ision

()

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 5 The precision of MRBPNN 1 on the two datasets

41 Classification Precision The classification precision ofMRBPNN 1 was evaluated using a varied number of traininginstances The maximum number of the training instanceswas 1000whilst themaximumnumber of the testing instanceswas also 1000 The large number of instances is based onthe data duplication Figure 5 shows the precision resultsof MRBPNN 1 in classification using 10 mappers It can beobserved that the precision keeps increasing with an increasein the number of training instances Finally the precision ofMRBPNN 1 on the synthetic dataset reaches 100 while theprecision on the Iris dataset reaches 975 In this test thebehavior of the parallel MRBPNN 1 is quite similar to thatof the standalone BPNNThe reason is that MRBPNN 1 doesnot distribute the BPNN among the Hadoop nodes insteadit runs on Hadoop to distribute the data

To evaluate MRBPNN 2 we designed 1000 traininginstances and 1000 testing instances using data duplication

The mappers were trained by subsets of the traininginstances and produced the classification results of 1000testing instances based on bootstrapping andmajority votingMRBPNN 2 employed 10 mappers each of which inputs anumber of training instances varying from 10 to 1000 Fig-ure 6 presents the precision results of MRBPNN 2 on the twotesting datasets It also shows that along with the increasingnumber of training instances in each subneural network theachieved precision based onmajority voting keeps increasingThe precision ofMRBPNN 2 on the synthetic dataset reaches100 whilst the precision on the Iris dataset reaches 975which is higher than that of MRBPNN 1

MRBPNN 3 implements a fully parallel and distributedneural network using Hadoop to deal with a complex neuralnetwork with a large number of neurons Figure 7 shows theperformance of MRBPNN 3 using 16 mappersThe precisionalso increases along with the increasing number of training

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

10 Computational Intelligence and Neuroscience

0

20

40

60

80

100

120Pr

ecisi

on (

)

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

0

20

40

60

80

100

120

Prec

ision

()

Number of instances in each sub-BPNN

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 6 The precision of MRBPNN 2 on the two datasets

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(a) The synthetic dataset

20

0

40

60

80

100

120Pr

ecisi

on (

)

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

(b) The Iris dataset

Figure 7 The precision of MRBPNN 3 on the two datasets

instances for both datasets It also can be observed that thestability of the curve is quite similar to that of MRBPNN 1Both curves have more fluctuations than that of MRBPNN 2

Figure 8 compares the overall precision of the three par-allel BPNNsMRBPNN 1 andMRBPNN 3 perform similarlywhereas MRBPNN 2 performs the best using bootstrappingandmajority voting In addition the precision ofMRBPNN 2in classification is more stable than that of both MRBPNN 1and MRBPNN 3

Figure 9 presents the stability of the three algorithms onthe synthetic dataset showing the precision of MRBPNN 2in classification is highly stable compared with that of bothMRBPNN 1 and MRBPNN 3

42 Computation Efficiency A number of experiments werecarried out in terms of computation efficiency using thesynthetic dataset The first experiment was to evaluate theefficiency of MRBPNN 1 using 16 mappers The volume ofdata instances was varied from 1MB to 1GB Figure 10 clearlyshows that the parallel MRBPNN 1 significantly outperforms

the standalone BPNN The computation overhead of thestandalone BPNN is lowwhen the data size is less than 16MBHowever the overhead of the standalone BPNN increasessharply with increasing data sizes This is mainly becauseMRBPNN 1 distributes the testing data into 4 data nodes inthe Hadoop cluster which runs in parallel in classification

Figure 11 shows the computation efficiency ofMRBPNN 2 using 16 mappers It can be observed that whenthe data size is small the standalone BPNN performs betterHowever the computation overhead of the standalone BPNNincreases rapidly when the data size is larger than 64MBSimilar to MRBPNN 1 the parallel MRBPNN 2 scales withincreasing data sizes using the Hadoop framework

Figure 12 shows the computation overhead ofMRBPNN 3 using 16 mappers MRBPNN 3 incurs a higheroverhead than both MRBPNN 1 and MRBPNN 2 Thereason is that bothMRBPNN 1 andMRBPNN 2 run trainingand classification within one MapReduce job which meansmappers and reducers only need to start once HoweverMRBPNN 3 contains a number of jobs The algorithm has to

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 11

0

20

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(a) The synthetic dataset

20

0

40

60

80

100

120

Prec

ision

()

Number of training instances

10

30

50

70

90

110

130

150

170

190

210

230

250

270

290

310

330

350

370

390

500

700

900

MRBPNN_1MRBPNN_2MRBPNN_3

(b) The Iris dataset

Figure 8 Precision comparison of the three parallel BPNNs

05

1015202530354045

1 2 3 4 5

Prec

ision

()

Number of tests

MRBPNN_1MRBPNN_2MRBPNN_3

Figure 9 The stability of the three parallel BPNNs

0100020003000400050006000700080009000

10000

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_1

Figure 10 Computation efficiency of MRBPNN 1

0

100

200

300

400

500

600

700

800

1 2 4 8 16 32 64 128 256 512 1024

Exec

utio

n tim

e (s)

Data volume (MB)

Standalone BPNNMRBPNN_2

Figure 11 Computation efficiency of MRBPNN 2

start mappers and reducers a number of times This processincurs a large system overhead which affects its computationefficiency Nevertheless Figure 12 shows the feasibility offully distributing a BPNN in dealing with a complex neuralnetwork with a large number of neurons

5 Conclusion

In this paper we have presented three parallel neural net-works (MRBPNN 1MRBPNN 2 andMRBPNN 3) based ontheMapReduce computingmodel in dealing with data inten-sive scenarios in terms of the size of classification datasetthe size of the training dataset and the number of neuronsrespectively Overall experimental results have shown thecomputation overhead can be significantly reduced using anumber of computers in parallel MRBPNN 3 shows the fea-sibility of fully distributing a BPNN in a computer cluster butincurs a high overhead of computation due to continuous

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

12 Computational Intelligence and Neuroscience

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256 512 1024

Trai

ning

effici

ency

(s)

Volume of data (MB)

Figure 12 Computation efficiency of MRBPNN 3

starting and stopping of mappers and reducers in Hadoopenvironment One of the future works is to research in-memory processing to further enhance the computation effi-ciency ofMapReduce in dealingwith data intensive taskswithmany iterations

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgment

The authors would like to appreciate the support from theNational Science Foundation of China (no 51437003)

References

[1] Networked European Software and Services Initiative (NESSI)ldquoBig data a new world of opportunitiesrdquo Networked EuropeanSoftware and Services Initiative (NESSI) White Paper 2012httpwwwnessi-europecomFilesPrivateNESSI WhitePa-per BigDatapdf

[2] P C Zikopoulos C Eaton D deRoos T Deutsch andG LapisUnderstanding Big Data Analytics for Enterprise Class Hadoopand Streaming Data McGraw-Hill 2012

[3] M H Hagan H B Demuth and M H Beale Neural NetworkDesign PWS Publishing 1996

[4] RGu F Shen andYHuang ldquoA parallel computing platform fortraining large scale neural networksrdquo in Proceedings of the IEEEInternational Conference on Big Data pp 376ndash384 October2013

[5] V Kumar A Grama A Gupta and G Karypis Introduction toParallel Computing Benjamin CummingsAddisonWesley SanFrancisco Calif USA 2002

[6] Message Passing Interface 2015 httpwwwmcsanlgovresearchprojectsmpi

[7] L N Long and A Gupta ldquoScalable massively parallel artificialneural networksrdquo Journal of Aerospace Computing Informationand Communication vol 5 no 1 pp 3ndash15 2008

[8] J Dean and SGhemawat ldquoMapReduce simplified data process-ing on large clustersrdquo Communications of the ACM vol 51 no1 pp 107ndash113 2008

[9] Y Liu M Li M Khan and M Qi ldquoA mapreduce based dis-tributed LSI for scalable information retrievalrdquo Computing andInformatics vol 33 no 2 pp 259ndash280 2014

[10] N K Alham M Li Y Liu and M Qi ldquoA MapReduce-baseddistributed SVM ensemble for scalable image classification andannotationrdquoComputers andMathematics with Applications vol66 no 10 pp 1920ndash1934 2013

[11] J Jiang J Zhang G Yang D Zhang and L Zhang ldquoApplicationof back propagation neural network in the classification of highresolution remote sensing image take remote sensing imageof Beijing for instancerdquo in Proceedings of the 18th InternationalConference on Geoinformatics pp 1ndash6 June 2010

[12] N L D Khoa K Sakakibara and I Nishikawa ldquoStock priceforecasting using back propagation neural networks with timeand profit based adjusted weight factorsrdquo in Proceedings ofthe SICE-ICASE International Joint Conference pp 5484ndash5488IEEE Busan Republic of korea October 2006

[13] M Rizwan M Jamil and D P Kothari ldquoGeneralized neuralnetwork approach for global solar energy estimation in IndiardquoIEEE Transactions on Sustainable Energy vol 3 no 3 pp 576ndash584 2012

[14] Y Wang B Li R Luo Y Chen N Xu and H Yang ldquoEnergyefficient neural networks for big data analyticsrdquo in Proceedingsof the 17th International Conference on Design Automation andTest in Europe (DATE rsquo14) March 2014

[15] D Nguyen and B Widrow ldquoImproving the learning speed of 2-layer neural networks by choosing initial values of the adaptiveweightsrdquo in Proceedings of the International Joint Conference onNeural Networks vol 3 pp 21ndash26 June 1990

[16] H Kanan and M Khanian ldquoReduction of neural networktraining time using an adaptive fuzzy approach in real timeapplicationsrdquo International Journal of Information and Electron-ics Engineering vol 2 no 3 pp 470ndash474 2012

[17] A A Ikram S Ibrahim M Sardaraz M Tahir H Bajwa andC Bach ldquoNeural network based cloud computing platform forbioinformaticsrdquo in Proceedings of the 9th Annual Conference onLong Island Systems Applications and Technology (LISAT rsquo13)pp 1ndash6 IEEE Farmingdale NY USA May 2013

[18] V Rao and S Rao ldquoApplication of artificial neural networksin capacity planning of cloud based IT infrastructurerdquo inProceedings of the 1st IEEE International Conference on CloudComputing for Emerging Markets (CCEM rsquo12) pp 38ndash41 Octo-ber 2012

[19] A A Huqqani E Schikuta and E Mann ldquoParallelized neuralnetworks as a servicerdquo in Proceedings of the International JointConference on Neural Networks (IJCNN rsquo14) pp 2282ndash2289 July2014

[20] J Yuan and S Yu ldquoPrivacy preserving back-propagation neuralnetwork learning made practical with cloud computingrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 1pp 212ndash221 2014

[21] Z Liu H Li and G Miao ldquoMapReduce-based back propaga-tion neural network over large scalemobile datardquo in Proceedingsof the 6th International Conference on Natural Computation(ICNC rsquo10) pp 1726ndash1730 August 2010

[22] B HeW Fang Q Luo N K Govindaraju and TWang ldquoMarsa MapReduce framework on graphics processorsrdquo in Proceed-ings of the 17th International Conference on Parallel Architectures

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Computational Intelligence and Neuroscience 13

and Compilation Techniques (PACT rsquo08) pp 260ndash269 October2008

[23] K Taura K Kaneda T Endo and A Yonezawa ldquoPhoenix aparallel programming model for accommodating dynamicallyjoiningleaving resourcesrdquo ACM SIGPLAN Notices vol 38 no10 pp 216ndash229 2003

[24] Apache Hadoop 2015 httphadoopapacheorg[25] J Venner Pro Hadoop Springer New York NY USA 2009[26] N K Alham Parallelizing support vector machines for scalable

image annotation [PhD thesis] Brunel University UxbridgeUK 2011

[27] The Iris Dataset 2015 httpsarchiveicsuciedumldatasetsIris

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Research Article MapReduce Based Parallel Neural Networks in …downloads.hindawi.com/journals/cin/2015/297672.pdf · 2019. 7. 31. · Research Article MapReduce Based Parallel Neural

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014