big data analytics for internet of things

30

Upload: others

Post on 14-Mar-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Big Data Analytics for Internet of Things

Big Data Analytics for Internet of Things

Edited by

Tausifa Jan SaleemNational Institute of TechnologySrinagar, India

Mohammad Ahsan ChishtiCentral University of KashmirGanderbal, Kashmir, India

This edition first published 2021© 2021 John Wiley & Sons, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Tausifa Jan Saleem and Mohammad Ahsan Chishti to be identified as the author(s) of the editorial material in this work has been asserted in accordance with law.

Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Officesw111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging-in-Publication Data

Names: Saleem, Tausifa Jan, editor. | Chishti, Mohammad Ahsan, editor. Title: Big data analytics for Internet of things / edited by Tausifa Jan Saleem, Mohammad Ahsan Chishti. Description: First edition. | Hoboken, NJ : Wiley, 2021. | Includes bibliographical references and index. Identifiers: LCCN 2020049761 (print) | LCCN 2020049762 (ebook) | ISBN 9781119740759 (hardback) | ISBN 9781119740766 (adobe pdf) | ISBN 9781119740773 (epub) Subjects: LCSH: Big data. | Internet of things. Classification: LCC QA76.9.B45 B4995 2021 (print) | LCC QA76.9.B45 (ebook) | DDC 005.7–dc23 LC record available at https://lccn.loc.gov/2020049761LC ebook record available at https://lccn.loc.gov/2020049762

Cover Design: WileyCover Image: © Blue Planet Studio/iStock/Getty Images Plus/Getty Images

Set in 9.5/12.5pt STIXTwoText by SPi Global, Pondicherry, India

10 9 8 7 6 5 4 3 2 1

v

List of Contributors xvList of Abbreviations xix

1 BigDataAnalyticsfor theInternetof Things:AnOverview 1 Tausifa Jan Saleem and Mohammad Ahsan Chishti

2 Data,AnalyticsandInteroperabilityBetweenSystems(IoT)isIncongruouswiththeEconomicsofTechnology:EvolutionofPorousParetoPartition(P3) 7

Shoumen Palit Austin Datta, Tausifa Jan Saleem, Molood Barati, María Victoria López López, Marie-Laure Furgala, Diana C. Vanegas, Gérald Santucci, Pramod P. Khargonekar, and Eric S. McLamore

2.1 Context 82.2 Modelsin the Background 122.3 ProblemSpace: Are We Asking the Correct

Questions? 142.4 SolutionsApproach:The ElusiveQuestto BuildBridgesBetweenData

and Decisions 152.5 AvoidThisSpace:The DeceptionSpace 172.6 Explorethe SolutionSpace:Necessaryto AskQuestionsThatMayNot

HaveAnswers,Yet 172.7 SolutionEconomy:WillWe EverGetThere? 192.8 IsThisFauxNaïvetéin ItsPurestDistillate? 212.9 RealityCheck:DataFusion 222.10 “DoubleA”Perspectiveof Dataand Toolsvs.The HypotheticalPorous

Pareto(80/20)Partition 282.11 Conundrums 292.12 Stigmaof Partitionvs.Astigmatismof Vision 382.13 TheIllusionof Data,Delusionof BigData,and the Absence

of Intelligencein AI 40

Contents

Contentsvi

2.14 InServiceof Society 502.15 DataSciencein Serviceof Society:Knowledgeand Performance

from PEAS 522.16 TemporaryConclusion 60 Acknowledgements 63 References 63

3 MachineLearningTechniquesforIoTDataAnalytics 89Nailah Afshan and Ranjeet Kumar Rout

3.1 Introduction 893.2 Taxonomyof MachineLearningTechniques 943.2.1 SupervisedMLAlgorithm 953.2.1.1 Classification 963.2.1.2 RegressionAnalysis 983.2.1.3 Classificationand RegressionTasks 993.2.2 UnsupervisedMachineLearningAlgorithms 1033.2.2.1 Clustering 1033.2.2.2 FeatureExtraction 1063.2.3 Conclusion 107 References 107

4 IoTDataAnalyticsUsingCloudComputing 115Anjum Sheikh, Sunil Kumar, and Asha Ambhaikar

4.1 Introduction 1154.2 IoTDataAnalytics 1174.2.1 ProcessofIoTAnalytics 1174.2.2 Typesof Analytics 1184.3 CloudComputingforIoT 1184.3.1 DeploymentModelsfor Cloud 1204.3.1.1 PrivateCloud 1204.3.1.2 PublicCloud 1204.3.1.3 HybridCloud 1214.3.1.4 CommunityCloud 1214.3.2 ServiceModelsfor CloudComputing 1224.3.2.1 SoftwareasaService(SaaS) 1224.3.2.2 PlatformasaService(PaaS) 1224.3.2.3 InfrastructureasaService(IaaS) 1224.3.3 DataAnalyticson Cloud 1234.4 Cloud-BasedIoTDataAnalyticsPlatform 1234.4.1 AtosCodex 1254.4.2 AWSIoT 1254.4.3 IBMWatsonIoT 126

Contents vii

4.4.4 HitachiVantaraPentaho,Lumada 1274.4.5 MicrosoftAzureIoT 1284.4.6 OracleIoTCloudServices 1294.5 MachineLearningforIoTAnalyticsinCloud 1324.5.1 MLAlgorithmsforDataAnalytics 1324.5.2 TypesofPredictionsSupportedbyMLandCloud 1364.6 Challengesfor AnalyticsUsingCloud 1374.7 Conclusion 139 References 139

5 DeepLearningArchitecturesforIoTDataAnalytics 143Snowber Mushtaq and Omkar Singh

5.1 Introduction 1435.1.1 Typesof LearningAlgorithms 1465.1.1.1 SupervisedLearning 1465.1.1.2 UnsupervisedLearning 1465.1.1.3 Semi-SupervisedLearning 1465.1.1.4 ReinforcementLearning 1465.1.2 StepsInvolvedin Solvinga Problem 1465.1.2.1 BasicTerminology 1475.1.2.2 TrainingProcess 1475.1.3 Modelingin DataScience 1475.1.3.1 Generative 1485.1.3.2 Discriminative 1485.1.4 WhyDLandIoT? 1485.2 DLArchitectures 1495.2.1 RestrictedBoltzmannMachine 1495.2.1.1 TrainingBoltzmannMachine 1505.2.1.2 ApplicationsofRBM 1515.2.2 DeepBeliefNetworks(DBN) 1515.2.2.1 TrainingDBN 1525.2.2.2 ApplicationsofDBN 1535.2.3 Autoencoders 1535.2.3.1 TrainingofAE 1535.2.3.2 ApplicationsofAE 1545.2.4 ConvolutionalNeuralNetworks(CNN) 1545.2.4.1 LayersofCNN 1555.2.4.2 ActivationFunctionsUsedinCNN 1565.2.4.3 ApplicationsofCNN 1585.2.5 GenerativeAdversarialNetwork(GANs) 1585.2.5.1 TrainingofGANs 1585.2.5.2 VariantsofGANs 159

Contentsviii

5.2.5.3 ApplicationsofGANs 1595.2.6 RecurrentNeuralNetworks(RNN) 1595.2.6.1 TrainingofRNN 1605.2.6.2 Applicationsof RNN 1615.2.7 LongShort-TermMemory(LSTM) 1615.2.7.1 TrainingofLSTM 1615.2.7.2 ApplicationsofLSTM 1625.3 Conclusion 162 References 163

6 AddingPersonalTouchestoIoT:AUser-CentricIoTArchitecture 167Sarabjeet Kaur Kochhar

6.1 Introduction 1676.2 EnablingTechnologiesforBDAofIoTSystems 1696.3 PersonalizingtheIoT 1716.3.1 Personalizationfor Business 1726.3.2 Personalizationfor Marketing 1726.3.3 Personalizationfor ProductImprovementand Service

Optimization 1736.3.4 Personalizationfor AutomatedRecommendations 1746.3.5 Personalizationfor ImprovedUserExperience 1746.4 RelatedWork 1756.5 UserSensitizedIoTArchitecture 1766.6 TheTweakedDataLayer 1786.7 ThePersonalizationLayer 1806.7.1 TheCharacterizationEngine 1806.7.2 TheSentimentAnalyzer 1826.8 Concernsand FutureDirections 1836.9 Conclusions 184 References 185

7 SmartCitiesand theInternetof Things 187Hemant Garg, Sushil Gupta, and Basant Garg

7.1 Introduction 1877.2 Developmentof SmartCitiesand theIoT 1887.3 TheCombinationof theIoTwith Developmentof CityArchitecture

to FormSmartCities 1897.3.1 Unificationof theIoT 1907.3.2 Securityof SmartCities 1907.3.3 Managementof Waterand RelatedAmenities 1907.3.4 PowerDistributionand Management 191

Contents ix

7.3.5 RevenueCollectionand Administration 1917.3.6 Managementof CityAssetsand HumanResources 1927.3.7 EnvironmentalPollutionManagement 1927.4 HowFutureSmartCitiesCanImproveTheirUtilizationof

theInternetof AllThings,with Examples 1937.5 Conclusion 194 References 195

8 ARoadmapforApplicationofIoT-GeneratedBigData inEnvironmentalSustainability 197Ankur Kashyap

8.1 Backgroundand Motivation 1978.2 Executionof theStudy 1988.2.1 Roleof BigDatain Sustainability 1988.2.2 PresentStatusand FuturePossibilitiesof IoTin Environmental

Sustainability 1998.3 ProposedRoadmap 2028.4 Identificationand Prioritizingthe Barriersin theProcess 2048.4.1 InternetInfrastructure 2048.4.2 HighHardwareand SoftwareCost 2048.4.3 LessQualifiedWorkforce 2048.5 Conclusionand Discussion 205 References 205

9 Applicationof High-PerformanceComputingin Synchrophasor DataManagementand Analysisfor PowerGrids 209C.M. Thasnimol and R. Rajathy

9.1 Introduction 2099.2 ApplicationsofSynchrophasorData 2109.2.1 VoltageStabilityAnalysis 2119.2.2 TransientStability 2129.2.3 Outof StepSplittingProtection 2139.2.4 MultipleEventDetection 2139.2.5 StateEstimation 2139.2.6 FaultDetection 2149.2.7 LossofMain(LOM)Detection 2149.2.8 TopologyUpdateDetection 2149.2.9 OscillationDetection 2159.3 UtilityBigDataIssuesRelatedtoPMU-Driven

Applications 2159.3.1 HeterogeneousMeasurementIntegration 215

Contentsx

9.3.2 Varietyand Interoperability 2169.3.3 Volumeand Velocity 2169.3.4 DataQualityand Security 2169.3.5 Utilizationand Analytics 2179.3.6 Visualizationof Data 2189.4 BigDataAnalyticsPlatformsforPMUDataProcessing 2199.4.1 Hadoop 2209.4.2 ApacheSpark 2219.4.3 ApacheHBase 2229.4.4 ApacheStorm 2229.4.5 Cloud-BasedPlatforms 2239.5 Conclusions 224 References 224

10 IntelligentEnterprise-LevelBigDataAnalyticsfor Modelingand Managementin SmartInternetof Roads 231

Amin Fadaeddini, Babak Majidi, and Mohammad Eshghi10.1 Introduction 23110.2 FullyConvolutionalDeepNeuralNetworkfor AutonomousVehicle

Identification 23310.2.1 Detectionof theBoundingBoxof theLicensePlate 23310.2.2 SegmentationObjective 23410.2.3 SpatialInvariances 23410.2.4 ModelFramework 23410.2.4.1 Increasingthe Layerof Transformation 23410.2.4.2 DataFormatof SampleImages 23510.2.4.3 ApplyingBatchNormalization 23610.2.4.4 NetworkArchitecture 23610.2.5 Roleof Data 23610.2.6 SynthesizingSamples 23610.2.7 Invariances 23710.2.8 ReducingNumberof Features 23710.2.9 ChoosingNumberof Classes 23810.3 ExperimentalSetupand Results 23910.3.1 SparseSoftmaxLoss 23910.3.2 MeanIntersectionOverUnion 24010.4 PracticalImplementationof Enterprise-LevelBigDataAnalytics

for SmartCity 24010.5 Conclusion 244 References 244

Contents xi

11 PredictiveAnalysisof IntelligentSensingand Cloud-BasedIntegratedWaterManagementSystem 247

Tanuja Patgar and Ripal Patel11.1 Introduction 24711.2 LiteratureSurvey 24811.3 ProposedSix-TierDataFramework 25011.3.1 PrimaryComponents 25111.3.2 ContactUnit(FC-37) 25311.3.3 Internetof ThingsCommunicator(ESP8266) 25311.3.4 GSM-BasedARMand ControlSystem 25311.3.5 Methodology 25311.3.6 ProposedAlgorithm 25611.4 Implementationand ResultAnalysis 25711.4.1 WaterReportfor Home1and Home2 Modules 26311.5 Conclusion 263 References 263

12 DataSecurityin theInternetof Things:Challengesand Opportunities 265

Shashwati Banerjea, Shashank Srivastava, and Sachin Kumar12.1 Introduction 26512.2 IoT:BriefIntroduction 26612.2.1 Challengesin aSecureIoT 26712.2.2 SecurityRequirementsin IoTArchitecture 26812.2.2.1 SensingLayer 26812.2.2.2 NetworkLayer 26912.2.2.3 InterfaceLayer 27112.2.3 CommonAttacksin IoT 27112.3 IoTSecurityClassification 27212.3.1 ApplicationDomain 27212.3.1.1 Authentication 27212.3.1.2 Authorization 27412.3.1.3 Depletionof Resources 27412.3.1.4 Establishmentof Trust 27512.3.2 ArchitecturalDomain 27512.3.2.1 Authenticationin IoTArchitecture 27512.3.2.2 Authorizationin IoTArchitecture 27612.3.3 CommunicationChannel 27612.4 SecurityinIoTData 27712.4.1 IoTDataSecurity:Requirements 277

Contentsxii

12.4.1.1 Data:Confidentiality,Integrity,and Authentication 27812.4.1.2 DataPrivacy 27912.4.2 IoTDataSecurity:ResearchDirections 28012.5 Conclusion 280 References 281

13 DDoSAttacks:Tools,MitigationApproaches,and ProbableImpacton PrivateCloudEnvironment 285

R. K. Deka, D. K. Bhattacharyya, and J. K. Kalita13.1 Introduction 28513.1.1 Stateof theArt 28713.1.2 Contribution 28813.1.3 Organization 29013.2 CloudandDDoSAttack 29013.2.1 CloudDeploymentModels 29013.2.1.1 DifferencesBetweenPrivateCloudand PublicCloud 29313.2.2 DDoSAttacks 29413.2.2.1 Attackson InfrastructureLevel 29413.2.2.2 Attackson ApplicationLevel 29613.2.3 DoS/DDoSAttackon Cloud:ProbableImpact 29713.3 MitigationApproaches 29813.3.1 Discussion 30913.4 Challengesand Issueswith Recommendations 30913.5 AGenericFramework 31013.6 Conclusionand FutureWork 312 References 312

14 Securingthe DefenseDatafor MakingBetterDecisionsUsing Data Fusion 321

Syed Rameem Zahra14.1 Introduction 32114.2 Analysisof BigData 32214.2.1 ExistingIoTBigDataAnalyticsSystems 32214.2.2 BigDataAnalyticalMethods 32414.2.3 Challengesin IoTBigDataAnalytics 32414.3 DataFusion 32514.3.1 OpportunitiesProvidedbyDataFusion 32614.3.2 DataFusionChallenges 32614.3.3 StagesatWhichDataFusionCanHappen 32614.3.4 MathematicalMethodsfor DataFusion 326

Contents xiii

14.4 DataFusionforIoTSecurity 32714.4.1 DefenseUseCase 32914.5 Conclusion 329 References 330

15 NewAgeJournalismand BigData(UnderstandingBigData andItsInfluenceon Journalism) 333

Asif Khan and Heeba Din15.1 Introduction 33315.1.1 BigDataJournalism:The NextBigThing 33415.1.2 AllAboutData 33615.1.3 AccessingDatafor Journalism 33715.1.4 DataAnalytics:Toolsfor Journalists 33815.1.5 CaseStudies–BigData 34015.1.5.1 BBCBigData 34015.1.5.2 TheGuardianDataBlog 34215.1.5.3 Wikileaks 34415.1.5.4 WorldEconomicForum 34415.1.6 BigData–IndianScenario 34515.1.7 Internetof Thingsand Journalism 34615.1.8 Impacton Media/Journalism 347 References348

16 TwoDecadesof BigDatain Finance:SystematicLiterature Reviewand FutureResearchAgenda 351

Nufazil Altaf16.1 Introduction 35116.2 Methodology 35316.3 ArticleIdentificationand Selection 35316.4 Descriptionand Classificationof Literature 35416.4.1 ResearchMethodEmployed 35416.4.2 ArticlesPublishedYearWise 35516.4.3 Journalof Publication 35616.5 Contentand CitationAnalysisof Articles 35616.5.1 CitationAnalysis 35616.5.2 ContentAnalysis 35716.5.2.1 BigDatain FinancialMarkets 35816.5.2.2 BigDatain InternetFinance 35916.5.2.3 BigDatain FinancialServices 35916.5.2.4 BigDataand OtherFinancialIssues 360

Contentsxiv

16.6 Reportingof Findingsand ResearchGaps 36016.6.1 Findingsfrom theLiteratureReview 36116.6.1.1 Lackof Symmetry 36116.6.1.2 Dominanceof Researchon FinancialMarkets,InternetFinance,

and FinancialServices 36116.6.1.3 Dominanceof EmpiricalResearch 36116.6.2 Directionsfor FutureResearch 362 References 362

Index 367

xv

Nailah AfshanDepartment of Computer Science and EngineeringIslamic University of Science and TechnologyPulwama, India

Nufazil AltafSchool of Business StudiesCentral University of Kashmir Kashmir, India

Asha AmbhaikarDepartment of Computer Science and EngineeringKalinga UniversityNaya RaipurChhattisgarh, India

Shashwati BanerjeaDepartment of Computer Science and EngineeringMotilal Nehru National Institute of Technology AllahabadPrayagrajUttar Pradesh, India

Molood BaratiSchool of EngineeringComputer and Mathematical Sciences

Auckland University of Technology Auckland, New Zealand

Dhruba Kumar BhattacharyyaDepartment of Computer Science and EngineeringSchool of EngineeringTezpur UniversityTezpurAssam, India

Mohammad Ahsan ChishtiDepartment of Information TechnologyCentral University of Kashmir Kashmir, India

Shoumen Palit Austin DattaMIT Auto-ID LabsDepartment of Mechanical EngineeringMassachusetts Institute of Technology Cambridge, MA, USA

Rup Kumar DekaDepartment of Computer Science and EngineeringAssam Don Bosco UniversityGuwahatiAssam, India

List of Contributors

List of Contributorsxvi

Heeba DinDepartment of Mass CommunicationIslamic University of Science and TechnologyPulwama, India

Mohammad EshghiComputer Engineering DepartmentShahid Beheshti UniversityTehran, Iran

Amin FadaeddiniDepartment of Computer EngineeringFaculty of Engineering Khatam UniversityTehran, Iran

Marie-Laure FurgalaInstitut Supérieur de Logistique IndustrielleKEDGE Business School Talence, France

Basant GargP.S to MOS, Ministry of Commerce and IndustryGovernment of IndiaUdyog Bhawan, New Delhi, India

Hemant GargThe PSCADB LTDChandigarh, India

Sushil GuptaDepartment of Bio-SciencesLovely Professional University Punjab, India

Jugal K. KalitaDepartment of Computer ScienceCollege of Engineering and Applied Science

University of ColoradoBoulder, CO, USA

Ankur KashyapBennett UniversityGreaterNoida, India

Asif KhanSchool of Media StudiesCentral University of Kashmir Kashmir, India

Sarabjeet Kaur KochharDepartment of Computer ScienceIndraprastha College for WomenUniversity of DelhiNew Delhi, India

Sachin KumarDepartment of Computer Science and EngineeringMotilal Nehru National Institute of Technology AllahabadPrayagrajUttar Pradesh, India

Sunil KumarDepartment of Electrical and Electronics EngineeringKalinga UniversityNaya RaipurChhattisgarh, India

María Victoria López LópezDeparmento Arquitectura de Computadores y AutomáticaUniversidad Complutense de Madrid Madrid, Spain

List of Contributors xvii

Babak MajidiEmergency and Rapid Response Simulation (ADERSIM) Artificial Intelligence GroupFaculty of Liberal Arts and Professional StudiesYork UniversityToronto, ON, Canada

Eric S. McLamoreDepartment of Agricultural SciencesClemson UniversityClemsonSC, USA

Snowber MushtaqDepartment of Computer Science and EngineeringIslamic University of Science and TechnologyPulwama, India

Ripal PatelDepartment of Electronics and CommunicationDr. Ambedkar Institute of TechnologyBengaluru, India

Tanuja PatgarDepartment of Electronics and CommunicationDr. Ambedkar Institute of TechnologyBengaluru, India

R. RajathyDepartment of Electrical and Electronics Engineering

Pondicherry Engineering CollegePuducherry, India

Ranjeet Kumar RoutDepartment of Computer Science and EngineeringNational Institute of TechnologySrinagar, India

Tausifa Jan SaleemDepartment of Computer Science and EngineeringNational Institute of TechnologySrinagar, India

Gérald SantucciINTEROP-VLabBureau Nouvelle Région Aquitaine Europe Brussels, Belgium

Anjum SheikhDepartment of Electronics and CommunicationKalinga UniversityNaya RaipurChhattisgarh, India

Omkar SinghDepartment of Electronics and Communication EngineeringNational Institute of TechnologySrinagar, India

Shashank SrivastavaDepartment of Computer Science and EngineeringMotilal Nehru National Institute of Technology AllahabadPrayagrajUttar Pradesh, India

List of Contributorsxviii

C.M. ThasnimolDepartment of Electrical and Electronics EngineeringPondicherry Engineering CollegePuducherry, India

Diana C. VanegasInterdisciplinary Group for Biotechnological Innovation and Ecosocial Change BioNovo

Universidad del ValleCali, Colombia

Syed Rameem ZahraDepartment of Computer Science and EngineeringNational Institute of TechnologySrinagar, India

xix

AI artificialintelligenceALPR AutomaticLicensePlateRecognitionANN artificialneuralnetworkAWS AmazonwebservicesBDA bigdataanalyticsCBSP cloud-basedsecurityproviderCCP ClusterCommunicationProtocolCHARGEN CharacterGeneratorProtocolCNN convolutionalneuralnetworkCRF conditionalrandomfieldCSP cloudserviceproviderCVM corevectormachineDBSCAN density-basedspatialclusteringofapplicationsDDoS distributeddenialofserviceDNS DomainNameSystemDoS denialofserviceE-DoS economicdenialofsustainabilityFCNN fullyconvolutionalneuralnetworkHTTP Hyper-TextTransferProtocolIaaS infrastructureasaserviceICMP InternetControlMessageProtocolIDS IntrusionDetectionSystemIoRT internetofroboticthingsIoT InternetofThingsIP InternetProtocolIPS IntrusionPreventionSystemsIRAS IntrusionResponsiveAutonomicSystemIWD intelligentwaterdropKDD KnowledgeDiscoveryinDatabases

List of Abbreviations

List of Abbreviationsxx

KNN K-nearestneighborLAN localareanetworksLAND localareanetworkdenialM-IoU meanintersectionoverunionML machinelearningNaaS networkingasaserviceNTP NetworkTimeProtocolOCSVM oneclasssupportvectormachinePaaS platformasaservicePCA principalcomponentanalysisPDC PhasorDataConcentratorPMU phasormeasurementunitPOS partofspeechQoS qualityofserviceRDBMS RelationalDataBaseManagementSystemRNN recurrentneuralnetworkSaaS softwareasaserviceSCADA supervisorycontrolanddataacquisitionSCIT self-cleansingintrusiontoleranceSDN software-definednetworkingSDNFV software-definednetworkfunctionvirtualizationSLA service-levelagreementSNMP SimpleNetworkManagementProtocolSOAP SimpleObjectAccessProtocolSQL structuredquerylanguageSSDP SimpleServiceDiscoveryProtocolSVM supportvectormachineSVR supportvectorregressionSYN synchronizeTCP TransmissionControlProtocolTTL time-to-liveUPnP universalplugandplayVAE variationalauto-encoderWAMS wideareamonitoringsystemWAN wideareanetworks

Big Data Analytics for Internet of Things, First Edition. Edited by Tausifa Jan Saleem and Mohammad Ahsan Chishti. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc.

1

Internet of Things (IoT) is an emerging idea that has the prospective to completely reform the outlook of businesses. The goal of the IoT is to transmute day-to-day objects to being smart by utilizing a broad range of sophisticated technologies, from embedded devices and communication technologies to data analytics. IoT is bound to transform the ways of our everyday working and living. The number of IoT devices is anticipated to amount to several billion in the next few years. This unpredictable growth in the number of devices connected to IoT and the exponen-tial rise in data consumption manifest how the expansion of big data seamlessly coincides with that of IoT. The growth of big data and the IoT is swiftly accelerat-ing and affecting all areas of technologies and businesses. The main objective of data analytics in IoT is to identify trends in the data, extract concealed informa-tion, and to dig out valuable information from the raw data generated by IoT systems. This is extremely crucial for dispensing elite services to IoT users. In this regard, investigating the technological advancements in the said area becomes indispensable. To this purpose, this book uncovers the recent trends in big data analytics for IoT applications so that novel, optimized, and efficient designs of IoT use-cases are formulated.

This book contains high-quality research articles discussing various aspects of IoT data analytics like enabling technologies of IoT data analytics, types of IoT data analytics, challenges in IoT data analytics, etc. This is critically important for keeping researchers up-to-date with the eco-system they have to deal with. IoT is being used as a field for garnering huge business profits. It is extremely important to squeeze out the best decisions or wisdom from the data that is being fed into the systems of business organizations. The book involves discussions of ways for

1

Big Data Analytics for the Internet of ThingsAn Overview

Tausifa Jan Saleem1 and Mohammad Ahsan Chishti2

1 Department of Computer Science and Engineering, National Institute of Technology Srinagar, India2 Department of Information Technology, Central University of Kashmir, Kashmir, India

1  Big Data Analytics for the Internet of Things2

extracting valuable insights from Big Data. The techniques that are suitable for digging out best decisions from the humungous IoT data to gain control of IoT devices are unleashed in the book. The book discusses almost every aspect of IoT data analytics.

The following topics are explored in this book:

● Enabling technologies for IoT Big Data Analytics ● Machine Learning Techniques for IoT Data Analytics ● Types of IoT Data Analytics ● IoT Data Analytical Platforms ● Challenges in IoT Data Analytics ● Deep Learning Architectures for IoT Data Analytics ● Personalization in IoT ● How IoT makes cities smarter ● Role of IoT and Big Data in Environmental Sustainability ● Synchro-phasor Data Management in Power Grids ● Autonomous Vehicle Identification in Smart Transportation ● Cloud-based Water Management System ● Security and Privacy Requirements in IoT ● Mitigation of DDOS attacks ● Opportunities provided by Data Fusion ● Role of IoT and Big Data in Journalism ● Role of IoT and Big Data in Finance

The book comprises of sixteen chapters. Following provides a glimpse of their contribution:

The second chapter entitled “Data, Analytics and Interoperability between Systems (IoT) is Incongruous with the Economics of Technology: Evolution of Porous Pareto Partition (P3)” aspires to inform that tools and data related to the affluent world are not a template to be “copied” or applied to systems in the remaining (80%) parts of the world which suffer from economic constraints. The chapter suggests that we need different thinking that resists the inclination of the affluent 20% of the world to treat the rest of the world (80% of the population) as a market. The 80/20 concept evokes the Pareto theme in P3, and the implication is that ideas may float between (porous) the 80/20 domains (partition).

The third chapter entitled “Machine Learning Techniques for IoT Data Analytics” discusses the various supervised and unsupervised machine learning approaches and their highly significant role in the smart analysis of IoT data. A detailed taxonomy of various machine learning algorithms together with their strengths, challenges and shortcomings is discussed. Following this, a review of application areas and use cases for each algorithm is presented in the chapter. It is quite helpful in having a better understanding of the usage of each algorithm and

Big Data Analytics for the Internet of Things 3

helps in choosing a suitable data analytic algorithm for a particular problem. The chapter concludes that machine learning has a lot of scope in the world of IoT and is proving highly beneficial for efficient analysis of smart data.

The fourth chapter entitled “IoT Data Analytics using Cloud Computing” discusses the cloud computing framework for IoT data analytics. Moreover, the importance of machine learning in IoT data analytics is also presented in the chapter. The chapter also lists the challenges faced by IoT data analytics when cloud is used as a computing platform.

The fifth chapter entitled “Deep Learning Architectures for IoT Data Analytics” unleashes the opportunities created by Deep Learning in IoT data analytics. Deep Learning has shown phenomenal performance in diverse domains, including image recognition, speech recognition, robotics, natural language processing, human-computer interface, etc. The chapter provides a description of the various Deep Learning architectures. The role of these Deep Learning architectures in IoT data analytics is also presented in the chapter.

The sixth chapter entitled “Adding Personal Touches to IoT: A User-Centric IoT Architecture” focuses on the use of the concept of personalization to achieve the goal of taking the human-computer interaction to the next level. Personalization is a powerful instrument that has the potential of shaping the quality of IoT products and services to keep pace with the constantly evolving customer needs. Use cases and real-life examples are used to demonstrate how using users personal insights spell magic for boosting IoT systems across a variety of domains such as businesses, marketing, recommendation systems and commercial and industrial IoT systems and services. The chapter investigates how personalization is assuming an impor-tant, irreplaceable role in the development of IoT systems being deployed across multiple domains and the lives of associated varied strata of users such as the busi-ness owners, marketing professionals, business analysts, data analysts, designers and the end-user. The work takes stock of the current scenario and establishes through use cases, and examples that personalization is already being exploited for huge benefits but the concept itself is being given a rather ad-hoc treatment. This is evident as personalization finds no mention in the IoT architecture itself. It is left to dangle on as a last-minute job in most of the IoT systems developed so far. Concerns regarding the usage of personalization viz. privacy and the filter bubble have also been taken into consideration to point out the future directions of work in Big Data Analytics of IoT systems.

The seventh chapter entitled “Smart Cities and the Internet of Things” investi-gates the development of smart cities from a perspective of the IoT. The chapter uses existing examples of smart cities to forecast what the future holds for cities seeking to utilize the IoT in optimizing their operations and resource usage.

The eighth chapter entitled “A Roadmap for Application of IoT Generated Big Data in Environmental Sustainability” describes the role of IoT generated big data

1  Big Data Analytics for the Internet of Things4

in environmental sustainability. The chapter proposes a roadmap for achieving better environmental sustainability. Moreover, the obstacles that create hindrance in environmental sustainability are also discussed in the chapter.

The ninth chapter entitled “Application of High-Performance Computing in Synchrophasor Data Management and Analysis for Power Grids” discusses the various problems associated with the big data analysis with particular reference to Phasor Measurement Unit’s (PMU) data handling and introduces the modern techniques and tools to resolve those pitfalls.

The tenth chapter entitled “Intelligent enterprise-level big data analytics for modelling and management in smart internet of roads” proposes a method based on Fully Convolutional Neural Network for semantic segmentation of vehicle license plates in a complex and multi-language environment. First, the license plates are detected, and then digits in the license plates are segmented. The perfor-mance of the proposed algorithm is evaluated using a dataset of real and manually generated data. The impact of various parameters in improving the accuracy of the proposed algorithm is investigated. The experimental results show that the proposed framework can detect and segment the license plates in complex sce-narios, and the results can be used in smart highways and smart road applications.

The eleventh chapter entitled “Predictive analysis of intelligent sensing and cloud-based integrated water management system” proposes a water manage-ment system with following characteristics; real-time measurement of consump-tion, monitoring of leakages, ability to control the water supply if there is leakage, a completely automated platform for societies, and apartment complexes to set up their billing system. The proposed system consists of a flow sensor meter installed in the main water inlet pipe that captures information about water usage and communicates through a WiFi network to iOS and Android compati-ble applications.

The twelfth chapter entitled “Data Security in the Internet-of-Things: Challenges and Opportunities” highlights the IoT security threats and vulnerabilities. The chapter categorizes the IoT security based on context of application, architecture and communication. Furthermore, the chapter discusses the research directions in confidentiality, privacy and IoT data security.

The thirteenth entitled “DDoS Attacks: Tools, Mitigation Approaches, and Probable Impact on Private Cloud Environment” discusses the seriousness of the threats posed by DDoS attacks in the context of the cloud, particularly in the per-sonal private cloud. The chapter discusses several prominent approaches intro-duced to counter DDoS attacks in private clouds. The chapter presents a generic framework to defend against DDoS attacks in an individual private cloud environ-ment taking into account different challenges and issues.

The fourteenth chapter entitled “Securing the Defense Data for Making Better Decisions using Data Fusion” gives an idea of the problems that arise in the

Big Data Analytics for the Internet of Things 5

defense related IoT-big data analytics with special attention to its security. Data fusion has been introduced as a probable solution to tackle these problems. The chapter guides the researchers regarding the issues of data fusion, the stages where it could be used and the mathematical techniques that could be adopted to implement it on IoT big data.

The fifteenth chapter entitled “New age Journalism and Big data (Understanding big data & its influence on Journalism)” tries to identify how big data is altering the way journalism is practiced in the twentyfirst century. For the purpose, the chapter takes the case study of award-winning data journalism projects, which have not only used big data for their stories but also using converging big data with new media practices of interactive visualization, revolutionized the practice of journalism. The chapter not only provides a glimpse into how big data is changing journalism but also critically examines the impact, practices and methods involved to lay forward a guide for future research into this genre. The chapter concludes that both IoT and Big Data have tremendous potential to influence the economies of global markets, and at the same time change, the way content (information) is collected and produced for the audiences.

The last chapter entitled “Two decades of big data in finance: Systematic litera-ture review and future research agenda” presents a review on IoT and big data in finance. The chapter identifies the gaps in the current body of knowledge to delib-erate upon the areas of future research. The study uses a systematic literature review method on a sample of 105 articles published from 2000 to 2019. The majority of work on big data in finance is dominated by the empirical setup in financial markets, internet finance, and financial services. The chapter contains all-inclusive publications on the big data in finance classified according to various attributes. The chapter would be useful to all the patrons concerned with big data.

Big Data Analytics for Internet of Things, First Edition. Edited by Tausifa Jan Saleem and Mohammad Ahsan Chishti. © 2021 John Wiley & Sons, Inc. Published 2021 by John Wiley & Sons, Inc.

7

2

Data, Analytics and Interoperability Between Systems (IoT) is Incongruous with the Economics of Technology: Evolution of Porous Pareto Partition (P3) Shoumen Palit Austin Datta1,2,3,*, Tausifa Jan Saleem4, Molood Barati5, María Victoria López López6, Marie-Laure Furgala7, Diana C. Vanegas8, Gérald Santucci9, Pramod P. Khargonekar10, and Eric S. McLamore11

1MIT Auto-ID Labs, Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA2MDPnP Interoperability and Cybersecurity Labs, Biomedical Engineering Program, Department of Anesthesiology, Massachusetts General Hospital, Harvard Medical School, 65 Landsdowne Street, Cambridge, MA 02139, USA3NSF Center for Robots and Sensors for Human Well-Being, Collaborative Robotics Lab, School of Engineering Technology, Purdue University, 193 Knoy Hall, West Lafayette, IN 47907, USA4Department of Computer Science and Engineering, National Institute of Technology Srinagar,Jammu & Kashmir 190006, India5School of Engineering, Computer and Mathematical Sciences Auckland University of Technology, Auckland 1010, New Zealand6Facultad de Informática, Deparmento Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, Calle Profesore Santesmases 9, 28040 Madrid, Spain7Director, Institut Supérieur de Logistique Industrielle, KEDGE Business School, 680 Cours de la Libération, 33405 Talence, France8Biosystems Engineering, Department of Environmental Engineering and Earth Sciences, Clemson University, Clemson, SC 29631, USA9Former Head of the Unit, Knowledge Sharing, European Commission (EU) Directorate General for Communications Networks, Content and Technology (DG CONNECT); Former Head of the Unit Networked Enterprise & Radio Frequency Identification (RFID), European Commission; Former Chair of the Internet of Things (IoT) Expert Group, European Commission (EU); INTEROP-VLab, Bureau Nouvelle Région Aquitaine Europe, 21 rue Montoyer, 1000 Brussels, Belgium10Vice Chancellor for Research, University of California, Irvine and Distinguished Professor of Electrical Engineering and Computer Science, University of California, Irvine, California 9269711Department of Agricultural Sciences, Clemson University, Clemson, SC 29634, USA

Opinions expressed in this essay (chapter) are due to the corresponding author and may not reflect the views of the institutions with which the author is affiliated. Listed coauthors are not responsible and may not endorse any/all comments and criticisms.

2  Data, Analytics, and Interoperability Between Systems (IoT) is Incongruous8

2.1 Context

Since 1999, the concept of the Internet of Things (IoT) was nurtured as a market-ing term [2] which may have succinctly captured the idea of data about objects stored on the Internet [3] in the networked physical world. The idea evolved while transforming the use of radio frequency identification (RFID) where an alphanu-meric unique identifier (64‐bit EPC [4] or electronic product code) was stored on the chip (tag [5]) but the voluminous raw data were stored on the Internet, yet inextricably and uniquely linked via the EPC, in a manner resembling the struc-ture of internet protocols [6] (64‐bit IPv4 and 128‐bit IPv6 [7]). IoT and, later, cloud of data [8] were metaphors for ubiquitous connectivity and concepts originating from ubiquitous computing, a term introduced by Mark Weiser [9] in 1998. The underlying importance of data from connected objects and processes usurped the term big data [10] and then twisted the sound bites to create the artificial myth of “Big Data” sponsored and accelerated by consulting companies. The global drive to get ahead of the “Big Data” tsunami, flooded both businesses and governments, big and small. The chatter about big data garnished with dollops of fake AI became parlor talk among fish mongers [11] and gold miners, inviting the sardonicism of doublespeak, which is peppered throughout this essay.

Much to the chagrin of the thinkers, the laissez‐faire approach to IoT percolated by the tinkerers overshadowed hard facts. The “quick & dirty” anti‐intellectual chaos adumbrated the artifact‐fueled exploding frenzy for new revenue from “IoT Practice” which spawned greed in the consulting [12] world. The cacophony of IoT in the market [13] is a result of that unstoppable transmutation of disingenu-ous tabloid fodder to veritable truth, catalyzed by pseudo‐science hacks, social gurus, and glib publicity campaigns to drum up draconian “dollar‐sign‐dangling” predictions [14] about “trillions of things connected to the internet” to feed mass hysteria, to bolster consumption. Few ventured to correct the facts and point out that connectivity without discovery is a diabolical tragedy of egregious errors. Even fewer recognized that the idea of IoT is not a point but an ecosystem, where col-laboration adds value.

The corporate orchestration of the digital by design metaphor of IoT was warped solely to create demand for sales by falsely amplifying the lure of increasing per-formance, productivity, and profit, far beyond the potential digital transformation could deliver by embracing the rational principles of IoT (Figures 2.1–2.4).

Ubiquitous connectivity is associated with high cost of products (capex or capi-tal expense) but extraction of “value” to generate return on investment (ROI) rests on the ability to implement SARA, a derivative of the PEAS paradigm (see Figures 2.7 and 2.8). SARA – Sense, Analyze, Respond, Actuate – is not a linear concept. Data and decisions necessary for SARA make the conceptual illustration more akin to The Sara Cycle, perhaps best illustrated by the analogy to the Krebs