network traffic anomaly detection based on ml-esn for...

21
Research Article Network Traffic Anomaly Detection Based on ML-ESN for Power Metering System S. T. Zhang, 1 X. B. Lin, 1 L. Wu, 1 Y. Q. Song , 2 N. D. Liao, 2 and Z. H. Liang 3 1 CSG Power, Dispatching Control Center, Guangzhou 510663, China 2 Changsha University of Science and Technology, Changsha 410114, China 3 CSG Power, Digital Grid Research Institute, Guangzhou 510623, China Correspondence should be addressed to Y. Q. Song; [email protected] Received 25 February 2020; Revised 20 June 2020; Accepted 2 July 2020; Published 14 August 2020 Academic Editor: Ivo Petras Copyright © 2020 S. T. Zhang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Due to the diversity and complexity of power network system platforms, some traditional network traffic detection methods work well for small sample datasets. However, the network data detection of complex power metering system platforms has problems of low accuracy and high false-positive rate. In this paper, through a combination of exploration and feedback, a solution for power network traffic anomaly detection based on multilayer echo state network (ML-ESN) is proposed. is method first relies on the Pearson and Gini coefficient method to calculate the statistical distribution and correlation of network flow characteristics and then uses the ML-ESN method to classify the network attacks abnormally. Because the ML-ESN method abandons the back- propagation mechanism, the nonlinear fitting ability of the model is solved. In order to verify the effectiveness of the proposed method, a simulation test was conducted on the UNSW_NB15 network security dataset. e test results show that the average accuracy of this method is more than 97%, which is significantly better than single-layer echo state network, shallow BP neural network, and some traditional machine learning methods. 1. Introduction At present, the traditional power grid is developing towards the smart grid. Due to the need to improve efficiency, flexibility, reliability, and loss reduction, power advanced metering infrastructure (AMI) has been rapidly developed. e system integrates smart meters, communication net- works, data centers, and software systems [1]. Various application servers are mainly responsible for data collection, business application operations, and system maintenance. Large-scale measurement terminals need to access the measurement automation master station through a virtual private network. ese communication processes are very vulnerable to attacks [2]. erefore, the safe op- eration of power metering systems must rely on reliable communication networks and security protection, detection, and analysis technologies. Network security experts have discovered that AMI, as an important infrastructure in modern society, is one of the important targets of cyberattacks launched by hostile or- ganizations. e main attack methods for power networks include malicious attacks, denial of service attacks, data spoofing, and network monitoring [3]. Due to the key information exchanged in AMI com- munication, AMI needs reliable protection to prevent un- authorized access and malicious attacks. erefore, when migrating to AMI facilities, we must use security mechanism and intrusion detection technology [3]. At present, intrusion detection methods are divided into host-based intrusion detection and network-based intrusion detection. Host-based intrusion detection mainly solves the collection, forensics, and audit of host intrusion traces; network-based intrusion detection is mainly used to analyze the network flow and judge the network attack behavior in real time. Among them, researchers at home and abroad have applied network intrusion detection technology to anomaly detection of AMI network flow and proposed a variety of Hindawi Mathematical Problems in Engineering Volume 2020, Article ID 7219659, 21 pages https://doi.org/10.1155/2020/7219659

Upload: others

Post on 12-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

Research ArticleNetwork Traffic Anomaly Detection Based on ML-ESN for PowerMetering System

S T Zhang1 X B Lin1 L Wu1 Y Q Song 2 N D Liao2 and Z H Liang3

1CSG Power Dispatching Control Center Guangzhou 510663 China2Changsha University of Science and Technology Changsha 410114 China3CSG Power Digital Grid Research Institute Guangzhou 510623 China

Correspondence should be addressed to Y Q Song acl158474361stucsusteducn

Received 25 February 2020 Revised 20 June 2020 Accepted 2 July 2020 Published 14 August 2020

Academic Editor Ivo Petras

Copyright copy 2020 S T Zhang et al is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Due to the diversity and complexity of power network system platforms some traditional network traffic detection methods workwell for small sample datasets However the network data detection of complex power metering system platforms has problems oflow accuracy and high false-positive rate In this paper through a combination of exploration and feedback a solution for powernetwork traffic anomaly detection based on multilayer echo state network (ML-ESN) is proposed is method first relies on thePearson and Gini coefficient method to calculate the statistical distribution and correlation of network flow characteristics andthen uses the ML-ESN method to classify the network attacks abnormally Because the ML-ESN method abandons the back-propagation mechanism the nonlinear fitting ability of the model is solved In order to verify the effectiveness of the proposedmethod a simulation test was conducted on the UNSW_NB15 network security dataset e test results show that the averageaccuracy of this method is more than 97 which is significantly better than single-layer echo state network shallow BP neuralnetwork and some traditional machine learning methods

1 Introduction

At present the traditional power grid is developing towardsthe smart grid Due to the need to improve efficiencyflexibility reliability and loss reduction power advancedmetering infrastructure (AMI) has been rapidly developede system integrates smart meters communication net-works data centers and software systems [1]

Various application servers are mainly responsible fordata collection business application operations and systemmaintenance Large-scale measurement terminals need toaccess the measurement automation master station througha virtual private network ese communication processesare very vulnerable to attacks [2] erefore the safe op-eration of power metering systems must rely on reliablecommunication networks and security protection detectionand analysis technologies

Network security experts have discovered that AMI asan important infrastructure in modern society is one of the

important targets of cyberattacks launched by hostile or-ganizations e main attack methods for power networksinclude malicious attacks denial of service attacks dataspoofing and network monitoring [3]

Due to the key information exchanged in AMI com-munication AMI needs reliable protection to prevent un-authorized access and malicious attacks erefore whenmigrating to AMI facilities we must use security mechanismand intrusion detection technology [3]

At present intrusion detection methods are divided intohost-based intrusion detection and network-based intrusiondetection Host-based intrusion detection mainly solves thecollection forensics and audit of host intrusion tracesnetwork-based intrusion detection is mainly used to analyzethe network flow and judge the network attack behavior inreal time

Among them researchers at home and abroad haveapplied network intrusion detection technology to anomalydetection of AMI network flow and proposed a variety of

HindawiMathematical Problems in EngineeringVolume 2020 Article ID 7219659 21 pageshttpsdoiorg10115520207219659

anomaly detection and analysis models such as deep neuralnetwork [1] Markov [4] density statistics [5] BP neuralnetwork [6] attack graph-based information fusion [7] andprinciple component analysis [8]

In [5] Fathnia and Javidi tried to use OPTICS density-based technology to immediately diagnose AMI anomaliesin customer information and intelligent data In order toimprove the efficiency of the method they used LOFindexing technologyis technology actually detects factorsrelated to data anomalies and judges abnormal behaviorbased on factor scores

In [7] an AMI intrusion detection system (AMIDS) wasproposedis system uses information fusion technology tocombine sensors and consumption data in smart meters tomore accurately detect energy theft

Frommost existing research we find that there are moreresearch studies on the detection of AMI theft behavioranomaly but fewer research studies on the detection of AMInetwork traffic anomaly attack

At present there are still some problems in the existingresearch on AMI network traffic anomaly detection forexample the attack rules in [5] for AMI dynamic networkenvironment must be updated regularly In [6] the authorsestablished a BP neural network training model based on 6kinds of simple data of AMI and carried out simulation testson Matlab software However the model is still a long wayfrom real engineering applications

In this study we are different from the previous AMIanomaly detection content focusing on the abnormal sit-uation of AMI platform network flow By continuouslyextracting AMI network traffic characteristics such asprotocol type average packet size maximum and minimumpacket size packet duration and other related flow-basedcharacteristics it is possible to accurately analyze the type ofattack anomalies encountered by the AMI platform

We make the following contributions to AMI networkattack anomaly detection by using deep learning methodsbased on stream feature extraction and multilayer echo statenetworks

(1) is paper proposes a deep learning method for AMInetwork attack anomaly detection based on multi-layer echo state networks

(2) By extracting the statistical features of the collectednetwork data streams the importance and correla-tion of the statistical features of the network streamsare found the data input of deep learning is opti-mized the model training effect is improved and themodel training time is greatly reduced

(3) In order to verify the validity and accuracy of themethod we tested it in the UNSW_NB15 publicbenchmark dataset Experimental results show thatour method can detect AMI anomalous attacks andis superior to other methods

e rest of the paper is organized as follows Section 2describes related research Section 3 introduces AMI net-work architecture and security issues Section 4 proposessecurity solutions Section 5 focuses on the application of

ML-ESN classification method in AMI Section 6 completesexperiments and comparisons finally this paper summa-rizes the research work and puts forward some problemsthat need to be solved in the future

2 Related Work

Smart grid introduces computer and network communi-cation technology and physical facilities to form a complexsystem which is essentially a huge cyber-physical system(CPS) [9]

AMI is regarded as one of themost basic implementationtechnologies of smart grid but so far a large number ofpotential vulnerabilities have been discovered For examplein the AMI network smart meters smart data collectors anddata processing centers have their own storage spaces andthese spaces store a lot of information However this in-formation can easily be tampered with due to the placementof malware

In order to solve the security problems of the AMIsystem the AMI Network Engineering Task Force (AMI-SEC) [10] pointed out that intrusion detection systems orrelated technologies can better monitor the AMI networkand analyze and discover different attacks through technicalmeans

At present domestic and foreign scholars have con-ducted a lot of research studies on the security of AMImainly focusing on power fraud detection malicious codedetection and network attack detection [11]

21 Power Fraud Detection In terms of power spoofingattacks are generally divided into two cases according to theconsequences of the attack

One is to inject the wrong data into the power grid tolaunch an attack which causes the power grid to oscillateOnce successful it will cause a large-scale impact on thepower grid and users e second is to enable attackers toobtain direct economic benefits by stealing electricity

Jokar et al in [12] present a new energy theft detectorbased on consumption patterns e detector uses thepredictability of normal and malicious consumption pat-terns of users and distribution transformer electricity metersto shortlist areas with a high probability of power theft andidentifies suspicious customers by monitoring abnormalconditions in consumption patterns

e authors in [13] proposed a semisupervised anomalydetection framework to solve the problem of energy theft inthe public utility database that leads to changes in user usagepatterns Compared with other methods (such as a class ofSVM and automatic encoder) the framework can control thedetection intensity through the detection index threshold

22 Malicious Code Detection Since the smart metertransmits power consumption information to the grid ter-minal the detection of malicious code can be extended to thedetection of executable code Once it is confirmed that thedata uploaded by the meter contain executable code the dataare likely to be malicious code [14]

2 Mathematical Problems in Engineering

In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem

In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code

23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)

SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users

e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms

e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect

e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]

In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments

In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure

In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork

In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes

In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods

With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]

Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)

Currently there are three main methods of anomalydetection based on deep learning

(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse

(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases

(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered

In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN

As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and

Mathematical Problems in Engineering 3

prediction of one-dimensional time series ESN has a verygood advantage [32]

Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]

Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS

At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized

From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied

e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection

3 AMI Network Architecture andSecurity Issues

e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1

In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers

e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]

As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft

and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure

Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack

4 Proposed Security Solution

At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished

To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI

As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes

Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally

e main reasons for adopting standardized processingare as follows

(1) Improve the centralized processing and visual dis-play of network flow information

(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices

(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification

For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine

4 Mathematical Problems in Engineering

learning classification algorithms according to their actualconditions

e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data

41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1

Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with

the escape string Part of the real probe stream data isshown in Figure 3

e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo

Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata

Data processing centerElectricity users

Data concentratorFirewall

Flow probe Smart electric meter

Figure 2 Traffic probe simple deployment diagram

Table 1 Some important metadata information

ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port

19 NatSrcIP IPstring 1024 NAT translated sourceIP

20 NatSrcPort String 1024 NAT translated sourceport

21 NatDestIP IPstring 1024 NAT translateddestination IP

22 NatDestPort String 1024 NAT translateddestination port

23 SrcMac String 1024 Source MAC address

24 DestMac String 1024 Destination MACaddress

25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol

Smartmeter

Smartmeter

Smartmeter

Repeater

Repeater

Smart home applications

Smart home applications

Energystorage

PHEVPEV

eutilitycentre

HANZigbee Bluetooth

RFID PLC

NANmesh network

Wi-FiWiMAX PLC

WANfiber optic WiMAX

satellite BPLData

concentrator

Figure 1 AMI network layered architecture [35]

Mathematical Problems in Engineering 5

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 2: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

anomaly detection and analysis models such as deep neuralnetwork [1] Markov [4] density statistics [5] BP neuralnetwork [6] attack graph-based information fusion [7] andprinciple component analysis [8]

In [5] Fathnia and Javidi tried to use OPTICS density-based technology to immediately diagnose AMI anomaliesin customer information and intelligent data In order toimprove the efficiency of the method they used LOFindexing technologyis technology actually detects factorsrelated to data anomalies and judges abnormal behaviorbased on factor scores

In [7] an AMI intrusion detection system (AMIDS) wasproposedis system uses information fusion technology tocombine sensors and consumption data in smart meters tomore accurately detect energy theft

Frommost existing research we find that there are moreresearch studies on the detection of AMI theft behavioranomaly but fewer research studies on the detection of AMInetwork traffic anomaly attack

At present there are still some problems in the existingresearch on AMI network traffic anomaly detection forexample the attack rules in [5] for AMI dynamic networkenvironment must be updated regularly In [6] the authorsestablished a BP neural network training model based on 6kinds of simple data of AMI and carried out simulation testson Matlab software However the model is still a long wayfrom real engineering applications

In this study we are different from the previous AMIanomaly detection content focusing on the abnormal sit-uation of AMI platform network flow By continuouslyextracting AMI network traffic characteristics such asprotocol type average packet size maximum and minimumpacket size packet duration and other related flow-basedcharacteristics it is possible to accurately analyze the type ofattack anomalies encountered by the AMI platform

We make the following contributions to AMI networkattack anomaly detection by using deep learning methodsbased on stream feature extraction and multilayer echo statenetworks

(1) is paper proposes a deep learning method for AMInetwork attack anomaly detection based on multi-layer echo state networks

(2) By extracting the statistical features of the collectednetwork data streams the importance and correla-tion of the statistical features of the network streamsare found the data input of deep learning is opti-mized the model training effect is improved and themodel training time is greatly reduced

(3) In order to verify the validity and accuracy of themethod we tested it in the UNSW_NB15 publicbenchmark dataset Experimental results show thatour method can detect AMI anomalous attacks andis superior to other methods

e rest of the paper is organized as follows Section 2describes related research Section 3 introduces AMI net-work architecture and security issues Section 4 proposessecurity solutions Section 5 focuses on the application of

ML-ESN classification method in AMI Section 6 completesexperiments and comparisons finally this paper summa-rizes the research work and puts forward some problemsthat need to be solved in the future

2 Related Work

Smart grid introduces computer and network communi-cation technology and physical facilities to form a complexsystem which is essentially a huge cyber-physical system(CPS) [9]

AMI is regarded as one of themost basic implementationtechnologies of smart grid but so far a large number ofpotential vulnerabilities have been discovered For examplein the AMI network smart meters smart data collectors anddata processing centers have their own storage spaces andthese spaces store a lot of information However this in-formation can easily be tampered with due to the placementof malware

In order to solve the security problems of the AMIsystem the AMI Network Engineering Task Force (AMI-SEC) [10] pointed out that intrusion detection systems orrelated technologies can better monitor the AMI networkand analyze and discover different attacks through technicalmeans

At present domestic and foreign scholars have con-ducted a lot of research studies on the security of AMImainly focusing on power fraud detection malicious codedetection and network attack detection [11]

21 Power Fraud Detection In terms of power spoofingattacks are generally divided into two cases according to theconsequences of the attack

One is to inject the wrong data into the power grid tolaunch an attack which causes the power grid to oscillateOnce successful it will cause a large-scale impact on thepower grid and users e second is to enable attackers toobtain direct economic benefits by stealing electricity

Jokar et al in [12] present a new energy theft detectorbased on consumption patterns e detector uses thepredictability of normal and malicious consumption pat-terns of users and distribution transformer electricity metersto shortlist areas with a high probability of power theft andidentifies suspicious customers by monitoring abnormalconditions in consumption patterns

e authors in [13] proposed a semisupervised anomalydetection framework to solve the problem of energy theft inthe public utility database that leads to changes in user usagepatterns Compared with other methods (such as a class ofSVM and automatic encoder) the framework can control thedetection intensity through the detection index threshold

22 Malicious Code Detection Since the smart metertransmits power consumption information to the grid ter-minal the detection of malicious code can be extended to thedetection of executable code Once it is confirmed that thedata uploaded by the meter contain executable code the dataare likely to be malicious code [14]

2 Mathematical Problems in Engineering

In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem

In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code

23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)

SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users

e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms

e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect

e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]

In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments

In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure

In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork

In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes

In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods

With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]

Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)

Currently there are three main methods of anomalydetection based on deep learning

(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse

(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases

(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered

In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN

As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and

Mathematical Problems in Engineering 3

prediction of one-dimensional time series ESN has a verygood advantage [32]

Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]

Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS

At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized

From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied

e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection

3 AMI Network Architecture andSecurity Issues

e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1

In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers

e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]

As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft

and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure

Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack

4 Proposed Security Solution

At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished

To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI

As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes

Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally

e main reasons for adopting standardized processingare as follows

(1) Improve the centralized processing and visual dis-play of network flow information

(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices

(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification

For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine

4 Mathematical Problems in Engineering

learning classification algorithms according to their actualconditions

e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data

41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1

Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with

the escape string Part of the real probe stream data isshown in Figure 3

e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo

Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata

Data processing centerElectricity users

Data concentratorFirewall

Flow probe Smart electric meter

Figure 2 Traffic probe simple deployment diagram

Table 1 Some important metadata information

ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port

19 NatSrcIP IPstring 1024 NAT translated sourceIP

20 NatSrcPort String 1024 NAT translated sourceport

21 NatDestIP IPstring 1024 NAT translateddestination IP

22 NatDestPort String 1024 NAT translateddestination port

23 SrcMac String 1024 Source MAC address

24 DestMac String 1024 Destination MACaddress

25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol

Smartmeter

Smartmeter

Smartmeter

Repeater

Repeater

Smart home applications

Smart home applications

Energystorage

PHEVPEV

eutilitycentre

HANZigbee Bluetooth

RFID PLC

NANmesh network

Wi-FiWiMAX PLC

WANfiber optic WiMAX

satellite BPLData

concentrator

Figure 1 AMI network layered architecture [35]

Mathematical Problems in Engineering 5

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 3: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

In order to achieve the rapid detection of AMI maliciouscode attacks the authors in [15] proposed a secure andprivacy-protected aggregation scheme based on additivehomomorphic encryption and proxy reencryption opera-tions in the Paillier cryptosystem

In [16] Euijin et al used a disassembler and statisticalanalysis method to deal with AMI malicious code detectione method first looks for the characteristics of each datatype uses a disassembler to study the distribution of in-structions in the data and performs statistical analysis on thedata payload to determine whether it is malicious code

23 Network Attack Detection At present after a largenumber of statistical discoveries the main attack point forhackers against the AMI network is the smart meter (SM)

SM is the key equipment that constitutes the AMInetwork It realizes the two-way communication betweenthe power company and the user On the one hand the userrsquosconsumption data are collected and transmitted to the powercompany through the AMI network e companyrsquos elec-tricity prices and instructions are presented to users

e intrusion detection mechanism is an important partof the current smart meter security protection It willmonitor the events that occur in the smart meter and analyzethe events Once an attack occurs or a potential securitythreat is discovered the intrusion detection mechanism willissue an alarm so that the system and managers adoptcorresponding response mechanisms

e current research on AMI network security threatsmainly analyzes whether there are abnormalities from theperspective of network security especially the data andnetwork security modeling for smart meter security emain reason is that physical attacks against AMI are oftenstrong and the most effective but they are easier to detect

e existing AMI network attack detection methodsmainly include simulation method [17 18] k-means clus-tering [1 19 20] data mining [21ndash23] evaluate prequential[24] and PCA [25]

In [17] the authors investigated the puppet attackmechanism and compared other attack types and evaluatedthe impact of puppet attack on AMI through simulationexperiments

In [18] authors also use the simulation tool NeSSi tostudy the impact of large-scale DDoS attacks on the intel-ligent grid AMI network information communicationinfrastructure

In order to be able to more accurately analyze the AMInetwork anomaly some researchers start with AMI networktraffic and use machine learning methods to determinewhether a variety of anomaly attacks have occurred on thenetwork

In [20] the authors use distributed intrusion detectionand sliding window methods to monitor the data flow ofAMI components and propose a real-time unsupervisedAMI data flow mining detection system (DIDS) e systemmainly uses the mini-batch k-means algorithm to performtype clustering on network flows to discover abnormal attacktypes

In [22] authors use an artificial immune system to detectAMI network attacks is method first uses the Pcapnetwork packets obtained by the AMI detection equipmentand then classifies the attack types through artificial immunemethods

With the increase of AMI traffic feature dimension andnoise data the traffic anomaly detection method based ontraditional machine learning faces the problems of lowaccuracy and poor robustness of traffic feature extractionwhich reduces the performance of traffic attack detection toa certain extent erefore the anomaly detection methodbased on deep learning has become a hot topic in the currentnetwork security research [26ndash34]

Wang et al [27] proposed a technique that uses deeplearning to complete malicious traffic detection is tech-nology is mainly divided into two implementation steps oneis to use CNN (convolutional neural network) to learn thespatial characteristics of traffic and the other is to extractdata packets from the data stream and learn the spatio-temporal characteristics through CNN and RNN (recurrentneural network)

Currently there are three main methods of anomalydetection based on deep learning

(1) Anomaly detection method based on deep Boltz-mann machine [28] this kind of method can extractits essential features through learning of high-di-mensional traffic data so as to improve the detectionrate of traffic attacks However this type of methodhas poor robustness in extracting features When theinput data contain noise its attack detection per-formance becomes worse

(2) Based on stacked autoencoders (SAE) anomaly de-tection method [29] this type of method can learnand extract traffic data layer by layer However therobustness of the extracted features is poor Whenthe measured data are destroyed the detection ac-curacy of this method decreases

(3) Anomaly detection method based on CNN [27 30]the traffic features extracted by this type of methodhave strong robustness and the attack detectionperformance is high but the network traffic needs tobe converted into an image first which increases thedata processing burden and the influence of networkstructure information on the accuracy of featureextraction is not fully considered

In recent years the achievements of deep learning in thefield of time series prediction have also received more andmore attention When some tasks need to be able to processsequence information RNN can play the advantages ofcorresponding time series processing compared to thesingle-input processing of fully connected neural networkand CNN

As a new type of RNN echo state network is composedof input layer hidden layer (ie reserve pool) and outputlayer One of the advantages of ESN is that the entire net-work only needs to train the Wout layer so its trainingprocess is very fast In addition for the processing and

Mathematical Problems in Engineering 3

prediction of one-dimensional time series ESN has a verygood advantage [32]

Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]

Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS

At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized

From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied

e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection

3 AMI Network Architecture andSecurity Issues

e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1

In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers

e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]

As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft

and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure

Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack

4 Proposed Security Solution

At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished

To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI

As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes

Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally

e main reasons for adopting standardized processingare as follows

(1) Improve the centralized processing and visual dis-play of network flow information

(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices

(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification

For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine

4 Mathematical Problems in Engineering

learning classification algorithms according to their actualconditions

e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data

41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1

Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with

the escape string Part of the real probe stream data isshown in Figure 3

e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo

Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata

Data processing centerElectricity users

Data concentratorFirewall

Flow probe Smart electric meter

Figure 2 Traffic probe simple deployment diagram

Table 1 Some important metadata information

ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port

19 NatSrcIP IPstring 1024 NAT translated sourceIP

20 NatSrcPort String 1024 NAT translated sourceport

21 NatDestIP IPstring 1024 NAT translateddestination IP

22 NatDestPort String 1024 NAT translateddestination port

23 SrcMac String 1024 Source MAC address

24 DestMac String 1024 Destination MACaddress

25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol

Smartmeter

Smartmeter

Smartmeter

Repeater

Repeater

Smart home applications

Smart home applications

Energystorage

PHEVPEV

eutilitycentre

HANZigbee Bluetooth

RFID PLC

NANmesh network

Wi-FiWiMAX PLC

WANfiber optic WiMAX

satellite BPLData

concentrator

Figure 1 AMI network layered architecture [35]

Mathematical Problems in Engineering 5

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 4: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

prediction of one-dimensional time series ESN has a verygood advantage [32]

Because ESN has such advantages it is also used by moreand more researchers to analyze and predict network attacks[33 34]

Saravanakumar and Dharani [33] applied the ESNmethod to network intrusion detection system tested themethod on KDD standard dataset and found that the methodhas faster convergence and better performance in IDS

At present some researchers have found through ex-periments that there are still some problems with the single-layer echo state network (1) there are defects in the modeltraining that can only adjust the output weights (2) trainingthe randomly generated reserve pool has nothing to do withspecific problems and the parameters are difficult to de-termine and (3) the degree of coupling between neurons inthe reserve pool is high erefore the application of echonetwork to AMI network traffic anomaly detection needs tobe improved and optimized

From the previous review we can find that the tradi-tional AMI network attack analysis methods mainly includeclassification-based statistics-based cluster-based and in-formation theory (entropy) In addition different deeplearning methods are constantly being tried and applied

e above methods have different advantages and dis-advantages for different research objects and purposes isarticle focuses on making full use of the advantages of theESN method and trying to solve the problem that the single-layer ESN network cannot be directly applied to the AMIcomplex network traffic detection

3 AMI Network Architecture andSecurity Issues

e AMI network is generally divided into three networklayers from the bottom up home area network (HAN)neighboring area network (NAN) and wide area network(WAN) e hierarchical structure is shown in Figure 1

In Figure 1 the HAN is a network formed by the in-terconnection of all electrical equipment in the home of a griduser and its gateway is a smart meter e neighborhoodnetwork is formed by multiple home networks throughcommunication interconnection between smart meters orbetween smart meters and repeaters And multiple NANs canform a field area network (FAN) through communicationinterconnections such as wireless mesh networks WiMAXand PLC and aggregate data to the FANrsquos area data con-centrator Many NANs and FANs are interconnected to forma WAN through switches or routers to achieve communi-cation with power company data and control centers

e reliable deployment and safe operation of the AMInetwork is the foundation of the smart grid Because theAMI network is an information-physical-social multido-main converged network its security requirements includenot only the requirements for information and networksecurity but also the security of physical equipment andhuman security [35]

As FadwaZeyar [20] mention AMI faces various securitythreats such as privacy disclosure money gain energy theft

and other malicious activities Since AMI is directly relatedto revenue customer power consumption and privacy themost important thing is to protect its infrastructure

Researchers generally believe that AMI security detec-tion defense and control mainly rely on three stages ofimplementation e first is prevention including securityprotocols authorization and authentication technologiesand firewalls e second is detection including IDS andvulnerability scanning e third is reduction or recoverythat is recovery activities after the attack

4 Proposed Security Solution

At present a large number of security detection equipmentsuch as firewalls IDS fortresses and vertical isolation de-vices have been deployed in Chinarsquos power grid enterprisesese devices have provided certain areas with securitydetection and defense capabilities but it brings someproblems (1) these devices generally operate independentlyand do not work with each other (2) each device generates alarge number of log and traffic files and the file format is notuniform and (3) no unified traffic analysis platform has beenestablished

To solve the above problems this paper proposes thefollowing solutions first rely on the traffic probe to collectthe AMI network traffic in real time second each trafficprobe uploads a unified standard traffic file to the controlcenter and finally the network flow anomalies are analyzedin real time to improve the security detection and identi-fication capabilities of AMI

As shown in Figure 2 we deploy traffic probes on someimportant network nodes to collect real-time network flowinformation of all nodes

Of course many domestic and foreign power companieshave not established a unified information collection andstandardization process In this case it can also be processedby equipment and area For example to collect data fromdifferent devices before data analysis perform pre-processing such as data cleaning data filtering and datacompletion and then use the Pearson and Gini coefficientmethods mentioned in this article to find important featurecorrelations and it is also feasible to use the ML-ESN al-gorithm to classify network attacks abnormally

e main reasons for adopting standardized processingare as follows

(1) Improve the centralized processing and visual dis-play of network flow information

(2) Partly eliminate and overcome the inadequateproblem of collecting information due to single ortoo few devices

(3) Use multiple devices to collect information andstandardize the process to improve the ability ofinformation fusion so as to enhance the accuracyand robustness of classification

For other power companies that have not performedcentralized and standardized processing they can establishcorresponding data preprocessing mechanisms andmachine

4 Mathematical Problems in Engineering

learning classification algorithms according to their actualconditions

e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data

41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1

Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with

the escape string Part of the real probe stream data isshown in Figure 3

e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo

Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata

Data processing centerElectricity users

Data concentratorFirewall

Flow probe Smart electric meter

Figure 2 Traffic probe simple deployment diagram

Table 1 Some important metadata information

ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port

19 NatSrcIP IPstring 1024 NAT translated sourceIP

20 NatSrcPort String 1024 NAT translated sourceport

21 NatDestIP IPstring 1024 NAT translateddestination IP

22 NatDestPort String 1024 NAT translateddestination port

23 SrcMac String 1024 Source MAC address

24 DestMac String 1024 Destination MACaddress

25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol

Smartmeter

Smartmeter

Smartmeter

Repeater

Repeater

Smart home applications

Smart home applications

Energystorage

PHEVPEV

eutilitycentre

HANZigbee Bluetooth

RFID PLC

NANmesh network

Wi-FiWiMAX PLC

WANfiber optic WiMAX

satellite BPLData

concentrator

Figure 1 AMI network layered architecture [35]

Mathematical Problems in Engineering 5

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 5: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

learning classification algorithms according to their actualconditions

e goal is the same as this article and is to quickly findabnormal network attacks from a large number of networkflow data

41 Probe Stream Format Standards and Collection ContentIn order to be able to unify the format of the probe streamdata the international IPFIX standard is referenced and therelevant metadata of the probe stream is defined emetadata include more than 100 different information unitsAmong them the information units with IDs less than orequal to 433 are clearly defined by the IPFIX standardOthers (IDs greater than or equal to 1000) are defined by usSome important metadata information is shown in Table 1

Metadata are composed of strings each informationelement occupies a fixed position of the string the strings areseparated by and the last string is also terminated by Inaddition the definition of an information element that doesnot exist in metadata is as follows if there is an informationelement defined below in a metadata and the correspondinginformation element position does not need to be filled in itmeans that two^are adjacent at this time If the extractedinformation element has a caret it needs to be escaped with

the escape string Part of the real probe stream data isshown in Figure 3

e first record in Figure 3 is as follows ldquo6 69085d3e5432360300000000 10107110 1010721241^19341^22^6^40 1^40^1 1564365874 15643 ^ 2019-07-29T030823969^ ^ ^ TCP^ ^ ^ 10107110^1010721241^ ^ ^ rdquo

Part of the above probe flow is explained as followsaccording to the metadata standard definition (1) 6 metadata

Data processing centerElectricity users

Data concentratorFirewall

Flow probe Smart electric meter

Figure 2 Traffic probe simple deployment diagram

Table 1 Some important metadata information

ID Name Type Length Description1 EventID String 64 Event ID2 ReceiveTime Long 8 Receive time3 OccurTime Long 8 Occur time4 RecentTime Long 8 Recent time5 ReporterID Long 8 Reporter ID6 ReporterIP IPstring 128 Reporter IP7 EventSrcIP IPstring 128 Event source IP8 EventSrcName String 128 Event source name9 EventSrcCategory String 128 Event source category10 EventSrcType String 128 Event source type11 EventType Enum 128 Event type12 EventName String 1024 Event name13 EventDigest String 1024 Event digest14 EventLevel Enum 4 Event level15 SrcIP IPstring 1024 Source IP16 SrcPort String 1024 Source port17 DestIP IPstring 1024 Destination IP18 DestPort String 1024 Destination port

19 NatSrcIP IPstring 1024 NAT translated sourceIP

20 NatSrcPort String 1024 NAT translated sourceport

21 NatDestIP IPstring 1024 NAT translateddestination IP

22 NatDestPort String 1024 NAT translateddestination port

23 SrcMac String 1024 Source MAC address

24 DestMac String 1024 Destination MACaddress

25 Duration Long 8 Duration (second)26 UpBytes Long 8 Up traffic bytes27 DownBytes Long 8 Down traffic bytes28 Protocol String 128 Protocol29 AppProtocol String 1024 Application protocol

Smartmeter

Smartmeter

Smartmeter

Repeater

Repeater

Smart home applications

Smart home applications

Energystorage

PHEVPEV

eutilitycentre

HANZigbee Bluetooth

RFID PLC

NANmesh network

Wi-FiWiMAX PLC

WANfiber optic WiMAX

satellite BPLData

concentrator

Figure 1 AMI network layered architecture [35]

Mathematical Problems in Engineering 5

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 6: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

version (2) 69085d3e5432360300000000 metadata ID (3)10107110 source IP (4) 1010721241 destination IP (5)19341 source port (6) 22 destination port and (7) 6 pro-tocol TCP

42 Proposed Framework e metadata of the power probestream used contain hundreds and it can be seen from thedata obtained in Figure 3 that not every stream contains allthe metadata content If these data analyses are used directlyone is that the importance of a single metadata cannot bedirectly reflected and the other is that the analysis datadimensions are particularly high resulting in particularlylong calculation time erefore the original probe streammetadata cannot be used directly but needs further pre-processing and analysis

In order to detect AMI network attacks we propose anovel network attack discovery method based on AMI probetraffic and use multilayer echo state networks to classifyprobe flows to determine the type of network attack especific implementation framework is shown in Figure 4

e framework mainly includes three processing stagesand the three steps are as follows

Step 1 collect network flow metadata information inreal time through network probe flow collection devicesdeployed in different areasStep 2 first the time series or segmentation of thecollected network flow metadata is used to statisticallyobtain the statistical characteristics of each part of thenetwork flow Second the statistically obtained char-acteristic values are standardized according to certaindata standardization guidelines Finally in order to beable to quickly find important features and correlationsbetween the features that react to network attackanomalies the standardized features are further filteredStep 3 establish a multilayer echo state network deeplearning model and classify the data after feature ex-traction part of which is used as training data and partof which is used as test data Cross-validation wasperformed on the two types of data to check the cor-rectness and performance of the proposed model

43 Feature Extraction Generally speaking to realize theclassification and identification of network traffic it isnecessary to better reflect the network traffic of differentnetwork attack behaviors and statistical behaviorcharacteristics

Network traffic [36] refers to the collection of all networkdata packets between two network hosts in a completenetwork connection According to the currently recognizedstandard it refers to the set of all network data packets withthe same quintuple within a limited time including the sumof the data characteristics carried by the related data on theset

As you know some simple network characteristics canbe extracted from the network such as source IP addressdestination IP address source port destination port andprotocol and because network traffic is exchanged betweensource and destination machines source IP address des-tination IP address source port and destination port arealso interchanged which reflects the bidirectionality of theflow

In order to be able to more accurately reflect thecharacteristics of different types of network attacks it isnecessary to cluster and collect statistical characteristics ofnetwork flows

Firstly network packets are aggregated into networkflows that is to distinguish whether each network flow isgenerated by different network behaviors Secondly thispaper refers to the methods proposed in [36 37] to extractthe statistical characteristics of network flow

In [36] 22 statistical features of malicious code attacksare extracted which mainly includes the following

Statistical characteristics of data size forward andbackward packets maximum minimum average andstandard deviation and forward and backward packetratioStatistical characteristics of time duration forward andbackward packet interval and maximum minimumaverage and standard deviation

In [37] 249 statistical characteristics of network trafficare summarized and analyzed e main statistical char-acteristics in this paper are as follows

6^69085d3e5432360300000000^10107110^1010721241^19341^22^6^40^1^40^1^1564365874^1564365874^^^^

6^71135d3e5432362900000000^10107110^1010721241^32365^23^6^40^1^0^0^1564365874^1564365874^^^

6^90855d3e5432365d00000000^10107110^1010721241^62215^6000^6^40^1^40^1^1564365874^1564365874^

6^c4275d3e5432367800000000^10107110^1010721241^50504^25^6^40^1^40^1^1564365874^1564365874^^^

6^043b5d3e5432366d00000000^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^71125d3e5432362900000000^10107110^1010721241^46043^443^6^40^1^40^1^1564365874^1564365874^^

6^043b5d3e5432366d00000001^10107110^1010721241^1909^2048^1^28^1^28^1^1564365874^1564365874^^

6^3ff75d3e5432361600000000^10107110^1010721241^39230^80^6^80^2^44^1^1564365874^1564365874^^^

6^044a5d3e5432366d00000000^10107110^1010721241^31730^21^6^40^1^40^1^1564365874^1564365874^^

6^7e645d3e6df9364a00000000^10107110^1010721241^33380^6005^6^56^1^40^1^1564372473^1564372473^

6^143d5d3e6dfc361500000000^10107110^1010721241^47439^32776^6^56^1^0^0^1564372476^1564372476^

6^81b75d3e6df8360100000000^10107110^1010721241^56456^3086^6^56^1^40^1^1564372472^1564372472^

6^e0745d3e6dfc367300000000^10107110^1010721241^54783^44334^6^56^1^0^0^1564372476^1564372476^

Figure 3 Part of the real probe stream data

6 Mathematical Problems in Engineering

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 7: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

Time interval maximum minimum average intervaltime standard deviationPacket size maximum minimum average size andpacket distributionNumber of data packets out and inData amount input byte amount and output byteamountStream duration duration from start to end

Some of the main features of network traffic extracted inthis paper are shown in Table 2

44 Feature Standardization Because the various attributesof the power probe stream contain different data type valuesand the differences between these values are relatively largeit cannot be directly used for data analysis erefore weneed to perform data preprocessing operations on statisticalfeatures which mainly include operations such as featurestandardization and unbalanced data elimination

At present the methods of feature standardization aremainly [38] Z-score min-max and decimal scaling etc

Because there may be some nondigital data in thestandard protocol such as protocol IP and TCP flagthese data cannot be directly processed by standardiza-tion so nondigital data need to be converted to digitalprocessing For example change the character ldquodhcprdquo tothe value ldquo1rdquo

In this paper Z-score is selected as a standardizedmethod based on the characteristics of uneven data distri-bution and different values of the power probe stream Z-score normalization processing is shown in the followingformula

xprime x minus x

δ (1)

where x is the mean value of the original data δ is the standarddeviation of the original data and std ((x1 minus x)2+

(x2 minus x)2 + n(number of samples per feature)) δ std

radic

45 Feature Filtering In order to detect attack behaviormore comprehensively and accurately it is necessary toquickly and accurately find the statistical characteristics thatcharacterize network attack behavior but this is a verydifficult problem e filter method is the currently popularfeature filtering method It regards features as independentobjects evaluates the importance of features according to

quality metrics and selects important features that meetrequirements

At present there are many data correlationmethodsemore commonly used methods are chart correlation analysis(line chart and scatter chart) covariance and covariancematrix correlation coefficient unary and multiple regres-sion information entropy and mutual information etc

Because the power probe flow contains more statisticalcharacteristics the main characteristics of different types ofattacks are different In order to quickly locate the importantcharacteristics of different attacks this paper is based on thecorrelation of statistical characteristics data and informationgain to filter the network flow characteristics

Pearson coefficient is used to calculate the correlation offeature data e main reason is that the calculation of thePearson coefficient is more efficient simple and moresuitable for real-time processing of large-scale power probestreams

Pearson correlation coefficient is mainly used to reflectthe linear correlation between two random variables (x y)and its calculation ρxy is shown in the following formula

ρxy cov(x y)

σxσy

E x minus ux( 1113857 y minus uy1113872 11138731113960 1113961

σxσy

(2)

where cov(x y) is the covariance of x y σx is the standarddeviation of x and σy is the standard deviation of y If youestimate the covariance and standard deviation of thesample you can get the sample Pearson correlation coeffi-cient which is usually expressed by r

r 1113936

ni1 xi minus x( 1113857 yi minus y( 1113857

1113936ni1 xi minus x( 1113857

2

1113936ni1 yi minus y( 1113857

211139691113970

(3)

where n is the number of samples xi and yi are the ob-servations at point i corresponding to variables x and y x isthe average number of x samples and y is the averagenumber of y samples e value of r is between minus1 and 1When the value is 1 it indicates that there is a completelypositive correlation between the two random variables when

Probe

Traffic collection Feature extractionStatistical flow characteristics

Standardized features

Flow

Characteristic filter

Classification andevaluation

Construction of multilayer echo

state network

Verification and performance

evaluation

Figure 4 Proposed AMI network traffic detection framework

Table 2 Some of the main features

ID Name Description1 SrcIP Source IP address2 SrcPort Source IP port3 DestIP Destination IP address4 DestPort Destination IP port

5 Proto Network protocol mainly TCP UDP andICMP

6 total_fpackets Total number of forward packets7 total_fvolume Total size of forward packets8 total_bpackets Total number of backward packets9 total_bvolume Total size of backward packets

29 max_biat Maximum backward packet reach interval

30 std_biat Time interval standard deviation of backwardpackets

31 duration Network flow duration

Mathematical Problems in Engineering 7

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 8: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

the value is minus1 it indicates that there is a completely negativecorrelation between the two random variables when thevalue is 0 it indicates that the two random variables arelinearly independent

Because the Pearson method can only detect the linearrelationship between features and classification categoriesthis will cause the loss of the nonlinear relationship betweenthe two In order to further find the nonlinear relationshipbetween the characteristics of the probe flow this papercalculates the information entropy of the characteristics anduses the Gini index to measure the nonlinear relationshipbetween the selected characteristics and the network attackbehavior from the data distribution level

In the classification problem assuming that there are k

classes and the probability that the sample points belong tothe i classes is Pi the Gini index of the probability distri-bution is defined as follows [39]

Gini(P) 1113944K

i1Pi(1 minus P) 1 minus 1113944

K

i1p2i (4)

Given the sample set D the Gini coefficient is expressedas follows

Gini(P) 1 minus 1113944K

i1

Ck

11138681113868111386811138681113868111386811138681113868

D1113888 1113889

2

(5)

where Ck is a subset of samples belonging to the Kth class inD and k is the number of classes

5 ML-ESN Classification Method

ESN is a new type of recurrent neural network proposed byJaeger in 2001 and has been widely used in various fieldsincluding dynamic pattern classification robot controlobject tracking nuclear moving target detection and eventmonitoring [32] In particular it has made outstandingcontributions to the problem of time series prediction ebasic ESN network model is shown in Figure 5

In this model the network has 3 layers input layerhidden layer (reservoir) and output layer Among them attime t assuming that the input layer includes k nodes thereservoir contains N nodes and the output layer includes L

nodes then

U (t) u1(t) u2(t) uk(t)1113858 1113859T

x (t) x1(t) x2(t) xN(t)1113858 1113859T

y(t) y1(t) y2(t) yL(t)1113858 1113859T

(6)

Win(NlowastK) represents the connection weight of theinput layer to the reservoir W(NlowastN) represents theconnection weight from x(t minus 1) to x(t) Wout (Llowast (K +

N + L)) represents the weight of the connection from thereservoir to the output layerWback(NlowastL) represents theconnection weight of y (t minus 1) to x (t) and this value isoptional

When u(t) is input the updated state equation of thereservoir is given by

x(t + 1) f win lowast u(t + 1) + wback lowastx(t)( 1113857 (7)

where f is the selected activation function and fprime is theactivation function of the output layeren the output stateequation of ESN is given by

y(t + 1) fprime wout lowast ([u(t + 1) x(t + 1)])( 1113857 (8)

Researchers have found through experiments that thetraditional echo state network reserve pool is randomlygenerated with strong coupling between neurons andlimited predictive power

In order to overcome the existing problems of ESNsome improved multilayer ESN (ML-ESN) networks areproposed in the literature [40 41] e basic model of theML-ESN is shown in Figure 6

e difference between the two architectures is thenumber of layers in the hidden layer ere is only onereservoir in a single layer and more than one in multiplelayers e updated state equation of ML-ESN is given by[41]

x1(n + 1) f winu(n + 1) + w1x1(n)( 1113857

xk(n + 1) f winter(kminus1)xk(n + 1) + wkxk(n)1113872 1113873

xM(n + 1) f winter(Mminus1)xk(n + 1) + wMxM(n)1113872 1113873

(9)

Calculate the output ML-ESN result according to for-mula (9)

y(n + 1) fout WoutxM(n + 1)( 1113857 (10)

51 ML-ESN Classification Algorithm In general when theAMI system is operating normally and securely the sta-tistical entropy of the network traffic characteristics within aperiod of time will not change much However when thenetwork system is attacked abnormally the statisticalcharacteristic entropy value will be abnormal within acertain time range and even large fluctuations will occur

W

ReservoirInput layer

Win

Output layer

U(t) y(t)

x(t)

Wout

Wback

Figure 5 ESN basic model

8 Mathematical Problems in Engineering

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 9: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

It can be seen from Figure 5 that ESN is an improvedmodel for training RNNs e steps are to use a large-scalerandom sparse network (reservoir) composed of neurons asthe processing medium for data information and then theinput feature value set is mapped from the low-dimensionalinput space to the high-dimensional state space Finally thenetwork is trained and learned by using linear regression andother methods on the high-dimensional state space

However in the ESN network the value of the number ofneurons in the reserve pool is difficult to balance If thenumber of neurons is relatively large the fitting effect isweakened If the number of neurons is relatively small thegeneralization ability cannot be guaranteed erefore it isnot suitable for directly classifying AMI network trafficanomalies

On the contrary the ML-ESN network model can satisfythe internal training network of the echo state by addingmultiple reservoirs when the size of a single reservoir issmall thereby improving the overall training performance ofthe model

is paper selects the ML-ESN model as the AMI net-work traffic anomaly classification learning algorithm especific implementation is shown in Algorithm 1

6 Simulation Test and Result Analysis

In order to verify the effectiveness of the proposed methodthis paper selects the UNSW_NB15 dataset for simulationtesting e test defines multiple classification indicatorssuch as accuracy rate false-positive rate and F1-score Inaddition the performance of multiple methods in the sameexperimental set is also analyzed

61 UNSW_NB15 Dataset Currently one of the main re-search challenges in the field of network security attackinspection is the lack of comprehensive network-baseddatasets that can reflect modern network traffic conditions awide variety of low-footprint intrusions and deep structuredinformation about network traffic [42]

Compared with the KDD98 KDDCUP99 and NSLKDDbenchmark datasets that have been generated internationallymore than a decade ago the UNSW_NB15 dataset appearedlate and can more accurately reflect the characteristics ofcomplex network attacks

e UNSW_NB15 dataset can be downloaded directlyfrom the network and contains nine types of attack data

namely Fuzzers Analysis Backdoors DoS Exploits Ge-neric Reconnaissance Shellcode and Worms [43]

In these experiments two CSV-formatted datasets(training and testing) were selected and each dataset con-tained 47 statistical features e statistics of the trainingdataset are shown in Table 3

Because of the original dataset the format of each ei-genvalue is not uniform For example most of the data are ofnumerical type but some features contain character typeand special symbol ldquo-rdquo so it cannot be directly used for dataprocessing Before data processing the data are standard-ized and some of the processed feature results are shown inFigure 7

62 Evaluation Indicators In order to objectively evaluatethe performance of this method this article mainly usesthree indicators accuracy (correct rate) FPR (false-positiverate) and F minus score (balance score) to evaluate the experi-mental results eir calculation formulas are as follows

accuracy TP + TN

TP + TN + FP + FN

FPR FP

FP + FN

TPR TP

FN + TP

precision TP

TP + FP

recall TP

FN + TP

F minus score 2lowast precisionlowast recallprecision + recall

(11)

e specific meanings of TP TN FP and FN used in theabove formulas are as follows

TP (true positive) the number of abnormal networktraffic successfully detectedTN (true negative) the number of successfully detectednormal network traffic

Reservoir 1Input layer Output layerReservoir 2 Reservoir M

hellip

WinterWin Wout

W1 WM

xM

W2U(t) y(t)

x1 X2

Figure 6 ML-ESN basic model

Mathematical Problems in Engineering 9

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 10: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

FP (false positive) the number of normal networktraffic that is identified as abnormal network trafficFN (false negative) the number of abnormal networktraffic that is identified as normal network traffic

63 Simulation Experiment Steps and Results

Step 1 In a real AMI network environment first collect theAMI probe streammetadata in real time and these metadataare as shown in Figure 3 but in the UNSW_NB15 datasetthis step is directly omitted

Table 3 e statistics of the training dataset

ID Type Number of packets Size (MB)1 Normal 56000 3632 Analysis 1560 01083 Backdoors 1746 0364 DoS 12264 2425 Exploits 33393 8316 Fuzzers 18184 4627 Generic 40000 6698 Reconnaissance 10491 2429 Shellcode 1133 02810 Worms 130 0044

(1) Input(2) D1 training dataset(3) D2 test dataset(4) U(t) input feature value set(5) N the number of neurons in the reservoir(6) Ri the number of reservoirs(7) α interconnection weight spectrum radius(8) Output(9) Training and testing classification results(10) Steps(11) (1) Initially set the parameters of ML-ESN and determine the corresponding number of input and output units according to

the dataset

(i) Set training data length trainLen

(ii) Set test data length testLen

(iii) Set the number of reservoirs Ri

(iv) Set the number of neurons in the reservoirN(v) Set the speed value of reservoir update α(vi) Set xi(0) 0 (1le ileM)

(12) (2) Initialize the input connection weight matrix win internal connection weight of the cistern wi(1le ileM) and weight ofexternal connections between reservoirs winter

(i) Randomly initialize the values of win wi and winter(ii) rough statistical normalization and spectral radius calculation winter and wi are bunched to meet the requirements of

sparsity e calculation formula is as follows wi α(wi|λin|) winter α(winterλinter) and λin andλinter are the spectral radii ofwi and winter matrices respectively

(13) (3) Input training samples into initialized ML-ESN collect state variables by using equation (9) and input them to theactivation function of the processing unit of the reservoir to obtain the final state variables

(i) For t from 1 to T compute x1(t)

(a) Calculate x1(t) according to equation (7)(b) For i from 2 to M compute xi(t)

(i) Calculate xi(t) according to equations (7) and (9)(c) Get matrix H H [x(t + 1) u(t + 1)]

(14) (4) Use the following to solve the weight matrix Wout from reservoir to output layer to get the trained ML-ESN networkstructure

(i) Wout DHT(HHT + βI)minus 1 where β is the ridge regression parameter I matrix is the identity matrix and D [e(t)] andH [x(t + 1) u(t + 1)] are the expected output matrix and the state collection matrix

(15) (5) Calculate the output ML-ESN result according to formula (10)

(i) Select the SoftMax activation function and calculate the output fout value

(16) (6) e data in D2 are input into the trained ML-ESN network the corresponding category identifier is obtained and theclassification error rate is calculated

ALGORITHM 1 AMI network traffic classification

10 Mathematical Problems in Engineering

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 11: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

Step 2 Perform data preprocessing on the AMI metadata orUNSW_NB15 CSV format data which mainly includeoperations such as data cleaning data deduplication datacompletion and data normalization to obtain normalizedand standardized data and standardized data are as shownin Figure 7 and normalized data distribution is as shown inFigure 8

As can be seen from Figure 8 after normalizing the datamost of the attack type data are concentrated between 04and 06 but Generic attack type data are concentrated be-tween 07 and 09 and normal type data are concentratedbetween 01 and 03

Step 3 Calculate the Pearson coefficient value and the Giniindex for the standardized data In the experiment thePearson coefficient value and the Gini index for theUNSW_NB15 standardized data are as shown in Figures 9and 10 respectively

It can be observed from Figure 9 that the Pearson coef-ficients between features are quite different for example thecorrelation between spkts (source to destination packet count)and sloss (source packets retransmitted or dropped) is rela-tively large reaching a value of 097 However the correlationbetween spkts and ct_srv_src (no of connections that containthe same service and source address in 100 connectionsaccording to the last time) is the smallest only minus0069

In the experiment in order not to discard a largenumber of valuable features at the beginning but to retainthe distribution of the original data as much as possible theinitial value of the Pearson correlation coefficient is set to05 Features with a Pearson value greater than 05 will bediscarded and features less than 05 will be retained

erefore it can be seen from Figure 9 that the corre-lations between spkts and sloss dpkts (destination to sourcepacket count) and dbytes (destination to source transactionbytes) tcprtt and ackdat (TCP connection setup time thetime between the SYN_ACK and the ACK packets) allexceed 09 and there is a long positive correlation On thecontrary the correlation between spkts and state dbytesand tcprtt is less than 01 and the correlation is very small

In order to further examine the importance of theextracted statistical features in the dataset the Gini coeffi-cient values are calculated for the extracted features andthese values are shown in Figure 10

As can be seen from Figure 10 the selected Gini values ofdpkts dbytes loss and tcprtt features are all less than 06while the Gini values of several features such as state andservice are equal to 1 From the principle of Gini coefficientsit can be known that the smaller the Gini coefficient value ofa feature the lower the impureness of the feature in thedataset and the better the training effect of the feature

Based on the results of Pearson and Gini coefficients forfeature selection in the UNSW_NB15 dataset this paperfinally selected five important features as model classificationfeatures and these five features are rate sload (source bitsper second) dload (destination bits per second) sjit (sourcejitter (mSec)) and dtcpb (destination TCP base sequencenumber)

Step 4 Perform attack classification on the extracted featuredata according to Algorithm 1 Relevant parameters wereinitially set in the experiment and the specific parametersare shown in Table 4

In Table 4 the input dimension is determined accordingto the number of feature selections For example in the

dur proto service state spkts dpkts sbytes dbytes

0 ndash019102881 0151809388 ndash070230738 ndash040921807 ndash010445581 ndash01357688 ndash004913362 ndash010272556

1 ndash010948479 0151809388 ndash070230738 ndash040921807 ndash004601353 0172598967 ndash004640996 0188544124

2 0040699218 0151809388 ndash070230738 ndash040921807 ndash008984524 ndash002693312 ndash004852709 ndash001213277

3 0049728681 0151809388 0599129702 ndash040921807 ndash00606241 ndash006321168 ndash004701649 ndash009856278

4 ndash014041703 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

5 ndash015105199 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash011762952 ndash004755436 ndash010205729

6 ndash011145895 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

7 ndash012928625 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

8 ndash012599609 0151809388 ndash070230738 ndash040921807 ndash007523467 ndash009949024 ndash004755436 ndash010145863

Figure 7 Partial feature data after standardized

Data label box

Wor

ms

Shel

lcode

Back

door

Ana

lysis

Reco

nnai

ssan

ce

DoS

Fuzz

ers

Expl

oits

Gen

eric

Nor

mal

10

08

06

04

02

00

Figure 8 Normalized data distribution

Mathematical Problems in Engineering 11

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 12: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

UNSW_NB15 data test five important features were selectedaccording to the Pearson and Gini coefficients

e number of output neurons is set to 10 and these 10outputs correspond to 9 abnormal attack types and 1 normaltype respectively

Generally speaking under the same dataset as thenumber of reserve pools increases the time for modeltraining will gradually increase but the accuracy of modeldetection will not increase all the time but will increase firstand then decrease erefore after comprehensive consid-eration the number of reserve pools is initially set to 3

e basic idea of ML-ESN is to generate a complexdynamic space that changes with the input from the reservepool When this state space is sufficiently complex it can usethese internal states to linearly combine the required outputIn order to increase the complexity of the state space thisarticle sets the number of neurons in the reserve pool to1000

In Table 4 the reason why the tanh activation function isused in the reserve pool layer is that its value range isbetween minus1 and 1 and the average value of the data is 0which is more conducive to improving training efficiencySecond when tanh has a significant difference in

characteristics the detection effect will be better In additionthe neuron fitting training process in the ML-ESN reservepool will continuously expand the feature effect

e reason why the output layer uses the sigmoid ac-tivation function is that the output value of sigmoid isbetween 0 and 1 which just reflects the probability of acertain attack type

In Table 4 the last three parameters are important pa-rameters for tuning the ML-ESNmodel e three values areset to 09 50 and 10 times 10minus 6 respectively mainly based onrelatively optimized parameter values obtained throughmultiple experiments

631 Experimental Data Preparation and ExperimentalEnvironment During the experiment the entire dataset wasdivided into two parts the training dataset and the testdataset

e training dataset contains 175320 data packets andthe ratio of normal and attack abnormal packets is 046 1

e test dataset contains 82311 data packets and theratio of normal and abnormal packets is 045 1

097

1

1

1

1

1

1

1

ndash0079

ndash0079

011

011 ndash014

ndash014

039

039

ndash0076

ndash0076

021

021

ndash006

043

043029

029

ndash017

ndash017

0077

0077

ndash0052

ndash0052

ndash0098

ndash0098

10

08

06

04

02

00

0017

0017

011

011

ndash0069

ndash0069

ndash006

ndash0065

ndash0065

ndash031

ndash031

039

039

033

033

024 ndash0058

ndash0058

0048

0048

ndash0058

ndash0058

ndash028 014 0076

0054

0054

014

014

ndash006

ndash0079

ndash0079

ndash00720076 ndash0072

ndash0072

ndash0086

ndash0086

ndash041 036

036

032

0047

0047

0087 ndash0046

ndash0046

ndash0043

ndash0043

0011

0011

ndash0085 ndash009

ndash009

ndash0072

00025

00025

ndash0045

ndash0045

ndash0037

ndash0037

0083

0083

008

008 1

1 ndash036

1

1 ndash029

ndash029

ndash028

ndash028

012

012

ndash0082 ndash0057

ndash0057

ndash033

ndash033

ndash0082 ndash036 1

1

084

084

024 ndash028 014 ndash041 0087 ndash0085 0068

0068

045

045

044

044 ndash03

ndash03

ndash03ndash03

009

009

097 0039

ndash006 ndash028

011 014

ndash0029 ndash02

02 0021

ndash0043 ndash03

0039

ndash026

016

ndash02

0024

ndash029

ndash0018

0095

ndash009

ndash0049

ndash0022

ndash0076

ndash0014

ndash00097

ndash00097

0039

00039 00075

0039

ndash0057

ndash0055

1

0051

0051

005

005

094

094

ndash006 0035

0035

ndash0098 ndash0059

004 097 ndash0059 1

0095 ndash009 ndash0049 ndash0022 ndash0076 ndash0014

ndash006 011 ndash0029 02 ndash0043 0017

0017

ndash028 014 ndash02

ndash02

0021 ndash03 00039

ndash026 016 0024 ndash029 00075

0018

ndash014

ndash014

06

06

ndash0067

ndash0067 ndash004

ndash0098 097

ndash0055ndash0057

032

spkts

state

service

sload

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

djit

stcpb

ct_srv_src

ct_dst_itm

spkt

s

state

serv

ice

sload

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

at djit

stcpb

ct_s

rv_s

rc

ct_d

st_Itm

Figure 9 e Pearson coefficient value for UNSW_NB15

12 Mathematical Problems in Engineering

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 13: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

e experimental environment is tested in Windows 10home version 64-bit operating system Anaconda3 (64-bit)Python 37 80 GB of memory Intel (R) Core i3-4005U CPU 17GHz

632 Be First Experiment in the Simulation Data In orderto fully verify the impact of Pearson and Gini coefficients onthe classification algorithm we have completed the methodexperiment in the training dataset that does not rely on thesetwo filtering methods a single filtering method and the

combination of the two e experimental results are shownin Figure 11

From the experimental results in Figure 11 it is generallybetter to use the filtering technology than to not use thefiltering technology Whether it is a small data sample or alarge data sample the classification effect without the fil-tering technology is lower than that with the filteringtechnology

In addition using a single filtering method is not as goodas using a combination of the two For example in the160000 training packets when no filter method is used therecognition accuracy of abnormal traffic is only 094 whenonly the Pearson index is used for filtering the accuracy ofthe model is 095 when the Gini index is used for filteringthe accuracy of the model is 097 when the combination ofPearson index and Gini index is used for filtering the ac-curacy of the model reaches 099

633 Be Second Experiment in the Simulation DataBecause the UNSW_NB15 dataset contains nine differenttypes of abnormal attacks the experiment first uses Pearsonand Gini index to filter then uses the ML-ESN training

serv

ice

sload

dloa

d

spkt

s

dpkt

s

rate

dbyt

es

sinpk

t

sloss

tcpr

tt

ackd

atsjit

ct_s

rv_s

rc

dtcp

b

djit

service

sload

dload

spkts

dpkts

rate

dbytes

sinpkt

sloss

tcprtt

ackdat

sjit

ct_srv_src

dtcpb

djit

10

08

06

04

02

00

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

1 1 1

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

078 097 058

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

11057 059 058 058057 056082

Figure 10 e Gini value for UNSW_NB15

Table 4 e parameters of ML-ESN experiment

Parameters ValuesInput dimension number 5Output dimension number 10Reservoir number 3Reservoir neurons number 1000Reservoir activation fn TanhOutput layer activation fn SigmoidUpdate rate 09Random seed 50Regularization rate 10 times 10minus6

Mathematical Problems in Engineering 13

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 14: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

algorithm to learn and then uses test data to verify thetraining model and obtains the test results of different typesof attacks e classification results of the nine types ofabnormal attacks obtained are shown in Figure 12

It can be known from the detection results in Figure 12that it is completely feasible to use the ML-ESN networklearning model to quickly classify anomalous networktraffic attacks based on the combination of Pearson andGini coefficients for network traffic feature filteringoptimization

Because we found that the detection results of accuracyF1-score and FPR are very good in the detection of all nineattack types For example in the Generic attack contactdetection the accuracy value is 098 the F1-score value isalso 098 and the FPR value is very low only 002 in theShellcode and Worms attack type detection both theaccuracy and F1-score values reached 099 e FPR valueis only 002 In addition the detection rate of all nineattack types exceeds 094 and the F1-score value exceeds096

634BeBird Experiment in the Simulation Data In orderto fully verify the detection time efficiency and accuracy ofthe ML-ESN network model this paper completed threecomparative experiments (1) Detecting the time con-sumption at different reservoir depths (2 3 4 and 5) anddifferent numbers of neurons (500 1000 and 2000) theresults are shown in Figure 13(a) (2) detection accuracy atdifferent reservoir depths (2 3 4 and 5) and differentnumber of neurons (500 1000 and 2000) the results areshown in Figure 13(b) and (3) comparing the time con-sumption and accuracy of the other three algorithms (BPDecisionTree and single-layer MSN) in the same case theresults are shown in Figure 13(c)

As can be seen from Figure 13(a) when the same datasetand the same model neuron are used as the depth of the

model reservoir increases the model training time will alsoincrease accordingly for example when the neuron is 1000the time consumption of the reservoir depth of 5 is 211mswhile the time consumption of the reservoir depth of 3 isonly 116 In addition at the same reservoir depth the morethe neurons in the model the more training time the modelconsumes

As can be seen from Figure 13(b) with the same datasetand the same model neurons as the depth of the modelreservoir increases the training accuracy of the model willgradually increase at first for example when the reservoirdepth is 3 and the neuron is 1000 the detection accuracy is096 while the depth is 2 the neuron is 1000 and the de-tection accuracy is only 093 But when the neuron is in-creased to 5 the training accuracy of the model is reduced to095

e main reason for this phenomenon is that at thebeginning with the increase of training level the trainingparameters of the model are gradually optimized so thetraining accuracy is also constantly improving Howeverwhen the depth of the model increases to 5 there is a certainoverfitting phenomenon in the model which leads to thedecrease of the accuracy

From the results of Figure 13(c) the overall performanceof the proposed method is better than the other threemethods In terms of time performance the decision treemethod takes the least time only 00013 seconds and the BPmethod takes the most time 00024 In addition in terms ofdetection accuracy the method in this paper is the highestreaching 096 and the decision tree method is only 077ese results reflect that the method proposed in this paperhas good detection ability for different attack types aftermodel self-learning

Step 5 In order to fully verify the correctness of the pro-posed method this paper further tests the detection

200

00

400

00

600

00

800

00

120

000

100

000

140

000

160

000

NonPeason

GiniPeason + Gini

10

09

08

07

06

05

04

Data

Accu

racy

Figure 11 Classification effect of different filtering methods

14 Mathematical Problems in Engineering

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 15: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

performance of the UNSW_NB15 dataset by a variety ofdifferent classifiers

635 Be Fourth Experiment in the Simulation Datae experiment first calculated the data distribution afterPearson and Gini coefficient filtering e distribution of thefirst two statistical features is shown in Figure 14

It can be seen from Figure 14 that most of the values offeature A and feature B are mainly concentrated at 50especially for feature A their values hardly exceed 60 Inaddition a small part of the value of feature B is concentratedat 5 to 10 and only a few exceeded 10

Secondly this paper focuses on comparing simulationexperiments with traditional machine learning methods atthe same scale of datasets ese methods include Gaus-sianNB [44] KNeighborsClassifier (KNN) [45] Decision-Tree [46] and MLPClassifier [47]

is simulation experiment focuses on five test datasetsof different scales which are 5000 20000 60000 120000and 160000 respectively and each dataset contains 9 dif-ferent types of attack data After repeated experiments thedetection results of the proposed method are compared withthose of other algorithms as shown in Figure 15

From the experimental results in Figure 15 it can be seenthat in the small sample test dataset the detection accuracyof traditional machine learning methods is relatively highFor example in the 20000 data the GaussianNBKNeighborsClassifier and DecisionTree algorithms all

achieved 100 success rates However in large-volume testdata the classification accuracy of traditional machinelearning algorithms has dropped significantly especially theGaussianNB algorithm which has accuracy rates below 50and other algorithms are very close to 80

On the contrary ML-ESN algorithm has a lower ac-curacy rate in small sample data e phenomenon is thatthe smaller the number of samples the lower the accuracyrate However when the test sample is increased to acertain size the algorithm learns the sample repeatedly tofind the optimal classification parameters and the accuracyof the algorithm is gradually improved rapidly For ex-ample in the 120000 dataset the accuracy of the algorithmreached 9675 and in the 160000 the accuracy reached9726

In the experiment the reason for the poor classificationeffect of small samples is that the ML-ESN algorithm gen-erally requires large-capacity data for self-learning to findthe optimal balance point of the algorithm When thenumber of samples is small the algorithm may overfit andthe overall performance will not be the best

In order to further verify the performance of ML-ESN inlarge-scale AMI network flow this paper selected a single-layer ESN [34] BP [6] and DecisionTree [46] methods forcomparative experiments e ML-ESN experiment pa-rameters are set as in Table 4 e experiment used ROC(receiver operating characteristic curve) graphs to evaluatethe experimental performance ROC is a graph composed ofFPR (false-positive rate) as the horizontal axis and TPR

098 098097098 099095 095 095096 098 099099

094 094097097 10 10

AccuracyF1-scoreFPR

Gen

eric

Expl

oits

Fuzz

ers

DoS

Reco

nnai

ssan

ce

Ana

lysis

Back

door

Shel

lcode

Wor

ms

e different attack types

10

08

06

04

02

00

Det

ectio

n ra

te

001 001 001001 002002002002002

Figure 12 Classification results of the ML-ESN method

Mathematical Problems in Engineering 15

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 16: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

200

175

150

125

100

75

50

25

00

e d

etec

tion

time (

ms)

Depth2 Depth3 Depth5Reservoir depths

Depth4

211

167

116

4133

02922

66

1402 0402

50010002000

(a)

Depth 2 Depth 3 Depth 5Reservoir depths

Depth 4

1000

0975

0950

0925

0900

0875

0850

0825

0800

Accu

racy

094 094095 095 095 095

096 096

091093093

091

50010002000

(b)

Accu

racy

10

08

06

04

02

00BP DecisionTree ML-ESNESN

091 096083

077

00013 00017 0002200024

AccuracyTime

0010

0008

0006

0004

0002

0000

Tim

e (s)

(c)

Figure 13 ML-ESN results at different reservoir depths

200

175

150

125

100

75

50

25

000 20000 40000 60000 80000 120000100000 140000 160000

Number of packages

Feature AFeature B

Feat

ure d

istrib

utio

n

Figure 14 Distribution map of the first two statistical characteristics

16 Mathematical Problems in Engineering

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 17: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

(true-positive rate) as the vertical axis Generally speakingROC chart uses AUC (area under ROC curve) to judge themodel performancee larger the AUC value the better themodel performance

e ROC graphs of the four algorithms obtained in theexperiment are shown in Figures 16ndash19 respectively

From the experimental results in Figures 16ndash19 it can beseen that for the classification detection of 9 attack types theoptimized ML-ESN algorithm proposed in this paper issignificantly better than the other three algorithms Forexample in the ML-ESN algorithm the detection successrate of four attack types is 100 and the detection rates for

20000

40000

60000

80000

120000

100000

140000

160000

10

09

08

07

06

05

04

Data

Accuracy

0

GaussianNBKNeighborsDecisionTree

MLPClassifierOur_MLndashESN

Figure 15 Detection results of different classification methods under different data sizes

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 092)Backdoor ROC curve (area = 095)Shellcode ROC curve (area = 096)Worms ROC curve (area = 099)

Generic ROC curve (area = 097)Exploits ROC curve (area = 094)

DoS ROC curve (area = 095)Fuzzers ROC curve (area = 093)

Reconnaissance ROC curve (area = 097)

Figure 16 Classification ROC diagram of single-layer ESN algorithm

Mathematical Problems in Engineering 17

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 18: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

Analysis ROC curve (area = 080)Backdoor ROC curve (area = 082)Shellcode ROC curve (area = 081)Worms ROC curve (area = 081)

Generic ROC curve (area = 082)Exploits ROC curve (area = 077)

DoS ROC curve (area = 081)Fuzzers ROC curve (area = 071)

Reconnaissance ROC curve (area = 078)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 18 Classification ROC diagram of DecisionTree algorithm

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Analysis ROC curve (area = 099)Backdoor ROC curve (area = 099)Shellcode ROC curve (area = 100)Worms ROC curve (area = 100)

Generic ROC curve (area = 097)Exploits ROC curve (area = 100)

DoS ROC curve (area = 099)Fuzzers ROC curve (area = 099)

Reconnaissance ROC curve (area = 100)

Figure 19 Classification ROC diagram of our ML-ESN algorithm

Analysis ROC curve (area = 095)Backdoor ROC curve (area = 097)Shellcode ROC curve (area = 096)Worms ROC curve (area = 096)

Generic ROC curve (area = 099)Exploits ROC curve (area = 096)

DoS ROC curve (area = 097)Fuzzers ROC curve (area = 087)

Reconnaissance ROC curve (area = 095)

10

10

08

08

06

06

00

00

02

02

04

04

True

-pos

itive

rate

False-positive rate

Figure 17 Classification ROC diagram of BP algorithm

18 Mathematical Problems in Engineering

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 19: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

other attack types are 99 However in the single-layer ESNalgorithm the best detection success rate is only 97 andthe general detection success rate is 94 In the BP algo-rithm the detection rate of the Fuzzy attack type is only 87and the false-positive rate exceeds 20 In the traditionalDecisionTree algorithm its detection effect is the worstBecause the detection success rate is generally less than 80and the false-positive rate is close to 35

7 Conclusion

is article firstly analyzes the current situation of AMInetwork security research at home and abroad elicits someproblems in AMI network security and introduces thecontributions of existing researchers in AMI networksecurity

Secondly in order to solve the problems of low accuracyand high false-positive rate of large-capacity network trafficdata in the existing methods an AMI traffic detection andclassification algorithm based onML-ESN deep learning wasproposed

e main contributions of this article are as follows (1)establishing the AMI network streaming metadata standard(2) the combination of Pearson and Gini coefficients is usedto quickly solve the problem of extracting important featuresof network attacks from large-scale AMI network streamswhich greatly saves model detection and training time (3)using ML-ESNrsquos powerful self-learning and storage andmemory capabilities to accurately and quickly classify un-known and abnormal AMI network attacks and (4) theproposed method was tested and verified in the simulationdataset Test results show that this method has obviousadvantages over single-layer ESN network BP neural net-work and other machine learning methods with high de-tection accuracy and low time consumption

Of course there are still some issues that need attentionand optimization in this paper For example how to establishAMI network streaming metadata standards that meet therequirements of different countries and different regions Atpresent due to the complex structure of AMI and otherelectric power informatization networks it is difficult to forma centralized and unified information collection source somany enterprises have not really established a securitymonitoring platform for information fusion

erefore the author of this article suggests that beforeanalyzing the network flow it is best to perform certainmulticollection device fusion processing to improve thequality of the data itself so as to better ensure the accuracy ofmodel training and detection

e main points of the next work in this paper are asfollows (1) long-term large-scale test verification of theproposed method in the real AMI network flow so as to findout the limitations of the method in the real environment(2) carry out unsupervised ML-ESN AMI network trafficclassification research to solve the problem of abnormalnetwork attack feature extraction analysis and accuratedetection (3) further improve the model learning abilitysuch as learning improvement through parallel traininggreatly reducing the learning time and classification time (4)

study the AMI network special protocol and establish anoptimized ML-ESN network traffic deep learning model thatis more in line with the actual application of AMI so as toapply it to actual industrial production

Data Availability

e data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no conflicts of interest

Acknowledgments

is work was supported by the Key Scientific and Tech-nological Project of ldquoResearch and Application of KeyTechnologies for Network Security Situational Awareness ofElectric PowerMonitoring System (no ZDKJXM20170002)rdquoof China Southern Power Grid Corporation the project ofldquoPractical Innovation and Enhancement of EntrepreneurialAbility (no SJCX201970)rdquo for Professional Degree Post-graduates of Changsha University of Technology and OpenFund Project of Hunan Provincial Key Laboratory of Pro-cessing of Big Data on Transportation (no A1605)

References

[1] A Maamar and K Benahmed ldquoA hybrid model for anomaliesdetection in AMI system combining k-means clustering anddeep neural networkrdquo Computers Materials amp Continuavol 60 no 1 pp 15ndash39 2019

[2] Y Liu Safety Protection Technology of Electric Energy Mea-surement Collection and Billing China Electric Power PressBeijing China 2014

[3] B M Nasim M Jelena B M Vojislav and K Hamzeh ldquoAframework for intrusion detection system in advancedmetering infrastructurerdquo Security and Communication Net-works vol 7 no 1 pp 195ndash205 2014

[4] H Ren Z Ye and Z Li ldquoAnomaly detection based on adynamic Markov modelrdquo Information Sciences vol 411pp 52ndash65 2017

[5] F Fathnia and D B M H Javidi ldquoDetection of anomalies insmart meter data a density-based approachrdquo in Proceedings ofthe 2017 Smart Grid Conference (SGC) pp 1ndash6 Tehran Iran2017

[6] Z Y Wang G J Gong and Y F Wen ldquoAnomaly diagnosisanalysis for running meter based on BP neural networkrdquo inProceedings of the 2016 International Conference on Com-munications Information Management and Network SecurityGold Coast Australia 2016

[7] M Stephen H Brett Z Saman and B Robin ldquoAMIDS amulti-sensor energy theft detection framework for advancedmetering infrastructuresrdquo IEEE Journal on Selected Areas inCommunications vol 31 no 7 pp 1319ndash1330 2013

[8] Y Chen J Tao Q Zhang et al ldquoSaliency detection via im-proved hierarchical principle component analysis methodrdquoWireless Communications and Mobile Computing vol 2020Article ID 8822777 12 pages 2020

Mathematical Problems in Engineering 19

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 20: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

[9] Y Mo H J Kim K Brancik et al ldquoCyberndashphysical security ofa smart grid infrastructurerdquo Proceedings of the IEEE vol 100no 1 pp 195ndash209 2012

[10] e AMI network engineering task Force (AMI-SEC) rdquo 2020httposgugucaiugorgutilisecamisecdefaultaspx

[11] Y Park D M Nicol H Zhu et al ldquoPrevention of malwarepropagation in AMIrdquo in Proceedings of the IEEE InternationalConference on Smart Grid Communications pp 474ndash479Vancouver Canada 2013

[12] P Jokar N Arianpoo and V C M Leung ldquoElectricity theftdetection in AMI using customersrsquo consumption patternsrdquoIEEE Transactions on Smart Grid vol 7 no 1 pp 216ndash2262016

[13] Q R Zhang M Zhang T H Chen et al ldquoElectricity theftdetection using generative modelsrdquo in Proceedings of the 2018IEEE 30th International Conference on Tools with ArtificialIntelligence (ICTAI) Volos Greece 2018

[14] N Y Jiang ldquoAnomaly intrusion detection method based onAMIrdquo MS thesis Southeast University Dhaka Bangladesh2018 in Chinese

[15] S Neetesh J C Bong and G Santiago ldquoSecure and privacy-preserving concentration of metering data in AMI networksrdquoin Proceedings of the 2017 IEEE International Conference onCommunications (ICC) Paris France 2017

[16] C Euijin P Younghee and S Huzefa ldquoIdentifying maliciousmetering data in advanced metering infrastructurerdquo in Pro-ceedings of the 2014 IEEE 8th International Symposium onService Oriented System Engineering pp 490ndash495 OxfordUK 2014

[17] P Yi T Zhu Q Q Zhang YWu and J H Li ldquoPuppet attacka denial of service attack in advanced metering infrastructurenetworkrdquo Journal of Network amp Computer Applicationsvol 59 pp 1029ndash1034 2014

[18] A Satin and P Bernardi ldquoImpact of distributed denial-of-service attack on advanced metering infrastructurerdquo WirelessPersonal Communications vol 83 no 3 pp 1ndash15 2015

[19] C Y Li X P Wang M Tian and X D Feng ldquoAMI researchon abnormal power consumption detection in the environ-mentrdquo Computer Simulation vol 35 no 8 pp 66ndash70 2018

[20] A A A Fadwa and A Zeyar ldquoReal-time anomaly-baseddistributed intrusion detection systems for advancedmeteringinfrastructure utilizing stream data miningrdquo in Proceedings ofthe 2015 International Conference on Smart Grid and CleanEnergy Technologies pp 148ndash153 Chengdu China 2015

[21] M A Faisal and E T Aigng ldquoSecuring advanced meteringinfrastructure using intrusion detection system with datastream miningrdquo in Proceedings of the Pacific Asia Conferenceon Intelligence and Security Informatics IEEE Jeju IslandKorea pp 96ndash111 2016

[22] K Song P Kim S Rajasekaran and V Tyagi ldquoArtificialimmune system (AIS) based intrusion detection system (IDS)for smart grid advanced metering infrastructure (AMI) net-worksrdquo 2018 httpsvtechworkslibvteduhandle1091983203

[23] A Saad and N Sisworahardjo ldquoData analytics-based anomalydetection in smart distribution networkrdquo in Proceedings of the2017 International Conference on High Voltage Engineeringand Power Systems (ICHVEPS) IEEE Bali IndonesiaIEEEBali Indonesia 2017

[24] R Berthier W H Sanders and H Khurana ldquoIntrusiondetection for advanced metering infrastructures require-ments and architectural directionsrdquo in Proceedings of the IEEEInternational Conference on Smart Grid CommunicationsIEEE Dresden Germany pp 350ndash355 2017

[25] V B Krishna G A Weaver and W H Sanders ldquoPCA-basedmethod for detecting integrity attacks on advanced meteringinfrastructurerdquo in Proceedings of the 2015 InternationalConference on Quantitative Evaluation of Systems pp 70ndash85Madrid Spain 2015

[26] G Fernandes J J P C Rodrigues L F Carvalho J F Al-Muhtadi and M L Proenccedila ldquoA comprehensive survey onnetwork anomaly detectionrdquo Telecommunication Systemsvol 70 no 3 pp 447ndash489 2019

[27] W Wang Y Sheng J Wang et al ldquoHAST-IDS learninghierarchical spatial-temporal features using deep neuralnetworks to improve intrusion detectionrdquo IEEE Access vol 6pp 1792ndash1806 2018

[28] N Gao L Gao Y He et al ldquoA lightweight intrusion detectionmodel based on autoencoder network with feature reductionrdquoActa Electronica Sinica vol 45 no 3 pp 730ndash739 2017 inChinese

[29] M Yousefi-Azar V Varadharajan L Hamey andU Tupalula ldquoAutoencoder-based feature learning for cybersecurity applicationsrdquo in Proceedings of the 2017 InternationalJoint Conference on Neural Networks (IJCNN) IEEE NeuralNetworks pp 3854ndash3861 Anchorage AK USA 2017

[30] Y Wang H Zhou H Feng et al ldquoNetwork traffic classifi-cation method basing on CNNrdquo Journal on Communicationsvol 39 no 1 pp 14ndash23 2018 in Chinese

[31] S Kaur and M Singh ldquoHybrid intrusion detection and sig-nature generation using deep recurrent neural networksrdquoNeural Computing and Applications vol 32 no 12pp 7859ndash7877 2019

[32] H Jaeger M Lukosevicius D Popovici and U SiewertldquoOptimization and applications of echo state networks withleaky- integrator neuronsrdquo Neural Networks vol 20 no 3pp 335ndash352 2007

[33] S Saravanakumar and R Dharani ldquoImplementation of echostate network for intrusion detectionrdquo International Journalof Advanced Research in Computer Science Engineering andInformation Technology vol 4 no 2 pp 375ndash385 2015

[34] Y Kalpana S Purushothaman and R Rajeswari ldquoImple-mentation of echo state neural network and radial basisfunction network for intrusion detectionrdquo Data Mining andKnowledge Engineering vol 5 no 9 pp 366ndash373 2013

[35] X X Liu ldquoResearch on the network security mechanism ofsmart grid AMIrdquo MS thesis National University of DefenseScience and Technology Changsha China 2014 in Chinese

[36] Y Wang ldquoResearch on network behavior analysis and iden-tification technology of malicious coderdquo MS thesis XirsquoanUniversity of Electronic Science and Technology XirsquoanChina 2017 in Chinese

[37] A Moore D Zuev and M Crogan ldquoDiscriminators for use inflow-based classificationrdquo MS thesis Department of Com-puter Science Queen Mary and Westfield College LondonUK 2005

[38] Data standardization Baidu Encyclopediardquo 2020 httpsbaikebaiducomitemE695B0E68DAEE6A087E58786E58C964132085fraladdin

[39] H Li Statistical Learning Methods Tsinghua University PressBeijing China 2018

[40] Z K Malik A Hussain and Q J Wu ldquoMultilayered echostate machine a novel architecture and algorithmrdquo IEEETransactions on Cybernetics vol 47 no 4 pp 946ndash959 2017

[41] C Naima A Boudour and M A Adel ldquoHierarchical bi-level multi-objective evolution of single- and multi-layerecho state network autoencoders for data representationrdquo

20 Mathematical Problems in Engineering

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21

Page 21: Network Traffic Anomaly Detection Based on ML-ESN for ...downloads.hindawi.com/journals/mpe/2020/7219659.pdfe current research on AMI network security threats mainly analyzes whether

2020 httpsarxivorgftparxivpapers1806180601016pdf

[42] M Nour and S Jill ldquoUNSW-NB15 a comprehensive data setfor network intrusion detection systemsrdquo in Proceedings of the2015 Military Communications and Information SystemsConference (MilCIS) pp 1ndash6 Canberra Australia 2015

[43] UNSW-NB15 datasetrdquo 2020 httpswwwunswadfaeduauunsw-canberra-cybercybersecurityADFA-NB15-Datasets

[44] N B Azzouna and F Guillemin ldquoAnalysis of ADSL traffic onan IP backbone linkrdquo in Proceedings of the GLOBECOMrsquo03IEEE Global Telecommunications Conference (IEEE Cat No03CH37489) IEEE San Francisco CA USAIEEE SanFrancisco CA USA 2004

[45] P Cunningham and S J Delany ldquoK-nearest neighbourclassifiersrdquo Multiple Classifier System vol 34 pp 1ndash17 2007

[46] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152 2014

[47] K J Manas R S Subhransu and T Lokanath ldquoDecision tree-induced fuzzy rule-based differential relaying for transmissionline including unified power flow controller and wind-farmsrdquoIET Generation Transmission amp Distribution vol 8 no 12pp 2144ndash2152

[48] L V Efferen and A M T Ali-Eldin ldquoA multi-layer per-ceptron approach for flow-based anomaly detectionrdquo inProceedings of the 2017 International Symposium on NetworksComputers and Communications (ISNCC) IEEE MarrakechMoroccoIEEE Marrakech Morocco 2017

Mathematical Problems in Engineering 21