high-dimensional data analytics using low-dimensional models...
TRANSCRIPT
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
1
High-dimensional Data Analytics Using
Low-dimensional Models in Power Systems
Meng Wang
Assistant ProfessorDepartment of Electrical, Computer & Systems Engineering
Rensselaer Polytechnic Institute (RPI)Troy, NY, USA
IEEE SGSMA 2019May 20, 2019
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
2
Acknowledgment
Dr. Yingshuai Hao Dr. Pengzhi Gao Shuai Zhang Ren Wang
Prof. Joe H. Chow
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
3
Big Data and Low-Dimensional Models
Figure: Big data inpower systems Figure: Big data in
social networks
Figure: Big data in objectrecognition
• Despite the ambient dimension, many high-dimensionaldatasets have intrinsic low-dimensional structures such assparsity, low-rankness, and low-dimensional manifolds.
• These low-dimensional models enable the development of fast,model-free methods with provable performance guarantees fordata recovery and information extraction.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
4
Big Data in Power Systems
• Phasor Measurement Units (PMUs)• PMUs provide synchronized phasor measurements at a
sampling rate of 30 or 60 samples per second.• Multi-channel PMUs can measure bus voltage phasors, line
current phasors, and frequency. 2000+ PMUs in the NorthAmerica.
• Data availability and quality issues, e.g., data losses due tocommunication congestions.
• Limited incorporation into the real-time operations.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
5
Low Dimensionality of PMU data
Figure: PMUs in CentralNY Power Systems
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
5
10
15
20
Time(s)
Cu
rre
nt
Ma
gn
itud
e
Figure: Currentmagnitudes of PMU data
5 10 15 20 25 30 350
200
400
600
800
1000Distribution of singular value
Singular value index
Figure: Singular values ofthe PMU data matrix
• 6 PMUs measure 37 voltage/current phasors. 30samples/second for 20 seconds.
• Singular values decay significantly. Mostly close to zero.Singular values can be approximated by a sparse vector.
• Low-dimensionality also used in Chen, Xie, Kumar 2013,Dahal, King, Madani 2012 for dimensionality reduction.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
6
Convert Data to Information
Objective: Develop computationally efficient data-driven methodsfor power system situational awareness.
• PMU data quality improvement: missing data recovery, baddata correction, and detection of cyber data attacks.
• Real-time event identification through machine learning.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
7
Outline
1 Motivation
2 Data Recovery and Error Correction
3 Event Identification
4 Conclusions
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
8
PMU Data Quality Issues
• Data losses and errors resulting from communicationcongestions and device malfunction.
• California Independent System Operator reported that10%-17% of data in 2011 had availability and quality issues.
• Reliable data needed for real-time situational awareness andcontrol.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
9
Simultaneous and Consecutive Data Losses
A recorded PMU dataset: consecutive data losses on three phasesof line for an hour.
00:00:00:000 04:48:00:000 09:36:00:000 14:24:00:000 19:12:00:000 24:00:00:000
Time
79
80
81
82
83
84
Mag
nitu
de (
kV)
Observed voltage magnitudes
Phase APhase BPhase C
Figure: Measured voltage phasormagnitudes
00:00:00:000 04:48:00:000 09:36:00:000 14:24:00:000 19:12:00:000 24:00:00:000
Time
0
50
100
150
Mag
nitu
de (
A)
Observed current magnitudesPhase APhase BPhase C
Figure: Measured current phasormagnitudes
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
10
Low-rank Matrix Completion
Low-rank Matrix Completion Problem!
• A literature includes theoretical analysis (e.g., Candes, Recht2012) and recovery methods (e.g., nuclear norm minimization(Fazel 2002)).
minX‖X‖∗ = sum of singular values of X
s.t. X is consistent with the observed entries,
• Applications in collaborative filtering, computer vision, remotesensing, load forecasting, electricity market inference
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
11
Low-rank Matrix Completion
Low-rank Matrix Completion Problem!
• A literature includes theoretical analysis (e.g., Candes, Recht2012) and recovery methods (e.g., nuclear norm minimization(Fazel 2002)).
minX‖X‖∗ = sum of singular values of X
s.t. X is consistent with the observed entries,
• Applications in collaborative filtering, computer vision, remotesensing, load forecasting, electricity market inference
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
12
Low-rank Matrix Completion
Low-rank Matrix Completion Problem!
• A literature includes theoretical analysis (e.g., Candes, Recht2012) and recovery methods (e.g., nuclear norm minimization(Fazel 2002)).
minX‖X‖∗ = sum of singular values of X
s.t. X is consistent with the observed entries,
• Applications in collaborative filtering, computer vision, remotesensing, load forecasting, electricity market inference
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
13
Low-rank Matrix Completion for PMU Data Recovery
Advantages:
• No modeling of the power system.
• Analytical performance guarantee.
• Tolerate a significant percentage ofmissing data/bad measurements atrandom locations.
y1 y2 yt
An m-by-t PMU data matrix
Time:1 ~ t
Temporally correlated
through dynamical systems
Related
through
circuit laws
ch
an
nels
: 1
~ m
Limitations:
• Do not model temporal dynamics sufficiently.
• Low-rank matrix completion methods fail to recover acolumn/row if the complete column/row is lost. Simultaneousand consecutive data losses are frequent in PMU data.
• Convex optimization problems are computationally expensivefor large datasets.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
14
Low-rank Matrix Completion for PMU Data Recovery
Advantages:
• No modeling of the power system.
• Analytical performance guarantee.
• Tolerate a significant percentage ofmissing data/bad measurements atrandom locations.
y1 y2 yt
An m-by-t PMU data matrix
Time:1 ~ t
Temporally correlated
through dynamical systems
Related
through
circuit laws
ch
an
nels
: 1
~ m
Limitations:
• Do not model temporal dynamics sufficiently.
• Low-rank matrix completion methods fail to recover acolumn/row if the complete column/row is lost. Simultaneousand consecutive data losses are frequent in PMU data.
• Convex optimization problems are computationally expensivefor large datasets.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
15
Our Contribution
Our developed model-free data recovery and error correctionmethods
• First-order algorithms to solve nonconvex optimizationproblems with provable global optimality.
• Recover/correct simultaneous and consecutive datalosses/errors.
• Differentiate bad data from system events.
Zhang, Hao, Wang, Chow. IEEE Journal of Selected Topics on Signal Processing,
2018.
Hao, Wang, Chow, Farantatos, Patel, IEEE Transactions on Power Systems, 2018.
Zhang, Wang. IEEE Transactions on Signal Processing, 2019.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
16
Low-rank Hankel Structure of PMU Data
Observation matrix:
Y = [y1, y2, · · · , yn] ∈ Cnc×n
Hankel structure:
Hκ(Y) =
y1 y2 · · · yn−κ+1
y2 y3 · · · yn−κ+2...
.... . .
...yκ yκ+1 · · · yn
.Hκ(Y) ∈ Cκnc×(n−κ+1) canstill be approximated by alow-rank matrix.
y1 y2 yt
An m-by-t PMU data matrix
Time:1 ~ t
Temporally correlated
through dynamical systems
Related
through
circuit laws
chan
nel
s: 1
~ m
An mj-by-(t-j+1) Hankel matrix
Time
chan
nel
s
Tim
e
y1 y2
y2 y3
yj
chan
nel
sch
an
nel
s
yt-j+1
yt-j+2
ytyj+1
The low-rank Hankel property results from the reduced-orderdynamical system.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
17
Low-rank Hankel Structure of PMU Data
Figure: Measurements that contain a disturbance
0 5 10 15 20 25r: approximate rank of H5(X)
10-5
10-4
10-3
10-2
10-1
100
Ap
pro
xim
atio
n e
rro
r
5=15=35=55=85=15
Figure: The low-rank approximationerrors to Hκ(Y)
0 5 10 15 20 25r: approximate rank of H5(X)
10-5
10-4
10-3
10-2
10-1
100
Ap
pro
xim
atio
n e
rro
r
5=15=35=55=85=15
Figure: The low-rank approximationerrors to Hκ(Y), where Y is acolumn permutation of Y.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
18
Robust Data Recovery
Let M = Y + S denote the partially corrupted measurements,where S denotes the sparse errors.The robust data recovery problem is formulated as
minX,S∈Cnc×n
‖PΩ(X + S−M)‖2F
subject to rank(Hκ(X)) = r , ‖S‖0 ≤ s.(1)
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
19
Our proposed alternating projection algorithm
Initialization: X0 = 0, thresholding ε0;Two stages of iterations:
• In the k-th outer iteration:
• Increase the desired rank k from 1 to r gradually;
• In the l-th inner iteration:
• Update Sl based on the current estimated thresholding ξl ;• Update Xl along the gradient descent directionPΩ(Xl + Sl −M);
• Project the Hankel matrix HκXl into the rank-k matrix set;• Obtain Xl+1 from the matrix after projection;• Update ξl+1 based on Xl+1.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
20
Theoretical results
Theorem
Suppose the number of observed data exceeds O(r3 log2(n)) andeach row of S has at most O
(1r
)fraction of nonzeros, the
algorithm converges to the original data matrix linearly as
‖Xl − Y‖F ≤ ε after l = O(log(1/ε)) iterations.
• Required number of observations: O(r3 log2(n)), less than thebound O(nr log2(n)) of recovery with convex relaxationapproach;
• Fraction of corruptions it can correct: O(
1r
)in each row;
• Low computational complexity per iteration: O(rncn log n);
• Recovery guarantees on simultaneous data losses andcorruptions across all channels.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
21
Numerical experiments
0 5 10 15 20
Time (second)
0.9
0.95
1
1.05
1.1
Vol
tage
mag
nitu
de (
p.u.
) Observed voltage phasor magnitudes
0 5 10 15 20
Time (second)
-200
-100
0
100
200
Vol
tage
pha
sor
angl
es (° ) Observed angles of voltage phasors
0 5 10 15 20
Time (second)
0.9
0.95
1
1.05
1.1
Vol
tage
mag
nitu
de (
p.u.
) Recovered voltage phasor magnitudes
0 5 10 15 20
Time (second)
-200
-100
0
100
200
Vol
tage
pha
sor
angl
es (° ) Recovered voltage phasor angles
Figure: One case of 8% random bad data and 40% random missing data
0 5 10 15 20
Time (second)
0.9
0.95
1
1.05
1.1
Vol
tage
mag
nitu
de (
p.u.
) Observed voltage phasor magnitudes
0 5 10 15 20
Time (second)
0.9
0.95
1
1.05
1.1
Vol
tage
mag
nitu
de (
p.u.
) Recovered voltage phasor magnitudes
Figure: Consecutive bad data, 3% random bad data and 20% missingdata
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
22
Outline
1 Motivation
2 Data Recovery and Error Correction
3 Event Identification
4 Conclusions
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
23
Existing Approaches
Recent development of data driven approaches of eventidentification, see sample work [Wang W et al., 2014,Rafferty et al., 2016, Valtierra R et al., 2014, Jiang H et al., 2014]:
• Advantage: model-free, robust to model errors.• Limitations
• Single events or multiple events with long time intervals andminor overlapping
• A large number of training datasets• No clear physical interpretations• High training complexity.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
24
Challenges and Goals
• Challenges• Events happen close in time or location have overlapping
impacts on the measurements.• Not enough training datasets for all possible conditions and
successive events.• The rapid occurrence of cascading failures requires online
algorithms.
• Goals• Train on the a small number of single events• Identify overlapping successive events in real time
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
25Our Approach: Identification using Subspace
Representation
• Identify an event by comparing the row subspace of thereal-time PMU data matrix with a dictionary of subspacesobtained from recorded PMU data.
M1 =
Mn1 =
...
M1 =
Mn2 =
...
M1 =
Mn3 =
...
... ... ...
Short Circuit Events
D=
Dictionary Atoms
...
...
...Line Trip Events Load Change Events
M =
Online data Offline data
Compare the
subspace of
the real-time
data with the
dictionary to
determine the
event type
Figure: Dictionary construction from historical datasets andreal-time data identification through subspace comparison
Wenting Li, Meng Wang, Joe H.Chow. IEEE Transactions on Power System, 2018.
Wenting Li, Meng Wang. To appear in IEEE Transactions on Power System, 2019.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
26
Offline Training
20 40 60 80 100 120 140
Time (0.033 second)
0.96
0.97
0.98
0.99
1
The 1st layer The 2nd layer
Fully connected layer
…...
...
…...
...
Singular Values
Eigen Values
OperatorsLine Trip The output
Vector
Generator Trip
Three phase short circuit
CNN ClassifierSingle Events Training Data
Offline Training
20 40 60 80 100 120 140
Time (0.033 second)
0.9
0.92
0.94
0.96
0.98
1
1.02
20 40 60 80 100 120 140
Time (0.033 second)
0
0.2
0.4
0.6
0.8
1
1.2
Figure: Offline Training
• Extract dominant eigenvaluesλ of the state matrix and thedominant singular values δ ofthe data matrix of recordedsingle events as features;
• Input λ and σ to a 2-layerconvolutional neural network(CNN) and combine twopaths in the fully connectedlayer;
• Train this 2-layer CNNclassifier to identify the type ofan event.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
27
Online Testing
Given the two successive events occurring at T1 and T2
respectively, there are three steps to identify the type of secondevent:
Prediction of the impacts
Subtraction
Prediction-Subtraction Process
Compute eigenvalues and singular values
Feature Extraction
Classify the dominant features
Classification
Two successive events
The predicted first event
The residual event
+
-
The trained CNN
classifier
The type of
the second event
Online Testing
Successive Events Testing Data
20 40 60 80 100
Time (0.033 second)
-0.08
-0.06
-0.04
-0.02
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
T2 T1
Figure: Online Testing
• Step 1: predict the impact ofthe first event after time T2
and subtract it from theobtained measurements
• Various methods can beemployed to predict themeasurements, such as timeseries analysis and Hankelmatrix based methods[Hao et al., 2018].
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
28
Online Testing
Given the two successive events occurring at T1 and T2
respectively, there are three steps to identify the type of secondevent:
Prediction of the impacts
Subtraction
Prediction-Subtraction Process
Compute eigenvalues and singular values
Feature Extraction
Classify the dominant features
Classification
Two successive events
The predicted first event
The residual event
+
-
The trained CNN
classifier
The type of
the second event
Online Testing
Successive Events Testing Data
20 40 60 80 100
Time (0.033 second)
-0.08
-0.06
-0.04
-0.02
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
T2 T1
Figure: Online Testing
• Step 1: predict the impact ofthe first event after time T2
and subtract it from theobtained measurements
• Various methods can beemployed to predict themeasurements, such as timeseries analysis and Hankelmatrix based methods[Hao et al., 2018].
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
29
Online Testing
Given the two successive events occurring at T1 and T2
respectively, there are three steps to identify the type of secondevent:
Prediction of the impacts
Subtraction
Prediction-Subtraction Process
Compute eigenvalues and singular values
Feature Extraction
Classify the dominant features
Classification
Two successive events
The predicted first event
The residual event
+
-
The trained CNN
classifier
The type of
the second event
Online Testing
Successive Events Testing Data
20 40 60 80 100
Time (0.033 second)
-0.08
-0.06
-0.04
-0.02
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
T2 T1
Figure: Online Testing
• Step 2: The features (λ, σ)are extracted from the residualmeasurements;
• Step 3: The trained CNNoutputs the type of thesecond event in real time.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
30
Online Testing
Given the two successive events occurring at T1 and T2
respectively, there are three steps to identify the type of secondevent:
Prediction of the impacts
Subtraction
Prediction-Subtraction Process
Compute eigenvalues and singular values
Feature Extraction
Classify the dominant features
Classification
Two successive events
The predicted first event
The residual event
+
-
The trained CNN
classifier
The type of
the second event
Online Testing
Successive Events Testing Data
20 40 60 80 100
Time (0.033 second)
-0.08
-0.06
-0.04
-0.02
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
20 40 60 80 100
Time (0.033 second)
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
T2 T1
Figure: Online Testing
• Step 2: The features (λ, σ)are extracted from the residualmeasurements;
• Step 3: The trained CNNoutputs the type of thesecond event in real time.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
31
Numerical Results
Table: Total Number of the Testing Datasets
Types LT+GT LT+TP LT+LT GT+GT GT+TP TP+GT
Number 169 404 378 107 58 146
• Line trips (LTs), generator trips (GTs), and three-phase shortcircuit (TPs) in the 68-bus power system generated usingPSS/E;
• The training set includes 967 single events at differentlocations with different topologies and various pre-eventconditions;
• The testing set includes 1262 two-event cases, where“LT+GT” means a line trip event is followed by a generatortrip event.
• The time interval between any two successive events variesfrom 0.5 to 2 seconds.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
32
Performance with a Small Training Dataset
6 12 18 24 30 50 80 100
Percentage of Training Data (%)
40
60
80
100
IAR
(%
)
LT
6 12 18 24 30 50 80 100
Percentage of Training Data (%)
40
60
80
100
IAR
(%
)
GT
6 12 18 24 30 50 80 100
Percentage of Training Data (%)
40
60
80
100
IAR
(%
)
TP
6 12 18 24 30 50 80 100
Percentage of Training Data (%)
40
60
80
100
IAR
(%
)
Averaged
CNN-F CNN-T
• CNN-F achieves a higher IAR when given a small trainingdata;
• The IAR of CNN-T is sensitive to the size of the trainingdata.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
33
The Impact of Subtracting the First Event
Table: IAR of CNN-F and CNN-T with and without subtracting the firstevent (∆T = 1 second)
Classifier Process LT % GT % TP % Overall %
CNN-F NS 79.4 68.3 96.4 81.9
CNN-F SP 95.9 89.1 97.3 94.2
CNN-T NS 94.8 83.2 78.4 85.1
CNN-T SP 73.2 62.4 78.4 71.5
• “Not Subtract (NS)” means using the measurements after thesecond event directly;
• “Subtract the Prediction (SP)” means using the residual aftersubtracting the first event;
• Subtracting the impact of the first event can enhanceCNN-F’s performance significantly.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
34
Robustness to Noise
Table: IAR of identifying the second event by CNN-F with differentsignal-noise-ratio (SNR) noisy measurements
SNR (dB) 40 50 60 70 80 90 100
IAR of LT (%) 77.3 81.4 82.5 85.6 82.5 90.7 91.7
IAR of GT (%) 75.2 82.2 86.1 86.1 85.1 86.1 82.2
IAR of TP (%) 93.7 97.3 97.3 98.2 98.2 99.1 99.1
Overall IAR (%) 82.5 87.4 88.9 90.3 90.0 92.2 91.3
• The overall IAR of CNN-F is more than 80% for differentnoise levels;
• IAR is more than 90% when the SNR is higher than 70 dB;
• Significant events like TP events are less sensitive to noise.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
35
Conclusions
• A framework of power system data analytics by exploiting thelow-dimensional structure of spatial-temporal data blocks.
• Data quality improvement with analytical guarantees.(Missing data recovery, detection of cyber data attacks.)
• Real-time event identification approach using a small numberof recorded single events for training.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
36
Q & A
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
37
References
Candes, Emmanuel, Benjamin Recht (2012)
Exact Matrix Completion via Convex Optimization
Communications of the ACM, 111 – 119.
Fazel M (2002)
Matrix Rank Minimization with Applications
PhD thesis, Stanford University
Mateos G, Giannakis G B (2013)
Load Curve Data Cleansing and Imputation via Sparsity and Low Rank
IEEE Transactions on Smart Grid, 2347 – 2355.
Kekatos V, Zhang Y, Giannakis G B (2014)
Electricity Market Forecasting via Low-rank Multi-kernel Learning
IEEE Journal of Selected Topics in Signal Processing, 1182 – 1193.
Liu Y, Ning P, Reiter M K (2011)
False Data Injection Attacks Against State Estimation in Electric PowerGrids ACM Transactions on Information and System Security (TISSEC),13.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
38
Xie L, Mo Y, Sinopoli B (2010)
False Data Injection Attacks in Electricity Markets
Smart Grid Communications (SmartGridComm), 2010 First IEEEInternational Conference on, 226 – 231.
Kosut O, Jia L, Thomas R J, et al (2010)
Malicious Data Attacks on Smart Grid State Estimation: AttackStrategies and Countermeasures
Smart Grid Communications (SmartGridComm), 2010 First IEEEInternational Conference on, 220 – 225.
Sedghi H, Jonckheere E (2013)
Statistical Structure Learning of Smart Grid for Detection of False DataInjection Power and Energy Society General Meeting (PES), 1 – 5.
Liu L, Esmalifalak M, Ding Q, et al (2014)
Detecting False Data Injection Attacks on Power Grid by SparseOptimization IEEE Transactions on Smart Grid, 612 – 621.
Davenport M A, Plan Y, van den Berg E, et al (2014)
1-Bit Matrix Completion Information and Inference, 189 – 223.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
39
Pengzhi Gao, Ren Wang, Meng Wang, and Joe H. Chow (2016)
Low-rank Matrix Recovery from Quantized and Erroneous Measurements:Accuracy-preserved Data Privatization in Power Grids
Asilomar Conference on Signals, Systems and Computers, 374 – 378.
Pengzhi Gao, Meng Wang, Scott G. Ghiocel, Joe H. Chow, BruceFardanesh, and George Stefopoulos (2016)
Missing Data Recovery by Exploiting Low-dimensionality in Power SystemSynchrophasor Measurements
IEEE Trans. Power Systems 31(2), 1006 – 1013.
Pengzhi Gao, Meng Wang, Joe H. Chow, Scott G. Ghiocel, BruceFardanesh, George Stefopoulos, and Michael P. Razanousky (2016)
Identification of Successive “Unobservable” Cyber Data Attacks in PowerSystems
IEEE Trans. Signal Processing 64(21), 5557 – 5570.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
40
Le Xie, Yang Chen, and P. R. Kumar (2014)
Dimensionality Reduction of Synchrophasor Data for Early EventDetection: Linearized Analysis
IEEE Trans. Power Systems 29(6), 2784 – 2794.
Mark A. Davenport, Yaniv Plan, Ewout van den Berg, and Mary Wootters(2014)
1-bit Matrix Completion
Information and Inference 3(3), 189 – 223.
Sonia A. Bhaskar (2016)
Probabilistic Low-Rank Matrix Completion from Quantized Measurements
Journal of Machine Learning Research 17(60), 1 – 34.
Akshay Soni, Swayambhoo Jain, Jarvis Haupt, and Stefano Gonella (2016)
Noisy Matrix Completion under Sparse Factor Models
IEEE Trans. Information Theory 62(6), 3636 – 3661.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
41
Tony Cai, and Wen-Xin Zhou (2013)
A Max-Norm Constrained Minimization Approach to 1-Bit MatrixCompletion
Journal of Machine Learning Research 14(1), 3619 – 3647.
Olga Klopp, Jean Lafond, Eric Moulines, and Joseph Salmon (2015)
Adaptive Multinomial Matrix Completion
Electronic Journal of Statistics 9(2), 2950 – 2975.
Andrew S. Lan, Christoph Studer, and Richard G. Baraniuk (2014)
Matrix Recovery from Quantized and Corrupted Measurements
IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP), 4973 – 4977.
Pengzhi Gao, Meng Wang, Joe H. Chow, Scott G. Ghiocel, BruceFardanesh, George Stefopoulos, and Michael P. Razanousky (2016)
Identification of Successive “Unobservable” Cyber Data Attacks in PowerSystems. IEEE Transactions on Signal Processing, 64 (21): 5557-5570.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
42
H. Farhangi (2010)
The Path of the Smart Grid
IEEE Power and Energy Magazine, 8(1), 18 – 28.
Quilumba Franklin L., Lee Wei-Jen, Huang Heng, Wang David Y., andSzabados Robert L. (2015)
Using Smart Meter Data to Improve the Accuracy of Intraday LoadForecasting Considering Customer Behavior Similarities
IEEE Transactions on Smart Grid, 6(2), 911 – 918.
S. Haben, C. Singleton, and P. Grindrod (2016)
Analysis and Clustering of Residential Customers Energy BehavioralDemand Using Smart Meter Data
IEEE Transactions on Smart Grid, 7(1), 136 – 144.
G. W. Hart (1992)
Nonintrusive appliance load monitoring
Proceedings of the IEEE, 80(12), 1870 – 1891.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
43
M. A. Lisovich, D. K. Mulligan, and S. B. Wicker (2010)
Inferring Personal Information from Demand-Response Systems
IEEE Security Privacy, 8(1), 11 – 20.
Li Fengjun, Luo Bo, and Liu Peng (2011)
Secure and Privacy-Preserving Information Aggregation for Smart Grids
International Journal of Security and Networks, 6(1), 28 – 39.
Barbosa Pedro, Brito Andrey, Almeida Hyggo, and Clau Sebastian (2014)
Lightweight Privacy for Smart Metering Data by Adding Noise
Proceedings of the 29th Annual ACM Symposium on Applied Computing,SAC ’14, 531 – 538.
McLaughlin Stephen, McDaniel Patrick, and Aiello William (2011)
Protecting Consumer Privacy from Electric Load Monitoring
Proceedings of the 18th ACM Conference on Computer andCommunications Security, 12, 87 – 98.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
44
N. Mahmoudi-Kohan, M. Parsa Moghaddam, M.K. Sheikh-El-Eslami, andE. Shayesteh (2010)
A Three-Stage Strategy for Optimal Price Offering by a Retailer Based onClustering Techniques International Journal of Electrical Power & EnergySystems, 32(10), 1135 – 1142.
C. Leon, F. Biscarri, I. Monedero, J. I. Guerrero, J. Biscarri, and R. Millan(2011)
Variability and Trend-Based Generalized Rule Induction Model to NTLDetection in Power Companies
IEEE Transactions on Power Systems, 26(4), 1798 – 1807.
Figueiredo Vera, Rodrigues Fatima, Vale Zita, and Gouveia JoaquimBorges (2005)
An Electric Energy Consumer Characterization Framework Based on DataMining Techniques IEEE Transactions on Power Systems, 20(2), 596 –602.
Saeed Aghabozorgi, Seyed Shirkhorshidi Ali, and Ying Wah Teh (2015)
Time-Series Clustering–A Decade Review Information Systems, 53, 16 –38.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
45
Piao, Minghao and et al (2014)
Subspace Projection Method Based Clustering Analysis in Load Profiling
IEEE Transactions on Power Systems, 29(6), 2628 – 2635.
Chen Yi, Nasrabadi Nasser M, and Tran Trac D (2011)
Hyperspectral Image Classification Using Dictionary-Based SparseRepresentation
IEEE Transactions on Geoscience and Remote Sensing, 49(10), 3973 –3985.
Pengzhi Gao, Meng Wang, Joe H. Chow, Matthew Berger, and Lee M.Seversky (2017)
Missing Data Recovery for High-dimensional Signals with NonlinearLow-dimensional Structures
IEEE Transactions on Signal Processing, 65(20), 5421 – 5436.
Sonia A. Bhaskar (2016)
Probabilistic Low-Rank Matrix Completion from Quantized Measurements
Journal of Machine Learning Research 17(60), 1 – 34.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
46
Ng Andrew, Jordan Michael, and Weiss Yair (2002)
On Spectral Clustering: Analysis and an Algorithm
Advances in neural information processing systems, 2, 849 – 856.
Elhamifar Ehsan, and Vidal Rene (2013)
Sparse Subspace Clustering: Algorithm, Theory, and Applications
IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11),2765 – 2781.
Liu Guangcan, Lin Zhouchen, Yan Shuicheng, Sun Ju, Yu Yong, and MaYi (2013)
Robust Recovery of Subspace Structures by Low-Rank Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1),171 – 184.
Rahmani Mostafa, and George K. Atia (2017)
Innovation Pursuit: A New Approach to Subspace Clustering
IEEE Transactions on Signal Processing, 65(23), 6276 – 6291.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
47
Tseng Paul (2000)
Nearest Q-Flat to m Points
Journal of Optimization Theory and Applications 105(1), 249 – 252.
Pengzhi Gao, Ren Wang, Meng Wang, and Joe H. Chow (2018)
Low-rank Matrix Recovery from Noisy, Quantized and ErroneousMeasurements
IEEE Transactions on Signal Processing, 60(11), 2918 – 2932.
Andrew S. Lan, Christoph Studer, and Richard G. Baraniuk (2014)
Matrix Recovery from Quantized and Corrupted Measurements
IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP), 4973 – 4977.
Mark A. Davenport, Yaniv Plan, Ewout van den Berg, and Mary Wootters(2014)
1-bit Matrix Completion
Information and Inference 3(3), 189 – 223.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
48
Wenting Li and Meng Wang and Joe H. Chow (2018)
Real-time Event Identification through Low-dimensional SubspaceCharacterization of High-dimensional Synchrophasor Data
IEEE Transactions on Power Systems , 4937-4947.
Wei Wang, Li He, Penn Markham, Hairong Qi, Yilu Liu, Qing CharlesCao, and Leon M Tolbert (2014)
Multiple event detection and recognition through sparse unmixing forhigh-resolution situational awareness in power grid
IEEE Transactions on Smart Grid, 5(4):1654–1664, 2014
Mahdi Soltanolkotabi, Ehsan Elhamifar, Emmanuel J Candes, et al (2014)
Robust subspace clustering
The Annals of Statistics,, 42(2):669–699, 2014.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
49
Valtierra Rodriguez, Martin and de Jesus Romero-Troncoso, Rene andOsornio-Rios, Roque Alfredo and Garcia-Perez, Arturo (2015)
Detection and classification of single and combined power qualitydisturbances using neural networks
IEEE Transactions on Industrial Electronics ,61(5),2473–2482. IEEE, 2014.
Huaiguang Jiang, Jun Jason Zhang, Wenzhong Gao, and Ziping Wu.(2014)
Fault detection, identification, and location in smart grid based ondata-driven computational methods
IEEE Transactions on Smart Grid, 13, 5(6):2947–2956, 2014.
Rafferty, Mark and Liu, Xueqin and Laverty, David M and McLoone, Sean.(2016)
Real-time multiple event detection and classification using moving windowPCA
IEEE Transactions on Smart Grid, 7(5):2537–2548, 2016.
The First IEEE International Conference on Smart Grid Synchronized Measurements and AnalyticsTexas A&M University, College Station, Texas, USA | May 20-23, 2019
50
Yingshuai Hao and Meng Wang and Joe H. Chow and EvangelosFarantatos and Mahendra Patel (2018)
Model-less Data Quality Improvement of Streaming SynchrophasorMeasurements by Exploiting the Low-Rank Hankel Structure
IEEE Transaction Power System. IEEE, 2018.
Elsner, James B. (2002)
Analysis of time series structure: SSA and related techniques
Taylor & Francis, 2002.