a cloud data sharing platform for real-time pmu data€¦ · a cloud data sharing platform for...
TRANSCRIPT
A Cloud Data Sharing Platform for
Real-time PMU Data
Carl Hauser
Washington State University
PSERC Webinar
September 13, 20161
PSERC Project S-67G
Cloud Data Sharing Platform
Washington State UniversityCarl Hauser, Anjan Bose
Ming Meng, Dave Anderson
Cornell UniversityKen Birman
Theodoros Gkountouvas, Weijia Song
ISO-NEEugene Litvinov
Xiaochuan Luo, Qiang Zhang
ADVISORS: Jay Giri (GE), George Stefopoulos (NYPA)
ALSO: Thanks to the US Dept of Energy ARPA-E GENI
Program that supported the initial GridCloud research2
Presentation Outline
Cloud opportunities and challenges
GridCloud concept and architecture
Performance
Cyber-security
For further investigation
3
Cloud Opportunity
• Resource elasticity: pay for what you need, only
for as long as you need it
• Low constant cost
• Massive computation available if needed (event
analysis, etc.)
• Geographic flexibility
• data and computation located close to where needed
• move to (or backup in) distant location for disasters
(Hurricane Sandy experience)
• Neutral ground for data sharing
• Data sharing platform need not be under physical
control of one utility, ISO, etc.
4
Cloud-specific Challenges
Clouds such as EC2 are surprisingly hostile for real-time work
5
• Underlying scheduler and network layer are
unreliable
• Strange timing problems, bursts of message loss,
other anomalies
• Overcoming this is made difficult by Amazon’s
unwillingness to document the AWS infrastructure
• But we’ve never encountered a problem that we
couldn’t eventually pin down and solve
Business Environment Challenges
6
• Distinct owners: peers &hierarchy (ISO)
• Owners control data flow: entities have different
security & sharing policies
• ISOs integrate data ... but as we get further
from sources, quality of information is a potential
concern
• How valuable is shared PMU data for
operations?
• Is sharing unthinkable due to technical barriers? We
can help with that
• Due to business barriers? That’s harder!
7
• Use redundancy to overcome real-time disruptions and
failures.
• Use proven techniques from distributed computing to
manage issues of consistency and availability
System ConceptA distributed platform for real-time data collection, storage,
processing and dissemination using Cloud Computing
Redundant CommunicationReliability and Performance
8
System Architecture
9
Application
Tim
e A
lign
men
t
Co
mp
uta
tio
n
Res
ult
s
GridCloud – the ARPA-E DemonstrationScalability and Fine-grained Replication
10
Dave Anderson, Theodoros Gkountouvas, Carl Hauser
Data Collectors Replica 1
FWDN
FWDN-1
FWD1
…FWD2
Application
Tim
e A
lign
men
t
Co
mp
uta
tio
n
Res
ult
s
Time-Synchronized Data Sources Internet / VPN GridCloud running on Amazon Elastic Compute Cloud
Data Collectors Replica N
FWDN
FWDN-1
FWD1
…FWD2
Application
Tim
e A
lign
men
t
Co
mp
uta
tio
n
Res
ult
s
Results
…
GridCloud for ARPA-E Demonstration Simulated 6K Bus WECC System
11
• Comparison of 6K to 179 Bus system:
• Power System Description:
• 6,000 busses
• A simplified model (~1/3rd number of busses) of the entire WECC system
• This is the primary model used by industry and academia for studying the July 2nd 1996 blackout
• All power components (busses, branches, etc.) in the system above and including 230kv are monitored
Old New Scale
Substations 127 291 2.29x
Busses 179 6,000 33.52x
Streams (PMUs) 1,577 4,632 2.94x
ARPA-E GridCloud Demonstration (6K bus, 3 Replicas)
12
Managed by CloudMake and ISIS2
GridStat
Simulated Power System Replica 1
Re-
pla
yed
Dat
a Fi
les
PMUN P
PMUN-1 P
PMU1 P
P
…PMU2
S
FE
FE
FE
FE
GridStat (UDP)
GridCloud Replica 1
FWDN
FWDN-1
FWD1
…FWD2
SubstationSE1 P
SubstationSEM P
Control Center SERunning at 5Hz
Tim
e A
lign
men
t
S
Co
mp
uta
tio
n
Res
ult
s
Vis
ual
izat
ion
Simulated Power System Replica 2
Re-
pla
yed
Dat
a Fi
les
PMUN P
PMUN-1 P
PMU1 P
P
…PMU2
S
FE
FE
FE
FE
GridStat (UDP)
GridCloud Replica 2
FWDN
FWDN-1
FWD1
…FWD2
SubstationSE1 P
SubstationSEM P
Simulated Power System Replica 3
Re-
pla
yed
Dat
a Fi
les
PMUN P
PMUN-1 P
PMU1 P
P
…PMU2
S
FE
FE
FE
FE
GridStat (UDP)
GridCloud Replica 3
FWDN
FWDN-1
FWD1
…FWD2
SubstationSE1 P
SubstationSEM P
Time-Synchronized Data Sources Internet GridCloud running on Amazon Elastic Compute Cloud
Visualization Viewer (RDP)
Visualization
Local Machine
4632 Streams
4632 Streams
4632 Streams
291 Substations
291 Substations
291 Substations
Application
Tim
e A
lign
men
t
Co
mp
uta
tio
n
Res
ult
s
ISO-NE GridCloud DemonstrationSmaller-scale, Geographic Redundancy, Security, Multiple
Data Sources
13
Data Collectors Replica 1
FWDN
FWDN-1
FWD1
…FWD2
Application
Tim
e A
lign
men
t
Co
mp
uta
tio
n
Res
ult
s
Time-Synchronized Data Sources Tunnels / VPN GridCloud running on Amazon Elastic Compute Cloud
Results
ISO-NE SYSTEM
New England System
• 761 buses (planning model)
• 73 PMUs
• 96 voltage phasors
• 127 current phasors
• 93 buses observable (including all 345kV)
• 11 seconds of recorded real time data
• PMU data @ 30Hz
• PMU data is run in a loop to obtain longer runs
• LSE solution @ 5Hz
• Returned as C37.118 data stream
14
ISO-NE Demonstration
15
ISO-NE hosted distribution point
GridCloud at
Data Center 1
FWDN
FWDN-1
FWD1
…
FWD2
GridCloud running on Amazon VPC
Managed by CloudMake and
VSync
Internal DataSource Network SSH Tunnels
LSE-VIS-2
Visualization
Client
Vis
ual
izat
ion
LSE-VIS-1
Visualization
Client
Vis
ual
izat
ion
TCP
Re-
pla
yed
C3
7 D
ata
PMUm
PMUm-1
PMU1
…PMU2
TCP Sender
31 Streams
Cornell hosted distribution point
Re-
pla
yed
C3
7 D
ata
PMUN
PMUN-1
PMUm
…
PMUm+
1TCP Sender
42 Streams
Cloud hosted
Ingress
distribution point
Cloud hosted
Raw Egress
distribution point
Cloud hosted SE
Result Egress
distribution point
DataArchive
State Estimator
GridCloud at
Data Center 2
FWDN
FWDN-1
FWD1
…
FWD2
Cloud hosted
Ingress
distribution point
Cloud hosted
Raw Egress
distribution point
Cloud hosted SE
Result Egress
distribution point
DataArchive
State Estimator
ISO-NE Demonstration Monitoring
16
ISO-NE hosted distribution pointGridCloud at
Data Center 1
FWDN
FWDN-1
FWD1
…
FWD2
Managed by CloudMake and
VSync
LSE-VIS-2
Visualization
Client
Vis
ual
izat
ion
LSE-VIS-1
Visualization
Client
Vis
ual
izat
ion
TCP
Re-
pla
yed
C3
7 D
ata
PMUm
PMUm-1
PMU1
…PMU2
TCP Sender
31 Streams
Cornell hosted distribution point
Re-
pla
yed
C3
7 D
ata
PMUN
PMUN-1
PMUm
…
PMUm+
1
TCP Sender
42 Streams
Cloud hosted
Ingress
distribution point
Cloud hosted
Raw Egress
distribution point
Cloud hosted SE
Result Egress
distribution point
DataArchive
State Estimator
GridCloud at
Data Center 2
FWDN
FWDN-1
FWD1
…
FWD2
Cloud hosted
Ingress
distribution point
Cloud hosted
Raw Egress
distribution point
Cloud hosted SE
Result Egress
distribution point
DataArchive
State Estimator
Latency And
Consistency
Checks
L3se: Round trip from datasource to SE to datasource
L2: Round trip from Ingres Relay to Egress relay
L3raw: Round trip from datasource to CloudRelay to datasource
SSH Tunnels
Performance
L2 and L3 Latency Tests
17
• Sampled over 4 hours
• Tests performed from Cornell and ISO-NE datasource machines over
SSH tunnels
• Sampled 4 raw feeds and two SE feeds from each datacenter
• Lowest numbered PMU from each datasource (ISO-NE and Cornell)
• Highest numbered PMU from each datasource
• PMUs send to the cloud in order from the datasource; this helps
show us the spread of data from first to last measurement sent
per round
• Lowest and Highest latency SE result
• Tests presented in the following slides as histograms and table of overall
statistics
• Histograms only cover highest numbered PMU/SE as they have the
highest variability
Histogram: Round Trip Latencies
18
Graphs: Number of times a particular latency occurs
Measured Latency (ms)
Num
ber
of O
ccurr
ences
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600
Raw Data Round Trip Latency From ISO-NE Source
Oregon
Viginia
Histogram: Round Trip Latencies
19
Graphs: Number of times a particular latency occurs
Measured Latency (ms)
Num
ber
of O
ccurr
ences
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600
Raw Data Round Trip Latency from Cornell Source
Oregon
Virginia
Histogram: Round Trip Latencies
20
Graphs: Number of times a particular latency occurs
Measured Latency (ms)
Num
ber
of O
ccurr
ences
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600
SE Results Round Trip Latency (Data from both sources)
Oregon
Virginia
Latencies (milliseconds)
21
Virginia Virginia-Internal Oregon Oregon-Internal
ISONE Raw-Low Min 20 88
ISONE Raw-Low 1st Percentile 22 89
ISONE Raw-Low Average 25 102
ISONE Raw-Low 99th Percentile 58 152
ISONE Raw-Low Max 611 696
ISONE Raw-High Min 22 90
ISONE Raw-High 1st Percentile 25 99
ISONE Raw-High Average 46 127
ISONE Raw-High 99th Percentile 82 179
ISONE Raw-High Max 612 697
Cornell Raw-Low Min 17 90
Cornell Raw-Low 1st Percentile 17 91
Cornell Raw-Low Average 18 115
Cornell Raw-Low 99th Percentile 20 191
Cornell Raw-Low Max 49 407
Cornell Raw-High Min 18 91
Cornell Raw-High 1st Percentile 18 92
Cornell Raw-High Average 19 120
Cornell Raw-High 99th Percentile 20 199
Cornell Raw-High Max 49 413
SE Results Min 279 242 351 240
SE Results 1st Percentile 294 267 370 273
SE Results Average 325 300 409 317
SE Results 99th Percentile 384 348 490 393
SE Results Max 911 469 962 642
Latencies (milliseconds)
22
Virginia Virginia-Internal Oregon Oregon-Internal
ISONE Raw-Low Min 20 88
ISONE Raw-Low 1st Percentile 22 89
ISONE Raw-Low Average 25 102
ISONE Raw-Low 99th Percentile 58 152
ISONE Raw-Low Max 611 696
ISONE Raw-High Min 22 90
ISONE Raw-High 1st Percentile 25 99
ISONE Raw-High Average 46 127
ISONE Raw-High 99th Percentile 82 179
ISONE Raw-High Max 612 697
Cornell Raw-Low Min 17 90
Cornell Raw-Low 1st Percentile 17 91
Cornell Raw-Low Average 18 115
Cornell Raw-Low 99th Percentile 20 191
Cornell Raw-Low Max 49 407
Cornell Raw-High Min 18 91
Cornell Raw-High 1st Percentile 18 92
Cornell Raw-High Average 19 120
Cornell Raw-High 99th Percentile 20 199
Cornell Raw-High Max 49 413
SE Results Min 279 242 351 240
SE Results 1st Percentile 294 267 370 273
SE Results Average 325 300 409 317
SE Results 99th Percentile 384 348 490 393
SE Results Max 911 469 962 642
L3
Raw
L3
SE
L3
SEL 2 L 2
L3
Raw
OpenPDC Manager (Visualizer) Displaying SE
Results
23
Cyber-security Performance Cost
24
• EC2 Latency
• Average = 245ms
• 1st Percentile = 211ms
• 99th Percentile = 255ms
• VPC Latency
• Average = 261ms
• 1st Percentile = 228ms
• 99th Percentile = 270ms
• Delta is approximately +15ms
• These numbers (L1 latencies) do not include SE compute
time (75ms-100ms)
• Adding SSH tunnels added less than 2ms to RTT
Main Findings
• Cost: As configured for testing
• 13 AWS instances total per datacenter (Vizualizer, CloudRelay,
CloudMakeLeader, StateEstimator, 3xRawArchiver,
4xSEArchiver, 2xForwader)
• $2.47/hr to run per datacenter
• Latency: Round-trip time including LSE solution on an eastern
data center was 300ms; on the western data center was 500ms
• Consistency: Returned raw data and LSE results from the two
data centers were identical
• Security Effect on Latency: Cost of AES256 encryption at
noise level; cost of SSH 2ms; data loss & delays were not observed
and did not affect latency
• Fault Tolerance: Loss of one data center did not impact results
from other data center. Restart of lost data center took 175sec
25
Additional Platform Features
• Distribute real-time data streams to multiple
applications in the cloud
• Freeze-Frame File System (FFFS)
• Distributed, time-consistent snapshots of stored data
• Tamper-proof data
• CloudMake
• Declarative specification of GridCloud components,
their interconnection and the cloud resources that
they use
• Automated instantiation, monitoring, and repair of
GridCloud components when instances or
communication fail
26
For Further Investigation
• Flexibility to incorporate multiple entities (actual
sharing)
• Naming and configuration for sharing
• Cyber-security
• Recording time-synchronized system topology
along with PMU data (FFFS should help)
• New project starting with NYPA to investigate
these and other issues
27