temporal operators for spark streaming and its application for office365 service monitoring
TRANSCRIPT
![Page 1: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/1.jpg)
TEMPORAL OPERATORS FOR SPARK STREAMING AND ITS APPLICATION FOR OFFICE365 SERVICE MONITORING
Jin Li, Wesley MiaoMicrosoft
![Page 2: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/2.jpg)
Monitoring Target Service
Kafka
Spark Streaming OD
ata API
Collectors
aggregate monitor
DataCenterManagement
ServiceRe
cove
ry
Act
ions
Aggregated Events
Ale
rts
Reco
very
A
ctio
ns
Raw Signals
Topology D
ata
DashboardsCassandra
Temporal operators
Temporal operators
normalizeTemporal operators
Raw Signals
Topology D
ata
Normalized Events
Ale
rts
Paging Alerts
Oncall Engineer
![Page 3: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/3.jpg)
Office365 Service Monitoring• Service Monitoring Signals
– Local active monitoring– Schema: <TopologyScopeValue, IsSuccess, …>– TopologyScopeValue: grouping of hosts
• e.g. server, rack, site, region
• Data may arrive out of order• Application logic defined on data time
![Page 4: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/4.jpg)
Spark Streaming
aggregate monitor
Rec
ove
ry
Act
ions
Temporal operators
Temporal operators
normalize
Raw
Signals
Topology D
ata Ale
rts
reorderTemporal operators
val rawSignals : DStream[T]val topologyData: DStream[T1]
val normalized = rawSignals.join(topologyData, …)
![Page 5: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/5.jpg)
Spark Streaming
aggregate monitor
Rec
ove
ry
Act
ions
Temporal operators
Temporal operators
normalize
Raw
Signals
Topology D
ata Ale
rts
reorderTemporal operators
val reordered = normalized.reorder(Policy.ReorderAdjustAndDrop(Seconds(60), Seconds(120)))
![Page 6: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/6.jpg)
Spark Streaming
aggregate monitor
Rec
ove
ry
Act
ions
Temporal operators
Temporal operators
normalize
Raw
Signals
Topology D
ata Ale
rts
reorderTemporal operators
val aggregated = reordered.aggregate(TumblingWindow(5 minutes),groupBy(event.rack, event.site, event.region),Sum(event => event.isSuccess), Count(), Avg(event => event.latency)
)val availabilityStats = aggregated.map{…}
![Page 7: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/7.jpg)
Spark Streaming
aggregate monitor
Rec
ove
ry
Act
ions
Temporal operators
Temporal operators
normalize
Raw
Signals
Topology D
ata Ale
rts
reorderTemporal operators
val monitoringStats = availabilityStats.join(availabilityStats, left => JoinKey(left.topologyScopeValue),right => JoinKey(right.topologyScopeValue),(left, right) => left.availability < 0.99 &&
right.availability >= 0.99TimeDiff(Seconds(-300), Seconds(0))
![Page 8: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/8.jpg)
Temporal operators in depth
![Page 9: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/9.jpg)
Temporal Operators – Reorder • Reorder(policy)
– inputStream.map { e => StreamEvent(e.timestamp, e) }.reorder(Policy.Reorder(Seconds(60)))
10:00:0010:00:0110:00:03
10:00:0209:59:01
10:00:0010:00:0110:00:03 10:00:02… …
10:01:01
10:00:0010:00:01
10:00:03 10:00:02… …10:01:01
HWM: 10:00:03HWM: 10:00:03HWM: 10:01:01
![Page 10: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/10.jpg)
Temporal Operators – Event-time window aggregate
• Basic Availability Metric– SuccessEvents / TotalEvents for the every 5 minutes
• Tumbling window sum• Event-time window aggregate
• val availabilityMetrics = input.map { e => StreamEvent(e.timestamp, e) }.reorder(Policy.ReorderAdjustAndDrop(Seconds(60), Seconds(120))).aggregate(TumblingWindow(Seconds(300)), event => GroupByKey((event.topologyScopeValue, “topologyScopeValue”)), Sum(event => event.isSuccess, “sumOfSuccessEvent”),Count(event => event, “sumOfTotalEvent”)
).map (…)
![Page 11: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/11.jpg)
Temporal Operators – Event-time window aggregate
• Implementation
(rack1) (SumState(150), CountState(170))window: 10:00:00 -10:05:00
(10:03:00, rack1, 1)(10:04:49, rack2, 0)
(rack2) (SumState(230), CountState(260))(10:05:01, rack2, 1)
… …
(10:05:00, rack1, 151, 171)
(10:05:00, rack2, 230, 261)
window: 10:00:05 -10:05:10
( … … )
(rack2) (SumState(1), CountState(1))(SumState(151), CountState(171))
(SumState(230), CountState(261))
![Page 12: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/12.jpg)
Temporal Operators – Join• Alarm
– SuccessEvents/TotalEvents > threshold for the current 5 minutes, but not the previous 5 minutes
• Temporal self join
• Temporal JoinavailabilityMetrics.join(availabilityMetrics,left => JoinKey(x.topologyScopeValue, “leftTopologyScopeValue”),right => JoinKey(y.topologyScopeValue, “rightTopologyScopeValue”),(left, right) => left.availability < 0.99 && right.availability >= 0.99TimeDiff(Seconds(-300), Seconds(0))
)
![Page 13: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/13.jpg)
Temporal Operators – Join• Temporal condition for Join
– TimeDiff(Seconds(-300), Seconds(0))
Left Right
-5 minutes
10:00:00
09:55:00
10:00:00
![Page 14: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/14.jpg)
THANK YOU.Jin Li ([email protected] )Wesley Miao ([email protected] )
![Page 15: Temporal Operators For Spark Streaming And Its Application For Office365 Service Monitoring](https://reader033.vdocuments.us/reader033/viewer/2022051707/58ef0de41a28ab6f408b45fb/html5/thumbnails/15.jpg)
Questions?