data streaming for connected devices with azure stream analytics by juan manual servera
TRANSCRIPT
Father of two
Undiscovered guitar player
.net Developer since v1.0 beta 2
C# MVP in 2014 & 2015
Level 44 geek
@jmservera
http://aka.ms/WomenInIoT
Connectivity Data AnalyticsThings
Source: wikimedia
https://www.hackster.io/peejster/rover-c42139
Windows 10 IoT Core
IoT Core Basics
Windows Universal App Platform• Converged APIs, write ONE Universal App and target all Windows 10 editions
– Scale and get higher ROI by selling same App to all Windows 10 editions OEMs/ODMSs
– Reuse existing development skills
Languages
• C++ /CX
• C#, VB
• JS
• Python*
• Node.js*
APIs
• WinRT
• Win32
• .NET
• Wiring
Deployment and
Execution
• APPX
• Xcopy
• App Isolation
UI Frameworks
• HTML
• Xaml
• DirectX
Tools
• Visual Studio
• PowerShell
Microsoft’s view
The Internet of Things starts with your things.
• Build on the infrastructure you already have
• Add more devices to the ones you already own
• Get more from the data that already exists
Stop just running your business. Start making it thrive. Start realizing the potential of the Internet of Your Things.
.
Microsoft Azure services for IoT
Producers Data Transport Storage Analytics Presentation & action
Event Hubs
(Service Bus)SQL Database Machine Learning Azure Websites
Heterogeneous
client agents
Table/Blob
StorageHD Insight Mobile Services
External Data
SourcesDocumentDB Stream Analytics Notification Hubs
External Data
SourcesCloud Services Power BI
External Services
{ }
Storage adapters
Stream processing
Cloud gateways(web APIs)
Field gateways
Applications
Search and query
Data analytics (Excel)
Web/thick client dashboards
Devices to take action
RabbitMQ /
ActiveMQ
Web and Social
Devices
Sensors
Solr
Azure Search
EventProducers
Field Gateway
Analytics & Operationalized InsightsPresentation & Business
Connectivity
IoT Device & Cloud PatternsD
ev
ices
RTO
S, L
inu
x, W
ind
ow
s, A
nd
roid
, iO
S
Protocol Adaptation
Batch Analytics & Visualizations
Azure HDInsight, AzureML, Power BI,
Azure Data Factory
Hot Path Analytics
Azure Stream Analytics, Azure HDInsight Storm
Presentation &
Business Connectivity
App Service, Websites
Dynamics, BizTalk Services, Notification Hubs
Hot Path Business Logic
Service Fabric & Actor Framework
Cloud Gateway
Event Hubs&IoT Hub
Field Gateway
Protocol Adaptation
Device Connectivity & Management
Many aspects of connectivity
Functionality Device-to-cloud telemetry,
Cloud-to-device commands and notifications,
Bulk uploads/downloads
Security Device security,
Cloud security,
Channel security, …
Monitoring Identify malfunctioning devices when they cannot
be reached directly
Reach and customization RTOS/Linux/Windows/non-IP capable,
Network/application protocols,
Authentication schemes
IoT Hub
Field
gateways
Cloud
protocol
gateways
Azure IoT Suite
…
IoT Hub Demo
Azure SQL DB
Azure Event Hubs
Azure Blob StorageAzure Blob Storage
Azure Event Hubs
Reference Data
Query runs continuously against incoming stream of events
Events
Have a defined schema and are
temporal (sequenced in time)
PowerBI
Getting Started
DML Statements
• SELECT
• FROM
• WHERE
• GROUP BY
• HAVING
• CASE
• JOIN
• UNION
Windowing Extensions
• Tumbling Window
• Hopping Window
• Sliding Window
• Duration
Aggregate Functions
• SUM
• COUNT
• AVG
• MIN
• MAX
Date and Time Functions
• DATENAME
• DATEPART
• DAY
• MONTH
• YEAR
• DATETIMEFROMPARTS
• DATEDIFF
• DATADD
String Functions
• LEN
• CONCAT
• CHARINDEX
• SUBSTRING
• PATINDEX
Scaling Functions
• WITH
• PARTITION BY
TollId EntryTimeLicense
PlateState Make Model Type Weight
1 2014-10-25T19:33:30.0000000Z JNB 7001 NY Honda CRV 1 3010
1 2014-10-25T19:33:31.0000000Z YXZ 1001 NY Toyota Camry 2 3020
3 2014-10-25T19:33:32.0000000Z ABC 1004 CT Ford Taurus 2 3800
2 2014-10-25T19:33:33.0000000Z XYZ 1003 CT Toyota Corolla 2 2900
1 2014-10-25T19:33:34.0000000Z BNJ 1007 NY Honda CRV 1 3400
2 2014-10-25T19:33:35.0000000Z CDE 1007 NJ Toyota 4x4 1 3800
… … … … … … … …
EntryStream - Data about vehicles entering toll stations
TollId ExitTime LicensePlate
1 2014-10-25T19:33:40.0000000Z JNB 7001
1 2014-10-25T19:33:41.0000000Z YXZ 1001
3 2014-10-25T19:33:42.0000000Z ABC 1004
2 2014-10-25T19:33:43.0000000Z XYZ 1003
… … …
ExitStream - Data about cars leaving toll stations
LicensePlate RegistartionId Expired
SVT 6023 285429838 1
XLZ 3463 362715656 0
QMZ 1273 876133137 1
RIV 8632 992711956 0
… … ….
ReferenceData - Commercial vehicle registration data
SELECT VehicleCategory =
Case Type
WHEN 1 THEN ‘Passenger’
WHEN 2 THEN ‘Commercial’
ELSE THEN ‘Other’
END,
TollId, State LicensePlate, State, Make, Model, Weight,
DATEPART(mi,EntryTime) AS ‘Mins’,
DATEPART(ss,EntryTime) AS ‘Seconds’
DATEPART(ms,EntryTime) AS ‘Milleseconds’
FROM EntryStream TIMESTAMP BY EntryTime
WHERE (State = ‘CA’ OR State = ‘WA’)
AND Weight < 3000
AND CHARINDEX (‘M’, model) = 0
AND PATINDEX(‘%999’, LicensePlate) = 5
SELECT Make
FROM EntryStream ES TIMESTAMP BY EntryTime
JOIN ExitStream EX TIMESTAMP BY ExitTime
ON ES.Make= EX.Make
AND DATEDIFF(second,ES,EX) BETWEEN 0 AND 10
Time
(Seconds)
{“Mazda”,6} {“BMW”,7} {“Honda”,2} {“Volvo”,3}Toll
Entry :
{“Mazda”,3} {“BMW”,7}{“Honda”,2} {“Volvo”,3}Toll
Exit :
0 5 10 15 20 25
SELECT ES.TollId, ES.EntryTime, EX.ExitTime, ES.EntryTime,
ES.LicensePlate DATEDIFF(minute, ES.EntryTime, EX.ExitTime )
FROM EntryStream ES TIMESTAMP BY EntryTime
JOIN ExitStream EX TIMESTAMP BY ExitTime
ON (EX.TollId= ES.TollId and ES.LicensePlate = EX.LicensePlate)
AND DATEDIFF(minute, ES, EX) BETWEEN 0 AND 15
SELECT ES.TollId, ES.EntryTime, ES.LicensePlate
FROM EntryStream EN TIMESTAMP BY EntryTime
LEFT OUTER JOIN ExitStream EX TIMESTAMP BY ExitTime
ON (EN.TollId= EX.TollId AND EN.LicensePlate =
EX.LicensePlate)
AND DATEDIFF(minute, EN, EX) BETWEEN 0 AND 5
WHERE EX.ExitTime IS NULL
SELECT ES.EntryTime, ES.LicensePlate, ES.TollId, RD.RegistrationId
FROM EntryStream ES TIMESTAMP BY EntryTime
JOIN RegistrationData RD
ON ES.LicensePlate = RD.LicensePlate
WHERE RD.Expired = 1
SELECT TollId, ENTime AS Time , LicensePlate FROM EntryStream TIMESTAMP BY ENTime
UNION
SELECT TollId, EXTime AS Time , LicensePlate FROM ExitStream TIMESTAMP BY EXTime
TollId EntryTime LicensePlate …
1 2014-09-10 12:01:00.000 JNB 7001 …
1 2014-09-10 12:02:00.000 YXZ 1001 …
3 2014-09-10 12:02:00.000 ABC 1004 …
TollId ExitTime LicensePlate
1 2009-06-25 12:03:00.000 JNB 7001
1 2009-06-2512:03:00.000 YXZ 1001
3 2009-06-25 12:04:00.000 ABC 1004
TollId Time LicensePlate
1 2014-09-10 12:01:00.000 JNB 7001
1 2014-09-10 12:02:00.000 YXZ 1001
3 2014-09-10 12:02:00.000 ABC 1004
1 2009-06-25 12:03:00.000 JNB 7001
1 2009-06-2512:03:00.000 YXZ 1001
3 2009-06-25 12:04:00.000 ABC 1004
1 5 4 26 8 6 4
t1 t2 t5 t6t3 t4
Time
• Event arrive at different times i.e. have unique timestamps
• Events arrive at different rates (events/sec).
• In any given period of time there may be 0, 1 or more events
Windowing is a core requirement for streaming
analytic applications
Common requirement to perform some set-
based operation (count, aggregation etc) over
events that arrive within a specified period of
time
Azure Stream Analytics supports three types of
windows: Hopping, Sliding and Tumbling
Every window operation outputs events at the
end of the window
The output of the window will be single event
based on the aggregate function used. The
event will have the time stamp of the window
All windows have a fixed length
All windows should be used in a GROUP BY
clause
Window 1 Window 2 Window 3
Aggregate
Function (Sum)
18 14Output Events
1 5 4 26 8 6 5
Time
(secs)
1 5 4 26
8 6
A 20-second Tumbling Window
3 6 1
5 3 6 1
Tumbling windows:
• Repeat
• Are non-overlapping
SELECT TollId, COUNT(*)
FROM EntryStream TIMESTAMP BY EntryTime
GROUP BY TollId, TumblingWindow(second, 20)
Query: Count the total number of vehicles entering each
toll booth every interval of 20 seconds.
An event can belong to only one tumbling window
1 5 4 26 8 6
A 20-second Hopping Window with a 10 second “Hop”
Hopping windows:
• Repeat
• Can overlap
• Hop forward in time by a fixed period
Same as tumbling window if hop size = window size
Events can belong to more than one hopping window
SELECT COUNT(*), TollId
FROM EntryStream TIMESTAMP BY EntryTime
GROUP BY TollId, HoppingWindow (second, 20,10)
4 26
8 6
5 3 6 1
1 5 4 26
8 6 5 3
6 15 3
QUERY: Count the number of vehicles entering each toll booth
every interval of 20 seconds; update results every 10 seconds
1 5
A 20-second Sliding Window
Sliding window:
• Continuously moves forward by an € (epsilon)
• Produces an output only during the occurrence of
an event
• Every windows will have at least one event
Events can belong to more than one sliding window
SELECT TollId, Count(*)
FROM EntryStream ES
GROUP BY TollId, SlidingWindow (second, 20)
HAVING Count(*) > 10
Query: Find all the toll booths which have served more
than 10 vehicles in the last 20 seconds
1
8
8
5 1
9
5 1 9
ASA Demo