how we do routing in schibsted routing billions of events ... · gdpr and data collection 9 legal...
TRANSCRIPT
![Page 1: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/1.jpg)
Routing billions of events a day:How we do routing in Schibsted
1
Carlos Manuel Duclos-Vergara, Staff Engineer
![Page 2: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/2.jpg)
About me
2
![Page 3: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/3.jpg)
Agenda• Schibsted• A short story• GDPR• Pulse (our tracking solution)
• Overview• Internals
3
![Page 4: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/4.jpg)
Schibsted
4
![Page 5: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/5.jpg)
Event generation
5
![Page 6: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/6.jpg)
Event routing
6
![Page 7: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/7.jpg)
Event dispatching
7
![Page 8: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/8.jpg)
Event consumption
8
![Page 9: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/9.jpg)
GDPR and data collection
9
Legal basis for data collection
1. Consent2. Processing obligation3. Legal obligation4. Vital interest5. Public interest6. Legitimate interest
User rights
1. Data portability2. Right to be forgotten
![Page 10: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/10.jpg)
End to end event processing solution
10
![Page 11: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/11.jpg)
Pulse ecosystem
11
![Page 12: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/12.jpg)
Lifetime of an event
12
![Page 13: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/13.jpg)
Side track: How much is 1 billion events
13
![Page 14: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/14.jpg)
Common pipeline
14
![Page 15: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/15.jpg)
Batch pipeline
15
![Page 16: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/16.jpg)
Streaming pipeline
16
![Page 17: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/17.jpg)
Processing and routing internals
17
![Page 18: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/18.jpg)
Routing lib
18
![Page 19: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/19.jpg)
Processing: routing languageSinkName: eventType: event schema filter: inline || stored || null transform: stored || null SinkType: SinkDetails:
19
ProbeEvent-1: eventType: ProbeEvent kafka: topic: probe-topic
![Page 20: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/20.jpg)
Event formats: probe event{
"$schema": "http://json-schema.org/draft-04/schema#",
"allOf": [
{
"$ref": "base-routable-event.json#"
}
],
"description": "Events sent by Data Platform Probe to measure latencies and missing events in the pipeline",
"id": "http://schema.schibsted.com/events/backend-probe-event.json#",
"properties": {
"senderId": {
"description": "Sender ID, in case several instances of Probe is running",
"type": "integer"
},
"sequenceNumber": {
"description": "Probe sequence number",
"type": "integer"
},
"timeSent": {
"$ref": "../common-definitions.json#/definitions/timestamp",
"description": "UTC timestamp of when the event is generated by Probe"
}
},
"title": "BackendProbeEvevnt",
"type": "object"
}
20
![Page 21: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/21.jpg)
JSLT: The magic sauce of processingJSON query and transformation language
21
Github repo: https://github.com/schibsted/jslt
License: Apache 2.0
{
"time": round(parse-time(.published, "yyyy-MM-dd'T'HH:mm:ssX") * 1000),
"device_manufacturer": .device.manufacturer,
"device_model": .device.model,
"language": .device.acceptLanguage,
"os_name": .device.osType,
"os_version": .device.osVersion,
"platform": .device.platformType,
"user_properties": {
"is_logged_in" : boolean(.actor."spt:userId")
}
}
![Page 22: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/22.jpg)
Routing: batch
22
![Page 23: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/23.jpg)
Routing: streaming
23
![Page 24: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/24.jpg)
Lessons learned (so far…)• Schemas and versions• Backfilling and recovery• Logging and metrics• Auditing
24
![Page 25: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/25.jpg)
And finally
25
![Page 26: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/26.jpg)
![Page 27: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/27.jpg)
Extra
27
![Page 28: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/28.jpg)
About Schibsted
28
![Page 29: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/29.jpg)
Marketplaces
29
![Page 30: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/30.jpg)
News Media
30
![Page 31: How we do routing in Schibsted Routing billions of events ... · GDPR and data collection 9 Legal basis for data collection 1. Consent 2. Processing obligation 3. Legal obligation](https://reader034.vdocuments.us/reader034/viewer/2022050308/5f700d4e70e89b61dc16340e/html5/thumbnails/31.jpg)
Some of our Next companies
31