![Page 1: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/1.jpg)
![Page 2: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/2.jpg)
Tomasz Szymański Adam Warski
SoftwareMill
Open source big data landscape and possible ITS applications
![Page 3: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/3.jpg)
Big Data? Fast Data?• No clear definition• Big Data
– 100s+ of GB? – Time frame?
• Fast Data– Real-time– Single-node vs multi-node
![Page 4: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/4.jpg)
Why Open Source?• Large developer base
Easy to learn• Projects usually backed by a commercial entity
Support• Cost efficiency
leverage latest developments• Future-proofing
tools with a large user base will be around for longer
![Page 5: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/5.jpg)
Apache Spark / Cassandra / Kafka• Data ingestion: Kafka• Data processing: Spark• Data storage: Cassandra
![Page 6: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/6.jpg)
Apache Spark / Cassandra / Kafka• Spark: largest cluster 8k nodes, eBay, Baidu, NASA, Amazon• Cassandra: over 75k nodes storing 10PB of data at Apple• Kafka: over 1.1 trillion messages per day at LinkedIn
![Page 7: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/7.jpg)
Possible ITS applications
![Page 8: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/8.jpg)
Hotspot detectionComputed using New York open taxi data, Akka & Apache Flink
![Page 9: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/9.jpg)
Architecture of a traffic-jam detection systemLeveraging Apache Kafka, Hadoop, Spark, Cassandra & Akka
![Page 10: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/10.jpg)
Summing up and the future• Open source has a lot to offer• Open data?• Fast-evolving field
– Rapid development, rapid data insights– Leverage in ITS!
technical expertise
‘s ITS domainexperts
![Page 11: Open source big data landscape and possible ITS applications](https://reader036.vdocuments.us/reader036/viewer/2022062412/587a60541a28ab520b8b76a9/html5/thumbnails/11.jpg)
• Founded in 2009• Bespoke software development services• Various domains, including logistics & transport• Big data a common theme in our projects