kafka connect: real-time data integration at scale with apache kafka, ewen cheslack-postava
Post on 16-Apr-2017
3.796 Views
Preview:
TRANSCRIPT
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka
By Ewen Cheslack-Postava
Data Integrationgetting data to all the right places
IntroducingKafka ConnectLarge-scale streaming data import/export for Kafka
Offsets automatically committed and restored
On restart: task checks offsets & rewinds
At least once delivery – flush data, then commit
Exactly once for connectors that support it (e.g. HDFS)
Delivery Guarantees
Abstract serialization: 1 connector, many serialization formats
Convert between Kafka Connect Data API (Connectors) and serialized bytes (Kafka)
JSON and Avro are currently well supported
Converters
Confluent Open Source – HDFS, JDBC
Connector Hub: connectors.confluent.io
Examples: MySQL, MongoDB, Twitter, Solr, S3, MQTT, Bloomberg, Apache Ignite, and more
Connectors Today
Jenkins connector – Aravind Yarram (Equifax)
Twitter semantic analysis and visualization – Ashish Singh (Cloudera)
Brain monitoring device connector – Silicon Valley Data Science
DynamoDB, Cassandra, Slack, Splunk, and many more
Connectors from the Hackathon
Improved connector control via REST API, standardized configs, metrics
Single record transformations
Data pipelines in an app - embedded mode & Kafka Streams integration
Many more connectors
Coming soon…
THANK YOU@ewencp@confluentincTry it out: http://confluent.io/downloadMore like this, but in blog form: http://confluent.io/blog
top related