why reinvent the wheel at criteo?
TRANSCRIPT
Cedrick MONTOUT
2016 June 8th
A story about C# @ Criteo
Why do we reinvent the wheel ?
2 | Copyright © 2016 Criteo
• Criteo• Global leader on retargeting
• Scalability• One of many groups in the Criteo R&D department
• WebScale• We write code to help scale up the real-time Criteo Platform
• https://www.linkedin.com/in/kerdrek• https://github.com/kerdrek
• @kerdrek
Who am I ?
Client Side Load Balancing
PLACEHOLDER IMAGE
Click icon to add picture
4 | Copyright © 2016 Criteo
The situation then … 1/5
Footer:
App A Pool App B Pool
HA Proxy
Service Pool
5 | Copyright © 2016 Criteo
The situation then … 2/5
Footer:
App A Pool App B Pool
HA Proxy
Service Pool
Random DC Service PoolTraffic 185k QPS
Input size ~5K bytes
Output size ~3,5K bytes
Network traffic ~12Gbits/s
6 | Copyright © 2016 Criteo
The situation then … 3/5
Footer:
App A Pool App B Pool
HA Proxy
Service Pool
Random DC Service PoolTraffic 185k QPS
Input size ~5K bytes
Output size ~3,5K bytes
Network traffic ~12Gbits/s
Compression 30% gain
7 | Copyright © 2016 Criteo
The situation then … 4/5
Footer:
App A Pool App B Pool
HA Proxy
Service Pool
Random DC Service PoolTraffic 185k QPS
Input size ~5K bytes
Output size ~3,5K bytes
Network traffic ~12Gbits/s
Compression 30% gain
Bonding 2 ports
Combine two physical data links into one logical link, by connecting 2 ports of the switch to 2 network interfaces of the HAProxy
8 | Copyright © 2016 Criteo
The situation then … 5/5
Footer:
App A Pool App B Pool
HA Proxy
Service Pool
Random DC Service PoolTraffic 185k QPS
Input size ~5K bytes
Output size ~3,5K bytes
Network traffic ~12Gbits/s
Compression 30% gain
Bonding 2 ports
Combine two physical data links into one logical link, by connecting 2 ports of the switch to 2 network interfaces of the HAProxy
4 pairs
9 | Copyright © 2016 Criteo
The new wheel : Client Side Load Balancing
• Bypass HA Proxy• Implemented inside Twitter/Finagle• Reuse existing health Check • Re-implement monitoring
PLACEOLDER IMAGE
App Pool A App Pool B
Service Pool
DevHost
PLACEHOLDER IMAGE
11 | Copyright © 2016 Criteo
The situation then …
Footer:
• Harder and harder to release• No available memory for new feature• Fragmented feature in production• Tightly coupled with the HTTP stack
12 | Copyright © 2016 Criteo
The new wheel: Component
• One Input • One Output• On Process
PLACEHOLDER IMAGE
13 | Copyright © 2016 Criteo
The new wheel: A Host with Services
• A collection of services
PLACEHOLDER IMAGE
14 | Copyright © 2016 Criteo
The new wheel: DevHost
• The DevHost• Several Components• Several Services• Asynchronous Process• Transport agnostic• Not front facing
PLACEHOLDER IMAGE
15 | Copyright © 2016 Criteo
The new wheel: DevHost
• Still use in production• ~45% of the windows production machine• Next iteration will use .net core (WiP)
PLACEHOLDER IMAGE
Web (IIS)300020%
De-vHost250017%
Other5003%
CentOS
900060%
Monitoring @ Task level
PLACEHOLDER IMAGE
17 | Copyright © 2016 Criteo
• Synchronous processing everywhere• Timeout on several pipeline (loosing money)• No clear diagnostic on execution path
The situation then …
Footer:
18 | Copyright © 2016 Criteo
The new Wheel : Asynchronous Token Framework
• TPL was not a solution at that time• Asynchronous Completion Token• Delegate based• Execution is time boxed• Task underneath• Timing for every thing• Metrics available on the machine• Metrics available aggregated on Graphite
PLACEHOLDER IMAGE
PLACEHOLDER IMAGE
Apache Kafka Driver in C#
20 | Copyright © 2016 Criteo
The situation then …
Footer:
• Syslog• Text based• Fire and Forget• Single messages• No built-in resiliency• No API for consuming
21 | Copyright © 2016 Criteo
The situation then …
Footer:
• Apache Kafka• Binary• Acknowledged message• Batched messages• Partitioning and replication• Consuming support
• Syslog• Text based• Fire and Forget• Single messages• No built-in resiliency• No API for consuming
22 | Copyright © 2016 Criteo
Apache Kafka where is your C# driver ?
We looked at several drivers
23 | Copyright © 2016 Criteo
Goldilocks Conundrum all over again
We looked at several drivers
First driver never used in production
24 | Copyright © 2016 Criteo
Goldilocks Conundrum all over again
We looked at several drivers
First driver never used in production
Second driver was impossible to unit test
25 | Copyright © 2016 Criteo
Goldilocks Conundrum all over again
We looked at several drivers
First driver never used in production
Second driver was impossible to unit test
Third driver was not recently maintained
26 | Copyright © 2016 Criteo
The new Wheel: kafka-sharp
• Yet another C# driver • Highly tuneable• Written with perf and scale in mind• Battle tested in production• Available here: https://github.com/criteo/kafka-sharp
27 | Copyright © 2016 Criteo
The wheel lists
• Distributed load balancer between clients.• Lightweight hosting server.• Low level asynchronous execution framework. • Yet another C# driver for Apache Kafka.
28 | Copyright © 2016 Criteo
PLACEHOLDER IMAGE
Click to add text