go && python event driven ml - pycon odessa · 2020-06-01 · go && python event...

Go && Python Event driven ML

Lalafo case

How it all started

Part 1. The Phantom Menace

StackOverflow: Ready to rescue

200 RPM

$

Feedback

$

Feedback

Processes Monitoring

RAM - expiration?

Relational DB?

Any issues there?1. Request monitoring (?)2. Hard to reason GPU usage (varies 3 to 7GB)3. ~5Gb RAM out of the box (scale?)4. Redis is in-memory storage (temporary)5. Threading != Parallel

http://www.nooooooooooooooo.com/

What if we split API and ML

Part 2. A new hope

Requirements

Lalafo case

1. Single image/request processing time: < 2 seconds2. Visibility3. Persistence4. Scalability 5. Make it extendable for new features

a. Price predictionb. Similarity searchc. Segmentation

6. SDK friendly (well documented, tested etc)

$

Feedback

Processes Monitoring

RAM - expiration?

Relational DB?

After 2 months

API Listeners

Feedback

$

Requirements

Lalafo case

1. Single image/request processing time: < 2 seconds2. Visibility - Decouple request and prediction3. Persistence 4. Scalability 5. Make it extendable for new features

a. Price predictionb. Similarity searchc. Segmentation

6. SDK friendly (well documented, tested etc)

2 seconds per request

Part 3. The Empire Strikes Back


API Listeners

1 second0.01 sec

segmentio/kafka-go -> shopify/sarama

API Listeners

0.02 sec0.01 sec


API Listeners

0.02 sec

0.3 second

0.3 second

Synchronous request

How large is every image

2.4 seconds per request

API Listeners

0.02 sec

Synchronous request


API Listeners

0.02 sec

Synchronous request

Add ThreadPool


Success? Not yet


API Listeners

0.02 sec


API Listeners

0.02 sec

1 second and growing

¯\_(ツ)_/¯


API Listeners

0.02 sec

200 RPM -> 1200 RPM


API Listeners

0.02 sec

ScaleScale

Master -> Slave -> balancer

Topics -> Partitions

< 1 second per request

Success? Not yet

Out of 8Gb

How to PyTorch in production

https://towardsdatascience.com/how-to-pytorch-in-production-743cb6aac9d4

7.6 Gb -> 600 Mb

1 second per request

Success? Not yet

Back to 2, 5, 10 seconds per request

Troubleshooting

API Listeners

0.02 sec

? ?

Troubleshooting

API Listeners

0.02 sec

?

Troubleshooting

API Listeners

0.02 sec

?175k records per request

Troubleshooting

API Listeners

0.02 sec?

Troubleshooting

API

SELECT * FROM ads LIMIT 1

25 seconds -> 0.02 seconds

0.02 seconds -> 0.002 second

Success? Not yet

Faust 1.4.6: No latest offset

https://github.com/robinhood/faust

3K images per minute 0_o

Success? Not yet

Issues to solve1. Occasional spikes in performance (GC, network latency)2. Message broker (Kafka rebalancing, offset etc)3. How to handle DB migrations4. Something we are not aware of yet

Lessons learnt1. CPU bound tasks != IO bound (¯\_(ツ)_/¯)2. High coupling - low cohesion3. You need to know how to cook MongoDB4. Go is not that obvious and library reach as Python5. Simple != Easier6. Concurrency != Parallelism (obviously)

< 1 second per request

+ Live statistics from PostgreSQL

API Listeners

Thank you everyone

Success?

go && python event driven ml - pycon odessa · 2020-06-01 · go && python event...

Documents