![Page 1: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/1.jpg)
Stream Upload And Asynchronous Job Processing System
Lê Bá Minh – [email protected] Manager – Zalo Team - VNG
![Page 2: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/2.jpg)
Agenda
• 1/ Why we need an Asynchronous Job Processing System?• 2/ How it works ?• 3/ Application• 4/ Q &A
![Page 3: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/3.jpg)
Parallel Stream Upload
• Data is separated in chunks
![Page 4: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/4.jpg)
Facts
• Zalo Stream Upload• Background continuous Voice Upload• Background Image upload• …
• Facts (now)• 1M voices /day • 800K images /day• Peak: 500 Chunks/second
• Expect:• Scalable (more than 5000 chunks/second)• High performance
![Page 5: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/5.jpg)
What we need• Asynchronous Job processing System
Collect Data
Processing Data
Response
Collect Data
Processing DataResponse
Workers
![Page 6: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/6.jpg)
What we need
• Asynchronous Job processing System• Batch Job• Big data job• High Reliable: No job missed• Distributed job processing workers • High performance• Persistent• Load balancing, Failed over, Recoverable
![Page 7: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/7.jpg)
Open-source solutions
• Share-memory workers• All workers in one physical server• No fail-over• Un-scalable
• Gearman• Good but not completely fit our requirement• No Batch Job support• Not full reliable (lost job)• Not full load-balance• Un-stable if more than 2000 jobs/second
![Page 8: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/8.jpg)
Zalo Asyn Job Processing System
Client
Client
Worker 1
Worker 2
Worker 3
Z Database
Short Connection
Long Connection
TCP
TCP
Worker Manager
Job Caching
Job Manager
Persistent Manager
Job Clean-Up
Job Server
TCP
TCP
TCP
![Page 9: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/9.jpg)
Implementation
• C/C++ for Job Server• C/C++, Java for client and workers • Binary Protocol• Z-Database
![Page 10: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/10.jpg)
Job State
Queuing
Processing
Failed Time Out
Finished
Deliver to Worker
Worker ACK Failed
Worker ACK Finished
No ACK
Started
![Page 11: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/11.jpg)
Job Type
• Single Job• Simple task • Immediately deliver
• Batch Job• Multiple tasks• Deliver when received all tasks
![Page 12: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/12.jpg)
Deployment
Job Server 1
Job Server 2
Synchronized
Business Server
Worker 1
Worker 2
Worker 3
![Page 13: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/13.jpg)
Applications
• Using for all Asynchronous job processing in Zalo: voice upload, image upload, feed processing…• Benchmark (single server)
• 50K images/seconds (640x480)• 50k voices/seconds (30s)
• Advantages• Batch Jobs• Never lost job• Worker can restart or stop any time• Fail-over, Load Balancing, Quick recover in failure
• Issue• Job duplication (handled by worker)
![Page 14: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/14.jpg)
Q&A
![Page 15: Stream upload and asynchronous job processing in large scale systems](https://reader033.vdocuments.us/reader033/viewer/2022051514/54b2d3514a795951548b4576/html5/thumbnails/15.jpg)