arbyte - a modular, flexible, scalable job queing and execution system
DESCRIPTION
Talk given at the London Perl Workshop 2008.TRANSCRIPT
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
1 Motivation
2 Architecture
3 Design and Implementation
4 Practicalities
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Introduction
Arbyte - Job queuing and execution framework
Required system to run jobs
Considered gearman, TheSchwartz . . .
Decided to create wrapper - Arbyte
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Requirements
Fully scalable
Modular
Logging
Good reliability
Thor Compliance
Batching
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Batching
Job specific optimisations
In main queue
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Distributed Computing Models
Cluster
Grid
MapReduce
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Distributed Computing Models
Cluster
Grid
MapReduce
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Gearman
Limitations
Not reliable
No retries
It didn’t work when I triedit
Features
Has multiple manager /queuing daemons
ScalableNo single point offailure
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
The Schwartz
Limitations
Single DB store - noteasily scalable
No batching aftersubmission.
Relational DB overhead
Features
Reliability
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Helios
Layer over TheSchwartz
Limitations
Same as TheSchwartz
Doesn’t add batching orchange the fundamentalarchitecture
Features
Manages worker processes
Adds XML Job submissionformat and web interface
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Non-Perl
Possible but not as good for hacking on, integratingcomponents.
We mostly have perl skills.
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Considered
Torque
Hadoop
Dr. Queue
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Back to Arbyte
Modular framework for job queuing and execution
Flexible, Customisable
Can be used with many other systems
e.g. Gearman, with batching, reliability and retries
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
ResponsibilitiesStoring JobsJob Specific OptimisationsBatchingPriorities
Notes
Currently have JobBuffer::Simple
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
ResponsibilitiesLoggingRetriesBasic load balancing
Notes
Only “active” component
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
ResponsibilitiesArrange for Job Execution
Notes
Consistent interface
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
NotesJobRunner::Simple is implemented
Forks a helper process
Others are examples (todo)
Architecture Diagram
Arbyte Boundary
JobBufferJob Producers
JobBufferJob Producers
JobBufferJob Producers
Manager
Manager
JobRunner
JobRunner
JobRunner
Helios
Gearman
Simple
JobExecutor
JobExecutor
JobExecutor
ResponsibilitiesRun Job codeReport success / failure
Notes
Classes correspond to Job classes
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Object Implementation: Options
Homemade
Moose
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Object Implementation: Choice
Using homemade objects
All hashes
AUTOLOADed get and set methods
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Processes
No threads
JobBuffer
JobRunner
Will likely have own processese.g. JobRunnerHelper
Manager
StatusAccepter
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
IPC Requirements
Wanted something with:
Easy way to serverify an object
Stub generation
Parameter passing
Exceptions
Timeouts
Security
Garbage collection
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Remote Object System
Object Oriented Design RMI like system
Assumed RMI on CPAN (Ruby has it, DRb) but no
Feel like fixing this?
Had to make do
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
IPC: Implementation Options
Considered
GRID::Machine
Distributed::Process
RPC::Serialized
RCGI - RPC with CGI server
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
IPC: Implementation Choice
Event::RPC
Closest to RMI
Maintained
Has (some) timeouts
Propagates Exceptions
Confusing - capabilities not clear
Using some hackery to make it Good Enough
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Deployment Hardware
Own Servers
Cloud
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Grid Management Software
To Manage
Booting
Package distribution
Configuration
For example
RPMs
Puppet
Wigwam
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Project Status
Now
Running in parallel with production system
Todo
Better JobBuffers
Better JobRunners
Worker capabilities?
Optimise
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
The Route to CPAN
Object system
Config system
High level documentation
More tests
Arbyte
Alistair N.MacLeod
Motivation
Problem
Requirements
Existing Systems
Arbyte
Architecture
ComponentDiagram
Design andImplementa-tion
Objects
Processes
IPC
Practicalities
Deployment
Project Status
Questions
Questions