arbyte - a modular, flexible, scalable job queing and execution system

31
ARBYTE ARBYTE Alistair N. MacLeod [email protected] ltd

Upload: lokku

Post on 08-May-2015

4.061 views

Category:

Technology


2 download

DESCRIPTION

Talk given at the London Perl Workshop 2008.

TRANSCRIPT

Page 1: Arbyte - A modular, flexible, scalable job queing and execution system

ARBYTEARBYTE

Alistair N. MacLeod

[email protected]

ltd

Page 2: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

1 Motivation

2 Architecture

3 Design and Implementation

4 Practicalities

Page 3: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Introduction

Arbyte - Job queuing and execution framework

Required system to run jobs

Considered gearman, TheSchwartz . . .

Decided to create wrapper - Arbyte

Page 4: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Requirements

Fully scalable

Modular

Logging

Good reliability

Thor Compliance

Batching

Page 5: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Batching

Job specific optimisations

In main queue

Page 6: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Distributed Computing Models

Cluster

Grid

MapReduce

Page 7: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Distributed Computing Models

Cluster

Grid

MapReduce

Page 8: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Gearman

Limitations

Not reliable

No retries

It didn’t work when I triedit

Features

Has multiple manager /queuing daemons

ScalableNo single point offailure

Page 9: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

The Schwartz

Limitations

Single DB store - noteasily scalable

No batching aftersubmission.

Relational DB overhead

Features

Reliability

Page 10: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Helios

Layer over TheSchwartz

Limitations

Same as TheSchwartz

Doesn’t add batching orchange the fundamentalarchitecture

Features

Manages worker processes

Adds XML Job submissionformat and web interface

Page 11: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Non-Perl

Possible but not as good for hacking on, integratingcomponents.

We mostly have perl skills.

Page 12: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Considered

Torque

Hadoop

Dr. Queue

Page 13: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Back to Arbyte

Modular framework for job queuing and execution

Flexible, Customisable

Can be used with many other systems

e.g. Gearman, with batching, reliability and retries

Page 14: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

Page 15: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

ResponsibilitiesStoring JobsJob Specific OptimisationsBatchingPriorities

Notes

Currently have JobBuffer::Simple

Page 16: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

ResponsibilitiesLoggingRetriesBasic load balancing

Notes

Only “active” component

Page 17: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

ResponsibilitiesArrange for Job Execution

Notes

Consistent interface

Page 18: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

NotesJobRunner::Simple is implemented

Forks a helper process

Others are examples (todo)

Page 19: Arbyte - A modular, flexible, scalable job queing and execution system

Architecture Diagram

Arbyte Boundary

JobBufferJob Producers

JobBufferJob Producers

JobBufferJob Producers

Manager

Manager

JobRunner

JobRunner

JobRunner

Helios

Gearman

Simple

JobExecutor

JobExecutor

JobExecutor

ResponsibilitiesRun Job codeReport success / failure

Notes

Classes correspond to Job classes

Page 20: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Object Implementation: Options

Homemade

Moose

Page 21: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Object Implementation: Choice

Using homemade objects

All hashes

AUTOLOADed get and set methods

Page 22: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Processes

No threads

JobBuffer

JobRunner

Will likely have own processese.g. JobRunnerHelper

Manager

StatusAccepter

Page 23: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

IPC Requirements

Wanted something with:

Easy way to serverify an object

Stub generation

Parameter passing

Exceptions

Timeouts

Security

Garbage collection

Page 24: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Remote Object System

Object Oriented Design RMI like system

Assumed RMI on CPAN (Ruby has it, DRb) but no

Feel like fixing this?

Had to make do

Page 25: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

IPC: Implementation Options

Considered

GRID::Machine

Distributed::Process

RPC::Serialized

RCGI - RPC with CGI server

Page 26: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

IPC: Implementation Choice

Event::RPC

Closest to RMI

Maintained

Has (some) timeouts

Propagates Exceptions

Confusing - capabilities not clear

Using some hackery to make it Good Enough

Page 27: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Deployment Hardware

Own Servers

Cloud

Page 28: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Grid Management Software

To Manage

Booting

Package distribution

Configuration

For example

RPMs

Puppet

Wigwam

Page 29: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Project Status

Now

Running in parallel with production system

Todo

Better JobBuffers

Better JobRunners

Worker capabilities?

Optimise

Page 30: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

The Route to CPAN

Object system

Config system

High level documentation

More tests

Page 31: Arbyte - A modular, flexible, scalable job queing and execution system

Arbyte

Alistair N.MacLeod

Motivation

Problem

Requirements

Existing Systems

Arbyte

Architecture

ComponentDiagram

Design andImplementa-tion

Objects

Processes

IPC

Practicalities

Deployment

Project Status

Questions

Questions