django multi-db in anger

48

Upload: loren-davie

Post on 08-Jul-2015

1.320 views

Category:

Technology


2 download

DESCRIPTION

A talk given at Django-NYC on March 21st, 2011, on how Django MultiDB capabilities are implemented at Saaspire (www.saaspire.com).

TRANSCRIPT

Page 1: Django Multi-DB in Anger
Page 2: Django Multi-DB in Anger

MultiDB In AngerProfessional Driver. Closed Course.

Page 3: Django Multi-DB in Anger

Hi.

Sean OConnor in background, with superior monitor.

World Headquarters

“Checkpoint Charlie” T-

Shirt

Page 4: Django Multi-DB in Anger
Page 5: Django Multi-DB in Anger

Django MultiDBAs It Was Probably Intended

Page 6: Django Multi-DB in Anger

The Legacy DB Problem

New DB, built for New

Web App

Legacy DB that you are Unable to

Eliminate. :(

Django Website

Page 7: Django Multi-DB in Anger

Django ORM, Before.

RDBMS

ORM API

Database Backend

Python Database

Driver

Page 8: Django Multi-DB in Anger

MultiDB Mode

RDBMS

ORM API

Database Backend

Python Database

Driver

RDBMS

Database Backend

Python Database

Driver

RDBMS

Database Backend

Python Database

Driver

DB Routing

DB Definition

DB Definition

DB Definition

Page 9: Django Multi-DB in Anger

Out of the Box

• Database definitions.

• --database=

• Database Routers.

• using(), using=

Page 10: Django Multi-DB in Anger

DB Definitions

DATABASES = { 'default': { 'NAME': 'app_data', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'USER': 'postgres_user', 'PASSWORD': 's3krit' }, 'users': { 'NAME': 'user_data', 'ENGINE': 'django.db.backends.mysql', 'USER': 'mysql_user', 'PASSWORD': 'priv4te' }}

settings.py

Page 11: Django Multi-DB in Anger

--database=

./manage.py syncdb

./manage.py syncdb --database=legacy

./manage.py migrate

./manage.py migrate --database=legacy

Page 12: Django Multi-DB in Anger

Database Routersclass MyAppRouter(object): """A router to control all database operations on models in the myapp application"""

def db_for_read(self, model, **hints): "Point all operations on myapp models to 'other'" if model._meta.app_label == 'myapp': return 'other' return None

def db_for_write(self, model, **hints): "Point all operations on myapp models to 'other'" if model._meta.app_label == 'myapp': return 'other' return None

Page 13: Django Multi-DB in Anger

Database Routers

DATABASE_ROUTERS = [‘path.to.MyAppRouter’,‘path.to.OtherRouter’]

settings.py

Page 14: Django Multi-DB in Anger

API: “using”

Author.objects.using(‘other’).all()

p = Person(name=‘Fred’)p.save(using=‘db0’)p.save(using=‘db1’)

Page 15: Django Multi-DB in Anger

And now, the real world...

MultiDB Meets

Page 16: Django Multi-DB in Anger

The Multi-Tenant Hydra

• SaaS

• Multiple accounts on a cluster.

• Each account with “substantial” data.

• Need to think about sharding.

Page 17: Django Multi-DB in Anger

Core DB

Account DB

Account DB

Account DB

Account DB

Account DB

Account DB

Account DB

Account DB

OMG It’s Full of Stars

Page 18: Django Multi-DB in Anger

Core vs. Account• Core DB

• Account Data

• User Data

• Immutable Data

• Account DBs

• Everything Else

Page 19: Django Multi-DB in Anger

More Requirements

• Easy, declarative specification of core vs account for any model.

• No explicit DB references in code.

• Must route account models to correct account db.

• Must support web and Celery task usage.

Page 20: Django Multi-DB in Anger

...and even worse:

• No a priori knowledge of account databases. Seriously.

Oh, boy...

Page 21: Django Multi-DB in Anger

The Hack Solution

Page 22: Django Multi-DB in Anger

Declarative Model Routing

class ApplicationSeat(models.Model): """ A seat for an application. """ app = models.ForeignKey(Application) user = models.ForeignKey(UserProfile,related_name='application_seats')

global_model = True

def __unicode__(self): return '%s - %s' % (unicode(self.app),unicode(self.user))

class Meta: unique_together = (('app','user'),)

Page 23: Django Multi-DB in Anger

Declarative Model Routing

class AccountRouter(object): """ Routes models to account database, unless they are flagged global. """ def _is_saaspire(self,model): return model.__module__.startswith('saaspire') def _global(self,model): return hasattr(model,'global_model') and model.global_model

Page 24: Django Multi-DB in Anger

Routing to Coreclass AccountRouter(object): # ... def db_for_read(self,model,**hints): """ Gets the db for read. """ from saaspire.accounts.utils import current_account if self._is_saaspire(model): if self._global(model): return 'default' else: self._init_dbs() ca = current_account() if ca: return ca.slug else: return None else: return None

Page 25: Django Multi-DB in Anger

Routing to Accountclass AccountRouter(object): # ... def db_for_read(self,model,**hints): """ Gets the db for read. """ from saaspire.accounts.utils import current_account if self._is_saaspire(model): if self._global(model): return 'default' else: self._init_dbs() ca = current_account() if ca: return ca.slug else: return None else: return None

Page 26: Django Multi-DB in Anger

What account am I using?

def current_account(): """ Gets the current account. """ from environment import env # Look for explicit account specification if hasattr(env,'account') and env.account: return env.account

# ... # Look for current user if hasattr(env,'user') and not env.user.is_anonymous(): profile = env.user.get_profile() if profile: return profile.account return None

Page 27: Django Multi-DB in Anger

Warning:The following scene contains threadlocals.

Page 28: Django Multi-DB in Anger
Page 29: Django Multi-DB in Anger

HTTP Request

Django-Environment

Environment Generators

View executes

Threadlocal is cleared.

Start

Stop

Values associated with

threadlocal "env"

Django-Environment

Page 30: Django Multi-DB in Anger

What account am I using?

from environment.standard import RequestAttributeGeneratorfrom saaspire.core.envgen import AccountGenerator, InstanceGenerator

entries = { 'user':RequestAttributeGenerator('user'), 'account':AccountGenerator(), 'instance':InstanceGenerator(),}

main.env

Page 31: Django Multi-DB in Anger

What account am I using?

class AccountGenerator(object): """ Populates the environment with the current account. """ def generate(self,key,request): """ Generates the variable. """ if not request.user.is_anonymous(): return request.user.get_profile().account else: return None

Page 32: Django Multi-DB in Anger

What account am I using?

def current_account(): """ Gets the current account. """ from environment import env # Look for explicit account specification if hasattr(env,'account') and env.account: return env.account

Page 33: Django Multi-DB in Anger

But what about Celery?

• Celery: An async message queue.

• No HTTP request/response cycle.

• Django-environment won’t work.

Page 34: Django Multi-DB in Anger

use_account() context manager

@periodic_task(run_every=timedelta(minutes=60))def run_proactive_service_subinferences(): for account in Account.objects.active(): with use_account(account.name): # etc...

Page 35: Django Multi-DB in Anger

use_account() context managerclass use_account(object): """ Context manager that sets a specific account. """ def __init__(self,account_name): from saaspire.accounts.models import Account self.account = Account.objects.get(Q(name=account_name) | Q(slug=account_name)) self.current_account = None def __enter__(self): """ Called at beginning of with block. """ from environment import env if hasattr(env,'account'): self.current_account = env.account env.account = self.account def __exit__(self,*args,**kwargs): """ Called at end of with block. """ from environment import env env.account = self.current_account

Page 36: Django Multi-DB in Anger

No a priori knowledge of account databases.

Page 37: Django Multi-DB in Anger

This is cached in memory.

DATABASES = { 'default': { 'NAME': 'app_data', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'USER': 'postgres_user', 'PASSWORD': 's3krit' }, 'users': { 'NAME': 'user_data', 'ENGINE': 'django.db.backends.mysql', 'USER': 'mysql_user', 'PASSWORD': 'priv4te' }}

Page 38: Django Multi-DB in Anger

What we need.

There is no database "X".

Someone signs up for

Saaspire account "X".

Database "X" is automatically

created.

Account "X" traffic automatically

routed to DB "X".

Page 39: Django Multi-DB in Anger

DBMap

from saaspire.core.db import DBMap

DATABASES = DBMap() # Dictionary-like thingie

settings.py

Page 40: Django Multi-DB in Anger

{% rant %}

Page 41: Django Multi-DB in Anger

The way DATABASES works.

MultiDB-aware code looks for a dictionary

assigned to DATABASES setting.

MultiDB-aware code proceeds to cache

dictionary entries in it's own memory store.

MultiDB-aware code ignores the object at the

DATABASES setting, henceforth.

Page 42: Django Multi-DB in Anger

The way DATABASES should work.

MultiDB-aware code asks for DB definition from the DATABASE_DEFINITION_PROVIDER class.

If no provider has been specified, it falls back to the included default provider, which happens to

look for a dictionary under the DATABASES setting.

Memory caching is LEFT UP TO THE PROVIDER. MultiDB-aware code does NOT do

its own memory caching.

Page 43: Django Multi-DB in Anger

{% endrant %}

Page 44: Django Multi-DB in Anger

Right, back to DBMap

class DBMap(object):

def __init__(self, core_name='core', core_engine='django.db.backends.postgresql_psycopg2', core_user='', core_password='', core_host='localhost', core_port='5432'): self.data = {'default':{'NAME':core_name, 'ENGINE':core_engine, 'USER':core_user, 'PASSWORD':core_password, 'HOST':core_host, 'PORT':core_port}}

Page 45: Django Multi-DB in Anger

More DBMap def __getitem__(self,key): """ Dictionary accessor. """ if not key in self.data: self.load_db(key) return self.data[key]

def load_db(self,key): """ Loads the specific database info. """ from saaspire.accounts.models import Account try: account = Account.objects.get(slug=key) self.data[key] = {'NAME':account.db_name, 'ENGINE':account.db_engine, 'USER':account.db_user, 'PASSWORD':account.db_password, 'HOST':self.data['default']['HOST'], 'PORT':self.data['default']['PORT']} except Account.DoesNotExist: raise KeyError

Page 46: Django Multi-DB in Anger

Dynamic Creation of DBsclass Account(models.Model):

def create_database(self):

db_info = settings.DATABASES['default']

host = db_info['HOST'] cx_params = {'host':host} if db_info.get('USER',None): cx_params['user'] = db_info['USER'] if db_info.get('PASSWORD',None): cx_params['password'] = db_info['PASSWORD']

connection = psycopg2.connect(**cx_params) connection.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT) cursor = connection.cursor()

cursor.execute('CREATE DATABASE "%s" OWNER "%s"' % (slugify(self.db_name), slugify(self.db_user),))

# TO BE CONTINUED...

Page 47: Django Multi-DB in Anger

Dynamic Creation of DBs

class Account(models.Model): def create_database(self):

# ...THE CONTINUATION...

call_command('syncdb', database=self.db_name, interactive=False, verbosity=0)

settings.DATABASES.load_db(self.db_name) # Force DBMap to load this database reload(south.db) call_command('migrate', database=self.db_name, interactive=False, verbosity=0)

Page 48: Django Multi-DB in Anger

Questions?