strategies for refactoring and migrating a big old project to be multilingual and use multiple...

68
Strategies for refactoring and migrating a big old project to be multilingual and use multiple databases or how I learned to stop worrying and search and replace my code base. 8 th Django Copenhagen Meetup Benjamin Bach [email protected]

Upload: benjaoming

Post on 16-Jul-2015

335 views

Category:

Technology


1 download

TRANSCRIPT

Strategies for refactoring and migrating a big old project to be multilingual and use multiple databases or how I learned to stop

worrying and search and replace my code base.

8th Django Copenhagen Meetup

Benjamin [email protected]

Thanks!

Title backup responsible:@valberg

1The project “toughroad”

2multiple database schemas

3multilingual (django-parler)

The project

A game:80-250 players

Educational (Global trade issues)~6 hours

Physical role play +computer interactions

Illustration of a global trade chain:

FarmersTraders

Exporters

Banks

Brokers Brand companiesCafés

Testability

Unfeasible:Either get 80 people or simulate

80 people's interactions

Even worse: Every role is unique and there are up to 250+

Instruction manuals, interdependenthuman behavior, human errors are

part of the game.

2009: First games played.

2010: First successful game.

Fixing issues during gameplay for~2 years

Summer 2014: Game has workedflawlessly for a couple of years.

http://cloc.sourceforge.net v 1.60 T=2.09 s (89.9 files/s, 13062.9 lines/s)-------------------------------------------------------------------------------Language files blank comment code-------------------------------------------------------------------------------Python 81 2948 1062 12832HTML 100 1519 2 7326CSS 3 133 31 953Javascript 3 43 2 443Bourne Shell 1 4 1 8-------------------------------------------------------------------------------SUM: 188 4647 1098 21562-------------------------------------------------------------------------------

(september 2014)

Finally! Success! New partners, more attention, new problems.

Does it translate to other countries?

Does it scale?

1. Copying the game

Old model: For every game, a new database.

Each game shares copies “start-up” configuration from a prototype game.

(MySQL)

New model:

Use Postgres schemas!

Now we can deploy each game inside its own schema and access shared data from the

“public” schema.

Schemas

Namespaces inside a database

Like addingset(A

1,B

1) + set(A

2,C

2) = set(A

1,B

1,C

2)

Example:

Database:Google

auth_userpublic

+ docsdocs

+ spreadsheetspreadsheet

Scalability and performance win!

Manage large sets of data separately

Share tables only where necessary

Reduce use of managers

Downside:

Hard to share data

Makes migrations harder!

Using schemas is a fundamentalDesign decision!

Step 1: Setting up the project...

1. settings.DATABASES

2. settings.DATABASE_ROUTERS

DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'toughroad_dk', 'USER': 'django', 'PASSWORD': 'django', 'HOST': '127.0.0.1', 'OPTIONS': { 'options': '-c search_path=public,' + os.environ['TOUGHROAD_SCHEMA'] }, }, 'toughroad': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'toughroad_dk', 'USER': 'django', 'PASSWORD': 'django', 'HOST': '127.0.0.1', 'OPTIONS': { 'options': '-c search_path=' + os.environ['TOUGHROAD_SCHEMA'] + ',public' }, },}

settings.DATABASES

settings.DATABASE_ROUTERS

DATABASE_ROUTERS = ['toughroad.database_routers.ToughroadRouter']

APPS = ('toughroad', 'sessions')

class ToughroadRouter(object):

""" Put all game-specific data in a seperate database """

def db_for_read(self, model, **hints): if model._meta.app_label in APPS: return 'toughroad' return None

def db_for_write(self, model, **hints): if model._meta.app_label in APPS: return 'toughroad' return None

def allow_syncdb(self, db, model): if db == 'toughroad': return model._meta.app_label in APPS elif model._meta.app_label in APPS: return False return None

def allow_relation(self, obj1, obj2, **hints): """ Allow relations if a model in the auth app is involved. """ if obj1._meta.app_label == 'default' or \ obj2._meta.app_label == 'toughroad': return True return None

# Specifying the DB from command line:$ python manage.py shell –database=toughroad

# Using env stuff$ TOUGHROAD_SCHEMA='customer_A' python manage.py shell –database=toughroad

But why have different databases?

Storing session data

(south_)migration_history!

Step 2:

Deployment. No fun using schemasif they're not automatically handled

manage.py deploy_game

for game in games.filter(db_created=False):

print "Creating schema for", game

cursor = connection.cursor()

cursor.execute("CREATE SCHEMA \"" + game.schema_name + "\"")

transaction.commit()

...

manage.py deploy_game (2)for game in games.filter(db_created=False):

...

# syncdb for applications that do not have migrations and

# are not in the public schema already

env = {}

env.update(os.environ.copy())

env.update(game.env)

p = subprocess.Popen(

['python', 'manage.py', 'syncdb', '--settings=settings.from_env', '--database=toughroad', '--noinput', '--traceback'],

env=env

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

...

manage.py deploy_game (3)

for game in games.filter(db_created=False):

...

if trmeta_settings.BEFORE_MIGRATE:

p = subprocess.Popen(

shlex.split(trmeta_settings.BEFORE_MIGRATE),

env=env

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

...

manage.py deploy_game (4)

for game in games.filter(db_created=False):

...

if trmeta_settings.BEFORE_MIGRATE:

p = subprocess.Popen(

shlex.split(trmeta_settings.BEFORE_MIGRATE),

env=env

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

...

manage.py deploy_game (5)

for game in games.filter(db_created=False):

...

p = subprocess.Popen(

['python', 'manage.py', 'deploy_game', 'south', '--settings=settings.from_env', '--traceback'],

env=env,

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

game.db_created = True

game.save()

transaction.commit()

...

manage.py deploy_game south

...

from django.db import connections, connection

if options['south']:

cursor = connections['toughroad_explicit'].cursor()

cursor.execute("""CREATE TABLE "{:s}"."south_migrationhistory" (

"id" serial NOT NULL PRIMARY KEY,

"app_name" varchar(255) NOT NULL,

"migration" varchar(255) NOT NULL,

"applied" timestamp with time zone NOT NULL

)

""".format(os.environ['TOUGHROAD_SCHEMA']))

transaction.commit_unless_managed(using="toughroad_explicit")

cursor.close()

manage.py deploy_game (6)for game in games.filter(db_created=False):

...

print "Now migrating", game.schema_name

env = {}

env.update(os.environ.copy())

env.update(game.env)

p = subprocess.Popen(

['python', 'manage.py', 'migrate', 'toughroad', '--settings=settings.from_env', '--database=toughroad', '--no-initial-data', '--noinput', '--traceback'],

env=env

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

...

manage.py deploy_game (7)

for game in games.filter(db_created=False):

...

if trmeta_settings.AFTER_MIGRATE:

p = subprocess.Popen(

shlex.split(trmeta_settings.AFTER_MIGRATE),

env=env

)

p.communicate()

if p.returncode != 0:

raise RuntimeError("Tried command, it failed")

...

And the refactoring?

Splitting into separate applications!

toughroadtoughroad_meta

“toughroad” before

class GameRound(models.Model): """ A simulated year. Managed by "gameloop", all other threads are notified of round changes. """

number = models.IntegerField( _('round number'), unique=True, help_text=_('Counts from 1!')) template_duration = models.IntegerField( _('duration'), default=30, ) economic_growth = models.FloatField( default=1.0, verbose_name=_('economic growth'), help_text=_( 'Economic growth factor (>0.0). 1.0=no growth. 1.1 = 10%% growth etc.'))

is_started = models.BooleanField(default=False) started_on = models.DateTimeField(null=True, blank=True)

“toughroad” after

from toughroad_meta.models import GameRound as GameRoundMeta

class GameRound(GameRoundMeta):

is_started = models.BooleanField(default=False) started_on = models.DateTimeField(null=True, blank=True)

“toughroad_meta”

class GameRound(models.Model): """ A simulated year. Managed by "gameloop", all other threads are notified of round changes. """

number = models.IntegerField( _('round number'), unique=True, help_text=_('Counts from 1!')) template_duration = models.IntegerField( _('duration'), default=30, ) economic_growth = models.FloatField( default=1.0, verbose_name=_('economic growth'), help_text=_( 'Economic growth factor (>0.0). 1.0=no growth. 1.1 = 10%% growth etc.'))

Do's and don't's

Do

● Create JSON dumps and your own scripts for re-importing into your factored out application

● Backup stuff

● Make proxy models!

● Be smart so you have to refactor as little code as possible.

● Take small steps

● Add new fields

● Use regex for search and replace!

Don't

● Search and replace more than you can remember at once

● Remove models

● Rename models

● Rename fields

● End up with ambiguous stuff. Having to models is fine, having the same field in two places is not.

Put a proxy in place of the old model

class MyModelWhichWasMoved(models.Model): """This model was moved while refactoring...""" class Meta: model = my_new_application.Model proxy = True

Cheat :)

class BrokerIntialContracts(models.Model):

def __init__(self, *args, **kwargs): models.Model.__init__(self, *args, **kwargs) # When using the 'toughroad' application, swap this model # for the toughroad proxy from toughroad_meta.settings import SWAP_TOUGHROAD_PROXIES if SWAP_TOUGHROAD_PROXIES: from toughroad.models import BrokerIntialContracts self.__class__ = BrokerIntialContracts

Another big “do”:

Refactor while you're hot!

Refactoring to be multi-lingual

django-parler

django-parler provides Django model translations without nasty hacks.

Features:

● Nice admin integration.● Access translated attributes like regular attributes.● Automatic fallback to the default language.● Separate table for translated fields, compatible with django-hvad.

● Plays nice with others, compatible with django-polymorphic, django-mptt and such:

● No ORM query hacks.● Easy to combine with custom Manager or QuerySet classes.

● Easy to construct the translations model manually when needed.

django-parler

from django.db import modelsfrom parler.models import TranslatableModel, TranslatedFields

class MyModel(TranslatableModel): translations = TranslatedFields( title = models.CharField(_("Title"), max_length=200) )

def __unicode__(self): return self.title

django-parler (2)

>>> object = MyModel.objects.all()[0]>>> object.get_current_language()'en'>>> object.titleu'cheese omelet'

>>> object.set_current_language('fr') # Only switches>>> object.title = "omelette du fromage" # Translation is created on demand.>>> object.save()

So that means:

mymodel.my_field willl still work!

But there is a fundamental issue!

MyModel.my_field is no longerin the “myapp_my_model” table!

It is in “myapp_my_model_translations”

So back to refactoring...

1Create the translation table, keep the

existing columns

2Copy the data from the original table to the

translation table.

3Remove the fields from the original model.

See: http://django-parler.readthedocs.org/en/latest/advanced/migrating.html

4Refactor like hell

1: Add translations to model

# Old model

class MyModel(models.Model):

name = models.CharField(max_length=123)

# New model

class MyModel(TranslatableModel):

name = models.CharField(max_length=123)

translations = TranslatedFields(

name=models.CharField(max_length=123),

)

2.1: Migrate the data

# Create an empty data migration

manage.py makemigrations --empty myapp "migrate_translatable_fields"

2.2: Create the migration

def forwards_func(apps, schema_editor):

MyModel = apps.get_model('myapp', 'MyModel')

MyModelTranslation = apps.get_model('myapp', 'MyModelTranslation')

for object in MyModel.objects.all():

MyModelTranslation.objects.create(

master_id=object.pk,

language_code=settings.LANGUAGE_CODE,

name=object.name

)

3Remove the old fields

manage.py schemamigration myapp --auto "remove_untranslated_fields"

Refactoring necessary:

This is broken...my_model.objects.filter(old_field=xx)

But how often do you filter on translated strings?

Replacefilter(field_name)

with.translated(field_name)

orfilter(translations__field_name).

ModelAdmin

from parler.admin import TranslatableAdmin

class MyModelAdmin(TranslatableAdmin): search_fields = ('translations__title',)

Update the ordering and order_by() code

TIP: Try to avoid default ordering bytranslated fields!

ModelAdmin

from parler.admin import TranslatableAdmin

class MyModelAdmin(TranslatableAdmin): search_fields = ('translations__title',)

Nasty stuff: Aggregation

lang = translation.get_language()

contracts_owned = contracts_owned.filter(supplier__translations__language_code=lang)contracts_owned = contracts_owned.filter(country__translations__language_code=lang)contracts_owned = contracts_owned.filter(commodity__translations__language_code=lang)contracts_owned = contracts_owned.values('commodity__translations__name', 'country__translations__name', 'country__country_code', 'supplier__translations__name', 'commodity__tons', 'latest_exchange_rate', 'delivery_round__number', 'supplier', 'commodity', 'country')

contracts_owned = contracts_owned.annotate( tons=Sum('commodity__tons'), bags=Count('supplier__translations__name'), price=Sum('latest_exchange_rate')).order_by('delivery_round__number')

Refactoring:

Just do it!

...but in small steps.

Search and replace:

Always search more than you replace

Manually replace stuff!

May the schwartz be with you

Github: benjaoming