migrating from lamp to app engine - term.ieterm.ie/data/migratingfromlamptoappengine.pdf ·...

46
Migrating from LAMP to App Engine An Audio Visual Experience featuring Andy Smith photo credit: pinksherbert

Upload: others

Post on 22-Sep-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Migrating from LAMP to App Engine

An Audio Visual Experience featuring Andy Smith

photo credit: pinksherbert

It’s The Last Talk!(cue Final Countdown by Europe)

<blink>

</blink>

photo credit: raulc

... but it’s technicalso the pretty pictures will be sparse

photo credit: alper

It comes with a handout for examples!

^W^W^WURL!http://term.ie/data/examples.txt

Quick Background

cool!

Jaiku, mobile presence sharing in Finland, PHP frontend, python backend, standard lamp, 50k lines code

October 07, bought by Google

April 08, App Engine launches, Google App Engine Helper for Django released

March 09, Jaiku re-launches on App Engine, Django, 25k lines code, Open Source!

photo credit: jussi

Let’s Go PortingYour App

Your Data ModelYour Mind

photo credit: doctorow

No More JOINor GROUP BY or nested queries or much else that

can’t be accessed in constant time

Your Mind

koolaid: but you probably didn’t want to anyway

google is used to building big things that need to scale massively, in order to do that certain restraints on your way of thinking are required

as joe and mike have mentioned, you probably don’t want to after a certain point anyway

No More Surprisesno more intangible thresholds where query time

suddenly goes exponential

Your Mind

koolaid: explicit over implicit

depending on the shape of your individual entities it can still take time to load, but otherwise no query you write will become slower over time

and you could also just try to fetch way too many things, but that’s your own damn fault

No More Slownesseverything you do hits an index, right away

Your Mind

koolaid: it’s quick, dude

quick enough to fake joins

no non-indexed result set filtering or temporary tables

nice article on indexes in app engine: http://code.google.com/appengine/articles/index_building.html

literally cannot make a non-indexed call

No More ALTER TABLEobject store means never having to say you’re sorry

Your Mind

koolaid: it’s not like you ever were able to alter a large table anyway

protocol buffers means you can add attributes to your entities at will without breaking old entities

No More Long Requestsyou haven’t got much time, so make it count

Your Mind

koolaid: keeps you honest

taking an iterative approach to problems

task queue will help this along, it’s on the roadmap, till then a simple queue + cron

No More Slashdot Effectthat’s “getting dugg” for you youngsters

Your Mind

koolaid: you’re not going to crash this database or web server

things don’t change under load

no more logging in to your server and frantically watching top or SHOW PROCESS LIST

Let’s Go PortingYour App

Your Data ModelYour Mind

photo credit: sampitech

Denormalizationyou’ll want it for counting, preferences and sometimes

referenced data

Your Data Model

Denormalization

if you’re not going to index the data don’t bother making it a property

Your Data Model

Pro Tip #1: Use a Dictionary Property

the app engine django helper implements a reasonable DictProperty based on pickle

Bonus Tip!

JaikuEngine has a reasonable implementation of DictProperty to copy

Denormalization

if you have to use a counter you’ll need to compute it at write-time, but they aren’t usually worth the pain

Your Data Model

Pro Tip #2: Avoid counting

things like counts aren’t really that important anyway

the bigger the number, the less accurate it needs to be

Bonus Tip!

don’t bother trying to keep counts super accurate past a certain point, and don’t compute them as often

Denormalization

you can do a few hundred (likely cached) get()s but Query() can be quite a bit slower

Your Data Model

Pro Tip #3: Aim for one Query() per page

sometimes this means duplicating data to prevent additional lookups

Think Like This:

at scale your data starts looking quite a lot like your presentation, organize your data into the shapes you’ll

be using it in

Integritytransactions aren’t going to save you

Your Data Model

transactions in app engine currently only cover fairly simplistic use cases such as updating a single entity while preventing race conditions

Integrity

you don’t have UNIQUE so you need to handle your own duplicate prevention

Your Data Model

Pro Tip #1: Be idempotent

you’re going to want to use guessable key names to avoid needing Query()s to check for duplicate work

What that means is:

anything unique goes in your primary key

Integrity

if something breaks halfway through, trying it again should fix it

Your Data Model

Pro Tip #2: Roll forward

for example, adding somebody as a contact may have some side effects, if the user tries to add the contact again ensure those side effects were completed

Integrity

if you get something that is broken, fix it or forget it

Your Data Model

Pro Tip #3: Clean up after yourself

a good example would be if some initial data that was supposed to be created when a user joins your site does not exist

Pit Fallsit doesn’t work like that

Your Data Model

Pit Falls

ordering thinks like a computer, not a human: separate your display from your storage and index

Your Data Model

Pit Fall #1: Case Sensitivity

Pit Falls

don’t think you can just ask for the 100th result:you need to use inequality and ordering to skip the line

Your Data Model

Pit Fall #2: Paging

you can also page queries with a special __key__ field, there’s a nice post explaining approaches to it

Bonus Tip!

__key__ post: http://groups.google.com/group/google-appengine/browse_thread/thread/ee5afbde20e13cde

Pit Falls

they’re kind of weird and tricky and probably don’t do what you want anyway: don’t bother

Your Data Model

Pit Fall #3: Entity Groups

if you do bother, they’re useful for data that is frequently edited together, but you’ll want to do some

reading

Bonus Tip!

entity group reading: http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html

Pit Falls

generally a datastore timeout, they happen infrequently but often unpredictably: be prepared

Your Data Model

Pit Fall #4: Timeouts

this is where all that data integrity stuff comes in

under race-condition load, with transactions on similar entities you _can_ lock your tables long enough to

cause timeouts, avoid this by using memcache for locks

Bonus Tip!

Pit Falls

syncing and lockings problems happen everywhere, even on app engine: proper memcache helps though

Your Data Model

Pit Fall #5: Parallel Execution

it’s not a new problem to any of us, but it isn’t going to magically go away

Let’s Go PortingYour App

Your Data ModelYour Mind

photo credit: michelhrv

Google App Engine Helper for Django

it’s a mouthful

Your App

... helper for Django

google-app-engine-django vs app-engine-patch:not all that different

A quick word on libraries

Your App

helper tries to get the models to work with existing django architecture

patch is more interested in the additional tools

i’ve never been the type to use many of the “features” of django, probably largely because i never was able to make use of the orm, so my interest with helper was more towards core functionality, testing

... helper for Django

you’ll need to base your models on a new class that wraps the App Engine and Django models together

Step #1: Port your models

it’d be great for this to be handled automatically for most simple cases, it’s open source, come help out

Bonus Tip!

Your App

Warning!complex sql queries that don’t make sense in the

datastore will have to be re-written and re-thought :/

... helper for Django

most pure python libraries will work fine, but you’ll probably have to make zip files for deployment

Step #2: Package your dependencies

we’ve built some tools for this in JaikuEngine, check out build.py, expect them to be added to the helper

library as well

Bonus Tip!

Your App

... helper for Django

if you are trying to use a lot of django-* apps that do complex database stuff you may have to port them as

well, some database thinking just doesn’t transfer

Step #3: Check your “apps”

Your App

DisclaimerI’m not really the biggest fan of the “app” metaphor in

Django to begin with, but we can argue later :p

... helper for Django

the App Engine SDK comes with a pretty decent admin that, amongst other things includes a console and data

viewer

Step #4: Use /_ah/admin

Your App

and now for something a little different

Call to Armswait a sec, monkey patches?

Your App

Call to Arms

there’s no good reason; poor communication

Why isn’t it done yet

Your App

Call to Arms

a great resource for django developers; zero to launch in seconds

Why it should be done

Your App

Let’s Fix Itdjango.contrib.appengine?

Your App

Let’s Fix It

google-app-engine-django and app-engine-patch

We already have some libraries

Your App a bunch of good code, but still relatively hacky

but they are hacking around django instead of working with it

Let’s Fix It

Most database features have obvious translations, some may have to be cut, the db code needs to expect this

A new database backend

Your App

approaches that are efficient in one style of database are not necessarily efficient in another

Let’s Fix It

django needs to be zipped to fit under app engine’s file limit, it’s an easy tool to write

Some lightweight support for packaging

Your App

Let’s Fix It

for deployment, for testing under app engine sdk, simple stuff that has already mostly been written

New manage.py commands

Your App

Let’s Fix It

it’ll need to be abstracted to work cross platform

Find and pulverize raw SQL

Your App

we can do it

Questions?photo credit: sombraala

Andy Smith <[email protected]>