Download - Advanced Django ORM techniques
![Page 1: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/1.jpg)
Advanced Django ORM techniques
Daniel Roseman http://blog.roseman.org.uk
![Page 2: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/2.jpg)
About Me
• Python user for five years
• Discovered Django four years ago
• Worked full-time with Python/Django since 2008.
• Top Django answerer on StackOverflow!
• Occasionally blog on Django, concentrating on efficient use of the ORM.
![Page 3: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/3.jpg)
Contents
• Behind the scenes: models and fields
• How model relationships work
• More efficient relationships
• Other optimising techniques
![Page 4: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/4.jpg)
Django ORM efficiency: a story
![Page 5: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/5.jpg)
![Page 6: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/6.jpg)
414 queries!
![Page 7: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/7.jpg)
How can you stop this happening to you?
http://www.flickr.com/photos/m0n0/4479450696
![Page 8: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/8.jpg)
Behind the scenes: models and fields
http://www.flickr.com/photos/spacesuitcatalyst/847530840
![Page 9: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/9.jpg)
Defining a model
• Model structure initialised via metaclass
• Called when model is first defined
• Resulting model class stored in cache to use when instantiated
![Page 10: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/10.jpg)
Fields
• Fields have contribute_to_class
• Adds methods, eg get_FOO_display()
• Enables use of descriptors for field access
![Page 11: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/11.jpg)
Model metadata
• Model._meta
• .fields
• .get_field(fieldname)
• .get_all_related_objects()
![Page 12: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/12.jpg)
Model instantiation
• Instance is populated from database initially
• Has no subsequent relationship with db until save
• No identity between models
![Page 13: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/13.jpg)
Querysets
• Model manager returns a queryset: foos = Foo.objects.all()
• Queryset is an ordered list of instances of a single model
• No database access yet
• Slice: foos[0]
• Iterate: {% for foo in foos %}
![Page 14: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/14.jpg)
Where do all those queries come from?
• Repeated queries
• Lack of caching
• Relational lookup
• Templates as well as views
![Page 15: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/15.jpg)
Repeated queries
def get_absolute_url(self): return "%s/%s" % ( self.category.slug, self.slug )
Same category, but query is repeated for each article
![Page 16: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/16.jpg)
• Same link on every page
• Dynamic, so can't go in urlconf
• Could be cached or memoized
Repeated queries
![Page 17: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/17.jpg)
Relationships
http://www.flickr.com/photos/katietegtmeyer/124315322
![Page 18: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/18.jpg)
Relational lookups
• Forwards:
foo.bar.field
• Backwards:
bar.foo_set.all()
![Page 19: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/19.jpg)
Example modelsclass Foo(models.Model):
name = models.CharField(max_length=10)
class Bar(models.Model):
name = models.CharField(max_length=10)
foo = models.ForeignKey(Foo)
![Page 20: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/20.jpg)
Forwards relationship
>>> bar = Bar.objects.all()[0]
>>> bar.__dict__
{'id': 1, 'foo_id': 1, 'name': u'item1'}
![Page 21: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/21.jpg)
Forwards relationship
>>> bar.foo.name
u'item1'
>>> bar.__dict__
{'_foo_cache': <Foo: Foo object>, 'id': 1, 'foo_id': 1, 'name': u'item1'}
![Page 22: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/22.jpg)
Fowards relationships
• Relational access implemented via a descriptor:django.db.models.fields.related.SingleRelatedObjectDescriptor
• __get__ tries to access _foo_cache
• If doesn't exist, does lookup and creates cache
![Page 23: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/23.jpg)
select_related
• Automatically follows foreign keys in SQL query
• Prepopulates _foo_cache
• Doesn't follow null=True relationships by default
• Makes query more expensive, so be sure you need it
![Page 24: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/24.jpg)
Backwards relationships{% for foo in my_foos %}
{% for bar in foo.bar_set.all %}
{{ bar.name }}
{% endfor %}
{% endfor %}
![Page 25: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/25.jpg)
Backwards relationships
• One query per foo
• If you iterate over foo_set again, you generate a new set of db hits
• No _foo_cache
• select_related does not work here
![Page 26: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/26.jpg)
Optimising backwards relationships
• Get all related objects at once
• Sort by ID of parent object
• Then cache in hidden attribute as with select_related
![Page 27: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/27.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 28: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/28.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 29: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/29.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 30: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/30.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 31: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/31.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 32: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/32.jpg)
qs = Foo.objects.filter(criteria=whatever)obj_dict = dict([(obj.id, obj) for obj in qs])objects = Bar.objects.filter(foo__in=qs)relation_dict = {}for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj)for id, related in relation_dict.items(): obj_dict[id]._related = related
![Page 33: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/33.jpg)
Optimising backwards[{'time': '0.000', 'sql': u'SELECT "foobar_foo"."id", "foobar_foo"."name" FROM "foobar_foo"'},
{'time': '0.000', 'sql': u'SELECT "foobar_bar"."id", "foobar_bar"."name", "foobar_bar"."foo_id" FROM "foobar_bar" WHERE "foobar_bar"."foo_id" IN (SELECT U0."id" FROM "foobar_foo" U0)'}]
![Page 34: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/34.jpg)
Optimising backwards
• Still quite expensive, as can mean large dependent subquery – MySQL in particular very bad at these
• But now just two queries instead of n
• Not automatic – need to remember to use _related_items attribute
![Page 35: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/35.jpg)
Generic relations• Foreign key to ContentType, object_id
• Descriptor to enable direct access
• iterating through creates n+m queries(n=number of source objects,m=number of different content types)
• ContentType objects automatically cached
• Forwards relationship creates _foo_cache
• but select_related doesn't work
![Page 36: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/36.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 37: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/37.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 38: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/38.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 39: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/39.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 40: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/40.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 41: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/41.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 42: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/42.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 43: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/43.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 44: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/44.jpg)
generics = {}for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id)content_types = ContentType.objects.in_bulk( generics.keys())relations = {}for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects.\ in_bulk(list(fk_list))for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
![Page 45: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/45.jpg)
Other optimising techniques
![Page 46: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/46.jpg)
Memoizing
• Cache property on first access
• Can cache within instance, if multiple accesses within same request
def get_expensive_items(self): if not hasattr(self, '_cache'): self._cache = self.expensive_op() return self._cache
![Page 47: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/47.jpg)
DB Indexes
• Pay attention to slow query log and debug toolbar output
• Add extra indexes where necessary - especially for multiple-column lookup
• Use EXPLAIN
![Page 48: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/48.jpg)
Outsourcing
• Does all the logic need to go in the web app?
• Services - via eg Piston
• Message queues
• Distributed tasks, eg Celery
![Page 49: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/49.jpg)
Summary
• Understand where queries are coming from
• Optimise where necessary, within Django or in the database
• and...
![Page 50: Advanced Django ORM techniques](https://reader034.vdocuments.us/reader034/viewer/2022050613/54c4b6834a7959f77e8b45bf/html5/thumbnails/50.jpg)
PROFILE