why sql alchemy impresses me

7
Friday, January 12, 2007 Why SQLAlchemy impresses me One of the reasons ORM tools have a spotted reputation is that it's really, really easy to write a dumb ORM that works fine for simple queries but performs like molasses once you start throwing real data at it. Let me give an example of a situation where, to my knowledge, only SQLAlchemy of the Python (or Ruby) ORMs is really able to handle things elegantly, without gross hacks like "piggy backing." Often you'll see a one-to-many relationship where you're not always interested in all of the -many side. For instance, you might have a users table, each associated with many orders. In SA you'd first define the Table objects, then create a mapper that's responsible for doing The Right Thing when you write "user.orders." (I'm skipping connecting to the database for the sake of brevity, but that's pretty simple. I'm also avoiding specifying columns for the Tables by assuming they're in the database already and telling SA to autoload them. Besides keeping this code shorter, that's the way I prefer to work in real projects.) users = Table('users', metadata, autoload=True) orders = Table('orders', metadata, autoload=True) class User(object): pass class Order(object): pass mapper(User, users, properties={ 'orders':relation(mapper(Order, orders), order_b }) That "properties" dict says that you want your User class to provide an "orders" attribute, mapped to the orders table. If you are using a sane database, SQLAlchemy will automatically use the foreign keys it finds in the relation; you don't need to explicitly specify that it needs to join on "orders.user_id = user.id." We can thus write for user in session.query(User).select(): print user.orders So far this is nothing special: most ORMs can do this much. Most can also specify whether to do eager loading for the orders -- where all the data is pulled out via joins in the first About Me JONATHAN ELLIS jbellis at gmail dot com View my complete profile Follow Jonathan on Twitter here Most popular Cassandra: fact vs fiction Cassandra in action Why you won't be building your killer app on a DHT All you ever wanted to know about writing Bloom Filters CouchDB: Not drinking the kool-aid FormAlchemy 1.0 Scala: first impressions Anders Heljsberg doesn't grok Python Blog Archive 2011 (1) 2010 (8) 2009 (17) 2008 (33) 2007 (40) December (2) November (1) Compartir Informar sobre mal uso Siguiente blog» Crear un blog Acceder Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html 1 de 7 30/07/2011 20:14

Upload: apmapm

Post on 06-Mar-2015

86 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Why SQL Alchemy Impresses Me

Friday, January 12, 2007Why SQLAlchemy impresses me

One of the reasons ORM tools have a spotted reputation is thatit's really, really easy to write a dumb ORM that works fine forsimple queries but performs like molasses once you startthrowing real data at it.

Let me give an example of a situation where, to myknowledge, only SQLAlchemy of the Python (or Ruby) ORMs isreally able to handle things elegantly, without gross hacks like"piggy backing."

Often you'll see a one-to-many relationship where you're notalways interested in all of the -many side. For instance, youmight have a users table, each associated with many orders.In SA you'd first define the Table objects, then create a mapperthat's responsible for doing The Right Thing when you write"user.orders."

(I'm skipping connecting to the database for the sake ofbrevity, but that's pretty simple. I'm also avoiding specifyingcolumns for the Tables by assuming they're in the databasealready and telling SA to autoload them. Besides keeping thiscode shorter, that's the way I prefer to work in real projects.)

users = Table('users', metadata, autoload=True)orders = Table('orders', metadata, autoload=True)

class User(object): passclass Order(object): pass

mapper(User, users, properties={ 'orders':relation(mapper(Order, orders), order_b })

That "properties" dict says that you want your User class toprovide an "orders" attribute, mapped to the orders table. Ifyou are using a sane database, SQLAlchemy will automaticallyuse the foreign keys it finds in the relation; you don't need toexplicitly specify that it needs to join on "orders.user_id =user.id."

We can thus write

for user in session.query(User).select(): print user.orders

So far this is nothing special: most ORMs can do this much.Most can also specify whether to do eager loading for theorders -- where all the data is pulled out via joins in the first

About Me

JONATHAN

ELLIS

jbellis at

gmail dot

com

View my complete profile

Follow Jonathan onTwitter here

Most popular

Cassandra: fact vs fiction

Cassandra in action

Why you won't be buildingyour killer app on a DHT

All you ever wanted toknow about writingBloom Filters

CouchDB: Not drinking thekool-aid

FormAlchemy 1.0

Scala: first impressions

Anders Heljsberg doesn'tgrok Python

Blog Archive

► 2011 (1)

► 2010 (8)

► 2009 (17)

► 2008 (33)

▼ 2007 (40)

► December (2)

► November (1)

Compartir Informar sobre mal uso Siguiente blog» Crear un blog Acceder

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

1 de 7 30/07/2011 20:14

Page 2: Why SQL Alchemy Impresses Me

select() -- or lazy loading, where orders are loaded via aseparate query each time the attribute is accessed. Either ofthese can be "the right way" for performance, depending onthe use case.

The tricky part is, what if I want to generate a list of all usersand the most recent order for each? The naive way is to write

class User: @property def max_order(self): return self.orders[-1]

for user in session.query(User).select(): print user, user.max_order

This works, but it requires loading all the orders when we arereally only interested in one. If we have a lot of orders, thiscan be painful.

One solution in SA is to create a new relation that knows howto load just the most recent order. Our new mapper will looklike this:

mapper(User, users, properties={ 'orders':relation(mapper(Order, orders), order_b 'max_order':relation(mapper(Order, max_orders, n })

("non_primary" means the second mapper does not definepersistence for Orders; you can only have one primary mapperat a time. "viewonly" means you can't assign to this relationdirectly.)

Now we have to define "max_orders." To do this, we'll leverageSQLAlchemy's ability to map not just Tables, but anySelectable:

max_orders_by_user = select([func.max(orders.c.order_id).la group_by=[orders.c.user_id]).almax_orders = orders.select(orders.c.order_id==max_orders_by

"max_orders_by_user" is a subselect whose rows are the maxorder_id for each user_id. Then we use that to definemax_orders as the entire order row joined to that subselect onuser_id.

We could define this as eager-by-default in the mapper, but inthis scenario we only want it eager on a per-query basis. Thatlooks like this:

q = session.query(User).options(eagerload('max_order'))for user in q.select(): print user, user.max_order

For fun, here's the sql generated:

SELECT users.user_name AS users_user_name, users.user_id AS anon_760c.order_id AS anon_760c_order_id, anon_760c.use anon_760c.description AS anon_760c_description, anon_760c.isopen AS anon_760c_isopen

► October (4)

► September (4)

► July (3)

► June (2)

► May (2)

► April (5)

► March (2)

► February (7)

▼ January (8)

Komodo 4 released;new free version

PyCon SQLAlchemytutorial full

Caution: upgrading tonew version ofblogger may i...

Abstract of "AdvancedPostgreSQL, part 1"

Why SQLAlchemyimpresses me

MySQL backendperformance

Fun with three-valuedlogic

Good advice forTortoise SVN users

► 2006 (38)

► 2005 (78)

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

2 de 7 30/07/2011 20:14

Page 3: Why SQL Alchemy Impresses Me

FROM users LEFT OUTER JOIN ( SELECT orders.order_id AS order_id, orders.user_id AS u orders.description AS description, orders.isopen AS FROM orders, ( SELECT max(orders.order_id) AS order_id FROM orders GROUP BY orders.user_id) AS max_orders_ WHERE orders.order_id = max_orders_by_user.order_id) ASON users.user_id = anon_760c.user_id ORDER BY users.oid, anon_760c.oid

In SQLAlchemy, easy things are easy; hard things take someeffort up-front, but once you have your relations defined, it'salmost magical how it pulls complex queries together for you.

.................

I'm giving a tutorial on Advanced Databases with SQLAlchemyat PyCon in February. Feel free to let me know if there isanything you'd like me to cover specifically.

Posted by Jonathan Ellis at 10:04 PM

18 comments:

Anonymous said...

Your "that's the way I prefer to work" link isn't quiteright...

10:37 AM

Jonathan Ellis said...

Fixed.

10:41 AM

Anonymous said...

"In SA you'd first define the Table objects, then create amapper that's responsible for doing The Right Thenwhen you write "user.orders."

Should that be "The Right Thing"? Nice post.

2:34 PM

Jonathan Ellis said...

Quite right. Fixed that too. :)

3:40 PM

Anonymous said...

Ugh, three levels of selects, that's some ugly SQL. Iknow MySQL loves to choke on that kind of query.

You are using a sub-sub-select to get all users and theirlast order, then joining that to a sub-select to get theorder information for that last order per user, and thenagain an outer select to join the users to get theirinformation.

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

3 de 7 30/07/2011 20:14

Page 4: Why SQL Alchemy Impresses Me

Why not use a single sub-select as a "relational" table tolink the users and order tables. Like so:

SELECT u.*, o.*FROM users AS uLEFT JOIN(SELECT user_id, max(order_id) AS order_idFROM ordersGROUP BY user_id) AS last_orderON u.user_id = last_order.user_idINNER JOIN orders AS o ON last_order.order_id =o.order_idORDER BY u.user_id

10:28 PM

Jonathan Ellis said...

I found my way more intuitive, but I'm pretty sure adatabase with a decent optimizer will generate the sameplan for both. (This may not include MySQL, as you say,but I don't see a point in writing baroque code toaccommodate a database that I never use. :)

10:40 PM

Anonymous said...

It doesn't impress me.

I think the generic SQL that ORMs tend to generatemight be useful only part of the time. That's why forreally caring ORMs, they make it easy for you tocustomize the SQL to your unique needs.

Maybe iBATIS is one of the most famous and impressiveones in doing that?

I have my own as well, which is "impressive" in its ownway. hehe. It follows more of the way iBATIS works,without the XML need and with lots of conventions. :-)

Anyway, just ask yourself how could your ORM becomemore popular and solid at the same time? Is it easy toget there?

Best of luck to you.

3:42 AM

Jonathan Ellis said...

SQLAlchemy allows custom SQL for several meanings ofthe phrase, but the point is, most of the time you don'tneed it. Which is the best of both worlds IMO.

(SA is not "my" ORM; I just recognize good softwarewhen I see it.)

6:56 AM

Anonymous said...

Python and Ruby ORMs? Dunno. The main perl one

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

4 de 7 30/07/2011 20:14

Page 5: Why SQL Alchemy Impresses Me

handles this fine. DBIx::Class will happily prefetch allsorts of complex things in a single relation, and it's alsotrivial to select parts of a one-many rel to prefetch onthe fly if you know that's all you're going to need.

8:39 AM

Jonathan Ellis said...

I do not see anything in the DBIx::Class documentationthat implies that it can do this cleanly (and I see somesuggestions that it cannot), but if you are correct, thenit's good that people who have to use Perl have thatoption. :)

9:58 AM

Timur Izhbulatov said...

Good job! Thanks! I've just started playing with SA andyour post is really helpful.

1:44 PM

zzzeek said...

my impression is that iBatis is strictly a rowset-mapper.that in itself is maybe 10% of SQLAlchemy, which if youprefer allows completely straight textual SQL that canbe turned into objects at the result set level just likeiBatis. But theres no need for that level of tedium.

the eager loading queries Jonathan illustrates work justgreat with MySQL. SA has several hundred unit tests,dozens of which contain far more exotic queries thanthat, all of which pass 100% on MySQL and are in useon many production systems today.

also, SQLAlchemy has been adopted by virtually everysub-category of the Python community in less than ayear after it was first released so we are doing OK in thepopularity department.

8:11 PM

Raven said...

Does SQLAlchemy do any kind of internal caching?

For example, if you ask for the same data twice (or anobvious subset of the initially requested data) will thedatabase be hit once or twice?

I recently wrote a caching database abstraction layer foran application and (while fun) it was a fair bit of work toget it to a minimally functional state. If SQLAlchemy didthat I would seriously consider jumping on thebandwagon.

I've found things in the docs that imply something likethis might be going on, but nothing explicit.

4:36 PM

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

5 de 7 30/07/2011 20:14

Page 6: Why SQL Alchemy Impresses Me

Jonathan Ellis said...

No; the author of SA [rightly, IMO] considers caching aseparate concern.

What you saw in the docs is probably the SA identitymap, which makes it so if you load an instance in twodifferent places, they will refer to the same object. Butthe database will still be queried twice, so it is not acache in the sense you mean.

6:56 PM

zzzeek said...

what youre talking about is a second level cache, whichis distinct from SA's "session" which is cache-like, butthats not its main purpose (identity mapping is its mainpurpose).

SA doesnt have an integrated second-level cache at themoment. i think theyre a little bit overrated and also itsquite easy to write your own that is closely tailored tohow your app functions (since theyre usually justdictionaries).

to design a generic one that does what everybody wantswithout being impossible to configure/understand is alarge undertaking, but we should eventually get one. ineed to make a huge pass through myghtyutils first andclean it up....that would probably be the backend.

7:54 PM

Matt said...

Jonathan,

In this line:'max_order':relation(mapper(Order, max_orders,non_primary=True), uselist=False, viewonly=True),

How does SA know to join on the user_id column? Youhaven't set up any foreign keys, nor have you specifiedwhere to join.

I've got a tricky problem with an association table -- andI think this technique (of using selects) would work, butI'm having trouble connecting all the pieces.

10:36 AM

Jonathan Ellis said...

I did have FKs in the database, so autoload picked themup. Sorry that wasn't clear.

11:46 AM

love encounter flow said...

i am impressed by sqlalchemy too but i recently dumpedit in favor of storm. i've been finding myself far toooften in the jungle of documentation, trying to figureout how to do this, how to do that. frequently i was

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

6 de 7 30/07/2011 20:14

Page 7: Why SQL Alchemy Impresses Me

Newer Post Older Post

Post a Comment

Home

Subscribe to: Post Comments (Atom)

unable to come up with a solution and had to write sqlanyway. sqlalchemy is just too complicated, and i feel ihave to jump thru too many hoops to get things done.also i got a lot of strange errors i was unable to resolvewith that relations things, when you tie several recordsfrom several tables together. i write a lot more raw sqlnow which i do not particularly like, but at least thelevel of complexity has decreased. using a db backendshould not be so hard. my suspicion is that the currentway to use orms to access db data is ultimately not theway to go, that there is a better solution waiting to bediscovered.

5:07 AM

Jonathan Ellis's Programming Blog - Spyced: Why SQLAlchemy impres... http://spyced.blogspot.com/2007/01/why-sqlalchemy-impresses-me.html

7 de 7 30/07/2011 20:14