pydiomatic

61
Idiomatic Python Enrico Franchi [email protected] 1

Upload: rik0

Post on 14-May-2015

473 views

Category:

Technology


0 download

DESCRIPTION

Python is a high level language focused on readability. The Python community developed the concept of "Pythonic Code", requiring not only semantic correctness, but also conformity to universally acknowledged stylistic criteria.A pre-requisite to write pythonic code is to write idiomatic code. Using the right idioms is a matter of acquired taste and experience, however, some idioms are quite easy to learn.This presentation focuses on some of these idioms and other stylistic criteria:* for vs. while* iterators, itertools* code conventions (space invaders)* avoid default values bugs* first order functions* internal/external iterators* substituting the switch statement* properties, attributes, read only objects* named tuples* duck typings* bits of metaprogramming* exception management: LBYL vs. EAFP

TRANSCRIPT

Page 2: Pydiomatic

Could you please lend me the thing that you put in the wall when you want to turn on the hairdryer and

the hairdryer comes from a different country?

Could you please lend me a power adapter?

2

Page 3: Pydiomatic

If you are out to describe the truth, leave elegance to the tailor.

Albert Einstein

3

Page 4: Pydiomatic

Debugging is twice as hard as writing the code in the first place.

Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Brian Kernighan

4

Page 5: Pydiomatic

READABILITYCOUNTS

Zen of Python

5

Page 6: Pydiomatic

TOC

Iteration

Naming

Functions are objects

Choice

Attributes and methods

Duck Typing

Exceptions [unless TimeoutError is thrown]

6

Page 7: Pydiomatic

FOR vs. WHILE vs. ...

Iteration vs. Recursion

sys.setrecursionlimit(n)

for vs. while

Traditionally bounded iteration vs. unbounded iteration

In C for and while are completely equivalent

Some languages have for/foreach to iterate on collections

for file in *.py; do pygmentize -o ${file%.py}.rtf $filedone

7

Page 8: Pydiomatic

Numerical Iteration

int i = 0; while(i < MAX) { printf("%d\n", i); ++i; }

int i = 0; for(i=0; i < MAX; ++i) { printf("%d\n", i); }

i = 0while i < MAX: print i i += 1

# O(n) spacefor i in range(MAX): print i

# O(1) spacefor i in xrange(MAX): print i

8

Page 9: Pydiomatic

Iteration on elements

It is also common to iterate on elements of some collection

C uses indices to iterate on array elements

Python uses for

What if we want to iterate both on elements and indices?

i = 0while i < len(lst): process(lst[i]) i += 1

for el in lst: process(el)

BAD

GOOD

9

Page 10: Pydiomatic

j = 0while j < len(lst): process(index=j, element=lst[j]) j += 1

for j in range(len(lst)): process(index=j, element=lst[j])

for j, el in enumerate(lst): process(index=j, element=el)

BAD

GOOD

BAD

10

Page 11: Pydiomatic

What about Turing?

for is usually considered the more pythonic alternative

Ideally every iteration should be done using for

However, we have shown only iteration on finite collections, that is to say, for would not provide turing-completeness

But everybody knows about generators: Python has infinite (lazy) sequences and they cover many other patterns as well

11

Page 12: Pydiomatic

Design Implications

Python for statement uses external iterators, that are extremely easy to implement through generators

itertools provides lots of functions to manipulate iterators

The iteration logic is pushed inside the iterator; the client code becomes totally agnostic on how values are generated

12

Page 13: Pydiomatic

def server_socket(host, port): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.bind((host, port)) sock.listen(5) csock, info = sock.accept() return csock.makefile('rw')

def server(host, port): fh = server_socket(host, port) for i, line in enumerate(fh): if line == "EOF\r\n": break fh.write("%4.d:\t%s" % (i, line)) fh.close()

... (Forking)TCPServer and higher level modules and frameworks are better!

13

Page 14: Pydiomatic

def depth_first_visit(node): stack = [node, ] while stack: current_node = stack.pop() stack.extend(reversed(current_node.children)) yield current_node.value def breadth_first_visit(node): queue = collections.deque((node, )) while queue: current_node = queue.popleft() queue.extend(current_node.children) yield current_node.value

for v in depth_first_visit(tree): print v,print

for v in breadth_first_visit(tree): print v,print

14

Page 15: Pydiomatic

PEP-8

http://www.python.it/doc/articoli/pep-8.html

‘‘‘One of Guido\’s key insights is that code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make consistent across the wide spectrum of Python code. As PEP 20 [6] says, “Readability counts”.’’’

http://www.python.org/dev/peps/pep-0008/

15

Page 16: Pydiomatic

PEP-8 (II)

Standard for source code style

names

whitespace

indentation

Consistency with this style guide is important.

Consistency within a project is more important.

Consistency within one module or function is most important.

16

Page 17: Pydiomatic

Indentation4 spaces, don’t mix tabs and spaces

79 characters per line max

Wrap lines in using implied line cont. in (), [] and {}

Add parentheses to wrap lines

Sometimes backslash is more appropriate

Newline after operators

One blank line between functions, two between classes

(not filename.startswith('.') and filename.endswith(('.pyc', '.pyo')))

17

Page 18: Pydiomatic

Space Invaders

Put a space after “,” [parameters, lists, tuples, etc]

Put a space after “:” in dicts, not before

Put spaces around assignments and comparisons

Unless it is an argument list

No spaces just inside parentheses or just before argument lists

18

Page 19: Pydiomatic

Naming conventions (I)

Always use descriptive names; the longer the scope, the longer the name

Trailing underscore: avoids conflict with keywords or builtins (class_)

Leading underscore: “internal use”/non-public

Double leading underscore: name mangling

Double leading and trailing: “magic”

Avoid l, 1 and similar confusing names

19

Page 20: Pydiomatic

Naming conventions (II)

simple lower_case CamelCase ALL_CAPSClasses

VariablesMethodsFunctionsConstantsPackagesModules

XX XX XX X

XXX (x)

... and self/cls first argument name for methods

20

Page 21: Pydiomatic

Default values

The default values are evaluated once, when the function is defined and is ‘shared’ among all call points

If the default value is a mutable object, that leads to bugs

>>> def f(x=[]): ... x.append(1)... return x... >>> f()[1]>>> f()[1, 1]>>> f()[1, 1, 1]

>>> def g(x=None):... x = [] if x is None else x... return x... >>> g()[]>>> g([1, 2])[1, 2]

21

Page 22: Pydiomatic

Functions are ObjectsIn Python everything is an object

Thus, functions are objects

Functions can be passed as arguments (easy)

Functions can be returned as return values

Some APIs explicitly expect functions as arguments (sort(key=))

import sys, urllibdef reporthook(*a): print afor url in sys.argv[1:]: i = url.rfind('/') file = url[i+1:] print url, "->", file urllib.urlretrieve(url, file, reporthook)

22

Page 23: Pydiomatic

Internal Iteratorsdef dfs(node, action): stack = [node, ] while stack: current_node = stack.pop() stack.extend(reversed(current_node.children)) action(current_node.value)

def bfs(node, action): queue = collections.deque((node, )) while queue: current_node = queue.popleft() queue.extend(current_node.children) action(current_node.value)

dfs(tree, lambda x: sys.stdout.write("%s, " % x))

23

Page 24: Pydiomatic

def dfs(node, pre_action=None, post_action=None): def nop(node): pass pre_action = pre_action or nop # bad, use if post_action = post_action or nop # bad stack = [] def process_node(n): def do_pre(): pre_action(n.value) def do_post(): post_action(n.value) def do_process(): stack.append(do_post) for child in reversed(n.children): stack.append(process_node(child)) stack.append(do_pre) return do_process stack.append(process_node(node)) while stack: action = stack.pop() action()

dfs(tree, pre_action=lambda x: sys.stdout.write("%s, " % x))printdfs(tree, post_action=lambda x: sys.stdout.write("%s, " % x))print

24

Page 25: Pydiomatic

AA

B C

D E

Pre

Proc

Post

A C B A

A C B

BA C B

A C B

A C

A C E D C

A C E D

A C E D D

A C E D

A C E E

A C E1

2

3

4

5

6

7

8

9

10

11

A C E

A C

A

12

13

14

15

25

Page 26: Pydiomatic

def dfs(node, pre_action=None, post_action=None): def nop(node): pass pre_action = pre_action or nop post_action = post_action or nop stack = []

def process_node(n): def do_pre(): pre_action(n.value) def do_post(): post_action(n.value) def do_process(): stack.append(do_post) for child in reversed(n.children): stack.append(process_node(child)) stack.append(do_pre) return do_process

stack.append(process_node(node))

while stack: action = stack.pop() action()

26

Command Pattern is obsolete...

Page 27: Pydiomatic

class TreePrinter(object): def __init__(self, fh, step=' '): self.out = fh self.step = step self.level = 0

def pre_print(self, value): self.out.write(self.step * self.level) self.out.write(str(value)) self.out.write('\n') self.level += 1

def post_print(self, _): self.level -= 1

tp = TreePrinter(sys.stdout)dfs(tree, tp.pre_print, tp.post_print)

0 1

2 3 4

5 6 7

8 9 10

11

27

Page 28: Pydiomatic

The case of the missing switch

Some people think Python should have a switch/case like statement, something that executes a block of code determined by the value of a variable

Possible solutions

Python if/elif/else statement

Seems the job for a dictionary + functions

A cleverly designed class can solve the problem as well

28

Page 29: Pydiomatic

What if we use the if?

An if statement is easy to read and write, if there are few branches. Confusing if there are many branches

Theoretically correct (provided that the conditions are disjoint)

Maybe slower as conditions are evaluated in order

Some suggest that if statements should be banned ;)

f (x1,…, xn ) =

φ1 x1,…, xn( ) if ρ1 x1,…, xn( )

φm x1,…, xn( ) if ρm x1,…, xn( )φm+1 x1,…, xn( ) otherwise

⎪⎪

⎪⎪

29

Page 30: Pydiomatic

Dictionary

If the body of the switch essentially sets some (set of) variable(s), a dictionary is perfect

def some_function(n, *more_args): # ... masks = { 0: '0000', 1: '0001', 2: '0010', 3: '0011', 4: '0100', 5: '0101', 6: '0110', 7: '0111', 8: '1000', 9: '1001', 10: '1010', 11: '1011', 12: '1100', 13: '1101', 14: '1110', 15: '1111' } # ... str_bits = masks[n]

30

Page 31: Pydiomatic

Dictionary [+ Functions]

If the “actions” in the branches are naturally abstracted as functions, a dictionary is perfectimport operator# ...class BinOp(Node): # ... def compute(self): operations = { '+': operator.add, '-': operator.sub, '*': operator.mul, '/': operator.div } return operations[self.op](self.left.compute(), self.right.compute())

31

Page 32: Pydiomatic

import cmd

class Example(cmd.Cmd): def do_greet(self, rest): print 'Hello %s!' % rest

def do_quit(self, rest): return True

while 1: words = raw_input('(cmd) ').split(' ', 1) command = words[0] try: rest = words[1] except IndexError: rest=''

switch command: case 'greet': print 'Hello %s!' % rest case 'quit': break

32

Page 33: Pydiomatic

Properties are a neat way to implement attributes whose usage resembles attribute access, but whose implementation uses method calls.

These are sometimes known as “managed attributes”.

GvR

33

Page 34: Pydiomatic

class Track(object): def __init__(self, artist, title, duration): self.artist = artist self.title = title self.duration = duration

def __str__(self): return '%s - %s - %s' % (self.artist, self.title, self.duration)

34Example (Track)

Page 35: Pydiomatic

Properties (I)

Track has public attributes

“Java” bad-practice

Dependency from “implementation details”

What if we need validation in setters and such?

property: old attribute access syntax, function calls under the hood

class A(object): def __init__(self, foo): self._foo = foo

def get_foo(self): print 'got foo' return self._foo

def set_foo(self, val): print 'set foo' self._foo = val

foo = property(get_foo, set_foo)

a = A('hello')print a.foo# => 'got foo'# => 'hello'a.foo = 'bar'# => 'set foo'

35

Page 36: Pydiomatic

Properties (II)

Sometimes we don’t need the setter...class A(object): def __init__(self, foo): self._foo = foo

def get_foo(self): print 'got foo' return self._foo

foo = property(get_foo)

a = A('ciao')print a.foo# => 'got foo'# => 'ciao'a.foo = 'bar'# Traceback (most recent call last):# File "prop_example2.py", line 15, in <module># a.foo = 'bar'# AttributeError: can't set attribute'

36

Page 37: Pydiomatic

Properties (III)

Nicer syntax: decorators are handyclass A(object): def __init__(self, foo): self._foo = foo

@property def foo(self): print 'got foo' return self._foo

a = A('hello')print a.foo# => 'got foo'# => 'hello'a.foo = 'bar'# Traceback (most recent call last):# File "prop_example2.py", line 15, in <module># a.foo = 'bar'# AttributeError: can't set attribute'

37

Page 38: Pydiomatic

Properties (IV)

From Python 2.6, decorator for the setter:class A(object): def __init__(self, foo): self._foo = foo

@property def foo(self): print 'got foo' return self._foo

@foo.setter def foo(self, value): print 'set foo' self._foo = value

a = A('hello')a.foo = 'bar'# => 'set foo'

38

Page 39: Pydiomatic

class Track(object): def __init__(self, artist, title, duration): self._artist = artist self._title = title self._duration = duration

@property def artist(self): return self._artist

@property def title(self): return self._title

@property def duration(self): return self._duration

def __str__(self): return '%s - %s - %s' % (self.artist, self.title, self.duration)

39

Page 40: Pydiomatic

How Pythonic?

We can decouple interface from implementation (getters/setters)

We have “read-only” attributes,

therefore, “immutable” objects

Trivial getter/setters are repetitive

Properties are helpful in order to evolve code, but are verbose to define “immutable objects”

40

Page 41: Pydiomatic

Named Tuples

Named Tuples solve the problem nicely

Immutable objects (easier to use, too much C++ and FP lately ☺)

Can be used both as objects and tuples

__str__ and other methods have good default implementation

Subclassing can be used to change defaults

Very quick to write!

http://code.activestate.com/recipes/500261-named-tuples/

41

Page 42: Pydiomatic

Track = collections.namedtuple('Track', ['title', 'artist', 'duration'])

42

Page 43: Pydiomatic

About Java/C++ types...

In statically typed languages like C++ we constrain parameters to be of a given type or any of its subtypes

However, a good programming practice is program to an interface

Java interfaces (true dynamic polymorphism)

C++ Templates (static polymorphism)

Both solutions have problems (however, I do love ML static typing...)

43

Page 44: Pydiomatic

Books, search by title

If the list contains a non book, an exception is raised

Does not even work with subclasses

Worst strategy

Never type-check like that

Solving a non-problem

class Book(object): def __init__(self, title, author): self.title = title self.author = author

def find_by_title(seq, title): for item in seq: if type(item) == Book: # horrible if item.title == title: return item else: raise TypeError

def find_by_author(seq, author): for item in seq: if type(item) == Book: # horrible if item.author == author: return item else: raise TypeError

44

Page 45: Pydiomatic

Books, search by title

If the list contains a non book, an exception is raised

Does not even work with subclasses

Worst strategy

Never type-check like that

Solving a non-problem

44

Page 46: Pydiomatic

Books, search by title

Subclasses are ok

However, code does not depend on elements being books

They have a title

They have an author

What about songs?

Bad strategy, afterall

def find_by_title(seq, title): for item in seq: if isinstance(item, Book): # bad if item.title == title: return item else: raise TypeError

def find_by_author(seq, author): for item in seq: if isinstance(item, Book): # bad if item.author == author: return item else: raise TypeError

class Book(object): def __init__(self, title, author): self.title = title self.author = author

45

Page 47: Pydiomatic

Books, search by title

Subclasses are ok

However, code does not depend on elements being books

They have a title

They have an author

What about songs?

Bad strategy, afterall

def find_by_title(seq, title): for item in seq: if isinstance(item, Book): # bad if item.title == title: return item else: raise TypeError

def find_by_author(seq, author): for item in seq: if isinstance(item, Book): # bad if item.author == author: return item else: raise TypeError

class Song(object): def __init__(self, title, author): self.title = title self.author = author

45

Page 48: Pydiomatic

What about movies?

Movies have a title. However, they have a director and no author

find_by_title should work, find_by_author, shouldn’t

Interface for Book e Song. And what about Movie?

Design Pattern o code duplication

Square Wheel ⇒ Roads designed for square wheels

Duck typing simply avoids the problem

46

Page 49: Pydiomatic

Books and Songs

The simplest solution is the best

Programmers do not code by chance (hopefully)

AttributeErrors are raised in case of problems

UnitTests discover these kind of errors

You have unit tests, don’t you?

class Book(object): def __init__(self, t, a): self.title = t self.author = a def find_by_title(seq, title): for item in seq: if item.title == title: return item

def find_by_author(seq, author): for item in seq: if item.author == author: return item

47

Page 50: Pydiomatic

def find_by(seq, **kwargs): for obj in seq: for key, val in kwargs.iteritems(): try: if getattr(obj, key) != val: break except AttributeError: break else: return obj raise NotFound

print find_by(books, title='Python in a Nutshell')print find_by(books, author='M. Beri')print find_by(books, title='Python in a Nutshell', author='A. Martelli')

try: print find_by(books, title='Python in a Nutshell', author='M. Beri') print find_by(books, title='Python in a Nutshell', pages=123)except NotFound: pass

48

Page 51: Pydiomatic

def find_by(seq, **kwargs): for obj in seq: for key, val in kwargs.iteritems(): try: attr = getattr(obj, key) except AttributeError: break else: if val != attr and val not in attr: break else: yield obj

Page 52: Pydiomatic

Life expectations

Function parameters and every variable bound in a function body constitutes the function local scope

These variables scope is the whole function body

However, using them before binding is an error

50

Page 53: Pydiomatic

Life expectations

Function parameters and every variable bound in a function body constitutes the function local scope

These variables scope is the whole function body

However, using them before binding is an errorif s.startswith(t): a = s[:4]else: a = tprint a

a = None

WRONG

50

Page 54: Pydiomatic

Life expectations

Function parameters and every variable bound in a function body constitutes the function local scope

These variables scope is the whole function body

However, using them before binding is an errorif s.startswith(t): a = s[:4]else: a = tprint a

GOOD

50

Page 55: Pydiomatic

LBYL vs. EAFP

LBYL: Look before you leap

EAFP: Easier to ask forgiveness than permission

Usually EAFP is the best strategy

Exception are rather fast

Atomicity, ...

# LBYL -- badif id_ in employees: emp = employees[id_]else: report_error(...)

#EAFP -- goodtry: emp = employees[id_]except KeyError: report_error(...)

51

Page 56: Pydiomatic

if os.access(filename, os.F_OK): fh = file(filename)else: print "Something went bad."

if os.access(filename, os.F_OK): try: fh = file(filename) except IOError: print "Something went bad."else: print "Something went bad."

try: fh = file(filename)except IOError: print "Something went bad."

BAD

VERBOSE

GOOD

52

Page 57: Pydiomatic

More on Exceptions

Exceptions should subclass Exception directly or indirectly

Catch exceptions using the most specific specifier

Don’t use the base except: unless

You plan to re-raise the exception (but you probably should use finally)

You want to log any error or something like that

Also catches KeyboardInterrupt

53

Page 58: Pydiomatic

Limit the try scopetry: # Too broad! return handle_value(collection[key])except KeyError: # Will also catch KeyError raised by handle_value() return key_not_found(key)

try: value = collection[key]except KeyError: return key_not_found(key)else: return handle_value(value)

BAD

GOOD

54

Page 59: Pydiomatic

References

Python in a Nutshell, 2ed, Alex Martelli, O’Reilly

Python Cookbook, Alex Martelli, Anna Martelli Ravenscroft and David Ascher, O’Reilly

Agile Software Development: Principles, Patterns and Practices, Robert C. Martin, Prentice Hall

Code Clean, Robert C. Martin, Prentice Hall

Structure and Interpretation of Computer Programs,H. Abelson, G. Sussman, J. Sussman,http://mitpress.mit.edu/sicp/full-text/book/book.html

55

Page 61: Pydiomatic

Q&A57