session 2 wharton summer tech camp 1: basic python 2: start regex

22
Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Upload: charity-barton

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Session 2Wharton Summer Tech Camp

1: Basic Python2: Start Regex

Page 2: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Announcement

If you did not get an email from me saying that the slides have been uploaded, please email me and I’ll add you to the list

Page 3: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Python Packaged Distribution

• Download this packaged version • Enthought Canopy or EPD– Company that maintains a great compiled version of

Python.– Has many packages included. – Alternative is to download python and install countless

number of packages -> can be a nightmare due to compiler incompatibility etc

– https://www.enthought.com/products/canopy/academic/• Free for people with EDU email

Page 4: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Why ?• Has many great packages useful for us (Scientific computing, Machine

Learning, NLP, Scraping etc) • One of the easiest and concise language yet powerful

– Memory consumption was often "better than Java and not much worse than C or C++”

• Has IDLE ("Interactive DeveLopment Environment") – Read-Eval-Print-Loop

• Great OOP (Compared to other comparable languages, say PERL. bless() those who use it)

• Highly scalable • Easy incorporation of other languages (Cython, Jython) • Named after Monty Python

Used by many companies as prototyping and "duct-tape" language as well as the main language: Wall Street, Yahoo, CERN, NASA, Con Edison, Google, etc. Also Youtube is written in Python!

Page 5: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Bit More Background on Python• Does few things EXCELLENTLY (OOP, Sci Comp, etc) and is generally good

for lot of things• Guido van Rossum – late 1980s• Programmer oriented (easy to write and read). Use of white space.• Automatic memory management • Can be interpreted or compiled (PyPy – Just-in-time compiler)• Direct opposite of PERL when it comes to programming philosophy

– PERL "there is more than one way to do it" -> Super fun when writing your own code. Rage when you debug other people’s PERL code (there is even a contest Obfuscated PERL)

– Python "there should be one—and preferably only one—obvious way to do it" -> Writing your own & Reading others’ = Fun

• Would you like to know more? – http://www.youtube.com/watch?v=ugqu10JV7dk– Van Rossum talks about history of python for 110 min!

Page 6: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Let’s start coding in Python!Fire up your IDLE.

Load the file called basicpython.py from the camp website

Page 7: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Basic Data Types

• All the standard types– Integers, floating• 2, 2.2, 3.14 etc

– Strings • “Hi, I am a string”

– Booleans • True• False

Page 8: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Hello World & Arithmetic

Helloworld.py >>> print "hello, world!" #that's it# <- used for commenting

Simple Arithmetic (+ - * ** / %)>>> 1+1>>> 5**2

Booleans (operators: and, or, not, >, <, <=, ==, !=, etc)>>> True >>> False

Page 9: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Strings

string="hello";string+stringstring*3string[0]string[-1]string[1:4]len(string)

Page 10: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Lists, Tuples, and Dictionaries

Data structures – there are many but 4 most commonly used. Each has pros and cons.

• List – list of values • Sets – set(list). You can do set operations which can be faster

than going through array element one at a time.• Tuples – just like list but not mutable and fixed size. Also, style-

wise, array usually consist of homogeneous stuff while tuples can consist of heterogeneous stuff and make a some sort of structure. (firstname, lastname) (name, age)

• Dictionaries – Hash look up table. Index of stuff. Basic book keeping "Key->Value". Fast look up O(1).

Page 11: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Lists, Tuples, and Dictionaries

• List – []>>> TPlayersList=["Federer","Nadal","Murray", "Djokovic"]range(), append(),pop(),insert(),reverse(),sort() e.g. TPlayersList.sort()

• Tuples – ()>>> TPlayersTuple=("Federer","Nadal","Murray", "Djokovic")

• Dictionaries – {}>>> TPlayersDict={ "Federer": 5, "Nadal": 4, "Murray":2, "Djokovic":1}>>>TPlayersDict["Ferrer"]=3>>>TPlayersDict["Ferrer"]>>>del TPlayersDict["Ferrer"]let d be a dictionary then d.keys(), d.values(), d.items()

Page 12: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

• When you are first reading in Data– Think carefully about what you want to do with the data – Then decide what data structures to use– It is common to have things like

• Array of arrays• Array of tuples • Dictionary of arrays• Dictionary of dictionaries• Dictionary made of (tuple keys)

– However, once you need things like dictionary of dictionary of dictionary of arrays or similar ridiculous structures, consider using object-oriented programming • Look up python Classes

(http://docs.python.org/2/tutorial/classes.html)

Lists, Tuples, and Dictionaries

Page 13: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Basic Control Flow

• Boils down to– If (elif, else)–While– For

• Python has better syntactic sugar for control flow to iterate through different data structure

Page 14: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Basic Control Flow

• True Things – True– Any non-zero numbers– Any non-empty string or data structure

• False Things – False – 0– “”– Empty data structures

Page 15: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

If and while

if True: print "everything is good”else: print "?! HUHHHHH?"

i=1while (i<=5): print "Hellodoctornamecontinueyesterdaytomorrow" i+=1 if i>5: print "good morning dr. chandra"

Page 16: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Basic Control Flow - forfor player in TPlayersList: print player

for player in sorted(TPlayersList): print player

for index, player in enumerate(TPlayersList): print index, player

for i in xrange(1,10,2): print i

for key, value in TPlayersDict.iteritems(): print key, value

Page 17: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

continue and break

• While running loops, you may need to skip or stop at some point, look up – continue– break

Page 18: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Defining a function

def fib(n): # write Fibonacci series up to n """Print a Fibonacci series up to n.""" a, b = 0, 1 while a < n: print a, a, b = b, a+b

Page 19: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Importing Libraries

• Import library• E.g. “import sys”• Some useful libraries

– sys– re– csv– scipy– numpy

• http://wiki.python.org/moin/UsefulModules#Useful_Modules.2C_Packages_and_Libraries

Page 20: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

File IO

• Reading data files into the memory • open() – returns a file object which can read or

write files• open(filename, mode)• filehandle= open(filename, mode)• filehandle.readline() Mode• r= read w=write a=append rb=read in binary

(windows makes that distinction)

Page 21: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Python Example 1

• Reading a CSV and saving each row as an array– Dealing with CSV can be very painful. – Sometimes different character encoding causes

problem when reading csv – If CSV reading just doesn’t work, suspect that you

have an encoding issue. Look up encodings (ISO-8859-1/latin1 to UTF-8)

– This is why no serious programs really use csv as a storage mechanism

• Fire up csvRead.py

Page 22: Session 2 Wharton Summer Tech Camp 1: Basic Python 2: Start Regex

Lab

Do Interactive tutorials athttp://www.codecademy.com/courses/

http://www.learnpython.org/