advanced geoprocessing with python

86
Advanced geoprocessing with… MAGIC 2012 Chad Cooper – [email protected] Center for Advanced Spatial Technologies University of Arkansas, Fayetteville

Upload: chad-cooper

Post on 06-May-2015

3.757 views

Category:

Technology


10 download

DESCRIPTION

4-hour short course given at the Mid-America GIS Consortium Biennial meeting, April 2012, Kansas City, MO.

TRANSCRIPT

Page 1: Advanced geoprocessing with Python

Advanced geoprocessing with…

MAGIC 2012Chad Cooper – [email protected]

Center for Advanced Spatial TechnologiesUniversity of Arkansas, Fayetteville

Page 2: Advanced geoprocessing with Python

Intros

• your name• what you do/where you work• used Python much?– any formal training?– what do you use it for?

• know any other languages?

Page 3: Advanced geoprocessing with Python

Objectives

• informal class– expect tangents– code as we go

• not geared totally to ArcGIS• THINK – oddball and out of the ordinary

applications will make you want more…

Page 4: Advanced geoprocessing with Python

Outline

• data types review• functions• procedural vs. OOP• geometries• rasters• spatial references• error

handling/logging

• documentation• 3rd party modules• module installation• the web– fetching– scraping–email– FTP

• files

Page 5: Advanced geoprocessing with Python

Strings• ordered collections of characters• immutable – can’t change it• raw strings: path = r”C:\temp\chad\”• slicing fruit[0] ‘b’

• indexing: fruit[1:3] >> ‘an’• iteration/membership: for each in fruit

‘f’ in fruit

Page 6: Advanced geoprocessing with Python

Strings• string formatting: ‘a %s parrot’ % ‘dead’ ‘a dead parrot’

• useful string formatting:

import arcpyf = "string"arcpy.CalculateField_management(fc, “some_field", '"%s"' % f)

Page 7: Advanced geoprocessing with Python

Lists• list – ordered collection of arbitrary objects list1 = [0,1,2,3] list2 = ['zero','one','two','three'] list3 = [0,'zero',1,'one',2,'two',3,'three']

• ordered list2.sort() list2.sort(reverse=True) ['one','three',...] ['zero','two',...]

• mutable – you can change it list1.append(4) list1.reverse() list2.insert(0,’one-half’) [0,1,2,3,4] [4,3,2,1,0] [‘one-half’,’zero’…]

list2.extend([‘four’,’five’]) <- Extend concats lists

Page 8: Advanced geoprocessing with Python

Lists…• iterable – very important! for l in list3 0 zero ...

• membership 3 in list3 --> True • nestable – 2D array/matrix list4 = [[0,1,2], [3,4,5], [6,7,8]]

• access by index – zero based list4[1] list4[1][2] [3,4,5] 5

Page 9: Advanced geoprocessing with Python

Dictionaries

• unordered collection of arbitrary objectsd = {1:’foo’, 2:’bar’}

• key/value pairs – think hash/lookup table (keys don’t have to be numbers)

d.keys() d.values() [1, 2] [‘foo’,’bar’]

• nestable, mutable d[3] = ‘spam’ del d[key]

• access by key, not offset d[2] >> ‘bar’

Page 10: Advanced geoprocessing with Python

Dictionaries

• iterabled.iteritems()<dictionary-itemiterator object at 0x1D2D8330>

for k, v in d.iteritems():print k, v

... 1 foo2 bar

Page 11: Advanced geoprocessing with Python

Tuples

• ordered collection of arbitrary objects• immutable – cannot add, remove, find• access by offset• basically an unchangeable list (1,2,’three’,4,…)

• so what’s the purpose?– FAST – great for iterating over constant set of

values– SAFE – you can’t change it

Page 12: Advanced geoprocessing with Python

List comprehensions

• Map one list to another by applying a function to each of the list elements

• Original list goes unchanged L = [2,4,6,8] J = [elem * 2 for elem in L] >>> J [4, 8, 12, 16]

Page 13: Advanced geoprocessing with Python

Sets

• unordered collections of objects• like mathematical sets – collection of distinct

objects – NO DUPLICATES• example – get rid of dups in a list via list comp L1=[2,2,3,4,5,5,3] L2=[] [L2.append(x) for x in L1 if x not in L2] >>> L2 [2, 3, 4, 5]

Page 14: Advanced geoprocessing with Python

Sets• get rid of dups via set:

>>> L1=[2,2,3,4,5,5,3]>>> set(L1)set([2, 3, 4, 5])>>>L1 = list(set(L1))>>>>>> L1[2, 3, 4, 5]

• union:>>> L2 = [4,5,6,7]>>> L1 + L2[2, 3, 4, 5, 4, 5, 6, 7]>>>>>> list(set(L1).union(set(L2)))[2, 3, 4, 5, 6, 7]

Page 15: Advanced geoprocessing with Python

Sets

• intersection – data are the same>>> set(L1).intersection(set(L2))set([4, 5])>>>

• symmetrical difference – data are not the same>>> set(L1).symmetric_difference(set(L2))set([2, 3, 6, 7])

>>> L1[2, 2, 3, 4, 5, 5, 3]>>> L2[4, 5, 6, 7]

• difference – data in first set but not second>>> set(L1).difference(set(L2))set([2, 3])>>> set(L2).difference(set(L1))set([6, 7])

Page 16: Advanced geoprocessing with Python

Programming paradigms:big blob of code

• OK on a small scale for GP scripts• gets out of hand quickly• hard to follow• think ModelBuilder-exported code

Page 17: Advanced geoprocessing with Python

Programming paradigms:procedural programming

• basically a list of instructions• program is built from one or more procedures

(functions) – reusable chunks• procedures called at anytime, anywhere in program• focus is to break task into collection of variables,

data structures, subroutines• natural style, easy to understand• strict separation between code and data

Page 18: Advanced geoprocessing with Python

Functions

• portion of code within a larger program that performs a specific task

• can be called anytime, anyplace• can accept arguments• should return a value• keeps code neat• promotes smooth flow

>>> def foo(bar):... print bar...>>> foo(“yo”)yo

Page 19: Advanced geoprocessing with Python

Functionsimport arcpy

def get_raster_props(in_raster): """Get properties of a raster, return as dict""" # Cast input layer to a Raster r = arcpy.Raster(in_raster) raster_props = {} # Create empty dictionary to put props in below raster_props["x_center"] = r.extent.XMin + (r.extent.XMax - r.extent.XMin)/2 raster_props["y_center"] = r.extent.YMin + (r.extent.YMax - r.extent.YMin)/2 raster_props["max_elev"] = r.maximum raster_props["min_elev"] = r.minimum raster_props["no_data"] = r.noDataValue raster_props["terr_width"] = r.width raster_props["terr_height"] = r.height raster_props["terr_cell_res"] = r.meanCellHeight # Return the dictionary of properties return raster_props

Page 20: Advanced geoprocessing with Python

Programming paradigms:Procedural example

import arcpy

def add_field(in_fc="Magic.gdb/Fields", in_fields=[("Distance", "Float", "0"), ("Name", "Text", 50)]): """Add fields to FC""" for in_field in in_fields: if in_field[1] == 'Text': arcpy.AddField_management(in_fc,in_field[0],in_field[1],"#", "#",in_field[2],"#","NULLABLE","NON_REQUIRED","#") else: arcpy.AddField_management(in_fc,in_field[0],in_field[1],"#", "#","#","#","NULLABLE","NON_REQUIRED","#")

add_field()

Page 21: Advanced geoprocessing with Python

Programming paradigms:Object-oriented programming (OOP)

• break program down into data types (classes) that associate behavior (methods) with data (members or attributes)

• code becomes more abstract• data and functions for dealing with it are

bound together in one object

Page 22: Advanced geoprocessing with Python

Programming paradigms:Object-oriented programming (OOP)

import arcpy

class Fields(object): """Class for working with fields""" # __init__ --> method signature def __init__(self, in_fc="Magic.gdb/Fields", in_fields=[("Distance", "Float", "0"), ("Name", "Text", 50)]): self.in_fc = in_fc self.in_fields = in_fields def add_field(self): """Add fields to FC""" for in_field in self.in_fields: if in_field[1] == "Text": arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", in_field[2], "#", "NULLABLE", "NON_REQUIRED", "#") else: arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", "#", "#", "NULLABLE", "NON_REQUIRED", "#")

if __name__ == "__main__": # Instantiate the Fields class f = Fields() # Call the add_field method f.add_field() print f.in_fields print f.in_fc

Page 23: Advanced geoprocessing with Python

• objects let you wrap complex processes, but present a simple interface to them

• methods and attributes are encapsulated inside the object

• methods and attributes are exposed to users• you can then update the object without

breaking the interface• you can pass objects around - carefully

Programming paradigms:Object-oriented programming (OOP)

Page 24: Advanced geoprocessing with Python

Programming paradigms:OOP - Inheritance

• classes can inherit attributes and methods • allows you to reuse and customize existing

code inside a new class• you can override methods• you can add new methods to a class without

modifying the existing class

Page 25: Advanced geoprocessing with Python

Prog

ram

min

g pa

radi

gms:

OO

P - I

nher

itanc

e

import arcpy

class Fields(object): """Class for working with fields""" def __init__(self, in_fc="Magic.gdb/Fields", in_fields=[("Distance", "Float", "0"), ("Name", "Text", 50)]): self.in_fc = in_fc self.in_fields = in_fields def add_field(self): """Add fields to FC""" for in_field in self.in_fields: if in_field[1] == "Text": arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", in_field[2], "#", "NULLABLE", "NON_REQUIRED", "#") else: arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", "#", "#", "NULLABLE", "NON_REQUIRED", "#")

class MyFields(Fields): """Customized fields class""" def add_field(self): """Add fields to FC""" for in_field in self.in_fields: # Test to see if in_field exists already in featureclass if in_field[0] in [f.name for f in arcpy.ListFields(self.in_fc)]: # If field exists, delete it arcpy.DeleteField_management(self.in_fc, in_field[0]) print in_field[0], "deleted" if in_field[1] == "Text": arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", in_field[2], "#", "NULLABLE", "NON_REQUIRED", "#") else: arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", "#", "#", "NULLABLE", "NON_REQUIRED", "#") if __name__ == "__main__": # Instantiate MyFields class, which in inherits the Fields class f = MyFields() # Call add_field method f.add_field()

Page 26: Advanced geoprocessing with Python

Prog

ram

min

g pa

radi

gms:

OO

P - I

nher

itanc

e

import arcpy

class Fields(object): """Class for working with fields""" def __init__(self, in_fc="Magic.gdb/Fields", in_fields=[("Distance", "Float", "0"), ("Name", "Text", 50)]): self.in_fc = in_fc self.in_fields = in_fields def add_field(self): """Add fields to FC""" for in_field in self.in_fields: if in_field[1] == "Text": arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", in_field[2], "#", "NULLABLE", "NON_REQUIRED", "#") else: arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", "#", "#", "NULLABLE", "NON_REQUIRED", "#") def get_field_props(self): desc = arcpy.Describe(self.in_fc) for field in desc.fields: print field.name, "-->", field.type

class MyFields(Fields): """Customized fields class""" def add_field(self): """Add fields to FC""" for in_field in self.in_fields: if in_field[0] in [f.name for f in arcpy.ListFields(self.in_fc)]: arcpy.DeleteField_management(self.in_fc, in_field[0]) print in_field[0], "deleted" if in_field[1] == "Text": arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", in_field[2], "#", "NULLABLE", "NON_REQUIRED", "#") else: arcpy.AddField_management(self.in_fc, in_field[0], in_field[1], "#", "#", "#", "#", "NULLABLE", "NON_REQUIRED", "#") if __name__ == "__main__": # Instantiate MyFields class f = MyFields() # Call add_field method f.add_field() print f.in_fields # See, we really do inherit everything from the Fields class f.get_field_props()

Page 27: Advanced geoprocessing with Python

Modularizing code

• I’m lazy, so I want to reuse code• import statement – call functionality in

another module• Have one custom module (a .py file) with code

you use all the time• Great way to package up helper functions• ESRI does this with ConversionUtils.py C:\Program Files (x86)\ArcGIS\Server10.0\ArcToolBox\Scripts

Page 28: Advanced geoprocessing with Python

Geometries

• heirarchy:– feature class is made of features– feature is made of parts– part is made of points

• heirarchy in Pythonic terms:– part: [[pnt, pnt, pnt, ...]]– multipart polygon: [[pnt, pnt, pnt, ...], [pnt, pnt, pnt, ...]]– single part polygon with hole: [[pnt, pnt, pnt, ,pnt, pnt, pnt]]

Page 29: Advanced geoprocessing with Python

Reading geometry• accessed through the geometry object of a

feature• example: describe_geometry_arcmap.py

1.open up SearchCursor

2.loop through rows

3.get geometry4.print out X, Y

import arcpydesc = arcpy.Describe("Points")sfn = desc.ShapeFieldNamerows = arcpy.SearchCursor("Points")for row in rows: geom = row.getValue(sfn) pnt = geom.getPart() print pnt.X, pnt.Y

Page 30: Advanced geoprocessing with Python

Reading geometry

import arcpy

desc = arcpy.Describe("Points")sfn = desc.ShapeFieldNamerows = arcpy.SearchCursor("Points")for row in rows: geom = row.getValue(sfn) pnt = geom.getPart() print pnt.X, pnt.Y

Page 31: Advanced geoprocessing with Python

Read

ing

geom

etry

import arcpy

infc = "Magic.gdb/Polygons"

# Identify the geometry fielddesc = arcpy.Describe(infc)shapefieldname = desc.ShapeFieldName# Create search cursorrows = arcpy.SearchCursor(infc)

# Enter for loop for each feature/rowfor row in rows: # Create the geometry object feat = row.getValue(shapefieldname) # Print the current multipoint's ID print "Feature %i:" % row.getValue(desc.OIDFieldName) partnum = 0 # Step through each part of the feature for part in feat: # Print the part number print "Part %i:" % partnum # Step through each vertex in the feature for pnt in feat.getPart(partnum): if pnt: # Print x,y coordinates of current point print pnt.X, pnt.Y else: # If pnt is None, this represents an interior ring print "Interior Ring:" partnum += 1

Page 32: Advanced geoprocessing with Python

Read

ing

geom

etry

import arcpy

infc = "Magic.gdb/Polygons"

desc = arcpy.Describe(infc)shapefieldname = desc.ShapeFieldName

rows = arcpy.SearchCursor(infc)

for row in rows: feat = row.getValue(shapefieldname) print "\tFeature %i:" % row.getValue(desc.OIDFieldName) partnum = 0

for part in feat: parts = [] print "Part %i:" % partnum

for pnt in feat.getPart(partnum): if pnt: parts.append([pnt.X, pnt.Y]) else: parts.append(" ") partnum += 1 print parts

Page 33: Advanced geoprocessing with Python

Writing geometry

• arcpy.Point• point features are point objects, lines and

polygons are arrays of point objects– arcpy.PolyLine, arcpy.Polygon

• Geometry objects can be created using the Geometry, Mulitpoint, PointGeometry, Polygon, or Polyline classes

Page 34: Advanced geoprocessing with Python

Writi

ng g

eom

etry

data_list = [[33.09500,-93.90389], [33.03194,-93.89806], [34.34111,-93.50056], [34.24917,-93.67667], [34.22500,-93.89500], [33.76833,-92.48500], [33.74500,-92.47667], [33.68000,-92.46667], [35.05425,-94.12711], [35.03472,-94.12233], [35.03333,-94.12236], [35.01500,-94.12108], [35.00392,-94.12033]]

import arcpyimport timedef PushNbiToFeatureclass( inFc, inList): """ Take a list of NBI data and push it directly to a FGDB point FC """ try: cur = arcpy.InsertCursor(inFc) for line in inList: t = 0 feat = cur.newRow() feat.shape = arcpy.Point(line[1], line[0]) feat.setValue("Timestamp", time.strftime("%m/%d/%Y %H:%M:%S", time.localtime())) cur.insertRow(feat) del cur except Exception as e: print e.message

PushNbiToFeatureclass(r”path to fc”, data_list)

Page 35: Advanced geoprocessing with Python

Writi

ng g

eom

etry

import arcpy

arcpy.env.overwriteOutput = 1

# A list of features and coordinate pairscoordList = [[[1,2], [2,4], [3,7]], [[6,8], [5,7], [7,2], [9,18]]]

# Create empty Point and Array objectspoint = arcpy.Point()array = arcpy.Array()# A list that will hold each of the Polygon objects featureList = []

for feature in coordList: # For each coordinate pair, set the x,y properties and add to the # Array object for coordPair in feature: point.X = coordPair[0] point.Y = coordPair[1] array.add(point) # Add the first point of the array in to close off the polygon array.add(array.getObject(0)) # Create a Polygon object based on the array of points polygon = arcpy.Polygon(array) # Clear the array for future use array.removeAll() # Append to the list of Polygon objects featureList.append(polygon)

# Copy Polygon object to a featureclassarcpy.CopyFeatures_management(featureList, "d:/temp/polygons.shp")

Page 36: Advanced geoprocessing with Python

Rasters

• arcpy.Raster class– raster object: variable that references a raster

dataset– gives access to raster props

• raster calculations – Map Algebra– outras = Slope(“in_raster”)– can cast to Raster object for calculations

Page 37: Advanced geoprocessing with Python

Rastersimport arcpy

def get_raster_props(in_raster): """Get properties of a raster, return as dict""" # Cast input layer to a Raster r = arcpy.Raster(in_raster) raster_props = {} # Create empty dictionary to put props in below raster_props["x_center"] = r.extent.XMin + (r.extent.XMax - r.extent.XMin)/2 raster_props["y_center"] = r.extent.YMin + (r.extent.YMax - r.extent.YMin)/2 raster_props["max_elev"] = r.maximum raster_props["min_elev"] = r.minimum raster_props["no_data"] = r.noDataValue raster_props["terr_width"] = r.width raster_props["terr_height"] = r.height raster_props["terr_cell_res"] = r.meanCellHeight # Return the dictionary of properties return raster_props

Page 38: Advanced geoprocessing with Python

Spatial references

• can get properties from arcpy.Describe>>> sr = arcpy.Describe(fc).spatialReference>>> sr.typeu’Projected’ or u’Geographic’

• arcpy.SpatialReference class• methods to create/edit spatial refs

Page 39: Advanced geoprocessing with Python

Spatial references

>>> sr_utm = arcpy.SpatialReference()>>> sr_utm.factoryCode = 26915>>> sr_utm.create()>>> sr_utm.name...

• arcpy.SpatialReference class• methods to create/edit spatial refs

Page 40: Advanced geoprocessing with Python

ERROR

S

Page 41: Advanced geoprocessing with Python

Exception Handling

• It’s necessary, stuff fails• Useful error reporting• Proper application cleanup• Combine it with logging try: do something... except: handle error... finally: clean up...

Page 42: Advanced geoprocessing with Python

Exception handling – try/except

• most basic form of error handling• wrap whole program or portions of code• use optional finally clause for cleanup– close open files– close database connections– check extensions back in

Page 43: Advanced geoprocessing with Python

Exception handling

import arcpytry: arcpy.Buffer_analysis("Observer")except Exception as e: print e.message

Page 44: Advanced geoprocessing with Python

Exception handling

import arcpy

try: if arcpy.CheckExtension("3D") == "Available": arcpy.CheckOutExtension("3D") arcpy.Slope_3d("Magic.gdb/NWA10mNED", "Magic.gdb/SlopeNWA")except: print arcpy.GetMessages(2)finally: # Check in the 3D Analyst extension arcpy.CheckInExtension("3D")

Page 45: Advanced geoprocessing with Python

Exception handling - raise

• allows you to force an exception to occur• can be used to alert of conditions

Page 46: Advanced geoprocessing with Python

Exception handling - raiseimport arcpy

class LicenseError(Exception): pass

try: if arcpy.CheckExtension("3D") == "Available": arcpy.CheckOutExtension("3D") else: raise LicenseError arcpy.Slope_3d("NWA10mNED", "SlopeNWA")except LicenseError: print "3D Analyst license unavailable"except: print arcpy.GetMessages(2)finally: # Check in the 3D Analyst extension arcpy.CheckInExtension("3D")

Page 47: Advanced geoprocessing with Python

Exception handlingAddError and traceback

• AddError – returns GP-specific errors• traceback – prints stack trace; determines

precise location of error– good for larger, more complex programs

Page 48: Advanced geoprocessing with Python

Exce

ption

han

dlin

g –

AddE

rror

and

trac

ebac

kimport arcpyimport sysimport traceback

arcpy.env.workspace = r"C:\Student\Code\MAGIC.gdb"try: # Your code goes here float("a string")

except: # Get the traceback object tb = sys.exc_info()[2] tbinfo = traceback.format_tb(tb)[0] # Concatenate information together concerning the error into a message string # tbinfo: where error occurred # sys.exc_info: 3-tuple of type, value, traceback pymsg = "PYTHON ERRORS:\nTraceback info:\n" + tbinfo + "\nError Info:\n" + str(sys.exc_info()[1]) msgs = "ArcPy ERRORS:\n" + arcpy.GetMessages(2) + "\n" # Return python error messages for use in script tool or Python Window arcpy.AddError(pymsg) if arcpy.GetMessages(2): arcpy.AddError(msgs) print msgs # Print Python error messages for use in Python / Python Window print pymsg + "\n"

Page 49: Advanced geoprocessing with Python

Logging

• logging module• logging levels:– DEBUG: detailed; for troubleshooting– INFO: normal operation, statuses– WARNING: still working, but unexpected behavior– ERROR: more serious, some function not working– CRITICAL: program cannot continue

Page 50: Advanced geoprocessing with Python

Super-basic logging

import logginglogging.warning("Look out!")logging.info("Does this print?")

Page 51: Advanced geoprocessing with Python

Super-basic logging to a log file

import logginglogging.basicConfig(filename='log_example.log', level=logging.DEBUG)logging.debug('This message should get logged')logging.info('So should this')logging.warning('And this, too')

Page 52: Advanced geoprocessing with Python

import logginglogging.basicConfig(filename="log_example.log",level=logging.DEBUG)logging.debug("This message should go to the log file")logging.info("So should this")logging.warning("And this, too")

Super-basic logging to a log file

Page 53: Advanced geoprocessing with Python

Meaningful logging

• “customize” the logger• add in info-level message(s) to get logged• log our errors to log file• can get much more advanced, see the docs

Page 54: Advanced geoprocessing with Python

import arcpyimport sysimport tracebackimport loggingimport datetime

log_file = "meaningful_log_%s.log" % datetime.datetime.now().strftime("%Y_%m_%d_%H_%M_%S")

arcpy.env.workspace = r"C:\Student\Code\MAGIC.gdb"

# Setup loggerlogging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)-8s %(message)s', datefmt='%Y-%m-%d %H:%M:%S', filename=log_file, filemode='w')logging.info(': START LOGGING')

try: # Your code goes here float("lfkjdlk")

logging.info(": DONE")

except: # Get the traceback object tb = sys.exc_info()[2] tbinfo = traceback.format_tb(tb)[0] # Concatenate information together concerning the error into a message string # tbinfo: where error occurred # sys.exc_info: 3-tuple of type, value, traceback pymsg = "PYTHON ERRORS:\nTraceback info:\n" + tbinfo + "\nError Info:\n" + str(sys.exc_info()[1]) msgs = "ArcPy ERRORS:\n" + arcpy.GetMessages(2) + "\n" # Return python error messages for use in script tool or Python Window arcpy.AddError(pymsg) if arcpy.GetMessages(2): arcpy.AddError(msgs) logging.error(": %s" % msgs) # Log Python error messages for use in Python / Python Window logging.error(": %s" % pymsg + "\n")

Mea

ning

ful l

oggi

ng

Page 55: Advanced geoprocessing with Python

Mea

ning

ful l

oggi

ng

Page 56: Advanced geoprocessing with Python

Code documentation

• Pythonic standards covered in PEPs 8 and 257• help()• comments need to be worth it• name items well• be precise and compact• comments may be for you

Page 57: Advanced geoprocessing with Python

Creating documentation

• pydoc – built-in; used by help()– generate HTML on any module– kinda plain

• epydoc – old, rumored to be dead– produces nicely formatted HTML– easy to install and use

• Sphinx framework– “intelligent and beautiful documentation”– all the cool kids are using it (docs.python.org)– more involved to setup and use

Page 58: Advanced geoprocessing with Python

Branching out

Page 59: Advanced geoprocessing with Python

Installing packages

Page 60: Advanced geoprocessing with Python

Installing packages (on Windows)

• Windows executables• Python eggs– .zip file with metadata, renamed .egg– distributes code as a bundle– need easy_install

• pip– tool for installing and managing Python packages– replacement for easy_install

Page 61: Advanced geoprocessing with Python

pip

• can take care of dependencies for you• uninstallation!• install via easy_install, ironically

C:\pip search “kml”

C:\pip install BeautifulSoup

C:\pip install –upgrade pykml

C:\pip uninstall BeautifulSoup

Page 62: Advanced geoprocessing with Python

virtualenv

• a tool to create isolated Python environments• manage dependencies on a per-project basis,

rather than globally installing• test modules without installing into site-

packages• avoid unintentional upgrades

Page 63: Advanced geoprocessing with Python

virtualenv

• install via pip, easy_install, or by

• create the env

• activate the env

• use the env

C:\python virtualenv.py

C:\dir virtualenv <env>

C:\dir\<env>Scripts activate

(<env>) C:\dir\<env>Scripts\python>>>

Page 64: Advanced geoprocessing with Python

virtualenv

• installs Python where you tell it, modifies system path to point there– good only while the env is activated

• use yolk to list installed packages in env

• But can this work in ArcMap Python prompt?

(test) C:\dir> yolk -l

Page 65: Advanced geoprocessing with Python

virtualenv

• YES, with a little work...

• tells ArcMap to use Python interpreter in our virtualenv– kill ArcMap, back to using default interpreter

>>> execfile(r'C:\<env>\Scripts\activate_this.py', {'__file__': r'C:\<env>\Scripts\activate_this.py'})

Page 66: Advanced geoprocessing with Python
Page 67: Advanced geoprocessing with Python

The web

• Infinite source of information• Right-click and “Save as” is so lame (and too

much work)• Python can help you exploit the web– ftplib, http (urllib), mechanize, scraping (

Beautiful Soup), send email (smtplib)

Page 68: Advanced geoprocessing with Python

Fetching data

• Built-in libraries for ftp and http• ftplib – log in, nav to directory, retrieve files• urllib/urllib2 – pass in the url you want, get it

back• wget – GNU commandline tool– Can call with os.system()

Page 69: Advanced geoprocessing with Python

import urlliburllib.urlretrieve("http://www.fhwa.dot.gov/bridge/nbi/2011/RI11.txt", "C:/temp/RI11.txt")

Fetching data

Page 70: Advanced geoprocessing with Python

Scraping

• Scrape data from a web page• Well-structured content is a HUGE help, as is

valid markup, which isn’t always there• BeautifulSoup 3rd party module– Built in methods and regex’s help out– Great for getting at tables of data

Page 71: Advanced geoprocessing with Python

Scraping addresses

http://www.phillypal.com/pal_locations.php

Page 72: Advanced geoprocessing with Python

import BeautifulSoup as bsimport urllib2

url = "http://www.phillypal.com/pal_locations.php"

# Open the URLresponse = urllib2.urlopen(url)# Slurp all the HTML code into memoryhtml = response.read()# Feed it into BS parsersoup = bs.BeautifulSoup(html)# Find all the table cells whose width=37%addresses = soup.findAll("td", {"width":"37%"})

print len(addresses)

for address in addresses: # Print out just the text print address.find(text=True)

Scraping addresses

Page 73: Advanced geoprocessing with Python

1845 N. 23rd Street, 191213301 Tasker Street, 191455801 Media Street, 19131250 S. 63rd Street, 19139732 N. 17th Street, 19130631 Snyder Avenue, 191486901 Rising Sun Avenue, 19111851 E. Tioga Street, 19134720 W. Cumberland St., 191333890 N. 10th Street, 191404550 Haverford Avenue, 191391100 W. Rockland St., 191411500 W. Ontario Street, 191402423 N. 27th Street, 191321267 E. Cheltenham Ave., 245330 Germantown Ave., 191441599 Wharton Street, 191464253 Frankford Avenue, 191242524 E. Clearfield St., 191346300 Garnet Street, 191265900 Elmwood Street, 191434301 Wayne Avenue, 191404401 Aldine Street, 191364614 Woodland Avenue, 191434419 Comly Street, 191352251 N. 54th Street, 19131

Scraping addresses

Page 74: Advanced geoprocessing with Python

Emailing

• smtp built-in library• best if you have IP of your email server• port blocking can be an issue import smtplib server = smtplib.SMTP(email_server_ip) msg = ‘All TPS reports need new cover sheets’ server.sendmail('[email protected]', '[email protected]', msg) server.quit()

• there’s always Gmail too…

Page 75: Advanced geoprocessing with Python

Files

• built in open function – slurp entire file into memory – OK except for huge files

data = open(file).read().splitlines()

• iterate over the lines for line in data:

do something

• CSV module reader =

csv.reader(open('C:/file.csv','rb')) for line in reader: do something

Page 76: Advanced geoprocessing with Python
Page 77: Advanced geoprocessing with Python

Excel

• love, hate, love• many modules out there– xlrd (read) / xlwt (write) – only .xls– openPyXL – read/write .xlsx

• uses – Push text data to Excel file– Push featureclass data to Excel programmatically– Read someone else’s “database”

Page 78: Advanced geoprocessing with Python

import xlrd

# Open the workbookwb = xlrd.open_workbook('Employees.xls')wb.sheet_names()

# Get first sheetsh=wb.sheet_by_index(0)# Print out the rowsfor row in range(sh.nrows): print sh.row_values(row)

# Get a single cellcell_b2 = sh.cell(1,1).valueprint "\n", cell_b2

Reading Excel

Page 79: Advanced geoprocessing with Python

# Write an XLS file with a single worksheet, containing# a heading row and some rows of data.

import xlwtimport datetimeimport bs_scrape as bsimport nbi_data_processing as nbiezxf = xlwt.easyxf

def write_xls(file_name, sheet_name, headings, data, heading_xf, data_xfs): book = xlwt.Workbook() sheet = book.add_sheet(sheet_name) rowx = 0 for colx, value in enumerate(headings): sheet.write(rowx, colx, value, heading_xf) sheet.set_panes_frozen(True) # frozen headings instead of split panes sheet.set_horz_split_pos(rowx+1) # in general, freeze after last heading row sheet.set_remove_splits(True) # if user does unfreeze, don"t leave a split there for row in data: rowx += 1 for colx, value in enumerate(row): sheet.write(rowx, colx, value, data_xfs[colx]) book.save(file_name)

if __name__ == "__main__": import sys files = ["RI","HI"] all_data = [] stateDict = bs.FetchFipsCodes( ) for f in files: k = nbi.ParseNbiFile('C:/student/inputs/' + f + '11.txt', stateDict ) all_data.extend(k) hdngs = ["Structure","State","Facility carried","Lat","Lon","Year built"] kinds = "text text text double double yr".split() data = [] for each_row in all_data: data.extend([each_row]) # Format the headers heading_xf = ezxf("font: bold on; align: wrap on, vert centre, horiz center") # Set the data type formats kind_to_xf_map = { "date": ezxf(num_format_str="yyyy-mm-dd"), "int": ezxf(num_format_str="#,##0"), "money": ezxf("font: italic on; pattern: pattern solid, fore-colour grey25", num_format_str="$#,##0.00"), "price": ezxf(num_format_str="#0.000000"), "double":ezxf(num_format_str="00.00000"), "text": ezxf(), "yr": ezxf(num_format_str="0000") } data_xfs = [kind_to_xf_map[k] for k in kinds] write_xls("NBI_Data_To_Excel.xls", "NBI", hdngs, data, heading_xf, data_xfs)

Writi

ng E

xcel

Page 80: Advanced geoprocessing with Python

Writi

ng E

xcel

Page 81: Advanced geoprocessing with Python

Databases

• You can connect to pretty much ANY database• Is there one true solution??• pyodbc – Access, SQL Server, MySQL• Oracle – cx_Oracle• Others – pymssql, _mssql, MySQLdb• Execute SQL statements through a connection

conn = library.connect(driver/user/pwd) cursor = conn.cursor() for row in cursor.execute(sql)

do something

Page 82: Advanced geoprocessing with Python

Resources - FREE

• Dive into Python• Python Cookbook• Think Python• Python docs• gis.stackexchange.com• Google is your friend (as always)• Python community is HUGE and GIVING

Page 83: Advanced geoprocessing with Python

Conferences

• pyArkansas – annually in Conway– pyar2 list on python.org

• PyCon – THE national US Python conference• FOSS4G – international open source for GIS • ESRI Developer Summit – major dork-fest, but

great learning opportunity and Palm Springs in March

Page 84: Advanced geoprocessing with Python

IDEs and editors

• Wing – different license levels, good people• PyScripter – open source, code completion• Komodo – free version also available• Notepad2 – ole’ standby editor• Notepad++ - people swear by it• PythonWin – another standby, but barebones• …dozens (at least) more editors out there…

Page 85: Advanced geoprocessing with Python

More reading

• http://www.voidspace.org.uk/python/articles/OOP.shtml - great OOP article (which I used a a lot)

Page 86: Advanced geoprocessing with Python