python geoprocessing for rangeland tools and processes

Post on 10-May-2022

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Python Geoprocessing for Rangeland Management: Guidance for Developing Efficient Tools and Processes

David Howes, Ph.D.

David Howes, LLC

dhowes.com

Eric Sant

Open Range Consulting

openrangeconsulting.com

Northwest GIS 2019November 6th, 2019

David Howes, Ph.D.

• Education• B.Sc. (Hons) in Geography – University of Salford, England• M.Sc. in Geographic Information Systems – University of Edinburgh,

Scotland• Ph.D. in Geomorphology – State University of New York at Buffalo,

New York

• 28 years in GIS

• Specialty: GIS tools, processes and supporting infrastructure

• Established• David Howes, LLC in 2012• GISPD.com in 2014

Eric Sant, M.S.

• Education

• B.S. in Geography – Utah State University

• M.S. in Geography – Utah State University

• Specialty

Improving grazing management by giving land managers geospatial assessment products that are accurate, timely, and cost efficient

• Experience

Worked on wide variety of federal, state, and private industry projects concerned with assessing the biological value of large landscapes

Acknowledgement

Tim Bateman - Geospatial Analyst, Open Range Consulting

Open Range Consulting

• http://openrangeconsulting.com/

• “Open Range Consulting wants to improve the world through rangeland management by providing statistically valid, economically feasible and expedient landscape information”

Presentation Approach

• Requirement

• Solution

• Considerations

Requirement

Tasks

• Run the ArcGIS Spatial Analyst Zonal Statistics as Table tool for a set of zones and a set of rasters derived from a composite (four-band) raster (from aerial imagery)

• Use the zonal statistics data to develop a model using the statistical package R

• Apply the model to rasters derived from the source raster to create an output raster

Original Solution

Developed Python scripts to apply to full source raster

Reality

• Source rasters keep getting bigger as resolution and required coverage increase

• Processes take far too long

• Soon run out of RAM

Options

• Get a bigger machine

• Enhance the procedures

Solution

Both

• New computer

• 28 cores (56 processors)

• 128GB RAM

• New Python tools

Split Source Raster into Parts and Process Simultaneously

Append Output Parts

Return Full Output Rasters

Typical Data Quantities

Item Size (GB) Count Total Size (GB)

Composite raster1-m NAIP (National Agriculture Imagery Program).img format172,854 x 148,946 pixels

255.00 1 255.00

Composite raster band 63.75 4 255.00

Derived raster 63.75 19 1,211.25

Output raster 63.75 1 63.75

TOTAL 1,785.00

Conceptual Steps

• Prepare source and derived part rasters

• Generate zonal statistics data for training zones using source part rasters

• Prepare an R model using the zonal statistics data

• Run the R model for each part

• Join part output rasters together to create full final raster

Implementation

• Develop a set of Python scripts

• Run at the command line

• Use the same simple input file for all scripts

• Standardize the code and syntax

• Develop a clean data structure

Data Folder/File Structure

Level 1 Folder Level 2 Folder Level 3 Folder Contents

Source\ Source raster

Parts\ Part_001 Source part raster

Part_002etc.

Processing\ Part_001\ In\ Source band rastersDerived rasters

Out\ Part output raster

Part_002\etc.

Out\ Full output raster

Scripts (1)

• prepare_parts.py/create_part_rasters.py• Extract source parts and create derived part rasters using multiprocessing

• Create part extents feature class

• zone_raster_relationships.pyIdentify zones per source part raster and store details in Zone-Raster Relationships table

• multi_raster_zonal_statistics.py• Read records in Zone-Raster Relationships table

• Runs Spatial Analyst Zonal Statistics as Table tool for each part using multiprocessing

• Compiles data into single output table

Scripts (2)

• run_r_model_parallel.py• Run R model for each part using multiprocessing

• Input: derived part rasters

• Output: single part output raster

• create_raster_from_parts.py

Mosaic part output rasters to create final full output raster

Script: multi_raster_zonal_statistics.py

• Using multiprocessing for each part• Create temporary folder

• Create temporary file geodatabase

• Create temporary zones feature class

• Create and run temporary Python script• Runs Zonal Statistics as Table for part zones and source part raster

• Creates temporary output table

• Compile data from temporary zonal statistics table into single table

• Remove temporary data

Basic Multiprocessing

# Imports from multiprocessing import Process

import subprocess

# Function to run each process def run_shell(command):

p = subprocess.Popen(command)

p.communicate()

def main():

for each part:

# Create process command = "python process_part.py " + args_str

task = Process(target=run_shell, args=(command,))

task.start()

tasks.append(task)

# Wait for all processes to finish for task in tasks:

task.join()

Issues - Empty Parts

Issues - Spatial Analyst Licensing

• ArcGIS Desktop Python 2.7

Spatial Analyst license needs to be checked out for each task process

• ArcGIS Pro Python 3.6

No licensing restrictions on simultaneous use of Spatial Analyst tools

Issues - Multiprocessing Within ArcGIS Programs

Can't run multiprocessing within ArcGIS desktop programs

• Desktop (ArcMap, ArcCatalog)

Program hangs

• Pro

New Pro instance is opened every time a task command is issued

Hence, command line processing

Issues - ArcGIS Pro Python Licensing Error

• Can't run more than 34 processes at once

• For each additional process, error occurs

“RuntimeError: Not signed into Portal”

• Esri solution about to be tested

Considerations

Professional Development

• Want code to be as understandable as possible for client

• Provides a resource for learning and other coding needs

• Don't want to provide a black box

Clarity Triggers

• Suffering from confusion

• Can't remember what you did or how your process works

• Procedures are difficult to explain

If any of these apply, make processes clearer and simpler

Overcoming Lack of Clarity

• Refactor - split processes into basic components

• Simplify - refine and rearrange code

• Generalize - make code reusable wherever possible

Responding to Evolving Requirements

• Can't know all future requirements, but can make growth easier with clean and careful coding

• Try to think ahead and be prepared

• Sometimes need to step back, reevaluate, and refine approaches

• Iterative process as experience is gained

• Apply new ideas

• Address new requirements

Eliminate Barriers to Progress

• Always be ready to

• Reuse code

• Explain it

• Defend it

• You don't want to get stuck using and explaining your own procedures

Standards

• Use simple, consistent, and descriptive coding style and vocabulary

• Follow PEP-8 - Style Guide for Python Code

• https://www.python.org/dev/peps/pep-0008/

• PEP = Python Enhancement Proposal

• Use code inspector (e.g., PyCharm Inspect Code function)

• Implement clean, simple data structures

Wider Applicability

• Concept can be applied to any geoprocessing operation for which tasks can be separated into independent parts

• Processing requirements continually increasing

• Multiprocessing will become more important over time

Takeaway Messages

• Be innovative

• Be ready to evolve and adapt

• Apply helpful standards for all aspects of your processing requirements

Slides available at http://gispd.com/events

Thanks for Coming

top related