ken bragg: batch data processing in fme

51
Potent Potions for Batch Data Processing

Upload: gimnv

Post on 21-Jan-2018

92 views

Category:

Technology


10 download

TRANSCRIPT

Potent Potions for Batch Data Processing

250,000 CAD files & rasters on mobile devices

Tip: Know Your Potions and Choose Wisely

Today’s Potions

1. Wildcards

2. Batch Deploy

3. Parent/child Workspaces

4. Parent/child Server Workspaces

Potion 1: One Wild<card>

Dataset

Multi-Dataset Picker

Multi-Dataset Picker

Multi-Dataset Picker

Multi-Dataset Picker

Shapefile MapInfo

Most rasters

DWG DGN SQLite

Dataset Wildcards

Extended glob syntax:

Symbol Matches

? Any single character

* Any sequence of zero or more characters

[chars] Any single character in chars.

[a-d] Any character between a and d inclusive

{a,b,...} Any of the sub-patterns a, b

/**/ 0 or more subdirectories

Time to brew Potion 1

Potion 1: Enticements

Wildcard Bulk Data Processing

Enticements ü  Simple to set up

ü Can transform across file boundaries

- Needs memory & time

Potion 1: Pitfalls

Wildcard Bulk Data Processing Pitfalls

x  Recovery from data errors difficult

x  Feature Type vs File vs Format Issues

x  No granular log x  No ability to

parallelize

Potion 2: Batch Deploy

Batch Deploy Script Writer

Batch Deploy Script Writer

Batch Deploy Script Writer

Batch Deploy Script Writer

Batch Deploy Script Writer

Batch Deploy Script Writer

Time to brew Potion 2

Potion 2: Enticements

Batch Deploy Enticements

ü  Simple to set up ü  Runs quickly ü  Can script via

command line ü  Run on demand

Potion 2: Pitfalls

Batch Deploy Pitfalls

x  Recovery from data errors difficult

x  No granular log x  Destination dataset

naming can be tricky

Potion 3: Parent/Child Workspaces

Parent/Child Workspace Ingredients

•  Parent Workspace: –  PathReader –  WorkspaceRunner

•  Child Workspace:

–  FeatureWriter

Parent Ingredients

Parent Ingredients

Child Ingredients

Time to brew Potion 3

Potion 3: Enticements

Parent/Child Workspace Enticements

ü  Separate transformation from workflow

ü  Generate audit logs ü  All authored within

Workbench

Potion 3: Pitfalls

Parent/Child Workspace

Pitfalls

x  Not all writers can be used concurrently

x  Slow to run each child workspace separately

x  Recovery from data errors not easy if concurrent runs used

Potion 4: Parent/Child Server

Workspaces

Parent/Child Server

Workspace Ingredients

•  Parent Workspace: –  PathReader –  FMEServerJobSubmitter –  FMEServerJobWaiter

•  Child Workspace:

–  FeatureWriter

Parent Ingredients

Parent Ingredients

Parent Ingredients

Child Ingredients

Time to brew Potion 4

Potion 4: Enticements

Parent/Child Server

Workspace Enticements

ü  Separate transformation from workflow

ü  Generate audit logs ü  All authored within

Workbench ü  Make full use of

parallelism = FAST

Potion 4: Pitfalls

Parent/Child Server

Workspace Pitfalls

x  Not all writers can be used concurrently

x  Data needs to be accessible to Server Engines - Consider using Server Data Resources

x  Craft your reload/audit plan

Summary ●  Many ways to handle bulk data moves

●  Choose your potion wisely - each has pluses and minuses

●  FME Server is the most robust automation choice

Questions?

Batch processing tutorial: fme.ly/b59