postal address cleanup

10
Postal address clean-up Two unusual FME Workbench applications Andrew Zolnai for Mouvement Démocrate Europe du Nord

Upload: andrew-zolnai

Post on 16-Jan-2017

426 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Postal address cleanup

Postal address clean-upTwo unusual FME Workbench applications

Andrew Zolnai for

Mouvement Démocrate Europe du Nord

Page 2: Postal address cleanup

The premise

• Help French 2017 presidential / parliamentary election campaign

• Using nationbuilder.com for a London UK based campaign

• Liste Electorale Consulaire is structured, or is it?

• Ergo normalise 4 address columns to upload…

Page 3: Postal address cleanup

Standards? What standards?

Page 4: Postal address cleanup

Nothing new here…

Page 5: Postal address cleanup

Regex on steroids

• 1Spatial hosted Safe FME World Tour 2015 leg in London• scrape 4 years worth of playlists and tracks off the StrayFM website • and categorise and rank the most played artists and tracks

• Unusual use of FME Workbench for non-spatial data

• Similar exercise with help from Safe• StringSearcher – search address components• AttributeSplitter – split them into similar parts• AttributeManager – re-order into one schema

Page 6: Postal address cleanup
Page 7: Postal address cleanup

So what’s going on?

• Get the first matches of address strings in the 4 address fields

• If string is empty then assign the next address string to it

• Country name is constant last string

• Build normalised string sets backward from it

Page 8: Postal address cleanup

But << drum roll >>

• FeatureMerger works on non-spatial data too!

1. Initial load:• find rejected addresses

• repeat the procedure if possible

2. Ongoing updates:• find the new entries as updated lists are received

• repeat the procedure on the “delta” only

Page 9: Postal address cleanup
Page 10: Postal address cleanup

And what did we get?

• Metadata… metadata… metadata… ……..

• 15Mb CSV list with verbose 30Mb PDF

• Resulted in 10Mb cleaned up CSV list

• Uploaded clean address base into nationbuilder.com