![Page 1: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/1.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Open Geodemographics: Open Tools and the 2011 OAC
UCL Department of Geography, Gower Street, London, WC1E 6BT
Chris Gale* Muhammad Adnan Paul Longleymapblog.in gis-tech.co.uk paul-longley.com@geogale @gisandtech
* Conference attendance kindly supported by RGS-IBG funded QMRG bursary
![Page 2: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/2.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• What is Geodemographics?• Need for Open Geodemographics• GeodemCreator• The 2011 Output Area Classification• Summary
Outline
![Page 3: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/3.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Geodemographics• The analysis of people by where they live• Areas can be described by the characteristics and
attitudes of those people who live in them• Based on the concept that similar people with similar
characteristics are more likely to live within the same locality and that such area types will be distributed in different locations across a geographical space
• Commercial (MOSAIC, ACORN) and free (OAC) classifications available
![Page 4: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/4.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Commercial Geodemographic Classifications• Created as ‘black box’ systems (Longley and Singleton, 2009)• Closed methods are used with little documentation• Little information is given regarding the data inputs,
normalisation and weighting procedures, and clustering methods employed
![Page 5: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/5.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Need of Open, Transparent, and Flexible Classifications• Increased amount of data sources due to ‘open data’
initiatives– ONS NeSS data exchange, London data store, Crime data API
• Need of open methods– Open method of Estimation, Normalisation, and Clustering
procedures• Open public consultation
![Page 6: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/6.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Need of Open, Transparent, and Flexible Classifications• A number of statistical packages could be used for
building geodemographic classifications– R, SPSS, Microsoft Excel
• No unified software utility exists that could be used for building open, transparent, and flexible classifications
• ‘GeodemCreator’ is a unified software utility for building geodemographic classifications
![Page 7: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/7.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator• A cross platform java software utility for building
geodemographic classifications• Requires ‘Java’ and ‘R’ installed on user’s machine• Geodemographic classifications could be created for
any geographical level and by using any data set• Users can combine census data with their own data
sources
![Page 8: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/8.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator• Operates in ‘Basic’ and ‘Advanced’ modes
– Basic Mode is for inexperienced and new users– Advanced modes is for experienced users
• Clusters the data by using k-means clustering algorithm
![Page 9: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/9.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 10: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/10.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study• A Socio-economic and Ethnic classification of Greater
London• Created by using 41 OAC variables and 12 ethnicity
variables (created from ethnicity data source http://worldnames.publicprofiler.org)
• GeodemCreator was used for building the final classification
![Page 11: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/11.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Data Sources• Variables V1 to V41 from the 2001 OAC• Variables V42 to V53 ethnicity
V42: ‘European’ ethnic groupV43: ‘East Asian & Pacific’ ethnic groupV44: ‘Muslim’ ethnic groupV45: ‘Greek’ ethnic groupV46: ‘English’ ethnic groupV47: ‘Nordic’ ethnic groupV48: ‘African’ ethnic groupV49: ‘Japanese’ ethnic groupV50: ‘Hispanic’ ethnic groupV51: ‘Celtic’ ethnic groupV52: ‘Jewish’ ethnic groupV53: ‘South Asian’ ethnic group
![Page 12: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/12.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Results• A Socio-economic and Ethnic classification of Greater
London:
![Page 13: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/13.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Results• GeodemCreator also produces radial charts for each
cluster solution
English and European ethnic groups living in suburban areas
Well off and educated Asian families
![Page 14: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/14.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Results
English, European, and Celtic fringe city commuters
Poor Asian Families
![Page 15: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/15.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Results
Childless European city dwellers Native blue collar communities
![Page 16: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/16.jpg)
UCL DEPARTMENT OF GEOGRAPHY
GeodemCreator Case Study Results
English and European ethnic groups living in council properties
![Page 17: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/17.jpg)
UCL DEPARTMENT OF GEOGRAPHY
The 2001 Output Area Classification (OAC)• Groups the UK population
into:– 7 Supergroups– 21 Groups– 52 Subgroups
• Only data source used is the 2001 Census– 41 Variables
• Variety of organisations use it including local government and commercial companies
![Page 18: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/18.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• Building on the success of the 2001 OAC• The 2001 OAC’s real achievement was showing that
open-source geodemographic classifications were possible
• Can utilise developments in computing over the past 6 years, since the 2001 OAC’s publication, to make improvements
• Can be produced using open-source software (if required) with a fully open and transparent methodology
The 2011 Output Area Classification
![Page 19: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/19.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• Not just a repeat of the 2001 Output Area Classification
• Methodology that will possibly not rely on 100% Census data
• Enhanced outputs to cater for different potential users• Designed to allow easy creation of bespoke variants
– Variables and/or Geography – Automated variable selection depending on user criteria
• e.g. variables used for a national classification not necessarily being suitable for a regional classification
The 2011 Output Area Classification
![Page 20: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/20.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• Code used to auto-select best variables for desired purpose
• Allows for a fully transparent and repeatable methodology– Variable selection the only “black box” element of the 2001
OAC• Allows for wider scale bespoke geodemographics
– A user with no geodemographics experience can produce their own classification by selecting the variables, standardisation method, number of clusters.
– Removes any technical barriers that could prevent wider adoption of bespoke geodemographic classifications.
2011 OAC Variables
![Page 21: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/21.jpg)
UCL DEPARTMENT OF GEOGRAPHY
Bespoke Geodemographic Classifications• Categorised into 3 main types:
– Using the same data already provided in classification.– Changing the number of variables used to create a
classification.– Uploading other data that was not originally included into a
pre-existing classification or creating a new classification from scratch.
• In the case of OAC this could resolve a problem when used at a regional level– London is an example of one such region that OAC does not
classify very well.
![Page 22: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/22.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 23: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/23.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• Bespoke free area classification of Hull
• 45 Census Variables used
• 10 Groups in 3 hierarchies
The Hull City Council Classification
![Page 24: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/24.jpg)
UCL DEPARTMENT OF GEOGRAPHY
2011 OAC and Open Data• Would it be better to use potentially “newer” Open
Data (when compared with the 2011 Census)?• How much of a problem is the lack of data currently
available at OA level?• Using Open Data raises a lot of questions:
– What sources of Open Data should be used?– What should the coverage of the Open Data be?– Does the integrity of the Open Data matter?– How often should the Open Data sources be updated?
• Beyond 2011
![Page 25: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/25.jpg)
UCL DEPARTMENT OF GEOGRAPHY
On-The-Fly Clustering• To meet the changing and varying needs of users a
dynamic classification environment needs to be created
• Ability to create bespoke classifications a requirement– both for different geographies (e.g. London or UK) and the
range and number of variables utilised (e.g. Census and/or non-Census) with an additional weighting capacity
• Will require clustering to happen in real-time• Research of users specific has been undertaken
– 2011 OAC User Engagement (run in partnership with the ONS)
– Results to be published by ONS by late April
![Page 26: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/26.jpg)
UCL DEPARTMENT OF GEOGRAPHY
On-The-Fly Clustering Objectives• Find optimum real-time clustering solution
– Using mean Within-Cluster Sum of Squares (WCSS) value to determine optimum cluster solution using K-Means.
– Number of cluster algorithm iterations to use to create a good clustering solution that does not result in poor functionality.
• Create repeatability– Overcome inherent random seeding of K-Means that results in
an OA remaining in the same cluster group but being given a random cluster assignment (e.g. a number from 1 to 7) for every iteration.
• Incorporate different data sources– Both Census and non-Census data
![Page 27: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/27.jpg)
UCL DEPARTMENT OF GEOGRAPHY
What the Within-Cluster Sum of Squares Value means• Lower the mean value the more homogenous (i.e.
better) the final cluster groupings are– Clustering using the lowest WCSS value can therefore be
considered to create the optimum cluster groupings.• Using anything other than optimum cluster solution can
have differing results depending on the dataset and level of geography
![Page 28: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/28.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 29: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/29.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 30: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/30.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 31: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/31.jpg)
UCL DEPARTMENT OF GEOGRAPHY
![Page 32: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/32.jpg)
UCL DEPARTMENT OF GEOGRAPHY
• The 2001 OAC was an important first step for open source geodemographics
• The 2011 OAC can build on the successes of the 2001 OAC
• Tools like GeodemCreator can be used to create bespoke geodemographic classifications easily and without any “expert” knowledge
• The 2011 OAC is still in the planning phase but should be released in some form by late 2012/early 2013
Summary
![Page 33: Open Geodemographics: Open Tools and the 2011 OAC](https://reader036.vdocuments.us/reader036/viewer/2022062614/54644969af795979338b483d/html5/thumbnails/33.jpg)
UCL DEPARTMENT OF GEOGRAPHY
AnyQuestions?