toronto housing market segmentation using gwr · organization overview • the mission of canada...

32
CANADA MORTGAGE AND HOUSING CORPORATION Toronto Housing Market Segmentation using GWR Xiongbing Jin Canada Mortgage and Housing Corporation Oct. 12 th , 2017 @ Esri User Conference Ottawa

Upload: others

Post on 02-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Toronto Housing Market Segmentation using GWRXiongbing Jin

Canada Mortgage and Housing Corporation

Oct. 12th, 2017 @ Esri User Conference Ottawa

Page 2: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Organization overview

• The mission of Canada Mortgage and Housing Corporation (CMHC) is to help Canadians meet their housing needs

• Business areas

• Mortgage loan insurance

• Affordable housing

• First nation housing

• Policy and research

• Securitization

• Uses GIS and Esri products in many sections and areas

Page 3: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Why market segmentation

• Location is one of the most important factors influencing housing prices

• Large CMAs like Toronto, Montreal and Vancouver contain areas with significantly different locational factors

• Market segmentation divides a study area into many submarkets, where within each market the influence of location is relatively homogeneous, and hedonic models are able to better capture local market dynamics

• Manual delineation of submarkets is often arbitrary, error-prone, and time-consuming

• A GWR-based automated approach is proposed (Borst, 2007)

Location!Location!Location!

Page 4: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Housing submarkets in the City of Toronto

Source: Toronto Star

Page 5: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

What is Geographically Weighted Regression (GWR)

Source: Fotheringham et al (2003)

Page 6: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Key features of GWR

• GWR applies a local regression to each subject property for properties in its neighbourhood

• Bandwidth determines sample size

• Fixed bandwidth: all properties within 2,000 metres

• Adaptive bandwidth: 2,000 nearest properties regardless of distance

• Kernel determines how weight decreases over distance

• Only the Gaussian kernel is implemented in ArcGIS

• Each GWR local regression captures the localizedcontributions of price predictors

Source: Borst (2007)

Page 7: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

GWR reveals locational influences – value of floor area

Page 8: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Use GWR for market segmentation

• Select best independent variables for use in GWR (Exploratory Regression)

log 𝑝𝑟𝑖𝑐𝑒= 𝑓ሺ

ሻ𝑏𝑎𝑡ℎ𝑟𝑜𝑜𝑚𝑠, 𝑓𝑖𝑟𝑒𝑝𝑙𝑎𝑐𝑒𝑠, 𝑓𝑙𝑜𝑜𝑟

/𝑏𝑎𝑠𝑒𝑚𝑒𝑛𝑡 𝑎𝑟𝑒𝑎, 𝑎𝑔𝑒,𝑚𝑜𝑛𝑡ℎ 𝑜𝑓 𝑠𝑎𝑙𝑒

• Determine GWR parameters and run GWR (GWR with cross validation/AICc)

• With over 130,000 points, ArcGIS is the only software that can run GWR using 16GB of RAM

• GWR captures the difference in the contribution of each variable in housing prices at different locations of the region. Using GWR to predict the price of an average house across the city reveals the overall influence of location on property prices

Page 9: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Benchmark house

• Benchmark house (or Market Basket House) is a “typical” house whose characteristics are the median values of the property characteristics of all houses in the region

• For Toronto, a benchmark house has

• 2 full bathrooms,

• 1 fireplace,

• 173 square metres (1,862 square feet) of floor area,

• 48 square metres (516 square feet) of finished basement area,

• was built 18 years ago, and

• was sold in September 2013

(values based on properties sold in Toronto CMA between Nov 2010 and Oct 2015)

Page 10: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Benchmark house price prediction

Page 11: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Use GWR for market segmentation: clustering

• Group points based on the predicted benchmark house value (Grouping Analysis, which uses k-means clustering)

• k-means clustering partitions observations into k clusters where each observation belongs to the cluster with the nearest mean

• Animated example of k-means clustering, from David Kauchak (Pomona College) http://www.cs.pomona.edu/~dkauchak/classes/f13/cs451-f13/lectures/lecture31-kmeans.pptx

Page 12: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Use GWR for market segmentation: number of segments

• k values between 2 and 11 (i.e. 2 to 11 submarkets) are tested.

• For each k value:

• After k-means clustering groups the points in the k groups, the boundaries between groups are re-aligned to census tract boundaries

• An ordinary least square (OLS) model is estimated for each submarket (using the same model specification as the GWR model)

• The overall performance of all submarket models is summarized

• Performance is then compared between the segmentation scenarios to identify the optimal number of submarkets (

• The 7 submarkets scenario is selected as the optimum, balancing performance and model complexity

• When model performance is similar, the simpler model is always preferred

Page 13: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Comparing different number of submarkets

Page 14: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Market segmentation results for the Toronto CMA

Submarkets:1. Mississauga/Oakville2. Scarborough/Durham

Region3. Toronto (excluding Don

Valley and Scarborough)4. Don River Valley5. South York Region6. North York Region7. Brampton

Page 15: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

The submarkets have distinct market dynamics

Note: Submarket names are for demonstration only, and do not correspond to the actual administrative areas.

Page 16: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Comparing single and submarket models

• Model quality and performance are compared between

• A single market model covers the entire Toronto CMA, including all previously-mentioned variables in addition to census tract dummy variables (to capture locational influences)

• 7 submarket models, one for each identified submarket, using the same model specification as the single market model

• Model quality

• Submarket models greatly reduces spatial autocorrelation (Spatial Autocorrelation – Global Moran’s I)

Page 17: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Comparing single and submarket models’ performance

Page 18: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Conclusion

• The GWR and k-means clustering based method is able to detect distinct housing submarkets

• Market segmentation improves model quality and prediction accuracy

Page 19: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

References

• Borst, R. (2007). Discovering and Applying Location Influence Patterns in the Mass Valuation of Demestic Real Propety. PhD thesis. University of Ulster

• Fotheringham, et al. (2003). Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. John Wiley & Sons,

• Kauchak, D. (2013). Machine Learning and Big Data (Course material). http://www.cs.pomona.edu/~dkauchak/classes/f13/cs451-f13/lectures/lecture31-kmeans.pptx

• Radil, S. (2011). Spatializing social networks: making space for theory in spatial analysis. PhD thesis. University of Illinois at Urbana-Champaign.

• Yew, M. (2013). Homes in GTA see big price gain. Toronto Star. https://www.thestar.com/business/real_estate/2013/07/18/homes_in_gta_see_big_price_gain.html

Page 20: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Connecting R and ArcGIS

• Using ArcGIS in R

• arcgisbinding: An R library released by Esri to read/write/convert ArcGIS data formats

• reticulate: An R library to run Python/ArcPy code from within R

Page 21: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Demo: ArcGIS API for Python

Demo: ArcGIS API for Python

Page 22: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Additional slides

Page 23: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

k-means clustering: an example

Source: David Kauchak (Pomona College)

Page 24: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

k-means clustering: initialize centers randomly

Source: David Kauchak (Pomona College)

Page 25: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

k-means clustering: assign points to nearest center

Source: David Kauchak (Pomona College)

Page 26: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

k-means: readjust centers

Source: David Kauchak (Pomona College)

Page 27: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

K-means: assign points to nearest center

Source: David Kauchak (Pomona College)

Page 28: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

K-means: readjust centers

Source: David Kauchak (Pomona College)

Page 29: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

K-means: assign points to nearest center

Source: David Kauchak (Pomona College)

Page 30: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

K-means: readjust centers

Source: David Kauchak (Pomona College)

Page 31: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

K-means: assign points to nearest center

No changes: DoneSource: David Kauchak (Pomona College)

Page 32: Toronto Housing Market Segmentation using GWR · Organization overview • The mission of Canada Mortgage and Housing Corporation (CMHC) is to help ... An R library to run Python/ArcPy

CANADA MORTGAGE AND HOUSING CORPORATION

Spatial autocorrelation (Moran’s I)

• Moran’s I: Spatial autocorrelation in residual errors. Smaller values mean more randomness, or less spatial autocorrelation

• Moran’s I = 1 Moran’s I = 0 Moran’s I = -1

Source: Steven Radil (2011)