harry t. cominos university of queensland supervisors: prasada rao; alicia rambaldi
DESCRIPTION
Hedonic Imputed Housing Price Indices from a Model with Dynamic Shadow Prices Incorporating Nearest Neighbour Information. Harry T. Cominos University of Queensland Supervisors: Prasada Rao; Alicia Rambaldi. Structure of Presentation. Background and Motivation - PowerPoint PPT PresentationTRANSCRIPT
Hedonic Imputed Housing Hedonic Imputed Housing Price Indices from a Model with Price Indices from a Model with
Dynamic Shadow Prices Dynamic Shadow Prices Incorporating Nearest Incorporating Nearest Neighbour InformationNeighbour Information
Harry T. CominosHarry T. CominosUniversity of QueenslandUniversity of Queensland
Supervisors: Supervisors: Prasada Rao; Alicia RambaldiPrasada Rao; Alicia Rambaldi
Structure of PresentationStructure of Presentation1.1. Background and MotivationBackground and Motivation2.2. Existing Hedonic Methods for House Existing Hedonic Methods for House
Price IndexesPrice Indexes3.3. Hedonic Specification with Spatial Hedonic Specification with Spatial
Autocorrelation.Autocorrelation.4.4. Hedonic Specification with Dynamic Hedonic Specification with Dynamic
coefficientscoefficients5.5. Data (Brisbane Metropolitan Area)Data (Brisbane Metropolitan Area)6.6. Empirical ResultsEmpirical Results
Background & MotivationBackground & Motivation 2 main issues in house price index 2 main issues in house price index
construction:construction:1.1. Quality change problem Quality change problem 2.2. Compositional change problem.Compositional change problem.
The hedonic method (theoretically) The hedonic method (theoretically) accounts for both quality and accounts for both quality and compositional changes over time.compositional changes over time.
Hedonic MethodHedonic Method Regression based – Explains the price of a Regression based – Explains the price of a
house using a range of characteristics.house using a range of characteristics.
Drawback – Data Intensive.Drawback – Data Intensive.
The hedonic approach has led to the The hedonic approach has led to the
1.1. Time-dummy variable method (DTH)Time-dummy variable method (DTH)2.2. Hedonic Imputation method (HI)Hedonic Imputation method (HI)3.3. Characteristics Price Method (CP).Characteristics Price Method (CP).
DTH MethodDTH Method
Where xWhere xitit is a vector of household is a vector of household characteristics.characteristics.
dditit is a vector of dummy variables. is a vector of dummy variables.
This is a POOLED (not Panel) regression!This is a POOLED (not Panel) regression!
' 'it it it itln P x d
t t 1ˆ ˆ ˆ ˆexp( var( ) / 2 ) tI
DTH MethodDTH Method DTH Method assumes same hedonic DTH Method assumes same hedonic
model and characteristics in every period.model and characteristics in every period.
It is less restrictive to use the adjacent-It is less restrictive to use the adjacent-period approach (AP-DTH). Ie. regression period approach (AP-DTH). Ie. regression is estimated for every pair of periods.is estimated for every pair of periods.
There is no choice of index number There is no choice of index number formula for DTH and AP-DTH.formula for DTH and AP-DTH.
Hedonic Imputation (HI) Hedonic Imputation (HI) MethodMethod
HI uses imputed prices for missing HI uses imputed prices for missing products in each time period, which allows products in each time period, which allows matched price indices to be computed.matched price indices to be computed.
Consider the model:Consider the model:
C
h h ht c,t c,t t
c=1
lnP = x β +v h = 1,…Ht; c = 1,…,C; t = 1,…,T
HI MethodHI MethodKey assumption – characteristics do not change Key assumption – characteristics do not change
over time.over time.
Impute prices:Impute prices:
ˆˆ
Ch h h
s t c,t c,sc=1
P (x ) exp x β
The above imputed price formula is biased – see APPENDIX A of
paper.
HI MethodHI MethodOnce imputed prices are computed, a Once imputed prices are computed, a
variety of price index formulae can be variety of price index formulae can be used. One class of Törnqvist is:used. One class of Törnqvist is:
ˆ ˆˆ ˆ
ˆ ˆˆ ˆ
t st s
t st
T1 GP GLs,t s,t s,t
1/21 1N Nh h h hN N
t t t sh h h h
h=1 h=1s t s s
1 1N Nh h h h2N 2N
t t t sh h h h
h=1 h=1s t s s
I = I ×I
P (x ) P (x ) =P (x ) P (x )
P (x ) P (x ) =
P (x ) P (x )
s
Another Törnqvist type Another Törnqvist type IndexIndex
ˆ ˆˆ ˆ
t s
t s t s
t s
t s t st st s
N NT2 GP GLN +N N +Ns,t s,t s,t
N N1 1N +N N +N
N Nh h h hN Nt t t sh h h h
h=1 h=1s t s s
I = I × I
P (x ) P (x ) =P (x ) P (x )
This index weights the sample which is more
indicative of the population
Hedonic Specification with Hedonic Specification with Spatial AutocorrelationSpatial Autocorrelation
House prices should be spatially House prices should be spatially autocorrelated because neighbourhoodsautocorrelated because neighbourhoods
1. Have similar structural characteristics 1. Have similar structural characteristics (block size, age)(block size, age)
2. Share location amenities (supermarkets, 2. Share location amenities (supermarkets, schools)schools)
3. Share socioeconomic variables (local 3. Share socioeconomic variables (local crime rates, wealth levels)crime rates, wealth levels)
Hedonic Specification with Hedonic Specification with Spatial AutocorrelationSpatial Autocorrelation
Most empirical examples of Hedonic Most empirical examples of Hedonic Indexes assume white noise errors, but…Indexes assume white noise errors, but…
In practice, the residuals should be spatially In practice, the residuals should be spatially autocorrelated because the hedonic autocorrelated because the hedonic function is not fully specified (due to data function is not fully specified (due to data constraints).constraints).
The Spatial Error Model (SEM)The Spatial Error Model (SEM)
Where Where WW is an is an n n xx n n spatial weight matrix; spatial weight matrix; u u is is white noise; and white noise; and ρρ is the parameter that captures is the parameter that captures the magnitude of the spatial autocorrelation.the magnitude of the spatial autocorrelation.
The SEM can be estimated by GLS or ML.The SEM can be estimated by GLS or ML.
y X
W u
Spatial weight matrix (W)Spatial weight matrix (W)W W has elements representing the spatial has elements representing the spatial
relationship between houses relationship between houses ii and and jj. The . The researcher take the specification of researcher take the specification of WW as as known known a prioria priori..
Common properties:Common properties: WW is non-negative is non-negative wwiiii = 0 (ie. an observation does not affect = 0 (ie. an observation does not affect
its own prediction)its own prediction)
Spatial Weight MatrixSpatial Weight MatrixWe form We form WW as follows: as follows:
if if ii and and jj are contiguous observations; and are contiguous observations; and takes the value zero otherwise. Contiguity takes the value zero otherwise. Contiguity can be artificially constructed using can be artificially constructed using Delaunay triangulation.Delaunay triangulation.
Then Then WW is row normalised. is row normalised.
ijw 1
ContiguityContiguity
Contiguous points share a common edge.
Note: MATLAB has
an inbuilt Delaunay function.
Spatial Weight MatrixSpatial Weight Matrix
In order to construct In order to construct WW in this manner, the in this manner, the latitude and longitude coordinate of each latitude and longitude coordinate of each house must be known.house must be known.
This allows R. Kelley Pace’s This allows R. Kelley Pace’s FDELW2FDELW2 Matlab function to be used to convert the Matlab function to be used to convert the Delaunay algorithm results into a Delaunay algorithm results into a contiguity matrix. contiguity matrix.
So far hedonic parameters have been
•constrained to be constant over the sample period of interest (ie. the DTH method) or
• allowed to vary in a sporadic fashion because the hedonic regression is re-estimated in every time period (or every second period for AP-DTH).
Hedonic Specification with Hedonic Specification with Dynamic coefficientsDynamic coefficients
The State-space SEM (SSSEM)The State-space SEM (SSSEM)
Alternatively, we assume that the vector of Alternatively, we assume that the vector of parameters follows a random walk process. We parameters follows a random walk process. We can extend the SEM as follows:can extend the SEM as follows:
t t t t
t
t t 1 t
y XW u
The SSSEM is estimated using Kalman filter and smoother. Unknown hyperparameters are estimated by MLE.
DataData Data compiled from property information Data compiled from property information
service service RP DATA, RP DATA, from from www.rpdata.comwww.rpdata.com Website is search engine based.Website is search engine based. Data preparation:Data preparation:1.1. Raw data downloaded and hedonics Raw data downloaded and hedonics
chosen.chosen.2.2. Data were filtered.Data were filtered.3.3. Address of each house was geocoded to Address of each house was geocoded to
provide lat/longprovide lat/long
DataData Initially 316,359 observations from early 50s to Initially 316,359 observations from early 50s to
12/05.12/05. Most observations did not contain info on Most observations did not contain info on
property attributes.property attributes. Of those observations containing hedonics, Of those observations containing hedonics,
reporting was inconsistent.reporting was inconsistent. Hence, only ADDRESS, PRICE, DATE, AREA, Hence, only ADDRESS, PRICE, DATE, AREA,
BED, BATH, CARLUG information was kept.BED, BATH, CARLUG information was kept. After filtering, 71,583 remaining observations After filtering, 71,583 remaining observations
spanning 01/71 – 12/05spanning 01/71 – 12/05
YEAR PRICE ($) AREA (m2) BED BATH CARLUG No. Obs
Min Median Max Min Median Max Min Median Max Min Median Max Min Median Max 1990 10000 110000 4800000 169 607 34000 1 3 7 1 1 7 1 2 8 2198 1991 8500 123000 1070000 169 607 40000 1 3 6 1 1 5 0 2 6 2251 1992 3000 130000 770000 152 607 99100 1 3 8 1 1 5 1 2 8 2288 1993 3000 137500 2300000 146 607 45300 1 3 8 1 1 6 1 2 9 2462 1994 3996 145000 980000 152 607 66000 1 3 7 1 1 5 0 2 8 2332 1995 2000 142000 1790000 152 607 60400 1 3 8 1 1 6 1 2 7 1660 1996 5000 144900 1400000 126 607 31800 1 3 7 1 1 5 1 2 6 2107 1997 3000 152000 1410000 120 607 40500 1 3 9 1 1 5 1 2 8 2793 1998 2000 158500 2475000 106 607 98100 1 3 8 1 1 6 0 2 9 2941 1999 1400 165000 3000000 143 607 65000 1 3 8 1 1 9 1 2 8 3892 2000 1090 172300 2750000 120 607 163700 1 3 8 1 1 6 1 2 7 4323 2001 1210 200000 5350000 106 607 100000 1 3 8 1 1 5 1 2 8 4954 2002 1610 255000 5900000 101 607 198000 1 3 8 1 1 7 0 2 9 5240 2003 1860 330000 8200000 113 607 99100 1 3 8 1 1 9 0 2 8 6672 2004 2004 375000 6000000 101 607 114500 1 3 9 1 2 7 0 2 8 6097 2005 1111 375000 7000000 107 607 114500 1 3 9 1 1 6 0 2 8 7613
Summary of Data over Summary of Data over timetime
No. Observations per No. Observations per MonthMonth
0
100
200
300
400
500
600
700
800
Jan-
75
Jan-
77
Jan-
79
Jan-
81
Jan-
83
Jan-
85
Jan-
87
Jan-
89
Jan-
91
Jan-
93
Jan-
95
Jan-
97
Jan-
99
Jan-
01
Jan-
03
Jan-
05
Data LimitationsData Limitations Data could not be downloaded for all Data could not be downloaded for all
suburbs.suburbs. With an extensive list of property With an extensive list of property
attributes, it would be great to use some of attributes, it would be great to use some of this information! (POOL etc)this information! (POOL etc)
Tedious and time consuming to compile Tedious and time consuming to compile data.data.
Results – Spatial Stat TestsResults – Spatial Stat TestsYear Moran I Moran ZI LR stat 1990 0.1464 11.9155 127.1655 1991 0.2563 21.0753 334.7955 1992 0.2156 17.8345 254.9895 1993 0.2031 17.4664 247.0573 1994 0.2081 17.4404 232.4361 1995 0.1982 13.9541 161.8869 1996 0.1655 13.1358 141.6757 1997 0.2379 21.8062 369.8247 1998 0.2311 21.7087 362.4950 1999 0.2779 29.9798 688.2507 2000 0.3392 38.5383 1053.7 2001 0.3392 41.4676 1157.9 2002 0.3718 46.6997 1431.0 2003 - - 1598.8 2004 - - 1187.6 2005 - - 2611.7
Intercept Coefficient over timeIntercept Coefficient over time
10
10.5
11
11.5
12
12.5
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Coe
ffici
ent
SEM AP-DTH DTH SSSEM
AREA coefficient over timeAREA coefficient over time
0
0.00005
0.0001
0.00015
0.0002
0.00025
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Coe
ffici
ent
SEM AP-DTH DTH SSSEM
BED coefficient over timeBED coefficient over time
0
0.02
0.04
0.06
0.08
0.1
0.12
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Coe
ffici
ent
SEM AP-DTH DTH SSSEM
BATH coefficient over timeBATH coefficient over time
0
0.05
0.1
0.15
0.2
0.25
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Coe
ffici
ent
SEM AP-DTH DTH SSSEM
Shadow Prices for AP-Shadow Prices for AP-DTHDTH
Transformed coefficients Regression Area Bed Bath Carlug
90/91 5.33 12855.58 22306.20 5587.05 91/92 2.84 10966.54 24570.32 7053.14 92/93 2.62 9081.11 27282.22 7209.75 93/94 3.50 9329.99 28732.39 6731.67 94/95 3.43 12062.30 32089.59 4847.16 95/96 2.97 16602.70 31690.66 6394.21 96/97 6.02 16844.34 31908.81 6447.75 97/98 4.61 17363.85 35278.57 7599.45 98/99 3.98 18457.41 38647.37 9952.09 99/00 3.31 18703.66 42399.85 9389.46 00/01 3.35 16193.85 49608.79 8206.20 01/02 3.50 14310.56 58905.83 8483.93 02/03 5.11 15261.88 70111.51 11317.29 03/04 8.73 19214.66 83927.50 16257.60 04/05 9.76 25989.98 97631.61 20202.24
Annual IndexesAnnual Indexes
0
0.5
1
1.5
2
2.5
3
3.5
4
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Inde
x N
umbe
r
T2 SEM DTH Median T2 SSSEM
SSSEM Monthly T2 Index vs SSSEM Monthly T2 Index vs Intercept trend Intercept trend
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Jan-90
Jul-90
Jan-91
Jul-91
Jan-92
Jul-92
Jan-93
Jul-93
Jan-94
Jul-94
Jan-95
Jul-95
Jan-96
Jul-96
Jan-97
Jul-97
Jan-98
Jul-98
Jan-99
Jul-99
Jan-00
Jul-00
Jan-01
Jul-01
Jan-02
Jul-02
Jan-03
Jul-03
Jan-04
Jul-04
Jan-05
Jul-05In
dex
Num
ber
SSSEM (Monthly) T2 Intercept Trend
ConclusionsConclusions Spatial autocorrelation highly significantSpatial autocorrelation highly significant Coefficients do seem to vary over timeCoefficients do seem to vary over time However, Intercept term is driving the However, Intercept term is driving the
index numbersindex numbersTherefore:Therefore: Indexes remarkably similar so choice of Indexes remarkably similar so choice of
index formulae is not important.index formulae is not important. Imposing time-varying coefficients is Imposing time-varying coefficients is
unimportant.unimportant.
ConclusionsConclusions Results are largely due to not having Results are largely due to not having
enough hedonics in regression. enough hedonics in regression. If intercept term has less influence, the If intercept term has less influence, the
impact of incorporating spatial impact of incorporating spatial autocorrelation and time-varying autocorrelation and time-varying coefficients may be significant.coefficients may be significant.
Future researchFuture research Different Weight matrices in lattice models.Different Weight matrices in lattice models. Geostatistical modelsGeostatistical models HI and CP indexes particular to housing HI and CP indexes particular to housing
case.case. Robust method to adjust for unusual Robust method to adjust for unusual
observations such as acreage propertiesobservations such as acreage properties Adjust/test for seasonalityAdjust/test for seasonality DATA DATA DATA DATA DATADATA DATA DATA DATA DATA
Repeat Sales methodRepeat Sales method Is the method of choice in the Real Estate Industry.Is the method of choice in the Real Estate Industry.
iiiitst,i 'dPlnPlnPln
Where i represents the ith house; t represents the tth time period; Pi is the price of house i in time period, t.
di is a vector of dummy variables that take the value 1 if the second sale occurred in period j and the value -1 if the first sale occurred in period j .
β is the parameter vector and εi is a vector of residuals.
Repeat Sales MethodRepeat Sales MethodWeaknesses:Weaknesses:1.1. It does not make maximum use of the It does not make maximum use of the
available data.available data.2.2. It violates temporal fixity.It violates temporal fixity.3.3. Does not account for the depreciation Does not account for the depreciation
effect.effect.4.4. Repeat sales properties may differ in Repeat sales properties may differ in
some respects from single sale some respects from single sale properties.properties.
The State-space Hedonic The State-space Hedonic ModelModel
0001
)(E),(~),(~
XPln
tt
t
tt
ttt
tttt
1.Parameters in Ωt and Σ are estimated using MLE.
2.βt is the state-vector and will be obtained by Kalman Filter and Smoother.
Ωt is spatially correlated.Σ is a diagonal matrix.
Hedonic Imputation Hedonic Imputation IndexIndex
1. Impute Prices for every house in every time 1. Impute Prices for every house in every time period (except when a true observation exists)period (except when a true observation exists)
tttˆXP̂ln
2. Compute a Price Index.Example: Paasche House Price Index
t
t
H
h
hs
H
h
ht
t,s
p̂
pP
1
1
SummarySummaryUsing a State-Space Formulation of a Using a State-Space Formulation of a
Hedonic Model, and a Hedonic Imputation Hedonic Model, and a Hedonic Imputation Index has the following advantages:Index has the following advantages:
The shadow prices of characteristics vary The shadow prices of characteristics vary over time.over time.
Spatial correlation is accounted for in the Spatial correlation is accounted for in the estimation process.estimation process.
The resulting Index satisfies temporal The resulting Index satisfies temporal fixity.fixity.
SummarySummary Makes maximum use of the available Makes maximum use of the available
data.data. Takes into account shifts in the Takes into account shifts in the
composition of transactions each composition of transactions each period.period.
Controls for quality Improvements Controls for quality Improvements (Data dependent)(Data dependent)
Incorporating Spatial Incorporating Spatial CorrelationCorrelation
1.1. Collect House Price Data for Brisbane Collect House Price Data for Brisbane – which includes the address of each – which includes the address of each house.house.
2.2. Change addresses to Change addresses to Latitudes/longitudesLatitudes/longitudes
3.3. Use the Lat/Long to calculate Use the Lat/Long to calculate distance between a house and its distance between a house and its ‘nearest neighbours’.‘nearest neighbours’.
4.4. Use this information in the Model.Use this information in the Model.