harry t. cominos university of queensland supervisors: prasada rao; alicia rambaldi

Hedonic Imputed Housing Hedonic Imputed Housing Price Indices from a Model with Price Indices from a Model with

Dynamic Shadow Prices Dynamic Shadow Prices Incorporating Nearest Incorporating Nearest Neighbour InformationNeighbour Information

Harry T. CominosHarry T. CominosUniversity of QueenslandUniversity of Queensland

Supervisors: Supervisors: Prasada Rao; Alicia RambaldiPrasada Rao; Alicia Rambaldi

Structure of PresentationStructure of Presentation1.1. Background and MotivationBackground and Motivation2.2. Existing Hedonic Methods for House Existing Hedonic Methods for House

Price IndexesPrice Indexes3.3. Hedonic Specification with Spatial Hedonic Specification with Spatial

Autocorrelation.Autocorrelation.4.4. Hedonic Specification with Dynamic Hedonic Specification with Dynamic

coefficientscoefficients5.5. Data (Brisbane Metropolitan Area)Data (Brisbane Metropolitan Area)6.6. Empirical ResultsEmpirical Results

Background & MotivationBackground & Motivation 2 main issues in house price index 2 main issues in house price index

construction:construction:1.1. Quality change problem Quality change problem 2.2. Compositional change problem.Compositional change problem.

The hedonic method (theoretically) The hedonic method (theoretically) accounts for both quality and accounts for both quality and compositional changes over time.compositional changes over time.

Hedonic MethodHedonic Method Regression based – Explains the price of a Regression based – Explains the price of a

house using a range of characteristics.house using a range of characteristics.

Drawback – Data Intensive.Drawback – Data Intensive.

The hedonic approach has led to the The hedonic approach has led to the

1.1. Time-dummy variable method (DTH)Time-dummy variable method (DTH)2.2. Hedonic Imputation method (HI)Hedonic Imputation method (HI)3.3. Characteristics Price Method (CP).Characteristics Price Method (CP).

DTH MethodDTH Method

Where xWhere xitit is a vector of household is a vector of household characteristics.characteristics.

dditit is a vector of dummy variables. is a vector of dummy variables.

This is a POOLED (not Panel) regression!This is a POOLED (not Panel) regression!

' 'it it it itln P x d

t t 1ˆ ˆ ˆ ˆexp( var( ) / 2 ) tI

DTH MethodDTH Method DTH Method assumes same hedonic DTH Method assumes same hedonic

model and characteristics in every period.model and characteristics in every period.

It is less restrictive to use the adjacent-It is less restrictive to use the adjacent-period approach (AP-DTH). Ie. regression period approach (AP-DTH). Ie. regression is estimated for every pair of periods.is estimated for every pair of periods.

There is no choice of index number There is no choice of index number formula for DTH and AP-DTH.formula for DTH and AP-DTH.

Hedonic Imputation (HI) Hedonic Imputation (HI) MethodMethod

HI uses imputed prices for missing HI uses imputed prices for missing products in each time period, which allows products in each time period, which allows matched price indices to be computed.matched price indices to be computed.

Consider the model:Consider the model:

C

h h ht c,t c,t t

c=1

lnP = x β +v h = 1,…Ht; c = 1,…,C; t = 1,…,T

HI MethodHI MethodKey assumption – characteristics do not change Key assumption – characteristics do not change

over time.over time.

Impute prices:Impute prices:

ˆˆ

Ch h h

s t c,t c,sc=1

P (x ) exp x β

The above imputed price formula is biased – see APPENDIX A of

paper.

HI MethodHI MethodOnce imputed prices are computed, a Once imputed prices are computed, a

variety of price index formulae can be variety of price index formulae can be used. One class of Törnqvist is:used. One class of Törnqvist is:

ˆ ˆˆ ˆ

ˆ ˆˆ ˆ

t st s

t st

T1 GP GLs,t s,t s,t

1/21 1N Nh h h hN N

t t t sh h h h

h=1 h=1s t s s

1 1N Nh h h h2N 2N

t t t sh h h h

h=1 h=1s t s s

I = I ×I

P (x ) P (x ) =P (x ) P (x )

P (x ) P (x ) =

P (x ) P (x )

s

Another Törnqvist type Another Törnqvist type IndexIndex

ˆ ˆˆ ˆ

t s

t s t s

t s

t s t st st s

N NT2 GP GLN +N N +Ns,t s,t s,t

N N1 1N +N N +N

N Nh h h hN Nt t t sh h h h

h=1 h=1s t s s

I = I × I

P (x ) P (x ) =P (x ) P (x )

This index weights the sample which is more

indicative of the population

Hedonic Specification with Hedonic Specification with Spatial AutocorrelationSpatial Autocorrelation

House prices should be spatially House prices should be spatially autocorrelated because neighbourhoodsautocorrelated because neighbourhoods

1. Have similar structural characteristics 1. Have similar structural characteristics (block size, age)(block size, age)

2. Share location amenities (supermarkets, 2. Share location amenities (supermarkets, schools)schools)

3. Share socioeconomic variables (local 3. Share socioeconomic variables (local crime rates, wealth levels)crime rates, wealth levels)

Hedonic Specification with Hedonic Specification with Spatial AutocorrelationSpatial Autocorrelation

Most empirical examples of Hedonic Most empirical examples of Hedonic Indexes assume white noise errors, but…Indexes assume white noise errors, but…

In practice, the residuals should be spatially In practice, the residuals should be spatially autocorrelated because the hedonic autocorrelated because the hedonic function is not fully specified (due to data function is not fully specified (due to data constraints).constraints).

The Spatial Error Model (SEM)The Spatial Error Model (SEM)

Where Where WW is an is an n n xx n n spatial weight matrix; spatial weight matrix; u u is is white noise; and white noise; and ρρ is the parameter that captures is the parameter that captures the magnitude of the spatial autocorrelation.the magnitude of the spatial autocorrelation.

The SEM can be estimated by GLS or ML.The SEM can be estimated by GLS or ML.

y X

W u

Spatial weight matrix (W)Spatial weight matrix (W)W W has elements representing the spatial has elements representing the spatial

relationship between houses relationship between houses ii and and jj. The . The researcher take the specification of researcher take the specification of WW as as known known a prioria priori..

Common properties:Common properties: WW is non-negative is non-negative wwiiii = 0 (ie. an observation does not affect = 0 (ie. an observation does not affect

its own prediction)its own prediction)

Spatial Weight MatrixSpatial Weight MatrixWe form We form WW as follows: as follows:

if if ii and and jj are contiguous observations; and are contiguous observations; and takes the value zero otherwise. Contiguity takes the value zero otherwise. Contiguity can be artificially constructed using can be artificially constructed using Delaunay triangulation.Delaunay triangulation.

Then Then WW is row normalised. is row normalised.

ijw 1

ContiguityContiguity

Contiguous points share a common edge.

Note: MATLAB has

an inbuilt Delaunay function.

Spatial Weight MatrixSpatial Weight Matrix

In order to construct In order to construct WW in this manner, the in this manner, the latitude and longitude coordinate of each latitude and longitude coordinate of each house must be known.house must be known.

This allows R. Kelley Pace’s This allows R. Kelley Pace’s FDELW2FDELW2 Matlab function to be used to convert the Matlab function to be used to convert the Delaunay algorithm results into a Delaunay algorithm results into a contiguity matrix. contiguity matrix.

So far hedonic parameters have been

•constrained to be constant over the sample period of interest (ie. the DTH method) or

• allowed to vary in a sporadic fashion because the hedonic regression is re-estimated in every time period (or every second period for AP-DTH).

Hedonic Specification with Hedonic Specification with Dynamic coefficientsDynamic coefficients

The State-space SEM (SSSEM)The State-space SEM (SSSEM)

Alternatively, we assume that the vector of Alternatively, we assume that the vector of parameters follows a random walk process. We parameters follows a random walk process. We can extend the SEM as follows:can extend the SEM as follows:

t t t t

t

t t 1 t

y XW u

The SSSEM is estimated using Kalman filter and smoother. Unknown hyperparameters are estimated by MLE.

DataData Data compiled from property information Data compiled from property information

service service RP DATA, RP DATA, from from www.rpdata.comwww.rpdata.com Website is search engine based.Website is search engine based. Data preparation:Data preparation:1.1. Raw data downloaded and hedonics Raw data downloaded and hedonics

chosen.chosen.2.2. Data were filtered.Data were filtered.3.3. Address of each house was geocoded to Address of each house was geocoded to

provide lat/longprovide lat/long

http://www.rpdata.com/

DataData Initially 316,359 observations from early 50s to Initially 316,359 observations from early 50s to

12/05.12/05. Most observations did not contain info on Most observations did not contain info on

property attributes.property attributes. Of those observations containing hedonics, Of those observations containing hedonics,

reporting was inconsistent.reporting was inconsistent. Hence, only ADDRESS, PRICE, DATE, AREA, Hence, only ADDRESS, PRICE, DATE, AREA,

BED, BATH, CARLUG information was kept.BED, BATH, CARLUG information was kept. After filtering, 71,583 remaining observations After filtering, 71,583 remaining observations

spanning 01/71 – 12/05spanning 01/71 – 12/05

YEAR PRICE ($) AREA (m2) BED BATH CARLUG No. Obs

Min Median Max Min Median Max Min Median Max Min Median Max Min Median Max 1990 10000 110000 4800000 169 607 34000 1 3 7 1 1 7 1 2 8 2198 1991 8500 123000 1070000 169 607 40000 1 3 6 1 1 5 0 2 6 2251 1992 3000 130000 770000 152 607 99100 1 3 8 1 1 5 1 2 8 2288 1993 3000 137500 2300000 146 607 45300 1 3 8 1 1 6 1 2 9 2462 1994 3996 145000 980000 152 607 66000 1 3 7 1 1 5 0 2 8 2332 1995 2000 142000 1790000 152 607 60400 1 3 8 1 1 6 1 2 7 1660 1996 5000 144900 1400000 126 607 31800 1 3 7 1 1 5 1 2 6 2107 1997 3000 152000 1410000 120 607 40500 1 3 9 1 1 5 1 2 8 2793 1998 2000 158500 2475000 106 607 98100 1 3 8 1 1 6 0 2 9 2941 1999 1400 165000 3000000 143 607 65000 1 3 8 1 1 9 1 2 8 3892 2000 1090 172300 2750000 120 607 163700 1 3 8 1 1 6 1 2 7 4323 2001 1210 200000 5350000 106 607 100000 1 3 8 1 1 5 1 2 8 4954 2002 1610 255000 5900000 101 607 198000 1 3 8 1 1 7 0 2 9 5240 2003 1860 330000 8200000 113 607 99100 1 3 8 1 1 9 0 2 8 6672 2004 2004 375000 6000000 101 607 114500 1 3 9 1 2 7 0 2 8 6097 2005 1111 375000 7000000 107 607 114500 1 3 9 1 1 6 0 2 8 7613

Summary of Data over Summary of Data over timetime

No. Observations per No. Observations per MonthMonth

0

100

200

300

400

500

600

700

800

Jan-

75

Jan-

77

Jan-

79

Jan-

81

Jan-

83

Jan-

85

Jan-

87

Jan-

89

Jan-

91

Jan-

93

Jan-

95

Jan-

97

Jan-

99

Jan-

01

Jan-

03

Jan-

05

Data LimitationsData Limitations Data could not be downloaded for all Data could not be downloaded for all

suburbs.suburbs. With an extensive list of property With an extensive list of property

attributes, it would be great to use some of attributes, it would be great to use some of this information! (POOL etc)this information! (POOL etc)

Tedious and time consuming to compile Tedious and time consuming to compile data.data.

Results – Spatial Stat TestsResults – Spatial Stat TestsYear Moran I Moran ZI LR stat 1990 0.1464 11.9155 127.1655 1991 0.2563 21.0753 334.7955 1992 0.2156 17.8345 254.9895 1993 0.2031 17.4664 247.0573 1994 0.2081 17.4404 232.4361 1995 0.1982 13.9541 161.8869 1996 0.1655 13.1358 141.6757 1997 0.2379 21.8062 369.8247 1998 0.2311 21.7087 362.4950 1999 0.2779 29.9798 688.2507 2000 0.3392 38.5383 1053.7 2001 0.3392 41.4676 1157.9 2002 0.3718 46.6997 1431.0 2003 - - 1598.8 2004 - - 1187.6 2005 - - 2611.7

Intercept Coefficient over timeIntercept Coefficient over time

10

10.5

11

11.5

12

12.5

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Coe

ffici

ent

SEM AP-DTH DTH SSSEM

AREA coefficient over timeAREA coefficient over time

0

0.00005

0.0001

0.00015

0.0002

0.00025

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Coe

ffici

ent


BED coefficient over timeBED coefficient over time

0

0.02

0.04

0.06

0.08

0.1

0.12

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Coe

ffici

ent


BATH coefficient over timeBATH coefficient over time

0

0.05

0.1

0.15

0.2

0.25

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Coe

ffici

ent


Shadow Prices for AP-Shadow Prices for AP-DTHDTH

Transformed coefficients Regression Area Bed Bath Carlug

90/91 5.33 12855.58 22306.20 5587.05 91/92 2.84 10966.54 24570.32 7053.14 92/93 2.62 9081.11 27282.22 7209.75 93/94 3.50 9329.99 28732.39 6731.67 94/95 3.43 12062.30 32089.59 4847.16 95/96 2.97 16602.70 31690.66 6394.21 96/97 6.02 16844.34 31908.81 6447.75 97/98 4.61 17363.85 35278.57 7599.45 98/99 3.98 18457.41 38647.37 9952.09 99/00 3.31 18703.66 42399.85 9389.46 00/01 3.35 16193.85 49608.79 8206.20 01/02 3.50 14310.56 58905.83 8483.93 02/03 5.11 15261.88 70111.51 11317.29 03/04 8.73 19214.66 83927.50 16257.60 04/05 9.76 25989.98 97631.61 20202.24

Annual IndexesAnnual Indexes

0

0.5

1

1.5

2

2.5

3

3.5

4

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Inde

x N

umbe

r

T2 SEM DTH Median T2 SSSEM

SSSEM Monthly T2 Index vs SSSEM Monthly T2 Index vs Intercept trend Intercept trend

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Jan-90

Jul-90

Jan-91

Jul-91

Jan-92

Jul-92

Jan-93

Jul-93

Jan-94

Jul-94

Jan-95

Jul-95

Jan-96

Jul-96

Jan-97

Jul-97

Jan-98

Jul-98

Jan-99

Jul-99

Jan-00

Jul-00

Jan-01

Jul-01

Jan-02

Jul-02

Jan-03

Jul-03

Jan-04

Jul-04

Jan-05

Jul-05In

dex

Num

ber

SSSEM (Monthly) T2 Intercept Trend

ConclusionsConclusions Spatial autocorrelation highly significantSpatial autocorrelation highly significant Coefficients do seem to vary over timeCoefficients do seem to vary over time However, Intercept term is driving the However, Intercept term is driving the

index numbersindex numbersTherefore:Therefore: Indexes remarkably similar so choice of Indexes remarkably similar so choice of

index formulae is not important.index formulae is not important. Imposing time-varying coefficients is Imposing time-varying coefficients is

unimportant.unimportant.

ConclusionsConclusions Results are largely due to not having Results are largely due to not having

enough hedonics in regression. enough hedonics in regression. If intercept term has less influence, the If intercept term has less influence, the

impact of incorporating spatial impact of incorporating spatial autocorrelation and time-varying autocorrelation and time-varying coefficients may be significant.coefficients may be significant.

Future researchFuture research Different Weight matrices in lattice models.Different Weight matrices in lattice models. Geostatistical modelsGeostatistical models HI and CP indexes particular to housing HI and CP indexes particular to housing

case.case. Robust method to adjust for unusual Robust method to adjust for unusual

observations such as acreage propertiesobservations such as acreage properties Adjust/test for seasonalityAdjust/test for seasonality DATA DATA DATA DATA DATADATA DATA DATA DATA DATA

Repeat Sales methodRepeat Sales method Is the method of choice in the Real Estate Industry.Is the method of choice in the Real Estate Industry.

iiiitst,i 'dPlnPlnPln

Where i represents the ith house; t represents the tth time period; Pi is the price of house i in time period, t.

di is a vector of dummy variables that take the value 1 if the second sale occurred in period j and the value -1 if the first sale occurred in period j .

β is the parameter vector and εi is a vector of residuals.

Repeat Sales MethodRepeat Sales MethodWeaknesses:Weaknesses:1.1. It does not make maximum use of the It does not make maximum use of the

available data.available data.2.2. It violates temporal fixity.It violates temporal fixity.3.3. Does not account for the depreciation Does not account for the depreciation

effect.effect.4.4. Repeat sales properties may differ in Repeat sales properties may differ in

some respects from single sale some respects from single sale properties.properties.

The State-space Hedonic The State-space Hedonic ModelModel

0001

)(E),(~),(~

XPln

tt

t

tt

ttt

tttt

1.Parameters in Ωt and Σ are estimated using MLE.

2.βt is the state-vector and will be obtained by Kalman Filter and Smoother.

Ωt is spatially correlated.Σ is a diagonal matrix.

Hedonic Imputation Hedonic Imputation IndexIndex

1. Impute Prices for every house in every time 1. Impute Prices for every house in every time period (except when a true observation exists)period (except when a true observation exists)

tttˆXP̂ln

2. Compute a Price Index.Example: Paasche House Price Index

t

t

H

h

hs

H

h

ht

t,s

p̂

pP

1

1

SummarySummaryUsing a State-Space Formulation of a Using a State-Space Formulation of a

Hedonic Model, and a Hedonic Imputation Hedonic Model, and a Hedonic Imputation Index has the following advantages:Index has the following advantages:

The shadow prices of characteristics vary The shadow prices of characteristics vary over time.over time.

Spatial correlation is accounted for in the Spatial correlation is accounted for in the estimation process.estimation process.

The resulting Index satisfies temporal The resulting Index satisfies temporal fixity.fixity.

SummarySummary Makes maximum use of the available Makes maximum use of the available

data.data. Takes into account shifts in the Takes into account shifts in the

composition of transactions each composition of transactions each period.period.

Controls for quality Improvements Controls for quality Improvements (Data dependent)(Data dependent)

Incorporating Spatial Incorporating Spatial CorrelationCorrelation

1.1. Collect House Price Data for Brisbane Collect House Price Data for Brisbane – which includes the address of each – which includes the address of each house.house.

2.2. Change addresses to Change addresses to Latitudes/longitudesLatitudes/longitudes

3.3. Use the Lat/Long to calculate Use the Lat/Long to calculate distance between a house and its distance between a house and its ‘nearest neighbours’.‘nearest neighbours’.

4.4. Use this information in the Model.Use this information in the Model.

harry t. cominos university of queensland supervisors: prasada rao; alicia rambaldi

Documents

hedonic model

hedonic approach

hedonic function

hedonic methodregression

spatial relationship

specification of w

spatial weight matrixin

imputed price formula