REAL ESTATE DATA ANALYSIS & INSIGHTS
USING CLUSTERTING TECHNIQUE
RECOMMENDATIONS
1. Cluster 3 seems to be the better segment for investment options.2. The primary reason being high rental yield along with low rental share with good potential to rental rise. 3. On further drilling down Cluster 3 based on Rental yield , Rental share & Population parameters we can shortlist the below areas.
State Place
Michigan Genesse,Macomb,Ingham - Counties
Texas Corpus Christi, Nueces, Fort worth – CitiesBell,Bexar,Tarrant – CountiesArlington City
Ohio Montgomery county
Illinois St. Clair County
Missouri Jackson County
1. THE CHART BELOW EXPLAINS HOW RENTAL YIELD & RENTAL SHARE PARAMETERS FARE IN THE AREAS SELECTED.
2. THE DATA HAS BEEN ORDERED IN DECREASING VALUE OF RENTAL YIELD AND THE TREND HAS BEEN GIVEN.
Genesee County
Corpus Christi city
Nueces County
Fort Worth city
Macomb County
Bell County Bexar County St. Clair County
Montgomery County
Ingham County
Jackson County
Tarrant County
Arlington city0
5
10
15
20
25
Cluster chart-Rental Yield & Rental Share
Rent Yield Linear (Rent Yield) Rent Share
DETAILED SUMMARY OF ANALYSIS
CLUSTERING FOR REAL ESTATE DATA
Methodologies & Insights
AGENDA
• Synopsis
• Recommendations
• Appendix – SAS code
OBJECTIVE & APPROACH• Goal : Recommend a good place / zip code to buy property for
investment purpose
• K-means Clustering : This algorithm uses minimizing the distance between
points and centroids for creating clusters. Effective for large sized datasets.
PROC FASTCLUS procedure has been used for this method.
ANALYSIS STEPS• We can use clustering analysis on the given dataset to segment each data based on
the critical factors like Rental yield, Rental share of income, Place type and size of the place.
• By this approach we can actually split the data in to high, medium and low returns for investment.
• The goal of clustering would be to find similarities and differences within the data by creating homogeneous groups wherein with in group similarities are maximized and the between group similarities are minimized.
CLUSTER SUMMARY - PROFILING
CLUSTER 1 PROFILE
Variable Mean Pop mean Std dev Z scoreRental share 26% 21% 4% 1.25
Population 3597926 260474 2144874 1.6Rental yield 5% 6% 2% 0.5
1. 19 data points fall in this cluster.2. Rental share has highest z-score and it differentiates this cluster.3. As rental share has high z-score, we can conclude this cluster comprises of
low income groups and has less scope for yield on investment.4. This can be further seen in the rental yield z-score and population means
CLUSTER 2 PROFILE
Variable Mean Pop mean Std dev Z scoreRental share 20% 21% 4% 1.25
Rental yield 5% 6% 1% 1Population 219706 260474 241058 0.17
1. 1176 data points fall in this cluster. This is roughly 73% of the total data. Cluster 2 is the biggest cluster.
2. Rental yield is marginally high compared to cluster 1. 3. Even in cluster 2 rental share seems to be having higher z score.4. Cluster 2 not ideal for investment option.
CLUSTER 3 PROFILE
Variable Mean Pop mean Std dev Z scoreRental yield 9% 6% 2% 1.5Population 222090 260474 317575 0.75Rental share 23% 21% 4% 0.5
1. 403 data points fall in this cluster. This is 25% of the total data. 2. Rental yield has the highest score of 1.5 and this differentiates this cluster.
3. Rental share z-score denotes that this cluster has potential to pay more rent as their rental share value is relatively low compared to clusters 1 and 2.
4. Also cluster population size is decent enough compared to population mean for any investment decision.
5. Cluster 3 has all the ingredients for an ideal investment option.
RECOMMENDATIONS• Cluster 3 seems to be the ideal investment option.• The reason being high Rental yield and low rental share values with
good potential for rental rise.• On further analysis of cluster 3 data based on rental yield,
population size and propensity to given more rent we can shortlist the below areas.
State PlaceMichigan Genesse,Macomb,Ingham counties
Texas Corpus christi,Nueces,Fort worth – Cities Bell,Bexar,Tarrant - CountiesArlington city
Ohio Montgomery countyIllinois St.Clair countyMO Jackson county
APPENDIX – SAS CODE• SAS code location in WPS :
Y:USERS\USER169\Programmes\Clustering\Real Estate_clustering.sas