geog 465 final project presentation

14
Seattle’s Top 30 Restaurants based on Yelp and Instagram Data Janessa Cordeiro | Haneen Al Hassani | Kevin Ho | Jinyang Luo (Regina) | Kanin Sangcharoenvanakul GEOG 465: GIS Database & Programming

Upload: jinyang-luo

Post on 14-Aug-2015

103 views

Category:

Documents


7 download

TRANSCRIPT

Seattle’s Top 30 Restaurants based on Yelp and Instagram Data

Janessa Cordeiro | Haneen Al Hassani | Kevin Ho | Jinyang Luo (Regina) | Kanin Sangcharoenvanakul

GEOG 465: GIS Database & Programming

“Is there a relationship between Yelp’s restaurant ratings and social media reactions through the number of posts and “likes” of photographs on Instagram?

Research Question:

Method◇ Collect Yelp Data◇ Collect Instagram Data◇ Cleanse & Geocode Data◇ Determine statistical interpretation of data◇ Create an interactive web map application

Collecting Yelp Data◇ Use import.io to scrape 30 restaurant with highest

rated and another 30 highest number of reviews◇ Scraped data: restaurant name, street addresses,

ratings, number of reviews, URLS ◇ Select top 30 restaurants based on multiplication

of the two variables we selected ◇ Use a combination of a Python Script and

OpenStreetMap server to geocode our primary data

◇ Collect Instagram data using Application Program Interface (API)

◇ Obtained data: Instagram photos URL, number of likes, date posted, latitudes and longitudes

◇ Clear up the data and count number of post and number of like for each restaurant and multiple them

The API terms of agreement only allow us obtain data for most 33 posts Instagram post.

Collecting Instagram Data

Geocoding with Python & OpenStreetMap Server

◇ CartoDB◇ CSS◇ SPSS Statistics

Data Visualization

Yelp Interactive Map

Instagram Interactive Map

Statistical Correlation Analysis

Instagram

Spearman's rho Yelp Correlation Coefficient .490**

Sig. (2-tailed) .006

N 30

**. Correlation is significant at the 0.01 level (2-tailed).

Conclusion◇ Spearman’s Rank Correlation Coefficient (rho) = 0.49

■ Statistically significant moderately positive correlation

■ 99% confidence interval◇ Possible Hypotheses:

■ demographic and behavioral differences between users on Yelp and Instagram

Limitations◇ Instagram API (sampling issues)◇ Social Media data is always rapidly changing

■ Yelp contains long term data over months or years

■ Instagram only can gain maximum 33 post per restaurant in real time

BibliographyData Sources:

1. Yelp (2015).2. Instagram API (2015)

Base Maps:

1. CartoDB2. OpenStreetMap

Q&A