social media application goal: data reduction for data visualization

35

Upload: scot-thornton

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social Media Application Goal: Data Reduction for Data Visualization
Page 2: Social Media Application Goal: Data Reduction for Data Visualization

Social Media Application

Page 3: Social Media Application Goal: Data Reduction for Data Visualization
Page 4: Social Media Application Goal: Data Reduction for Data Visualization

Goal: Data Reduction for Data Visualization

Page 5: Social Media Application Goal: Data Reduction for Data Visualization

People

Variables

CLUSTER ANALYSIS

FACTOR ANALYSISVariable/Dimension Reduction

Cluster and Factor Analysis

Page 6: Social Media Application Goal: Data Reduction for Data Visualization

For car buying, what matters to customers?Question

Hypothesis

Data

Analytics

Charts

Answer

Page 7: Social Media Application Goal: Data Reduction for Data Visualization

Brainstorm: Car Purchase

Page 8: Social Media Application Goal: Data Reduction for Data Visualization

Surveys

Page 9: Social Media Application Goal: Data Reduction for Data Visualization

Q Rate on a scale of 1-Low to 9-High(randomized list)

Shopper#1 NewBMW

1971 Olds 442 Conv.

1 Initial Price 9 3 42 Style 7 8 93 # of Miles on Car 7 9 44 Reliability 7 6 25 Color 5 7 96 Comfort 6 7 57 Horsepower 2 6 98 Safety 6 7 19 Financing Terms 7 5 2

10 Country Origin 1 7 711 Drive Type (Front, 4WD) 4 4 612 Miles Per Gallon (MPG) 6 7 513 Warranty Coverage 4 5 2

Survey: Attribute Ratings

Many more features, options….

Page 10: Social Media Application Goal: Data Reduction for Data Visualization

Q Rate on a sale of 1- 91 Initial Price

2 Style

3 # of Miles on Car

4 Reliability

5 Color

6 Comfort

7 Horsepower

8 Safety

9 Financing Terms

10 Country Origin

11 Drive Type (Front, 4WD)

12 Miles Per Gallon (MPG)

13 Warranty Coverage

Survey: Attribute Ratings1 2 3 4 5 6 7 8 9 1

011

12

13

Page 11: Social Media Application Goal: Data Reduction for Data Visualization

cor(data, digits=2)

Correlation Matrix

Page 12: Social Media Application Goal: Data Reduction for Data Visualization
Page 13: Social Media Application Goal: Data Reduction for Data Visualization

install.packages("corrgram")library(corrgram)corrgram(data)

Page 14: Social Media Application Goal: Data Reduction for Data Visualization

Factor Analysis / Variable Reduction

Correlation Matrix

Correlated variables are grouped together and separated from other variables with low or no correlation

Factor Analysis

Page 15: Social Media Application Goal: Data Reduction for Data Visualization

F1

Factor Analysis

F2 FN….F3

Page 16: Social Media Application Goal: Data Reduction for Data Visualization

First & Second Principal Components

Z1 and Z2 are two linear combinations.

• Z1 has the highest variation (spread of values)

• Z2 has the lowest variation

40 60 80 100 120 140 160 1800

10

20

30

40

50

60

70

80

90

100

calories

rating

z1

z2

16

Page 17: Social Media Application Goal: Data Reduction for Data Visualization

F1

b’s Factor Loadings

Factor Analysis

F2 FN….F3

Page 18: Social Media Application Goal: Data Reduction for Data Visualization
Page 19: Social Media Application Goal: Data Reduction for Data Visualization

Packages

Library PC Method Rotation Plot

psych fa() Yes No

principal()

princomp() No Yes

Page 20: Social Media Application Goal: Data Reduction for Data Visualization

Principal Components Analysis

Page 21: Social Media Application Goal: Data Reduction for Data Visualization

Modelmodel <- princomp(data, cor=TRUE)summary(model) biplot(model)

Page 22: Social Media Application Goal: Data Reduction for Data Visualization

Output

# scree plotplot(fit,type="lines")

Page 23: Social Media Application Goal: Data Reduction for Data Visualization

Psych Package

Page 24: Social Media Application Goal: Data Reduction for Data Visualization

Psych Package – falibrary(psych)rmodel <- fa(r = corMat, nfactors = 3, rotate = “none", fm = "pa")

Page 25: Social Media Application Goal: Data Reduction for Data Visualization

Each variable (circle) loads on both

factors and there is no clarity about

separating the variables into different

factors, to give the factors useful

names.

Factor 2

Factor 1

RotationRotations Courtesy of Professor Paul Berger

Page 26: Social Media Application Goal: Data Reduction for Data Visualization

26

“CLASSIC CASE”

After rotationof ~450

NOW, all variables are loading on one factor and not at all the other; This is an overly “dramatic” case.

• Not Correlated Orthogonal• Varimax = Orthogonal Rotation

RotationRotations Courtesy of Professor Paul Berger

Page 27: Social Media Application Goal: Data Reduction for Data Visualization

Psych Package – falibrary(psych)rmodel <- fa(r = corMat, nfactors = 3, rotate = "oblimin", fm = "pa")

Page 28: Social Media Application Goal: Data Reduction for Data Visualization

Psych Package – principallibrary(psych)fit <- principal(ratings6, nfactors=4, rotate=“null")

Page 29: Social Media Application Goal: Data Reduction for Data Visualization

Psych Package – principallibrary(psych)fit <- principal(ratings6, nfactors=4, rotate="varimax“)

corrgram(ratings6[,(1,2,9,12,3,4,6,8,10,5,11,7,13)])

Orthogonal /No Correlation

Page 30: Social Media Application Goal: Data Reduction for Data Visualization

3 Factor vs. 4 Factor

Page 31: Social Media Application Goal: Data Reduction for Data Visualization

3 Factor vs. 4 Factor

StyleComfortColorUpgrade PackagesReliabilitySafetyCountry OriginHorsepowerNice DashMiles Per GallonInitial Price# of Miles on CarFinancing Options

Aaahh!!!Factor

Money

Page 32: Social Media Application Goal: Data Reduction for Data Visualization

Perceptual Map

Factor Loadings

Brand Ratings

Weights

Average

Variance

Page 33: Social Media Application Goal: Data Reduction for Data Visualization

Which One?Which Car?

Price$$$

$

Sweet!!!BORING

Aaaah factor…

Page 34: Social Media Application Goal: Data Reduction for Data Visualization

Component Matrixa

.714 -7.61E-02 .327

.539 .226 -.145

.796 -3.02E-02 .338

.789 6.734E-02 -.379

.712 .107 -.499

.747 -2.02E-02 -.205

6.412E-03 .795 -4.87E-02

-.130 .841 3.175E-02

.675 -4.47E-02 .512

-5.09E-02 .701 .251

.791 1.682E-02 6.907E-02

D01

D02

D03

D04

D05

D06

D07

D08

D09

D10

D11

1 2 3

Component

3 components extracted.a.

Factor Analysis Recap

Page 35: Social Media Application Goal: Data Reduction for Data Visualization

Dimensionality Reduction

Applications Algorithms