final stat project
TRANSCRIPT
-
8/17/2019 Final Stat Project
1/16
November 21
2011[By: Parth Patel Period 2 and
Rachel Seitsinger Period 5]
-
8/17/2019 Final Stat Project
2/16
Table o !ontents
"# "ntrod$ction
a# %$estion
b# Problem
c# Prediction
""# &ata table
"""# 'ra(h yo$r data
a# N$mber o contacts gra(h and analysis
i# )((ro(riate statistical meas$res or n$mber o contacts
b# N$mber o *aceboo+ riends gra(h and analysis
i# )((ro(riate Statistical ,eas$res or n$mber o *aceboo+riends
c# -.(loring the )ssociation
i# Scatter Plot
ii# /east S $ares Regression /ine
iii# Resid$al Plot
iv# !orrelation !oe cient
v# n$s$al 3bservations
d# &isc$ss
"# The association o o$r variables
ii# The strength o the linear association
iii# The $se $lness o this linear model or (rediction
# )lligator &ata
g# !hoosing an e.(lanatory val$e
h# )ns4ering the original $estion
-
8/17/2019 Final Stat Project
3/16
"# "ntrod$ction
a#%$estion: The $estion that 4e 4ill beans4ering is:&oes the n$mber o *aceboo+ riendsdetermine the n$mber o riends a (ersonhas in real li e
b#Problem: The $estion 4e 4o$ld li+e toans4er is i the n$mber o *aceboo+ riends a(erson has is an acc$rate (redictor o
4hether or not a (erson has a lot o riends inreal li e# To meas$re the n$mber o riends a(erson has in real li e6 4e decided to ta+e then$mber o (hone n$mbers one has in theircell (hone6 4ith the ho(e that thismeas$rement gives $s a good idea o ho4
many (eo(le they tal+ to# This interests $sbeca$se *aceboo+ is s$ch a h$ge (art oteenager7s everyday lives and 4e 4o$ld li+eto determine i this social net4or+ing sitehold any tr$th 4hen it states on a (ro8le ho4many 9 riends someone has#
c# Prediction: ;e (redict that the more*aceboo+ riends someone has6 the more
riends they 4ill have in real li e6 there orethe more contacts they 4ill have in their
-
8/17/2019 Final Stat Project
4/16
(hone# The scatter (lot sho$ld have a(ositive6 linear association#
""# &ata Table
#Name # of contacts in cell
phone# Of FacebookFriends
1 )nh Tr$ong 100 2ho$
-
8/17/2019 Final Stat Project
5/16
@0 =enny *oster C
-
8/17/2019 Final Stat Project
6/16
i# )((ro(riate Statistical ,eas$re or n$mber o contacts
Measure
# ofcontacts
,inim$m 15%1
-
8/17/2019 Final Stat Project
7/16
b#N$mber o *aceboo+ *riends
1
2
3
4
5
6
7
Collection 1 His
Collection 1 B
The sha(e o the histogram or n$mber o *aceboo+ riends isbimodal and s+e4ed to the lo4er n$mbers# This sho4s that most(eo(le had a n$mber o riends aro$nd 500 or 006 and e4er(eo(le had riends belo4 C00# =$dging by the bo. (lot6 the middle50K o (eo(le had riend n$mbers rom abo$t C00 riends toabo$t
-
8/17/2019 Final Stat Project
8/16
i# )((ro(riate Statistical ,eas$res or n$mbero *aceboo+ riends
Measure# of FacebookFriends
,inim$m ?0%1 C00,edian 5@?#5%@
-
8/17/2019 Final Stat Project
9/16
c# -.(loring the )ssociation
i# Scatter Plot
050
100150
200
250
300
350
400
450
500Collection 1
The scatter (lot bet4een n$mber o *aceboo+ riends L.M and n$mber o contacts in one7s cell (hone LyM has a (ositive6 moderately 4ea+6 linearassociation#
ii# /east S $ares Regression /ine
-
8/17/2019 Final Stat Project
10/16
number_o _contacts ! 0"166number_o _ b_ rien#s $ 70 % r 2 ! 0"21
0
100
200
300
400
500
number_of_fb_friends
Collection 1
N$mber o !ontacts
-
8/17/2019 Final Stat Project
11/16
The correlation coe cient is 0#CC
-
8/17/2019 Final Stat Project
12/16
iii# se $lness o this /inear ,odel orPrediction
3$r r 2 val$e6 4hich is 0#200C tells $s that 20#0CK o the variationin n$mber o contacts can be e.(lained by the linear model orn$mber o *aceboo+ riends6 and n$mber o (hone contracts# Thismeans that the other
-
8/17/2019 Final Stat Project
13/16
This is the data be ore the trans ormation#
The resid$al (lot is clearly c$rved ma+ing the linear modelina((ro(riate or this set o data#
To re e.(ress the data6 4e 8rst $sed the nat$ral log o thede(endent variable L;eight in lbsM# The data came o$tstraightened li+e this:
-
8/17/2019 Final Stat Project
14/16
-
8/17/2019 Final Stat Project
15/16
e /nL4eightM e C#
-
8/17/2019 Final Stat Project
16/16
meant that only a small 20#0CK o variation in n$mber ocontacts can be e.(lained by the linear model or n$mbero *aceboo+ riends# So 4hile there 4as a clear linear
association6 the 4ea+ness o the association means thatthe n$mber o *aceboo+ riends can only 4ea+lydetermine the n$mber o real riends# Io4ever6 some
a4s co$ld be that not everyone ($ts riends into theircell (hones6 or may have had their contacts deletedrecently6 4hich 4o$ld change the data#