describing data with graphics and numbers
DESCRIPTION
Describing data with graphics and numbers. Types of Data. Categorical Variables also known as class variables, nominal variables Quantitative Variables aka numerical nariables either continuous or discrete. Graphing categorical variables. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/1.jpg)
Describing datawith graphicsand numbers
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 2: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/2.jpg)
Types of Data
•Categorical Variables –also known as class variables, nominal variables
•Quantitative Variables –aka numerical nariables
–either continuous or discrete.
![Page 3: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/3.jpg)
Graphing categorical variables
![Page 4: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/4.jpg)
Ten most common causes of death in Americans between 15 and 19 years old in 1999.
![Page 5: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/5.jpg)
Bar graphs
![Page 6: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/6.jpg)
Graphing numerical variables
![Page 7: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/7.jpg)
Heights of BIOL 300 students (cm)
165 168 163 173 170 163 170 155 152 190 170 168 142 160 154 165 156 177 173 165 165 175
155 166 168 165 180 165
![Page 8: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/8.jpg)
Stem-and-leaf plot
![Page 9: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/9.jpg)
Stem-and-leaf plot
191817161514
000 0 0 3 3 5 70 3 3 5 5 5 5 5 5 6 8 8 82 4 5 5 6 2
![Page 10: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/10.jpg)
Frequency table
Height Group
Frequency
141-150
151-160
161-170
171-180
181-190
![Page 11: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/11.jpg)
Frequency table
Height Group
Frequency
141-150 1
151-160 6
161-170 15
171-180 5
181-190 1
![Page 12: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/12.jpg)
Histogram
![Page 13: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/13.jpg)
Histogram
![Page 14: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/14.jpg)
HistogramFrequency distribution
![Page 15: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/15.jpg)
Histogram with more data
![Page 16: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/16.jpg)
![Page 17: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/17.jpg)
150 160 170 180 190 200 210
0.2
0.4
0.6
0.8
1
Cumulative
Frequency
Height (in cm) of Bio300 Students
Cumulative Frequency Distribution
![Page 18: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/18.jpg)
150 160 170 180 190 200 210
0.2
0.4
0.6
0.8
1
Cumulative
Frequency
Height (in cm) of Bio300 Students
Cumulative Frequency Distribution
90th percentile50th percentile(median)
![Page 19: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/19.jpg)
Associations between two categorical variables
![Page 20: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/20.jpg)
Association between reproductive effort and avian
malariaTable 2.3A. Contingency table showing incidence of
malaria in female great tits subjected to experimental
egg removal.
contro lgroup
egg removalgroup
rowtotal
malaria 7 15 22nomalaria
28 15 43
columntotal
35 30 65
![Page 21: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/21.jpg)
Association between reproductive effort and avian
malariaTable 2.3A. Contingency table showing incidence of
malaria in female great tits subjected to experimental
egg removal.
contro lgroup
egg removalgroup
rowtotal
malaria 7 15 22nomalaria
28 15 43
columntotal
35 30 65
![Page 22: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/22.jpg)
Mosaic plot
Control Egg removal
0.0
0.2
0.4
0.6
0.8
1.0
Treatment
Relative frequency
Figure 2.3B. Mosaic plot for reproductive effort and avian malariain great tits (Table 2.3A). Blue fill indicates diseased birds whereasthe white fill indicates birds free of malaria. n = 65 birds.
![Page 23: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/23.jpg)
Grouped Bar Graph
Malaria No malaria Malaria No malaria
0
5
10
15
20
25
Control Egg removal
![Page 24: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/24.jpg)
Associations between categorical and numerical
variables
![Page 25: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/25.jpg)
Multiple histograms
0 200 400 600 800 1000
0
200
400
600
0
200
400
600
Non-conserved
0 200 400 600 800 1000
Protein length
Conserved
![Page 26: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/26.jpg)
Associations between two numerical variables
![Page 27: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/27.jpg)
Scatterplots
![Page 28: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/28.jpg)
Scatterplots
![Page 29: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/29.jpg)
Evaluating Graphics
• Lie factor
• Chartjunk
• EfficiencyQuickTime™ and a
TIFF (Uncompressed) decompressorare needed to see this picture.
![Page 30: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/30.jpg)
Don’t mislead with graphics
![Page 31: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/31.jpg)
Better representation of truth
![Page 32: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/32.jpg)
Lie Factor
• Lie factor = size of effect shown in graphic
size of effect in data
![Page 33: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/33.jpg)
Lie Factor Example
Effect in graphic: 2.33/0.08= 29.1
Effect in data: 6748/5844= 1.15
Lie factor = 29.1 / 1.15= 25.3
![Page 34: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/34.jpg)
ChartjunkChartjunk
![Page 35: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/35.jpg)
0 50 100
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
NorthWestEast
![Page 36: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/36.jpg)
Needless 3D Graphics
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 37: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/37.jpg)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 38: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/38.jpg)
Summary: Graphical methods for frequency distributions
Type of Data MethodCategorical data Bar graph
Numerical dataHistogram
Cumulative frequency distribution
![Page 39: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/39.jpg)
Summary: Associations between variables
Explanatory variableResponse variable Categorical Numerical
CategoricalContingency tableGrouped bar graph
Mosaic plot
NumericalMultiple histograms
Cumulative frequency distributionsScatter plot
![Page 40: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/40.jpg)
Great book on graphics
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 41: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/41.jpg)
Describing data
![Page 42: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/42.jpg)
Two common descriptions of data
• Location (or central tendency)
• Width (or spread)
![Page 43: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/43.jpg)
Measures of location
Mean
Median
Mode
![Page 44: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/44.jpg)
Mean
€
Y =
Yi
i=1
n
∑n
n is the size of the sample
![Page 45: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/45.jpg)
Mean
Y1=56, Y2=72, Y3=18, Y4=42
![Page 46: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/46.jpg)
Mean
Y1=56, Y2=72, Y3=18, Y4=42
= (56+72+18+42) / 4 = 47
€
Y
![Page 47: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/47.jpg)
Median
• The median is the middle measurement in a set of ordered data.
![Page 48: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/48.jpg)
The data:
18 28 24 25 36 14 34
![Page 49: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/49.jpg)
The data:
18 28 24 25 36 14 34
can be put in order:
14 18 24 25 28 34 36
Median is 25.
![Page 50: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/50.jpg)
0.0
2.5
5.0
7.5
10.0
12.5
5 6 7 8 9 10 11 12 13 14 15 16 17 18
Frequency
Mouse weight at 50 days old, in
a line selected for small size
Mean
Mode
Median
![Page 51: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/51.jpg)
Mean vs. median in politics
• 2004 U.S. Economy
• Republicans: times are good– Mean income increasing ~ 4% per year
• Democrats: times are bad– Median family income fell
• Why?
![Page 52: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/52.jpg)
Mean 169.3 cm
Median 170 cm
Mode 165-170 cm
![Page 53: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/53.jpg)
150 160 170 180 190 200 210
0.2
0.4
0.6
0.8
1
Cumulative
Frequency
Height (in cm) of Bio300 Students
![Page 54: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/54.jpg)
Measures of width
• Range
• Standard deviation
• Variance
• Coefficient of variation
![Page 55: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/55.jpg)
Range
14 17 18 20 22 22 24 25 26 28 28 28 30 34 36
![Page 56: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/56.jpg)
Range
14 17 18 20 22 22 24 25 26 28 28 28 30 34 36
The range is 36-14 = 22
![Page 57: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/57.jpg)
![Page 58: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/58.jpg)
Population Variance
€
σ 2 =
Yi − μ( )2
i=1
N
∑N
![Page 59: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/59.jpg)
Sample variance
€
s2 =
Yi −Y ( )2
i=1
n
∑n −1
n is the sample size
![Page 60: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/60.jpg)
Shortcut for calculating sample variance
€
s2 =n
n −1
⎛
⎝ ⎜
⎞
⎠ ⎟
Yi2
i=1
n
∑n
−Y 2
⎛
⎝
⎜ ⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟ ⎟
![Page 61: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/61.jpg)
Standard deviation (SD)
• Positive square root of the variance
σ is the true standard deviations is the sample standard deviation
![Page 62: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/62.jpg)
In class exercise
Calculate the variance and standard deviation of a sample
with the following data:
6, 1, 2
![Page 63: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/63.jpg)
Answer
Variance=7Standard deviation =
€
7
![Page 64: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/64.jpg)
Coefficient of variance (CV)
CV = 100 s / .
€
Y
![Page 65: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/65.jpg)
Equal means, different variances
-5 0 5 10
0.1
0.2
0.3
0.4
Value
Frequency
V = 1
V=2
V=10
![Page 66: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/66.jpg)
Manipulating means
• The mean of the sum of two variables:
E[X + Y] = E[X]+ E[Y]
• The mean of the sum of a variable and a constant:
E[X + c] = E[X]+ c
• The mean of a product of a variable and a constant:
E[c X] = c E[X]
• The mean of a product of two variables:
E[X Y] = E[X] E[Y]
if and only if X and Y are independent.
![Page 67: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/67.jpg)
Manipulating variance
• The variance of the sum of two variables:
Var[X + Y] = Var[X]+ Var[Y]
if and only if X and Y are independent.
• The variance of the sum of a variable and a constant:
Var[X + c] = Var[X]
• The variance of a product of a variable and a constant:
Var[c X] = c2 Var[X]
![Page 68: Describing data with graphics and numbers](https://reader036.vdocuments.us/reader036/viewer/2022081519/56813a48550346895da23daf/html5/thumbnails/68.jpg)
Parents’ heights
Mean Variance
Father Height
174.3 71.7
Mother Height
160.4 58.3
Father Height +Mother Height
334.7 184.9