measures of spread chapter 3.3 – tools for analyzing data i can: calculate and interpret measures...
TRANSCRIPT
![Page 1: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/1.jpg)
Measures of Spread
Chapter 3.3 – Tools for Analyzing Data
I can: calculate and interpret measures of spread
MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7, 10
![Page 2: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/2.jpg)
What is spread?
Measures of central tendency do not always tell you everything!
The histograms have identical mean and median, but the spread is different
Spread tells you how widely the data are dispersed
Co
un
t
1
23
4
56
7
data2 3 4 5 6 7 8 9
data Histogram
Co
un
t
1
2
3
4
sp2 4 6 8 10
data Histogram
![Page 3: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/3.jpg)
Why worry about spread? Spread is how closely the values cluster
around the middle value Less spread means you have greater
confidence that values will fall within a particular range
Important for making predictions
![Page 4: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/4.jpg)
Measures of Spread There are 3 Measures of Central Tendency:
Mean, Median, Mode All measure the centre of a set of data We will also study 3 Measures of Spread:
Range, Interquartile Range, Standard Deviation
All measure how spread out data is Smaller value = less spread, more consistent
![Page 5: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/5.jpg)
Measures of Spread 1) Range = (max) – (min)
Indicates the size of the interval that contains 100% of the data
2) Interquartile Range IQR = Q3 – Q1 where Q1 is the lower half median and Q3 is the
upper half median Indicates the size of the interval that contains the
middle 50% of the data
![Page 6: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/6.jpg)
Quartiles Example 26 28 34 36 38 38 40 41 41 44 45 46 51 54 55
Q2 = 41 Median Q1 = 36 Lower half median Q3 = 46 Upper half median IQR = Q3 – Q1 = 46 – 36 = 10 (50% of data is 10 units apart) If a quartile occurs between 2 values, it is
calculated as the average of the two values
![Page 7: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/7.jpg)
A More Useful Measure of Spread Range is very basic
Does not take clusters or outliers into account Interquartile range is somewhat useful
Takes clusters and outliers into account Visual in Box-and-Whisker Plot
Standard deviation is very useful The average distance from the mean for all data
points
![Page 8: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/8.jpg)
Standard Deviation
1. Find the mean (average)
2. Find the deviation for each data point
(data) – (mean)
3. Square the deviations
4. Average the squares of the deviations (this is called the variance)
5. Take the square root of the variance
![Page 9: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/9.jpg)
Deviation The mean of these numbers is 48 Deviation = (data) – (mean) The deviation for 24 is 24 - 48 = -24 -24
12 24 36 48 60 72 84
36 The deviation for 84 is 84 - 48 = 36
![Page 10: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/10.jpg)
Example of Standard Deviation 26 28 34 36 mean = (26 + 28 + 34 + 36) / 4 = 31 σ² = (26–31)² + (28-31)² + (34-31)² + (36-31)²
4 σ² = 25 + 9 + 9 + 25
4 σ² = 17 σ = √17 = 4.1
![Page 11: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/11.jpg)
Measure of Spread - Recap Measures of Spread are numbers indicating how spread
out data is Smaller value for any measure of spread means data is
more consistent 1) Range = Max – Min 2) Interquartile Range: IQR = Q3 – Q1
Q1 = first half median Q3 = second half median
3) Standard Deviation Find mean (average) Find all deviations (data) – (mean) Square all and average them - this is variance or σ2
Take the square root to get std. dev. σ
![Page 12: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/12.jpg)
Standard Deviation σ² (lower case sigma
squared) is used to represent variance
σ is used to represent standard deviation
σ is commonly used to measure the spread of data, with larger values of σ indicating greater spread
we are using a population standard deviation
n
xxi
2
![Page 13: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/13.jpg)
Standard Deviation with Grouped Data
grouped mean = (2×2 + 3×6 + 4×6 + 5×2) / 16 = 3.5 deviations:
2: 2 – 3.5 = -1.5 3: 3 – 3.5 = -0.5 4: 4 – 3.5 = 0.5 5: 5 – 3.5 = 1.5
σ² = 2(-1.5)² + 6(-0.5)² + 6(0.5)² + 2(1.5)² 16 σ² = 0.7499 σ = √0.7499 = 0.9
i
ii
f
xxf 2
Hours of TV 2 3 4 5
Frequency 2 6 6 2
![Page 14: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/14.jpg)
MSIP / Home Learning Read through the examples on pages 164-
167 Complete p. 168 #2b, 3b, 4, 6, 7, 10 You are responsible for knowing how to do
simple examples by hand (~6 pieces of data) We will use technology (Fathom/Excel) to
calculate larger examples Have a look at your calculator and see if you
have this feature (Σσn and Σσn-1)
![Page 15: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/15.jpg)
Normal Distribution
3.4 – Tools for Analyzing Data
Learning goal: Determine the % of data within intervals of a Normal Distribution
Due now: p. 168 #2b, 3b, 4, 6, 7, 10
MSIP / Home Learning: p. 176 #1, 3b, 6, 8-10
![Page 16: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/16.jpg)
Histograms
Histograms can be skewed...
Right-skewed Left-skewed
![Page 17: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/17.jpg)
Histograms
... or symmetricalC
ou
nt
1
2
3
4
5
a3 4 5 6 7 8 9 10 11
Collection 1 Histogram
![Page 18: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/18.jpg)
Normal? A normal distribution is a histogram that is
symmetrical and has a bell shape Used quite a bit in statistical analysis Also called a Gaussian Distribution Symmetrical with equal mean, median and mode
that fall on the line of symmetry of the curve
![Page 19: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/19.jpg)
A Real Example the heights of 600 randomly chosen Canadian
students from the “Census at School” data set the data approximates a normal distribution
0.005
0.010
0.015
0.020
0.025
0.030
0.035
De
nsit
y
100 120 140 160 180 200 220 240Heightcm
Density = x mean s normalDensity
600 Student Heights Histogram
![Page 20: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/20.jpg)
The 68-95-99.7% Rule Area under curve is 1 (i.e. it represents 100%
of the population surveyed) Approx 68% of the data falls within 1
standard deviation of the mean Approx 95% of the data falls within 2
standard deviations of the mean Approx 99.7% of the data falls within 3
standard deviations of the mean http://davidmlane.com/hyperstat/A25329.html
![Page 21: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/21.jpg)
Distribution of Data
34% 34%
13.5% 13.5%
2.35% 2.35%
68%
95%
99.7%
x x + 1σ x + 2σ x + 3σx - 1σx - 2σx - 3σ
),(~ 2xNX
0.15%0.15%
![Page 22: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/22.jpg)
Normal Distribution Notation
The notation above is used to describe the Normal distribution where x is the mean and σ² is the variance (square of the standard deviation)
e.g. X~N (70,82) describes a Normal distribution with mean 70 and standard deviation 8 (our class at midterm?)
),(~ 2xNX
![Page 23: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/23.jpg)
An example Suppose the time before burnout for an LED
averages 120 months with a standard deviation of 10 months and is approximately Normally distributed. What is the length of time a user might expect an LED to last with:
a) 68% confidence? b) 95% confidence? So X~N(120,102)
![Page 24: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/24.jpg)
An example cont’d
68% of the data will be within 1 standard deviation of the mean
This will mean that 68% of the bulbs will be between
120–10 = 110 months and 120+10 = 130 months So 68% of the bulbs will last 110 - 130 months
95% of the data will be within 2 standard deviations of the mean
This will mean that 95% of the bulbs will be between
120 – 2×10 = 100 months and 120 + 2×10 = 140 months So 95% of the bulbs will last 100 - 140 months
![Page 25: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/25.jpg)
Example continued… Suppose you wanted to know how long
99.7% of the bulbs will last? This is the area covering 3 standard
deviations on either side of the mean This will mean that 99.7% of the bulbs will be
between 120 – 3×10 months and 120 + 3×10 So 99.7% of the bulbs will last 90-150 months This assumes that all the bulbs are produced
to the same standard
![Page 26: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/26.jpg)
Example continued…
34% 34%
13.5% 13.5%
2.35% 2.35%
95%
99.7%
120 140 15010090months monthsmonthsmonths months
![Page 27: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/27.jpg)
Percentage of data between two values The area under any normal curve is 1 The percent of data that lies between two
values in a normal distribution is equivalent to the area under the normal curve between these values
See examples 2 and 3 on page 175
![Page 28: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/28.jpg)
Why is the Normal distribution so important? Many psychological and educational
variables are distributed approximately normally: height, reading ability, memory, IQ, etc.
Normal distributions are statistically easy to work with All kinds of statistical tests are based on it
Lane (2003)
![Page 30: Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,](https://reader035.vdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b92ca1/html5/thumbnails/30.jpg)
References
Lane, D. (2003). What's so important about the normal distribution? Retrieved October 5, 2004 from http://davidmlane.com/hyperstat/normal_distribution.html
Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page