wk 2 – hrs 1 (tue, jan 10) wk 2 - hr 2 and 3 (thur, jan 12 ...jackd/stat201/lecture_wk02-1.pdf ·...

46
Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descripve stascs: - Measures of centrality (Mean, median, mode, trimmed mean) - Measures of spread (MAD, Standard deviaon, variance) - Other measures (Quanles, skewness, shape parameters

Upload: others

Post on 15-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Wk 2 – Hrs 1 (Tue, Jan 10)

Wk 2 - Hr 2 and 3 (Thur, Jan 12)

Descriptive statistics:

- Measures of centrality

(Mean, median, mode, trimmed mean)

- Measures of spread

(MAD, Standard deviation, variance)

- Other measures

(Quantiles, skewness, shape parameters

Page 2: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Variation: Not everything can be controlled. Results may vary, even in a factory setting. Some bags will get more chips than others, we say there is variation in the weights in each bag.

Image source: Failblog.org, “Quality Control Fail”

Page 3: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

There are laws about the proportion of bags sold that can be under-weight.

A company needs to know the proportion that will be under but can’t afford to check every single bag.

Instead they check a sample of bags and hope it represents the population. (Like my survey of 39 students)

Page 4: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

..but those samples are not going to be the same every time.

Most of you have done this before during R-R-R-Roll Up TheRim season.

They say there’s a one in six chance of winning, but did you win on EXACTLY one in six cups. Did you win as much as yourfriends?

Page 5: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Mine: 0 / 3 = 0% Wins

Jason: 3 / 13 = 23% Wins

Emelie: 6 / 39 = 15% Wins

Each person’s roll up the rim season is different, why? Variability!

Page 6: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Why should you care?

When you’re doing a social study or experiment, your resultsaren’t going to be hard set.

Image: xkcd.com

Page 7: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

If you did the same study tomorrow with similar subjects, you’d get different results. It would help if we had an idea how different we would expect these differences to be.

Image: xkcd.com

Page 8: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

That’s what measures of spread like the interquartile range (IQR) and the standard deviation are for.

They help us measure how uncertain we are about our central values.

IQR is intuitive, works for a wide range of distributions, and has the 1.5xIQR rule for finding outliers.

But it’s tied to the median and related measures like the quartiles.

Page 9: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

A spread measure based on the mean is the standarddeviation.

To deviate means the stray from the norm.

A standard deviation is the typical amount strayed from themean.

Page 10: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:
Page 11: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

When the distribution looks kind of like this…

about ⅔ of the distribution is within 1 sd of the mean

about 95% is within 2 sd of the mean

about 99% is within 3 sd of the mean

Page 12: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Example: Grade 5 Reading Scores have a

mean of 120 and a standard deviation (sd) of 25.

120 + 1sd = 145120 – 1sd = 95So about 2/3 of the grade 5s have a reading score between 95 and 145.

Page 13: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Example: Grade 5 Reading Scores have a

mean of 120 and a standard deviation (sd) of 25.

120 + 2sd = 120 + 2(25) = 170120 – 2sd = 120 – 2(25) = 70So about 95% of the grade 5s have a reading score between70 and 170.

Page 14: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Another way to determine outliers when using the mean and standard deviation is the 3 standard deviation rule.

Anything three standard deviations below or above the mean is an outlier.

.

Page 15: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

With the reading scores, anything below 120 – 3(15) = 75or above 120 + 3(15) = 165 is an outlier.

Like the mean and standard deviation, this outlier measure isonly appropriate for symmetric data.

Page 16: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Quartiles and the Five Number Summary

- The five numbers are the Minimum (Q0), LowerQuartile (Q1), Median (Q2), Upper Quartile (Q3), andMaximum (Q4).

- Q1 means bigger than 1 Quarter of the data.- Q3 means bigger than 3 Quarters of the data.

For the values {0, 1, 2, 4, 5, 5, 7, 10, 10, 12, 13, 17, 39},

the five number summary is: 0

3

7

12.5

39.

Page 17: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

There are several ways to compute the quartiles, but here's the one I used.

In this data set: {0, 1, 2, 4, 5, 5, 7, 10, 10, 12, 13, 17, 39}

There are 13 numbers, n=13.

So the median is the 7th value.

The lower quartile is the 3.5th smallest value (between the 2 and 4)

The lower quartile is the 3.5th largest value (between the 12 and 13)

Page 18: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Inter-Quartile Range

The Inter-Quartile Range. (Literally “range the between thequartiles”, called the IQR for short), is a measure of spreadbased on the median rather than the mean.

Likewise, it's robust to outliers.

Page 19: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- The Inter-Quartile range is calculated:

IQR = Q3 – Q1

a) The size of the IQR indicates how spread out the middle halfof the data is.

Page 20: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Outliers (1.5 x IQR Rule)

1. Now that we have a measure of spread, we can use it toidentify values that are much farther from the centerthan usual.

2. How? Spread measures like the IQR tell us how far a typical value could be from the average, so anything muchmore than the typical distance can be identified.

Page 21: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- We call these data points outliers.

They (figuratively) lay outside the rest of thedata.

- Because an outlier stands out from the restof the data, it…

o might not belong there, or o is worthy of extra attention.

Page 22: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- One way to define an outlier is

o anything below Q1 – 1.5 IQR or… o aboveQ3 + 1.5 IQR.

This is called the 1.5 x IQR rule. (Important).

Page 23: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- Example: {0, 1, 2, 4, 5, 5, 7, 10, 10, 12, 13, 17, 39} Q1 = 3, Q3 = 12.5

IQR = 12.5 - 3 = 9.5.

Q1 – 1.5xIQR = 3 – 1.5(9.5)

= 3 -14.25 = -11.25

Anything less than -11.25 is an outlier.

In this case there are no outliers on the low end.

Page 24: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- Example: {0, 1, 2, 4, 5, 5, 7, 10, 10, 12, 13, 17, 39} Q1 = 3, Q3 = 12.5

IQR = 9.5

Q3 + 1.5xIQR = 12.5 + 1.5*9.5

= 12.5 + 14.25 = 26.75

Anything more than 26.75 is an outlier.

39 is the only outlier.

Page 25: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

More on IQR and Outliers:

- There are other ways to define outliers, but 1.5xIQR isone of the most straightforward.

- If our range has a natural restriction, (like it can’t possibly be negative), it’s okay for an outlier limit to be beyond that restriction.

- If a value is more than Q3 + 3*IQR or less than Q1 –3*IQR it is sometimes called an extreme outlier.

Page 26: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- The standard graph for showing the median, quartiles, and outliers of a data set is the boxplot, for {0, 1, 2, 4, 5, 5, 7, 10, 10, 12, 13, 17, 39} it looks like this:

Page 27: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- The five-number summary is in the boxplot:

- The box from 3 to 12.5 is the region between Q1and Q3.

- The line going through the middle of the box at 7 isthe median.

-

Page 28: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- The lines going out the ends of the box are called thewhiskers. They show the range of values that are notoutliers.

- The lower whisker goes to the lowest value, 1. The upper whisker goes to 17 because it’s the biggest value before the upper limit of 26.75 is hit.

Page 29: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- The individual dot at 39 shows an outlier.

- Outliers in SPSS are labelled with their row numberso you can find them in data view.

- In SPSS extreme outliers are shown as stars.

- The farthest outliers on either side are theminimum and maximum.

- If there are no outliers on a side, the end of thewhisker is that minimum or maximum.

Page 30: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Boxplots and Skew

- Skewed distributions have more extreme values onone side, so a boxplot of a skewed distribution willhave one whisker longer than the other.

- There will also be more outliers on one side of theboxplot than the other.

Page 31: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Side-by-side Boxplots

- Boxplots can also be used to comparethe distributions of two samples.

- Example: Heights of adult men andwomen.

Page 32: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- There is some overlap

- In general men are taller.

- The variance is about the same.- Both distributions appear to be symmetric.

Page 33: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

This page left obnoxiously blank

Page 34: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

What exactly IS an outlier?

- It’s a value far from anything else that warrantsspecial consideration aside from the rest of thedata.

- Often it’s a mistake in data entry. If were recording a grade of 73%, mistyped, and recorded 3% or 730%, both of these values would be far from the rest of the data and would indicate that the data is not being represented properly.

Page 35: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

- If the times to finish a final exam had Q1 at 120 minutes and Q3 at 150 minutes, but someone finished in 62 minutes, that person could be a student with a stronger than recommended background for that course or someone who gave up during the exam.

- In both cases, their exam wouldn’t agood representation of the exams aswhole.

- Sometimes outliers can tell your assumptionsand expectations are wrong

Page 36: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:
Page 37: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

again...

Page 38: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Finally, there's the variance.

The variance is the average squared difference between a value and the mean.

The standard deviation is the square root of the variance.

We won't be using the variance, but I will be referring to it to explain some concepts in the future.

Page 39: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

The standard deviation is only used for symmetric (or close)distributions.

When data is skewed the standard deviation breaks down because of direction of the deviations becomes important.

Page 40: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Example: Postively/Right skewed distribution.

The first standard deviation below the mean (blue) coversmore of the distribution than first one above (red).

So a standard deviation below implies something differentthan a standard deviation above.

Page 41: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Example: Right skewed distribution.

Since the mean is more than the median, there are morevalues below the mean. Does that imply that a deviationbelow the mean is ‘standard’?

For skew, avoid the whole mess and use the IQR.

Page 42: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Pop quiz:

If the distribution is symmetric and the data is interval, thenthe best measure of variability is:

a) Interquartile rangeb) Standard Deviation

Hint: What is the default central measure? Which measureabove is based on that?

Page 43: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Question:

If the data is ordinal, then which measure of variability/spreadis not possible (without extra assumptions):

a) Interquartile range b) Standard Deviation

Hint: The standard deviation is based on the mean. Do ordinalshave means?

Page 44: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Answer:

Standard deviation is impossible for ordinal data because youcan’t get the mean of ordinal data usually.

To get the mean for ordinal data, you need to treat it like interval data, that means assuming that the categories areevenly spaced

Page 45: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Which of the following standard deviations is/are impossible?

40

7 potatoes

-4

Hint: The standard deviation is the square root of the variance.

Page 46: Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12 ...jackd/Stat201/Lecture_Wk02-1.pdf · Wk 2 – Hrs 1 (Tue, Jan 10) Wk 2 - Hr 2 and 3 (Thur, Jan 12) Descriptive statistics:

Answer: -4 is impossible.

Standard deviation is the (positive) square root of the variance.It doesn’t make sense for the typical distance from the mean to be a negative number.

7 potatoes is a fine standard deviation if the variable is number of potatoes. (for interest, the variance would be measured in

potatoes2 )