measures of position where does a certain data value fit in relative to the other data values?

41
Measures of Position Where does a certain data value fit in relative to the other data values? To accompany Hawkes lesson 3.3 Original content by D.R.S. 1

Upload: maitland

Post on 24-Feb-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Measures of Position Where does a certain data value fit in relative to the other data values?. To accompany Hawkes lesson 3.3 Original content by D.R.S. N th Place. The highest and the lowest 2 nd highest, 3 rd highest, etc. “If I made $60,000, I would be 6 th richest.”. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Measures of  Position Where does a certain data value fit in relative to the other data values?

1

Measures of PositionWhere does a certain data value fit in relative

to the other data values?

To accompany Hawkes lesson 3.3Original content by D.R.S.

Page 2: Measures of  Position Where does a certain data value fit in relative to the other data values?

2

Nth Place

• The highest and the lowest• 2nd highest, 3rd highest, etc.• “If I made $60,000, I would be 6th richest.”

Page 3: Measures of  Position Where does a certain data value fit in relative to the other data values?

3

Another view: “How does my compare to the mean?”

• “Am I in the middle of the pack?”• “Am I above or below the middle?”• “Am I extremely high or extremely low?”

• Score is the measuring stick

Page 4: Measures of  Position Where does a certain data value fit in relative to the other data values?

4

Score: is how many standard deviations away from the mean?

If you know the x value• Population:

• Sample

To work backward from z to x• Population

• Sample

Page 5: Measures of  Position Where does a certain data value fit in relative to the other data values?

5

score is also called “Standard Score”

• No matter what is measured in or how large or small the values are….

• The score of the mean will be 0– Because numerator turns out to be 0.

• If is above the mean, its is positive.– Because numerator turns out to be positive

• If is below the mean, its is negative.– Because numerator turns out to be negative

Page 6: Measures of  Position Where does a certain data value fit in relative to the other data values?

6

score values

• Typically round to two decimal places.– Don’t say “0.2589”, say “0.26”

• If not two decimal places, pad– Don’t say “2”, say “2.00”– Don’t say “-1.1”, say “-1.10”

• scores are almost always in the interval . Be very suspicious if you calculate a score that’s not a small number.

Page 7: Measures of  Position Where does a certain data value fit in relative to the other data values?

7

Practice: Given x, compute z

Find the scores corresponding to the salary values, given that the mean, and the standard deviation .

Page 8: Measures of  Position Where does a certain data value fit in relative to the other data values?

8

Practice: Given z, compute x

Find the scores (salaries) corresponding to these standard scores, given that the mean, and the standard deviation .

• and • and • and

Page 9: Measures of  Position Where does a certain data value fit in relative to the other data values?

9

Two parallel axes (scales), and

Page 10: Measures of  Position Where does a certain data value fit in relative to the other data values?

10

Example: Using scores to compare unlike items

The Literature test• The mean score was 77

points.• The standard deviation was

11 points• Sue earned 91 points• Find her z score for this test

The Biology test• The mean score was 47

points• The standard deviation was

6 points• Sue earned 55 points• Find her z score for this test• On which test did she have

the “better” performance?

Page 11: Measures of  Position Where does a certain data value fit in relative to the other data values?

11

scores caution with negatives

• Example: compare test scores on two different tests to ascertain “Which score was the more outstanding of the two?”

• Be careful if the scores turn out to be negative. Which is the better performance? or ?

• Stop and think back to your basic number line and the meaning of “<“ and “>”

Page 12: Measures of  Position Where does a certain data value fit in relative to the other data values?

12

Percentiles

• “What percent of the values are lower than my value?”– 90th percentile is pretty high– 50th percentile is right in the middle– 10th percentile is pretty low

• If you scored in the 99th percentile on your SAT, I hope you got a scholarship.

Page 13: Measures of  Position Where does a certain data value fit in relative to the other data values?

13

Salary data for our percentile examples

• With these salary values again

• What’s thepercentile for a salary of $59,000 ?

• You can see it’s going to be higher than 50th Because it’s in the top half.

Page 14: Measures of  Position Where does a certain data value fit in relative to the other data values?

14

Example: Given x, find the percentile

• Count = how many values below $59,000• Count = how many values in the data set• Formula for percentile • Here we have values lower than our $59,000• Here we have values in the data set.• so , “75th percentile”

Page 15: Measures of  Position Where does a certain data value fit in relative to the other data values?

15

Continued: Given x, find the percentile

• so • Do not say “75%”, but say “the 75th percentile”• Other sources use different formulas, beware!– Some other books use in the numerator.– Excel has two different answers, PERCENTILE.EXC

and PERCENTILE.INC functions.

Page 16: Measures of  Position Where does a certain data value fit in relative to the other data values?

16

Given Percentile , find the value

• Formula: position from bottom – Again, how many data values in the set– and the percentile rank that’s given.

• Is there a decimal remainder in position ?– If so, then BUMP UP to the next highest whole #

and take the value in that position.– Or if is an exact whole number, take the average

from positions and .• Note: Book uses lowercase instead of .

Page 17: Measures of  Position Where does a certain data value fit in relative to the other data values?

17

Given Percentile , find the value

• Example: What is the 31st percentile in the salary data?

• 31st percentile: plug in • Compute . It has a remainder.• Bump it up! 7. – Not rounding, but rather bumpety-upping

• So we look 7 positions from the bottom• “The 31st percentile is $44,476”

Page 18: Measures of  Position Where does a certain data value fit in relative to the other data values?

18

Given Percentile , find the value

• Example: What is the 40th percentile in the salary data? Plug in

• Compute . Exact integer!• So count 8th and 9th from bottom.• “The 40th percentile is $47,367.50, or

$47,368.”

Page 19: Measures of  Position Where does a certain data value fit in relative to the other data values?

19

Excel gives different answers

• Excel does some fancy interpolation

Page 20: Measures of  Position Where does a certain data value fit in relative to the other data values?

20

Quartiles Q1, Q2, Q3

• Data values are arranged from low to high.• The Quartiles divide the data into four groups.• Q2 is just another name for the Median.

• Q1 = Find the Median of Lowest to Q2 values

• Q3 = Find the Median of Q2 to Highest values

• It gets tricky, depending on how many values.

Page 21: Measures of  Position Where does a certain data value fit in relative to the other data values?

21

Quartiles example

• 10, 20, 30, 40, 50, 60, 70, 80, 90• The Second Quartile, Q2 = median = 50• Find the medians of the subsets left and right.• Keep the 50 in each of those subsets.• The First Quartile, Q1

= median of { 10, 20, 30, 40, 50 } = 30• The Third Quartile, Q3

= median of { 50, 60, 70, 80, 90 } = 70

Page 22: Measures of  Position Where does a certain data value fit in relative to the other data values?

22

Quartiles example

• 10, 20, 30, 40, 50, 60, 70, 80, 90, 100• Q2 = median =. (two middle #s)• Leave the 50 and 60 in place; do not reuse 55• Q1 = median of {10, 20, 30, 40, 50} = 30

• Q3 = median of {60, 70, 80, 90, 100} = 80

Page 23: Measures of  Position Where does a certain data value fit in relative to the other data values?

23

Quartiles example

• 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110• Q2 = median = (two middle #s). • 55 isn’t really there so you can’t remove it!• Leave the 50 and 60 in place• Q1 = median of {0, 10, 20, 30, 40, 50} = 25

• Q3 = median of {60, 70, 80, 90, 100, 110} = 85• Two middle numbers happened again!

Page 24: Measures of  Position Where does a certain data value fit in relative to the other data values?

24

Interquartile Range

• Definition: IQR = Q3 – Q1

• In the previous example, 85 – 25 = 60.• Interquartile Range measures how spread out

the middle of the data are– The lowest quartile (x < Q1) is not involved

– And the highest quartile (x > Q3) is not involved.

Page 25: Measures of  Position Where does a certain data value fit in relative to the other data values?

25

Quartiles with TI-84

• 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110• Put values into a TI-84 List• Use STAT, CALC,

1-Var Stats• Scroll down down

down to get to them.

Page 26: Measures of  Position Where does a certain data value fit in relative to the other data values?

26

There is disagreement about Quartiles

• The TI-84 sometimes gives different answers than the method we use in the Hawkes materials

• Excel might give different answers from Hawkes and TI-84, both.

• Use the Hawkes method in this course’s work• Be aware of the others– You should know how to use TI-84 and Excel– You should be aware that differences can occur.

Page 27: Measures of  Position Where does a certain data value fit in relative to the other data values?

27

Quartiles with TI-84 vs. Hawkes

• 10, 20, 30, 40, 50, 60, 70, 80, 90• We got Q1=30 and Q3=70 before.• Hawkes keeps the 50,

using 10,20,30,40,50to compute Q1.

• But the TI-84 throwsout 50 and uses 10,20,30,40.

• Hawkes says the TI-84 is computing “hinges”.

Page 28: Measures of  Position Where does a certain data value fit in relative to the other data values?

28

Quartiles in Excel

• =QUARTILE.INC(cells, 1 or 2 or 3) seems to give the same results as the old QUARTILE function

• There’s new =QUARTILE.EXC(cells, 1 or 2 or 3)

• Excel does fancy interpolation stuff and may give different Q1 and Q3 answers compared to the TI-84 and our by-hand methods.

Page 29: Measures of  Position Where does a certain data value fit in relative to the other data values?

29

The Five Number Summary

• Again: 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110

• Q2 = median =, Q1 = 25 and Q3 = 85• “The Five Number Summary” is defined as:

the minimum, then Q1, Q2, Q3, then the maximum

• For this set of numbers, the Five Number Summary is “0, 25, 55, 85, 110”

Page 30: Measures of  Position Where does a certain data value fit in relative to the other data values?

30

The Five Number Summary

• Again: 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110

• Q2=55, Q1=25, Q3 = 85• Min is 0, Max is 110• For this set of numbers,

the Five Number Summary is “0, 25, 55, 85, 110”

• Box Plot

• TI-84 can do Box Plot too, but again its quartiles disagree with the way Hawkes defines quartiles.

Min Q1 Q2 Q3 Max0 25 55 85 110

Page 31: Measures of  Position Where does a certain data value fit in relative to the other data values?

31

Why Box Plot?

• Don’t lose sight of the big picture here:– We have a data set– It’s a bunch of numbers– We want to summarize the data

• Summarize means make it into a sound bite– We must be Concise – don’t say too much– We must be Informative – don’t say too little

Page 32: Measures of  Position Where does a certain data value fit in relative to the other data values?

32

We must be Concise

• Bad: “Here is a report that tells you the mean and the variance and the standard deviation and the quartiles and the percentiles from 0 to 100… and the marketing survey analyzed by demographic subgroups …” (there is a place for that, but not right now)

• Good: “Got fifteen seconds? Here’s what we found.”

Page 33: Measures of  Position Where does a certain data value fit in relative to the other data values?

33

Notice the pieces of the boxplot:

• Horizontal scale, maybe a little beyond the min and the max. A generic number line.

• The five numbers.• The box holds the quartiles– With a line in the middle at the median.

• The whiskers extend out to the min and the max.

Page 34: Measures of  Position Where does a certain data value fit in relative to the other data values?

34

TI-84 Boxplot

• See instructions on separate handout.• Caution again that TI-84 computes quartiles

differently from Hawkes and differently from Excel, so the results aren’t always going to agree.

Page 35: Measures of  Position Where does a certain data value fit in relative to the other data values?

35

Additional Topics

• Might not be needed for Hawkes homework• But you should be aware of them

• Quintiles and Deciles• Interquartile Range and Outliers• TI-84 Box Plot

Page 36: Measures of  Position Where does a certain data value fit in relative to the other data values?

36

Quintiles and Deciles

• You might also encounter– Quintiles, dividing data set into 5 groups.– Deciles, dividing data set into 10 groups.

• Reconcile everything back with percentiles:– Quartiles correspond to percentiles 25, 50, 75– Deciles correspond to percentiles 10, 20, …, 90– Quintiles correspond to percentiles 20, 40, 60, 80

Page 37: Measures of  Position Where does a certain data value fit in relative to the other data values?

37

Interquartile Range and Outliers

• Concept: An OUTLIER is a wacky far-out abnormally small or large data value compared to the rest of the data set.

• We’d like something more precise.• Define: IQR = Interquartile Range = Q3 – Q1.• Define: If , is an Outlier.• Define: If , is an Outlier.• (Other books might make different definitions)

Page 38: Measures of  Position Where does a certain data value fit in relative to the other data values?

38

Outliers Example

• Here’s an quick elementary example:• Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20• Mean and • Or in Hawkes method, , , and we still get

interquartile range = (it won’t always work out the same but in this case the IQR is the same either way)

Page 39: Measures of  Position Where does a certain data value fit in relative to the other data values?

39

Outliers Example

• Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20• We found IQR = 6 and the mean is 6.8• One definition uses to define outliers• Here, • Anything more than 9 units away from is then

considered to be abnormally small or large.• , nothing smaller than • : the 20 is an outlier.

Page 40: Measures of  Position Where does a certain data value fit in relative to the other data values?

40

No-Outliers Example

• Data values 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10• Mean and

(coincidence that , insignificant)

• Anything more than 9 units away from is abnormal.

• This data set has No Outliers.

Page 41: Measures of  Position Where does a certain data value fit in relative to the other data values?

41

Outliers: Good or Bad?

• “I have an outlier in my data set. Should I be concerned?”– Could be bad data. A bad measurement. Somebody

not being honest with the pollster.– Could be legitimately remarkable data, genuine true

data that’s extraordinarily high or low.• “What should I do about it?”– The presence of an outlier is shouting for attention.

Evaluate it and make an executive decision.