slide 1 statistics workshop tutorial 6 measures of relative standing exploratory data analysis

22
Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Upload: lewis-palmer

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Slide 1

Statistics Workshop Tutorial 6

•Measures of Relative Standing• Exploratory Data Analysis

Page 2: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 2

Created by Tom Wegleitner, Centreville, Virginia

Section 2-6Measures of Relative

Standing

Page 3: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 3

z Score (or standard score)

the number of standard deviations that a given value x is above or

below the mean.

Definition

Page 4: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 4

Sample Population

x - µz =

Round to 2 decimal places

Measures of Positionz score

z = x - xs

Page 5: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 5Interpreting Z Scores

Whenever a value is less than the mean, its corresponding z score is negative

Ordinary values: z score between –2 and 2 sd

Unusual Values: z score < -2 or z score > 2 sd

FIGURE 2-14

Page 6: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 6Definition

Q1 (First Quartile) separates the bottom 25% of sorted values from the top 75%.

Q2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%.

Q1 (Third Quartile) separates the bottom 75% of sorted values from the top 25%.

Page 7: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 7

Q1, Q2, Q3 divides ranked scores into four equal parts

Quartiles

25% 25% 25% 25%

Q3Q2Q1(minimum) (maximum)

(median)

Page 8: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 8Percentiles

Just as there are quartiles separating data into four parts, there are 99 percentiles denoted P1, P2, . . . P99, which partition the data into 100 groups.

Page 9: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 9Finding the Percentile

of a Given Score

Percentile of value x = • 100number of values less than x

total number of values

Page 10: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

From Percentile to Data Value

• What score is at the kth percentile?

• (1) Rank the data from lowest to highest

• (2) Find L (locator) L = k% * n

• a) If L is not a whole number, round up and find the score in that position

• b) If L is a whole #, find the average of the scores in positions L and L+1

Page 11: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 11

Interquartile Range (or IQR): Q3 - Q1

10 - 90 Percentile Range: P90 - P10

Semi-interquartile Range:2

Q3 - Q1

Midquartile:2

Q3 + Q1

Some Other Statistics

Page 12: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 13

Created by Tom Wegleitner, Centreville, Virginia

Section 2-7Exploratory Data Analysis

(EDA)

Page 13: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 14

Exploratory Data Analysis is the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate data sets in order to understand their important characteristics

Definition

Page 14: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Outliers

• An outlier is a very high or very low value that stand apart from the rest of the data

• They may be from data collection errors, data entry errors, or simply valid but unusual data values.

• Always identify and examine outliers to determine if they are in error

Page 15: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 16Important Principles

An outlier can have a dramatic effect on the mean

An outlier have a dramatic effect on the standard deviation

An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally

obscured

Page 16: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 17

For a set of data, the 5-number summary consists of the minimum value; the first quartile Q1; the median (or second quartile Q2); the third quartile, Q3; and the maximum value

A boxplot ( or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3

Definitions

Page 17: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 18Boxplots

Figure 2-16

Page 18: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Outliers

• A data point is considered an outlier if it is 1.5 times the interquartile range above the 75th percentile or 1.5 times the interquartile range below the 25th percentile

• In other words, outliers are numbers outside the interval [Q1-1.5*IQR, Q3+1.5*IQR]

Page 19: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Box Plots and Histograms

• When looking at one variable, it’s a good idea to look at the box plot and histogram together

• Box plots complement histograms by providing more specific information about the center, the quartiles, and outliers

Page 20: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Copyright © 2004 Pearson Education, Inc.

Slide 21

Figure 2-17

Boxplots

Page 21: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Shape, Center and Spread

• What should you tell about a quantitative variable?

• Always report the shape, center and spread

• If the distribution is skewed, report the median and IQR

• In a symmetric distribution, report the mean and standard deviation

• If there are any clear outliers and you are reporting the mean and the standard deviation, report them with the outliers and without them

Page 22: Slide 1 Statistics Workshop Tutorial 6 Measures of Relative Standing Exploratory Data Analysis

Slide 23

Now we are ready for

Part 21 of Day 1