statistical reasoning. descriptive statistics are used to organize and summarize data in a...

25
Statistical Reasoning

Upload: beatrix-willis

Post on 30-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Statistical Reasoning

Page 2: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Statistical Reasoning

Descriptive Statistics are used to organize and summarize data in a meaningful way.

Frequency distributions – Where are the majority of the scores?

• Used to organize raw scores, or data, so that information makes sense at a glance.

• They take scores and arrange them in order of magnitude and the number of times each score occurs.

Page 3: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

13 A+ 40 13 A+12 A 39 4 41% A 12 23 52% 12 A 11 39%

38 11 A- 11 10 11 A- 1411 A- 37 11 B+ 10 15 10 B+ 910 B+ 36 6 B 9 5 41% 9 B 12 45%9 B 35 4 31% B- 8 6 8 B- 8

34 5 C+ 7 2 7 C+ 28 B- 33 5 C 6 1 5% 6 C+ 2 11%7 C+ 32 4 C- 5 5 C- 36 C 31 3 19% D+ 4 2 4 D+

30 3 D 3 3% 3 D 3 5%5 C- 29 2 D- 2 2 D-4 D+ 28 F 1 0% 1 F 0%3 D 27 4 8% 0

26 12 D- 251 F 24 2%

<24 1

Multiple Choice

Essay

Composite

Mean=34.3SD=4.2

Mean=10.2SD=2.0

Mean=9.3SD=2.3

Page 4: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Histograms & Frequency Polygons

• These are 2 ways of showing your frequency distribution data.

1. Histogram – graphically represents a frequency distribution by making a bar chart using vertical bars that touch

• When you have a continuous scale (for example, scores on a test go from 0-100, continuously getting larger.) the bars touch, because you have to have a class for each score to fall into, and you can’t have any “gaps.”

• Different than a Bar Graph which is used when you have non-continuous classes (example, which candidate do you support, Obama or McCain? You’d have a bar for each, with gaps in between, because you can’t fall between two candidates, you have to pick one.)

Page 5: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

HistogramUses a Bar Graph to show data

Page 6: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Frequency PolygonUses a line graph to show data

2. Frequency Polygon – graphically represents a frequency distribution by marking each score category along a graph’s horizontal axis, and connecting them with straight lines (line graph)

Page 7: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Standard Normal Distribution Curve Characteristics of the normal curve

• Bell shaped curve where the mean, median and mode are all the same and fall exactly in the middle

2.15% 2.15%13.6%13.6% 34.1%34.1% .13%.13%

+ or - #

Page 8: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Skewed CurvesSkewed Distribution – when more scores pile up on one side of the distribution than the other.

Positively skewed means more people have low scores. Negatively skewed means more people have high scores.

•Positive & Negative refers to the direction of the “tail” of the curve, they do not mean “good” or “bad.”

Page 9: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Measures of Central Tendency• A single number that gives us information

about the “center” of a frequency distribution. Measures of central tendency – 3 types

4, 4, 3, 4, 51. Mode=most common=4(Reports what there is more of – Used in data with no

connection. Can’t average men & women.)

2. Mean=arithmetic average=20/5=4(has most statistical value but is susceptible to the effects of extreme scores )

3. Median=middle score=4(1/2 the scores are higher, half are lower. Used when there are extreme scores)

Page 10: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Central Tendency An extremely high or low price/score can skew the mean. Sometimes the

median is better at showing you the central tendency.

1968 TOPPS Baseball CardsNolan Ryan $1500Billy Williams $8Luis Aparicio $5Harmon Killebrew $5Orlando Cepeda $3.50Maury Wills $3.50Jim Bunning $3Tony Conigliaro $3Tony Oliva $3Lou Pinella $3Mickey Lolich $2.50

Elston Howard $2.25Jim Bouton $2Rocky Colavito $2Boog Powell $2Luis Tiant $2Tim McCarver $1.75Tug McGraw $1.75Joe Torre $1.5Rusty Staub $1.25Curt Flood $1

With Ryan:Median=$2.50Mean=$74.14

Without Ryan:Median=$2.38Mean=$2.85

Page 11: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Does the mean accurately portray the central tendency of incomes?

NO!

What measure of central tendency would more accurately show income distribution?

Median – the majority of the incomes surround that number.

Page 12: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Measures of Variability

• Gives us a single number that presents us with information about how spread out scores are in a frequency distribution. (See example of why this is important).

• Range – Difference b/w a high & low score– Take the highest score and subtract the lowest score from

it. (can be skewed by an extreme score)

• Standard Deviation – How spread out is your data?– The larger this number is, the more spread out scores are

from the mean. – The smaller this number is, the more consistent the scores

are to the mean

Page 13: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Calculating Standard DeviationHow spread out (consistent) is your data?

1. Calculate the mean.

2. Take each score and subtract the mean from it.

3. Square the new scores to make them positive.

4. Mean (average) the new scores

5. Take the square root of the mean to get back to your original measurement.

6. The smaller the number the more closely packed the data. The larger the number the more spread out it is.

Page 14: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Standard Deviation

PuntDistance

36384145

Mean:160/4 = 40 yds

Deviationfrom Mean

36 - 40 = -438 – 40 = -241 – 40 = +145 – 40 = +5

DeviationSquared

Numbers multiplied by itself & added together

16 4 125

46Variance:

46/4 = 11.5

StandardDeviation:

variance=

11.5 = 3.4 yds

Page 15: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

13 A+ 40 13 A+12 A 39 4 41% A 12 23 52% 12 A 11 39%

38 11 A- 11 10 11 A- 1411 A- 37 11 B+ 10 15 10 B+ 910 B+ 36 6 B 9 5 41% 9 B 12 45%9 B 35 4 31% B- 8 6 8 B- 8

34 5 C+ 7 2 7 C+ 28 B- 33 5 C 6 1 5% 6 C+ 2 11%7 C+ 32 4 C- 5 5 C- 36 C 31 3 19% D+ 4 2 4 D+

30 3 D 3 3% 3 D 3 5%5 C- 29 2 D- 2 2 D-4 D+ 28 F 1 0% 1 F 0%3 D 27 4 8% 0

26 12 D- 251 F 24 2%

<24 1

Multiple Choice

Essay

Composite

Mean=34.3SD=4.2

Mean=10.2SD=2.0

Mean=9.3SD=2.3

Are these scores consistent?

Is there a skew?

Page 16: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Z-ScoresA number expressed in Standard Deviation Units that shows

an Individual score’s deviation from the mean.Basically, it shows how you did compared to everyone else.

+ Z-score means you are above the mean, – Z-score means you are below the mean.

Z-Score = your score minus the average score divided by standard deviation.

Which class did you perform better in compared to your classmates?

Test Total Your Score

Average score

S.D.

Biology 200 168 160 4

Psych. 100 44 38 2

Z score in Biology: 168-160 = 8, 8 / 4 = +2 Z Score

Z score in Psych: 44-38 = 6, 6/2 = +3 Z Score

You performed better in Psych compared to your classmates.

Page 17: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Standard Normal Distribution Curve Characteristics of the normal curve

• Bell shaped curve where the mean, median and mode are all the same and fall exactly in the middle

2.15% 2.15%13.6%13.6% 34.1%34.1% .13%.13%

+ or - #

Page 18: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

CorrelationCorrelation – shows the relationship between two variables.

•The closer to + or - one the stronger the relationship between the two variables.

•This enables us to predict. However, correlation does not mean causation.

Page 19: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Positive Correlation

• As the value of one variable increases (or decreases) so does the value of the other variable.

• When A goes UP B goes UP or• When A goes Down, B goes Down• A perfect positive correlation is +1.0.• The closer the correlation is to +1.0, the

stronger the relationship.

Page 20: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Negative Correlation

• As the value of one variable increases, the value of the other variable decreases.

• When A goes UP B goes Down or

• When A goes Down, B goes Up

• A perfect negative correlation is -1.0.

• The closer the correlation is to -1.0, the stronger the relationship.

Page 21: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Zero Correlation

• There is no relationship whatsoever between the two variables.

Page 22: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Let’s Review

Page 23: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Inferential Statistics• Techniques that allow a researcher to

determine whether a study’s outcome is more than just chance events.

• Usually you would use inferential statistics to try to predict things about a population based on a sample.

• For example, we surveyed 50 staff members in the district about their level of education and are trying to use that to predict the average level of education for all staff in the district.

Page 24: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Statistical Significancep value = likelihood a result is caused by chance. In other words,

are they statistically significant? If the answer is yes, then they can be generalized to a larger population

• This is bad to a researcher. They want this number to be as small as possible to show that any change in their experiment was caused by an independent variable and not some outside force.

• Results are considered statistically significant if the probability of obtaining it by chance alone is less than .05 or a P-Score of 5%. p ≤ .05

• This means the researcher must be 95% certain their results are not caused by chance.

• Replication of the experiment will prove the p value to be true or not.

Page 25: Statistical Reasoning. Descriptive Statistics are used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority

Does the sample represent the population?

a. Non-biased sample-good

b. Low variability-good

c. Larger samples-good

• Population – is a complete set of something.

• Sample – is a subset of a population.