the logic of statistical significance & making statistical decisions textbook chapter 11 &...

The Logic of Statistical Significance & Making Statistical Decisions

Textbook Chapter 11 & more

Statistical Significance & Probability

Last Day– using Normal Distribution, standard deviation & z-scores to calculate probabilities that what we observe in the sample falls outside a certain range due to sampling error

Elements in logic of statistical significance (p. 273)

Assumptions regarding independence of two variableAssumptions regarding representativeness of samples drawn using probability sampling techniquesObserved distribution of sample in terms of two variableExample: Study of a population of 256 people, half male & half female about confort using Internet to make purchases

Expected outcomes if Null hypothesis holds

(no relationship between sex & comfort)

Illustration of a representative sample for previous case

Illustration of an unrepresentative sample for previous case (if null hypothesis holds)

But what if there IS a relationship between sex & comfort in the population?

Logical foundations of tests of statistical significance

How to explain discrepancy between assumptions about independence of variables in a observed

distribution of sample elements Observed distribution of sample

elements

1-sample unrepresentative2-variables NOT independent

Type I (alpha) & Type II (beta) Errors

What the researcher says

No relationship

Causal relationship

No relationship

No error Type II error

Causal relationship

Type I error No error

Situation in the “real world”

Type 1 & Type II Errors (text p. 278)

Use of Type I & II Errors

Establishing error risk levels Conventional .05 level for rejecting null

hypothesis But exceptions (familywise error)

Avoiding Type II errors Using interval or ratio-levels of

measurement (parametric statistics) Directional hypotheses (one-tailed

tests) Increase sample size (depends on

effect size)

Effect Size

Degree to which your variables are interdependent in the population Could be large or small Variables can be related but with

smaller level of significance (.001 as opposed to .05)

Not the same as the strength of the interdependence (small differences can be important)

First step: Decide on Error Risk & Sample size

Some researchers have proposed ideal sample sizes for different expectations about effect sizes (Cohen)

Second Step: Selecting appropriate statistical test

depends on research methodologyParametric statistics assume

Probabilistic sampling techniques Independence of observations (selection one

element won’t influence likelihood of selecting another)

Normal distribution of population Comparison groups reflect population with same

variance Dependent variables measured at interval or ratios

level

non-parametric statistics Usually presume independence but not

normal distribution of variables Nominal or ordinal levels of measurement

Measures of Association and Difference

Depends on research questions & operationalizationMeasures of association (ex. Correlation coefficients like Pearson’s r)Measures of difference (ex. Median split procedure)

Steps 3 & 4: Computing Test Statistic & Using Tables of Critical Values

Tables of Critical Values Numerical guides to sampling distribution

of statistic Indicate likelihood of observations being

due to sampling error

Deciding to reject or not reject null hypothesis

If statistical value based on sample observations is at least as large as critical value—null hypothesis can be rejected 5% chance that the null hypothesis might

be correct (type 1 error)

If computed value smaller than critical value of table then null hypothesis cannot be rejected (but risk of Type II error)

Examples of Five Common Statistical Measures in Textbook

1. Non-parametric measure of difference (chi-square)

2. Parametric test of difference between 2 groups (t-test)

3. Parametric test of difference betwee 3 or more groups that vary on one independent variable (ANOVA)

4. Factorial ANOVA5. Parametric measure of association

(Pearson correlation)

Example: Chi-Square

Goodness of Fit (One-Variable)Independence (Two-Variable)

Chi-Square Test

Evaluates whether observed frequencies for a qualitative variable (or variables) are adequately described by hypothesized or expected frequencies. Qualitative (or categorical) data is a set

of observations where any single observation is a word or code that represents a class or category.

Goodness of Fit

One-Way Chi-Square Asks whether the relative frequencies

observed in the categories of a sample frequency distribution are in agreement with the relative frequencies hypothesized to be true in the population.

Goodness of Fit

Goodness of Fit

Observed Frequency The obtained frequency for each

category.

of

Goodness of Fit

State the research hypothesis. Is the rat’s behavior random?

State the statistical hypotheses.

Goodness of Fit: Expected Outcomes (with Null Hypothesis)

.25.25 .25

.25

If picked by chance.

false. is :

25.,25.,25.,25.:

0

0

HH

PPPPH

A

DCBA

Goodness of Fit

Expected Frequency The hypothesized frequency for each

distribution, given the null hypothesis is true.

Expected proportion multiplied by number of observations.8(32) .25 ef

Goodness of Fit

Set the decision rule. Degrees of Freedom

Number of Categories -1

314 df

Goodness of Fit

Set the decision rule.

81.7

3

05.

2

crit

df

Goodness of Fit

Calculate the test statistic.Subtract the expected frequency from the observed frequency in each cell. Square this difference for each cell. Divide each squared difference by the expected frequency of that cell. Add together the results for all the cells in the table.

e

eo

f

ff 22 )(

Goodness of Fit

Calculate the test statistic.

Goodness of Fit

Decide if your result is significant. Reject H0, 9.25>7.81

Interpret your results. The rat’s behavior was not random.

Goodness of Fit: An Example

You may have heard, “Stay with your first answer on a multiple-choice test.” Is changing answers more likely to be helpful or harmful? To examine this, Best (1979) studied the responses of 261 students in an introductory psychology course. He recorded the number of right-to-wrong, wrong-to-right, and wrong-to-wrong answer changes for each student. More wrong-to-right changes than right-to-wrong changes were made by 195 of the students, who were thus “helped” by changing answers; 27 students made more right-to-wrong changes than wrong-to-right changes and thus hurt themselves. Using a .05 level of significance, test the hypothesis that the proportions of right-to-wrong and wrong-to-right changes are equal.

Goodness of Fit

State the research hypothesis. Should you stay with your first answer

on a multiple-choice test?


false. is H:H

.50P .50,P:H

0A

righttowrongwrongtoright0

Goodness of Fit

Observed Frequency The obtained frequency for each category.

right-to-wrongwrong-to-rightObserved 27 195

Expected Frequency The hypothesized frequency for each distribution,

given the null hypothesis is true. Expected proportion multiplied by number of

observations.right-to-wrongwrong-to-right

Expected .5*222 = 111 .5*222 = 111

Note: Total number of observations, not 100.

Goodness of Fit


84.3

1

05.

2

crit

df

Goodness of Fit

Calculate the test statistic.right-to-wrongwrong-to-right

Observed 27 195

Expected 111 111

e

eo

f

ff 22 )(

14.127

57.6357.63

111

7056

111

7056

111

)111195(

111

)11127( 222

Goodness of Fit


Interpret your results. The proportion of right-to-wrong

changes and wrong-to-right changes is not equal.

Chi-Square

Cannot be negative because all discrepancies are squared.Will be zero only in the unusual event that each observed frequency exactly equals the corresponding expected frequency.Other things being equal, the larger the discrepancy between the expected frequencies and their corresponding observed frequencies, the larger the observed value of chi-square.It is not the size of the discrepancy alone that accounts for a contribution to the value of chi-square, but the size of the discrepancy relative to the magnitude of the expected frequency.The value of chi-square depends on the number of discrepancies involved in its calculation.

e

eo

f

ff 22 )(

Chi-Square Test for Independence (Two-Way Chi-Square)

Asks whether observed frequencies reflect the independence of two qualitative variables.

Compares the actual observed frequencies of some phenomenon (in our sample) with the frequencies we would expect if there were no relationship at all between the two variables in the larger (sampled) population.

Two variables are independent if knowledge of the value of one variable provides no information about the value of another variable.

Chi-Square Test for Independence: Sex & Auto Choice

Chi-Square Test for Independence

Recent studies have found that most teens are knowledgeable about AIDS, yet many continue to practice high-risk sexual behaviors. King and Anderson (1993) asked young people the following question: “If you could have sexual relations with any and all partners of your choosing, as often as you wished, for the next 2 (or 10) years, but at the end of that time period you would die of AIDS, would you make this choice?” A five-point Likert scale was used to assess the subjects’ responses. For the following data, the responses “probably no,” “unsure,” “probably yes”, and “definitely yes” were pooled into the category “other.” Using the .05 level of significance, test for independence.

Definitely No Other

Males 451 165

Females 509 118


State the research hypothesis. Is willingness to participate in

unprotected sex independent of gender?

State the statistical hypothesis.

related. aregender andquestion the toResponse:

related.not aregender andquestion the toResponse:0

AH

H


To find expected values: Find column, row, and overall

totals.Definitely No Other Total

Males 451 165 616

Females 509 118 627

Total 960 283 1243


To find expected values:

Definitely No Other Total

Males 451 (475.75) 165 616

Females 509 118 627

Total 960 283 1243

totaloverall

total)(row tal)(column toef

75.4751243

)616)(960(ef




Males 451 (475.75) 165 616

Females 509 (484.25) 118 627

Total 960 283 1243

totaloverall


25.4841243

)627)(960(ef




Males 451 (475.75) 165 (140.25) 616

Females 509 (484.25) 118 627

Total 960 283 1243

totaloverall


25.1401243

)616)(283(ef




Males 451 (475.75) 165 (140.25) 616

Females 509 (484.25) 118 (142.75) 627

Total 960 283 1243

totaloverall


75.1421243

)627)(283(ef


Set the decision rule. Degrees of Freedom

(number of columns - 1) (number of rows -1)

(c-1)(r-1) 1)1)(1()12)(12( df

Definitely No Other

Males 451 165

Females 509 118



84.3

1

05.

2

crit

df



e

eo

f

ff 22 )(


Males 451 (475.75) 165 (140.25) 616

Females 509 (484.25) 118 (142.75) 627

Total 960 283 1243

75.142

)75.142118(

25.140

)25.140165(

25.484

)25.484509(

75.475

)75.475451( 22222

75.142

56.612

25.140

56.612

25.484

56.612

75.475

56.612

21.11

29.437.426.129.1



Interpret your results. Willingness to engage in unprotected

sex and gender are not independent.


Males 451 (475.75) 165 (140.25) 616

Females 509 (484.25) 118 (142.75) 627

Total 960 283 1243

An Example: Chi-Square Test for Independence

In one large factory, 100 employees were judged to be highly successful and another 100 marginally successful. All workers were asked, “Which do you find more important to you personally, the money you are able to take home or the satisfaction you feel from doing the job?” In the first group, 49% found the money more important, but in the second group 53% responded that way. Test the null hypothesis that job performance and job motivation are independent using the .01 level of significance.


State the research hypothesis. Are job performance and job

motivation independent?


t.independennot are motivation job and eperformanc Job:H

t.independen are motivation job and eperformanc Job:H

A

0

An Example:


64.6

1)1)(1()12)(12()1)(1(

01.

2

crit

rcdf



49200

)100)(98(

totaloverall

)lumn total total)(co(row

51200

)100)(102(

totaloverall

)lumn total total)(co(row

e

e

f

f

High Success Marginal Success Total

Money 49 (51) 53 (51)102

Satisfaction 51 (49) 47 (49)98

Total 100 100 200



32.0

08.08.08.08.

49

)4947(

51

)5153(

49

)4951(

51

)5149( 22222

High Success Marginal Success Total

Money 49 (51) 53 (51)102


Total 100 100 200

e

eo

f

ff 22 )(


Decide if your result is significant. Retain H0, 0.32<6.64

Interpret your results. Job performance and job motivation

are independent.High Success Marginal Success Total

Money 49 (51) 53 (51)102


Total 100 100 200

the logic of statistical significance & making statistical decisions textbook chapter 11 &...

Documents