the logic of statistical significance & making statistical decisions textbook chapter 11 &...
TRANSCRIPT
The Logic of Statistical Significance & Making Statistical Decisions
Textbook Chapter 11 & more
Statistical Significance & Probability
Last Day– using Normal Distribution, standard deviation & z-scores to calculate probabilities that what we observe in the sample falls outside a certain range due to sampling error
Elements in logic of statistical significance (p. 273)
Assumptions regarding independence of two variableAssumptions regarding representativeness of samples drawn using probability sampling techniquesObserved distribution of sample in terms of two variableExample: Study of a population of 256 people, half male & half female about confort using Internet to make purchases
Expected outcomes if Null hypothesis holds
(no relationship between sex & comfort)
Illustration of a representative sample for previous case
Illustration of an unrepresentative sample for previous case (if null hypothesis holds)
But what if there IS a relationship between sex & comfort in the population?
Logical foundations of tests of statistical significance
How to explain discrepancy between assumptions about independence of variables in a observed
distribution of sample elements Observed distribution of sample
elements
1-sample unrepresentative2-variables NOT independent
Type I (alpha) & Type II (beta) Errors
What the researcher says
No relationship
Causal relationship
No relationship
No error Type II error
Causal relationship
Type I error No error
Situation in the “real world”
Type 1 & Type II Errors (text p. 278)
Use of Type I & II Errors
Establishing error risk levels Conventional .05 level for rejecting null
hypothesis But exceptions (familywise error)
Avoiding Type II errors Using interval or ratio-levels of
measurement (parametric statistics) Directional hypotheses (one-tailed
tests) Increase sample size (depends on
effect size)
Effect Size
Degree to which your variables are interdependent in the population Could be large or small Variables can be related but with
smaller level of significance (.001 as opposed to .05)
Not the same as the strength of the interdependence (small differences can be important)
First step: Decide on Error Risk & Sample size
Some researchers have proposed ideal sample sizes for different expectations about effect sizes (Cohen)
Second Step: Selecting appropriate statistical test
depends on research methodologyParametric statistics assume
Probabilistic sampling techniques Independence of observations (selection one
element won’t influence likelihood of selecting another)
Normal distribution of population Comparison groups reflect population with same
variance Dependent variables measured at interval or ratios
level
non-parametric statistics Usually presume independence but not
normal distribution of variables Nominal or ordinal levels of measurement
Measures of Association and Difference
Depends on research questions & operationalizationMeasures of association (ex. Correlation coefficients like Pearson’s r)Measures of difference (ex. Median split procedure)
Steps 3 & 4: Computing Test Statistic & Using Tables of Critical Values
Tables of Critical Values Numerical guides to sampling distribution
of statistic Indicate likelihood of observations being
due to sampling error
Deciding to reject or not reject null hypothesis
If statistical value based on sample observations is at least as large as critical value—null hypothesis can be rejected 5% chance that the null hypothesis might
be correct (type 1 error)
If computed value smaller than critical value of table then null hypothesis cannot be rejected (but risk of Type II error)
Examples of Five Common Statistical Measures in Textbook
1. Non-parametric measure of difference (chi-square)
2. Parametric test of difference between 2 groups (t-test)
3. Parametric test of difference betwee 3 or more groups that vary on one independent variable (ANOVA)
4. Factorial ANOVA5. Parametric measure of association
(Pearson correlation)
Example: Chi-Square
Goodness of Fit (One-Variable)Independence (Two-Variable)
Chi-Square Test
Evaluates whether observed frequencies for a qualitative variable (or variables) are adequately described by hypothesized or expected frequencies. Qualitative (or categorical) data is a set
of observations where any single observation is a word or code that represents a class or category.
Goodness of Fit
One-Way Chi-Square Asks whether the relative frequencies
observed in the categories of a sample frequency distribution are in agreement with the relative frequencies hypothesized to be true in the population.
Goodness of Fit
Goodness of Fit
Observed Frequency The obtained frequency for each
category.
of
Goodness of Fit
State the research hypothesis. Is the rat’s behavior random?
State the statistical hypotheses.
Goodness of Fit: Expected Outcomes (with Null Hypothesis)
.25.25 .25
.25
If picked by chance.
false. is :
25.,25.,25.,25.:
0
0
HH
PPPPH
A
DCBA
Goodness of Fit
Expected Frequency The hypothesized frequency for each
distribution, given the null hypothesis is true.
Expected proportion multiplied by number of observations.8(32) .25 ef
Goodness of Fit
Set the decision rule. Degrees of Freedom
Number of Categories -1
314 df
Goodness of Fit
Set the decision rule.
81.7
3
05.
2
crit
df
Goodness of Fit
Calculate the test statistic.Subtract the expected frequency from the observed frequency in each cell. Square this difference for each cell. Divide each squared difference by the expected frequency of that cell. Add together the results for all the cells in the table.
e
eo
f
ff 22 )(
Goodness of Fit
Calculate the test statistic.
Goodness of Fit
Decide if your result is significant. Reject H0, 9.25>7.81
Interpret your results. The rat’s behavior was not random.
Goodness of Fit: An Example
You may have heard, “Stay with your first answer on a multiple-choice test.” Is changing answers more likely to be helpful or harmful? To examine this, Best (1979) studied the responses of 261 students in an introductory psychology course. He recorded the number of right-to-wrong, wrong-to-right, and wrong-to-wrong answer changes for each student. More wrong-to-right changes than right-to-wrong changes were made by 195 of the students, who were thus “helped” by changing answers; 27 students made more right-to-wrong changes than wrong-to-right changes and thus hurt themselves. Using a .05 level of significance, test the hypothesis that the proportions of right-to-wrong and wrong-to-right changes are equal.
Goodness of Fit
State the research hypothesis. Should you stay with your first answer
on a multiple-choice test?
State the statistical hypotheses.
false. is H:H
.50P .50,P:H
0A
righttowrongwrongtoright0
Goodness of Fit
Observed Frequency The obtained frequency for each category.
right-to-wrongwrong-to-rightObserved 27 195
Expected Frequency The hypothesized frequency for each distribution,
given the null hypothesis is true. Expected proportion multiplied by number of
observations.right-to-wrongwrong-to-right
Expected .5*222 = 111 .5*222 = 111
Note: Total number of observations, not 100.
Goodness of Fit
Set the decision rule.
84.3
1
05.
2
crit
df
Goodness of Fit
Calculate the test statistic.right-to-wrongwrong-to-right
Observed 27 195
Expected 111 111
e
eo
f
ff 22 )(
14.127
57.6357.63
111
7056
111
7056
111
)111195(
111
)11127( 222
Goodness of Fit
Decide if your result is significant. Reject H0, 127.14>3.84
Interpret your results. The proportion of right-to-wrong
changes and wrong-to-right changes is not equal.
Chi-Square
Cannot be negative because all discrepancies are squared.Will be zero only in the unusual event that each observed frequency exactly equals the corresponding expected frequency.Other things being equal, the larger the discrepancy between the expected frequencies and their corresponding observed frequencies, the larger the observed value of chi-square.It is not the size of the discrepancy alone that accounts for a contribution to the value of chi-square, but the size of the discrepancy relative to the magnitude of the expected frequency.The value of chi-square depends on the number of discrepancies involved in its calculation.
e
eo
f
ff 22 )(
Chi-Square Test for Independence (Two-Way Chi-Square)
Asks whether observed frequencies reflect the independence of two qualitative variables.
Compares the actual observed frequencies of some phenomenon (in our sample) with the frequencies we would expect if there were no relationship at all between the two variables in the larger (sampled) population.
Two variables are independent if knowledge of the value of one variable provides no information about the value of another variable.
Chi-Square Test for Independence: Sex & Auto Choice
Chi-Square Test for Independence
Recent studies have found that most teens are knowledgeable about AIDS, yet many continue to practice high-risk sexual behaviors. King and Anderson (1993) asked young people the following question: “If you could have sexual relations with any and all partners of your choosing, as often as you wished, for the next 2 (or 10) years, but at the end of that time period you would die of AIDS, would you make this choice?” A five-point Likert scale was used to assess the subjects’ responses. For the following data, the responses “probably no,” “unsure,” “probably yes”, and “definitely yes” were pooled into the category “other.” Using the .05 level of significance, test for independence.
Definitely No Other
Males 451 165
Females 509 118
Chi-Square Test for Independence
State the research hypothesis. Is willingness to participate in
unprotected sex independent of gender?
State the statistical hypothesis.
related. aregender andquestion the toResponse:
related.not aregender andquestion the toResponse:0
AH
H
Chi-Square Test for Independence
To find expected values: Find column, row, and overall
totals.Definitely No Other Total
Males 451 165 616
Females 509 118 627
Total 960 283 1243
Chi-Square Test for Independence
To find expected values:
Definitely No Other Total
Males 451 (475.75) 165 616
Females 509 118 627
Total 960 283 1243
totaloverall
total)(row tal)(column toef
75.4751243
)616)(960(ef
Chi-Square Test for Independence
To find expected values:
Definitely No Other Total
Males 451 (475.75) 165 616
Females 509 (484.25) 118 627
Total 960 283 1243
totaloverall
total)(row tal)(column toef
25.4841243
)627)(960(ef
Chi-Square Test for Independence
To find expected values:
Definitely No Other Total
Males 451 (475.75) 165 (140.25) 616
Females 509 (484.25) 118 627
Total 960 283 1243
totaloverall
total)(row tal)(column toef
25.1401243
)616)(283(ef
Chi-Square Test for Independence
To find expected values:
Definitely No Other Total
Males 451 (475.75) 165 (140.25) 616
Females 509 (484.25) 118 (142.75) 627
Total 960 283 1243
totaloverall
total)(row tal)(column toef
75.1421243
)627)(283(ef
Chi-Square Test for Independence
Set the decision rule. Degrees of Freedom
(number of columns - 1) (number of rows -1)
(c-1)(r-1) 1)1)(1()12)(12( df
Definitely No Other
Males 451 165
Females 509 118
Chi-Square Test for Independence
Set the decision rule.
84.3
1
05.
2
crit
df
Chi-Square Test for Independence
Calculate the test statistic.
e
eo
f
ff 22 )(
Definitely No Other Total
Males 451 (475.75) 165 (140.25) 616
Females 509 (484.25) 118 (142.75) 627
Total 960 283 1243
75.142
)75.142118(
25.140
)25.140165(
25.484
)25.484509(
75.475
)75.475451( 22222
75.142
56.612
25.140
56.612
25.484
56.612
75.475
56.612
21.11
29.437.426.129.1
Chi-Square Test for Independence
Decide if your result is significant. Reject H0, 11.21>3.84
Interpret your results. Willingness to engage in unprotected
sex and gender are not independent.
Definitely No Other Total
Males 451 (475.75) 165 (140.25) 616
Females 509 (484.25) 118 (142.75) 627
Total 960 283 1243
An Example: Chi-Square Test for Independence
In one large factory, 100 employees were judged to be highly successful and another 100 marginally successful. All workers were asked, “Which do you find more important to you personally, the money you are able to take home or the satisfaction you feel from doing the job?” In the first group, 49% found the money more important, but in the second group 53% responded that way. Test the null hypothesis that job performance and job motivation are independent using the .01 level of significance.
An Example: Chi-Square Test for Independence
State the research hypothesis. Are job performance and job
motivation independent?
State the statistical hypotheses.
t.independennot are motivation job and eperformanc Job:H
t.independen are motivation job and eperformanc Job:H
A
0
An Example:
Set the decision rule.
64.6
1)1)(1()12)(12()1)(1(
01.
2
crit
rcdf
An Example: Chi-Square Test for Independence
Calculate the test statistic.
49200
)100)(98(
totaloverall
)lumn total total)(co(row
51200
)100)(102(
totaloverall
)lumn total total)(co(row
e
e
f
f
High Success Marginal Success Total
Money 49 (51) 53 (51)102
Satisfaction 51 (49) 47 (49)98
Total 100 100 200
An Example: Chi-Square Test for Independence
Calculate the test statistic.
32.0
08.08.08.08.
49
)4947(
51
)5153(
49
)4951(
51
)5149( 22222
High Success Marginal Success Total
Money 49 (51) 53 (51)102
Satisfaction 51 (49) 47 (49)98
Total 100 100 200
e
eo
f
ff 22 )(
An Example: Chi-Square Test for Independence
Decide if your result is significant. Retain H0, 0.32<6.64
Interpret your results. Job performance and job motivation
are independent.High Success Marginal Success Total
Money 49 (51) 53 (51)102
Satisfaction 51 (49) 47 (49)98
Total 100 100 200