course: just 3900 introductory statistics for criminal justice instructor:

COURSE: JUST 3900INTRODUCTORY STATISTICS

FOR CRIMINAL JUSTICE

Instructor:Dr. John J. Kerbs, Associate Professor

Joint Ph.D. in Social Work and Sociology

Chapter 8: Chapter 8: Introduction to Hypothesis TestingIntroduction to Hypothesis Testing

Hypothesis TestingHypothesis Testing

A A hypothesis testhypothesis test is a statistical method is a statistical method that uses sample data to evaluate a that uses sample data to evaluate a hypothesis about a population.hypothesis about a population.

The general goal of a hypothesis test is to The general goal of a hypothesis test is to rule out chance (sampling error) as a rule out chance (sampling error) as a plausible explanation for the results from a plausible explanation for the results from a research study. research study.

Hypothesis Test - StepsHypothesis Test - Steps

1.1. State hypothesis about the population.State hypothesis about the population.

2.2. Use hypothesis to predict the characteristics Use hypothesis to predict the characteristics the sample should have.the sample should have.

3.3. Obtain a sample from the population.Obtain a sample from the population.

4.4. Compare data with the hypothesis prediction.Compare data with the hypothesis prediction. If the sample mean is consistent with the If the sample mean is consistent with the

prediction, then we conclude that the hypothesis prediction, then we conclude that the hypothesis is reasonable.is reasonable.

Basic Experimental Situation for Basic Experimental Situation for Hypothesis TestingHypothesis Testing

Basic Assumption of Hypothesis TestingBasic Assumption of Hypothesis Testing If the treatment has any effect, it is simply to add or subtract a If the treatment has any effect, it is simply to add or subtract a

constant amount to each individual’s scoreconstant amount to each individual’s score Remember that adding or subtracting a constant changes the mean, Remember that adding or subtracting a constant changes the mean,

but not the shape of the distribution for the population and/or the but not the shape of the distribution for the population and/or the standard deviation.standard deviation.

Thus, the population after treatment has the same shape and Thus, the population after treatment has the same shape and standard deviation as the population prior to treatmentstandard deviation as the population prior to treatment

Hypothesis Testing (cont'd.)Hypothesis Testing (cont'd.)

If the individuals in the sample are noticeably If the individuals in the sample are noticeably different from the individuals in the original different from the individuals in the original population, we have evidence that the population, we have evidence that the treatment has an effect. treatment has an effect.

However, it is also possible that the difference However, it is also possible that the difference between the sample and the population is between the sample and the population is simply sampling error simply sampling error The question that this chapter addresses is as The question that this chapter addresses is as

follows: follows: How much sampling error are you willing to tolerate?How much sampling error are you willing to tolerate?

A Treated Sample That Represents A Treated Sample That Represents a Treated Populationa Treated Population

Hypothesis Testing (cont'd.)Hypothesis Testing (cont'd.)

The purpose of the hypothesis test is to The purpose of the hypothesis test is to decide between two explanations:decide between two explanations:1.1. The difference between the sample and the The difference between the sample and the

population can be explained by sampling error population can be explained by sampling error (there does not appear to be a treatment effect)(there does not appear to be a treatment effect)

2.2. The difference between the sample and the The difference between the sample and the population is too large to be explained by population is too large to be explained by sampling error (there does appear to be a sampling error (there does appear to be a treatment effect).treatment effect).

The Hypothesis Test: Step 1The Hypothesis Test: Step 1Clearly State The HypothesisClearly State The Hypothesis

State the hypothesis about the unknown State the hypothesis about the unknown population.population.

The The null hypothesisnull hypothesis, , HH00, , states that there is no change in states that there is no change in the general population before and after an intervention. In the general population before and after an intervention. In the context of an experiment, the context of an experiment, HH0 0 predicts that the predicts that the independent variable had independent variable had no effectno effect on the dependent on the dependent variable. variable.

The The alternative hypothesisalternative hypothesis, , HH11, , states that there is a states that there is a change in the general population following an intervention. change in the general population following an intervention. In the context of an experiment, predicts that the In the context of an experiment, predicts that the independent variable independent variable did have an effectdid have an effect on the on the dependent variable.dependent variable.

The Hypothesis Test: Step 2The Hypothesis Test: Step 2Set Criteria for DecisionSet Criteria for Decision

The The αα level level establishes a criterion, or "cut-off", establishes a criterion, or "cut-off", for making a decision about the null for making a decision about the null hypothesis. The alpha level also determines hypothesis. The alpha level also determines the risk of a the risk of a Type I Error Type I Error ((False PositiveFalse Positive). ).

α = .05 (most used), α = .01, α = .001α = .05 (most used), α = .01, α = .001Find values in the unit normal table for z-scoresFind values in the unit normal table for z-scores

The The critical regioncritical region consists of outcomes that consists of outcomes that are very unlikely to occur if the null hypothesis are very unlikely to occur if the null hypothesis is true. That is, the critical region is defined by is true. That is, the critical region is defined by sample means that are almost impossible to sample means that are almost impossible to obtain if the treatment has no effect.obtain if the treatment has no effect.

Hypothesis Testing and the Hypothesis Testing and the Critical Region for z-ScoresCritical Region for z-Scores

Remember that each tail has

2.5%

The Hypothesis Test: Step 3The Hypothesis Test: Step 3Collect Data & Compute Sample StatisticsCollect Data & Compute Sample Statistics

Compare the sample means (data) with the Compare the sample means (data) with the null hypothesis. null hypothesis.

Compute the test statistic. The Compute the test statistic. The test statistictest statistic ((zz-score) forms a ratio comparing the obtained -score) forms a ratio comparing the obtained difference between the sample mean and the difference between the sample mean and the hypothesized population mean versus the hypothesized population mean versus the amount of difference we would expect without amount of difference we would expect without any treatment effect (the standard error). any treatment effect (the standard error).

The Hypothesis Test: Step 4The Hypothesis Test: Step 4Make A DecisionMake A Decision

If the test statistic results are in the critical If the test statistic results are in the critical region, we conclude that the difference is region, we conclude that the difference is significantsignificant or that the treatment has a or that the treatment has a significant effect. significant effect. In this case we In this case we rejectreject the null hypothesis the null hypothesis. .

If the mean difference is not in the critical If the mean difference is not in the critical region, we conclude that the evidence from the region, we conclude that the evidence from the sample is not sufficient to show a treatment sample is not sufficient to show a treatment effecteffect In this case we In this case we fail to reject fail to reject the null hypothesisthe null hypothesis. .

Hypothesis Testing Example:Hypothesis Testing Example:Re-Arrests for Heroin AddictsRe-Arrests for Heroin Addicts

Let us assume that the Let us assume that the populationpopulation of all heroin of all heroin addicts in the US commit an average of addicts in the US commit an average of μμ = 80 = 80 property crimes per year with a standard deviation property crimes per year with a standard deviation of of σσ = 20 = 20 when they live in the free world without when they live in the free world without treatment. The National Institute of Justice wants to treatment. The National Institute of Justice wants to determine if treatment with opiate antagonists determine if treatment with opiate antagonists (Naltrexone) significantly alters the average number (Naltrexone) significantly alters the average number of crimes committed per year (of crimes committed per year (MM=70=70) for a small ) for a small random sample (random sample (nn = 16 = 16) heroin-addicted felons who ) heroin-addicted felons who are completely detoxed and placed on daily doses are completely detoxed and placed on daily doses of Naltrexone. of Naltrexone.


Step 1: State the HypothesisStep 1: State the Hypothesis HH00: : μμwith Naltrexone with Naltrexone = 80 Property Crimes/Year= 80 Property Crimes/Year

- - The null hypothesis suggests no treatment - - The null hypothesis suggests no treatment effect: even with Naltrexone, the mean number of effect: even with Naltrexone, the mean number of property crimes per year will be 80property crimes per year will be 80

HH11: : μμwith Naltrexonewith Naltrexone ≠ 80 Property Crimes/Year ≠ 80 Property Crimes/Year - - The alternative hypothesis suggests the - - The alternative hypothesis suggests the

presence of a treatment effect: with Naltrexone, presence of a treatment effect: with Naltrexone, the mean number of property crimes per year will the mean number of property crimes per year will be different from 80.be different from 80.


Step 2: Set Criteria for DecisionStep 2: Set Criteria for Decision Select an Alpha Level and determine the boundaries Select an Alpha Level and determine the boundaries

for the critical regionsfor the critical regions Most studies use an alpha of .05, which Most studies use an alpha of .05, which

corresponds to a z-score of +/- 1.96 (2-tailed test)corresponds to a z-score of +/- 1.96 (2-tailed test) If the z-score for the treated sample does not If the z-score for the treated sample does not

fall into the critical region, fail to reject Hfall into the critical region, fail to reject H00

If the z-score for the treated sample falls into If the z-score for the treated sample falls into the critical region (z≤-1.96 or z≥+1.96), reject Hthe critical region (z≤-1.96 or z≥+1.96), reject H00


Step 3: Compute Step 3: Compute SampleSample Statistic Statistic Complete the computation for the z-statistic based Complete the computation for the z-statistic based

upon the difference between the sample mean upon the difference between the sample mean ((M=70, M=70, nn=16=16) and the population mean () and the population mean (μμ=80=80) ) using the standard error (as calculated below) for using the standard error (as calculated below) for the sample mean (the sample mean (σ M=5) in the denominator of the ) in the denominator of the

z-statistic as noted below.z-statistic as noted below.

Remember to calculate the standard error for the sample mean and use this in the denominator.

Do not use the standard deviation as the denominator of the z-statistic for the sample.


Step 4: Make A DecisionStep 4: Make A Decision To make a decision, you must compare the z-statistic To make a decision, you must compare the z-statistic

for the sample (zfor the sample (zsamplesample = - 2.00) against the z-statistic = - 2.00) against the z-statistic

that defines the boundaries of your critical region. As that defines the boundaries of your critical region. As discussed earlier, we set the alpha level at .05 discussed earlier, we set the alpha level at .05 (z(zcritical critical = +/- 1.96). = +/- 1.96).

Thus, we reject the null hypothesis (H0) and note that there does appear to be a treatment effect on the average number of property crimes committed per year for felons taking daily doses of Naltrexone.

Errors in Hypothesis TestsErrors in Hypothesis Tests

Just because the sample mean (following Just because the sample mean (following treatment) is different from the original treatment) is different from the original population mean does not necessarily indicate population mean does not necessarily indicate that the treatment has caused a change. that the treatment has caused a change.

You should recall that there usually is some You should recall that there usually is some discrepancy between a sample mean and the discrepancy between a sample mean and the population mean simply as a result of sampling population mean simply as a result of sampling error. error.

Errors in Hypothesis Tests (cont'd.)Errors in Hypothesis Tests (cont'd.)

Because the hypothesis test relies on sample Because the hypothesis test relies on sample data, and because sample data are not data, and because sample data are not completely reliable, there is always the risk completely reliable, there is always the risk that misleading data will cause the hypothesis that misleading data will cause the hypothesis test to reach a wrong conclusion. test to reach a wrong conclusion.

Two types of errors are possible.Two types of errors are possible.

Type I ErrorsType I ErrorsA A Type I error Type I error occurs when the sample data occurs when the sample data

appear to show a treatment effect when, in appear to show a treatment effect when, in fact, there is none. fact, there is none. In this case the researcher will reject the null In this case the researcher will reject the null

hypothesis and falsely conclude that the treatment hypothesis and falsely conclude that the treatment has an effect. has an effect. Type I errors are caused by unusual, unrepresentative Type I errors are caused by unusual, unrepresentative

samples, falling in the critical region even though the samples, falling in the critical region even though the treatment has no effect. treatment has no effect.

The hypothesis test is structured so that Type I errors The hypothesis test is structured so that Type I errors are very unlikely; specifically, the probability of a Type I are very unlikely; specifically, the probability of a Type I error is equal to the alpha level.error is equal to the alpha level.

Type I ErrorsType I ErrorsThe The αα level level

Also known as the Also known as the Level of SignificanceLevel of SignificanceAlso known as Also known as Type I ErrorType I ErrorAlso determines the risk of a Also determines the risk of a false positive false positive findingfinding

The probability that a result would be produced by The probability that a result would be produced by chance (sampling error or random error) alonechance (sampling error or random error) alone

Commonly used levels of significance (Commonly used levels of significance (αα) ) α = .05 (most used)α = .05 (most used)

5% or 5 out of every 100 results would be due to chance5% or 5 out of every 100 results would be due to chanceα = .01α = .01

1% or 1 out of every 100 results would be due to chance1% or 1 out of every 100 results would be due to chanceα = .001α = .001

0.1% or 1 out of every 1000 results would be due to chance0.1% or 1 out of every 1000 results would be due to chance

Type I Errors:Type I Errors:Alpha Levels and z-ScoresAlpha Levels and z-Scores

Select Select αα level level for for two-tailed teststwo-tailed testsTwo-tailed tests hypothesize the presence of a Two-tailed tests hypothesize the presence of a

difference, but not a particular direction for the difference, but not a particular direction for the difference between a sample mean (difference between a sample mean (MM) and a ) and a population mean (population mean (μμ).).HH00: M = : M = μμHH11: M ≠ : M ≠ μμ

α Level z-Score

.05 +/- 1.96

.01 +/- 2.58

.001 +/- 3.30

Type II ErrorsType II ErrorsA A Type II Type II error occurs when the sample does error occurs when the sample does

not appear to have been affected by the not appear to have been affected by the treatment when, in fact, the treatment does treatment when, in fact, the treatment does have an effect. have an effect. In this case, the researcher will fail to reject the In this case, the researcher will fail to reject the

null hypothesis and falsely conclude that the null hypothesis and falsely conclude that the treatment does not have an effect. treatment does not have an effect.

Type II errors are commonly the result of a very Type II errors are commonly the result of a very small treatment effect. Although the treatment small treatment effect. Although the treatment does have an effect, it is not large enough to does have an effect, it is not large enough to show up in the research study. show up in the research study.

Type II ErrorsType II ErrorsType II ErrorsType II Errors

Also known as Also known as beta error (beta error (ββ))Defined by the probability of Defined by the probability of false negativesfalse negatives

An error made by accepting or retaining a An error made by accepting or retaining a false null hypothesis (Hfalse null hypothesis (H00))

Stated simply, you fail to reject a false Stated simply, you fail to reject a false null hypothesis (Hnull hypothesis (H00) and claim that a ) and claim that a relationship does relationship does notnot exist when (in fact) exist when (in fact) it does existit does exist

Type I versus Type II ErrorType I versus Type II Error

FALSE+

FALSE-

TRUE+

TRUE-

Directional TestsDirectional Tests

When a research study predicts a specific When a research study predicts a specific direction for the treatment effect (increase or direction for the treatment effect (increase or decrease), it is possible to incorporate the decrease), it is possible to incorporate the directional prediction into the hypothesis test. directional prediction into the hypothesis test.

The result is called a The result is called a directional testdirectional test or a or a one-tailed testone-tailed test. A directional test includes the . A directional test includes the directional prediction in the statement of the directional prediction in the statement of the hypotheses and in the location of the critical hypotheses and in the location of the critical region. region.

Type I Errors:Type I Errors:Set Criteria for DecisionSet Criteria for Decision

Select Select αα level level for one-tailed tests for one-tailed testsThese tests hypothesize the presence of a These tests hypothesize the presence of a

difference between a sample mean (difference between a sample mean (MM) and ) and a population mean (a population mean (μμ) that falls in a ) that falls in a particular direction.particular direction. MM > > μμ or or MM < < μ μ

α Level z-Score

.05 +/- 1.65

.01 +/- 2.33

.001 +/- 3.10

Directional Tests (cont'd.)Directional Tests (cont'd.) For the prior example with Naltrexone treatment, if the original For the prior example with Naltrexone treatment, if the original

population has a mean number of property crimes per year of population has a mean number of property crimes per year of μ = 80 and the treatment is predicted to μ = 80 and the treatment is predicted to decreasedecrease the mean the mean number of property crimes per year, then the null and number of property crimes per year, then the null and alternative hypotheses would state that after treatment:alternative hypotheses would state that after treatment:

HH00: μ ≥ 80 (there is no decrease): μ ≥ 80 (there is no decrease)

HH11: μ < 80 (there is a decrease): μ < 80 (there is a decrease) In this case, the entire critical region would be located in the In this case, the entire critical region would be located in the

left-hand tailleft-hand tail of the distribution because of the distribution because smallersmaller values for values for MM would demonstrate that there is a decrease in arrests per would demonstrate that there is a decrease in arrests per year for Naltrexone recipients and we would reject the null year for Naltrexone recipients and we would reject the null hypothesis if the z-score for the sample was lower than the hypothesis if the z-score for the sample was lower than the critical cutoff identified for a particular level of significance critical cutoff identified for a particular level of significance (for (for example, zexample, zcritcrit = -1.65 for = -1.65 for α = .05 α = .05 ).).

Measuring Effect SizeMeasuring Effect Size

A hypothesis test evaluates the A hypothesis test evaluates the statistical significancestatistical significance of the results from a research study. of the results from a research study.

That is, the test determines whether or not it is likely That is, the test determines whether or not it is likely that the obtained sample mean occurred without any that the obtained sample mean occurred without any contribution from a treatment effect. contribution from a treatment effect.

The hypothesis test is influenced not only by the size The hypothesis test is influenced not only by the size of the treatment effect but also by the size of the of the treatment effect but also by the size of the sample. sample. Thus, even a very small effect can be significant if it is Thus, even a very small effect can be significant if it is

observed in a very large sample. observed in a very large sample.

Measuring Effect SizeMeasuring Effect Size

Because a significant effect does not necessarily Because a significant effect does not necessarily mean a large effect, it is recommended that the mean a large effect, it is recommended that the hypothesis test be accompanied by a measure of the hypothesis test be accompanied by a measure of the effect sizeeffect size. .

We use Cohen’s We use Cohen’s dd as a standardized measure of as a standardized measure of effect size. effect size.

Much like a Much like a zz-score, -score, Cohen’s Cohen’s dd measures the size of measures the size of the mean difference in terms of the standard deviation.the mean difference in terms of the standard deviation.

Cohen’s Cohen’s dd and andEstimated Cohen’s Estimated Cohen’s dd

Calculations for Cohen’s Calculations for Cohen’s dd are fairly simple are fairly simple

Note: Sample size does not affect Cohen’s Note: Sample size does not affect Cohen’s ddEvaluating Effect Sizes for Evaluating Effect Sizes for dd

Magnitude of d Evaluation of Effect Size

d = 0.2 Small effect (mean difference around 0.2 standard deviations)

d = 0.5 Medium effect (mean difference around 0.5 standard deviations)

d = 0.8 Large effect (mean difference around 0.8 standard deviations)

The Effect of Standard Deviation The Effect of Standard Deviation on Calculations for Cohen’s on Calculations for Cohen’s dd

Power of a Hypothesis TestPower of a Hypothesis Test

The The powerpower of a hypothesis test is defined is of a hypothesis test is defined is the probability that the test will reject the null the probability that the test will reject the null hypothesis when the treatment does have an hypothesis when the treatment does have an effect. effect. Probability of Type II Error (False Negative) = Probability of Type II Error (False Negative) = ββPower of Hypothesis Test = 1 - Power of Hypothesis Test = 1 - ββ

The power of a test depends on a variety of The power of a test depends on a variety of factors, including the size of the treatment factors, including the size of the treatment effect and the size of the sample. effect and the size of the sample.

Factors that Affect PowerFactors that Affect Power

You can You can decreasedecrease power when power when1. 1. sample size is decreasedsample size is decreased2. 2. Alpha is decreased (e.g., from .05 to .01)Alpha is decreased (e.g., from .05 to .01)3. 3. You go from a 1- to 2-tail testYou go from a 1- to 2-tail test

You can You can increaseincrease power when power when 1. 1. sample size is increasedsample size is increased2. 2. Alpha is increased (e.g., from .01 to .05)Alpha is increased (e.g., from .01 to .05)3. 3. You go from a 2- to 1-tail testYou go from a 2- to 1-tail test

Statistical Power for Statistical Power for Hypothesis TestingHypothesis Testing

How to Calculate the How to Calculate the Power of a Hypothesis TestPower of a Hypothesis Test

The previous slide was based upon a study The previous slide was based upon a study from your book with from your book with μμ = 80, = 80, σσ = 10, and a = 10, and a sample sample (n(n=25=25)) that is drawn with an 8-point that is drawn with an 8-point treatment effect (treatment effect (MM=88). What is the power of =88). What is the power of the related statistical test for detecting the the related statistical test for detecting the difference between the population and sample difference between the population and sample mean?mean?


Step #1: Calculate standard error for sampleStep #1: Calculate standard error for sample In this step, we work from the population’s In this step, we work from the population’s

standard deviation (standard deviation (σσ) and the sample size ) and the sample size (n)(n)


Step #2: Locate Boundary of Critical RegionStep #2: Locate Boundary of Critical Region In this step, we find the exact boundary of the In this step, we find the exact boundary of the

critical regioncritical regionPick a critical z-score based upon alpha (Pick a critical z-score based upon alpha (αα =.05 =.05))


Step #3: Calculate the z-score for the Step #3: Calculate the z-score for the difference between the treated sample mean difference between the treated sample mean ((MM=83.92) for the critical region boundary and =83.92) for the critical region boundary and the population mean with an 8-point treatment the population mean with an 8-point treatment effect (effect (μμ = 88). = 88).


Interpret Power of the Hypothesis TestInterpret Power of the Hypothesis TestFind probability associated with a z-score > - 2.04Find probability associated with a z-score > - 2.04Look this probability up as the proportion in the Look this probability up as the proportion in the

body of the normal distribution (column B in your body of the normal distribution (column B in your textbook)textbook)

pp = .9793 = .9793Thus, with a sample of 25 people and an 8-point Thus, with a sample of 25 people and an 8-point

treatment effect, 97.93% of the time the treatment effect, 97.93% of the time the hypothesis test will conclude that there is a hypothesis test will conclude that there is a significant effect.significant effect.

course: just 3900 introductory statistics for criminal justice instructor:

Documents

null hypothesis

hypothesis prediction

use hypothesis

alternative hypothesis

general population

unknown population

original population

population andor