5/15/2015slide 1 solving the problem the one sample t-test compares two values for the population...

43
03/26/22 Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population means compares the population means for two groups of subjects on a single variable. The null hypothesis for this test is: there is no difference between the population mean of the variable for one group of subjects and the population mean of the same variable for a second group of subjects. In addition to our concern with the assumption of normality for each group and the number of cases in each group if we are to apply the Central Limit Theorem, but this test also requires us to examine the spread or dispersion of both groups so that the measure of standard error used in the t-test fairly represents both group. While there is a test of Equality of Variance and a formula to use when the test is satisfied and a formula to use when the test is violated, the authors of our text suggest we always use the formula that assumes the test is violated. If we use this version of the statistic when the variances are in fact equal, the results of the test are

Upload: isabella-simmons

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

04/18/23 Slide 1

SOLVING THE PROBLEM

The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population means compares the population means for two groups of subjects on a single variable. The null hypothesis for this test is:there is no difference between the population mean of the variable for one group of subjects and the population mean of the same variable for a second group of subjects.

In addition to our concern with the assumption of normality for each group and the number of cases in each group if we are to apply the Central Limit Theorem, but this test also requires us to examine the spread or dispersion of both groups so that the measure of standard error used in the t-test fairly represents both group.

While there is a test of Equality of Variance and a formula to use when the test is satisfied and a formula to use when the test is violated, the authors of our text suggest we always use the formula that assumes the test is violated. If we use this version of the statistic when the variances are in fact equal, the results of the test are comparable to what we would obtain using the formula for equal variances.

We will the authors advice and restrict our attention to the “Equal variances not assumed” row in the SPSS output table without examining the Levene test of equality of variance.

04/18/23 Slide 2

The introductory statement in the question indicates:• The data set to use (GSS2000R)• The variables to use in the analysis: socioeconomic

index [sei] for groups of survey respondents defined by the variable sex [sex]

• The task to accomplish (two-sample t-test for the difference between sample means)

• The level of significance (0.05, two-tailed)

04/18/23 Slide 3

The first statement asks about the level of measurement.

A two-sample t-test for the difference between sample means requires a quantitative dependent variable and a dichotomous independent variable.

04/18/23 Slide 4

"Socioeconomic index" [sei] is quantitative, satisfying the level of measurement requirement for the dependent variable. "Sex" [sex] is dichotomous, satisfying the level of measurement requirement for the independent variable.

Mark the statement as correct.

04/18/23 Slide 5

A two-sample t-test for the difference between sample means requires that the distribution of the variable satisfy the nearly normal condition for both groups. We will operationally define the nearly normal condition as having skewness and kurtosis between -1.0 and +1.0 for both groups, and not having any outliers with standard scores equal to or smaller than -3.0 or equal to or larger than +3.0 in the distribution of scores for either group.

To justify the use of probabilities based on a normal sampling distribution in testing hypotheses, either the distribution of the variable must satisfy the nearly normal condition or the size of the sample must be sufficiently large to generate a normal sampling distribution under the Central Limit Theorem.

04/18/23 Slide 6

To evaluate the variables conformity to the nearly normal condition, we will use descriptive statistics and standard scores.

We will first compute the standard scores.

To compute the standard scores, select the Descriptive Statistics > Descriptives command from the Analyze menu.

04/18/23 Slide 7

First, move the variable for the analysis sei to the Variable(s) list box.

Third, click on the OK button to produce the output.

Second, mark the check box Save standardized values as variables.

04/18/23 Slide 8

There were no outliers that had a standard score less than or equal to -3.0.

Sort the column Zsei in ascending order to show any negative outliers at the top of the column.

04/18/23 Slide 9

There were no outliers that had a standard score greater than or equal to +3.0.

Sort the column Zsei in descending order to show any positive outliers at the top of the column.

04/18/23 Slide 10

Next, we will use the Explore procedure to generate descriptive statistics for each gender..

To compute the descriptive statistics, select the Descriptive Statistics > Explore command from the Analyze menu.

04/18/23 Slide 11

First, move the dependent variable sei to the Dependent List.

Second, move the group variable sex to the Factor List.

Third, mark the option button to display Statistics only.

Fourth, click on the OK button to produce the output.

04/18/23 Slide 12

For survey respondents who were male, "socioeconomic index" satisfied the criteria for a normal distribution. The skewness of the distribution (0.539) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.852) was between -1.0 and +1.0.

04/18/23 Slide 13

For survey respondents who were female, "socioeconomic index" satisfied the criteria for a normal distribution. The skewness of the distribution (0.610) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.921) was between -1.0 and +1.0.

04/18/23 Slide 14

For survey respondents who were male, "socioeconomic index" satisfied the criteria for a normal distribution. The skewness of the distribution (0.539) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.852) was between -1.0 and +1.0. For survey respondents who were female, "socioeconomic index" satisfied the criteria for a normal distribution. The skewness of the distribution (0.610) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.921) was between -1.0 and +1.0.

There were no outliers that had a standard score less than or equal to -3.0 or greater than or equal to +3.0.

Mark the statement as correct.

04/18/23 Slide 15

To apply the Central Limit Theorem for a two-sample t-test for the difference between sample means requires that both groups defined by the independent variable have 40 or more cases.

Though we have satisfied the nearly normal condition and do not need to utilize the Central Limit Theorem to justify the use of probabilities based on the normal distribution, we will still examine the sample size.

04/18/23 Slide 16

There were 110 valid cases for survey respondents who were male and 145 valid cases for survey respondents who were female.

04/18/23 Slide 17

Both groups had 40 or more cases, so the Central Limit Theorem would be applicable. However, since the distribution of "socioeconomic index" satisfied the nearly normal condition, we do not need to rely upon the Central Limit Theorem to satisfy the sampling distribution requirements of a two-sample t-test for the difference between sample means.

Mark the statement as correct.

04/18/23 Slide 18

The next statement asks us to identify the mean for each group in the sample data and the standard error of the sampling distribution.

To answer this question, we need to produce the output for the two-sample t-test.

04/18/23 Slide 19

To produce the two-sample t-test (which SPSS calls Independent-Samples T-Test), select the Compare Means > Independent Samples T Test command from the Analyze menu.

04/18/23 Slide 20

First, move the variable sei to the Test Variable(s) list box.

Third, click on the Define Groups button to enter the group codes.

Second, move the grouping variable sex to the text box.

SPSS adds ?’s after the variable name to remind us that we need to specify the numeric codes for the groups.

04/18/23 Slide 21

First, enter 1 for males as Group 1.

First, enter 2 for females as Group 2.

Third, click on the Continue button to close the dialog box.

If I did not remember the code numbers for male and female, I would look them up in the Variable View of the SPSS Data Editor.

04/18/23 Slide 22

Third, click on the OK button to produce the output.

SPSS replaces the question marks with the codes I entered.

04/18/23 Slide 23

The mean "socioeconomic index" for survey respondents who were male was 50.29 and the mean for survey respondents who were female was 47.51

The standard error of the differences between group means was 2.446.

04/18/23 Slide 24

The mean "socioeconomic index" for survey respondents who were male was 50.29 and the mean for survey respondents who were female was 47.51. The standard error of the differences between group means was 2.446.

Mark the question as correct.

04/18/23 Slide 25

The next statement asks us about the null hypothesis for the one-sample t-test.

We should check to make certain the relationship is stated correctly.

04/18/23 Slide 26

The null hypothesis for the test is: there is no difference between the population mean of "socioeconomic index" for survey respondents who were male and the population mean of "socioeconomic index" for survey respondents who were female.

Since the hypothesis is stated correctly, mark the question as correct.

04/18/23 Slide 27

The next statement asks us to relate the t-test to the data in our problem.

04/18/23 Slide 28

Following the convention in the text book, we will only focus on the “Equal variances not assumed” option. Within this option, the difference and standard error are correctly identified.

The t-test statistic is based on the difference between the means of the two groups (2.777) relative to the standard error of the differences between sample means (2.446).

04/18/23 Slide 29

The statement is correct and contains the correct values for both the difference in means and the sampling error that we would typically expect to find in the sampling distribution for differences in means.

Mark the statement as correct.

04/18/23 Slide 30

The next statement asks about the probability for the comparison made by the t-test. i.e. what is the probability that the population means for each group are not different.

In the last question, the difference in means was only slightly larger than the standard error of the differences, so we should expect a ratio near one and a high value for the probability.

04/18/23 Slide 31

The probability that the population mean for survey respondents who were male (50.3) was not different from the population mean for survey respondents who were female (47.5) was p = .257.

04/18/23 Slide 32

The probability that the population mean for survey respondents who were male (50.3) was not different from the population mean for survey respondents who were female (47.5) was p = .257.

Since the probability was correctly stated, mark the question as true.

04/18/23 Slide 33

When the p-value for the statistical test is less than or equal to alpha, we reject the null hypothesis and interpret the results of the test. If the p-value is greater than alpha, we fail to reject the null hypothesis and do not interpret the result.

04/18/23 Slide 34

The p-value for this test (p = .257) is larger than the alpha level of significance (p = .050) supporting the conclusion to fail to reject the null hypothesis.

The check box is not marked.

04/18/23 Slide 35

The final statement asks us to interpret the result of our statistical test as a finding in the context of the problem we created.

We only interpret the results when the null hypothesis is rejected.

04/18/23 Slide 36

If we had a significant p-value, we would have looked at the means of the two groups to identify the direction of the relationship.

04/18/23 Slide 37

Since we did not have a significant p-value, we cannot reject the null hypothesis and interpret the relationship.

The check box is not marked.

04/18/23 Slide 38

Dependent variable is quantitative?

Yes Do not mark check box.

Mark statement check box.

No

Mark only “None of the above.”

Stop.

Independent variable is dichotomous?

Yes

No

04/18/23 Slide 39

Yes

Nearly normal distribution?

Do not mark check box.No

Mark statement check box.

Nearly normal:•Skewness and kurtosis between -1.0 and +1.0 for both groups•Z-scores between -3.0 and +3.0

CLT applicable(Sample size ≥ 40 in

each group)?

YesDo not mark

check box.

Mark statement check box. Stop.

CLT stands for Central Limit Theorem.

No

If the variable is not normal and the sample size is less than 40, the test is not appropriate.

04/18/23 Slide 40

Yes

Nearly normal distribution?

Do not mark check box.No

Mark statement check box.

Nearly normal:•Skewness and kurtosis between -1.0 and +1.0 for both groups•Z-scores between -3.0 and +3.0

CLT applicable(Sample size ≥ 40 in

each group)?

YesDo not mark

check box.

Mark statement check box. Stop.

CLT stands for Central Limit Theorem.

No

If the variable is not normal and the sample size is less than 40, the test is not appropriate.

We will check the applicability of the Central Limit Theorem based on sample size, even when our data satisfies the nearly normal condition.

04/18/23 Slide 41

Yes

Do not mark check box.

No

Mark statement check box.

Sample means and standard error

correct?

Yes

Do not mark check box.

No

Mark statement check box.

H0: no difference between sample means

04/18/23 Slide 42

T-test accurately described?

Yes

Do not mark check box.

Mark statement check box.

No

P-value (sig.) stated correctly?

Yes

Do not mark check box.

Mark statement check box.

No

04/18/23 Slide 43

Yes

Do not mark check box.

No

Mark statement check box.

Reject H0 is correct decision (p ≤ alpha)?

Stop.

We interpret results only if we reject null hypothesis.

Interpretation is stated correctly?

Yes

Do not mark check box.

Mark statement check box.

No