running head: this is a short (50 characters … · web viewfeeding the data into the excel...
TRANSCRIPT
Running head: SPSS/Excel Project 1
SPSS/Excel Project
Andrea Freeman
Seattle Pacific University
SPSS/Excel Project 2
SPSS/Excel Project
Part I: Frequency Analysis
A study was conducted in order to determine the statistical relationship between school
funding and academic performance. Twelve variables were identified: expenditure per pupil,
pupil/teacher ratio, salary, eligibility, reading (Verbal SAT) scores, Math SAT scores, Writing
SAT scores, region, SES (Socioeconomic Status determined by free and reduced-price lunch),
students, IDEA, and money (revenues). There is a belief that the data does not support the
conclusion that funding affects academic achievement.
There are four variables, with random samples taken from public schools in all 50 states
and the District of Columbia, that may shed some light on whether or not there is a possible
relationship between funding and student scores:
Variable: Current Expenditure per Pupil In Average Daily Attendance
The histogram in Figure 1 shows the frequency distribution of school expenditures per pupil. It
indicates that there is a positive skew to the distribution of expenditure scores. The majority of
the scores fall to the left end of the distribution and a few scores fall on the right end, indicating a
higher expenditure. This means there are some possible outliers that could distort the mean of
the distribution. According to the statistics in Table 1, the mean of the expenditure distribution is
$10327.78 and the median is $9805.00, which shows the positive skew (SK+). The box plot for
the distribution, Figure 2, indicates that there is one score, sample 5, which is an outlier in the
South, Washington DC, with an expenditure of $18,339 per pupil. This score is greater than
three standard deviations above the mean in this distribution, causing the skew in the
distribution. The box plot in Figure 2 also indicates that there is a possible difference between
SPSS/Excel Project 3
the regions in expenditure per pupil. The northeast looks like it is possibly significantly different
from the other three regions.
Table 1
Statistics Data for 12 Variables with 51 Samples (50 states and Washington DC)
Statistics
current
expenditure
per pupil in
average daily
attendance in
public elem
and sec
schools
2005-06
average
pupil/teacher
ratio Fall
2005
estimated
ave salary
2005-
2006
percentage
of all eligible
students
taking the
SAT 2006-
07
average
verbal
SAT
score
2005-06
average
math
SAT
score
2005-06
average
writing
SAT
score
2005-06
% of students
eligible for
free/reduced
lunch 2006-
07
% of
students
with
disabilities
2006-07
Total
revenues for
the year
2005-06 (in
thousands)
Enrollment
Fall 2005
N Valid 51 51 51 51 51 51 51 50 51 51 51
Missing 0 0 0 0 0 0 0 1 0 0 0
Mean 10327.78 15.23 47679.08 39.33 534.94 540.59 525.37 39.82 14.22 10208705.02 963005.84
Median 9805.00 14.80 45575.00 32.00 523.00 529.00 511.00 37.35 14.30 6346033.00 654526.00
Std.
Deviation
2502.202 2.54 6942.01 31.12 37.80 37.46 37.63 10.64 2.13 12216880.36 1145966.304
Skewness 1.16 .75 .60 .25 .31 .47 .30 .62 .23 2.61 2.98
Std. Error
of
Skewness
.33 .33 .33 .33 .33 .33 .33 .34 .33 .33 .33
Minimum 5960 10.80 35607.00 3.0 482.00 472.00 472.00 17.70 10.50 958109.00 76876.00
Maximum 18339 22.10 61372.00 100.0 610.00 617.00 591.00 67.50 19.90 63785872.00 6437202.00
SPSS/Excel Project 4
Figure 1. Histogram of Current Expenditure per Pupil in Average Daily Attendance in Public Elementary and Secondary Schools 2005-2006
Figure 2. Box Plot of Current Expenditure per Pupil in Average Daily Attendance in Public Elementary and Secondary Schools 2005-2006
SPSS/Excel Project 5
Variable: Total Revenues for the Year 2005-06 (in thousands)
The histogram in Figure 3 represents the total revenues for each sample for the year 2005-06, in
thousands. The visual representation of the distribution shows a definite positive skew. The
bulk of the samples fall on the left side of the distribution and a few scores are more than three
standard deviations from the mean. With a standard deviation of 12216880.36, any score that is
over 46859346.10 is going to be an outlier. The histogram shows that two scores are much
higher than three standard deviations from the mean of 10208705.02. This indicates that the
mean of the distribution is probably distorted because of the two samples that have higher
revenues. Also, compared to the mean, the median of the distribution is much higher at
6346033.00. Sprinthall (2007) states, “the greater the distance between the mean and the
median, the greater is the total amount of Sk” (Sprinthall, 2007, p.45). Finally, the data in Table
1 and the histogram in Figure 3 show that the mean lies to the right of the median, indicating a
positive skew (Sk+). A possible source of the skew could be California. California has total
revenue of 63785872, which is an outlier in the South region. The box plot in Figure 4 shows the
distribution of revenues for each region supports these observations and sheds further light on
these outliers. For the region of the Northeast, the box is shifted toward the high end, which
means that there is a negative skew. This also occurs in the region of the West. There seem to
be three outliers identified in these four regions. The asterisk in the Western region has a label
of 5, which indicates the sample from California. Also, there are two points outside the whiskers
of the box plot for the Southern region. The indicators in the South are 10, for Florida, and 44,
for Texas. Despite these outliers and the skew, the medians, indicated by the lines in the middle
of each box, for these box plots seem to be relatively similar.
SPSS/Excel Project 6
Figure 3. Histogram of Total Revenues For the Year 2005-06 (in thousands)
Figure 4. Box Plot of Total Revenues For the Year 2005-06 (in thousands)
SPSS/Excel Project 7
Variable: Average Verbal SAT Score, 2005-06
The histogram in Figure 5 indicates that the largest number of scores, the mode and the tallest
bar in the graph, fall between 490 and 500. With a mean of 534.94 and a median of 523, there is
a very slight skew to the right. This skew becomes clearer when looking at the box plot on
Figure 6. The medians of the other regions look relatively similar, though the West is slightly
higher than the South and the Northeast and its box is shifted to the top. The distribution of the
south lies mostly above the median as does the West. Samples 36, Ohio, and 15, Indiana are
outliers for the Midwest. Figure 6 shows the Midwest is possibly different from the other regions
because there are a few outliers. They lie underneath the low end of that distribution. Also, the
box plot for the Midwest shows the box representing the middle 50% of the scores are shifted to
the low end. The median line for the Midwest is also very different from the other three regions.
Putting the information in Figures 5 and 6 together, it indicates that there is a positive skew and
the source of that is likely to be found in the Midwest, specifically Ohio and Indiana; even
though, in the histogram in Figure 5, there aren’t any scores that lie more than three standard
deviations above the mean (648.34). Ohio and Indiana could be distorting the mean.
SPSS/Excel Project 8
Figure 5. Histogram of the Average Verbal SAT Score for 2005-06
Figure 6. Box Plot of the Average Verbal SAT Score 2005-06 for the Four Regions
SPSS/Excel Project 9
Variable: Average Writing SAT Score, 2005-06
Table 1 indicates that the mean of this variables distribution is 525.37 and the median is 511.00 with
a standard deviation of 37.63. With just this information it does not indicate that there are any
outliers which are skewing the distribution because none of the scores lie three standard deviations
above the mean, 638.26. The histogram in Figure 7 does show that there may be something skewed
about the scores. The mode lies on the far left side of the distribution, between 475 and 500. This
places it to the left of the mean. The distribution isn’t quite bimodal, but there are definitely two
bumps on either side of the mean. The difference between the mean and the median suggest that,
since the mean lies to the right of the median, there is a skew to the right. In order to get some more
information about the distribution and possible skew, the box plot in Figure 8 needs to be consulted.
Again, sample 36, Ohio, and sample 15, Indiana, are falling below the distribution of the Midwest
box plot. The median of the Midwest also lies way above the other regions. The west shows some
slight difference as well, but the medians of the South and the Northeast are relatively similar. In fact,
the distribution of verbal (reading) and writing SAT scores look relatively similar when the bos plots
are compared, except the medians of the Midwest are slightly different. Just like the box plot for the
West in Figure 6, the box for the West in Figure 8 is shifted to the top. Also, the scores in the
distributions for the West and the South are above the median.
SPSS/Excel Project 10
Figure 7. Histogram of Average Writing SAT Scores 2005-06
Figure 8. Box Plot of Average Writing SAT Scores 2005-06 by Region
SPSS/Excel Project 11
Categorical Variable:
There is one categorical variable among the twelve variables in the data, region. Table 2 shows
that there are four regions: West, Midwest, South, and Northeast. There are 13 samples (States)
from the West, 12 samples from the Midwest, 17 from the South, and 9 from the Northeast
(including Washington DC) for a total of 51 samples. The largest number of samples is from the
South.
Table 2
Frequency Distribution of Samples by Region
region
Frequency Percent Valid Percent Cumulative Percent
Valid West 13 25.5 25.5 25.5
Midwest 12 23.5 23.5 49.0
South 17 33.3 33.3 82.4
Northeast 9 17.6 17.6 100.0
Total 51 100.0 100.0
Part II
Do the regions differ in terms of:
Variable:Expenditure Per Pupil:
In order to find out if the expenditure differs by region, the ANOVA test was used (Table 4).
First, Levine’s Test of Equality of Error Variances (Table 3) is used to discover if equal variance
can be assumed. Table 3 shows that there is a p value .24, which is greater than .05, therefore
homogeneity can be assumed. The ANOVA results in Table 4 also shows p < .001, which is
less than .05 and significant. This indicates a significant difference in expenditure per pupil, but
SPSS/Excel Project 12
it doesn’t show exactly where. The Tukey HSD test (Table 5) is performed to identify where
this difference lies. The test reveals that there is a different pattern of group differences. In
Table 5, the (*) indicates the mean differences that are greater than .05 and, therefore,
significant. The Table indicates that there is a significant mean difference between the Northeast
and all the other regions. Also eta squared is .38, which indicates a strong effect.
Table 3
Levene's Test of Equality of Error Variancesa
Dependent Variable:current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06
F df1 df2 P
1.45 3 47 .24
Table 4
Tests of Between-Subjects Effects
Dependent Variable:current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06
SourceType III Sum of
Squares df Mean Square F pPartial Eta Squared
region 120,097,648.80 3 40,032,549.60 9.75 .00 .38
Error 192,953,101.83 47 4,105,385.15
Total 5,752,870,321.00 51
Corrected Total 313,050,750.63 50
SPSS/Excel Project 13
Table 5
Multiple Comparisons
current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06Tukey HSD
(I) region (J) region
Mean Difference
(I-J)Std. Error P
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest -660.49 811.12 .85 -2820.82 1499.83
South -475.96 746.52 .92 -2464.23 1512.31
Northeast -4356.52* 878.61 .00 -6696.60 -2016.45
Midwest West 660.49 811.12 .85 -1499.83 2820.82
South 184.53 763.94 1.00 -1850.14 2219.21
Northeast -3696.03* 893.46 .00 -6075.66 -1316.40
South West 475.96 746.52 .92 -1512.31 2464.23
Midwest -184.53 763.94 .995 -2219.21 1850.14
Northeast -3880.56* 835.25 .000 -6105.17 -1655.96
Northeast West 4356.52* 878.61 .00 2016.45 6696.60
Midwest 3696.03* 893.46 .00 1316.40 6075.66
South 3880.56* 835.25 .00 1655.96 6105.17
*. The mean difference is significant at the .05 level.
Variable:Pupil/Teacher Ratio
In order to discover if there is a difference between the regions in terms of the variable
pupil/teacher ratio, the same series of tests need to be performed. In this case, Levene’s Test of
Equality of Error Variances (Table 6) indicates a p value of .01, which means that homogeneity
cannot be assumed because it is not greater than .05. But, an ANOVA can still be performed
because it is a “robust” test. The ANOVA results (Table 7) shows p < .001, which is less
SPSS/Excel Project 14
than .05 and, therefore, a significant difference exists. Because, in this case, homogeneity cannot
be assumed, the Dunnet C (Table 8) is used to discover the source of these differences. Table 8
shows that there is a significant mean difference between the West and all the other regions. In
addition to the West, there is a significant mean difference between the Northeast and the South
and the West.
Table 6
Levene's Test of Equality of Error Variancesa
Dependent Variable:average pupil/teacher ratio Fall 2005
F df1 df2 P
4.92 3 47 .01
Table 7
Tests of Between-Subjects Effects
Dependent Variable:average pupil/teacher ratio Fall 2005
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 146.88 3 48.96 13.08 .00 .46
Error 175.92 47 3.74
Total 12145.39 51
Corrected Total 322.80 50
a. R Squared = .455 (Adjusted R Squared = .420)
SPSS/Excel Project 15
Table 8
Multiple Comparisons
average pupil/teacher ratio Fall 2005Dunnett C
(I) region (J) region
Mean Difference
(I-J)Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest 3.00* .94 .19 5.81
South 2.94* .86 .40 5.49
Northeast 5.07* .94 2.23 7.92
Midwest West -3.00* .94 -5.81 -.19
South -.06 .58 -1.78 1.67
Northeast 2.08 .69 -.08 4.22
South West -2.94* .86 -5.49 -.40
Midwest .06 .58 -1.67 1.78
Northeast 2.13* .58 .34 3.92
Northeast West -5.07* .94 -7.92 -2.23
Midwest -2.080 .69 -4.22 .08
South -2.13* .58 -3.92 -.34
Variable: Estimated Average Salary 2005-06
As before, the first step is to figure out if homogeneity can be assumed. Levine’s Test of
Equality of Error Variances (Table 9) is used. In Table 9 it shows a p value of .66, which is
greater than .05, so homogeneity can be assumed. This leads to the ANOVA test results in Table
10. The p value is .02, which is less than .05, so the results are significant. Eta squared,
however, is .18 which does not indicate a strong effect. Nevertheless, Tukey’s HSD (Table 11)
SPSS/Excel Project 16
can be performed to figure out where those mean differences occurred. The only significant
mean difference, in Table 11, at the .05 alpha level exists between the South and the Northeast.
Table 9
Levene's Test of Equality of Error Variancesa
Dependent Variable:estimated ave salary 2005-2006
F df1 df2 P
.53 3 47 .66
Table 10
Tests of Between-Subjects Effects
Dependent Variable:estimated ave salary 2005-2006
Source
Type III Sum of
Squares df Mean Square F P Partial Eta Squared
region 434910474.84 3 144970158.28 3.45 .02 .18
Error 1974666324.85 47 42014177.12
Total 118347597323.00 51
Corrected Total
2409576799.69 50
SPSS/Excel Project 17
Table 11
Multiple Comparisons
estimated ave salary 2005-2006Tukey HSD
(I) region (J) region
Mean Difference
(I-J) Std. Error Sig.
95% Confidence Interval
Lower BoundUpper Bound
West Midwest 910.88 2594.81 .99 -6000.11 7821.88
South 1506.03 2388.15 .92 -4854.55 7866.62
Northeast -6641.50 2810.72 .10 -14127.52 844.52
Midwest West -910.88 2594.81 .99 -7821.87 6000.11
South 595.15 2443.89 1.00 -5913.89 7104.18
Northeast -7552.39 2858.22 .05 -15164.94 60.16
South West -1506.03 2388.15 .92 -7866.62 4854.55
Midwest -595.15 2443.89 1.00 -7104.18 5913.89
Northeast -8147.54* 2672.01 .02 -15264.15 -1030.92
Northeast West 6641.50 2810.71 .10 -844.52 14127.52
Midwest 7552.39 2858.22 .05 -60.16 15164.94
South 8147.54* 2672.01 .02 1030.92 15264.15
Variable: Percentage of All Eligible Students Taking the SAT, 2006-07
In this case, the test for equality of variances (Table 12) indicates that homogeneity can’t be
assumed because the p < .001, which is less than .05. This means that the ANOVA can be used,
but is followed by the Dunnet C instead of the Tukey HSD test. The ANOVA (Table 13) shows
a p < .001, which is less than .05 and indicates a significant difference. Eta squared also
indicates that this has a strong effect. To find out where this mean difference is, the Dunnett C
(Table 14) is performed. The results in Table 14 indicate that there is a significant mean
SPSS/Excel Project 18
difference between the Northeast each of the other regions; though, a significant mean difference
also exists between the South and the Midwest.
Table 12
Levene's Test of Equality of Error Variancesa
Dependent Variable: percentage of all eligible students taking the SAT 2006-07
F df1 df2 P
16.53 3 47 .00
Table 13
Tests of Between-Subjects Effects
Dependent Variable:percentage of all eligible students taking the SAT 2006-07
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 24959.33 3 8319.78 16.66 .00 .52
Error 23472.00 47 499.40
Total 127334.00 51
Corrected Total 48431.33 50
a. R Squared = .515 (Adjusted R Squared = .484)
SPSS/Excel Project 19
Table 14
Multiple Comparisons
percentage of all eligible students taking the SAT 2006-07Dunnett C
(I) region (J) region
Mean Difference
(I-J)Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest 20.80 7.12 -.48 42.07
South -6.89 9.14 -33.35 19.57
Northeast -47.98* 6.26 -67.02 -28.95
Midwest West -20.80 7.12 -42.07 .48
South -27.69* 8.92 -53.59 -1.79
Northeast -68.78* 5.94 -87.03 -50.53
South West 6.89 9.14 -19.57 33.35
Midwest 27.69* 8.92 1.79 53.59
Northeast -41.09* 8.25 -65.17 -17.01
Northeast West 47.98* 6.26 28.95 67.02
Midwest 68.78* 5.94 50.53 87.03
South 41.09* 8.25 17.01 65.17
*. The mean difference is significant at the .05 level.
Variable: Average Verbal SAT Score 2005-06
The Levine test in Table 15 has p < .001, which is less than .05, so homogeneity cannot be
assumed. So, the ANOVA is used, followed by the Dunnett C. In Table 16, the ANOVA shows
that the p< .001, which is less than .05, so there is significance. Eta squared shows that there is a
strong effect. The Dunnett C in Table 17 shows that these mean differences occur mostly
SPSS/Excel Project 20
between the Midwest and the other three regions. But, there is also a significant mean difference,
at the .05 level, between the Northeast and the West.
Table 15
Levene's Test of Equality of Error Variancesa
Dependent Variable:average verbal SAT score 2005-06
F df1 df2 P
6.57 3 47 .00
Table 16
Tests of Between-Subjects Effects
Dependent Variable:average verbal SAT score 2005-06
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 30986.00 3 10328.67 12.00 .00 .43
Error 40450.83 47 860.66
Total 14665702.00 51
Corrected Total 71436.82 50
a. R Squared = .434 (Adjusted R Squared = .398)
SPSS/Excel Project 21
Table 17
Multiple Comparisons
average verbal SAT score 2005-06Dunnett C
(I) region (J) region
Mean Difference
(I-J)Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest -47.81* 11.31 -81.68 -13.94
South 1.93 11.26 -30.74 34.60
Northeast 24.69* 7.73 1.37 48.02
Midwest West 47.81* 11.31 13.94 81.68
South 49.74* 12.63 12.65 86.82
Northeast 72.50* 9.62 43.31 101.69
South West -1.93 11.26 -34.60 30.74
Midwest -49.74* 12.63 -86.82 -12.65
Northeast 22.77 9.56 -5.02 50.55
Northeast West -24.69* 7.73 -48.02 -1.37
Midwest -72.50* 9.62 -101.69 -43.31
South -22.77 9.56 -50.55 5.02
*. The mean difference is significant at the .05 level.
Variable: Average Math SAT Score 2005-06
Again, the Levine Test (Table 18) indicates that homogeneity cannot be assumed because the
p< .001, which is less than .05. This is followed by the ANOVA (Table 19) which shows a p
< .00, which is less than .05, so it is significant. The Eta squared value of .52 indicates that there
is a strong effect. The Dunnett C (Table 20) is used to find where the mean differences lie. The
SPSS/Excel Project 22
results in Table 20 show that the Midwest is different from all the other regions. Another
difference lies between the West and the Midwest.
Table 18
Levene's Test of Equality of Error Variancesa
Dependent Variable:average math SAT score 2005-06
F df1 df2 P
5.82 3 47 .00
Table 19
Tests of Between-Subjects Effects
Dependent Variable:average math SAT score 2005-06
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 36447.57 3 12149.19 16.94 .00 .52
Error 33708.78 47 717.21
Total 14974174.00 51
Corrected Total 70156.35 50
a. R Squared = .520 (Adjusted R Squared = .489)
SPSS/Excel Project 23
Table 20
Multiple Comparisons
average math SAT score 2005-06Dunnett C
(I) region (J) region
Mean Difference
(I-J)Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest -51.84* 10.39 -83.01 -20.67
South 8.02 9.67 -19.96 36.00
Northeast 22.75* 6.11 4.23 41.26
Midwest West 51.84* 10.39 20.67 83.01
South 59.86* 12.14 24.15 95.57
Northeast 74.58* 9.548 45.67 103.50
South West -8.02 9.67 -35.99 19.96
Midwest -59.86* 12.14 -95.57 -24.14
Northeast 14.73 8.75 -10.71 40.16
Northeast West -22.74* 6.11 -41.26 -4.23
Midwest -74.58* 9.54 -103.50 -45.67
South -14.73 8.75 -40.16 10.71
*. The mean difference is significant at the .05 level.
Variable: Average Writing SAT Score 2005-2006
Table 21, Levene’s Test shows p<.001, so homogeneity cannot be assumed because it is less than
.05. This means that, after the ANOVA, the Dunnett C will be used if the ANOVA shows a
significant difference. Table 22 shows that p<.001, which indicates a significant difference. Eta
squared is .38, which is a large effect. The Dunnett C comparison test indicates that there is a
SPSS/Excel Project 24
significant mean difference, at the .05 alpha level, between the Midwest and the other three
regions.
Table 21
Levene's Test of Equality of Error Variancesa
Dependent Variable:average writing SAT score 2005-06
F df1 df2 P
8.01 3 47 .00
Table 22
Tests of Between-Subjects Effects
Dependent Variable:average writing SAT score 2005-06
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 26942.23 3 8980.74 9.62 .00 .38
Error 43857.69 47 933.14
Total 14147632.00 51
Corrected Total 70799.92 50
a. R Squared = .381 (Adjusted R Squared = .341)
SPSS/Excel Project 25
Table 23
Multiple Comparisons
average writing SAT score 2005-06Dunnett C
(I) region (J) region
Mean Differenc
e (I-J)Std. Error
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest -49.17* 11.38 -83.24 -15.10
South -5.82 11.81 -40.07 28.42
Northeast 17.78 7.95 -6.24 41.80
Midwest West 49.17* 11.389 15.10 83.24
South 43.34* 13.06 5.06 81.63
Northeast 66.94* 9.71 37.43 96.46
South West 5.82 11.81 -28.42 40.07
Midwest -43.34* 13.06 -81.63 -5.06
Northeast 23.60 10.22 -6.10 53.30
Northeast West -17.78 7.95 -41.80 6.24
Midwest -66.94* 9.71 -96.46 -37.43
South -23.60 10.22 -53.30 6.10
*. The mean difference is significant at the .05 level.
Variable: Percent of Students Eligible for Free/Reduced Lunch 2006-07
Table 24 shows that, after using Levene’s test, the p value is .29. This means that homogeneity
can be assumed because it is greater than .05 and the Tukey HSD test can follow the ANOVA if
a significant difference is found. In this case, Table 25 is a significant difference, p<.001. Also,
SPSS/Excel Project 26
Eta squared indicates a strong effect with a value of .49. These tests are followed by Tukey’s
HSD test (Table 26), which shows a mean difference between the Southern group and the other
three. There is also a significant mean difference between the Northeast group and the others,
except the Midwest group.
Table 24
Levene's Test of Equality of Error Variancesa
Dependent Variable:% of students eligible for free/reduced lunch 2006-07
F df1 df2 P
1.30 3 46 .29
Table 25
Tests of Between-Subjects Effects
Dependent Variable:% of students eligible for free/reduced lunch 2006-07
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 2732.83 3 910.94 14.87 .00 .49
Error 2817.70 46 61.25
Total 84848.08 50
Corrected Total 5550.53 49
a. R Squared = .492 (Adjusted R Squared = .459)
SPSS/Excel Project 27
Table 26
Multiple Comparisons
% of students eligible for free/reduced lunch 2006-07Tukey HSD
(I) region (J) region
Mean Difference
(I-J)Std. Error P
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest 5.14 3.20 .38 -3.38 13.66
South -9.57* 2.95 .01 -17.43 -1.70
Northeast 9.79* 3.45 .03 .59 18.99
Midwest West -5.14 3.20 .38 -13.66 3.38
South -14.71* 2.95 .00 -22.58 -6.84
Northeast 4.65 3.45 .54 -4.55 13.85
South West 9.57* 2.95 .01 1.70 17.43
Midwest 14.71* 2.95 .00 6.84 22.58
Northeast 19.36* 3.23 .00 10.76 27.96
Northeast West -9.79* 3.45 .03 -18.99 -.59
Midwest -4.65 3.45 .54 -13.85 4.55
South -19.36* 3.23 .00 -27.96 -10.76
*. The mean difference is significant at the .05 level.
Variable: Percentage of Students With Disabilities
Levene’s Test in Table 27 indicates that homogeneity can be assumed because .07 is greater than
.05. This leads to Table 28, the ANOVA test, which shows p<.001, which shows a significant
difference. Eta squared indicates that there is a strong effect at .40. Tukey’s HSD in Table 29
shows that the mean difference is significant between the Northeast group and the Midwest and
the West. Also, the mean difference is significant with the Midwest and the West.
SPSS/Excel Project 28
Table 27
Levene's Test of Equality of Error Variancesa
Dependent Variable:% of students with disabilities 2006-07
F df1 df2 P
2.55 3 47 .07
Table 28
Tests of Between-Subjects Effects
Dependent Variable:% of students with disabilities 2006-07
SourceType III Sum of Squares df
Mean Square F P
Partial Eta Squared
region 89.78 3 29.93 10.25 .00 .40
Error 137.19 47 2.92
Total 10533.34 51
Corrected Total 226.97 50
a. R Squared = .396 (Adjusted R Squared = .357)
SPSS/Excel Project 29
Table 29
Multiple Comparisons
% of students with disabilities 2006-07Tukey HSD
(I) region (J) region
Mean Difference
(I-J)Std. Error P
95% Confidence Interval
Lower Bound
Upper Bound
West Midwest -2.49* .68 .00 -4.31 -.67
South -1.58 .63 .07 -3.26 .10
Northeast -3.94* .74 .00 -5.91 -1.96
Midwest West 2.49* .68 .00 .67 4.31
South .91 .64 .50 -.80 2.63
Northeast -1.4 .75 .24 -3.45 .56
South West 1.58 .63 .07 -.10 3.26
Midwest -.91 .64 .50 -2.63 .80
Northeast -2.36* .70 .01 -4.23 -.48
Northeast West 3.94* .74 .00 1.96 5.91
Midwest 1.44 .75 .24 -.56 3.45
South 2.36* .70 .01 .48 4.23
*. The mean difference is significant at the .05 level.
Variable: Total Revenues for the Year 2005-06 (in thousands)
This final variable shows that homogeneity can be assumed based on a p value of .52, which is
greater than .05. Table 31 shows that there is not a significant difference because p>.05 and the
Eta squared value also indicates a very small effect, .02. The Tukey HSD test in Table 32 also
shows that there isn’t a significant mean difference.
SPSS/Excel Project 30
Table 30
Levene's Test of Equality of Error Variancesa
Dependent Variable:Total revenues for the year 2005-06 (in thousands)
F df1 df2 P
.77 3 47 .52
Table 31
Tests of Between-Subjects Effects
Dependent Variable:Total revenues for the year 2005-06 (in thousands)
Source Type III Sum of Squares df Mean Square F P
Partial Eta
Squared
region 139,274,225,326,152.02 3 46,424,741,775,384.01 .30 .83 .02
Error 7,323,334,065,723,234.00 47 155,815,618,419,643.28
Total 12,777,708,858,095,064.00 51
Corrected Total 7,462,608,291,049,387.00 50
SPSS/Excel Project 31
Table 32
Multiple Comparisons
Dependent Variable:Total revenues for the year 2005-06 (in thousands)
(I) region (J) region
Mean Difference (I-
J) Std. Error P
95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD
West Midwest -894694.29 4997044.29 1.00 -14203768.44 12414379.87
South -1056610.77 4599069.62 1.00 -13305723.47 11192501.92
Northeast -4876159.09 5412823.97 .80 -19292616.37 9540298.18
Midwest West 894694.29 4997044.29 1.00 -12414379.87 14203768.44
South -161916.49 4706406.22 1.00 -12696908.31 12373075.34
Northeast -3981464.81 5504314.79 .89 -18641597.77 10678668.16
South West 1056610.77 4599069.62 1.00 -11192501.92 13305723.47
Midwest 161916.49 4706406.22 1.00 -12373075.34 12696908.31
Northeast -3819548.32 5145723.57 .88 -17524613.28 9885516.64
Northeast West 4876159.09 5412823.972 .80 -9540298.18 19292616.37
Midwest 3981464.81 5504314.79 .89 -10678668.16 18641597.77
South 3819548.32 5145723.57 .88 -9885516.64 17524613.28
Part III—Variable Relationship Investigation
Variables: Expenditure and Verbal SAT Scores
Feeding the data into the Excel template yielded the scatter plot, regression line and regression
equation in Figure 9. The visual information in the scatter plot in Figure 9 indicates that there is
a possible negative correlation between expenditure and Verbal SAT scores. The regression line
SPSS/Excel Project 32
shows that the points on the scatter plot follow a path that begins in the upper left of the graph
and moves toward the bottom right. In order to gather further information about the relationship
between these two variables, the Pearson r equation is used. The information from this
calculation (calculated using the Excel Template) is shown in Table (33). Table 33 shows the
correlation coefficient to be -0.42. The negative sign indicates that the correlation is negative.
The negative in front of the number indicates the same interpretation with the Pearson r. To test
for significance, the correlation coefficient is compared to the critical values of r in Sprinthall
(2007). The critical value for a ɗƒ of 49 at the .01 alpha level is 0.36 (Sprinthall, 2007, p.605).
This indicates a significant relationship: r(49) = -0.42, p<.01. The correlation coefficient is
between .40 and .70, which, according to Guilford, indicates a “moderate
correlation”( Sprinthall, 2007, p.296). The square of the Pearson r, the coefficient of
determination, is used to “establish the proportion of the variability among Y scores that can be
accounted for by the variability among the X scores” (Sprinthall, 2007, p.297). The higher the
coefficient of determination, the more information you have about Y. In this case, the coefficient
of determination is 0.17. When it is multiplied by 100, it tells us that the variability is reduced
by 17%, which doesn’t give a lot of predictive accuracy. Also, the regression equation is
y= -0.0063x+599.76. This means that the slope of the regression line is .0063 and the Y
intercept (where the line crosses the Y axis) is +599.76. This also means that we can expect a
change of-0.0063 for each unit change of X. The negative is important because it is consistent
with a negative correlation. This equation indicates that when X, the expenditure per pupil, is
equal to 0 dollars, the expected score on the verbal SAT would be 599.76. That is interesting to
think about because, while there seems to be a moderate correlation, a 599.76 on this section of
the SAT is not really bad. A strong correlation, it would seem, would show more of a
SPSS/Excel Project 33
connection between expenditure and the verbal SAT score. This means that there may not be
much practical significance in this relationship. The negative correlation means that, as
expenditure increases, there is a likely decrease in verbal SAT scores.
Table 33
Coefficients for Expenditure and Verbal SAT
r2 0.1726 Coefficient of Determinationr -0.4155 Coefficient of Correlation
0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,0000.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
f(x) = − 0.00627591099245409 x + 599.757431572793
X
Y
Figure 9. Scatter Plot of Expenditure Per Pupil (x) and Verbal SAT Scores (y)
Variables: Pupil/Teacher Ratio (Secondary) and Verbal SAT Scores
The initial visual presented in the scatter plot in Figure 10 indicates that there may not be much
of a relationship between pupil/teacher ratio at the secondary level and Verbal SAT scores.
Compared to the previous scatter plot of expenditure per pupil and Verbal SAT scores, the points
SPSS/Excel Project 34
are clustered more around the regression line. The regression equation is y= -0.49x + 542.32.
This equation includes information about the slope and intercept of the regression line. The slope
in the regression equation, -0.49, indicates that there is a change for each unit change of pupils
per teacher. The negative sign indicates that this is a negative correlation. For each unit change
in X, pupils per teacher, there is a change of –0.49 in Y. The Y intercept is +542.32, which is
where the regression line crosses the Y axis. Further information is needed to decide if this
relationship is statistically significant. This was calculated with the Pearson r equation and the
information is in Table 34. According to Table 34, the coefficient of correlation is -0.03.
According to Guilford’s interpretations, this is much less than .20. This indicates that there is
not very much of a correlation between the two variables. Comparing the coefficient of
correlation to the Sprinthall (2007) table can support this conclusion. The critical value of r for a
ɗƒ of 49 is 0.279 at an alpha level of .05. This means that the null hypothesis should be accepted.
r(49) = -0.03, n.s., there isn’t a significant relationship between the pupil/teacher ratio at the
secondary level and Verbal SAT scores. This is supported even further when the coefficient of
determination is multiplied by 100, indicating that only 0.11% of the information about Y (verbal
SAT scores) can be found based on the ratio of pupils to teachers. There isn’t any statistical
significance, so research should be directed at other possible relationships to discover what
variables affect student achievement.
Table 34
Coefficients for Pupil/Teacher Ratio and Verbal SAT
r2 0.0011 Coefficient of Determinationr -0.0325 Coefficient of Correlation
SPSS/Excel Project 35
10.00 15.00 20.00 25.000.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
f(x) = − 0.487061110885226 x + 542.324449898537
X
Y
Figure 10. Secondary Pupil/Teacher Ratio(x) and Verbal SAT Scores (y)
Variables: Teacher Salary and Verbal SAT Score
The initial observation of the scatter plot that represents the pairing of Average Teacher Salaries
and Verbal SAT scores in Figure 11 indicates that there is a possible relationship between the
two because there is a downward slope to the scores. The dots, however, do seem to cluster
around the mean, so further investigation is needed. In Table 35, the calculation of the Pearson r
yields a value of -0.47. The interpretation of r based on Guilford in Sprinthall (2007) suggests
that this is a moderate correlation because it is between .40 and .70. A comparison between this
value of r, -0.47, and the critical value of r in Sprinthall (2007) shows that there is a significant
correlation and the null hypothesis needs to be rejected. r(49) = -0.47, p< .01 because the critical
value of r at the .01 alpha level, 0.36, is less than the calculated value there is a significant
relationship. The negative sign in front of the coefficient of correlation indicates that there is a
SPSS/Excel Project 36
negative correlation. The coefficient of determination, found in Table 35, is 0.23; when it is
multiplied by 100, it indicates that 23% of the information about Y can be found in X. The
regression equation, y= -0.49x+542.32, gives us further information. The slope, -0.49, indicates
that the Y value will change by -0.49 for each unit change of X. The intercept, +542.32, is the
point where the regression line crosses the Y axis when X equals zero. So, while this may be a
statistically significant relationship, there may not be much practical significance. It is interesting
that any increase in teacher salaries is related to a decrease in student scores. It is also possible
that the averages reported by the states could be skewed by districts that pay higher or lower
salaries.
Table 35
Coefficients for Pupil/Teacher Ratio and Verbal SAT
r2 0.2254 Coefficient of Determinationr -0.4748 Coefficient of Correlation
30,000.00 40,000.00 50,000.00 60,000.00 70,000.000.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
f(x) = − 0.00258512605430003 x + 658.197604368544
X
Y
Figure 11. Scatter plot of Teacher Salary (x) and Verbal SAT Scores (y)
SPSS/Excel Project 37
After looking at all of these variables and analyzing the tests, I found that the scatter plots were
very useful for processing the observations I made about each of the variables. The scatter plots
indicated that there was a moderate correlation between expenditure and Verbal SAT scores.
The analysis of variance also indicated that there were some differences between expenditures in
the means of each group, the regions. This might be an area where further investigation would
reveal further information. It is also important to consider that the manner in which these
districts spend the money could influence the Verbal SAT scores, so expenditure may not be the
only indicator of increased student achievement. I found another possible source of increased
student achievement, based on the Verbal SAT scores. Teacher salaries also had a correlation
with student achievement. But, as I pointed out earlier, the salaries reported by the states could
be slightly off if there are significant differences in salaries within that state. Some districts are
better at negotiating. Also, teacher salaries could be higher in districts that are smaller and have
more resources to spread around a smaller area. This made me think about pupil/teacher ratio at
the secondary level.
The one variable that I identified as not having a relationship to student achievement on the
Verbal SAT was pupil/teacher ratio. The scatter plot in Figure 10 shows the dots all bunched
together, almost in a circle. This indicated that there was not a relationship. The ANOVA in
Table 8 and the Dunnett C in Table 9 show that there are some significant mean differences
between the groups. But, those differences do not seem to have an effect on Verbal SAT scores.
So, class size may not be an area of focus for improving student achievement. I would have liked
to explore the scatter plot and regression equation for percent of students on free/reduced lunch
in comparison with Verbal SAT scores, but there was a missing score and I was not able to use
it.
SPSS/Excel Project 38
References
Sprinthall, R. C. (2007). Basic Statistical Analysis [8th Ed.]. New York: Pearson