running head: this is a short (50 characters … · web viewfeeding the data into the excel...

Running head: SPSS/Excel Project 1

SPSS/Excel Project

Andrea Freeman

Seattle Pacific University

SPSS/Excel Project 2

SPSS/Excel Project

Part I: Frequency Analysis

A study was conducted in order to determine the statistical relationship between school

funding and academic performance. Twelve variables were identified: expenditure per pupil,

pupil/teacher ratio, salary, eligibility, reading (Verbal SAT) scores, Math SAT scores, Writing

SAT scores, region, SES (Socioeconomic Status determined by free and reduced-price lunch),

students, IDEA, and money (revenues). There is a belief that the data does not support the

conclusion that funding affects academic achievement.

There are four variables, with random samples taken from public schools in all 50 states

and the District of Columbia, that may shed some light on whether or not there is a possible

relationship between funding and student scores:

Variable: Current Expenditure per Pupil In Average Daily Attendance

The histogram in Figure 1 shows the frequency distribution of school expenditures per pupil. It

indicates that there is a positive skew to the distribution of expenditure scores. The majority of

the scores fall to the left end of the distribution and a few scores fall on the right end, indicating a

higher expenditure. This means there are some possible outliers that could distort the mean of

the distribution. According to the statistics in Table 1, the mean of the expenditure distribution is

$10327.78 and the median is $9805.00, which shows the positive skew (SK+). The box plot for

the distribution, Figure 2, indicates that there is one score, sample 5, which is an outlier in the

South, Washington DC, with an expenditure of $18,339 per pupil. This score is greater than

three standard deviations above the mean in this distribution, causing the skew in the

distribution. The box plot in Figure 2 also indicates that there is a possible difference between


the regions in expenditure per pupil. The northeast looks like it is possibly significantly different

from the other three regions.

Table 1

Statistics Data for 12 Variables with 51 Samples (50 states and Washington DC)

Statistics

current

expenditure

per pupil in

average daily

attendance in

public elem

and sec

schools

2005-06

average

pupil/teacher

ratio Fall

2005

estimated

ave salary

2005-

2006

percentage

of all eligible

students

taking the

SAT 2006-

07

average

verbal

SAT

score

2005-06

average

math

SAT

score

2005-06

average

writing

SAT

score

2005-06

% of students

eligible for

free/reduced

lunch 2006-

07

% of

students

with

disabilities

2006-07

Total

revenues for

the year

2005-06 (in

thousands)

Enrollment

Fall 2005

N Valid 51 51 51 51 51 51 51 50 51 51 51

Missing 0 0 0 0 0 0 0 1 0 0 0

Mean 10327.78 15.23 47679.08 39.33 534.94 540.59 525.37 39.82 14.22 10208705.02 963005.84

Median 9805.00 14.80 45575.00 32.00 523.00 529.00 511.00 37.35 14.30 6346033.00 654526.00

Std.

Deviation

2502.202 2.54 6942.01 31.12 37.80 37.46 37.63 10.64 2.13 12216880.36 1145966.304

Skewness 1.16 .75 .60 .25 .31 .47 .30 .62 .23 2.61 2.98

Std. Error

of

Skewness

.33 .33 .33 .33 .33 .33 .33 .34 .33 .33 .33

Minimum 5960 10.80 35607.00 3.0 482.00 472.00 472.00 17.70 10.50 958109.00 76876.00

Maximum 18339 22.10 61372.00 100.0 610.00 617.00 591.00 67.50 19.90 63785872.00 6437202.00


Figure 1. Histogram of Current Expenditure per Pupil in Average Daily Attendance in Public Elementary and Secondary Schools 2005-2006

Figure 2. Box Plot of Current Expenditure per Pupil in Average Daily Attendance in Public Elementary and Secondary Schools 2005-2006


Variable: Total Revenues for the Year 2005-06 (in thousands)

The histogram in Figure 3 represents the total revenues for each sample for the year 2005-06, in

thousands. The visual representation of the distribution shows a definite positive skew. The

bulk of the samples fall on the left side of the distribution and a few scores are more than three

standard deviations from the mean. With a standard deviation of 12216880.36, any score that is

over 46859346.10 is going to be an outlier. The histogram shows that two scores are much

higher than three standard deviations from the mean of 10208705.02. This indicates that the

mean of the distribution is probably distorted because of the two samples that have higher

revenues. Also, compared to the mean, the median of the distribution is much higher at

6346033.00. Sprinthall (2007) states, “the greater the distance between the mean and the

median, the greater is the total amount of Sk” (Sprinthall, 2007, p.45). Finally, the data in Table

1 and the histogram in Figure 3 show that the mean lies to the right of the median, indicating a

positive skew (Sk+). A possible source of the skew could be California. California has total

revenue of 63785872, which is an outlier in the South region. The box plot in Figure 4 shows the

distribution of revenues for each region supports these observations and sheds further light on

these outliers. For the region of the Northeast, the box is shifted toward the high end, which

means that there is a negative skew. This also occurs in the region of the West. There seem to

be three outliers identified in these four regions. The asterisk in the Western region has a label

of 5, which indicates the sample from California. Also, there are two points outside the whiskers

of the box plot for the Southern region. The indicators in the South are 10, for Florida, and 44,

for Texas. Despite these outliers and the skew, the medians, indicated by the lines in the middle

of each box, for these box plots seem to be relatively similar.


Figure 3. Histogram of Total Revenues For the Year 2005-06 (in thousands)

Figure 4. Box Plot of Total Revenues For the Year 2005-06 (in thousands)


Variable: Average Verbal SAT Score, 2005-06

The histogram in Figure 5 indicates that the largest number of scores, the mode and the tallest

bar in the graph, fall between 490 and 500. With a mean of 534.94 and a median of 523, there is

a very slight skew to the right. This skew becomes clearer when looking at the box plot on

Figure 6. The medians of the other regions look relatively similar, though the West is slightly

higher than the South and the Northeast and its box is shifted to the top. The distribution of the

south lies mostly above the median as does the West. Samples 36, Ohio, and 15, Indiana are

outliers for the Midwest. Figure 6 shows the Midwest is possibly different from the other regions

because there are a few outliers. They lie underneath the low end of that distribution. Also, the

box plot for the Midwest shows the box representing the middle 50% of the scores are shifted to

the low end. The median line for the Midwest is also very different from the other three regions.

Putting the information in Figures 5 and 6 together, it indicates that there is a positive skew and

the source of that is likely to be found in the Midwest, specifically Ohio and Indiana; even

though, in the histogram in Figure 5, there aren’t any scores that lie more than three standard

deviations above the mean (648.34). Ohio and Indiana could be distorting the mean.


Figure 5. Histogram of the Average Verbal SAT Score for 2005-06

Figure 6. Box Plot of the Average Verbal SAT Score 2005-06 for the Four Regions


Variable: Average Writing SAT Score, 2005-06

Table 1 indicates that the mean of this variables distribution is 525.37 and the median is 511.00 with

a standard deviation of 37.63. With just this information it does not indicate that there are any

outliers which are skewing the distribution because none of the scores lie three standard deviations

above the mean, 638.26. The histogram in Figure 7 does show that there may be something skewed

about the scores. The mode lies on the far left side of the distribution, between 475 and 500. This

places it to the left of the mean. The distribution isn’t quite bimodal, but there are definitely two

bumps on either side of the mean. The difference between the mean and the median suggest that,

since the mean lies to the right of the median, there is a skew to the right. In order to get some more

information about the distribution and possible skew, the box plot in Figure 8 needs to be consulted.

Again, sample 36, Ohio, and sample 15, Indiana, are falling below the distribution of the Midwest

box plot. The median of the Midwest also lies way above the other regions. The west shows some

slight difference as well, but the medians of the South and the Northeast are relatively similar. In fact,

the distribution of verbal (reading) and writing SAT scores look relatively similar when the bos plots

are compared, except the medians of the Midwest are slightly different. Just like the box plot for the

West in Figure 6, the box for the West in Figure 8 is shifted to the top. Also, the scores in the

distributions for the West and the South are above the median.


Figure 7. Histogram of Average Writing SAT Scores 2005-06

Figure 8. Box Plot of Average Writing SAT Scores 2005-06 by Region


Categorical Variable:

There is one categorical variable among the twelve variables in the data, region. Table 2 shows

that there are four regions: West, Midwest, South, and Northeast. There are 13 samples (States)

from the West, 12 samples from the Midwest, 17 from the South, and 9 from the Northeast

(including Washington DC) for a total of 51 samples. The largest number of samples is from the

South.

Table 2

Frequency Distribution of Samples by Region

region

Frequency Percent Valid Percent Cumulative Percent

Valid West 13 25.5 25.5 25.5

Midwest 12 23.5 23.5 49.0

South 17 33.3 33.3 82.4

Northeast 9 17.6 17.6 100.0

Total 51 100.0 100.0

Part II

Do the regions differ in terms of:

Variable:Expenditure Per Pupil:

In order to find out if the expenditure differs by region, the ANOVA test was used (Table 4).

First, Levine’s Test of Equality of Error Variances (Table 3) is used to discover if equal variance

can be assumed. Table 3 shows that there is a p value .24, which is greater than .05, therefore

homogeneity can be assumed. The ANOVA results in Table 4 also shows p < .001, which is

less than .05 and significant. This indicates a significant difference in expenditure per pupil, but


it doesn’t show exactly where. The Tukey HSD test (Table 5) is performed to identify where

this difference lies. The test reveals that there is a different pattern of group differences. In

Table 5, the (*) indicates the mean differences that are greater than .05 and, therefore,

significant. The Table indicates that there is a significant mean difference between the Northeast

and all the other regions. Also eta squared is .38, which indicates a strong effect.

Table 3

Levene's Test of Equality of Error Variancesa

Dependent Variable:current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06

F df1 df2 P

1.45 3 47 .24

Table 4

Tests of Between-Subjects Effects

Dependent Variable:current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06

SourceType III Sum of

Squares df Mean Square F pPartial Eta Squared

region 120,097,648.80 3 40,032,549.60 9.75 .00 .38

Error 192,953,101.83 47 4,105,385.15

Total 5,752,870,321.00 51

Corrected Total 313,050,750.63 50


Table 5

Multiple Comparisons

current expenditure per pupil in average daily attendance in public elem and sec schools 2005-06Tukey HSD

(I) region (J) region

Mean Difference

(I-J)Std. Error P

95% Confidence Interval

Lower Bound

Upper Bound

West Midwest -660.49 811.12 .85 -2820.82 1499.83

South -475.96 746.52 .92 -2464.23 1512.31

Northeast -4356.52* 878.61 .00 -6696.60 -2016.45

Midwest West 660.49 811.12 .85 -1499.83 2820.82

South 184.53 763.94 1.00 -1850.14 2219.21

Northeast -3696.03* 893.46 .00 -6075.66 -1316.40

South West 475.96 746.52 .92 -1512.31 2464.23

Midwest -184.53 763.94 .995 -2219.21 1850.14

Northeast -3880.56* 835.25 .000 -6105.17 -1655.96

Northeast West 4356.52* 878.61 .00 2016.45 6696.60

Midwest 3696.03* 893.46 .00 1316.40 6075.66

South 3880.56* 835.25 .00 1655.96 6105.17

*. The mean difference is significant at the .05 level.

Variable:Pupil/Teacher Ratio

In order to discover if there is a difference between the regions in terms of the variable

pupil/teacher ratio, the same series of tests need to be performed. In this case, Levene’s Test of

Equality of Error Variances (Table 6) indicates a p value of .01, which means that homogeneity

cannot be assumed because it is not greater than .05. But, an ANOVA can still be performed

because it is a “robust” test. The ANOVA results (Table 7) shows p < .001, which is less


than .05 and, therefore, a significant difference exists. Because, in this case, homogeneity cannot

be assumed, the Dunnet C (Table 8) is used to discover the source of these differences. Table 8

shows that there is a significant mean difference between the West and all the other regions. In

addition to the West, there is a significant mean difference between the Northeast and the South

and the West.

Table 6


Dependent Variable:average pupil/teacher ratio Fall 2005

F df1 df2 P

4.92 3 47 .01

Table 7


Dependent Variable:average pupil/teacher ratio Fall 2005

SourceType III Sum of Squares df

Mean Square F P

Partial Eta Squared

region 146.88 3 48.96 13.08 .00 .46

Error 175.92 47 3.74

Total 12145.39 51

Corrected Total 322.80 50

a. R Squared = .455 (Adjusted R Squared = .420)


Table 8


average pupil/teacher ratio Fall 2005Dunnett C


Mean Difference

(I-J)Std. Error


Lower Bound

Upper Bound

West Midwest 3.00* .94 .19 5.81

South 2.94* .86 .40 5.49

Northeast 5.07* .94 2.23 7.92

Midwest West -3.00* .94 -5.81 -.19

South -.06 .58 -1.78 1.67

Northeast 2.08 .69 -.08 4.22

South West -2.94* .86 -5.49 -.40

Midwest .06 .58 -1.67 1.78

Northeast 2.13* .58 .34 3.92

Northeast West -5.07* .94 -7.92 -2.23

Midwest -2.080 .69 -4.22 .08

South -2.13* .58 -3.92 -.34

Variable: Estimated Average Salary 2005-06

As before, the first step is to figure out if homogeneity can be assumed. Levine’s Test of

Equality of Error Variances (Table 9) is used. In Table 9 it shows a p value of .66, which is

greater than .05, so homogeneity can be assumed. This leads to the ANOVA test results in Table

10. The p value is .02, which is less than .05, so the results are significant. Eta squared,

however, is .18 which does not indicate a strong effect. Nevertheless, Tukey’s HSD (Table 11)


can be performed to figure out where those mean differences occurred. The only significant

mean difference, in Table 11, at the .05 alpha level exists between the South and the Northeast.

Table 9


Dependent Variable:estimated ave salary 2005-2006

F df1 df2 P

.53 3 47 .66

Table 10


Dependent Variable:estimated ave salary 2005-2006

Source

Type III Sum of

Squares df Mean Square F P Partial Eta Squared

region 434910474.84 3 144970158.28 3.45 .02 .18

Error 1974666324.85 47 42014177.12

Total 118347597323.00 51

Corrected Total

2409576799.69 50


Table 11


estimated ave salary 2005-2006Tukey HSD


Mean Difference

(I-J) Std. Error Sig.


Lower BoundUpper Bound

West Midwest 910.88 2594.81 .99 -6000.11 7821.88

South 1506.03 2388.15 .92 -4854.55 7866.62

Northeast -6641.50 2810.72 .10 -14127.52 844.52

Midwest West -910.88 2594.81 .99 -7821.87 6000.11

South 595.15 2443.89 1.00 -5913.89 7104.18

Northeast -7552.39 2858.22 .05 -15164.94 60.16

South West -1506.03 2388.15 .92 -7866.62 4854.55

Midwest -595.15 2443.89 1.00 -7104.18 5913.89

Northeast -8147.54* 2672.01 .02 -15264.15 -1030.92

Northeast West 6641.50 2810.71 .10 -844.52 14127.52

Midwest 7552.39 2858.22 .05 -60.16 15164.94

South 8147.54* 2672.01 .02 1030.92 15264.15

Variable: Percentage of All Eligible Students Taking the SAT, 2006-07

In this case, the test for equality of variances (Table 12) indicates that homogeneity can’t be

assumed because the p < .001, which is less than .05. This means that the ANOVA can be used,

but is followed by the Dunnet C instead of the Tukey HSD test. The ANOVA (Table 13) shows

a p < .001, which is less than .05 and indicates a significant difference. Eta squared also

indicates that this has a strong effect. To find out where this mean difference is, the Dunnett C

(Table 14) is performed. The results in Table 14 indicate that there is a significant mean


difference between the Northeast each of the other regions; though, a significant mean difference

also exists between the South and the Midwest.

Table 12


Dependent Variable: percentage of all eligible students taking the SAT 2006-07

F df1 df2 P

16.53 3 47 .00

Table 13


Dependent Variable:percentage of all eligible students taking the SAT 2006-07


Mean Square F P

Partial Eta Squared

region 24959.33 3 8319.78 16.66 .00 .52

Error 23472.00 47 499.40

Total 127334.00 51




Table 14


percentage of all eligible students taking the SAT 2006-07Dunnett C


Mean Difference

(I-J)Std. Error


Lower Bound

Upper Bound

West Midwest 20.80 7.12 -.48 42.07

South -6.89 9.14 -33.35 19.57

Northeast -47.98* 6.26 -67.02 -28.95

Midwest West -20.80 7.12 -42.07 .48

South -27.69* 8.92 -53.59 -1.79

Northeast -68.78* 5.94 -87.03 -50.53

South West 6.89 9.14 -19.57 33.35

Midwest 27.69* 8.92 1.79 53.59

Northeast -41.09* 8.25 -65.17 -17.01

Northeast West 47.98* 6.26 28.95 67.02

Midwest 68.78* 5.94 50.53 87.03

South 41.09* 8.25 17.01 65.17


Variable: Average Verbal SAT Score 2005-06

The Levine test in Table 15 has p < .001, which is less than .05, so homogeneity cannot be

assumed. So, the ANOVA is used, followed by the Dunnett C. In Table 16, the ANOVA shows

that the p< .001, which is less than .05, so there is significance. Eta squared shows that there is a

strong effect. The Dunnett C in Table 17 shows that these mean differences occur mostly


between the Midwest and the other three regions. But, there is also a significant mean difference,

at the .05 level, between the Northeast and the West.

Table 15


Dependent Variable:average verbal SAT score 2005-06

F df1 df2 P

6.57 3 47 .00

Table 16


Dependent Variable:average verbal SAT score 2005-06


Mean Square F P

Partial Eta Squared

region 30986.00 3 10328.67 12.00 .00 .43

Error 40450.83 47 860.66

Total 14665702.00 51




Table 17


average verbal SAT score 2005-06Dunnett C


Mean Difference

(I-J)Std. Error


Lower Bound

Upper Bound

West Midwest -47.81* 11.31 -81.68 -13.94

South 1.93 11.26 -30.74 34.60

Northeast 24.69* 7.73 1.37 48.02

Midwest West 47.81* 11.31 13.94 81.68

South 49.74* 12.63 12.65 86.82

Northeast 72.50* 9.62 43.31 101.69

South West -1.93 11.26 -34.60 30.74

Midwest -49.74* 12.63 -86.82 -12.65

Northeast 22.77 9.56 -5.02 50.55

Northeast West -24.69* 7.73 -48.02 -1.37

Midwest -72.50* 9.62 -101.69 -43.31

South -22.77 9.56 -50.55 5.02


Variable: Average Math SAT Score 2005-06

Again, the Levine Test (Table 18) indicates that homogeneity cannot be assumed because the

p< .001, which is less than .05. This is followed by the ANOVA (Table 19) which shows a p

< .00, which is less than .05, so it is significant. The Eta squared value of .52 indicates that there

is a strong effect. The Dunnett C (Table 20) is used to find where the mean differences lie. The


results in Table 20 show that the Midwest is different from all the other regions. Another

difference lies between the West and the Midwest.

Table 18


Dependent Variable:average math SAT score 2005-06

F df1 df2 P

5.82 3 47 .00

Table 19


Dependent Variable:average math SAT score 2005-06


Mean Square F P

Partial Eta Squared

region 36447.57 3 12149.19 16.94 .00 .52

Error 33708.78 47 717.21

Total 14974174.00 51




Table 20


average math SAT score 2005-06Dunnett C


Mean Difference

(I-J)Std. Error


Lower Bound

Upper Bound

West Midwest -51.84* 10.39 -83.01 -20.67

South 8.02 9.67 -19.96 36.00

Northeast 22.75* 6.11 4.23 41.26

Midwest West 51.84* 10.39 20.67 83.01

South 59.86* 12.14 24.15 95.57

Northeast 74.58* 9.548 45.67 103.50

South West -8.02 9.67 -35.99 19.96

Midwest -59.86* 12.14 -95.57 -24.14

Northeast 14.73 8.75 -10.71 40.16

Northeast West -22.74* 6.11 -41.26 -4.23

Midwest -74.58* 9.54 -103.50 -45.67

South -14.73 8.75 -40.16 10.71


Variable: Average Writing SAT Score 2005-2006

Table 21, Levene’s Test shows p<.001, so homogeneity cannot be assumed because it is less than

.05. This means that, after the ANOVA, the Dunnett C will be used if the ANOVA shows a

significant difference. Table 22 shows that p<.001, which indicates a significant difference. Eta

squared is .38, which is a large effect. The Dunnett C comparison test indicates that there is a


significant mean difference, at the .05 alpha level, between the Midwest and the other three

regions.

Table 21


Dependent Variable:average writing SAT score 2005-06

F df1 df2 P

8.01 3 47 .00

Table 22


Dependent Variable:average writing SAT score 2005-06


Mean Square F P

Partial Eta Squared

region 26942.23 3 8980.74 9.62 .00 .38

Error 43857.69 47 933.14

Total 14147632.00 51




Table 23


average writing SAT score 2005-06Dunnett C


Mean Differenc

e (I-J)Std. Error


Lower Bound

Upper Bound

West Midwest -49.17* 11.38 -83.24 -15.10

South -5.82 11.81 -40.07 28.42

Northeast 17.78 7.95 -6.24 41.80

Midwest West 49.17* 11.389 15.10 83.24

South 43.34* 13.06 5.06 81.63

Northeast 66.94* 9.71 37.43 96.46

South West 5.82 11.81 -28.42 40.07

Midwest -43.34* 13.06 -81.63 -5.06

Northeast 23.60 10.22 -6.10 53.30

Northeast West -17.78 7.95 -41.80 6.24

Midwest -66.94* 9.71 -96.46 -37.43

South -23.60 10.22 -53.30 6.10


Variable: Percent of Students Eligible for Free/Reduced Lunch 2006-07

Table 24 shows that, after using Levene’s test, the p value is .29. This means that homogeneity

can be assumed because it is greater than .05 and the Tukey HSD test can follow the ANOVA if

a significant difference is found. In this case, Table 25 is a significant difference, p<.001. Also,


Eta squared indicates a strong effect with a value of .49. These tests are followed by Tukey’s

HSD test (Table 26), which shows a mean difference between the Southern group and the other

three. There is also a significant mean difference between the Northeast group and the others,

except the Midwest group.

Table 24


Dependent Variable:% of students eligible for free/reduced lunch 2006-07

F df1 df2 P

1.30 3 46 .29

Table 25


Dependent Variable:% of students eligible for free/reduced lunch 2006-07


Mean Square F P

Partial Eta Squared

region 2732.83 3 910.94 14.87 .00 .49

Error 2817.70 46 61.25

Total 84848.08 50




Table 26


% of students eligible for free/reduced lunch 2006-07Tukey HSD


Mean Difference

(I-J)Std. Error P


Lower Bound

Upper Bound

West Midwest 5.14 3.20 .38 -3.38 13.66

South -9.57* 2.95 .01 -17.43 -1.70

Northeast 9.79* 3.45 .03 .59 18.99

Midwest West -5.14 3.20 .38 -13.66 3.38

South -14.71* 2.95 .00 -22.58 -6.84

Northeast 4.65 3.45 .54 -4.55 13.85

South West 9.57* 2.95 .01 1.70 17.43

Midwest 14.71* 2.95 .00 6.84 22.58

Northeast 19.36* 3.23 .00 10.76 27.96

Northeast West -9.79* 3.45 .03 -18.99 -.59

Midwest -4.65 3.45 .54 -13.85 4.55

South -19.36* 3.23 .00 -27.96 -10.76


Variable: Percentage of Students With Disabilities

Levene’s Test in Table 27 indicates that homogeneity can be assumed because .07 is greater than

.05. This leads to Table 28, the ANOVA test, which shows p<.001, which shows a significant

difference. Eta squared indicates that there is a strong effect at .40. Tukey’s HSD in Table 29

shows that the mean difference is significant between the Northeast group and the Midwest and

the West. Also, the mean difference is significant with the Midwest and the West.


Table 27


Dependent Variable:% of students with disabilities 2006-07

F df1 df2 P

2.55 3 47 .07

Table 28


Dependent Variable:% of students with disabilities 2006-07


Mean Square F P

Partial Eta Squared

region 89.78 3 29.93 10.25 .00 .40

Error 137.19 47 2.92

Total 10533.34 51




Table 29


% of students with disabilities 2006-07Tukey HSD


Mean Difference

(I-J)Std. Error P


Lower Bound

Upper Bound

West Midwest -2.49* .68 .00 -4.31 -.67

South -1.58 .63 .07 -3.26 .10

Northeast -3.94* .74 .00 -5.91 -1.96

Midwest West 2.49* .68 .00 .67 4.31

South .91 .64 .50 -.80 2.63

Northeast -1.4 .75 .24 -3.45 .56

South West 1.58 .63 .07 -.10 3.26

Midwest -.91 .64 .50 -2.63 .80

Northeast -2.36* .70 .01 -4.23 -.48

Northeast West 3.94* .74 .00 1.96 5.91

Midwest 1.44 .75 .24 -.56 3.45

South 2.36* .70 .01 .48 4.23


Variable: Total Revenues for the Year 2005-06 (in thousands)

This final variable shows that homogeneity can be assumed based on a p value of .52, which is

greater than .05. Table 31 shows that there is not a significant difference because p>.05 and the

Eta squared value also indicates a very small effect, .02. The Tukey HSD test in Table 32 also

shows that there isn’t a significant mean difference.


Table 30


Dependent Variable:Total revenues for the year 2005-06 (in thousands)

F df1 df2 P

.77 3 47 .52

Table 31



Source Type III Sum of Squares df Mean Square F P

Partial Eta

Squared

region 139,274,225,326,152.02 3 46,424,741,775,384.01 .30 .83 .02

Error 7,323,334,065,723,234.00 47 155,815,618,419,643.28

Total 12,777,708,858,095,064.00 51

Corrected Total 7,462,608,291,049,387.00 50


Table 32




Mean Difference (I-

J) Std. Error P


Lower Bound Upper Bound

Tukey HSD

West Midwest -894694.29 4997044.29 1.00 -14203768.44 12414379.87

South -1056610.77 4599069.62 1.00 -13305723.47 11192501.92

Northeast -4876159.09 5412823.97 .80 -19292616.37 9540298.18

Midwest West 894694.29 4997044.29 1.00 -12414379.87 14203768.44

South -161916.49 4706406.22 1.00 -12696908.31 12373075.34

Northeast -3981464.81 5504314.79 .89 -18641597.77 10678668.16

South West 1056610.77 4599069.62 1.00 -11192501.92 13305723.47

Midwest 161916.49 4706406.22 1.00 -12373075.34 12696908.31

Northeast -3819548.32 5145723.57 .88 -17524613.28 9885516.64

Northeast West 4876159.09 5412823.972 .80 -9540298.18 19292616.37

Midwest 3981464.81 5504314.79 .89 -10678668.16 18641597.77

South 3819548.32 5145723.57 .88 -9885516.64 17524613.28

Part III—Variable Relationship Investigation

Variables: Expenditure and Verbal SAT Scores

Feeding the data into the Excel template yielded the scatter plot, regression line and regression

equation in Figure 9. The visual information in the scatter plot in Figure 9 indicates that there is

a possible negative correlation between expenditure and Verbal SAT scores. The regression line


shows that the points on the scatter plot follow a path that begins in the upper left of the graph

and moves toward the bottom right. In order to gather further information about the relationship

between these two variables, the Pearson r equation is used. The information from this

calculation (calculated using the Excel Template) is shown in Table (33). Table 33 shows the

correlation coefficient to be -0.42. The negative sign indicates that the correlation is negative.

The negative in front of the number indicates the same interpretation with the Pearson r. To test

for significance, the correlation coefficient is compared to the critical values of r in Sprinthall

(2007). The critical value for a ɗƒ of 49 at the .01 alpha level is 0.36 (Sprinthall, 2007, p.605).

This indicates a significant relationship: r(49) = -0.42, p<.01. The correlation coefficient is

between .40 and .70, which, according to Guilford, indicates a “moderate

correlation”( Sprinthall, 2007, p.296). The square of the Pearson r, the coefficient of

determination, is used to “establish the proportion of the variability among Y scores that can be

accounted for by the variability among the X scores” (Sprinthall, 2007, p.297). The higher the

coefficient of determination, the more information you have about Y. In this case, the coefficient

of determination is 0.17. When it is multiplied by 100, it tells us that the variability is reduced

by 17%, which doesn’t give a lot of predictive accuracy. Also, the regression equation is

y= -0.0063x+599.76. This means that the slope of the regression line is .0063 and the Y

intercept (where the line crosses the Y axis) is +599.76. This also means that we can expect a

change of-0.0063 for each unit change of X. The negative is important because it is consistent

with a negative correlation. This equation indicates that when X, the expenditure per pupil, is

equal to 0 dollars, the expected score on the verbal SAT would be 599.76. That is interesting to

think about because, while there seems to be a moderate correlation, a 599.76 on this section of

the SAT is not really bad. A strong correlation, it would seem, would show more of a


connection between expenditure and the verbal SAT score. This means that there may not be

much practical significance in this relationship. The negative correlation means that, as

expenditure increases, there is a likely decrease in verbal SAT scores.

Table 33

Coefficients for Expenditure and Verbal SAT

r2 0.1726 Coefficient of Determinationr -0.4155 Coefficient of Correlation

0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 20,0000.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

f(x) = − 0.00627591099245409 x + 599.757431572793

X

Y

Figure 9. Scatter Plot of Expenditure Per Pupil (x) and Verbal SAT Scores (y)

Variables: Pupil/Teacher Ratio (Secondary) and Verbal SAT Scores

The initial visual presented in the scatter plot in Figure 10 indicates that there may not be much

of a relationship between pupil/teacher ratio at the secondary level and Verbal SAT scores.

Compared to the previous scatter plot of expenditure per pupil and Verbal SAT scores, the points


are clustered more around the regression line. The regression equation is y= -0.49x + 542.32.

This equation includes information about the slope and intercept of the regression line. The slope

in the regression equation, -0.49, indicates that there is a change for each unit change of pupils

per teacher. The negative sign indicates that this is a negative correlation. For each unit change

in X, pupils per teacher, there is a change of –0.49 in Y. The Y intercept is +542.32, which is

where the regression line crosses the Y axis. Further information is needed to decide if this

relationship is statistically significant. This was calculated with the Pearson r equation and the

information is in Table 34. According to Table 34, the coefficient of correlation is -0.03.

According to Guilford’s interpretations, this is much less than .20. This indicates that there is

not very much of a correlation between the two variables. Comparing the coefficient of

correlation to the Sprinthall (2007) table can support this conclusion. The critical value of r for a

ɗƒ of 49 is 0.279 at an alpha level of .05. This means that the null hypothesis should be accepted.

r(49) = -0.03, n.s., there isn’t a significant relationship between the pupil/teacher ratio at the

secondary level and Verbal SAT scores. This is supported even further when the coefficient of

determination is multiplied by 100, indicating that only 0.11% of the information about Y (verbal

SAT scores) can be found based on the ratio of pupils to teachers. There isn’t any statistical

significance, so research should be directed at other possible relationships to discover what

variables affect student achievement.

Table 34

Coefficients for Pupil/Teacher Ratio and Verbal SAT



10.00 15.00 20.00 25.000.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

f(x) = − 0.487061110885226 x + 542.324449898537

X

Y

Figure 10. Secondary Pupil/Teacher Ratio(x) and Verbal SAT Scores (y)

Variables: Teacher Salary and Verbal SAT Score

The initial observation of the scatter plot that represents the pairing of Average Teacher Salaries

and Verbal SAT scores in Figure 11 indicates that there is a possible relationship between the

two because there is a downward slope to the scores. The dots, however, do seem to cluster

around the mean, so further investigation is needed. In Table 35, the calculation of the Pearson r

yields a value of -0.47. The interpretation of r based on Guilford in Sprinthall (2007) suggests

that this is a moderate correlation because it is between .40 and .70. A comparison between this

value of r, -0.47, and the critical value of r in Sprinthall (2007) shows that there is a significant

correlation and the null hypothesis needs to be rejected. r(49) = -0.47, p< .01 because the critical

value of r at the .01 alpha level, 0.36, is less than the calculated value there is a significant

relationship. The negative sign in front of the coefficient of correlation indicates that there is a


negative correlation. The coefficient of determination, found in Table 35, is 0.23; when it is

multiplied by 100, it indicates that 23% of the information about Y can be found in X. The

regression equation, y= -0.49x+542.32, gives us further information. The slope, -0.49, indicates

that the Y value will change by -0.49 for each unit change of X. The intercept, +542.32, is the

point where the regression line crosses the Y axis when X equals zero. So, while this may be a

statistically significant relationship, there may not be much practical significance. It is interesting

that any increase in teacher salaries is related to a decrease in student scores. It is also possible

that the averages reported by the states could be skewed by districts that pay higher or lower

salaries.

Table 35

Coefficients for Pupil/Teacher Ratio and Verbal SAT


30,000.00 40,000.00 50,000.00 60,000.00 70,000.000.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

f(x) = − 0.00258512605430003 x + 658.197604368544

X

Y

Figure 11. Scatter plot of Teacher Salary (x) and Verbal SAT Scores (y)


After looking at all of these variables and analyzing the tests, I found that the scatter plots were

very useful for processing the observations I made about each of the variables. The scatter plots

indicated that there was a moderate correlation between expenditure and Verbal SAT scores.

The analysis of variance also indicated that there were some differences between expenditures in

the means of each group, the regions. This might be an area where further investigation would

reveal further information. It is also important to consider that the manner in which these

districts spend the money could influence the Verbal SAT scores, so expenditure may not be the

only indicator of increased student achievement. I found another possible source of increased

student achievement, based on the Verbal SAT scores. Teacher salaries also had a correlation

with student achievement. But, as I pointed out earlier, the salaries reported by the states could

be slightly off if there are significant differences in salaries within that state. Some districts are

better at negotiating. Also, teacher salaries could be higher in districts that are smaller and have

more resources to spread around a smaller area. This made me think about pupil/teacher ratio at

the secondary level.

The one variable that I identified as not having a relationship to student achievement on the

Verbal SAT was pupil/teacher ratio. The scatter plot in Figure 10 shows the dots all bunched

together, almost in a circle. This indicated that there was not a relationship. The ANOVA in

Table 8 and the Dunnett C in Table 9 show that there are some significant mean differences

between the groups. But, those differences do not seem to have an effect on Verbal SAT scores.

So, class size may not be an area of focus for improving student achievement. I would have liked

to explore the scatter plot and regression equation for percent of students on free/reduced lunch

in comparison with Verbal SAT scores, but there was a missing score and I was not able to use

it.


References

Sprinthall, R. C. (2007). Basic Statistical Analysis [8th Ed.]. New York: Pearson

running head: this is a short (50 characters … · web viewfeeding the data into the excel...

Documents