prediction of the probability of successful first year ... · for students who completed 2 unit...

22
PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR UNIVERSITY STUDIES IN TERMS OF HIGH SCHOOL BACKGROUND: WITH APPLICATION TO THE FACULTY OF ECONOMIC STUDIES AT THE UNIVERSITY OF NEW ENGLAND D.M. Dancer and H.E. Doran No. 48 - September 1990 ISSN ISBN 0 157-0188 0 85834 893 4

Upload: others

Post on 17-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

PREDICTION OF THE PROBABILITY OF SUCCESSFULFIRST YEAR UNIVERSITY STUDIES IN TERMS OF

HIGH SCHOOL BACKGROUND: WITHAPPLICATION TO THE FACULTY OF ECONOMIC

STUDIES AT THE UNIVERSITY OF NEW ENGLAND

D.M. Dancer and H.E. Doran

No. 48 - September 1990

ISSN

ISBN

0 157-0188

0 85834 893 4

Page 2: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise
Page 3: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

General Background

This paper is motivated by informal observation over several years of first year students

in the Faculty of Economic Studies at the University of New England. A very casual

appraisal of students who have struggled has indicated that many of them had a poor

background in Mathematics as well as a fairly low level of achievement in the New South

Wales Higher School Certificate Examination.

A formal analysis of results could be a useful contribution to educational decision

making. In particular, estimating the important determinants of performance could sig-

nificantly improve the assessment of a student’s chances of success. It was considered that

factors which could influence the success rate of first year students would include the HSC

aggregate score, preparation in Mathematics and the type and location of schooling.

The aim of this investigation is to relate the probability of a student passing a range

of subjects to these ’background variables’ and this is achieved by using the technique of

probit analysis.

The Probit Model

The basic tool used in this paper is probit analysis. The probit model is a qualitative

response (QR) model in which the dependent or endogenous variable may only take the

discrete values 0 and 1. The particular form of the probit model used in this study is a

univariate binary QR model where the normal distribution is specified as the function F.

It is convenient mathematically to define the dichotomous random variable y which

takes the value of one if the event occurs and zero if it does not occur. It is assumed that

the probability of an event depends on a vector of independent variables m and a vector

of unknown parameters ~.

Amemiya (1981) writes the univariate dichotomous model as

p, = P(y~ = l)= F[I(m~,O)] i = 1,2, ...,n.

where F is the cumulative distribution function of some random variable and I(x~, 8) is a

scalar index and assumes the random variables y~ are independently distributed.

If a linear specification is chosen

and the model can then be written as

(i)

Page 4: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

The probit model follows from the specification in Equation 1 by choosing F to be the

cumulative distribution function (cdf) for a normal random variable. That is, F(.) =

To simplify the explanation of the probit model, a specific example will be used. It

could be hypothesized that a student’s probability of success is dependent on a number

of factors. The first factor and probably the most important factor to consider is the

HSC tertiary entrance score. This HSC score is used by the universities to determine a

student’s eligibility for a certain degree.

Another factor that is considered to have a bearing on the level of success is the level

of mathematics that a student undertakes at high school. The level of mathematics is

particularly relevant if the student is considering a Science or an Economics degree.

In some respects, probit analysis is analogous to regression analysis. Let us suppose

that we are interested in investigating the number of subjects passed. One way of ap-

proaching this problem would be to postulate that the number of subjects passed, Y,

could be regarded as a function of several variables. For example,

Y = f(HSC score, level of mathematics, place of school). (2)

The linear regression model may then enable an equation in the following form to be

estimated :-

Y = ¢~0 +/~ * X1 +/~2 * X2 + ¢~3 * X3 (3)

where

* X1 = HSC score

¯ X2 = level of mathematics completed at high school

* Xs = country or city school.

Note that if more than two levels of mathematics are being compared, then several dummyvariables would be needed to capture the differences between the levels. We emphasise

that such an approach could be used to calculate the average number of subjects passed

for any combination of the three variables.

Another problem that is of interest is to estimate the proportion of students who pass

two or more subjects. When analysing proportions or probabilities, probit analysis is

an appropriate technique. An approach similar to the regression problem can be used.

Again, a sample of students can be taken who have an HSC score, completed 2 UnitMathematics and attended a city school.

2

Page 5: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

These proportions could be interpreted as the probability that a student with certain

qualifications will pass at least two or more subjects. When probabilities are to be esti-

mated, the technique analogous to regression analysis is probit analysis. To illustrate this

technique, an example will be given.

For the sake of simpficity, consider students at a university who have enrolled in an

Economics degree. Let

HSC tertiary entrance score.Xli --

X21 = 1 if the student completed 2 Unit A Mathematics

= 0 if the student completed 2, 3 or 4 Unit Mathematics

X3i = 1 if the student attends a city high school

= 0 if the student attends a country high school

The variables X2i and X3i are dummy variables.

Define an index I[, given by

(4)

which is a single number summarising the values of X1, X2 and X3 for the ith individual.

Such an index, which can take any real value, can be converted to a probability by re-

flecting it in any cumulative distribution function. The probit model uses the distribution

function associated with a normal random variable. That is,

where Fx(.) isthe distribution function of a random variable X which is

Now

= P[X <_ 3"0 + 3"~ * Xli + 72 * X~i + 3"3 * Xz~]

= P[Z <_ (3"0 + 3"~ * X,, + 3"~ * X2, + 3"3 * ~) - ~]= P[Z <_ (5)

where Z ,-, N(O, 1) and

I~ - (3’0 - tz) + 3"~ ¯ Xli + 3"~ * X2i + 3"~ * X3i

That is,

I~ =/3o + ~ * X~i + 32 * X2i + fl~ * X~i. (6)

Page 6: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

Because of the arbitrary nature of the coefficients 7{ and/3i, it is always possible to choose

Fx(.) :where q~z(.) is the cumulative distribution function of the standardised normal random

variable.Consider some hypothetical values of/30, ~31,/32 and f13. Let 30 = -2.0,/31 = 0.01,

~2 = -1.5, ~33 = -0.5. Three individuals have been chosen to illustrate this example. The

first student has an ttSC score of 280, has completed 2 Unit A Mathematics and attended

a city high school. The second student has an HSC score of 300, has completed 2 Unit

Mathematics and attended a city high school. The third student has an HSC score of

350, has completed 3 Unit Mathematics and attended a country high school.

The index value for the first student is calculated as

-2.0 + 0.01 * 280 - 1.5 * 0 + -0.5 * 1 = -1.2.

Therefore,

P(Z <_ I~) = P(Z ! -1.2) = 0.12.

From Table 1, it can be seen that the probabihty that a student who has an HSC score of

280 with 2 Unit A Mathematics and has attended a city high school is quite low (0.12).

The probability that a student who has an HSC score of 300 with 2 Unit Mathematicsand has attended a city high school is 0.59. This indicates that this student hsa about a

50% chance of passing two or more subjects. The third student who has an HSC score

of 350 with 3 Unit Mathematics and attended a country high school has a very high

probability of passing two or more subjects at the end of first year at university. In the

Table 1: Probabihties of a Student Passing 2 or More Subjects

Student HSC Score Level of City or Index Proby

Mathematics Country School

1 280 2UA City -1.2 0.12

2 300 2U City 0.5 0.59

3 350 3U Country 1.5 0.93

above example, the coefficients, b0, bl, b2 and b3, were assigned arbitrary values. In reahty,

these numbers are always unknown, and must be estimated from the data.

The data for this project are in the form of a single observation on each decision maker.

This means that maximum likelihood methods must be used for estimation purposes. For

Page 7: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

a sample of T observations, the likelihood function is

(7)

where y~ = 1 or 0. The logarithm of the likelihood function is maximised using numer-

ical methods. The properties of the log likelihood function guarantee that the iterative

procedure will converge to the global maximum based on any suitable set of starting

values/3(0). Also the maximum likelihood estimators of the/3 parameters are consistent,

asymptotically efficient and asymptotically normally distributed.

In this project, the two events which are considered separately are ’The student passes

all his subjects ’ and ’The student passes all three main subjects’.

Data

The data for this study were obtained from two sources. The first source was the Admin-istrative Data Processing section of the University of New England (UNE) which provided

information on every student enrolled in the Faculty of Economic Studies in 1988. The

following information was provided:-

¯ the degree in which the student was enrolled

¯ the HSC aggregate mark

¯ each student’s examination results for 1988.

For some students who live interstate, the UNE calculates an equivalent HSC aggregatemark which was then subsequently used in the analysis.

The second source of data was a questionnaire sent out to M1 students with a 1988

student number in the Faculty of Economic Studies. This survey was a postal survey with

one follow up of those who had not replied the first time. Students were asked for the

following information:-

¯ State in which their education took place

¯ Highest level of Mathematics they attempted

¯ Highest level of Enghsh they attempted

5

Page 8: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

¯ Education at a state or a private school

¯ Town or city in which they completed their education

The students were asked to answer these questions for the last two years at high school.

On the basis of this information, the following variables are defined.

D~ = 1

= 0

D2 = 1

= 0

M1 = 1

= 0

M3 = 1

= 0

E1 = 1

= 0

School = 1

= 0

City = 1

= 0

HSC =

for Bachelor of Agricultural Economic Students

otherwise

for Bachelor of Financial Administration Students

otherwise

for students who completed 2 Unit A Mathematics at high school

otherwise

for students who completed 3 or 4 Unit Mathematics at high school

otherwise

for students who completed 2 Unit English at high school

otherwise

for students who attended a state high school

otherwise

for students who attended a city high school

otherwise

the HSC tertiary entrance score for the student.

The basic model postulated is

I = f(M1, M3, D1,D2, HSC, El, School, City) (s)

Results

In this analysis, we are concentrating primarily on estimating the probability that a

student will pass all subjects. To this end, a number of probit models are estimated

including various combinations of the variables M1, M2, D1, D2, El, HSC, School and

Gity. The results of these estimations are presented in Appendix A.

The main features of the empirical results are :-

¯ The HSC aggregate is highly significant statistically.

Page 9: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

The level of Mathematics and whether the student attended a country or city school

are nearly always significant.

With regard to Mathematics, once H$C aggregate is controlled, there appears to be no

difference between 2 and 3 Unit Mathematics students. As we have chosen the 2 Unit

group as base, we therefore omit the variable M3. Accordingly, for further interpretation,

we have selected the model

I = -2.4228 + 0.009725 HSC - 1.3452 Ml - 0.5471 City(0,887} (0.00973) (0.590) (0.259)

(9)

The probabilities which result from this index are shown in Figures 1 and 2 below, in

which probability is plotted agMnst HSC aggregate for city educated students (Figure 1)

and country educated students (Figure 2). Two points emerge very clearly front these

figures.

If a student enters the university with an HSC aggregate of 260 and 2 Unit Math-

ematics, then the probability of passing is approximately 0.52 for a student from a

country school and approximately 0.32 for a student from a city school. However, if

a student enters the university with an ttSC aggregate of 260 and 2 Unit A Mathe-

matics, then these probabilities drop dramatically to 0.1 and 0.03 respectively.

The effect of 2 Unit Mathematics on a student’s prospects is very marked. For

example, for a student to have an even chance of passing all subjects with only2 Unit A Mathematics, an tlSC aggregate of about 380 if the student is from a

country school and about 440 if the student is from a city school is required.

The implications of the model could also be interpreted from a different perspective.

In Figures 3 and 4, we display the results of 2 Unit Mathematics students(Figure 3) and

2 Unit A Mathematics students (Figure 4). This enables us to see the difference between

city and country educated students. It is apparent that country students have superiorperformance. Ra(her than inferring that a country education is superior, we suspect that

we are observing a ’living away froln home’ effect. It would be interesting to see a similar

analysis performed on students from a metropolitan university. Quite possibly the results

would be reversed. We note in passing that a perusal of Appendix A shows little evidence

that private or public school education has any effect on university results, at least in theFaculty of Economic Studies.

In addition to the above analysis, we carried out a similar analysis to assess the

probabifity of students passing the core subjects common to all three degrees - namely,

Page 10: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

Economics, Econometrics and Mathematics. The same broad conclusions emerge except

that now, while the Mathematics effect is numerically very large, it is not statistically

significant. The difference between country and city is no longer quite so marked.

Conclusion

This analysis has given rise to two main conclusions.

Students who enter with an HSC aggregate of under 300 have a very low probability

of passing all subjects. Such a conclusion has clear resource implications particularly

in an environment in which Government pressure is towards increasing acceptanceof poorly qualified students.

Mathematical preparation appears to be very important. There is a clear differen-

tiation between students who have at least 2 Unit Mathematics and those who do

not.

References

Amemiya T. (1985) Advanced Econometrics, Blackwell, Oxford.

8

Page 11: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

City Students

0.8

0.7

0.6

0.5

0.4

0,3

0.2

0.1

0,0

Legend¯ 2 Unit Mathematics

200 220 240 260 280 500 $20 540 360 380 400 420 440 460HSC Aggregafes

Figure 1: Students from City Schools Passing All Subjects

1.0 1

O.g

0.8

0.7 ~.

0.6~

0.5

0.4

0.3

0.2

0.1

0,0

Country Students

Legend¯ 2 Unit Mathematics= 2 Unit A Mafhemotics

200 220 240 260 280 500 320 340 360 580 400 420 440 460HSC Aggregate

Figure 2: Students from Country Schools Passing All Subjects

Page 12: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

1.0

0.9

0.7

0~6

0.5

0.4

0.3

0,2

0.1

Studenfs wlth 2 Unit Mathematics

¯ Country Students= City Students

0.0 I 1 I I I I ! I 1 I I I I200 220 240 260 280 500 320 340 360 380 400 420 440 460

HSC Aggregote

Figure 3: Students with 2 Unit Mathematics Passing All Subjects

SJudents with 2 Unit A Mathematics

1.0

0.8

0.7

0.6

0.4.1

0.3

0.2

0.1

0.0

Legend

Country StudentsClty Students

200 220 240 260 280 .~00 $20 540 360 580 400 420 440 460HSC Aggregote

Figure 4: Students with 2 Unit A Mathematics Passing All Subjects

10

Page 13: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

Results of the Different Probit Analysis

Table 2: The Variable ’Passing All Subjects’ vs Different Combinations

Const M1 M3 D1 D2 HSC E1 School

¯ ¯ ¯t

o

,tt ,tott*tt

¯ tt ot,t,ttottottott otott ,t¯ tt otott ot¯ tt otott ot¯ tt ,tott ot¯ tt ,tott ott,tt ,t¯ tt ott

¯

o

¯

¯

¯

¯

o

o

.tt

°t °tt

o

.tt,tt

,tt,tt,tt,tt,tt,tt,tt*tt*tt,tt,tt,ttott,tt,ttott

o

o

.t

.t

o

o

City R~

0.074

0.068

0.025

oft 0.0750.034

0.203

0.163

0.159

ott 0.146°tt 0.132

°t 0.213

°t 0.198

"t 0.1880.169

"t 0.2420.219

0.214

°t 0.255

0.219

0.247

° t t 0.272

°t 0.283

0.227

°tt 0.281

0.251

"t 0.289

Predict

0.661

0.637

0.637

0.653

0.637

0.710

0.645

0.669

0.661

0.669

0.685

0.710

0.694

0.669

0.718

0.702

0.710

0.726

0.710

0.726

0.702

0.726

0.669

0.726

0.742

0.726

11

Page 14: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise
Page 15: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

WORKING PAPERS IN ECONOMETRICS AND APPLIED STATISTICS

goe~di~ ~in2.omt ~od~e~. Lung-Fel Lee and William E. Griffiths,No. I - March 1979.

Howard E. Doran and Rozany R. Deem No. 2 - March 1979.

Willi~m Grlfflths and Dan Dao, No. 3 - April 1979.

D.S. Prasada Rao0 No. 5 - April 1979.

O~ o~ the ~eiqC~_d ~Za~ O&a~ Za an ~ ~~~~od~: H Dg0r_~ o~ ~ox~ o~ g~gcgen~, tloward E. Doran,No. 6 - June 1979.

Re/~ ~6q~ ~od2_~. George E. Battese andBruce P. Bonyhady, No. 7 - September 1979.

Howard E. Doran and David F. Williams, No. 8 - September 1979.

D.S. Prasada Rao, No. 9 - October 1980.

~h~ Do/~ - 1979. W.F. Shepherd and D.S. Prasada Rao,No. I0 - October !980.

~atag gOn~-~ and gaozus-Yecik~m Do2a to gatOnaLea ~az~~ ~u,w3.ixutu~ ~o~aad Neqo~ ~wu~ ~LaRa. W.E. Grlfflths andJ.R. Anderson, No. II - December 1980.

~ac&-O~-~ ~eaZ ia ~ ~a~z~rme o~ 7�~. !toward E. Doranand Jan Kmenta, No. 12 - April 1981.

~b~ Oad~z ~ui~aq~J~ ~ta/~acaa. H.E. Doran and W.E. Gri£fiths,No. 13 - June 1981.

Ze~ ~Ae go~o~_a ~,ar.e gadem and ~te ~o~uaq.Pauline Beesley, No. 14 - July 1981.

~o~ Doia. George E. Battese and Wayne A. Fuller, No. 15 - February1982.

Page 16: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise
Page 17: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

2

~)~. H.I. Toft and P.A. Cassidy, No. 16 - February 1985.

~t~ :Yeata ~ the Tazdkat .¢do~ab1~ent ~ ~!~e ~~~ ~od~.H.K. Doran, No. 17 - F~bruary 1985.

J.~.B. Guise and P.A.A. Beesley, No. 18 - February 1985.

g.E. Grtfftths and K. Surekha, No. 19 - August 1985.

~~ ~. D.S. Prasada ~ao, No. 20 - October 1985.

H.E. Doran, No. 21- November 1985.

R. Carter Hill and Peter J. Pope, No. 22 - November 1985.

~~ ~oa. gilliam E. Grlffiths, No. 23- February 1986.

~ ~ ~~. T.J. Coellt and G.E. Battese. No. 24-February 1986.

~~ ~~ ~ ~~ ~. George E. Battese and

Sohall J. Halik, No. 25- April 1986.

George E. Battese and Sohail J. ~allk, No. 26 - April 1986.

~~. George E. Battese and Sohatl J. Haltk, No. 27 - Hay 1986.

George E. Battese, No. 28- 3une 1986.

N~. D.S. Prasada Rao and J. Salazar-Carrillo, No. 29- August1986.

~~ ~ ~ Y~ ¢~ ~ ~. ~(I) ¢~ ~. ll.E. Doran,~.E. Grlffiths and P.A. Beesley, No. 30 - August 1987.

gi111am E. Griffiths, No. 31- November 1987.

Page 18: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise
Page 19: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

"3

Chris ~. Alaouze, No. 3~ - Sept.em[~er, 19BB.

G.E. Battese, T.J. Coelll a,ld T.C. Colby, No. 33--January, 19B9.

Tim J. Coelll, No. 34 - February, 1989.

Idllllam Grlfflths and Geo~ge Judge, No. 36 - Februagy, 1989.

~flanao¢~l~ o~ ~rtztiqdtZot[ WuteJ~ 1)~tq~ng 7_)qouq!~t. Chris N. Alaouze,No. 37 - April, 1989.

~~ ~ IVa~ ~oc~tt Tqo&~. Chris M. Alaouze, No. 38 -July, 1989.

~tl~ ?n,t~ ~tnea, t ~,~oqqammDW ~eatuati~a o~ Ya~ a~ ~a~,toqqinq.

Chris M. Alaouze and Campbell R. Fitzpatrick, No. 39- August, 1989.

SatUaa~ a~ ~i~k ~ u~ Yeemtnqt~ ~nnetat~t ~eqaeaaiona cmdDa~a. Guang II. Wan, William E. Grlfflths and Jock R. Anderson, No. 40 -September 1989.

tlhe O~ o~ ~apacU~ Yhardnq in Yteclw~tLc Dtlnan~c T,mqaaomd~Wo~ ~’hane.d ’.Reaea~ei]~ OpemU.e~. Chris H. hlaouze, No. 41 - November,1989.

~~ ~heorul and YimutatLott ~/pp~ea. William Grlfflths andHelmut L~tkepohl, No. 42 - Flarch 1990.

Howard E. Doran, No. 43- March 1990.

No. 44 - Flarch 1990.lloward E. l)ovan,

:Re~. lloward Doran, No. 45 - Hay, 1990.

Howard Doran and Jan Kmenta, No. 46 - Flay, 1990.

Page 20: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise
Page 21: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise

E./~. Selval~a[ha~. No. ~.’[ -Sepl.e~be~0 1990.

II.E. l)oran, No. ~18- gepte~nber, 1990.

Page 22: PREDICTION OF THE PROBABILITY OF SUCCESSFUL FIRST YEAR ... · for students who completed 2 Unit English at high school otherwise for students who attended a state high school otherwise