student evaluations of teaching and courses: student study strategies as a criterion of validity

8
Higher Education 20: 135-142, 1990 1990 Kluwer Academic Publishers. Printed in the Netherlands. Student evaluations of teaching and courses: Student study strategies as a criterion of validity* MICHAEL PROSSER 1 & KEITH TRIGWELL 2 Centre for Teaching and Learning, University of Sydney, Sydney, Australia 2 University of Technology, Sydney, Australia Abstract. Recent developments in higher education are likelyto lead to increased evaluation of teaching and courses and, in particular, increased use of student evaluation of teaching and courses by questionnaire. Most studies of the validity of such evaluations have been conducted in terms of the ,relationship between traditional measures of 'how much' students learn and their ratings of teaching and courses. But there have been few if any studies of the relationship between students' rating of teaching and the quality of student learning, or in how the students approached their learning. For the evaluation of teaching and courses by questionnaire to be valid we would expect that (1) those students reporting that they adopted deeper approaches to study would rate the teaching and the course more highly than those adopting more surface strategies and, more importantly, (2) those teachers and courses which received higher mean ratings would also have, on average, students adopting deeper strategies. In the paper we report the results for eleven courses in two institutions. The results, in general, support the validity of student ratings, and suggest that courses and teaching in which students have adopted deeper strategies to learning also have higher student ratings. Introduction Recent policy developments in higher education are likely to lead to increased evaluation of teaching and courses, and in particular, increased use of student evaluation of teaching and courses by questionnaire. For example, the Australian Government's White paper on Higher Education (1988) argues for the development of a set of performance indicators to be used to facilitate resource allocation within institutions. The Australian Vice-Chancellors' Committee and the Australian Committee of Directors and Principals in Advanced Education (1988) established a working party with the task of developing a set of such indicators. Among the indicators for teaching and the curriculum were average student ratings of teaching and courses by department. Before such questionnaires can be used, all aspects of their validity need to be carefully considered. For such ratings to be valid they must, at least, be a measure of teaching and course effectiveness. But exactly what constitutes teaching effec- tiveness is difficult to define. Measures of teaching effectiveness used by other researchers to validate student ratings have included: (1) self ratings, (2) ratings by colleagues, (3) ratings by administrators, (4) ratings by graduates, and (5) student achievement (Cohen, 1981; Marsh, 1987). It is the last of these, student achievement, which has received most attention. Student achievement is usually defined in terms of the amount students learn in a

Upload: michael-prosser

Post on 06-Jul-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

Higher Education 20: 135-142, 1990 �9 1990 Kluwer Academic Publishers. Printed in the Netherlands.

Student evaluations of teaching and courses: Student study strategies as a criterion of validity*

M I C H A E L PROSSER 1 & K E I T H T R I G W E L L 2

Centre for Teaching and Learning, University of Sydney, Sydney, Australia 2 University of Technology, Sydney, Australia

Abstract. Recent developments in higher education are likely to lead to increased evaluation of teaching and courses and, in particular, increased use of student evaluation of teaching and courses by questionnaire. Most studies of the validity of such evaluations have been conducted in terms of the ,relationship between traditional measures of 'how much' students learn and their ratings of teaching and courses. But there have been few if any studies of the relationship between students' rating of teaching and the quality of student learning, or in how the students approached their learning.

For the evaluation of teaching and courses by questionnaire to be valid we would expect that (1) those students reporting that they adopted deeper approaches to study would rate the teaching and the course more highly than those adopting more surface strategies and, more importantly, (2) those teachers and courses which received higher mean ratings would also have, on average, students adopting deeper strategies.

In the paper we report the results for eleven courses in two institutions. The results, in general, support the validity of student ratings, and suggest that courses and teaching in which students have adopted deeper strategies to learning also have higher student ratings.

Introduction

Recent pol icy developments in higher educat ion are likely to lead to increased evaluat ion of teaching and courses, and in part icular , increased use of s tudent evaluat ion of teaching and courses by questionnaire. F o r example, the Austra l ian Government ' s Whi te paper on Higher Educat ion (1988) argues for the development of a set o f performance indicators to be used to facilitate resource allocation within institutions. The Aust ra l ian Vice-Chancellors ' Commit tee and the Aust ra l ian Commit tee of Directors and Principals in Advanced Educat ion (1988) established a working pa r ty with the task of developing a set of such indicators. A m o n g the indicators for teaching and the curr iculum were average student ratings of teaching and courses by department .

Before such quest ionnaires can be used, all aspects of their validity need to be carefully considered. F o r such ratings to be valid they must, at least, be a measure of teaching and course effectiveness. But exactly what constitutes teaching effec- tiveness is difficult to define. Measures of teaching effectiveness used by other researchers to val idate s tudent ratings have included: (1) self ratings, (2) ratings by colleagues, (3) ratings by adminis t ra tors , (4) ratings by graduates, and (5) student achievement (Cohen, 1981; Marsh, 1987). It is the last of these, student achievement, which has received most attention.

Student achievement is usually defined in terms o f the amoun t students learn in a

Page 2: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

136

particular course, and assessment grades obtained by students are the usual measure of the amount learnt. Using these assumptions, many reviewers have cited significant positive correlations between assessment results and student ratings to support the validity of student ratings (Cohen, 1981). But in a recent review of a number of meta-analyses of these correlations, Abrami, Cohen, and d'Apollonia have shown that the meta-analyses are not consistent with one another, and consequently they state that '...conclusions about the validity of student ratings - will have to wait a more complete analysis.' (Abrami, Cohen and d'Apollonia, 1988, p. 162.) Thus despite many suggestions to the contrary in the literature, it seems that the validity of student ratings in terms of student achievement needs further study.

But even if the validity of ratings is supported in terms of traditional measures of student achievement, is this achievement an appropriate criterion of student learning? Research studies in science education in the last decade have shown that 'how much' students learn is just one dimension of the learning outcome. An important second dimension is the quality of the learning. Differences between quantity, or how much has been learnt, and quality (how well it has been understood) have been demonstrated, for example, by Gunstone and White (1981) who showed that just under half a student group who scored highly in senior secondary school physics examinations were unable to explain a basic (non formula based) physics phenomenon. Other studies (J0hansson et al., 1985 and Prosser and Millar, forthcoming) show that many students studying first year university Newtonian physics, still hold pre-Newtonian conceptions which are not detected by traditional examinations, suggesting that quality and quantity are not necessarily related.

Indicators of the quality of learning are difficult to develop and expensive to collect. But a substantial amount of research has shown that the quality of student learning is related to the quality of their approaches to learning (for example, see van Rossum and Schenk, 1984). That is, the research shows that most students who have attained a high quality learning outcome have also adopted a high quality approach to their learning. Thus it would be expected that high quality teaching and courses would be associated with a high quality of student approaches to their learning. Indeed, it may be argued, that the quality of teaching and courses has a more direct influence on approaches than on outcomes and that approaches may be a better criterion.

Other studies have shown that the quality of students' approaches to learning are also related to students' perceptions of the academic learning environment, and that high quality approaches to learning are related to students' perceptions of high quality teaching. It should be noted, however, that these findings are based upon more qualitative analyses of student interview data, and that efforts to confirm this relationship using student questionnaires have not been so successful (Entwistle and Ramsden, 1983; Meyer and Parsons, 1988; Entwistle and Tait, 1990). These studies, using the student and not the class or course as the unit of analysis, have suggested that students who adopt low quality approaches to learning prefer methods which encourage such approaches, and thus question the validity of such ratings (Entwistle and Tait, 1990).

Page 3: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

137

In this paper we describe the results of some of our research examining the validity of student ratings in terms of the relationship between the quality of the students' approaches to learning and their ratings of teaching and courses. More specifically, for the validity of questionnaire methods to be supported, we would expect: (1) that within courses, students with a high quality approach to learning would rate the teaching and the course higher than those with a lower quality approach, and (2) that between courses, those teachers and courses that achieved higher mean ratings would have, on average, students adopting higher quality approaches to learning. The second of these is more important, as noted by Cohen:

Researchers using the individual student as the level of analysis [rather than the class] are not asking (though they may intend to) whether the teachers who receive high student ratings are also the ones who contribute to student learning...this type of analysis does not answer our validity concerns. To do this we need to know the relationship between ratings and student achievement for individual teachers. (Cohen, 1981, p. 284)

Methodology

The study was conducted in collaboration with nine academic staff from two Australian universities: the University of Sydney (SU) and the University of Technology, Sydney (UTS). Results were collected for eleven different courses (four at SU and seven at UTS) using student evaluation of teaching questionnaires and components of standardised approaches to study inventory.

Researchers examining the validity of student evaluation questionnaires in terms of student achievement have used multisection courses rather than different courses because it was not possible to compare achievement across courses. However, it is possible to compare the quality of student approaches to learning across different courses using a standardised inventory, as was the case in this study.

Student evaluation of teaching and courses questionnaires

The student evaluation of teaching and courses questionnaires used in the two institutions differed slightly; both being adaptations of the questionnaire developed and used since 1982 at the University of Queensland (Moses, 1986). The evaluation of teaching component of each is composed of a set of questions related to specific aspects of teaching, and one overall question. The seven specific aspects of teaching common to both are that the teacher: made explanations clear, taught to help understanding, stimulated interest, created opportunity for questions, was available for consultation, made objectives clear, and was well prepared. All were rated by students on a five point scale from strongly agree to strongly disagree. The global teaching question asked students to rate the teaching on a seven point scale from very poor (1) through satisfactory (4) to excellent/outstanding (7).

An overall course rating (seven point scale) was available on the Sydney University questionnaire only. Data for an overall course rating on the UTS sample

Page 4: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

138

were obtained by averaging scores from four questions on aspects of the course being evaluated: the accuracy of the course description, relationship between course goals and assessment tasks, workload, and whether assessment tasks allowed students to demonstrate what they had learnt. While we have some concerns with this procedure, it has been used by Cohen (198 l) in meta-analyses. These ratings (on a five point scale) were converted to a seven point scale for compatibility with the Sydney University data.

Approaches to study inventory

The Lancaster Approaches to Study Inventory (Entwistle and Ramsden, 1983) contains sixteen subscales. Three of them are labelled Deep Approach, Relating Ideas and Surface Approach. The twelve item inventory used in this study was constructed using all four items from the Deep Approach and Relating Ideas subscales, along with four items (SA2, 4, 5 and 6) from the Surface Approach subscale. Students responded to each item on a five point scale and an overall score was obtained for each of the three subscales. The three subscales have been labelled Deep, Relational and Surface. A description of each type of approach and an example item from each subscale is given below:

Deep: Students attempt to understand, and to determine the meaning of, the subject. They take nothing given as automatically correct and question both themselves and the subject. ('I generally put a lot of effort into trying to understand things which initially seem difficult.') Relationak Students try to see connections between previously studied and current material, to relate new ideas to real life situations, to integrate the subject into the whole and to see the task in a wider perspective. ('I find it helpful to 'map out' a new topic for myself by seeing how the ideas fit together'.) Surface: Students concentrate on memorising the material. They don't have the time to think about the implications of what they have read, indicating an unreflective or passive approach to the task. ('I find I have to concentrate on memorising a good deal of what we have to learn.')

In the context of this study, a high quality approach to learning is one in which the students indicate that they adopt deeper and more relational strategies rather than more surface strategies.

Procedure

The data were collected in the final three weeks of the teaching period from courses in seven different faculties, with sample sizes from 37 to 350 and a total of 999. The analysis took two forms. To examine the within course variation, items on the teaching and courses evaluation questionnaire were correlated with items on the approaches to study inventory for each course and the mean correlation for all

Page 5: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

139

courses calculated. This tests whether students with a high quality approach to learning would rate the teaching and the course higher than those with a lower quality approach. To examine the between course variation, the within course means of the items on the teaching and courses evaluation questionnaire were correlated with the within course means of the items on the approaches to study inventory. This tests whether those teachers and courses that achieved higher mean ratings would have, on average, students adopting higher quality approaches to learning.

In the analysis, correlations were calculated between the evaluation questionnaire items and the Deep, Relational and Surface subscales. Since each of these subscales contribute a unique amount of information, an overall or total indicator of study approach was calculated by summing the scores of the two high quality approaches (Deep and Relational) and dividing this sum by the score of the low quality approach (Surface). A quotient rather than a subtraction was used because the Surface subscale (a component of the Reproducing Orientation scale) is a different dimension to the Deep and Relational subscales (components of the Meaning Orientation scale), and so should not be subtracted from those subscales.

Results and discussion

Table 1 shows the results of the within course analysis of the questionnaire data. The dimensions listed under skill, rapport and structure refer to the seven dimensions of teaching common to both questionnaires and referred to in more detail in the methodology section. The table contains (a) the averages of the within course correlations of the student evaluation questionnaire items with the study process items, and (b) the number of courses which produced positive correlations greater than. 10 and the number with negative correlations less than -. 10. (While it is not possible to define absolute cut-off measures, a cut-off of. 10 was adopted as this was defined as being at least a small effect size by Cohen (1977), and therefore worthy of further consideration and study.)

The table shows quite small average correlations between the components of teaching items (skill, rapport and structure) and the approaches to study inventory subscales. It should be noted, however, that the signs of all the average correlations are as would be expected if the students in the class who are adopting deeper approaches to their study rated the teaching more highly. The average correlations between the overall teaching and course evaluation items and study approaches inventory items are somewhat larger, although still quite small. Again the average correlations are in the directions expected to support the validity, and the numbers of courses with positive and negative correlations are also distributed as would be expected.

The more important set of correlations for this paper are those between the courses. These correlations are shown in Table 2. The average of each of the teaching and course evaluation items for each class group is correlated with the average study approach score. For the overall approach to study score ( (Deep +

Page 6: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

140

Table 1. (a) Means of the within course correlations between teaching and courses evaluation questionnaire items and approaches to study inventory subscales, and (b) the number of correlations equal to or greater than 0.10 and equal to or less than -.10

(a) Correlations

Sure Deep Relat.

(b) No. of corr. >0,10, <-0.10

Surf. Deep Relat.

>10 <-10 >10 <-10 >10 <-10

Overall evaluation Teaching -13"** 09*** 14"** Course -13"** 09*** 17"**

Skill Explanation -11"** 00 04 Understand -11"** 06* 08** Interest -13"** 09*** 19"**

Rapport Questions -05 08** 07* Consultation -07* 08** 09***

Structure Objectives -04 02 02 Prepared -09*** 03 05

1 7 5 1 7 0 1 7 7 1 9 0

2 7 2 1 3 1 1 7 4 0 4 0 0 6 5 1 7 1

1 6 5 1 4 1 1 4 4 2 6 0

1 5 5 2 2 1 1 6 3 1 3 0

Notes: (a) All decimal points have been removed. (b) n = 999. (c) *p<0.05; **p<0.01; ***p<0.005.

Table 2. Correlation between the within the course means of teaching and courses evaluation questionnaire items and approaches to study inventory subscales

Surf. Deep Relat. (Deep + Relat.)/Surf.

n = l l n = l l n = 1 0 ~

Overall Teaching -51 17 06 52* 60* Course -12 22 36 34 78***

Skill Explanation -04 -07 -29 -02 43 Understand -21 16 -09 25 55* Interest -25 16 25 40 64**

Rapport Questions -14 22 01 24 33 Consultation 01 17 05 07 29

Structure Objectives 41 -13 -28 -41 06 Prepared 26 05 -20 -22 18

Notes: (a) All decimal points have been excluded. (b) *p<0.10; **p<0.05; ***p<0.01. (e) One case, identified as an outlier, removed.

Page 7: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

141

Relational)/Surface), it also shows the correlations excluding one outlying case. This case was identified from scattergrams of the correlation coefficients. It is quite an unusual case in that the lecturer presented a quite unorganised course, but one in which he wanted his students to question and criticise the ideas presented. He then selected topics for discussion in which the students were likely to question and discuss, but there was little systematic development of ideas across topics and class sessions. In this case the students rated the teaching highly, but did not rate the course, nor the skill, rapport and structure components of teaching very highly.

While the overall teaching and course items correlate with the approaches to study in the expected direction, some of the teaching components, especially structure, do not. Some of this effect is caused by the outlier.

Focussing on the overall study process correlations minus the outlying case (the final column), it can be seen that many of the correlations are large and in the direction expected. Indeed, the correlation with the overall evaluation of teaching is .60 (.52 including the outlier) and with the overall evaluation of the course is .78 (.34 including the outlier). These correlations strongly suggest that those courses in which the students adopted deeper approaches to study were also the courses that had teaching that was rated more highly. Thus, using a criterion not previously utilised, the results support the validity of student ratings of teaching and courses.

Conclusion

The validity of student evaluation of teaching by questionnaire has not previously been studied using the quality of student approaches to learning as a criterion. Though conducted on a small sample of classes, and with one member of that sample being a significant outlier, this study supports the validity of using student questionnaires as o n e of the methods of evaluating teaching and courses. While this conclusion may appear to be inconsistent with that of Entwistle and Tait (1990), it should again be emphasised that their study used the student as the unit of analysis and not the class. These results confirm that when the class is used, student questionnaires appear to be valid.

In this context, the research reported here and elsewhere shows a pattern of inter-relations between student rating, self ratings by teachers, ratings by colleagues, administrators and graduates, student achievement and now student approaches to learning. The pattern is such as to suggest that no single criterion is sufficiently valid to be a measure in itself. A construct validity approach needs to be adopted for both evaluation practice and further research in which a number of measures, along with student ratings, is required. This study has supported the use of student ratings as one such measure. What of the future? We believe that the current practice of evaluating teaching and courses needs to incorporate several indicators and that continuing research needs to focus on the relations between several different measures.

Page 8: Student evaluations of teaching and courses: Student study strategies as a criterion of validity

142

Note

* An earlier version of this paper was presented at the 1989 Annual Conference of the Higher Education Research and Development Society of Australia.

References

Abrami, P. C., Cohen, P. A. and d'Apollonia, S. (1988). 'Implementation problems in meta-analysis', Review of Educational Research 58:151 - 179.

Australian Government Department of Education and Training (1988). Higher Education: A Policy Statement. Canberra: Australian Government Printing Service.

Australian Vice-Chancellors' Committee/Australian Committee of Directors and Principals (1988). Report of the Working Party on Performance Indicators. Braddon, Australian Capital Territory.

Cohen, J. (1977). Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press. Cohen, P.A. (1981). 'Student ratings of instruction and student achievement: a meta-analysis of

multisection validity studies', Review of Educational Research 51: 281-309. Entwistle, N. J. and Ramsden, P. (1983). Understanding Student Learning. London: Croom Helm. Entwistle, N. J. and Tait, H. (1990). 'Approaches to learning, evaluations of teaching, and preferences

for contrasting academic environments', Higher Education 19 (2): 169-194. Gunstone, R. F. and White, R.T. (1981). 'Understanding gravity', Science Education 65: 291-299. Johansson, B., Marton, F. and Svensson, L. (1985). 'An approach to describing learning as change

between qualitatively different conceptions', in West, L. H. T. and Pines, A.L. (eds.), Cognitive Structure and Conceptual Change. London: Academic Press.

Marsh, H. W. (1987). 'Students' evaluation of university teaching: research findings, methodological issues, and directions for the future', International Journal of Educational Research 11: 253-388.

Meyer, J. H. F. and Parsons, P. (1989). 'Approaches to studying and course perceptions using the Lancaster inventory - A comparative study', Studies in Higher Education 14: 137-153.

Moses, I. (1986). 'Self and student evaluation of academic staff', Assessment and Evaluation in Higher Education 11: 76-86.

Prosser, M. and Millar, R. (forthcoming). 'The how and what of learning physics', TheEuropean Journal of the Psychology of Education.

Van Rossum, E.J. and Schenk, S.M. (1984). 'The relationship between learning conception, study strategy and learning outcome', British Journal of Educational Psychology 54: 73-83.