the implications of using skills tests as basis for a ... · the implications of using skills tests...

Mari Räkköläinen, Finnish National Board of Education, Finland Kathryn Ecclestone, University of Exeter, UK

The implications of using skills tests as basis for a national evaluation system in Finland Outcomes from a pilot evaluation in 2002–2003 in Finland Evaluation 1/2005

FOREWORD

Initial Vocational qualifications in future will incorporate a skills test based testimonial as proof of the student’s attainment of the goals set in occupational studies. In parallel, a national system of evaluating learning results along with a set of assessment practices to be used by the providers of education are being developed. The pilot evaluation presented here tested a model, where the national evaluation of learning results was integrated with skills tests based assessment and where the evaluation data was gathered directly from the skills tests organised by the institutions. The skills test based student assessment to be included in future vocational qualifications contains new challenges to be met and new questions to be answered relative to the evaluation of learning results. The present report is an account of the underlying development work, the challenges brought out by the pilot evaluation and the final evaluation outcomes. Based on the evaluation pilot, the model for evaluating learning results will be developed further and a plan drawn up for extracting the national evaluation data from the data provided by the skills tests.

It is our hope that this report will form a productive basis for contiguous work. Concomitantly, the report is an interim evaluation intended as feedback to be used by the participants and to be exploited in developing evaluation further. The majority of the partakers will go on to tackling these challenges in oncoming projects in the next phase of the pilot. The demanding but interesting and inspiring work continues.

Dr. Kathryn Ecclestone from the University of Exeter, who is the co-author of the report, has acted also as a peer evaluator of the development project. She has contributed invaluable outside thought and evaluatory expertise to help analyse the multiple questions and challenges brought out by the pilot. Evaluation by skills tests not only changes routines and the ways of implementing the evaluation, the whole concept of evaluation and the evaluation culture – it compels us to re-evaluate the principles governing the reliability and practicality of evaluation and the aims of national evaluation. The report discusses these questions also from the perspective of theory, clarifies the concepts of evaluation, assessment and quality assurance, and compares the Finnish evaluation system with that in use in the UK. The report is suitable for use also by all who are interested in or working with developing evaluation. We hope the report will give rise to a widespread and lively debate.

A large number of teachers, students, workplace trainers, and experts from research institutions, labour market organisations, and education administration has taken part in the development work. We wish to thank them, one and all, for their invaluable input in developing the evaluation system. Helsinki, December 2004 Pentti Yrjölä Mari Räkköläinen Head of Evaluation unit, FNBE Project Manager, FNBE

CONTENTS

1 INTRODUCTION...................................................................................................... 7

1.1 THE BACKGROUND ......................................................................................................................................7 1.2 OBJECTIVES OF THE DEVELOPMENT PROJECT ....................................................................................7 1.3 CO-OPERATION WITH A PEER EVALUATOR ..........................................................................................8

2 THE SYSTEM OF VOCATIONAL EDUCATION AND NATIONAL EVALUATION IN FINLAND..................................................................................10

2.1 VOCATIONAL EDUCATION AND TRAINING ........................................................................................10

3 EXISTING APPROACHES TO EVALUATION IN FINLAND............................13

3.1 A NEW EVALUATION STRATEGY PROPOSED BY THE MINISTRY OF EDUCATION.....................14 3.2 TRANSITION TO A NEW SYSTEM OF EVALUATION ............................................................................14 3.3 IMPLICATIONS OF INTEGRATION...........................................................................................................15

4 PRINCIPLES AND FEATURES OF ASSESSMENT AND EVALUATION ........15

4.1 DEFINITIONS, TERMS AND CONCEPTS..................................................................................................16 4.2 THE PURPOSES OF EVALUATION ............................................................................................................17 4.3 QUALITY ASSURANCE.................................................................................................................................19 4.4 QUALITY CONTROL ....................................................................................................................................19 4.5 APPROACHES TO EVALUATION...............................................................................................................20

5 APPROACHES TO EVALUATION IN FINLAND AND THE UK..................... 22

5.1 GENERAL PRINCIPLES AND VALUES......................................................................................................22 5.2 GOALS AND VALUES IN THE FINNISH EDUCATION SYSTEM...........................................................22 5.3 CHARACTERISTICS OF EVALUATION IN THE UK ................................................................................24

6 A NEW EVALUATION MODEL BASED ON QUALITY ASSURANCE AND QUALITY CONTROL.............................................................................................. 27

6.1 A PILOT PROJECT CARRIED OUT BY NBE IN 2002–2003...................................................................27 6.2 QUESTIONS TO BE ADDRESSED ..............................................................................................................28 6.3 METHODS AND DATA COLLECTION ......................................................................................................29 6.4 STANDARDS AND CRITERIA .....................................................................................................................32 6.5 RESULTS OF THE PILOT .............................................................................................................................32

7 CONCLUSIONS FROM THE PILOT PROJECT ................................................. 35

7.1 THE ROLE OF SKILLS TESTS......................................................................................................................35 7.2 QUALITY ASSURANCE AND QUALITY CONTROL ................................................................................35 7.3 PRACTICAL PROBLEMS IN USING LOCAL SKILLS TESTS IN A NATIONAL EVALUATION

SYSTEM...........................................................................................................................................................36 7.4 ALTERNATIVE MODELS TO COLLECT NATIONAL EVALUATION DATA .......................................37

8 CHALLENGES FOR FUTURE DEVELOPMENT............................................... 40

4

9 SUGGESTIONS FOR NEXT PHASE OF THE PROJECT .................................. 40

SOURCES............................................................................................................................. 42

APPENDIX 1 METHODOLOGY AND DATA COLLECTION ...............................................................................43 APPENDIX 2 ’CRITICAL POINTS’............................................................................................................................50 APPENDIX 3 THE STRUCTURE OF THE EVALUATION PILOT (2002-2003).................................................52 APPENDIX 4 QUALITY ASSURANCE SYSTEM IN EVALUATION PILOT .........................................................53 APPENDIX 5 QUALITY REQUIREMENTS OF EVALUATION.............................................................................54APPENDIX 6 PILOT EVALUATION 2002–2003...................................................................................................65

5

1 INTRODUCTION

1.1 THE BACKGROUND This report analyses an important reform of vocational education and training in Finland. Student assessment was renewed in a way that vocational competence in future will be assessed by practical skills tests and that this assessment will form the basis of a national evaluation system. A decision made by the Ministry of Education in 1998 has in view to make all vocational qualifications attended by a practical skills test, demonstrating the student’s attainment of the aims of vocational training. The final transition to the new system of assessment and evaluation is scheduled for the autumn of 2006.

The skills tests are an integral part of the teaching and learning process, designed to unify the practices of providers whilst also diversifying student assessment to reflect local conditions and making the needs and desiderata of working life part of the student’s vocational proficiency. The purpose hence is to enhance the quality of vocational training and to ensure that the training meets the perspectives of working life. In addition to promoting better student assessment, the reform also strives to improve national evaluation. An important aim of the development project discussed in this report is to explore whether national evaluation data can be extracted from the skills tests. A parallel aim is that the evaluation of national learning results further will be an evaluation of the effectiveness of training. The project will reveal the strengths and weaknesses of these aims and suggest possible solutions.

In the past, national evaluation of learning results in Finland was based on nationally devised uniform tests and work assignments in the form of a final examination. The integration of national evaluation with skills tests has as its aim to abolish a system based on separate national tests and, instead, to gather the evaluation data straight from local tests flexibly administered by the institutions. This model has been tested in a national pilot project coordinated by the National Board of Education (NBE1) in 2002–2003. This report is based on the results of the pilot evaluation and the questions and problems brought out by a parallel separate investigation carried out by the NBE and an external expert from the UK.

In this report, we will discuss to what extent using the skills test simultaneously for assessing students’ competence and for evaluating the effectiveness of training positively influences the concept of evaluation and the nature and quality of the evaluation data. We will also discuss the tensions that have emerged between the parties involved in the Finnish evaluation system (e.g. control of and trust in evaluation, norms and flexibility of steering, fairness, objectivity and comparability of evaluation, and the difficult balance in all assessment and evaluation systems between validity and the requirement of reliability).

1 Since 2004 NBE has been called FNBE (Finnish National Board of Education).

7

1.2 OBJECTIVES OF THE DEVELOPMENT PROJECT The main objectives can be summarized as follows: • To evaluate the strengths and weaknesses of a new evaluation system, taking into

account the perspectives of different stakeholders.

• To describe key features of assessment and evaluation in vocational education and the values and principles that underlie them.

• To differentiate clearly between formative and summative assessment and evaluation and to define other important concepts (such as reliability, validity, standards, quality assurance and quality control).

• To analyze technical, professional, political and educational tensions within the existing system of evaluation and those that might emerge in an evaluation system based on skills tests.

• To base recommendations 1) on evidence from pilot projects that test different strategies for national and local evaluation and 2) on ideas from an impartial external expert.

• To pilot a new approach to evaluation through two (pilot) projects, in collaboration with representatives from colleges, vocational schools, employers, the social partners and government departments in the Finnish vocational education system.

• To offer advice to the National Board of Education and the Ministry of Education about the implications of using national skills tests for evaluating the Finnish vocational education system.

1.3 CO-OPERATION WITH A PEER EVALUATOR The project was launched by the Evaluation Unit at NBE in January 2002. Anu Räisänen, Counsellor of Education at NBE and Planning Specialist Mari Räkköläinen, manager of the project, devised the first project plan for the development project. Dr. Kathryn Ecclestone from the University of Newcastle in the UK was asked to act as peer evaluator and expert on assessment systems and practices in vocational education. Dr. Kathryn Ecclestone later transferred to the University of Exeter. Paula Mäkihalvari works as the project planner and is in charge of the data analysis. Anu Räisänen has changed jobs and now works as the Head Coordinator of Evaluation of Vocational Education at the Finnish Education Evaluation Council. Since the spring of 2002 project manager Mari Räkköläinen has been in charge of developing the

8

project and of reporting.

The role of the peer evaluator is to bring an outside perspective and expertise in assessment from the UK to the project. Kathryn worked with Mari Räkköläinen and Paula Mäkihalvari to establish clear objectives for the cooperation and to suggest ideas, advice and information at various points during the project. She also met and interviewed various officials, teachers and social partners during six working periods in Finland (usually two each year) (see Appendix 1). Her general role was to offer an objective, impartial and supportive perspective, based on intensive experience of the project and an in-depth understanding of the Finnish vocational education and evaluation system.

During working periods the expert familiarised herself with principles, values and features of different aspects of the assessment and evaluation system in vocational education in Finland and identified ideas about technical, practical and critical aspects of evaluation amongst different agencies and interested parties in Finland (all 7 work periods). This helped NBE officials understand other assessment and evaluation systems and the implications of their principles and features for Finland’s system.

As developers from NBE, we visited England and familiarized ourselves with UK´s education system in VET, focusing on quality assurance and quality control procedures and principles in assessment systems. We had beforehand analysed the main problems and ‘critical points’ which we might have with the new system. During the visit to the UK, we had opportunity to discuss with those who evaluate and the evaluated. We visited the Qualification and Curriculum Authority (QCA), central awarding bodies and some institutions. We interviewed many authors and met teachers, students, developers and principals and inspectors and external verifiers. Our aim was to understand the main features of evaluation in the UK’s system and to explore the balance between external, summative evaluation required by government and internal evaluation undertaken by institutions.

A list of working periods, interviews, visits and observation that have formed the basis for our cooperation are included in Appendix 1.

9

2 THE SYSTEM OF VOCATIONAL EDUCATION AND NATIONAL EVALUATION IN FINLAND

2.1 VOCATIONAL EDUCATION AND TRAINING Initial vocational education is provided in vocational institutions and in the form of apprenticeship training in virtually all fields. The completion of an initial vocational qualification takes 2–3 years, and instruction is given in multi-field or specialised vocational institutions. Initial vocational education is open also to upper secondary school leavers. The duration of studies is 0.5–1 year shorter for these students, due to the credits transferred from their upper secondary school studies. A three-year vocational qualification yields eligibility for all forms of higher education.

The objective of initial vocational education is to provide students with the knowledge and skills necessary for acquiring vocational expertise and to provide them with the capabilities for self-employment. The further objectives of the education are to promote the students' development into good and balanced individuals and members of society, to provide students with the knowledge and skills necessary for further studies, for the pursuit of personal interests and the versatile development of their personality, and for promoting lifelong learning (Vocational Education Act 630/1998).

The Government defines the general objectives of vocational education and training, the general structure of the study programmes and the common studies. The Ministry of Education decides on the details and scope of the study programmes, while the National Board of Education (NBE) issues the national core curricula that determine the objectives and core contents of the studies. The core curricula are the charge of tripartite expert bodies and training committees, set up by the Ministry of Education to plan and develop vocational education. Based on these, each provider of education prepares his own curriculum. The providers also have tripartite expert bodies and consultative committees that participate in the planning and development of education at the local level.

The Ministry of Education grants licences (accreditation) to organise vocational education, determining the education providers' fields of study and total number of students, etc. Within the framework of the licence and the confirmed structure of the study programmes, the education providers may focus their education as they see fit, allowing for the needs of the local and regional economic and working life.

The National Board of Education is an expert body responsible for the development of educational objectives, contents and methods in basic, general upper secondary, vocational and adult education and training. The Board prepares and adopts the core curricula and is responsible for evaluating the quality of the Finnish education system. The responsibility for evaluation at the polytechnics and universities rests with these institutions themselves. In this task they have the

10

assistance of the Higher Education Evaluation Council, operating under the Ministry of Education.

Developing co-operation between education providers and working life is regarded as desirable in order for education to better meet working life requirements. Such co-operation is necessary, if for no other reason, than the fact, that the Finnish education system is very institution-centred. Representatives of working life are involved in the advisory bodies of vocational education at both the central administration and the local level. Students' orientation to working life, on-the-job training or fixed-term studies at workplaces have been included as a regular element of education in initial vocational and in polytechnic degree programmes.

There is no separate inspection department for schools in Finland. The steering of the education system is in the hands of the Government and the Ministry of Education. Many matters, however, have been entrusted to the providers of education. Their operations are steered through the core curricula and the objectives laid down in legislation. Feedback concerning the operations of the education system is collected by means of statistics and evaluations. Even alone these provide information that by itself steers education.

Like most developed countries, Finland completed the basic infrastructure of education during the past decades and is now in a situation where the emphasis has shifted to the quality of education. One tool for quality assurance is the evaluation of educational outcomes. In recent years, powers of decision have been devolved to the local level, whereby evaluation has become a significant tool also for the steering of education. In legislation, evaluation duties have been assigned to both the education providers and the authorities.

A figure showing organisations and stakeholders is on the next page.

11

• draws up core curricula • is responsible for the

national evaluation for learning outcomes learning outcomes

• organises the evaluation of education, contributes to the development of evaluation and promotes evaluation research

• acts as an expert network

• are responsible for teaching and

• the assessment of students and

• the self-evaluation

• consist of representatives of education provider, employers, employees and students

• works responsibility of an official for the legality of his actions subject to official liability

• signs skills-test certificate • approves plan for skills-test • appointed for a maximum of three

years

• consists of representatives of working life, teachers and FNBE

• promotes co-operation and interaction between institutions and working life

• 31 committees in all • initial vocational education and

training

• is responsible for developing educational, research, cultural, sport and youth policies as well as international co-operation is these fields

• heads government regulation

EDUCATION EVALUATION COUNCIL

MINISTRY OF EDUCATION

FNBE

• consists of representatives of the employers, employees and teachers

• heads the organisation of competence-based qualification in adult education and issues certificates

• research • assessment + evaluation

certificates

• consists of representatives of working life, the employers, employees

• assessment + evaluation • a municipality, a joint

municipal authority or the state• is responsible for student’s

assessment • prepares the institution-specific

curriculum • issues certificates

• enacts legislation

PARLIAMENT

TRAINING COMMITTEE

QUALIFICATION COMMITTEE

EDUCATION PROVIDER

INSTITUTIONS

TRADE UNION TRIPARTITE COMMITTEE

UNIVERSITY RESEARCHERS

FNBE

FIGURE 1 ORGANISATIONS AND STAKEHOLDERS IN THE FINNISH VOCATIONAL SYSTEM. FIGURE 1 ORGANISATIONS AND STAKEHOLDERS IN THE FINNISH VOCATIONAL SYSTEM.

12

3 EXISTING APPROACHES TO EVALUATION IN FINLAND Between 1995 and 2002 and at the rate of one vocational sector per year, the National Board of Education (NBE) used sample-based national final examinations as part of regular sector specific evaluations carried out by the institutions. Summative tests in pure theory or in both pure theory and practical (work) knowledge were organised in selected fields. Summative tests for assessing vocational competence normally lasted one week, encompassing tests for both theory and practical knowledge. The national tests were pre-tested in order to improve their reliability. In this strategy, learning results in each field were evaluated once every five years. However, the implementation was hampered by lack of sufficient funds for meeting this schedule. The evaluation programme itself was approved and validated alongside the annual result agreement between the Ministry of Education and NBE.

A group of teachers working with the NBE analysed the aims of the curricula and determined the knowledge that students should attain and be assessed for for summative purposes. The group prepared test questions, the assessment scale and criteria to be used, and formulated the instructions to the students, teachers, and evaluators. The tests were spread over 1–3 weeks. Teachers from each individual school were responsible for organising the tests in their respective schools, but did not participate in evaluating their students' test results.

Instead, assessment was tripartite, involving representatives of the vocational sector, the trade, craft or profession, and the employers. All tests were piloted before they were used. The test results were analysed and all foreseeable problems charted. After the test, the schools returned the material for analysis to the NBE. Each school was promptly provided with its own ‘quick results’ sheet along with the national results for comparison. The final national report was completed roughly six months after the test. In the report, no school-specific results are listed. A number of problems with reliability emerged. These included:

• The teachers and other test evaluators had great difficulties in applying the set

assessment criteria, and in interpreting them correctly.

• Not all students were motivated sufficiently during the test, since they knew that the test result would not reflect on the grades in their school leaving certificates.

• The level of error amongst assessors was significant, because they interpreted the knowledge of their students and the set criteria in very different ways.

• The assessment cultures of the individual schools differed markedly from one another and some were slack, evincing forms of cheating such as approving imitation, preparing students in advance and awarding good marks without sufficient justification.

13

These problems are common in all assessment systems to a greater or lesser extent, as research has shown in the UK’s vocational system. In particular research in the UK shows that assessors have to be socialised into forming reliable judgments of criteria, through processes such as peer moderation, exemplars of good and poor standards of work within a specific field, and open discussion between teachers about problems of bias and comparison (see Wolf 1995; Ecclestone 2001; Harlen 2004, Wilmut 2004). When these problems emerged in the Finnish system, initial work carried out by the NBE suggested the following remedies:

• Improvements to the indicators and criteria in order to maximise their reliability as tries in the pilot

• Careful planning of documents, instructions and procedures

• Standardising factors affecting the students, the test environment and timing

• Training for assessors

• Better recruitment of representatives of the employers for the "triangular (= consensus) assessment" groups.

3.1 A NEW EVALUATION STRATEGY PROPOSED BY THE

MINISTRY OF EDUCATION The national evaluation strategy launched in 1998 by the Ministry of Education emphasized summative assessment of learning results (Ministry of Education decision 212/430/98). The Government decided to adopt a skills test based assessment of competence in initial vocational training, where summative tests were to be implemented during training courses and by area of study entity and, in the main, at a workplace.

These proposals enabled tests to be arranged locally with the provider of education in charge of preparations. The transition to a skills test based assessment and evaluation system was supported at the national level with ESF (European Social Funding) funding through the production of skills test material and pilot tests in various sectors. Test materials were produced both nationally and by individual test organisers. However, there were no standard national tests. Instead, the provider of education was free to decide what material to use in the actual test.

14

3.2 TRANSITION TO A NEW SYSTEM OF EVALUATION In parallel with developing a skills test-based assessment system, the Ministry of Education decided to abandon a system with separate national tests. Instead, the evaluation of learning results and the assessment by skills tests were to be integrated as effectively as possible. It was also decided that the transition to the new system would initially involve only few vocational sectors. The agreement signed between the Ministry of Education and the National Board of Education had a clause that a system for evaluating learning results would be developed in 2002 and that the system be piloted, if possible, that same year or in 2003, at the latest. The decision as to whether one or more templates and strategies should be put to the trial was left to the judgment of experts.

A skills test-based assessment has been under construction for several years, but the demands of evaluation by learning results have not yet been sufficiently incorporated into the new system. The integration of evaluation by learning results entails changes in the skills test-based system. How these changes are to be effected and an evaluation of the long-term implications involved were questions still to be settled at the beginning of the pilot project.

Evaluation based on skills tests is implemented by signifying that the student has to prove his competence throughout his training. The curricula contain no collective courses or study entities, thereby factually precluding the organization of corresponding tests in the final stage of training. This process, spanning the duration of vocational training, makes considerable demands on how learning results are evaluated, as well as on the reliability of these evaluations.

3.3 IMPLICATIONS OF INTEGRATION It was clear from the outset that the decision for a new system of evaluation based on skills tests would create three main problems: • Comparison: skills tests and assessment practices vary a lot at the local level

• Time: it takes a long time to get the results and institutions implement their tests based on their own curricula

• Overlap: how to link and connect assessment at the local level with evaluation at the national level.

15

4 PRINCIPLES AND FEATURES OF ASSESSMENT AND EVALUATION

4.1 DEFINITIONS, TERMS AND CONCEPTS In general terms, evaluation means placing a value on things. Evaluation can be seen as a systematic, reflective and formal activity, where evaluation is part of the decision-making processes. Evaluation involves making judgments about the worth of an activity through systematically and openly collecting and analysing information about it and relating it to explicit objectives, criteria and values. Evaluation comprises a series of procedures carried out to collect information about the quality of learning experiences. This information can be used as the basis for recommendations to improve the quality of the services provided. Evaluation is materialised by collecting information systematically and vigorously and involves the recording of information in some detail. Evaluation can be carried out either by teachers, school managers, or external bodies. However, Rowntree places the emphasis on teachers’ evaluation as being crucial for the improvement of teaching and learning: Having diagnosed a learning experience and seen it in use by our students, we must ask: `how is it working out? Is it doing any good? What effects is it having?’ Evaluation is the means by which we systematically collect and analyse information about the results of a student’s encounters with a learning experience. We wish to understand what it is like to teach and learn within the system that we have created; to materialize which objectives have been achieved and which not; and to ascertain what unforeseen results… have also materialized… .as teachers we anticipate the need to create similar learning experiences for other students in the future. Ideally, the insights gained from evaluation will help us develop and improve our teaching… .for future students too, (Rowntree 1982, 181). A project on vocational education in Europe, funded by the European Commission between 1999 and 2003 and mirroring the activities of a working group on assessment and evaluation, showed the extent of the confusion in all systems about the meaning of key concepts and of the purposes of the different processes of assessment and evaluation (Breur, 2002). Confusion is exacerbated by way of key terms being used interchangeably by practitioners, policy makers and researchers: this problem is evident in Finland where there is no distinct word for ‘assessment’. Despite these difficulties of definition and differences in the uses of terms and concepts, the project aimed to clarify from the outset, how different terms might be defined and then to use them more precisely than is currently the case in many assessment and evaluation systems. Key terms are:

16

• Assessment of learning outcomes: diagnostic, formative and summative,

internal/external • Evaluation of institutional provision: different levels such as macro (national),

meso (institutional) and micro (classrooms and individual teachers); it is therefore necessary to define self-evaluation, peer review and peer evaluation and external audit

• Assessment and evaluation are inextricably linked and the two terms are used interchangeably: the data gained from formative and summative assessments can therefore contribute to formative and summative evaluation

• Learning results and learning outcomes also need to be defined and differentiated: results has a narrower meaning (e.g. grades) than outcomes

• Quality assurance: internal processes built into qualifications and learning programmes from the outset, in order to increase reliability and validity of assessment

• Quality control: external processes to ensure that standards of reliability and validity are met

• Standards: norm-referenced standards are based on being able to measure students’ results reliably and criterion-referenced standards are based on explicit measures of validity and authenticity.

4.2 THE PURPOSES OF EVALUATION There are different purposes of evaluation. Evaluations can be both internal and external and meet different diagnostic, formative and summative purposes. It is therefore helpful to differentiate as precisely as possible which data are needed for which purposes and who should collect and use these data. Methods and instruments for collecting data, as well as the data themselves, can therefore serve each purpose separately or all of them together! • External (formative) evaluation, for example carried out by advisers or people

responsible for staff development and curriculum development

• External (summative) evaluation, for example carried out by inspectors appointed by the Ministry of Education, or peer reviewers appointed by an institution or a Ministry of Education

• Internal (formative and diagnostic) evaluation, for example using self and peer evaluation by colleagues, learning groups, quality circles

17

• Internal (summative) evaluation, for example carried out by peer reviewers

evaluating courses within an institution.

Each purpose has a different ‘audience’ or ‘stakeholder’, although, of course, different stakeholders can be interested in all purposes! It is possible although problematic and confusing for an evaluation system to try to achieve all these purposes with the same data and procedures/processes.

It is crucial to differentiate between formative and summative purposes of evaluation because this helps to clarify more precisely which audiences and stakeholders have a legitimate interest in the processes and outcomes of evaluation. Summative evaluation can be carried out by an institution or external agency for the following purposes: • Accountability for meeting government’s targets and/or for funding allocations

• System-wide standardization of a) assessment practices, b) curricula, and c) learning results

• System-wide comparison of quality against national criteria (reliability)

• A lever for curriculum change (e.g. to make vocational education more relevant to working life)

• Motivation of teachers and institutional managers to work for improvements

• A basis for professional autonomy (e.g. designing local assessments to suit local conditions).

Summative evaluation focuses on pre-defined goals and criteria and judges the extent to which these goals have been achieved. Goals and criteria can be externally defined by governments and their agencies, or internally defined by institutions or teachers: goals and criteria can also combine external and internal definitions. Formative evaluation looks critically at the kinds of educational goals being set and their function within curriculum development, or the extent to which the goals of summative evaluation are being met before the summative (formal, official) evaluation takes place.

18

4.3 QUALITY ASSURANCE Quality assurance aims to reduce sources of error caused by particular, local circumstances whilst preserving the validity of assessment: QA processes therefore develop quality and monitor progress. They include: • Accreditation – formal approval that institutions can carry out processes of quality

assurance and quality control leading to formal recognition of achievement

• Validation (approval) of institutions to run courses and to carry out assessments

• Design of curriculum and certification (NBE and institutions)

• Exemplars of students’ work at different levels of attainment

• Internal moderation of assessment results (standardisation – validity versus reliability)

• Internal verification of assessment results to ensure that procedures have been followed

• Approved centres to run tests and curricula against centrally-designed and evaluated criteria (this is called validation in UK)

• Exemplar materials of tests and learning materials

• Guidance for centres to follow.

4.4 QUALITY CONTROL Quality control ensures public accountability for the results or outcomes of summative assessments carried out in local contexts. These processes might be both internal and external but their emphasis is on summative evaluation rather than formative evaluation. QC might involve the NBE (other government-appointed agencies) in processes to: • Moderate assessors judgments and decisions (e.g. external moderators acting on behalf

of NBE; regional and national moderation meetings with teachers and evaluators; postal sampling, in situ sampling of assessment results by internal or external evaluators

• Verify that QA and QC procedures have been adhered to within centres and by assessors acting on behalf of centres

19

• Offer national quality awards for best practice in quality assurance and quality control

• Organise action research projects by teachers into their own assessment practices (although these might be regarded largely as an informal form of QC)

• Data collect and interpret on behalf of schools.

The project will pilot some possible approaches to quality control: • A questionnaire to participants to evaluate their responses to requirements to

quality assurance

• Visits by peer moderators to institutions.

4.5 APPROACHES TO EVALUATION Evaluation can be summative or formative and adopt different approaches and methods. In addition to differentiating between formative and summative purposes, it is useful to consider approaches in advance of designing a national system. DEVELOPMENT-ORIENTED

Methods to encourage the development of quality emphasise formative processes from the outcomes of summative evaluation. These processes include:

• Using quantitative and qualitative data on a large of features in order to review questions about quality with providers, teachers and social partners

• Peer observation, feedback and evaluation of teaching an assessment • External observation, feedback and evaluation of teaching and assessment. The emphasis in development-oriented evaluation is on formative feedback which is constructive and supportive and which encourages people to improve the quality of their work based partly on ipsative (self-referenced) criteria and data that is relevant, valid and meaningful at a local level.

20

COMPARATIVE/NON-REFERENCED

Evaluation methods use comparative data (qualitative and quantitative) so that institutions and teachers can compare their performance with that of others. The purpose can be formative or summative. THEMATIC/SECTOR-SPECIFIC

Evaluation methods and data focus on a clearly-identified theme, such as staff development, teaching methods etc. or a range of themes across a clearly-defined sector (e.g. all aspects of quality within forestry, health and social care etc.) Data can be quantitative and qualitative and the purposes of evaluation can be formative or summative. CRITERION-REFERENCED (EXTERNAL AND IPSATIVE)

Evaluation uses clearly-defined criteria, both externally-defined (e.g. by working life, government etc.) and ipsative (defined by institutions and teachers based on their own previous performance). The purposes of evaluation can be formative or summative.

FIGURE 2 THE EVALUATION OF LEARNING RESULTS.

Diagnosticassessment

Formativeassessment

Summativeassessment

Formative (= QA) and summative (= QC) evaluation

Tests,portfolio, skill-tests Feedback

teaching, learning

Internal assessors

External assessors

The evaluation of learning results

Data

AchievementsSkill-tests

21

5 APPROACHES TO EVALUATION IN FINLAND AND IN THE UK

5.1 GENERAL PRINCIPLES AND VALUES Generally evaluation systems may convey some of the following underlying features, reflecting broader political and social goals. • Democratic and pluralist evaluation: listens to, and involves different stakeholders

and interest groups and aims to reconcile competing interests. Democratic evaluation offers an information service to the whole community about the characteristics of an education programme

• Humanist: both in the assessment systems used with students and in the use of evaluation outcomes (i.e. personal and professional development of those involved)

• Emancipatory: the methods and outcomes used aim to make the system more liberating for all involved, according participants to ask radical or critical questions

• Authoritarian/compliant: the evaluators and the evaluated agree (usually tacitly) to make the system work to achieve goals imposed from the top down. The evaluator offers external validation of policy in exchange for providers' compliance with the recommendations the evaluator offers. An authoritarian system is controlled by allocation of resources

• Bureaucratic: the evaluators want to make sure that the system works in favour of the bureaucracy, so that other goals slowly become excluded. The evaluator accepts the values of those who hold political office and offers information to help achieve policy objectives.

5.2 GOALS AND VALUES IN THE FINNISH EDUCATION SYSTEM A number of the features of the Finnish system indicate that democratic and humanist aims are prominent at all levels, from Ministry to teachers, social partners and students. These include for example: helping teachers and social partners understand the underlying principles and purposes of evaluation, rather than merely issuing top-down guidance and advice or prescriptive specifications to follow. Advice, support and trying to find ways of producing useful data replace inspection, ‘name

22

and shame’, and the publication of league table results. There is a strong political antipathy to such measures espoused by officials at the Ministry of Education and NBE. There is also a holistic approach that emphasizes local validity of assessment and curriculum design, and local and professional autonomy in deciding how to implement legislative obligations; this requires different layers and methods of evaluation. From the Ministry to teachers, and social partners and students principles of evaluation in Finland appear to emphasize the following values: • A holistic approach that emphasizes local validity of assessment and curriculum

design and local and professional autonomy in deciding how to implement legislative obligations; this requires different layers and methods of evaluation

• The prominence in official texts of students’ personal and social development, rather than an emphasis on working life competence. From this perspective, the NBE states:

the standard of an individual provider of educational services or of the whole education system is determined by its capacity to support the students' personal growth and educational aspirations as well as to prepare them for social and working life (NBE 1999, 12)

• Helping teachers and social partners understand the underlying principles and purposes of

evaluation rather than merely issuing top-down guidance and advice or, as in the UK, prescriptive specifications to follow. Advice, support and trying to find ways of producing useful data replace inspection, “name and shame” and publication of league table results. Officials at the Ministry of Education and NBE express a strong political antipathy to such measures

• A leading idea in the reformation of educational administration has been that the high quality of public educational services, which is necessary for the equality of citizens, cannot be achieved through strict uniformity, detailed regulations and state control but rather through improving the conditions for operation (NBE 1999, 11)

• An expectation of direct employer involvement in educational activity: if employers want more relevance to working life in VQs, they must take part in assessing students' competence

• An openness to admitting problems and tension and to spending time on trying to resolve them, alongside social partners

• A strong emphasis on evaluation for improvement of education and the conditions for learning, together with a recognition that underlying conditions can act as profound barriers to good quality.

23

The pilot project for national evaluation based on skills is rooted firmly in these values.

5.3 CHARACTERISTICS OF EVALUATION IN THE UK There is not space here to provide much detail about the features of evaluation in the UK’s vocational education system. They can be summarised as:

• A strongly centralised system which uses experts and focus groups to make changes but does not have a strong political commitment to social partners; for example, employers are consulted in proposed charges to the education system but unions and teachers are less prominent in this process and the focus is on consulting key individuals rather than creating a dialogue with representatives of social partners

• The UK’s vocational education system has undergone many major upheavals and restructuring over the past 20 years and teachers tend to suffer from ‘initiative fatigue’ and a sense of professional powerlessness

• Governments in the UK place a strong emphasis on reliability in assessment results, so that they can evaluate ‘national standards’

• Funding and the publication of institutions’ performance are linked to the outcomes of national and regional inspection

• There are over 1 200 different qualifications offered by 450 colleges (vocational schools), 150 universities and over 2 000 providers of work-place training (e.g. employers, private training companies); providers have to be approved formally by awarding bodies to run these qualifications and, in turn, the qualifications must conform to national standards and criteria defined by the Qualifications and Curriculum Authority (QCA)

• The UK has hundreds of awarding bodies and many organisations involved in evaluation, inspection and quality assurance. Although it is a bureaucratic and expensive system, it does offer many choices of courses and qualifications to colleges and workplace trainers.

24

The map indicates the main organisations involved in the UK’s post-compulsory system.

For learning and competenceDraft: Kathryn EcclestoneMari RäkköläinenPaula Mäkihalvari

DfWPDepartment for

Work &Pensions

DfESDepartment for

Education &Skills

DTIDepartment for

Trade &Industry

Home Office

QCAQualifications & Curriculum

Authority

NQFNational

QualificationsFramework SSCs

SectorCouncils

OfSTEDOffice for

Standards inEducation

ALIAdult Learning

Inspectoate

LSCLearning & Skills

Council

RDAsRegional

DevelopmentAgencies

LocalLearning

Partnership Employers* HEFCHigher

EducationFundingCouncil

New DealModern

Apprenticeships

Awarding Bodies

ProvidersLSDALearning & Skills

DevelopmentAgency

NIACENational Institute of

Adultand Continuing

EducationNRCALN

National Research Centre for

Adult Literacy &Numeracy

WBLRCWider Benefits of Learning

Research Centre

KeyTarget setting/Statutory control/RegulatoryResponsibility

Professional/Development Linksand Cooperation

Funding/QualityAssuranceresponsibility

City & GuildsOxford Cambridge &

RSAAssessment &

Qualifications AllianceEd Excel

NCFENational Open College

NetworkProfessional Bodies

(hundreds)

PrisonsColleges (FE)*

6th Form CollegesUFI

Learn DirectEmployers *

Training OrgsVoluntary Orgs

CharitiesWorkers Educational

AssociationTrade Union Congress *

Local Education Authority

Higher EducationInstitutionsResearchers

inUniversities

LEAsLocal Education

AuthorIties

47 Local ‘arms’- Funding- Research- Targets- Planning-(Rationalisation)

SchoolsFamily

Learning6th Forms

CommunityEducation

EducationInstitutions

QAAQuality

AssuranceAgency

THE SYSTEM OF VOCATIONAL AND ADULT EDUCATION IN ENGLAND

FIGURE 3 THE SYSTEM OF VOCATIONAL AND ADULT EDUCATION IN THE UK.

25

TABLE 1 THE MAIN DIFFERENCES BETWEEN THE TWO SYSTEMS CAN BE SUMMARISED AS FOLLOWS.

UK Finland Complex, large and confusing with many different bodies and organisation involved in funding, regulation and evaluation

Small and simple, with clear lines of responsibility between a small number of organisations

Many different vocational programmes, assessment systems and qualifications (certification)

Few different programmes and forms of qualification or certification

Many awarding bodies with different approaches to quality assurance and quality control – institutions/providers must choose between them and work with different systems

No awarding bodies – institutions operate autonomously

Emphasis on national standards, national systems, comparison of providers

Emphasis on locals standards, local agreements and a high level of local autonomy

National prescriptions for regulation, quality assurance and inspection

No prescription but, instead, guidance and advice and a strong emphasis on professional judgement and autonomy

26

6 A NEW EVALUATION MODEL BASED ON QUALITY ASSURANCE AND QUALITY CONTROL In the pilot project, we define ‘quality assurance’ as checks and balances designed into the assessment system from the outset, and 'quality control' as summative checks on the processes and outcomes of assessment. Both quality assurance and quality control can be external (e.g. carried out by NBE) and internal (college, school and social partners). However, it is perhaps useful to regard quality assurance as part of formative evaluation and quality control as summative evaluation.

In the option proposed here, the skills test-based summative assessment has been almost fully integrated with assessment by learning results and there are no overlaps. In this model, the developers of the skills test-based assessment system are responsible for drawing up the curricula and compiling the test material, as well as for organizing the tests and assessing the students. In order for the skills test-based system to be used in its current form to evaluate learning results, the experts in assessing learning results will be responsible for controlling the quality of the skills tests and the processes involved, as well as for the assessment of the learning results themselves. Quality assurance will focus on the crucial phases of the process, such as: • the curricula • the skills test material • the organisation of the tests • the assessment of the students' competence • the formative, internal sampling and moderation of the assessors' judgements. The project will pilot some possible approaches to quality control: • a questionnaire to participants to evaluate their responses to requirements for

quality assurance • visits by peer moderators to institutions to observe assessment processes and to

review the outcomes of consensus assessment. The pilot project will evaluate the degree of prescription that might be needed in deciding what to include in quality control processes.

The critical points of interaction between a skills test based system and national evaluation of learning results were laid down before the Pilot was implemented. These critical points form the content of Appendix 2.

27

6.1 A PILOT PROJECT CARRIED OUT BY NBE IN 2002–2003 The aim of the national evaluation of learning results is to produce information about the effectiveness of education and about how the objectives of the core curriculum have been attained. Evaluation produces information about the level of competence and how well it meets the needs of working life and further studies.

The piloting evaluation model is integrated throughout the quality assurance system with the skills test system. Local skills test will produce the evaluation data. The national skills-test material plays an important role, since the local tests are organized on the basis of them.

The purpose of the evaluation pilot in 2002–2003 was to test the new evaluation model and quality assurance and the quality control system in practice. Piloting completed, an analysis was made to determine what manner of system is possible at the national level in all the vocational sectors. The pilot project was designed to check the suggested model of the new system of evaluation in practice in order to create the standards for the national level. The questions put in the pilot were formulated to scan for and examine the problem of tension between reliability and validity in the assessment and evaluation. We wanted to know what kind of information the new evaluation system produces and if this information is useful for different purposes and audiences. The pilot was to provide information as to whether this model can be used also in the national evaluation of different fields. The results and conclusions of the first pilot will form the basis of preparations for implementing the new evaluation system at the national level in the years 2004–2007, when the second phase of the development project is continued at the national and local levels.

The structure of the evaluation pilot (2002–2003) is described in Appendix 3.

6.2 QUESTIONS TO BE ADDRESSED The focal questions and their associated sub-questions for the pilot were: 1) What is the role of skills-tests in the national evaluation system?

This led to the following sub-questions: • What new data about learning results do the skills-test produce? • Can this data be used for national evaluation? • What are the implications of using skills-tests from study entities for a national

system? • Are the skills-tests based on the core curriculum? • Which social partners are able to make use of the data from the skills-tests? • What are the strengths and weaknesses of the skills-tests as a basis for national

evaluation?

28

2) What QA and QC procedures do we need for the national evaluation

system?

This led to the following sub-questions: • What standards and criteria do we need to develop for QA and QC in terms

of assessment and evaluation procedures at the local and national levels? • What are the implications of using internal moderation to enhance the validity

and reliability of assessment? • What are the implications of using external moderation to enhance the validity

and reliability of assessment (e.g. by using peer assessors)? • To identify processes and activities that can be used in external moderation. • What training and support do assessors and trainers need? • What kind of documentation is required in QA and QC procedures? • What is the role of the institution’s self-evaluation in QA? • What is the role of FNBE in QA and QC? • What are the roles of social partners in QA and QC? • Do we need a “soft” process of accreditation (validation, approval) as part of

QA and QC? 3) What are the practical problems in implementing an evaluation system?

This led to the following sub-questions: • How do the “critical points” of quality assurance turn out in the pilot and what

new practical questions are associated with them (see Appendix 3)? • What new information does the pilot produce about practical problems in

integrating the evaluation system with that of the skill-test system? • Comparison problems in practice • Time problems in practice • Overlapping problems in practice • What kind of instructions and recommendations do we need for the

assessment and evaluation? • What solutions can we propose in response to the problems we identify?

6.3 METHODS AND DATA COLLECTION We prepared the pilot project thoroughly together with our external expert Dr. Kathryn Ecclestone. The expert and NBE officials looked at different alternative evaluation models, analysed them from a theoretical viewpoint and clarified key principles and activities in assessment and evaluation.

29

TEST MATERIALS

The pilot evaluation tested a model that was completely integrated with the skill-tests system. NBE carried out the pilot in collaboration with the educational establishments taking part in the preceding skills test material and actual test projects. The information about learning results was gleaned directly from the skills tests organised by the education providers. There were no common examination tasks; the national test material (a guide) and the test materials of the individual school steered the planning and administration of the tests and the ensuing assessment of the student’s level of competence. DATA COLLECTION

We used three different methods to collect the data for the pilot: 1) Data of learning results were collected from local skill-tests. The data collection was made by the institutions by means of uniform forms of documentation to be analysed at NBE, 2) There was a separate questionnaire for the teachers, the students and the representatives of working life. Those questioned were the people who implemented the skills tests. The inquiry was organised by persons in charge at the institutions and the forms were sent in and analysed by NBE, 3) A quality assurance process was planned and tested, which brought a lot of information about the situation in institutions and at workplaces, about how the skills tests are organized and students assessed and about the quality of the test materials. The quality assurance was made by outside experts from working life, education and research, and the developers from the evaluation unit at NBE. THE SAMPLE OF SKILLS TESTS

Approximately 190 students from seven vocational institutions took part in the pilot and data was gathered from 376 individual tests. The participating vocational fields were health and social services, the heating and ventilation sector, and the construction sector. 43 % of all the skill tests were conducted at a workplace, 46 % were assessed by consensus (teacher, student and representatives of working life working together) and 30 % by a teacher alone. The data was collected during the autumn of 2002 and the spring of 2003. THE QUESTIONNAIRE

The questionnaire targeted the same tests from which the evaluation data was gathered. We got back 733 forms altogether; 344 of the respondents were students, 247 teachers, and 143 representatives of working life. The purpose of the questionnaire was to examine the meaningfulness and importance of the skills tests, what happens at the local level and how national standards are realised when the skills

30

tests are implemented. The things looked into by the questionnaire were: 1) Correspondence of the test between the aims of the course and the demands of working life, 2) The organisation of the skill-tests, and 3) Skill-tests and learning. QUALITY ASSURANCE

External quality assurance was also tested in the pilot project. The task of a quality assurance system is to ensure that skills tests are implemented to conform to a given set of objectives and criteria. The aim is to ensure the acquisition of comparable results from the skills test system, so that the results are useable by the organisers of education and the national evaluation of educational outcome.

The quality assurance of assessment using skills tests for a system of evaluation by learning results was designed to bring common quality standards and criteria were in agreement. Quality assurance targets the process as a whole, from the curricula of the providers of education to the locally implemented tests. In this project quality assurance comprised evaluation of the curricula of the provider and the skills-test material along with visits for auditing, external moderation and reporting. The process of creating the quality assurance system is described in Appendix 4.

A total of seven provider curricula, nine skills test materials, nine visits for auditing. External evaluators participated in 11 the test evaluation meeting. There were 11 interviews with student, 12 interviews with teacher, 13 interviews with representatives of working life and six interviews with school principals in pilot evaluation. In addition four (discussion) meetings were organised with the teachers and the working life representatives (external moderation).

The written material and the later visits for external auditing to seven institutions along with sixteen national test materials were quality assured. Afterwards the institutions received school specific reports on their learning results and the auditing. Quality assurance targeted the process as a whole, all the way from the curricula of the providers of education to the locally organized and assessed skills tests. The institutions’ own material (e.g. curricula, instructions for student assessment, documentation forms etc.) and the national skills test materials (those used in the pilot) were also evaluated by an outside experts of the pilot.

The task of the quality assurance process was to ensure the organisation and implementation of the skills tests and the attending assessment practices and to check how the selected quality requirements worked in practice. The feedback of the quality assurance system had to be useful for the institutions’ self-evaluation and, of course, when improving the national core curricula.

31

6.4 STANDARDS AND CRITERIA Some common quality standards and criteria were laid down before the pilot was implemented. Quality standards were set up for the curricula of the providers of education, the content and structure of the skills test material, the implementation of the test (test site, cooperation with working life), the assessment of the test (assessment targets, criteria and practices), the documentation of the test, and the training in how to implement skills tests.

In the pilot evaluation the quality requirements of evaluation were criterion-based assessment and evaluation, assessment by consensus, the dimensions of competence, the general targets and criteria of the evaluation, and documentation. The quality requirements of evaluation have been described in more detail in Appendix 5.

The assessment and evaluation of learning results was criterion-based, signifying that the targets and criteria were set in advance as well as known by all the parties concerned. The skills tests were evaluated on the principle of consensus, i.e. each test had to be evaluated by a teacher, a working life representative, and a student. The evaluation by learning results targeted functional competence, cognitive competence, social competence, and reflective competence. In other words, the targets of evaluation were both the competence achieved and the processes producing competence.

The general targets and criteria used in evaluating the learning results were uniform across all the fields of training involved. The evaluation focused on the student's degree of command of the work methods, tools, material, work processes, safety regulations, and knowledge of what underlies the work task in hand. Added to that, an evaluation was made of the degree of command of the areas of emphasis and core competence common to all vocational fields. The result of the evaluation by consensus was documented on nationally uniform report sheets.

6.5 RESULTS OF THE PILOT The results of the data collection from the pilot (the learning results, summary of questionnaire and of the quality assurance process) are described fully in more detail in a separate report published in the autumn of 2003. The results will here be described only briefly.

The report on the results of the pilot project constitutes Appendix 6. IMPLEMENTING TESTS

There were clear differences between the fields and institutions in how they implemented and assessed the tests. Differences were further dependent on which study entities were assessed, as it was quite common in institutions to organize broad ‘basic skills’ tests (30 study weeks), frequently divided into several lesser parts

32

(subtests). For example in health and social services, all tests were at a workplace and nearly all assessed by consensus. It appears that when a test is arranged at a work place, the representatives of working life also take part in student assessment, a fact that is very much appreciated by the students (all evident from the result of the questionnaire and auditing interviews). GRADES

The grades awarded in the tests were very good in all the fields and qualifications. The most common grade was good 4 (the scale being 1–5), awarded in just under half of all the tests. Differences between the fields in the grades awarded were quite small. There were no failing grades. In all qualifications the best learning results were obtained in cooperation skills (mean = 4,13), which is one of the core skills common to the tests of all vocational qualifications. There were also very high grades awarded in quality and in customer-orientedness (mean = 4,02). This is interesting, because when interviewed the teachers considered it quite difficult, in the skills tests, to assess the so-called core skills and common focal areas of content, specifically in the technically oriented study entities and fields. Safety at work was also an area that the students mastered well. Safety at work is emphasized from the very start in all studies in the technical sector. The lowest grades were awarded in command of knowledge underlying work and in readiness to plan, implement and evaluate work processes, but also here the grades were still reasonably good. THE EVALUATION OF TESTS BY THE TEACHER

In all sectors the tests were considered to be good assessment tools and the participants, as a rule, were quite satisfied with the way the tests were implemented. All the parties considered the grades awarded the students to be objective and just and that they corresponded well to the requirements of the future workplace. Virtually all of the students taking part in the pilot evaluation considered the degree of difficulty of the tests to have been just right and only a small percentage that the tests had been too difficult. The students and the workplace representatives, however, thought the test directives to have been insufficient, and that they also had clearly had less of a chance to participate in the planning of the tests than had the teachers. THE INVOLVEMENT OF THE STUDENTS

A good one third of the students had not taken part in the planning of their tests. The teachers, on the other hand, felt that they had been well enough involved in the planning. The assessment targets and criteria of the tests also proved to have been poorly known by the students. More than one third of the students were only fairly well acquainted with them. The workplace representatives considered the joint evaluation to have been the most difficult and were generally more dissatisfied with

33

the evaluation criteria than were the teachers. The students were the most critical of the usefulness of the tests. QUALITY ASSURANCE

The results of the quality assurance and auditing process showed that the curricula of the providers ought to better meet the needs of working life. However, the skills tests have not yet been universally integrated with the providers’ curricula but constitute separately used assessment tools. It is also evident from the pilot project that more concrete criteria and descriptions of competence are needed, because assessment practices vary from school to school, from sector to sector, and even within a single school. Many teachers and working life assessors consider criterion-based assessment to be difficult.

Tests in schools do not make it possible to assess wide-range occupational competence. Workplace tests appear to be well organized and the workplace representatives to be committed to the assessment of the tests, even though it is very time-consuming. The assessment of the tests and feedback discussions constitute good learning situations and make teachers more committed to their task. The role of the teachers in assessment and feedback is important and students find it motivating that workplace representatives are involved in assessing the tests.

34

7 CONCLUSIONS FROM THE PILOT PROJECT The pilot started by asking the following questions:

What is the role of skills tests in the national evaluation system?

What QA and QC procedures do we need for the national evaluation system?

What are the practical problems in implementing an evaluation system using skills tests?

7.1 THE ROLE OF SKILLS TESTS Skills tests stress the validity of student assessment, whereas the old system with final tests place more emphasis on reliability. Skills tests increase the validity, relevance and authenticity of assessment because they create opportunities for consensual assessment and local co-operation between working life and vocational institutions. Students and working life partners appreciate these qualities because they make assessment more relevant and interesting. Compared to the old systems of evaluation, skills tests bring new information about the quality of learning at a local level.

However, all assessment and evaluation systems have to accept some trade-off between reliability and validity. Reliability can be increased by sampling a narrow range of tests, by standardizing the administration of tests, test conditions and assessment practices, and by setting stricter standards and criteria for assessment.

Problems with using skills tests as part of national evaluation arise from difficulties in using the data produced by the tests about learning results for the purposes of comparability and reliability. When opting for national reliability, there will be less flexibility (validity) at the local level.

7.2 QUALITY ASSURANCE AND QUALITY CONTROL Skill-tests are an important part of quality assurance and quality control in vocational education. Conversely, the tests need good processes for quality assurance and control to make them more reliable. The pilot project shows that it is possible to improve reliability (coherence of judgment between assessors and institutions) by regulating and standardizing the assessment materials, by sampling only a few tests, by using trained assessors, and by insisting on consensus assessment. Processes for quality assurance and quality control can increase the reliability of the tests to some

35

extent, but they cannot make them reliable enough for national comparison of learning results.

It is therefore extremely important to recognize that processes for quality assurance and quality control cannot transform a test designed for validity into a reliable test: they can only enhance reliability.

Quality assurance can make assessment criteria and the target areas of evaluation more unambiguous and transparent. However, no specifications of criteria can ever be clear enough to prevent assessors from sometimes being inconsistent in their judgements. Instead, we can enhance the transparency of the assessment criteria and processes of quality assurance and train assessors to use them as effectively as possible. One of the best features of the new model is that it motivates assessors and students and increases the trustworthiness and credibility of the tests.

There are financial and training implications, if quality assurance and quality control are to be carried out effectively.

It is not yet clear what effects the results of the tests will have on the quality of learning at the local level, nor on the quality of learning at the national level.

The skills tests reveal problems with the implementation of the core curriculum. The structure of the core curriculum does not yet support skills-test based evaluation, as skills-tests as yet do not make part of the core curriculum. At the moment the core curriculum does not provide means to monitor the development of a student’s competence during his training (transfer). Another problem is that core competences and common areas of emphasis are not clearly defined in the core curriculum. Consequently, they are very difficult to assess in skills-tests. Skills-tests give valuable feedback for the development of the core curriculum.

National results of learning results based on skill-tests are not yet very useful to working life. Instead, working life partners set greater store by the local validity of the assessments.

7.3 PRACTICAL PROBLEMS IN USING LOCAL SKILLS TESTS IN A NATIONAL EVALUATION SYSTEM We have far too many skills tests for producing valid and reliable information about learning results at the national level: this makes it extremely difficult to manage and process data from these tests. Instead, we need to use a sample of high quality skills tests based on minimum quality standards and other material.

Using skills tests for national evaluation requires all partners to understand the key principles, purposes and activities of assessment and evaluation. Without development and training to create this understanding, it will be difficult, at the expert level, to involve working life and the teachers that are needed for using skills tests in national evaluation. Without training and expertise, there is a danger that assessment used for ‘high stakes’ evaluation will lead assessors to teach for the test.

The tests cannot provide reliable data, on a national level, for comparisons between the learning results of individual institutions. This means that we need other data and

36

background information in addition to the tests: the next stage of our project will explore what this information might consist of and how it might be used.

We have to plan how to involve social partners and other organisations in the ongoing development of an assessment and evaluation system. In particular, providers need to consider how the skills tests relate to the quality of teaching and the assessment of activities in schools and at workplaces. A consideration of on-going developments is important because the pilot project highlights a problem typical in any new initiative, namely that of “catalytic validity”, where taking part in something new can create a view amongst teachers and project participants that a new practice is more valid that what went before it. We have to be cautious about inferring too much from the positive results of the pilot, as implementing the processes on a national scale will involve teachers who may be less enthusiastic and who have not personally themselves volunteered to take part in the processes.

It is important to consider the audiences for the results of the skills tests as well. Different groups emphasise different features of the tests. If we emphasise validity, the tests will be broader and maybe more useful to working life and to the quality of teaching. Institutions, however, need comparative data in order to develop their own evaluation and quality assurance activities.

Changes to evaluation and assessment create possibilities for mutual accountability between teachers and government, but they can also create new tensions between central regulation by the Ministry and NBE and the local autonomy of teachers and working life partners.

Evaluation systems might espouse one set of values at an official level but the implementation of procedures can erode these values. This makes it important to articulate clearly what espoused values the evaluation system hopes to convey, particularly at a time of transition to a new system. In Finland, it is important for social partners to decide what values, principles and processes we want to preserve at all costs. The goal of consensus based on respectful, democratic dialogue between social partners is strongly inherent in the Finnish education system and is crucial to creating agreement concerning the future balance between regulation and autonomy.

7.4 ALTERNATIVE MODELS TO COLLECT NATIONAL EVALUATION DATA On the basis of the experiences of the pilot project we devised some alternatives for collecting the evaluation data from skills tests at the national level. It is not possible to collect the learning results from every skills test within each qualification. This “total” model is not possible at the moment, as there are too many skills tests, and they vary a lot between them. The collected data has to be sample based. This means that the qualifications and skills tests for the national evaluation have to be selected beforehand.

37

1) The ‘Follow-up’ model

Only a few tests from each qualification will be selected (‘good tests’). Relevant evaluation data will then be gathered from the involved students and the students´ learning results monitored throughout his/ her three-year period of training. Thus it will take three years to accumulate the final data about each qualification to be included in the national evaluation. Beyond this, pertinent background information also has to be provided by the institutions and the students.

Process results 3 years

Stm = Skills test material S = Skills test

Stmn

S1

Stm1

S2 Sn

Stm2

Same students

Institution B . .

Stm1

S1 S2

Stm2 Stmn

Sn Process results 3 years

Same students

Institution A

Institution n

S1

Stm1

S2

Stm 2 Stm n

Sn

Same students

Process results 3 years

38

2) The ‘Cross-sectional’ model

This is a sample-based model. In this model the most important and valid skills tests (“good tests”) are selected at the beginning of the studies, midway through the studies and at the very end of training. The institutions remain the same, but the students are different in each test. In this model it is not possible to monitor the progression of a particular student, but it does give general information about the learning results at a particular institution in a particular qualification. This model provides results quickly, in one year. Background information is required about the institutions and the different students. These two theoretical models will be tested in the next phase of the development project. Stm = Skills test material S = Skills test

Institution B . .

Institution A

Institution n

StmnStm2

S nS 2S 1

S nS 2S 1

Stm nS 2S 1

Different students

Stm1

Results Results Results 1 year 2 year 3 year

39

8 CHALLENGES FOR FUTURE DEVELOPMENT The development project has been both progressive and innovative, but there still remain topics for discussion in the future. Main topics suggested by the pilot for further discussion include: 1. The level of desirable trade-off between the skills tests themselves as an

assessment method and their use for a national evaluation system 2. The resources needed to create an effective process of quality assurance and

quality control (i.e. moderation, peer evaluation, high quality test materials, and the training of teachers and working life assessors)

3. The acceptable level of tension between regulation and autonomy (i.e. between

prescription and regulation to standardise assessment practices amongst teachers and working life assessors, and ‘nice to have’ good practice)

4. The requirements for training working life assessors and teachers 5. The role of social partners, providers and other organizations in the ongoing

development of an assessment and evaluation system.

40

9 SUGGESTIONS FOR NEXT PHASE OF THE PROJECT The development of the evaluation system will be continued in 2004-2007 by means of ESR projects. There already is an ongoing project termed KOPPI, conducted by the National Board of Education. In addition, the evaluation system will be developed through four other ESR-projects, coordinated by the institutions. The focus of these projects will rest on questions regarding the reliability and comparability of a skills test based evaluation system. The focus will further be on the training for teachers and working life assessors and on the dissemination of results.

The evaluation system of learning results will be tested in fourteen initial vocational qualifications. The evaluation data of learning results will be collected through two different models of data collecting, termed the ‘total’ and the ‘sample’ models. In the ‘total’ model evaluation data is collected from all the skills-tests and from every student and institution. In the ‘sample’ model the data is gathered from a selected number of skills-tests, institutions and students. The model can be divided up into a ‘follow-up’ model and a ‘cross-section’ model. In the ‘follow-up’ model the evaluation data is collected from one and the same institution and student throughout his three years of training. In this model the evaluation data can be gathered either from every skill-test or from only a selected number of skills-tests. In the ‘cross-sectional’ model evaluation data is collected from a selected number of students, institutions and skills-tests. In this model evaluation data can be collected either from each and every skills-test or from a selected number of such tests.

The evaluation system development projects will produce proposals for the national evaluation and quality assurance systems. Concomitantly, a feedback and information system will be devised in support of the self-evaluation to be made by the education providers. The development projects will provide the national learning results of fourteen initial vocational qualifications. In the next phase of the project it will be important also 1. To discuss the training needs for teachers and working life assessors in a new

evaluation system

2. To discuss the practicalities of a new system with officials from the British QCA and inspection system; to bring in additional expertise for ideas and consultation

3. To decide on the features of standardisation that are necessary and those features that are ‘nice to have’

4. To carry out a detailed costing of a new evaluation system.

41

SOURCES

Breur, K. (ed). 2002. Evaluation in European vocational systems. Belgium. Chelimsky, E. 1997. Thoughts of new evaluation society. Evaluation 3 (1), 97–118. Crooks, T.J. 2002. Educational assessment in New Zealand Schools. Profiles of

educational assessment systems world-wide. Assessment in Education, 9 (2), 237–253.

Crooks, T. J. 2003. Some criteria for intelligent accountability applied to account-ability in New Zealand. Presentation 22.4.2002 in AERA-conference, Chicago.

Ecclestone, K. 1996. How to assess the vocational curriculum. London: Kogan Page. Ecclestone, K. 1999. Empowering or Ensnaring? The Implications of Outcome-

based Assessment in higher Education. Higher Education Quarterly 53 (1), 29–48. Ecclestone, K. 2001. I know a 2:1 when I see it: how lecturers learn degree standards

in franchised programmes. Journal for further and Higher Education. Ecclestone, K. 2004. (Second edition) Understanding assessment. A guide for teacher

and manager in post-compulsory education: principles, policy and practice and qualifications in Leicester: National Institute of Adult and Continuing Education.

A Framework for Evaluating Educational Outcomes in Finland. Evaluation 8. 1999. Helsinki: National Board of Education.

Gipps, C. V. Educational Accountability in England: The role of assessment. Presentation 22.4.2002 in AERA-conference, Chicago.

Guba, G. E. & Lincoln, Y. S. 1989. Fourth Generation Evaluation. SAGE Publications: USA.

Harlen, W. 2004. Teacher assessment: a systematic review. Bristol: EPPI/DIES. Neil, D. T., Wadley, D.A., Phinn, S. R. 1999. A Generic Framework for criterion-

referenced Assessment of Undergraduate Essays. Journal of geography in Higher Education, Vol. 23, No. 3. 1999, 303–325.

O’neil, O. 2002. A Question of Trust. The BBC Reith Lectures 2002. Cambridge: Cambridge University Press.

Poikela E. & Poikela. S. 1997. Conceptions of learning and knowledge – impacts of the implementation of problem-based learning. Zeitschrift fur Hochschuldidaktik – Austrian Journal for Higher Education 21/1, 8–22.

Rowntree, D. 1982. Assessing students: how shall we know them? London: Kogan Page. Wilmut, J. 2004. Teacher assessment in the UK: a report for the Qualification and

Curriculum Authority. London: QCA. Vocational Education Act 630/1998.Wolf, A. 1996. Competence-based assessment. Buckingham: Open University Press.

42

APPENDIX 1 METHODOLOGY AND DATA COLLECTION Working periods, interviews, visits and observations

We have had seven intensive working periods together, six in Finland and one in England in 2002–2004. The co-operation will continue during the next phase of the development project 2004–2006(7). The working periods: • April 15.–21.2002 (Dr. Kathryn Ecclestone in Finland) • June 17.–20.2002 (Dr. Kathryn Ecclestone in Finland) • November 4.–8.11.2002 (Dr. Kathryn Ecclestone in Finland) • February 5.–7.2003 (Mari Räkköläinen and Paula Mäkihalvari in the UK) • December 1.–5.2003 (Dr. Kathryn Ecclestone in Finland) • June 14.–18.2004 (Dr. Kathryn Ecclestone in Finland) • November 8.–12.2004 (Dr. Kathryn Ecclestone in Finland) During the working periods Dr. Ecclestone familiarized herself with the principles, values and features of the different aspects of VET assessment and evaluation in Finland, identifying technical, practical and critical ideas and aspects regarding evaluation amongst different agencies and interested parties in Finland. The co-operation helped NBE officials gain an understanding of other assessment and evaluation systems and their implications. WORK PERIOD I: 15.4.–21.4.2002 AT NBE The first period 15.4.–21.4.2002: Dr. Ecclestone clarified evaluation practices of the vocational education system in the UK and acquainted herself with the Finnish education and evaluation system. Together with NBE officials she looked at different alternative evaluation models and the possibility of integrating these models when making the change over to an evaluation based on skills tests. We analysed different models from a purely theoretical viewpoint and began to clarify key terms and principles in assessment and evaluation. During first period we, among other things, dealt with the following themes: • Vocational education and the evaluation system of learning results in England • The theoretical points of departure in a system of evaluating by skills tests.

43

WORK PERIOD II: 17.6.–20.6.2002 AT NBE The task of the second work period was for the external expert and NBE experts to localise, on the basis of analysing different alternative templates, the main areas in need of further development in the Finnish system and to create a strategy for the evaluation of learning results. In addition to workshops, we interviewed and met social partners. During the first period we, inter alia, dealt with the following themes: • Quality assurance – Quality control • Requirements for documentation • Planning for pilot • The structure of report, distribution of work in writing. WORK PERIOD III: 4.–8.11.2002 AT NBE The task of the third work period was to examine the quality assurance and quality control system at national level and its practical implementation in the evaluation following the pilot. During the period we devised questionnaires for the teachers, the students and the representatives of working life. During this period we, among other things, addressed the following themes: • Contents of report • Checking for the system of quality assurance and quality control • Evaluation strategy at the national level – what does this mean after pilot • Devising for questionnaires. WORK PERIOD IV: 5–7.2.2003 – THE EXPERTS FROM NBE VISIT THE UK We developers from NBE visited the UK, familiarising ourselves with the UK’s education system in VET. We were specifically interested in QA- and QC-procedures. We had beforehand analysed the main problems and ‘critical points’ which might arise in the new system. During the visit we had opportunity to have discussions with people who evaluate and those being evaluated. We visited QCA, central awarding bodies, and some institutions. We interviewed authors and met teachers, students, developers, school principals, inspectors, and external verifiers. Interviews/discussion groups in England: • Mike Coles, Tom Lehman and Jeff Carter, Qualifications and Curriculum

Authority (QCA), London • Murray Butcher, Head of Policy and George Barr, Research Manager City and

Guilds Awarding Body, London

44

• Keith Simpson, Head of Resources and Planning, Sandy Peacock, Head of Quality

Assurance, Charles Carter, Curriculum Manager Art and Design, Newcastle College

• Less August, Senior Inspector for Adult and Community Education, Adult Learning Inspectorate, Newcastle

• Isabel Sutcliffe, Chief Executive, Lesley Roe, Head of Strategy and Planning, Christine Cann, Head of Moderator Training, NCFE awarding body , Newcastle.

We got new ideas and good advice on how to solve (and avoid) some problems we might encounter when implementing a new evaluation system. We learned what the benefits of standardisation are and how important flexibility is, for example, to creativity and the motivation of teachers. There will be tensions and you have to learn to compromise between regulation and flexibility. Further we learned a very interesting method of quality assurance, i.e. moderation. Moderation in NCFE is based on co-operation between inspectors and institutions. WORK PERIOD V: 1–5.12.2003 AT NBE During the period we and Dr. Ecclestone analysed and processed the results of the pilot – learning results and results for auditing. We met with the ‘Integration group’ (social partners and teachers and representatives from The Ministry of Education and the NBE) and presented them with the results of the pilot. The objective of the meeting was to discuss the main issues of the pilot and to examine the effects of the results on the development of the evaluation system in the future. During the period we, inter alia, addressed the following themes: • The results of the Pilot project • The main issues of the Pilot project: Do we now have all the answers? • The effects of the results on the evaluation system of learning results at the

national level • Planning a system of data collecting and the dissemination of information. WORK PERIOD VI: 14–18.6.2004 AT NBE During the sixth period we and Dr. Ecclestone examined the reliability and comparability of learning results in the pilot evaluation. We devised alternatives methods / models to collect the evaluation data and consulted Jari Metsämuuronen, Senior researcher, in NBE. At the same time we set the direction of the new projects for developing the evaluation system. During the period, the following themes, among others, presented themselves: • The reliability and comparability of the learning results • The minimum standards and quality assurance to assess in the new projects • Alternative methods for external moderation

45

• Training (what kind of training, and who trains and who will be trained) • Alternative methods / models to collect evaluation data. WORK PERIOD VII: 8–12.11.2004 AT NBE During the seventh period a seminar of evaluation and assessment was organised in Turku. The seminar focused on skills-test-based evaluation and student assessment. The participants were the developers of the skill-tests system from the institutions, working life and NBE.

During this period we also finished the report on the pilot evaluation.

During the working periods we have also interviewed numerous key individuals who are involved in developing the skill-test system and the skill-test materials, social partners along with several important stakeholders from different sectors. We have visited the institutions and met with the teachers and other developers involved in developing the quality of education. Below, in brief detail, we list the activities engaged in, the visits made, and the interviews done. The visits: • Visit to AEL – centre for technical training (17.4.2002). Meeting and interviews:

project manager Veikko Ollila and project chief Pentti Suursalmi (technical sector) – Familiarity with the preparation and planning of skills tests materials in adult

and youth education – Comparison problems; the skills tests are implemented in so many different

ways at the local level – The national evaluation data can be collected by skills tests – The representatives of working life need training in assessing and

implementing the skills tests.

• Visit to the Vocational School of Northern Ostrobothnia (18.4.2002). Meeting and interviews: Vice principal Esa Kiuttu, Quality coordinator Sauli Alaruikka, Teacher Tiina Koski – Familiarity with the learning results in mathematics – The school implements assessment of learning results both at the beginning

and at the end of education to help develop their own activities and teaching (diagnostic and formative assessment)

– It has a lot of co-operation with other institutions, comparing results between them

– They support ranking-listing and they also want national comparison data – The skills tests provide them with means for self-assessment.

46

h Visit to the Oulu Institute of Social and Health Care (18.4.2002). Meeting and

interviews: International co-operator Lauri Malm, project manager Ritva Arvola, project manager Erja Kotimäki, project manager Annukka Kurki, project manager Päivi Kukkonen, school principal Irmeli Männistö – The skill-test should be made as flexible as possible – The institute is familiar with the skills tests in social and health services field at

the local level – Skills tests are important for the students’ learning but they are very hard to

assess – Assessment criteria should be as lucid and concrete as possible – It is difficult always to pinpoint the most important and essential competences

in the core curriculum that should be included in the skills test – There are only few essential target areas of evaluation – The costs of skills tests are far too high for the institutions. Who will be

responsible for financing the skills tests when ESR funding ends? • Visit to Villa Viklo, retirement home of the elderly (18.4.2002). Meeting and

interviews: Nursing student Kirsi Kokko and her supervisor – Familiarity with the significance of completing a skills test.

INTERVIEWS LIST OF PERSONS WHO WE INTERVIEWED: • Counsellor of Education Sirkka-Liisa Kärki from NBE (16.4.2002) • Senior lecturer Esa Poikela from University of Tampere (16.4.2002) • Counsellor of education Risto Hakkarainen and Counsellor of Education Pentti

Yrjölä (19.4.2002), NBE • The Ministry of Education (16.4.2002). Counsellor of Education Eija Alhojärvi;

Senior Inspector Tarja Riihimäki • Interview and meeting with social partners (18.6.2002): Training director Manu

Altonen, Confederation of Finnish Industry and Employers (TT), Secretary of Education Policy Merja Laamo, Trade Union of Education in Finland (OAJ), School principal Jukka Salminen, Association of school principals, Special Expert Susanna Kivelä, Association of Finnish Local and Regional Authorities, head nurse Taina Ala-Nikkola, Helsinki and Uusimaa Hospital Group

• The Ministry of Education (18.6.2004). Director Timo Lankinen. Interviews were made to establish the perception of the practices of evaluation and assessment: • Counsellor of Education Sirkka-Liisa Kärki from NBE (16.4.2002)

– The main objective of the skills test is to improve the quality of vocational education

47

– The evaluation system has to be flexible and not too burdensome for the institutions

– Numerous kinds of skills tests are possible – The skills tests will, as far as possible, be organized in genuine working

environments – The national evaluation data can be collected by skills tests – The representatives of working life need training in student assessment and

supervision.

• Senior lecturer Esa Poikela from University of Tampere (16.4.2002) – The holistic view on student assessment and the learning process; not only the

results but the evaluation data are important to the whole process of education – The skills tests should be holistic (to measure entirety) – The evaluation system should be informative, not controlled – Institutions need good practices in order to develop their own activities.

• Counsellor of Education Risto Hakkarainen and Counsellor of Education Pentti

Yrjölä, NBE (19.4.2002) – There is active collaboration between social partners in adult education and the

development of competence-based examinations for adults – Comparison problems between different qualifications, due to the institutions

implementing their skills tests in so many different ways – There are far too many competence-based examinations (about 350) – The criteria are too detailed – The training committee plays an important role in implementing and planning

the skills tests – It is important to exploit the experiences gained from the competence-based

examinations for adults when developing and planning the skills tests. • The Ministry of Education (16.4.2002). Counsellor of Education Eija Alhojärvi,

Senior Inspector Tarja Riihimäki – The skills tests improve the quality of vocational education – The employers and employees are involved in the student assessment process

to ensure the quality of vocational education – Evaluation data is needed from the whole process of learning and education – The evaluation system has to be light and cheap – A good quality assurance system produces reliable and comparable evaluation

data. There is a need for comparison data – It has to be borne in mind that the skills tests are only one part of student

assessment – We have to accept that at first results are not so reliable, but that reliability

improves year by year.

48

Below is listed the main content of the interviews with the social partners. Interview and meeting with social partners (18.6.2002): Training director Manu Altonen, Confederation of Finnish Industry and Employers (TT), Secretary of Education Policy Merja Laamo, Trade Union of Education in Finland (OAJ), school principal Jukka Salminen, Association of School Principals, Special expert Susanna Kivelä, Association of Finnish Local and Regional Authorities, Head nurse Taina Ala-Nikkola, Helsinki and Uusimaa Hospital Group. Main points of interviewing and meeting: • The evaluation system has to be useful and motivate different stakeholders • Teachers and working life need training in assessment and the implementation of

skills tests • The evaluation system should produce comparison data • Problem: Whose task is it to train teachers and working life representatives

(supervisors) • It is necessary take into account also the student’s point of view when developing

the evaluation system. The Ministry of Education. Interview and meeting with director Timo Lankinen (18.6.2004) h Kathryn Ecclestone’s feedback on the development project h Discussion about the main results and challenges of the new system of evaluation h What kind of information does the ministry want at the national level h Discussion about learning results and the quality of learning. Working with NBE officials • A number of meetings clarified the definitions of key terms and the principles of

assessment and evaluation, thereby preparing for – Quality assurance – quality control – Requirements of documentation – The project – The structure of the report, the dissemination of the work in writing, the

planning of next work period.

49

APPENDIX 2 ‘CRITICAL POINTS’ For the evaluation of the learning results, a system of quality assurance that covers the entire process from curriculum to the locally administered tests is developed. In order to set up the standards, any critical points in the skills test system from the standpoint of the national evaluation have to be identified. At least the following points and phases of the process have proved to be such that sufficient heed needs to be paid to their quality requirements, in order that the evaluation of learning results may be successfully integrated with the skill-test based evaluation. In the pilot these critical points are under inspection and some essential standards and criteria for the assessment and evaluation system will be based on them. Some standards and criteria will be made before and some after the pilot project. 1 THE SKILLS TEST MATERIAL

- contents and structure - precise or general instructions - exploitation is voluntary / obligatory - field-specific / general requirements in the content of materials - national materials/ institutions own materials

2 THE OBJECTIVE OF ASSESSMENT (COMPETENCE)

- demands of the core curriculum / requirements of working life - professional (technical) skills / theoretical (academic) knowledge - assessment of the learning process (transfer-effect) / assessment of

performance (achievements, final results) - one / many tests for one competence - many objectives / a few objectives of assessment (evaluation) - study entities / whole core curriculum - definition at local level (assessment) / at national level (evaluation)

3 THE ASSESSMENT CRITERIA

- definition at local level (assessment) / at national level (evaluation) - commensurable (uniform) / variable criteria and scales - general/ concrete definitions on criteria

4 THE SKILLS TEST LOCATION (ENVIRONMENT)

- general instructions and definitions / local circumstances - restrictions / free to choice - the workplace, the school, some laboratory facility…

50

5 THE TEST

- the scale of variation at the local level - number and scope of local tests: vary / uniform - on the job learning -periods and skills tests: combine/ separate - continuous testing / final test or performance

6 ASSESSMENT PRACTICES

- uniform / vary at the local level - general regulations / exact instructions - definition of tripartite (triangularity) / tripartite in practice (the number of

assessors) - own teacher/ not own teacher as an assessor - own trainer or counsellor / not own trainer or councellor - separate assessment / part of learning through work period assessment - students self-assessment

7 THE CORE CURRICULUM

- actual, based on requirements of working life - the objectives of the curriculum and objectives and criteria of assessment at

local level - procedural competences (transfer) in the curricula?

8 TRAINING

- training of assessors and teachers - precise or general instructions for training programmes - who is in charge of the training programmes?

9 DEFINITION OF CONCEPTS

- to select the central concepts - defination of the terms and concepts.

51

APPENDIX 3 THE STRUCTURE OF THE EVALUATION PILOT (2002-2003)

Field IConstruction technology I

Institution 1

Skillstests

materialX/I

Field IIHousebuilding technology

Field IIISocial and heath services

Institution 2 Institution 1 Institution 2 Institution 1 Institution 2

Skillstests

materialY/I

Skillstests

materialY/I

Skillstests

materialX/II

Skillstests

materialY/II

Skillstests

materialX/I

Skillstests

materialX/II

Skillstests

materialY/II

Skillstests

materialX/III

Skillstests

materialY/III

Skillstests

materialX/III

Skillstests

materialY/III

Skilltests

X

~ 15students

x

Skilltests

Y

~ 15students

y

Skilltests

X

~ 15students

x

Skilltests

Y

~ 15students

y

SkilltestsX

~ 15students

x

Skilltests

Y

~ 15students

y

SkilltestsX

~ 15students

x

Skilltests

Y

~ 15students

y

Skilltests

X

~ 15students

x

Skilltests

Y

~ 15students

y

SkilltestsX

~ 15students

x

SkilltestsY

~ 15students

y

Internalmoderation

External moderation

THE EVALUATION SYSTEM OF LEARNING RESULTSPILOT

THE EVALUATION SYSTEM OF LEARNING RESULTSPILOT

52

APPENDIX 4 QUALITY ASSURANCE SYSTEM IN EVALUATION PILOT

QUALITY ASSURANCE SYSTEM IN EVALUATION PILOT

I Evaluation of written material

• The curriculum of education organiser (evaluation targets, criterias, evaluation practices, documentation)

• The skills-test material (the content and structure of the test material, evaluation targets, criterias, the test environment, evaluation practices, documentation)

II Visit for auditing

• Auditing the test environment

• Participating in the test assessment meeting

• Interviews (student, teacher, representative of working life)

Reporting

III External moderation

• Discussion meeting between teachers and representative of working life

h Auditing report to institutions

53

APPENDIX 5 QUALITY REQUIREMENTS OF EVALUATION NATIONAL EVALUATION OF LEARNING RESULTS Evaluation pilot scheme autumn 2002 - spring 2003 / Mari Räkköläinen 22.10.2002 THE AIMS OF THE EVALUATION OF NATIONAL LEARNING RESULTS An evaluation of learning results produces information about the effectiveness of education. The aim is to generate information about how the objectives of the curriculum have been attained. The evaluation further produces data about the prevailing level of competence and about whether or not it meets the needs of working life and provides the knowledge and skills necessary for further study. THE BASES OF THE EVALUATION Integration with the skills test system The system of evaluation by learning results is integrated with the skills test system and the concomitant system of assessment by skills tests. The necessary information is gleaned from the skills tests administered by the organiser of education, where the competence of the student is assessed in a context that is similar to the real situations the student will be confronted with when entering working life. National skills test materials or the school's own tests material the basis of the implementation of the tests. Evaluation data is gathered from these tests by study entity throughout the student's training. Quality assurance National evaluation requires that the organiser of education takes part in an external evaluation. In order to try maximise both the validity and reliability of the evaluation data, a quality assurance sets the required standards for the written material pertaining to the curricula, test material, and implementation of the tests. In addition, outside evaluators pay visits to the schools, local test sites, and the consensus meetings to be held by the local evaluators. Any potential corrective measures are taken prior to the national evaluation.

54

Criterion-based assessment and evaluation The assessment and evaluation of learning results is criterion-based, signifying that the targets and criteria are set in advance and that they are known to all the parties concerned. The general targets and criteria are nationally uniform in order to standardise the procedures and outcomes. The competence targeted by the evaluation are the objectives contained in the national core curriculum of the field of study in question. Assessment by consensus The skills tests from which the learning results are gleaned are evaluated on the principle of consensus, signifying that each test is evaluated by the teacher, a working life representative, and the student. There may be numerous evaluators. The grade is awarded on the basis of a discussion between the parties. The evaluation by consensus is part of the quality assurance. The transfer effect of learning Evaluation data about the learning results are gathered throughout the student's training, by study entity and from all tests. The evaluation results are not mechanically added up, divided, or weighted. THE TARGETS AND CRITERIA OF THE EVALUATION The dimensions of competence The evaluation by learning results targets functional competence, cognitive competence, social competence, and reflective competence. In other words, the targets of evaluation are both the competence achieved and the processes producing competence.

The functional or operational competence reflects the student's command of his work tasks and work procedures. The cognitive or mental processes show the student's ability to apply pure knowledge, his command of knowledge, his understanding of the relations between processes, and his ability to shape wholes. The social or societal processes involve the student's ability to function both as an individual and as a member or leader of a group. The reflective or evaluative and mind-developing processes are part of the student's ability to learn, appraise, and improve his own working and that of his surroundings.

The dimensions of competence derive their content from the general objectives of each vocational qualification or first degree aimed for and from the central goals of the study entities and areas of competence involved.

55

The general targets and criteria of the evaluation The evaluation targets the basic as well as the specialized competence involved in vocational training. The general targets and criteria used in evaluating the learning results are uniform across all the fields of training involved. The evaluation focuses on the student's degree of command of the work methods, tools, material, work processes, safety regulations, and knowledge of what underlies the work task in hand. Added to that, an evaluation is made of the degree of command of the areas of emphasis and core competence common to all vocational fields. The general targets of the national evaluation by learning results are briefly as follows: 1. COMMAND OF THE WORK TASKS

includes the control of methods, tools, and materials involved

2. COMMAND OF THE WORK PROCESS includes the skills of planning, implementation, evaluation, and development

3. COMMAND OF THE KNOWLEDGE UNDERLYING THE WORK TASK IN HAND includes the ability to apply and control pure knowledge

4. COMMAND OF SAFETY REGULATIONS includes mastering the knowledge and measures pertaining to safety at work, the working conditions, and the ability to work

5. CORE COMPETENCE includes the command of the curricular core content common to all fields of vocational training, namely

- the ability to learn to learn - troubleshooting skills - interaction and communication skills - co-operation skills - ethical and aesthetical/ emotional skills

6. COMMON AREAS OF EMPHASIS

include the areas of emphasis relative to content common to the curricula of all vocational fields of study. Evaluated are the general studies connecting to vocational studies and civic skills, such as

- internationalisation - sustainable development - exploitation of technology and IT - entrepreneurship

56

- customer-oriented activities of high quality - service and consumer skills - attention to matters of occupational safety and health.

The general evaluation criteria consist of three stages, where competence is described as satisfactory, good, or excellent. The assessment scale ranges from 1 to 5. The drafting of the criteria starts with describing the level of competence of satisfactory. Satisfactory stands for the minimum requirement of vocational competence, from which the subsequent levels of good and excellent are derived. The level of competence accumulates as you move from one level to the next.

The level of satisfactory is indicated by the grades of 1 and 2, and the level of good by those of 3 and 4. The highest grade is awarded when the degree of competence clearly exceeds the minimum requirements of the description of levels. The grade of 5 corresponds to excellent.

The targets and criteria of evaluation have been drafted separately for vocational competence, the core areas of competence, and the competence in the areas of special emphasis. Each area has been given general descriptions. The descriptions have been derived from the aims of the core curricula, prior national evaluation criteria, the ongoing work by experts, and the skills tests material currently under construction. The targets and criteria of the evaluation of vocational qualifications On the basis of the general evaluation targets, the central competence to he evaluated within the various vocational qualifications are defined by study entity. Observing these general evaluation criteria, the criteria for evaluating the qualifications are subsequently also set by study entity.

57

• functional• cognitive• social• reflective

com petence

EG

S

EvaluationCriteria

SG

ES G

E

General targets of evaluation:

1. Com m and of work tasks2. Com m and of work processes3. Com m ands of knowledge

underlying work4. Com m and of safety regulations

5. Core com petence

6. General areas of em phasis

FIGURE 1 GENERAL DESCRIPTION OF THE LEVEL OF THE EVALUATION OF

COMPETENCE.

58

TABLE 1. DESCRIPTION OF GENERAL TARGETS AND CRITERIA OF EVALUATION IN EVALUATING NATIONAL LEARNING RESULTS

EVALUATION TARGET

SATISFACTORY

GOOD EXCELLENT

1. Command of work task Command of work methods, tools and materials. Evaluation targets the choice of methods, tools & materials best suited to the work task and environment, & the ability to apply and exploit them correctly & to master work tasks in different work environments.

2. Command of work process Skill of planning, implementing, evaluating & developing work processes. Evaluation targets ability to plan & implement own work in a practical manner & to evaluate quality of work done as weIl as how to improve it.

SATISFACTORY Student gets through basic tasks of his to the profession. He remembers, identifies, ascertains & repeats concepts and content of confronted phenomena. When performing tasks he repeats techniques & models acquired earlier on. He observes rules and follows instructions. Student plans his work, but in new situations or work environments he needs guidance. Student is capable of safe working.

GOOD Student knows how to analyse, classify, compare, & reshape information and to state his reasons for the choices made. He knows the core content of his field of study and is capable of independent action and planning also within larger task entities. He gets through new and changing situations on his own. He observes without guidance safety regulations & is familiar with the general rules for safety at work.

EXCELLENT Student manages independently the core tasks of his to be profession and states his reasons for the choices made. He views knowledge and his own work with criticism. He plans his own work with awareness & independently & is capable of combining, developing & recreating also in fully new situations. He acts of his own accord and is capable of acquiring new information. He observes without guidance safety regulations & actively concerns himself with keeping his work environment sound & safe.

3. Command of knowledge underlying work in hand Evaluation targets pure knowledge requisite to work in hand & ability to apply it to different tasks & work environments & ability to analyse and state one's reasons for the choices made.

4. Command of safety regulations Evaluates ability to observe safety regulations & to apply them to different work tasks and environments. Evaluates ability to maintain capacity to work.

59

EVALUATION TARGET

SATISFACTORY

GOOD EXCELLENT

5. Core competence learning to learn skills Evaluation targets student's ability & motivation to assess himself & his work & to develop his work methods & to act in his studies & in working life in a self-improving way. Further evaluated is his ability to acquire, analyse & evaluate information & to apply prior learning to new situations.

Troubleshooting skills Evaluation targets ability to tackle new situations & up- cropping problems & to make well-founded choices. Ability to act & think in a creative & innovative fashion.

SATISFACTORY Student when guided is capable of self-assessment & of evaluating his work.

GOOD Student is aware of his own strengths & weaknesses & makes choices to further improvement.

EXCELLENT Student evaluates himself and his work from different angles also when engaging in work. He is capable of improving the quality of his performance in a creative fashion also stating his reasons.

lnteraction and communication skills Evaluation targets ability to present & report information orally & in writing & to use modern communication technology. Degree of command of a foreign language. Social skills Evaluation targets student's ability to interact with other people & groups as a com- munitv member. Ethical & aesthetical / emotional skills Evaluation targets ability of commitment to work, to work in a responsible way & with justice & to resolve moral and ethical problems and to act in accordance with the ethics of the profession. Degree of awareness of beauty inherent in native culture & personal ethical values & ability to apply them to the work in hand.

Student's attention targets work details detached from the whole. Work strictly follows rules and instructions. Student is capable of working in a group, manages familiar situations well and is capable of resorting to the assistance of others when in need. He recognises his own task in the community and is familiar with its values, aims & rules and adheres, as a rule, to jointly agreed schedules.

Student is capable of applying rules and instructions to different situations. He is capable of seeing his work as a whole. He classifies, compares & analyses acquired information & reshapes it to further the task in hand. He is an active & responsible member of his group, capable of exploiting communal knowledge. He is capable observing the values, aims & rules of the team and of acting in accordance with the ethics of his profession.

Student is capable of applying his acquired knowledge & skills & of finding new solutions in different situations. He is capable of stating the reasons for his decisions. Student is further capable of seeing his work as part of the surrounding work collective and of recognizing his work environment as a whole. He also is capable of reshaping his acquired knowledge in a creative manner and of drawing the right conclusions.

60

EVALUATION TARGET

SATISFACTORY

GOOD EXCELLENT

6. Common areas of emphasis internationalisation Evaluation targets student's knowledge of foreign languages & tolerance as well as readiness to work in a multi- cultural environment & toheed values inherent in customer's culture.

EXCELLENT Student is capable of applying, of his own accord, the knowledge and skills of the involved areas of emphasis & of applying rules and regulations to his work, also stating his reason for the stating his reasons for the choices made.

Sustainable development Evaluation targets ability to apply the principles of sustain-able development & to recognize the risks to the environment of his work & to observe environ-mental ruIes and regulations pertaining to his workplace.

GOOD Student is capable of applying, of his own accord, the knowledge and skills of the involved areas of emphasis & of applying rules and

Enuepreneurship Evaluation targets student's command of, the basics of economy & his ability to work in the spirit of entrepreneur-ship & will to improve himself of his own accord in order to become a conscientious work-respecting labourer and person carrying on a trade.

SATISFACTORY Student displays and exploits in his work the skills inherent in common areas of emphasis, observing central rules and regulations in his work. Student is in need of guidance.

regulations to his work.

Exploitation of technology and IT Evaluation targets ability to exploit CIT & the technology in his own field & to develop and exploit his acquired knowledge and skills in a versatile way in his work.

Customer-oriented activities of high quality Evaluation targets ability to act in a customer-oriented way, taking into account customer's needs and expectations.

Service and consumer knowledge Evaluation targets ability to heed provisions in consumer legislation by acting in conformity with customer's rights, obligations and responsibilities.

Knowledge of safety at work and health Evaluation targets ability to further well-being & to recognize the safety risks involved in his field and work tasks & to maintain and improve his capacity to work & his work conditions.

61

GATHERING EVALUATION DATA FROM ASSESSMENT OF STUDENTS MADE BY SKILLS TESTS Curriculum of the organiser of education The curriculum of the organiser of education must meet the content and aims of the national core curriculum, concurrently with it being adapted to the operations of the individual school. The curriculum and the plans relating to the organisation, funding and general teaching arrangements have further to meet the requirements of an assessment by skills tests. National test materials National tests materials or the organiser's own quality-assured test material steer the actual organisation of the tests. The test material is based on the national core curriculum, and it has been prepared by study unity, while taking in account any field-specific special requirements.

The test material has been prepared in accordance with the principles for developing the tests, in other words, they contain the general rules for organising the tests, a description of the test, the requirements on the test environment, the central competence to be displayed in the test, along with the evaluation targets and necessary evaluation criteria, and the instructions for documenting the end results.

The material identifies the test environment options and the minimal requirements set, by which is meant the actual work milieu of the workplace or some environment closely resembling it. The test material must further specify the level of requirement. The test environment should make possible the demonstration of the student's degree of functional, cognitive, social, and reflective competence inherent in the curriculum and requisite to handling real working life situations.

The evaluation targets the central content of the study unity. The expected competence and the main criteria of evaluation have been defined in the national test material in accordance with the general targets of evaluation and observing the general rules for describing the criteria. The targets and criteria of evaluation of the organised tests have been derived from the aims of the underlying curriculum by study entity.

The test material should be clear in order to be able to steer the planning and the implementation of the tests in different local situations and when repeating the test at some later date. The use of definitions and concepts in the national test material has to be clear and coherent.

62

Organising the tests The evaluation data is gleaned from the tests by study entity administered by the organisers of education throughout the student's training. The tests are intended to be organised so that it is possible to heed all the dimensions of competence – the functional, cognitive, social, and reflective – of the evaluation. The tests are administered sufficiently broadly, in order that the student may demonstrate a wide range of basic vocational competence as well as his special competence in his general and specialisation studies. In the tests the student combines his practical and theoretical knowledge, applying it to situations met at me workplace. The various sub-tests throughout his training together form the whole describing the student's overall competence in his trade to be. Assessing the tests All tests measure the student's command of his work tasks (work methods, tools, and materials), the work process, the knowledge underlying the work in hand and of the safety regulations. The student's command of the core competence common to all and the attainment of common goals is evaluated in a way that each study unit delivers evaluation data from at least one area of core competence. The student's command of the common areas of emphasis are evaluated in the study entities that it belongs to according to the aims of the core curriculum. This is more precisely defined in the actual test material.

The assessment criteria are known in advance to the students and tests evaluators. The evaluators have been duly familiarised with and trained for their tasks.

The assessment of the tests are always made by the student, teacher, and a working life representative in conjunction. Each assessor gives his own assessment of the target in hand. The final grade of the test is given by consensus.

If the test is administered in parts, it is nonetheless assessed as a whole, leaving each study entity with only one consensual grade and summation sheet. Tests administered in the course of the student's period of learning through work is graded separately as well. Training The schools plan and implement the training of the teachers and workplace evaluators. The schools also familiarise the students with the evaluation of the skills tests.

The National Board of Education is in charge of the training pertaining to the national evaluation and quality assurance.

63

Curriculum

Skill testThe central areas of competence are formed and evaluated by evaluation tarvet. The general evaluation criteria are applied to the content and aims of the core curriculum

• functional• cognitive• social• reflectivecompetence

Working

life

INTEGRATION OF SYSTEM OF EVALUATING BY LEARNING RESULTS WITH A SYSTEM

OF ASSESSMENT BY SKILL TESTS

INTEGRATION OF SYSTEM OF EVALUATING BY LEARNING RESULTS WITH A SYSTEM

OF ASSESSMENT BY SKILL TESTS

General evaluationtargets

1.

2.

3...6.

EGS

Study entity

Areas of competence

1.

2.

3...6.

EG

Evaluationcriteria

S GE

S GE

S

FIGURE 2 DEFINING EVALUATION TARGETS AND CRITERIA CONNECTION WITH STUDY ENTITY-RELATED TESTS. DOCUMENTATION The result of the evaluation by consensus is documented on nationally uniform report sheets. For the purposes of evaluation the assessors and students have auxiliary report sheets, added to which there is a summation sheet to be filled in by the teacher. The teacher also fills in a background sheet on the test and the participating student(s). Information to be given is the codes denoting the student and his school, the basic facts about the implementation of the test, the test locality, the current phase of the student's studies, and information relative to the test evaluators.

The person in charge of the test at the school fills out the consensus and background reports and sees to it that they are mailed in accordance with instructions to the National Board of Education.

64

APPENDIX 6 PILOT EVALUATION 2002–2003, BUILDING MAINTENANCE TECHNOLOGY CONSTRUCTION TECHNOLOGY, HEALTH AN SOCIAL SERVICES

DEVELOPING THE SYSTEM OF EVALUATION BY LEARNING RESULTS

RESULTS OF PILOT EVALUATION 2002–2003

BUILDING MAINTENANCE TECHNOLOGY CONSTRUCTION TECHNOLOGY HEALTH AND SOCIAL SERVICES

National Board of Education 2003

Authors Mari Räkkölainen, Finnish National Board of Education, Finland Paula Mäkihalvari, Finnish National Board of Education, Finland

65

CONTENTS INTRODUCTION........................................................................................................................................67

1 BACKGROUND INFORMATION..................................................................................................69

1.1 VOCATIONAL QUALIFICATIONS, SCHOOLS AND STUDENTS PARTICIPATING IN THE PILOT EVALUATION..................................................................................................................................69

1.2 STUDY ENTITIES AND SKILLS TESTS IN THE PILOT EVALUATION............................................69 1.3 TEST SITE AND TEST EVALUATORS......................................................................................................71 2 LEARNING RESULTS ....................................................................................................................72

2.1 GRADE DISTRIBUTION OF LEARNING RESULTS (%) BY EVALUATION TARGET IN ALL SECTORS........................................................................................................................................................72

2.2 THE EVALUATION TARGET MEAN, THE TEST GRADES BY QUALIFICATION, AND THE NATIONAL MEAN AS COMPARISON .....................................................................................................74

3 NATIONAL RESULTS OF THE QUESTIONNAIRE..................................................................76

3.1 ALL SECTORS ...............................................................................................................................................76 4 SUMMING UP..................................................................................................................................83

5 EXTERNAL QUALITY ASSURANCE IN PILOT EVALUATION ..............................................84

5.1 RESULTS OF QUALITY ASSURANCE AND AUDITING PROCESS ....................................................86 6 SUMMING UP FOR AUDITING PROCESS..................................................................................87

66

INTRODUCTION

The National Board of Education has now completed a pilot evaluation connected to the project for developing the system of evaluation based on learning results. The evaluation was carried out in collaboration with the educational establishments taking part in the prior pilot test material and skills test project. A total of seven schools participated, representing the health and social services sector, the heating and ventilation sector, and the construction sector in vocational training. The pilot evaluation was launched in the autumn of 2002 and ended in the spring of 2003.

The pilot evaluation tested a model, integrating national evaluation by learning results with an evaluation by means of skills tests. The information about the evaluation by learning results was gleaned directly from the skills tests organized by the education providers. There were no common examination tasks; the national test material and the test material of the individual school steered the planning and administration of the tests and the ensuing assessment of the student’s level of competence. The evaluation data from 376 skills tests, conducted in the course of the autumn and the following spring, was gathered by means of coinciding evaluation forms.

All those partaking in the pilot evaluation also took part in the external quality assurance process, auditing the written material of the schools (i.e. curricula, evaluation and skills test schemes, and all other material connected with the tests and the evaluation). In addition, interviews were held and auditing visits made to the schools and the test situations, while in progress, followed by peer evaluation meetings between the school and the workplace evaluators. The schools have since received school specific reports on the external auditing. 16 national tests materials and two school specific materials were quality assured during the pilot project, and separate reports were prepared on the expert evaluations. Concomitantly, trials were made with seven quality-assured test materials.

Each test contained a separate questionnaire, one for the participating student, one for the evaluating teaching, and one for the evaluating representative (on-the-job trainer) of the workplace. A total of 343 students, 247 teachers, and 143 in-house trainers returned the questionnaire duly filled in.

This quick feedback of the pilot evaluation contains the test results and the yield of the questionnaire in summary form from the involved education sectors. This report comprises also the results of external auditing process. The learning results have been presented as mean values and grade distributions by training sector and evaluation target. The results of the questionnaire have been presented according to the answers of the respondents in each sector. The auditing results have been presented in the qualitative way.

We wish to thank all those who have been involved in this project, the teachers of the schools, the students and the workplace trainers for their valuable help in

67

carrying through the current evaluation, hoping that this report will assist and benefit them in their work on developing the skills tests and their evaluation. For further information, please do not hesitate to contact us. Mari Räkköläinen Planning Specialist, project manager phone 358-9-7747 7293

68

1 BACKGROUND INFORMATION

1.1 VOCATIONAL QUALIFICATIONS, SCHOOLS AND STUDENTS PARTICIPATING IN THE PILOT EVALUATION The pilot evaluation targeted the vocational qualifications in the health and social services, the building maintenance technology, and the construction sectors of vocational training. Seven schools and 190 students from the said sectors participated. Participants were the most numerous in health and social services, the least so in the construction sector.

Health and social services 2 schools Building maintenance technology 2 schools Construction 3 schools

TABLE 1 NUMBER OF SCHOOLS BY SECTOR PARTICIPATING IN THE PILOT EVALUATION.

Qualification Number of students Health and social services 87 Building maintenance technology 64 Construction 39 Total 190

TABLE 2 NUMBER OF STUDENTS BY SECTOR PARTICIPATING IN THE PILOT EVALUATION.

1.2 STUDY ENTITIES AND SKILLS TESTS IN THE PILOT EVALUATION Tests were administered in 11 different study entities and evaluation data was obtained from 376 tests. In building maintenance technology evaluation data was obtained from the tests on the basics skills in building technology and in measuring, calibration, and automation technologies. In construction the skills tests targeted the study units of house building, in-house carpentry, exterior carpentry (footing and framing), house repair work, and reinforced concrete framing. In health and social services the information data was obtained from the tests on the study entities in counselling to support child growth, nursing and care,

69

and rehabilitation. The compass of the study entities varied from 60 to 30 study weeks (credits). The greatest number of tests were conducted on heating and ventilation due to the large number of part tests in house-building technology, where the study unit comprised a full 30 study weeks (sw).

Vocational qualification Number of tests in study entity Health and social services Counsellling to support child growth (16 sw)

44

Nursing and care (22 sw) 43 Rehabilitation (12 ov) 39 Yhteensä 126 Building maintenance technology Basic skills (30 sw) 116 Measuring, calibration & automation technology (6 sw)

95

Total 211 Construction Construction (16 sw) 18 In-house carpentry (10 sw) 12 Footing and framing carpentry (10 sw) 1 Rough/Structural framing (14 ov) 5 Reinforced concrete framing (10 sw) 2 House repair work (5 ov) 1 Total 39 Total number of skills tests 376

TABLE 3 NUMBER OF SKILLS TESTS BY QUALIFICATION AND STUDY ENTITY.

A near third (31,9 %) of the skills tests were administered during the spring term of the second year of study, i.e. in the 4th term, and a little under a fourth (24,7) during the first term.

f % 1st term 93 24.7 2nd term 84 22.3 3rd term 43 11.4 4th term 120 31.9 6th term 36 9.6 Total 376 100.0

TABLE 4 PHASE OF STUDIES AT THE TIME OF THE SKILLS TESTS.

70

1.3 TEST SITE AND TEST EVALUATORS Of the test 43 per cent were conducted at the workplace, including all tests in health and social services, whereas the tests in building maintenance technology were all administered in the individual schools.

f % Health and social services Workplace 126 100.0 Building maintenance technology School 211 Construction Workplace 39 100,0 Total 376 100.0

TABLE 5 TEST SITE BY TRAINING SECTOR.

A little less than half (46 %) of the pilot evaluation tests were consensus-based, i.e. they were assessed jointly by the student, teacher, and workplace representative. Close to one third (30.3 %) were evaluated by the teacher alone. The workplace representative was the sole evaluator in 10.1 per cent of the cases.

f % Consensus 173 46 Teacher 114 30.3 Teacher and student 44 11.7 Workplace representative 38 10.1 Student and workplace representative 7 1.9 Teacher and workplace representative 1 0.3 Total 376 100.0

TABLE 6 TEST EVALUATORS (HEALTH AND SOCIAL SERVICES, BUILDING MAINTENANCE TECHNOLOGY,

AND CONSTRUCTION).

All the tests on construction (100 %) and virtually all on health and social services (93.6 %) were either consensual or joint evaluations. In building maintenance technology the evaluation, as a rule, was made by the teacher (54 %), the teacher and student together (20.4 %), or by the workplace representative alone (18 %). Consensus was observed in 7.6 per cent of the tests on heating and ventilation.

71

Vocational qualification f % Health and social services . Consensus 118 93.6 Teacher and student 1 0.8 Student and workplace representative 7 5.6 Total 126 100.0 Building maintenance technology Consensus 16 7.6 Teacher and student 43 20.4 Teacher 114 54.0 Workplace representative 38 18.0 Total 211 100.0 Construction Consensus 39 100.0 Total 39 100.0

TABLE 7 TEST EVALUATORS BY VOCATIONAL QUALIFICATION.

2 LEARNING RESULTS

2.1 GRADE DISTRIBUTION OF LEARNING RESULTS (%) BY

EVALUATION TARGET IN ALL SECTORS In the figure below are shown the distribution of grades (%) by evaluation target in all the sectors involved (construction, building maintenance technology, and health and social services) and the overall grade of the skills test. The general evaluation targets have been the same in the tests in all sectors. The evaluation targets have been listed after the figure. The figure also shows the number of tests, where the evaluation target in question has been assessed (n = number of tests).

The first four evaluation targets (1–4) were to be assessed in all the tests. A further requirement was that evaluation data be obtained of at least one of the core know-how (5–9) areas of each study entity. In other words, the student’s core skills had to be assessed in all tests, whereas the assessment of content of this know-how could vary, depending on the aims of the study entity concerned. Focal areas (10–13) common to all were to be evaluated within the study entity, to whose aims they have been assigned. The content of the common focal areas therefore varies in accordance with the aims and purpose of the study entity in question. The content of the know-how to be evaluated and the evaluation criteria are clearly brought to the fore in the test material on each study entity. Finally, each skills test is awarded an overall test grade (14).

72

0 %

20 %

40 %

60 %

80 %

100 %

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Evaluation targets

E5G4G3S2S1Fail 0

1. Knowledge of work methods, tools and materials 7. Interaction and communication (n = 139) (n = 373) 8. Co-operation (n = 153) 2. Command of knowledge of underlying work 9. Ethical and aesthetical skills (n = 120) (n = 375) 10. Sustainable development (n = 27) 3. Readiness to plan, implement, evaluate, and 11. Use of technology and IT (n = 8) develop the work process (n = 374) 12. Entrepreneurship (n = 27) 4. Health and safety at work (n = 229) 13. Quality and customer-orientedness (n = 102) 5. Learning skills (n = 125) 14. Overall grade of skills test (n = 297) 6. Problem-solving skills (n = 161)

FIGURE 1 GRADE DISTRIBUTION (%) OF LEARNING RESULT BY EVALUATION TARGET (CONSTRUCTION,

BUILDING MAINTENANCE TECHNOLOGY, AND HEALTH AND SOCIAL SERVICES).

Students displayed the best command of cooperation skills. The majority of students (96.7 %) received at least the grade of good, and a good third of them (36.6 %) the grade of excellent. Safety at work was another area that the students mastered well. The majority of them (95.1 %) were graded good, only 4.8 per cent receiving the grade of satisfactory. Ethical and aesthetical skills earned a good two thirds of the students (68.3 %) the grade of good, just under one third (29.2 %) the grade of excellent. The students also did well in quality and customer-orientedness, the majority of them (71.5 %) being awarded the grade of good, a trifle short of one quarter (24.5 %) the grade of excellent.

Students did the least well in the area of command of knowledge underlying work and readiness to plan, implement, evaluate, and develop the work process. Nevertheless, also in these areas, well over half of the students (85 % and 87.4 %) got the grade of good, only one in ten (10 %) receiving the grade of satisfactory.

73

297 overall grades were awarded during the pilot project. The bulk of the students (74.7 %) got the grade of good and just under one fifth (18.2 %) the grade of excellent. A satisfactory grade (1 or 2) made up only 7.1 per cent of all the grades given. There were no grades of fail. In all training sectors the commonest grade was 4 (48.1 % of all the grades given).

2.2 THE EVALUATION TARGET MEAN, THE TEST GRADES BY QUALIFICATION, AND THE NATIONAL MEAN AS COMPARISON In the next table the learning results are presented by evaluation target in the individual tests and by the corresponding mean of grades in all the tests (national mean of comparison). In other words, the figures given are all grade means. The figure in brackets after each evaluation target indicates the number of skills tests, in which the target in question has been graded (n = number of tests).

When scanning all sectors by evaluation target, the best grades were observed in the command of safety at work (mean = 4.09) and in the area of core know-how and common focal areas of content. The very best grades were obtained in cooperation skills (m = 4.13), which constitute part of the skills area of core know-how. Equally, the rest of the core skills content was evaluated as good (e.g. ethical and aesthetical skills (mean = 4.06, interaction and communication (m = 4.01). The highest grades in the focal area of content were awarded for quality and customer-orientedness (m = 4.02) and use of technology and IT (m = 4.02). The degree of displayed competence in core know-how and the focal area of content varies from one study entity to the next, making a comparison somewhat difficult. In part of the tests, core know-how was not assessed at all, or an assessment was made, even though it was not an evaluation target. The lowest grades were earned in command of knowledge underlying work (m = 3.52) and in readiness to plan, implement, and evaluate the work process (m= 3.57).

74

Health and Social

Services

Building mainte-nance

technology

Con-struction

National mean of

comparison

Command of work methods, tools, and materials (n = 373) 4.13 3.42 3.85 3.70

Knowledge of underlying work (n = 375) 3.64 3.44 3.62 3.52

Readiness to plan, implement, evaluate, and develop the work process (n = 374)

3.87 3.39 3.62 3.57

Health and safety at work (n = 229) 4.05 4.23 3.79 4.09

Learning to learn (n = 125) 4.02 - 3.97 4.01 Troubleshooting (n = 161) 3.90 3.45 3.69 3.73 Interaction and communication (n = 139) 4.15 3.07 4.03 4.01

Co-operation (n = 153) 4.28 3.72 4.15 4.13 Ethical and aesthetical skills (n = 120) 4.13 - 3.89 4.06

Sustainable development (n = 27) 3.70 3.70

Use of technology and IT (n = 8) 4.0 4.0 Entrepreneurship (n = 27) 3.93 3.93 Quality and customer-orientedness (n = 102) 4.08 3.96 4.00 4.02

Final grade in skills test (n = 297) 4.01 3.47 3.95 3.76

TABLE 8 QUALIFICATION MEAN OF NATIONAL LEARNING RESULTS BY EVALUATION TARGET (EVALUATION BY CONSENSUS / CONJOINT EVALUATION).

The best grades in health and social services were obtained in cooperation skills (m = 4.28) and the least good in command of knowledge underlying work (m = 3.64). The poorest grades in heating and ventilation were given in interaction and communication (m = 3.07) and the best in safety at work (m = 4.23). The best grades in construction were awarded for interaction and communication skills (m = 4.15) and the poorest in planning, implementing, evaluating, and developing the work process (m = 3.39). When examined statistically, the biggest differences by training sector occurred in the areas of command of work methods, tools, and materials, in the area of readiness to plan, implement, evaluate, and develop the work process, and in interaction and communication skills.

The mean of the final grade in the skills tests was 4.01 in health and social services, 3.47 in heating and ventilation, and 3.95 in construction. The mean of the final grade in the skills tests of all three sectors was 3.76. The commonest final grade in all sectors was 4. In health and social services the lowest grade was S2, in

75

heating and ventilation S1, and in construction G3. In building maintenance technology the grades were distributed more evenly. So for instance the distribution of the grades of G3 and G4 was relatively uniform. Below is shown the final distribution of test grades in percentage by training sector.

3,848,3

14,4

37,6

28,2

58,4

38,3

48,7

23,212

23,1

0 %

10 %

20 %

30 %

40 %

50 %

60 %

70 %

80 %

90 %

100 %

Health and Social Services Building maintenancetechnology

Construction

E5G4G3E2E1Rejected 0

FIGURE 2 FINAL DISTRIBUTION OF TEST GRADES IN PERCENTAGE BY TRAINING SECTOR.

3 NATIONAL RESULTS OF THE

QUESTIONNAIRE

3.1 ALL SECTORS There was a separate questionnaire for the teachers, the students, and the workplace representatives, who returned 343, 247, and 143 forms respectively, duly filled in. The questionnaire consisted of three parts: 1. Correspondence of test with the aims of the course, 2. Organization of the test, and 3. The test and learning. The questionnaire targeted the same tests, from which the evaluation data was gathered. The questionnaire assessed statements and assertions on a scale of 1–5 in a way that 1corresponded to not at all, 2 to somewhat, 3 to fairly much, 4 to a lot/much, and 5 to very much/well.

76

3.1.1 THE CORRESPONDENCE OF THE SKILLS TEST WITH THE COURSE AIMS AS ASSESSED BY THE WORKPLACE REPRESENTATIVE, THE TEACHER AND THE STUDENT By means of the questions about the correspondence of the test to the study aims, it was explored to what extent, in the view of the respondent, the test worked as an evaluation tool, what manner of competence it was able to evaluate, and how the evaluation and the formation of grades was thought to reflect a true and accurate picture of things. In the figure below are shown the mean values of the answers to the statements of the questionnaire as given by the teachers, students, and workplace representatives of the training sectors involved.

The majority of the workplace representatives (71.8 %), the teachers (89.9 %), and the students (70.5 %) were of the opinion that the demonstration of skills required by the tests corresponded well with the requisites of the future workplace. Most of the teachers (70.6 %) and of the workplace representatives (64.4 %) also felt that it was possible, by means of the used tests, to measure and assess a very wide range of vocational competence. All the parties considered the grades awarded the students to be objective and just. The majority of the students (84.2 %) and of the workplace representatives (88.4 %) and as many as 97 per cent of the teachers considered the grades given to the students to be objctive. The workplace representatives found it more difficult than the teachers to grade the students’ performance. More than three fourths (77 %) of the teachers and more than half (54.4 %) of the workplace representatives found awarding grades to be easy. Well over half of the teachers (67.1 %) and of the workplace representatives (61 %) also found the grades to be comparable between the schools. The better part of the students (63.3 %) considered the joint evaluation talks and the self-evaluation to be useful; one in every ten (10 %), however, looked upon them as useless.

In the next figure are shown the mean values of the answers to the statements made in the questionnaire as assessed by the workplace representatives, the teachers, and the students of the training sectors involved.

77

1 2 3 4

Usefulness of self-evaluation of test

Usefulness of evaluation talks

Degree of comparability of grades between schools

Reaching consensual grades was easy

Were the grades awarded felt to be objective and just?

The test made it possible to evaluate the student’s skill to assess his ownwork and himself

The test made it possible to evaluate the student’s skill to take heed ofothers

The test made it possible to evaluate the student’s skill in troubleshootingand making choices

The test made it possible to evaluate the student’s skills in team work

The test made it possible to evaluate the student’s interaction andcommunication skills

The test made it possible to evaluate the student’s knowledge of healthand safety at work regulations

The test made it possible to evaluate the student’s command of theory andhow to apply it

The test made it possible to assess student’s readiness to plan,implement, and evaluate the work process

The test made it possible to evaluate the student’s knowledge of workmethods, use of tools, and materials

The test made possible the use of things taught at the correspondingcourse

The test allows for a wide range of display of vocational know-how

The test corresponded to the degree of competence requisite for thefuture workplace

Not at all Mean Very well/ Very much

5

Workplace representative Teacher Student

FIGURE 3 THE CORRESPONDENCE OF THE TESTS WITH THE AIMS OF THE STUDY ENTITY AS SEEN BY

THE WORKPLACE REPRESENTATIVE, THE TEACHER, AND THE STUDENT.

78

3.1.2 ORGANIZATION OF TEST AS SEEN BY THE WORKPLACE REPRESENTATIVE, THE TEACHER, AND THE STUDENT The purpose of the questions regarding the organization of the tests was to explore the experiences of the parties concerned as to how well the planning and implementation of the test had succeeded. The figure below shows the mean values of the answers by the workplace representatives, teachers, and students of the various sectors to the put questions.

The teachers, in particular, were satisfied with the instructions for taking the test. No less than 88 per cent of the teachers were very satisfied, as were the majority of the students (64.4 %) and the workplace representatives (57.7 %). A third of the students (32 %) and over 40 percent of the workplace representatives, however, considered the instructions to be quite insufficient. The differences between the parties in this, was statistically significant.

The teachers, generally, were well acquainted with the evaluation targets and criteria of the tests, whereas the students were the least familiar with them. Of the teachers the majority (89 %) and more than half of the workplace representatives (63.1 %) were well informed, while a little less than half of the students (48.6 %) appear to have known the targets and criteria well and a mere third (33.2 %) relatively well. The differences between the three parties in this were statistically significant.

Both the students and the workplace representatives considered their chances to take part in the planning of the tests to be slim at best. A good third of the students (34 %) had not participated at all in the planning. More than one third of the workplace representatives reported their participation in the planning of the test to have been very small. By contrast, more than half of the teachers rated their chances to partake in the planning as good. Also here the differences between the three parties were statistically significant.

In the next figure are shown the mean values of the assessments of the workplace representatives, the teachers, and the students from all sectors to the statements made in the questionnaire.

79

1 2 3 4 5

The test promoted my cooperation with the school

The test improved my touch with working life

Administering the test during the period of learningthrough work was well suited to its purpose

I had a chance to participate in the planning of theskills test

The national evaluation form was useful whenevaluating

I was briefed about the evaluation of the test

I knew beforehand the evaluation criteria of the test

I knew beforehand the evaluation targets of the test

Clarity of the test objectives

Sufficiency of the test instructions

Not at all Mean Very well/ Very Much


FIGURE 4 THE ORGANIZATION OF THE SKILLS TEST AS SEEN BY THE WORKPLACE REPRESENTATIVES,

THE TEACHERS, AND THE STUDENTS.

80

3.1.3 THE SKILLS TESTS AND LEARNING AS SEEN BY THE WORKPLACE REPRESENTATIVES, THE TEACHERS, AND THE STUDENTS By means of relevant questions the purpose was to explore the significance of the evaluation by skills tests to learning one’s profession and to cooperation between the parties. The figure below shows the mean values of the responses of the workplace representatives, the teachers, and the students to the statements and assertions made in the questionnaire.

The teachers were more satisfied than the workplace representatives with the working of the evaluation criteria when assessing. The majority of the teachers (70 %) and the greater part of the workplace representatives (55.4 %) were very pleased with how they worked, whilst more than 10 per cent of the workplace representatives and roughly 5 per cent of the teachers were dissatisfied with how the criteria worked.

The students understood the evaluation criteria fairly well. The majority of the students (61.6 %) reported that they understood the criteria well or very well and a near third (29.4 %) that they had understood them fairly well. All parties reported that the tests provided feedback on the strengths and need for improvement of the students, and they were thought to support the learning and occupational growth of the students relatively well. Although the students held a more critical view, the majority of them (58.8 %) felt that the tests supported learning well and some 45 per cent that the criteria had increased their interest in field of training concerned. A good half of the teachers (64.6 %) and of the workplace representatives (53.2 %) thought the skills tests to be useful also with regard to their own work.

In the next figure is shown the mean values of the answers of the workplace representatives, the teachers, and the students of the various training sectors to the statements made in the questionnaire.

81

1 2 3 4

The test was useful to my work

The test required knowledge by the students of things thathad not been taught beforehand

The student was able to put to good use his experiencesfrom his period of learning through work

The test improved the student’s touch with working life

The test promoted the occupational growth of the student

The test supported the student’s learning

The test provided feedback on the strengths and need forimprovement of the student

I understood the evaluation criteria

The test criteria worked well in evaluating the test

Not at all Mean Very well/ Very much

5


FIGURE 5 THE SKILLS TESTS AND LEARNING AS SEEN BY THE WORKPLACE REPRESENTATIVE, THE

TEACHER, AND THE STUDENT.

Virtually all (95 %) of the students taking part in the pilot evaluation considered the degree of difficulty of the tests to have been just right and only some 3 per cent that the tests had been too difficult.

82

2 %

95 %

3 % Too easyJust rightToo difficult

FIGURE 7 DEGREE OF DIFFICULTY OF THE TEST AS SEEN BY THE STUDENTS.

4 SUMMING UP

- All the tests in health and social services and in construction were carried out in

actual work situations at the workplaces. In contrast, the test in building maintenance technology, were administered at the schools. Of all the tests making up the pilot evaluation less than half (43 %) were carried through at the workplace.

- All of the tests in construction and well-nigh all of the tests in health and social services (93.6 %) were evaluated conjointly by the teacher and the workplace representative. The tests in building maintenance technology were, as a rule, evaluated by the teacher (54 %), by the teacher and student together (20.4 %) or the workplace representative single-handed (18 %).

- The grades awarded in the tests were good in all sectors. In all sectors the most common grade was G4, which was awarded in just under half of all the tests (48.1 %). No test was failed. Satisfactory notes S1–S2 were given to only a very few students. The best learning results were obtained in cooperation skills, which was one of the core skills common to the tests in all three sectors. The biggest differences by sector occurred in the skills displayed in the command of work

83

methods, in the planning of the work process, and in communication and interaction.

- In all sectors the tests, as a means of evaluation, were considered to be good and

the participants, as a rule, were quite satisfied with the way the tests were implemented. The students and workplace representatives, however, thought the test instructions to have been insufficient, and they had also clearly had less chance to participate in the planning of the tests than the teachers. As many as one third of the students did not at all partake in the planning of their tests. The teachers, on the other hand, felt that they had been well enough involved in the planning.

- The evaluation targets and criteria also proved to have been poorly known by the

students. More than one third of the students were only fairly well acquainted with them. The workplace representatives found the joint evaluation to be more difficult and were more dissatisfied with the evaluation criteria than the teachers. The students were the most critical of the usefulness of tests.

5 EXTERNAL QUALITY ASSURANCE IN

PILOT EVALUATION

External quality assurance was also tested in pilot evaluation. The task of a quality assurance system is to ensure the implementation of skills tests in conformity with a given set of objectives and criteria. The aim is to ensure the acquisition of such comparable results, straight from the skills test system, so that they can be used by the organisers of education and national evaluations of educational outcome.

The quality assurance of an evaluation by skills tests along with a system of evaluation by learning results was made so that, common quality standards and criteria were set up for skills tests and the system of evaluation by learning results. Quality assurance targets the process as a whole, from the curricula of the organisers of education to the locally implemented tests. In this project quality assurance comprised evaluation of the curricula of the provider and skills-test material and visit for auditing, external moderation and reporting.

It were evaluated altogether seven curriculum of education organiser, nine skills test materials, nine visits for auditing. External evaluators participated in 11 the test evaluation meeting. There were 11 interviews of student, 12 interviews of teacher, 13 interviews of representatives of working life and six interviews of principals in pilot evaluation. Besides these it was organised four discussion meetings (external moderation).

84

QUALITY ASSURANCE SYSTEM IN EVALUATION PILOT

I Evaluation of written material

• The curriculum of education organiser (evaluation targets, criterias, evaluation practices, documentation)

• The skills material (the content and structure of the test material, evaluation targets, criterias, the test environment, evaluation practices, documentation)

II Visit for auditing

• Auditing the test environment

• Participating in the test evaluation meeting

• Interviews (student, teacher, representative of working life)

Reporting

III External moderation

• Discussion between teachers and representative of working life

h Auditing report to institutions

85

5.1 RESULTS OF QUALITY ASSURANCE AND AUDITING PROCESS I EVALUATION OF THE WRITTEN MATERIAL: CURRICULUM OF

PROVIDER AND SKILLS-TEST MATERIAL

• The curricula of the provider ought better to heed the needs of working life • A more application-oriented and concrete grip on things is called for in the

national core curricula • The skills tests have not yet been generally integrated into the providers’

curricula • More precise distinctions ought to be created between the grades of

satisfactory, good and excellent • More concrete criteria and descriptions of competence are needed • More attention ought to be given to the definition of the level of satisfactory:

the requirements are too high/too low • The knowledge covered by the test varies from sector to sector (core skills,

common emphases • The tests environments vary by training sector and according to the point in

time of the studies (learning through work) • More varied test environments are needed • The core skills is an important part of occupational competence • The national core curricula meet the needs and requirements of working life.

II VISIT FOR AUDITING (Auditing the test environment, participating in

the test evaluation meeting, Interviewing)

• The evaluation practices vary from school to school, from sector to sector and even inside a single school (consensus does not always materialize)

• Tests at school do not make possible to evaluate wide-range occupational competence

• Criterion-based assessment is often difficult • Workplace tests appear to be well organized • The students, teachers, and workplace representatives find a test by process to

be good • The workplace representatives could be ‘bolder’ in evaluating tests • The workplace representatives are committed to the evaluation of the tests,

even though it is very time-consuming • The evaluation and feedback talks connected with the tests are good learning

situations

86

• The role of the teachers in evaluation and feedback talks is important • The teachers as learning and counselling professionals are committed to their

task • The students find it motivating that workplace representatives are involved in

evaluating the tests. III OTHER FEEDBACK (external moderation)

• Auditing visits to the schools are a necessary part of external quality assurance • Evaluating written material is both time-consuming and demanding • The evaluation model, in principle, is OK, but its practical realization still

requires some thought • Documentation ought to be simplified; as it is, there are too many forms to fill

in (partly due to the piloting) • The management considers the cooperation between institutions to be

important • The peer evaluation meetings are experienced as useful, and as a good means

of learning from others.

6 SUMMING UP FOR AUDITING PROCESS

h The curricula of the provider ought better to heed the needs of working life

h More precise distinctions ought to be created between the grades of satisfactory, good and excellent

h More varied test environments are needed

h The evaluation practices vary from school to school, from sector to sector and even inside a single school (consensus does not always materialize)

h Criterion-based assessment is often difficult

h The students, teachers, and workplace representatives find a test by process to be good

h The workplace representatives could be ‘bolder’ in evaluating tests

h The role of the teachers in evaluation and feedback talks is important

h The peer evaluation meetings are experienced as useful, and as a good means of learning from others

h The management considers the cooperation between institutions to be important.

87

the implications of using skills tests as basis for a ... · the implications of using skills tests...

Documents