of the national institute for testing &...
Post on 12-Mar-2020
1 Views
Preview:
TRANSCRIPT
of the National Institute
for Testing & Evaluation
Tzur Karelitz
Sept 2016
Outline
NITE: history, structure & objectives
Overview of NITE’s assessment services & activities
A detailed look at the RAMA project:
• Analyzing and reporting data from large scale assessments
and survey for the Israeli Ministry of Education.
2
NITE: History & Structure
Established in 1981 by a consortium of the universities in Israel in order to centralize admissions testing for applicants to higher education.
NITE is a public, not-for-profit organization, that is supervised by a board of directors (representatives of founding universities).
NITE staff includes about 130 professionals: item writers, statisticians, psychometricians, computer programmers, graphic designers, language editors, and logistic & administrative staff.
3
NITE’s Organizational Structure
CEO: Dr. Anat Ben-Simon
Deputy Director: Dr. Naomi Gafni
Departments
• Test development
• Scoring
• Research
• Operations
• IT
• Finance & administrations
4
Units
• CAT
• Non-cognitive tests
• Test accommodation
• Computerized LD diagnosis system
• Automated text analysis
• RAMA project
NITE: Main Objectives
Provision of assessment services primarily to
institutions of higher education,
and also to the educational system (K-12), and other
public organizations
Conducting research on admissions, placement,
assessment and evaluation in institutions of higher
education
Advancement of the fields of measurement, testing
and psychometrics in Israel
5
Typical Assessment Service
Design and development of assessments & tools
Test registration
Test administration & accommodations
Scoring and quality control
Reporting of results
Conducting related research
6
Other Assessment Services
Expert item review
Test translation/adaptation
Computerizing P&P tests
Analyzing test results
Conducting evaluation projects
Training professional staff
Consulting organizations in Israel and abroad
Teaching relevant academic courses
Tests in Higher Education
8
Test Admission to - Type ~N
PET Universities & colleges P&P / CAT 70,000
MEIMAD Pre-academic prep schools
Online 6,000
MOR/MIRKAM Medical schools Assessment Centers
1,700
MITAM Psych. graduate studies P&P
1,300
And multiple, smaller-scale, admission & proficiency tests to various programs and organizations.
Test Proficiency in - Type ~N
AMIR/AMIRAM English P&P / CAT 45,000
YAEL Hebrew P&P 25,000
9
The Psychometric Entrance Test (PET)
Admission to higher ed. is based on the mean of PET
and school matriculation exams grades (BAGRUT).
PET is a scholastic aptitude test consisting of:
• Verbal Reasoning (~60 MC items)
• Writing (one task)
• Quantitative Reasoning (~50 MC items)
• English as a Foreign Language (~ 55 MC items)
PET is a standardized P&P test given 5 times a year
• It takes about 3.5 hours to complete
• Scores range 300-800, mean 540, SD=110.
• Adapted versions for examinees with disabilities
PET is translated into: Arabic, Russian, English, French,
and Spanish (and sometimes to Italian & Portuguese)
Other projects
MATAL- a comprehensive, standardized, computer-based test battery for the diagnosis of learning disability. • Aiding the provision of test accommodations in
higher education.
HLP- computational tools for analyzing and rating Hebrew texts. • NiteRater- automatic essay rating system.
ICAP- an initiative to advance educational measurement and psychometrics in Israel.
10
The RAMA Project: Analyzing and reporting results from large-scale tests and surveys for the
Ministry of Education
Outline
• Background
• Team
• Main projects
• Tasks
• Tools
• School Climate and Pedagogical Environment
(CPE) surveys
• Growth and Effectiveness Measures for Schools
(GEMS)
• Report production
• Challenges and successes
17/12/2015 Rama Project
12
Background
• A 5-year, variable-quota contract issued by the MOE, which began in 2012. Expected to be renewed in 2017.
Providing services to “RAMA” – The National Authority for Measurement and Evaluation in Education (a branch of MOE).
Main project cycle is between May and January, peaking in July-October.
17/12/2015 Rama Project
13
Cleansing, Analysis, Reporting
NITE
, Human Rating & Data EntryAdminstration
TALDOR
Test/Survey Development
Climate: RAMA Achievement: CET
The RAMA Team
17/12/2015 Rama Project
14
Eran DBA
Eliran Analyst
Evgeny Analyst
Shaul Analyst
Valla Analyst
Nethanel Report
Production
Matan Manager
And 6 part-time assistants
Tzur Psychometrician
Main Projects - 2015
• Growth and Effectiveness Measures for Schools (GEMS): Achievement Tests for 5th & 8th grades • First language (Heb/Arab), Math, English, Science & Technology
• About 200,000 records per year
• Climate and Pedagogical Environment (CPE) Surveys • 5th – 9th grade : about 150,000 students and 12,000 teachers per year
• 10th & 11th grades: about 50,000 students and 8,000 teachers per year
Results for surveys and tests are reported on 4 levels: • Schools (1/3 of the country): about 900 elementary & middle schools,
and 300 high schools, per year.
• Municipalities: about 100 per year
• Districts: 8 every year
• National: by language, school type (secular/religious), sub-groups within the Arab population, SES.
• Hebrew as a Second Language for Arabic speakers • 7,000 6th grade students (test and survey) and 600 teachers (survey).
• Results reported nationally.
17/12/2015 Rama Project 15
Main Project Tasks
17/12/2015 Rama Project 16
• Database design and maintenance
• Data cleansing
• Dealing with inconsistent, missing, corrupted, inappropriate or duplicate data in surveys, tests and background information
• Quality control of surveys prepared by RAMA
• Maintenance of item properties in the database
• Factor analysis for surveys and tests
• Item analysis for surveys and tests
• Scoring using classical scaling and calibration
• Aggregations and norms
• IRT analyses (sometimes)
Parallel channel
(cont.) Main Project Tasks
17/12/2015 Rama Project 17
• Extraction of historical comparison data
• Dealing with special cases
• Generation of reports for project control & monitoring
• Automatic generation of personalized reports
• levels: school, district, municipality, national
• Human and automatic quality control of reports
• Language and format editing of reports
• Writing insights and conclusions based on results
• Preparation of CDs and envelopes for mailing
• Secondary data analysis (research questions)
• Documentation and technical reports
Tools
17/12/2015 Rama Project 18
• SQL – Data management, cleansing and history
• SAS, SPSS – Data manipulation and analysis
• Parallel channel: All the main analyses are performed separately by different analysts using different codes and software, the results are compared, and inconsistencies are resolved.
• VBA – automation, quality control, reports’ post-processing
• Magic Publisher – Inserting SPSS output into Word
• Winsteps – IRT
• Word – Reports
• Excel – Everything…
School Climate and Pedagogical Environment
(CPE) Surveys
Goal of CPE Reports
17/12/2015 Rama Project 20
The surveys aim to provide a detailed picture of various aspects of the social climate and pedagogical processes in schools, that are essential for educational quality: satisfaction, relationship within the school community, security and safety, discipline and behavioral issues, emotional-motivational aspects, teaching-learning-assessing processes, value-based education, etc.
The reports are meant to help school personnel to set
evidence-based goals, and to track, plan and monitor important aspects of schooling such as inter-personal relationships, educational interaction, motivation and educational aptitude among students.
Reported Indicators, 2015
17/12/2015 Rama Project 21
Climate Pedagogy
An overall positive attitude among students towards school
Practices of quality teaching-learning-assessment
Close relationship and caring between teachers and students
Class assignments that promote inquiry learning
Positive relations between students and their peers Self-learning strategies Involvement in violent incidents Receiving feedback to promote learning
Digital violence via social networks and on the Internet
School's efforts to encourage social and civic involvement
Verbal violence School's efforts to promote tolerance of diversity
School’s efforts to encourage a sense of safety School trips & tours Proper behavior of students in the classroom Students’ recreational activities Teachers’ satisfaction with school Differentiated instruction at school Involvement of parents in school Giving feedback to promote learning
Teachers’ lack of a sense of safety Teamwork at school Competence, curiosity and interest in learning Collaborative learning in school
School's efforts to encourage motivation and curiosity among students
Teachers’ professional development
Parameter Report
• Main source of information about survey items • Content, type, instructions for coding and analyzing, conditioning, etc.
• Effective and uniform communication with RAMA
• Improved automation, quality control, code design
17/12/2015 Rama Project 22
Miss. val code ID
Code ID
Recode Miss. vals
Resp. range
Use? Indicator name Item response
Item text Code type
Core/ version
Item ID
Item #
Coding Items and Aggregating Indicators
Coding • Dicho: Strongly Agree & Agree are coded as 1, rest=0
• R_Dicho: Strongly Disagree & Disagree are coded as 1, rest=0
Aggregating • Calculate the mean of coded items, across all respondents.
• mean is calculated within the desired aggregation level (e.g., mean of each item within each language group, or school type).
• Calculate the mean of item means, for all the items that belong to the indicator. • The result is the mean % of respondents advocating the statements that
constitute the indicator.
• For categorical items, we calculate the number of respondents who selected each category within each aggregation level.
Historical comparison data • Past results are extracted from the DB and presented in reports.
• Exclusionary rules must be followed because in some cases, past data should be censored.
23
Aggregation Levels
• School level: • Whole school
• Across age groups: 5th –6th grades, 7th –9th grades, 5th –9th grades
• Within age groups: 5th, 6th, 7th, 8th, 9th
Comparison norms: • Language: Hebrew, Arabic
• Across age groups : 5th –6th, 7th –9th, 5th –9th grades
• Educational authority: religious, secular
• Arabic sub-groups: Arab, Druze, Bedouin
• SES: low, medium, high
17/12/2015 Rama Project 24
National CPE Report
17/12/2015 Rama Project 25
Positive general attitude towards school (for students)
All schools (2008-2015)
Hebrew schools (2008-2015) Arabic schools (2008-2015) %
agr
eein
g
5-6 7-9
10-11
2008 2009 2010 2011 2012 2013 2014 2015
5-6 7-9 10-11
Goal of the GEMS Achievement Tests
To examine the extent to which elementary and middle school
students are performing at the expected level according to
the curriculum in four core subjects: First Language
(Hebrew/Arabic), Mathematics, English, and Science and
Technology.
26
Achievement Tests – Background
Each school participates once every 3 years, by taking all relevant tests. • Every year, ⅓ of the schools take the external tests (data is
sent to RAMA), and ⅔ use internal tests (same test, but data stays in school).
27
Subject 5th Grade 8th Grade
English
Science & Technology
Math
Language
Tests (post-2014)
Achievement Tests – Forms
• Every year: • Two operational forms in Hebrew -> translated into Arabic
• Two pilot forms (next year’s test, administered securely)
• Form adaptations for Ultra-Orthodox population
• Students with special needs are tested with accommodations
The test forms cover: • Main topics from the curriculum
• A range of skills, abilities and levels of thinking
The forms are composed of: • Multiple-choice items
• Open-ended items (dichotomous or polytomous scoring)
• Matching items
• Items with multiple scoring dimensions
• Listening comprehension items
• Multi-stage items
• Testlets
28
• Scoring items based on the parameter report
• Parameter reports are created by the test developers and contain information needed to score & aggregate items.
• Calculations are performed in two parallel channels and compared.
• SAS: Macro-based code gets input from parameter report
• SPSS: VBA code generates a SPSS analysis code based on the parameter report
• Calculations are also triple-checked by RAMA.
• Calculating a total score for each examinee
• Calculating sub-scores for each examinee
29
Achievement Tests – Scoring
Achievement Tests – Item Analysis
• Using VBA, analysis outputs are compared, pasted into excel, and formatted to highlight problems.
• Item analysis includes: • Descriptive statistics of the total score and sub-scores
• Reliability analyses of the total score and sub-scores
• Correlations between items and total scores
• Item response and score distributions (+ total score means)
• Correlations between total score and sub-scores
• Graphical item analysis (response distribution over total score)
• Analysis of time and effort from self-report data
• DIF analysis includes: • Form A vs. Form B
• Hebrew form vs. Arabic form
• Ultra-Orthodox form vs. regular forms (Jewish sample only)
• Boys vs. girls within language
• Pilot form vs. operational form (for making equating decisions)
30
Item Analysis – Graphical Aids
31
0%
20%
40%
60%
80%
100%
10 30 50 70 90
Pe
rce
nta
ge
of
Stu
de
nts
Total Score
Item Q12
Answer 1 Answer 2 Answer 3
Answer 4 Answer 98 Answer 99
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Sco
re o
n f
orm
31
Score on form 30
Score Comparison- Forms 30 & 31
-0.2-0.15
-0.1-0.05
00.05
0.10.15
0.2
S8_0
S9 S10_0
S11_0
S12_0
S13
S14
S15_0
S16
S17_0
S18_0
S19_0
S20_0
S21_0
S22
_a
S22_b
S23_a
S24_a
S24_b
S26
_0H
AM
S26_0
LAS
S26_0
TMI
Dif
fere
nce
Score Difference- Form 30 & 31
נוסח
Achievement Tests – Scaling
• Goal: Transforming all raw scores to a scaled score,
uniform across different forms of a test. This scaled
score will be used for reporting total score and sub-
scores to schools.
• Assumptions:
• Equivalent populations within a year
• Translation does not affect the difficulty of the form
• Method: Linearly scaling all forms to the main
operational form (Form A)
32
α=S(A)/S(B)
β= M(A)- M(B) × α
Equating Achievement Test Scores
• Goal:
• Transforming the GEMS scores into a uniform scale across years, to aid interpretation.
• Background:
• Established in 2008, with mean=500 and SD=100.
• Applies only to the total scores, not the subscores.
• Method:
• Previous year’s pilot forms are linked to current year’s operational forms using anchor items.
• Building an “equating chain” (using both linear equating and Tucker method) to obtain parameters for transforming this year’s scaled score to the multi-year score.
33
Multi-year Equating of GEMS Tests
34
5
2
Pilot 2014
Form A
2015
Form B
2015
Tucker: α2, β2
Form A
2014
Multi-year
score
α5, β5
Parameters for multi-year
score, calculated last year
1
3
4
6
α1=S(B)/S(A)
β1= M(B) - M(A)×α1
α3=S(A*)/S(A1*)
β3= M(A*) - M(A1*)×α3
α4= α1× α2× α3
β4=α2× α3×
β1+ α3× β2+ β3
α6= α4× α5
β6= α5× β4+ β5
Achievement Tests – Aggregations • Reported results:
• Mean scores, quartiles and percentiles, attitudes
towards the subject, # examinees and response rates
• Historical comparison data
• Aggregation levels:
• Grade, school, municipality, district,
language, educational supervision,
national
• Excluding special needs, recent
immigrants & ultra-orthodox schools
• Using sampling & nonresponse weights
• Segmentation within aggregation levels: student
SES, school SES, special needs
35
445 450 465
487
520 511 518 511
508 521 526
548 558 556 562 559
350
400
450
500
550
600
650
עה עד עג עב עא ע סט סח סז
Mu
lti-
ye
ar
sco
re m
ea
n
דוברי עברית דוברי ערבית
Math- 5th grade
Hebrew Arabic
07 08 09 10 11 12 13 14 15
Report Production
17/12/2015 Rama Project 36
• Building & designing report templates
• Mapping reports: which datum goes where?
• Bookmarking for automatic data insertion
• Dissecting a report into sub-reports
• Running SPSS code for creating the report’s XML
• Inserting data into the report using Magic Publisher
• Post-processing – deletions, comments, visual editing
• Dealing with exceptions and special cases
• Performing manual & automatic data checking
• Burning CDs, preparing envelopes, uploading files to web
• A typical school report contains 150 pages, 50 tables, 50 graphs, and over 5,000 data points.
• The municipal, district and national reports are even bigger • These numbers increased by more than 150% since 2012.
Magic Publisher A tool for inserting SPSS output
into Word and PPT
17/12/2015 Rama Project 37
Generating Report Maps
17/12/2015 Rama Project 38
Table template
Table map
Table map check
Hebrew
Math
English
Subject Grade Number of examinees
Percentage of examinees
5
5
5
Final table after data insertion
17/12/2015 Rama Project 39
Report Template
Tables after Data Insertion
17/12/2015 Rama Project 40
Using Macros to Build Tables & Graphs
17/12/2015 Rama Project 41
Dealing with Exceptions
• Low response rates
• Cheating
• Exemption
• Refusal
• Bilingual schools
• Exceptions in testing conditions
• Exceptions in historical comparisons for particular schools or groups of schools
17/12/2015 Rama Project 42
Challenges
• Tight schedule and bottlenecks
• Long learning curve
• Distribution of knowledge
• New projects and requests every year
• Extensive changes every year
• Psychometric challenges
• Need to work in parallel channels
• Exceptions and special cases
• Extraction of historical data
• Documentation
17/12/2015 Rama Project 43
Successes
• Meeting deadlines & quality control standards
• Improving work processes, organization and automation
• Improving and extending data analysis and report production
• Reducing time for treating special cases
• Designing and building a new project database
17/12/2015 Rama Project 44
top related