brent duckor ph.d. (sjsu) april 22, 2014bearcenter.berkeley.edu/sites/default/files/duckor &...

32
Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014

Upload: others

Post on 18-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC)

BEAR Seminar April 22, 2014

Page 2: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Studies under review ELA event

Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating the internal structure of the Performance Assessment for California Teachers (PACT): A multi-dimensional item response model study. Paper presented at the annual meeting of the American Educational Research Association conference, San Francisco, California.

Mathematics event

Castellano, K., Duckor, B., Téllez, K., Wihardini, D., & Wilson, M. (2013, April). Validity evidence for the internal structure of the Performance Assessment for California Teachers (PACT): Examining the elementary mathematics teaching event. Paper presented at the annual meeting of the National Council on Measurement in Education conference, San Francisco, California.

Page 3: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Becoming a CA teacher Ø CA license requirements (“Pre-service”)

•  Subject competency tests (CSET)

•  Coursework

•  TPA (PACT)

Ø CA license requirements (“Preliminary” In-service)

•  BTSA Induction program

Ø “Clear” Credential & Full License

Page 4: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

For teacher candidates��� ü The PACT is a standardized

licensure “exam” over several weeks constructed in the field

ü Comprised of multiple constructed responses tasks

ü It is “high stakes” in the sense that it has consequences

Page 5: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Instrument  Content  Primer  

Teaching  Event  is  a  constructed  extended  response  items  design    It  includes  wri8en  responses  to  task  prompts    It  also  includes  video  clips  (e.g.  two  10  minute  clips)    And  ar@facts  such  as  lesson  plans,  instruc@onal  materials,  examples  of  student  work,  assessments,  etc.  

Page 6: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Validity

Page 7: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Validity: Working Definition

“The degree to which evidence and theory support the interpretation of test scores entailed by proposed

uses of tests."

(AERA, APA, & NCME, 1999)

Page 8: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Translated into plain English

“The degree to which the written and video evidence provided by teacher candidate supports the interpretation of

individual’s scores (on 5 domains and 12 tasks) to determine if that individual will enter the California public school

classroom."

(AERA, APA, & NCME, 1999)

Page 9: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Content Validity is not enough

“Teacher educators who participated in the development and design of the assessments were asked to judge the extent to which the content of the Teaching Events was an authentic representation of important dimensions of teaching. Another study examined the alignment of the TE tasks to the California Teaching Performance Expectations (TPEs). Overall, the findings across all content validity activities suggest a strong linkage between the TPE standards, the TE tasks and the skills and abilities that are needed for safe and competent professional practice.” (Technical Manual, 2007, pp. 25-27)[Emphasis added]

Page 10: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Validity Evidence

Content

Response processes

Internal structure

Relations to external variables

Consequences

Page 11: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

As Kane notes (1994) •  The plausibility of an interpretation depends on

evidence supporting the proposed interpretation and refuting competing interpretations.

•  Moreover, we should expect that different types of validity evidence will be relevant to different parts of the argument.

•  Claims that the situations included in licensure examinations are representative of the situations encountered in some are of practice could be supported by expert judgment or by empirical data.

Page 12: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Validation to support responsible use

q Content  Validity:  Does  the  developers  demonstrate  coverage  of  curriculum  with  a  sufficient  amount  of  tasks  for  each  topic  area  to  ensure  the  meaningfulness  of  the  score  results?  

q Response  Processes  Validity:  Did  the  developers  interview  candidates  to  check  how  9me,  energy,    mo9va9on,  confusion,  language  facility,  wri9ng  ability,  test-­‐wiseness,  etc.  may  have  reduced/overstated  the  meaningfulness  of  the  score  results?  

q  Internal  Structure  Validity:  Does  a  sta4s4cal  analysis  of  “dimensionality”  of  the  teacher  candidates’  results  indicate  that  we  are  measuring  what  we  intend  to  measure?  

q Rela@ons  to  other  external  variables  Validity:  Do  the  PACT  score  results  from  correlate  with  those  of  other  scores  e.g.  field  placement  scored  from  university  supervisors    and  coopera@ng  teachers  ?  

q Consequen@al  Validity:  Does  the  PACT  event  lead  to  “teaching  to  the  test”  or  eliminate  other  “non-­‐tested”  content  from  teacher  educa@on  curriculum  or  otherwise  harm  the  novice  teacher  learning  experience?  

Page 13: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Our study: Focus on Internal Strucure Validity Claims

•  Objective: Examine evidence for internal structure of the Elementary Literacy (EL) PACT scores

•  Theoretical framework: Validation study to address to two research questions:

•  To what extent does an IRT model fit the EL PACT instrument and aid in describing teacher candidate “ability” and task “difficulty” across the State?

•  Is there evidence that the EL PACT assesses multiple constructs other than intended?

•  Methods & Data Analyses

•  Sampling: 2008-2010 (n=1, 711)

•  Scoring & Data (production, masked, no student ID)

•  Statistical procedures: Partial credit model (IRT) for unidimensional and multidimensional investigation

•  Results

Page 14: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Summary Stats Table 1 Summary Statistics by Item

Time Statistic

Items by Domain

Planning Instruction Assessment Reflection Academic Language

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 AL11 AL12

2008-2009

N 407 407 406 407 407 407 407 311 407 407 407 407 Mean 2.74 2.74 2.59 2.54 2.45 2.63 2.44 2.54 2.55 2.54 2.18 2.52 SD 0.72 0.76 0.71 0.74 0.74 0.78 0.76 0.81 0.71 0.78 0.71 0.67

2009-2010

N 1304 1303 1304 1304 1304 1304 1304 1297 1303 1304 1304 1303 Mean 2.83 2.82 2.73 2.56 2.43 2.61 2.47 2.56 2.50 2.51 2.27 2.46 SD 0.65 0.70 0.68 0.71 0.71 0.75 0.75 0.78 0.71 0.74 0.67 0.62

Overall

N 1711 1710 1710 1711 1711 1711 1711 1608 1710 1711 1711 1710 Mean 2.81 2.80 2.70 2.56 2.43 2.61 2.46 2.56 2.51 2.52 2.25 2.48 SD 0.66 0.71 0.69 0.71 0.72 0.76 0.75 0.78 0.71 0.75 0.68 0.63

Page 15: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Constructs:  Readiness  to  teach  

in  5  Domains  

Items  Design:  12  Tasks  

Outcome  Space:    

12  Rubrics  

Reliability  

Validity  

Measurement  Model:  

IRT  scale  scores  

Page 16: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Findings • Unidimensional Partial Credit Model

• Partial Credit Model fits better than the Rating Scale Model

• The step difficulty parameters band together

• The Planning items tend to be the easiest and the Academic Language the most difficult

Page 17: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

IRT and Rasch Models •  In the Rasch model, the probability of a

specified response is modeled as a function of person and item parameters

•  Person (theta) parameters = teacher candidates’ “proficiency”

•  Item (delta) parameters= “task difficulties”

Page 18: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

81

(a) (b) (c)

Figure 12. Representation of possible relationships between respondent and item location

(adapted from Wilson, 2005).

As shown in case (a) of Figure 12, if a respondentís location on the person (right-

hand) side of the Wright map is above the item location on the left-hand side, he or she is

more than 50% likely to make that response. In this case, the probability governing a

correct response suggests that the items below him or her are relatively ìeasierî because

he or she has more of the construct, that is, proficiency for the given dimension.

As shown in case (c) of Figure 12, if a respondentís location on the right-hand

side of the Wright Map is below the item location on the left-hand side, then he or she is

less than 50% likely to make that response. In this case, the probability governing a

correct response suggests that the items below him or her are relatively ìharderî because

he or she has less of the construct, that is, proficiency for the given dimension.

As shown in case (b) of Figure 12, if a respondentís location on the right-hand

side of the Wright map is co-equal with an item location of the left-hand side, then he or

δi

θ

δi θ δi

θ

Page 19: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Person Ability Item Score Thresholds | 6 | X| X| | 5 X| X|AL12.4 X|AL11a.4 AL11b.4 X| 4 XX|P3a.4 I5.4 R9.4 XX|I4.4 A7.4 XX|P3b.4 A8.4 R10.4 XXX|P1.4 A6.4 3 XXX|P2.4 XXXX| XXXXXX| 2 XXXXXX|AL11a.3 XXXXXX| XXXXXX|AL11b.3 XXXXXXX| 1 XXXXXXXX| XXXXXXXXXX|I5.3 A7.3 R10.3 XXXXXXXX|R9.3 AL12.3 XXXXXXXXX|I4.3 A6.3 A8.3 0 XXXXXXXX|P3a.3 XXXXXXXXX| XXXXXXXXX|P2.3 P3b.3 -1 XXXXXXXXX| XXXXXXXXX|P1.3 XXXXXXXX| XXXXX| -2 XXXXX| XXXXX| XXX| XX|AL11a.2 -3 XX| X|A8.2 AL11b.2 X|A7.2 X|I5.2 -4 X|I4.2 R9.2 R10.2 X|P3a.2 A6.2 |AL12.2 -5 |P3b.2 |P1.2 |P2.2 | -6 |

Threshold for Score Level 4 (Ability level at which examinee has 50 percent chance of obtaining a score of 4)

Threshold for Score Level 3 (Ability level at which examinee has 50 percent chance of obtaining a score of 3 or higher)

Threshold for Score Level 2 (Ability level at which examinee has 50 percent chance of obtaining a score of 2 or higher)

Page 20: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Findings

• Multidimensional Partial Credit Model

• 4D (task based) & 5D (domain based) models fit less well (AIC) than 3D model

Page 21: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Task-based model

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Task 3 Tasks 1 & 2 Task 4 Task 5

AL11 AL12

Instruction Planning Assessment Reflection Academic Language

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 AL11 AL12

(a)

(b)

(c)

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Assessment, Reflection & Academic Language Planning

AL11 AL12

Instruction

Page 22: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Domain based model

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Task 3 Tasks 1 & 2 Task 4 Task 5

AL11 AL12

Instruction Planning Assessment Reflection Academic Language

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 AL11 AL12

(a)

(b)

(c)

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Assessment, Reflection & Academic Language Planning

AL11 AL12

Instruction

Page 23: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Findings

• Additionally, pairwise correlations are not as high as for modified 3D model

Page 24: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Correlations Table 2 Observed Correlations between Mean Domain Scores (above diagonal) and Disattenuated Correlations between Domains/Dimensions (below diagonal)

Planning Instruction Assessment Reflection Academic Language

Planning 0.64 0.64 0.63 0.65 Instruction 0.81 0.61 0.62 0.61 Assessment 0.80 0.79 0.70 0.67 Reflection 0.82 0.84 0.92 0.67 Academic Lang 0.84 0.84 0.90 0.95

Page 25: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Findings

• There is evidence for “Planning” and “Instructing” and “Meta-Reflecting” proficiencies fitting a 3D model

Page 26: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Dimension based model

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Task 3 Tasks 1 & 2 Task 4 Task 5

AL11 AL12

Instruction Planning Assessment Reflection Academic Language

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 AL11 AL12

(a)

(b)

(c)

P1 P2 P3 I4 I5 A6 A7 A8 R9 R10

Assessment, Reflection & Academic Language Planning

AL11 AL12

Instruction

Page 27: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

27

Dimension 1 - Planning Dimension 2 - Instruction

Dimension 3 - Assessment, Reflection, Academic Language

Person Ability

Item Score

Threshold

Person Ability

Item Score

Threshold

Person Ability

Item Score Threshold

9

|

|

|

|

|

|

X |

|

|

8

|

|

|

X |

|

|

7

|

X |

|

X |

|

|

X |

X |

X |

6 X |

X |

X |

X |

|

X |

X |

X |

|

5 XX | P3.4 XX |

X | AL12.4

XX | P1.4 X | I4,4,I5.4 X | A6.4,AL11a.4

4 XX |

XXX |

X | R9.4

XXX | P2.4 XX |

XX | A8a.4

XXX |

XX |

XXX | A8b.4,R10.4

3 XXX |

XXX |

XXX | A7.4

XXX |

XXXX |

XXXX |

XXXXXXXX |

XXXXXXX |

XXXX |

2 XXXXXXX |

XXXXXX |

XXXXXXXX | AL11a.3

XXXXXXXXX |

XXXXXXX |

XXXXXXXXX | AL11b.3

1 XXXXXXXX |

XXXXX |

XXXXX |

XXXXXXX |

XXXXXXX |

XXXXXXXXX | A6.3,A8a.3

XXXXXX |

XXXXXXXXX |

XXXXXXX | A7.3,A8b.3,R9.3,R10.3,AL12.3

0 XXXXX | P3.3 XXXXXXX | I5.3 XXXXXXX |  

XXXXXX |

XXXXXXX |

XXXXXXXX |

XXXXXXX | P2.3 XXXXXXX | I4.3 XXXXXXXXX |

-1 XXXXX | P1.3 XXXXXXX |

XXXXXXXXX |

XXXXXX |

XXXXX |

XXXXXXXX |

-2 XXXX |

XXXXXX |

XXXXXXXX |

XXXX |

XXXXXXXXX |

XXXX |

XXXXXXX |

XXX |

XXX |

-3 XXXXX |

XX |

X | AL11a.2

X |

XXX |

X | A8b.2,AL11b.2

X |

|

X | A8a.2

-4 X |

X |

X | R9.2,R10.2

X |

X |

X | A6.2,A7.2

-5

|

X | I5.2

| AL14.2

X |

|

|

| P3.2

|

|

-6

|

X | I4.2

|

| P1.2,P2.2

|

|

|

|

|

-7

|

|

|  

Threshold for Score Level 4 (Ability level at which examinee has 50 percent chance of obtaining a score of 4 or higher)

Threshold for Score Level 3 (Ability level at which examinee has 50 percent chance of obtaining a score of 3 or higher)

Threshold for Score Level 2 (Ability level at which examinee has 50 percent chance of obtaining a score of 2 or higher)

Literacy

(AIC=

33058)

Page 28: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Confused?

Page 29: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Take aways •  Validation matters—it depends on particular uses and consequences

of the score data

•  There is evidence for unidimensional internal structure of PACT scale on production data to make license decision (who is “in” and who is “out” of the teaching in CA public schools)

•  There is evidence for person separation (r=.918) across unidimensional scale but rater effects (halo, drift) not yet well understood

•  There is evidence for multi-dimensionality not envisioned by developers which undermines strong claims by domain

•  Sub-score reporting and decomposition (e.g. Academic Language, Assessment, Reflection) not warranted given MD findings

Page 30: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Eternal Return: Validation to support responsible use

q Content  Validity:  Does  the  developers  demonstrate  coverage  of  curriculum  with  a  sufficient  amount  of  tasks  for  each  topic  area  to  ensure  the  meaningfulness  of  the  score  results?  

q Response  Processes  Validity:  Did  the  developers  interview  candidates  to  check  how  9me,  energy,    mo9va9on,  confusion,  language  facility,  wri9ng  ability,  test-­‐wiseness,  etc.  may  have  reduced/overstated  the  meaningfulness  of  the  score  results?  

q  Internal  Structure  Validity:  Does  a  sta4s4cal  analysis  of  “dimensionality”  of  the  teacher  candidates’  results  indicate  that  we  are  measuring  what  we  intend  to  measure?  

q Rela@ons  to  other  external  variables  Validity:  Do  the  PACT  score  results  from  correlate  with  those  of  other  scores  e.g.  field  placement  scored  from  university  supervisors    and  coopera@ng  teachers  ?  

q Consequen@al  Validity:  Does  the  PACT  event  lead  to  “teaching  to  the  test”  or  eliminate  other  “non-­‐tested”  content  from  teacher  educa@on  curriculum  or  otherwise  harm  the  novice  teacher  learning  experience?  

Page 31: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Next steps: No short cuts 1.  Strong Theory Statements/Claims derived from latent regression analyses

to investigate effect of campus on person ability estimates are NOT warranted:

q  E.g., “CSU3”, “CSU4”, and “UC1” all have significantly higher mean ability estimates

q  E.g., “CSU1” has significantly lower mean ability estimates

2.  Problem of interpreting score results because of confounding of individual candidates, campus, program “treatments” and raters

q  I.e., Multi-level, multi-dimensional study must conducted with a richer data set to control for unobserved variation

3.  Current construct definitions and instrumentation may not detect individual differences in consistent or meaningful ways at “grain size” to offer formative feedback to candidates on Assessment or AL tasks

Page 32: Brent Duckor Ph.D. (SJSU) April 22, 2014bearcenter.berkeley.edu/sites/default/files/Duckor & Tellez_BEAR... · Task-based model! P1 P2 P3 I4 I5 A6 A7 A8 R9 R10 Task 3 Tasks 1 & 2

Thank you

For more info. contact: [email protected]