program review support tool nathan pellegrin research analyst

28
Program Review Support Tool Nathan Pellegrin Research Analyst

Upload: angelica-austin

Post on 12-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Program Review Support Tool Nathan Pellegrin Research Analyst

Program ReviewSupport Tool

Nathan PellegrinResearch Analyst

Page 2: Program Review Support Tool Nathan Pellegrin Research Analyst

Goals

Background and purpose of the tool

Demonstration

Cal-PASS update

OLAP Development at Cal-PASS

OLAP Success Story: SSPIRE Cube

Future development

Page 3: Program Review Support Tool Nathan Pellegrin Research Analyst

The Program Review Support Tool

• Funded by Hewlett.• Currently being tested by several colleges.• All data is from MIS.• Caveat: figures may not match what is found in locally

produced reports due to differences in master data sources and formulae used to derive figures.

• Like a “data smorgasbord” and includes – student demographics– course grades– TOP code course hierarchy

… the menu will be expanding !

Page 4: Program Review Support Tool Nathan Pellegrin Research Analyst

Purpose• Not the product of a mandate or requirement

from the Chancellors office. • Not intended to take the place of local tools.• Not intended to drive evaluation activities.• Intended for use by colleges as an optional

FREE tool in their program review process.• Obtain feedback from users to scale and

improve our data model and OLAP infrastructure.

Page 5: Program Review Support Tool Nathan Pellegrin Research Analyst

Cal-PASS Statistics

Over 300,000,000 records

Up to 15 years of data in some regions

Over 7,000 schools, colleges and university members

Over 150 research studies conducted in the last two years

Sixty-six Professional Learning Councils (1,200+ faculty)

Page 6: Program Review Support Tool Nathan Pellegrin Research Analyst

Universities (23)CSU .CSU .•Channel Islands •Dominguez Hills •Fresno •Long Beach •Los Angeles •Monterey Bay•Pomona •Sacramento •San Bernardino •San Marcos•Stanislaus •San Francisco•San Jose•Sonoma

UC UC ..•Davis •Merced •Riverside •San Diego •Santa Barbara•Santa Cruz

PRIVATE PRIVATE . .•Otis College of Art and Design •National University •University of the Pacific

Page 7: Program Review Support Tool Nathan Pellegrin Research Analyst

Changing the Paradigm:OLAP Applications

• OLAP = On-Line Analytical Processing• Like Excel pivot tables, except Excel handles only two

dimensional data.• Stores pre-computed aggregations of data with B-Tree

indexing for delivering fast retrieval times and fast calculation.• Enables users to perform analysis of data quickly with drag-

and-drop manipulation of variables and dynamic visualization.• Web-based for easy access – all processing is performed on

the server so it does not tie up your work station (zero footprint).

• Big time savings!• Ideal for the action research paradigm and design research.

Page 8: Program Review Support Tool Nathan Pellegrin Research Analyst

User Interface - Dundas

OLAP Cube - SSAS

Database(s) – SQL Server

MDX

SQL

3 Layers of the Application

Page 9: Program Review Support Tool Nathan Pellegrin Research Analyst

Development Process of the OLAP project is a technical collaboration between IT and Research …

• Server Architecture/O.R. – Alex Zakharenkov (IT)• Submission Processes/User Interface – Nick Wade (IT)• Data Model/ETL – Nathan Pellegrin (Research)• Design/Feedback of OLAP cubes - All IT and Research

Staff, including Terrence Willett and Mary Kay Patton

Page 10: Program Review Support Tool Nathan Pellegrin Research Analyst

Development times

• Development of initial Dim Model started in July 2008 … incremental additions/changes congealed into a (basic) model by February, 2009.

• Initial development of Program Review, including feedback and changes ≈ 8 weeks.

• Dim model ETL execution ≈ 15 hours.• Processing of OLAP cube ≈ 20 min./300K rows .• Initial deployment of UI ≈ 3 weeks. Several

changes since then. • UI required tweaks to OLAP cube design.

Page 11: Program Review Support Tool Nathan Pellegrin Research Analyst

Development Tools

• .NET• SQL Server

– storage– Integration Services– Analysis Services

• BIDS• Dundas

Page 12: Program Review Support Tool Nathan Pellegrin Research Analyst

LA

Object Object RepositorRepositor

yy

UCUC

CCCCCCCCOMISOMIS

K-12K-12

UC

CSU

Private

CSU

UniversitiesUniversities

Cal-PASS Cal-PASS SubmissiSubmissi

onon

ETL

0607,01612590000001,0000179441,U,,12,ABARCA,CARLOS,09091988,M,500,,,,,,,,,,,,,,,,,,,01,15,1,,,,,,N,010,Y,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,01,10032006,N,275,0,0,4,24,0,0,0,0,0,0,,10032006,X,,,,,,,,,,,,,,,,,,,,,,,N,Y,,,8,,,,,,,,,,0607,01612590000001,0000154281,9107510861,,11,BLACK,BRITNI,11291990,F,600,,,,,,,,,,,,,,,,,,,00,13,1,,,,,,N,000,N,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,01,10032006,N,302,1,8,6,35,3,15,5,28,4,33,01,10032006,N,340,5,71,12,67,18,90,0,0,7,47,2.5,,,,,,,,,,,N,N,U,72,80,,,,,,,,,,0607,01612590000001,0000159553,U,,11,BOWIE,EARLISHA,10231988,F,999,,,,,,,,,,,,,,,,,,,00,14,1,,,,,,Y,060,Y,,Y,,,,,,,,,,,,,,,,,,,,,,,Y,,,,,,,,,,,,,,,,,,,01,10032006,N,278,3,23,4,24,4,20,0,0,0,0,,10032006,A,,,,,,,,,,,,,,,,,,,,,,,N,N,,,40,,,,,,,,,,0607,01612590000001,0000161233,U,,

9107510861,,11,BLACK,BRITNI,11291990,F,600,,,,,,,,,,,,,,,,,,,00,13,1,,,,,,N,000,N,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,01,10032006,N,302,1,8,6,35,3,15,5,28,4,33,01,10032006,N,340,5,71,12,67,18,90,0,0,7,47,2.5,,,,,,,,,,,N,N,U,72,80,,,,,,,,,,0607,01612590000001,0000159553,U,,11,BOWIE,EARLISHA,10231988,F,999,,,,,,,,,,,,,,,,,,,00,14,1,,,,,,Y,060,Y,,Y,,,,,,,,,,,,,,,,,,,,,,,Y,,,,,,,,,,,,,,,,,,,01,10032006,N,278,3,23,4,24,4,20,0,0,0,0,,10032006,A,,,,,,,,,,,,,,,,,,,,,,,N,N,,,40,,,,,,,,,,0607,01612590000001,0000161233,U,,

CUSTOMFILES

•Analytical•Integrated K-12/CC/Univ•Time-dependent•2NF (Redundant CK)•Optimized Indexing

ETL

DimensionDimensional Modelal Model

•Semistructured data•Format/value Validation

•Storage•Application Integration•Key-value pairs (KVP) design

Cal-Cal-PASS PASS Data Data FlowFlow

Page 13: Program Review Support Tool Nathan Pellegrin Research Analyst

What does a dimensional data model do for Cal-PASS?What does a dimensional data model do for Cal-PASS?

UNIFY: Data from across segments is integrated into a unified dataset.STANDARDIZE: Table and field names, data types and value coding systems are standardized to be the same for all segments.

SIMPLIFY: The number of tables and fields used to store the data is reduced. Granularity of tables are at the units of analysis. Table relationships reflect analytical relationships between entities.

BOOST PRODUCTIVITY: The simpler, cleaner data model makes it easier to develop cubes with re-usable components, generalized for all segments. Currently, analytical data processing must be developed separately for each segment. Using a dimensional model, only one pathway needs to be developed that applies to all segments.

IMPROVE DATA QUALITY: Merging data brings data quality issues to light so they can be noted and/or resolved. Establishing primary and foreign key relationships enforces referential integrity. Multiple student identifiers are unified to produce a single “metakey” Missing course CBEDS classifications imputed using machine learning.

REDUCE RISK: Without it, in order to produce one metric for all segments separate analytical data processing pathways are required for each segment, which means more maintenance and increased risk of inconsistent results. Using a dimensional model the analytical computations and services are centralized.

Page 14: Program Review Support Tool Nathan Pellegrin Research Analyst

Organization

Organization

StudentStudent CourseCourse

TermTerm

Course Taxonomy

Course Taxonomy

Student StatusStudent Status

Course Outcome

Course Outcome

AwardAward

Cal-PASS Unified Cal-PASS Unified Dimensional Data ModelDimensional Data Model

(Selected Tables)(Selected Tables)

= Fact Table

= Dimension Table

= Foreign Key Relationship

Page 15: Program Review Support Tool Nathan Pellegrin Research Analyst

Dimensional

Model Tables

Dimensional

Model Tables

The Ideal: Centralization of analytical query processing

Each statistic can emerge at multiple presentation points, but there is only one logical control point.

Views and Stored Proc’s

OLAP

Presentation & User Engagement

User-defined cohorts; model outputs

Page 16: Program Review Support Tool Nathan Pellegrin Research Analyst
Page 17: Program Review Support Tool Nathan Pellegrin Research Analyst
Page 18: Program Review Support Tool Nathan Pellegrin Research Analyst

Student identifiers from each source system are mapped to a new identifier through transitive closure of all connected values (using a modified version of the Floyd–Warshall algorithm).

Local district student id

CSIS SSID

Name + gender + DOB

CCCCO SID (SB00)

n1d

1

d2

n2

o1

c1

d3

m1

Each edge represents a record linking two values of different identifiers in submitted student records.

Page 19: Program Review Support Tool Nathan Pellegrin Research Analyst

OLAP Success Story: SSPIRE Cube

• Funded by Irvine Foundation.• Currently used by nine colleges.• Incorporates MIS data with data submitted by colleges

(custom files).• Tracks cohorts of students.• Demonstrate using Merced college (thank you Dr. Duran!)

Page 20: Program Review Support Tool Nathan Pellegrin Research Analyst

Program Review Support Tool

Page 21: Program Review Support Tool Nathan Pellegrin Research Analyst

This is only the beginning…

• Provide access to K-12 districts and Universities• Inter-segmental OLAP Cubes• Link non-academic outcomes (Employment

Development Department, Child Welfare Data System)

“Success at Every Level”

Page 22: Program Review Support Tool Nathan Pellegrin Research Analyst

Education Data and Information Act of 2008SB 1298

1. convene a high-level working group to decide the best the governance structure for the comprehensive education data system;

2. directs the State Chief Information Office (CIO), in consultation with educators and education policymakers, to prepare a strategy plan outlining a clear path for technical implementation; and

3. requires the various education segments to begin using a common student identifier, so that once a governance structure and technical architecture are in place we can begin linking records from pre-k through the university with relative ease and speed.

Source: http://www.senatorsimitian.com/legislation/entry/sb_1298_education_information_system/

Page 23: Program Review Support Tool Nathan Pellegrin Research Analyst

23

CDE Data Systems

High level cross-agency systems map of key collections WORKING DRAFT

NOT EXHAUSTIVE

CASAS TOPS Pro

SACS

Assessments

CALPADS**

CASEMIS

Migrant

ConApps

AYP/API

Early Childcare

CALTIDES**

CPEC

CALPASS

UC CSS

EDD

CSU ERS CCC COMIS

CCTC CASE

Other CDE systems/ units including CDS, Charter schools

Other CDE units including Homeless, CALSAFE, Title 3,Private Schools etc.

Non CDE Data Systems

Prisons, Census

From Franchise tax, benefits system etc.

National Student Clearinghouse

Data sharing through local agencies

Direct data sharing*

Planned/ potential

* Does not imply direct data linkages. Only state system linkages shown

** CALPADS is envisioned to replace much of the CBEDS, Language Census, Student National Origin Report and select Consolidated Application data

*** CALTIDES is envisioned to collect data primarily from CALPADS and Commission on Teacher Credentialing’s CCTC’s Credential Automation System Enterprise CASE system

Source: Interviews with respective agencies, RAND, team analysis

In development

Existing

CDPH

Source: http://www.senatorsimitian.com/legislation/entry/sb_1298_education_information_system/

Page 24: Program Review Support Tool Nathan Pellegrin Research Analyst

24

High level system profiles of key CDE collections (1/2)System name Description

CALPADS California Longitudinal Pupil Achievement Data System. System (under development) for tracking K12 students longitudinally, that will replace CBEDS collections

Key identifier Data categories Granularity Data sharing

SSID Student demographic, program participation, grade level, enrollment, course enrollment and completion, discipline, and statewide assessment

Student Planned include- Assessments, API/AYP, Migrant, ConApps,

CALTIDES California Longitudinal Teacher Integrated Data Education System. Iintegrated data system for teacher data based on unique SEID

SEID Teacher credentials, authorizations, teacher participation program, alternative routes, participation in Beginning Teacher Support and intern program, SEID, Salary

Student Planned include- CALPADS, CCTC CASE

CASEMIS California Special Education Management Information System. Integrated data system for special education students on students, services and provider programs

SSID Attendance/Enrollment, Disciplinary, Education Agency, Mobility, Special Education, Staffing Data, Student Demographic, Other (services, age, gender, race/ethnicity)

Student, School district, School, county, region

None at state level

Assessments California High School Exit Exam CAHSEE, Standardized Testing and Reporting STAR and CELDT

SSID Attendance/Enrollment, Education Agency, Food and Nutrition, Parent Data, Special Education, Student Demographic

Student, School District, School, County

CASAS, Migrant, AYP/API, CALPADS (planned)

API/ AYP Accountability related information based on California's Public Schools Accountability Act of 1999 as well as No Child Left Behind Act of 2001

CDS code AYP/API score by student characteristics School Assessments

DRAFT

Source: Respective CDE departments

Source: http://www.senatorsimitian.com/legislation/entry/sb_1298_education_information_system/

Page 25: Program Review Support Tool Nathan Pellegrin Research Analyst

25

High level system profiles of key CDE collections (2/2)System name Description

Migrant Student enrollments in migrant education programs. Includes migrant education forms and a directory of offices providing services

Key identifier Data categories Granularity Data sharing

Migrant ID, COE number, CDS code

Student demographics, educational programs, counseling, health and support services, emergency health, clothing, food, transportation

Student Assessments, CALPADS (planned)

SACS Standardized Account Code Structure. Offers LEAs with a means of reporting financial information

CDS code For every general ledger accounting transaction- information on funds, resources, project year, goal, function, and object. Includes information on Attendance/Enrollment, Education Agency, Fiscal, Transportation

School, District CDS, Charter schools

ConAPPS Consolidated applications. Includes information on categorical programs e.g., Title I, II, V etc.

CDS code Student demographic, Title I, III, V, Part A, Immigrant, LEP, funding model, charter status, Gradespan, participants

School, District, County

CALPADS (planned)

Early Childcare Systems

CD-801A,B, CDMIS, Special Education Desired Result System SEDRS, and CD 9600

SSID Child demographics, IEP flag, family identification/case number, household name, type of program, DRDP Desired Result Development Profile, Early Childhood Environment Rating Scale ECERS

Student None

CASAS TOPSPro

Comprehensive Adult Student Assessment Systems. System for tracking Students in Adult Education Programs

ADA ID, SSID, CASAS no

Student demographics, Agency, instruction level and program, assessment scores, date of entry, reason for exit, class number, attainable goal within program year

Student Assessments

DRAFT

Source: Respective CDE departments

Source: http://www.senatorsimitian.com/legislation/entry/sb_1298_education_information_system/

Page 26: Program Review Support Tool Nathan Pellegrin Research Analyst

26

High level system profiles of non-CDE collectionsSystem name Description

CPEC California Post Secondary Education Commission. Data system for Higher Ed- post secondary systems

Key identifier Data categories Granularity Data sharing

Student ID based of SSN

Demographic, IEP, grade level, program, Graduation rate, teacher, institution

Student CDE, CSU, UC,CCC, prison, census

UC CSS Corporate Student System provides information on student enrollment and performance for University of California campuses

SSN Student demographic, income, financial aid, education history, assessment

Student CDE, CCC, CALPASS

DRAFT

CCC COMIS California Community Colleges Management Information System. COMIS data is used to prepare reports for Federal and State reports including Integrated Postsecondary Education Data System (IPEDS) and to track student outcomes

SSN, Student ID student demographic, income, financial aid, education history, assessment, teacher, institution

Student CALPASS, CPEC, CSU, EDD, National Student Clearinghouse

CSU ERS Enrollment RecordingSystem is used by Cal State totrack student retention and graduation to support regular term reports, IPEDS, and state budget requests

SSN Student demographic, financial aid, education history, assessment

Student CPEC, CALPASS, CCC

CDPH California Department of Public Health. System use to track

CDPH ID Case ID and demographics, clinical and diagnostic data

Case None

EDD Employment Development Database ID based of SSN Wages, payroll taxes, unemployment tracking, job matching, job training

Employee Franchise tax, benefits system, CCC

Source: Respective agencies, RAND

Source: http://www.senatorsimitian.com/legislation/entry/sb_1298_education_information_system/

Page 27: Program Review Support Tool Nathan Pellegrin Research Analyst

HAPPY DATA! Data Data

Thank you! Have fun ….

Page 28: Program Review Support Tool Nathan Pellegrin Research Analyst