data warehouse 101-fundamentals-

29
1 Mojo Nwokoma Director, Enterprise Data Systems Architecture Office of Assessment & Information Services Oregon Department of Education 503-378-3600 x2242 [email protected] Enterprise Data Warehouse Enterprise Data Warehouse Fundamentals 101 Fundamentals 101 KIDS Phase II Project KIDS Phase II Project

Upload: ashishguleria

Post on 27-Jan-2015

118 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data warehouse 101-fundamentals-

1

Mojo NwokomaDirector, Enterprise Data Systems Architecture

Office of Assessment & Information Services

Oregon Department of Education

503-378-3600 [email protected]

Enterprise Data Warehouse Enterprise Data Warehouse Fundamentals 101Fundamentals 101

KIDS Phase II ProjectKIDS Phase II Project

Page 2: Data warehouse 101-fundamentals-

2

What is Enterprise DW/BI Solutions

• Data Warehouse (DW) is a collection of integrated, subject oriented, time-variant, and non-volatile data from various sources into a single and consistent warehouse that supports reporting, analysis, and decision making within the enterprise. – Integrates operational data through consistent naming conventions,

measurements, physical attributes, and semantics.

• Business Intelligence (BI) solutions use a blending of technologies, including relational and multi-dimansional databases, client/server architecture, and graphical user interface, to integrate disparate data sources into a single coherent framework for real-time reporting, drill-through analysis and decision support.

Page 3: Data warehouse 101-fundamentals-

3

KIDS Phase I Project Report:The Business Case for Change

1. NCLB & Federal Accountability:• The reporting and performance requirements of NCLB demands a fundamental

change in the statewide data collection and reporting.• The need to report individual student achievement and aggregation of data by

subgroups.

2. Statewide Accountability & Efficiency:• Major gaps and inefficiencies exist in information collection, reporting and analysis

at both district and state levels that need to be addressed.• Funding formula compliance and equity, and ability to evaluate relative

effectiveness/ineffectiveness of education programs.

3. Student & Community Service:• Growing stakeholder demand for a significant improvement in student records

availability, accessibility, portability, and accuracy.

4. Cost Efficiency Gains:• Evidence of economies of scale resulting in cost reductions in the management of

administrative systems at the larger school districts and ESD structure.

Source: KIDS Phase 1 Final report: ODE & IBM Confidential (10/20/05)

Page 4: Data warehouse 101-fundamentals-

KIDS Phase II Project Planning:

5 key questions for a successful project planning & implementation

What?

When?Who?

Why? How?

SUCCESS

- Deliverables/Scope

- Stakeh

old

ers

- Resources- Business Case

-Tim

e F

ram

e

Page 5: Data warehouse 101-fundamentals-

5

The Essential Building Blocks for a Successful Enterprise Information Management Project

Page 6: Data warehouse 101-fundamentals-

6

Enterprise Data Warehouse ArchitectureEnterprise Data Warehouse Architecture

District DataDistrict DataWarehouseWarehouse

STAGINGSTAGING

--IntegrationIntegration

- Metadata- Metadata- Cleansed - Cleansed

-ProfiledProfiled- Biz RulesBiz Rules

Data MartData MartSISSIS

Data MartData MartFINANCEFINANCE

Data MartData MartTransportationTransportation

Data MartData MartInstructionInstruction

HRHR

SISSIS

Curriculum Curriculum & &

InstructionInstruction

FinanceFinance

NutritionNutrition

ExtractionExtraction PhasePhase

TransformationTransformation PhasePhase

LoadLoadPhasePhase

Data ManagementData ManagementH/W ServerH/W Server PlatformPlatform

Ap

pli

cati

on

sA

pp

lica

tio

ns

Page 7: Data warehouse 101-fundamentals-

KIDS Phase II Project

District/ESD Server Deployment & Data Warehouse Integration Architecture

TransactionSystem

TransactionSystem

TransactionSystem

TransactionSystem

TransactionSystem

ODE (State) DWPhysical

&Virtual

ODS = Operational Data Store

DW = Data Warehouse

ODS

ODS

ODS

ODS

ODS

Hillsboro District DWBeaverton District DW

Portland District DW

Eugene District DW

ESDs DW

LEGENDS: KIDS Work = ODEDistricts Record Exchange

Page 8: Data warehouse 101-fundamentals-

8

Project Planning Methodology – “The How?” For the Project Team

Step 1: Define the Work Breakdown Structure  The first is to create a comprehensive Work Breakdown Structure (WBS). The WBS lists all the phases, activities and tasks required to undertake the project. Identify and describe each phase, activity and task required to complete the project successfully. Depict the order in which the tasks must be undertaken and identify any key internal and external project

dependencies. Also list the critical project milestones, such as the completion of key project deliverables

Step 2: Identify the Required Resources  Having listed all of the tasks required to undertake the project, you now need to identify the generic resources required to complete each task. Examples of types of resource include: full-time and part-time staff, contractors, equipment and materials. For each resource type, identify the quantity required, the delivery dates and the project tasks in the WBS that the

resource will be used to help complete.

Step 3: Construct a Project Schedule To construct your schedule, you need to: List the phases, activities and tasks Sequence the phases, activities and tasks Add key internal and external dependencies Allocate relevant completion timeframes Add additional contingency to mitigate risk Assign resources required to complete tasks List critical delivery milestones Specify any assumptions and constraints

Page 9: Data warehouse 101-fundamentals-

Current Data Environment Lack of granular, integrated, accurate, standardized, and timely data regarding student

performance and achievement both for individual students and specific student subgroups.

Lack of data and information integration between instructional management sources and assessment sources aligned to the state’s specific educational standards. Other factors affected include attendance record, discipline, teacher qualification, classroom size, and instructional hours.

Lack of close collaboration between the districts and ODE in tracking students as they move through the educational system, both vertically and horizontally, in order to improve performance by identifying actionable indicators.

Lack of specific multi-year student performance information to support longitudinal analysis, accessible at the State, District, and school levels with appropriate controls to assure confidentiality.

Need for data and tool standardization between all reporting districts to ensure accurate, consistent, and useful analytical input for decision making purposes.

No single version of the truth exists for business rules and data definitions among various data sources.

Lack of easily validated financial information that accurately reports budgeted vs actual expenditures by program that allows correlation of these expenditures to student performances.

Lack of appropriately controlled online access to information for all stakeholders regarding student progress and school quality.

Page 10: Data warehouse 101-fundamentals-

10

Problems with Current Decision Support

• Data redundancy and process redundancy

• Data is not integrated and cannot be shared

• Data is not understood or misunderstood

• Inconsistent data definitions & business rules

• Data retrieval is difficult and time consuming

• Operational files may not contain history

• Reports are inconsistent in content & format

• Data is too dirty for business analysis

• Multiple versions of the truth

• Reports and associated BI tools are not standardized.

Page 11: Data warehouse 101-fundamentals-

11

Recommended Model for Enterprise Data Warehouse SystemKIDS Phase 11 Project

E-Portal

Data Warehouse

Operational Data StoreODS

Transactional Database

Educational StakeholderCommunication

Benchmarking/DecisionSupport

District & StateReporting

Day to DayOperations

PK-12 Data Model for Information Management (ODE & IBM Confidential) 10/20/05

Page 12: Data warehouse 101-fundamentals-

12

Standards

• Governance (setting priorities)• Data naming, aliases, abbreviations list• Meta data capture and maintenance• Data quality and data management• Testing standards• Security standards• Measuring results (benefits, costs, usage)• Service level agreements

Rules and protocols to be followed by all users and developers for all applications

Page 13: Data warehouse 101-fundamentals-

13

DW roles & responsibilities

• Business User (Client)

• Business User Support

• Data Administrator

• Data Analysts

• Meta Data Administrator

• Database Administrator

• Developers– ETL

– BI – Reports, Queries

Page 14: Data warehouse 101-fundamentals-

14

DW roles & responsibilities (continued)

• Security Officer

• Auditor

• Data Warehouse Project Manager

• Technical Services

• DW Architect

• Technical Advisory Board

• Steering Committee

Page 15: Data warehouse 101-fundamentals-

15

Information quality

• Data is accurate

• Data is consistent

• Data is timely

• Data is integrated

• Data is complete

• Data values follow the business rules

• Data corresponds to valid values

• Data is well understood

Page 16: Data warehouse 101-fundamentals-

Data Warehouse/ODS & BI Layers

Data Integration Layer & Operational Data Store (ODS): • ODS is a non queryable centralized staging areas for storing extracted, cleansing, and transformed data, and for gathering centralized metadata for implementing an Enterprise Data Mart Architecture (EDMA), eliminating the need for another non queryable staging area called data warehouse.

• Needed is a dimensionally modeled Data Warehouse for enterprise DSS, prepared to provide the best in query response performance and to support the most advanced OLAP functionalities.

Page 17: Data warehouse 101-fundamentals-

17

Meta data components

» Data name (entity, attribute, table, column, field, etc.)

» Business description of data (Project Start Date)

» Source of data (file, field)

» Business Owner of data

» Business rules

» Transformation rules

» Domains (allowable values)

» Data relationships

Page 18: Data warehouse 101-fundamentals-

18

Meta data components (continued)

» Data quality (measure of reliability)» Timeliness (ex: current as of certain date) » Historical information» Aggregation rules» Security (who has access)

Page 19: Data warehouse 101-fundamentals-

19

Meta data management

• Meta data administration

• Business meta data

• Technical meta data

• ETL reconciliation

• Data quality metrics

• Standardization

• Data ownership

• Enterprise integration

Meta Data = Meta Data = [Descriptive][Descriptive] Data About DataData About Data [of the Business][of the Business]

Information =Data within context

Context =Meta data

Information =Data + Meta data

Page 20: Data warehouse 101-fundamentals-

KIDS Phase II Project 05-07 Workflow

1. Review and validate business case for Phase II project• Identify key stakeholders.• Validate requirements, Deliverables, and Expectations• Identify two or three key districts with viable data warehouse infrastructure as test sites.

2. Review enterprise architecture, and identify Infrastructure components• Transaction Level Applications• Operating & Database systems• Hardware Platforms

3. Design Data Warehouse & Operational Data Store (ODS) Data Model• Design Dimensional Modeling Schemas• Configure Extraction, Transformation, Load (ETL)• Metadata Capture (Repository or Data labels)• Data Quality Profiling• Vendor selection will be based on a competitive “Bake-off” results from three top vendors.

4. Data Reporting & Information Delivery Mechanism• Leverage existing reporting On-line Analytical Processing (OLAP) infrastructure• Design and implement subject-area data marts for effective horizontal reporting integration• Develop OLAP cubes for report aggregation and slice/dice querying investigations• Deploy enterprise portal with built-in security and user access authentication.

5. End-user Training• Design and schedule end-user training at all levels of data and reporting needs.• Identify “Train-the-Trainer” candidates from each school district for more detailed training.

Page 21: Data warehouse 101-fundamentals-

KIDS Phase II ProjectKey System Deliverables

1. Transactional Systems/District Data Warehouse• Integrating systems that support instructional management that empowers teachers to combine

student performance and instructional data to make informed classroom decisions.• Districts will be better able to meet NCLB and state standards for having “highly qualified”

teachersin the classroom.2. Integrated & Interoperable Operational Data Store (ODS)

• Provide ability to evaluate student performance within selected programs across various schoolsand districts in the State, and highlight ways to achieve AYP consistently over time.

• Allows for quick turnaround in transferring students records when they move between districts.• Ease reporting burden on districts, and eliminate redundant and possibly inaccurate reporting of

data, and a better foundation for integrated data analysis.

3. Data Warehouse & Decision Support System /Tools• A repository for State and district reporting and analysis even at student-level data.• Allows for more meaningful system-level questions and answers by legislators or policy makers.• Greater system accessibility to all users with relevant security access privileges will mean greater

acceptance and use, and ultimately better decision.

4. System Wide Communication Portal• Provide a focused location for access to information, analytical/reporting tools, and the necessary

training and support. Also, maximization of effective system-wide use of state’s data warehouse.• Rapid and wide dissemination of integrated and proven instructional and administrative practices.

Page 22: Data warehouse 101-fundamentals-

KIDS Phase II Project High-level Project Work Plan, Time-line, & Resource requirements

Project Phase: Time-line Resource Requirements

1. Requirements Validation January 10, 2006 Mojo & Gary Scheduled trips to all

Districts & ESDs

3. “Test Site” DW/ODS Modeling, June 30, 2006 Database Administrator integration, ETL, Data Quality, Data Modeler/Analysts Meta Data Repository, and Data Quality Analysts Vendor “Bake-off” contracting Business User (Client) Meta Data Administrator ETL & BI developers Data Warehouse Project Manager

2. Inauguration of Governance & January 30, 2006 Doug Kosty & Mojo Nwokoma Project Team committee members

4. OLAP & Portal Development, October 30, 2006 End-user Business Analyst including Training, and Web Developer vendor “Bake-off” Contracting Portal Dashboard developer Business User (Client) BI OLAP Report Developer “ Train-the-trainer” Data Security Officer

Page 23: Data warehouse 101-fundamentals-

23

BI - OLAP Data Warehouse Architecture

Page 24: Data warehouse 101-fundamentals-

24

User expectations

Expectations must be managed in terms of:

• Schedule

• Budget

• Scope

• Performance

• Availability

• Simplicity (ease of use)

• Tool functionality

• Data cleanliness

• Users’ roles and responsibilities

Page 25: Data warehouse 101-fundamentals-

25

• Be a full-time member of the Core Team • Participate in all data modeling sessions• Co-manage the DW project• Make decisions and escalate disputes to the

Steering Committee • Provide meta data for business objects• Identify data security requirements• Participate in BI tool selection• Participate in all review sessions• Participate in all testing activities

User responsibilities

Page 26: Data warehouse 101-fundamentals-

26

IT Staffing

• DW roles & responsibilities• Dedicated IT team• New skill set – beyond tools, discipline• Contractors & consultants

• Knowledge transfer

• Training– Just in time– Just enough

Knowledge transfer through collaboration

Page 27: Data warehouse 101-fundamentals-

27

Low management commitment

Low user commitment

Unrealistic schedule

Unrealistic user expectations

Budget too small

Untrained or unavailable staff

Risks to be mitigated

Page 28: Data warehouse 101-fundamentals-

28

Risks to be mitigated

Unclear or changing requirements

Poor project management Creeping scope Initial project too large Wrong project

Changing priorities

Data cleansing not addressed early

Vendors out of control

Page 29: Data warehouse 101-fundamentals-

29

Not architected properly (wrong design)

Inappropriate organization structure

Lost or changed sponsor

New technology not understood

No procedure to resolve disputes

Exceeding platform capabilities

Risks to be mitigated