dwh-ahsan abdullah 1 data warehousing lab lect-2 lab data set virtual university of pakistan ahsan...
TRANSCRIPT
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
11
Data Warehousing Data Warehousing
Lab Lect-2Lab Lect-2Lab Data SetLab Data Set
Virtual University of PakistanVirtual University of Pakistan
Ahsan AbdullahAssoc. Prof. & Head
Center for Agro-Informatics Researchwww.nu.edu.pk/cairindex.asp
FAST National University of Computers & Emerging Sciences, IslamabadFAST National University of Computers & Emerging Sciences, Islamabad
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
22
Multi-Campus UniversityMulti-Campus University
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
33
Degree ProgramsDegree Programs
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
44
Disciplines for BSDisciplines for BS
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
55
Disciplines for MSDisciplines for MS
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
66
The need The need
Head Office wants a central data Head Office wants a central data repository for decision support i.e. a DWHrepository for decision support i.e. a DWH
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
77
Students Record Keeping & Mgmt. Students Record Keeping & Mgmt.
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
88
Data from Lahore CampusData from Lahore Campus
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
99
Data from Lahore Campus: SampleData from Lahore Campus: Sample
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1010
Lahore: Header of Student TableLahore: Header of Student Table
SIDSID St_NameSt_Name Father_NameFather_Name
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1111
Lahore: Header of Student TableLahore: Header of Student Table
Gender Gender AddressAddress [Date of Birth][Date of Birth] [Reg Date][Reg Date]
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1212
Lahore: Header of Student TableLahore: Header of Student Table
[Reg Status][Reg Status] [Degree Status][Degree Status] [Last Degree][Last Degree]
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1313
Lahore: Header of Course Reg. TableLahore: Header of Course Reg. Table
SID SID Degree Degree Semester Semester CourseCourse MarksMarks DisciplineDiscipline
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1414
Lahore: Facts About DataLahore: Facts About Data
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1515
Data from Karachi CampusData from Karachi Campus
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1616
Data from Karachi Campus: SampleData from Karachi Campus: Sample
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1717
Karachi: Header of Student TableKarachi: Header of Student Table
St_IDSt_ID NameName FatherFather DoBDoB M/FM/F DoRegDoReg RStatusRStatus DStatusDStatus AddressAddress QualificationQualification
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1818
Karachi: Header of Course Reg. TableKarachi: Header of Course Reg. Table
SID: SID: CoursesCourses ScoreScore SemSem DispDisp
Degree (BS/MS) is missing because Degree (BS/MS) is missing because separate books are maintained, but the separate books are maintained, but the issue is critical while loading dataissue is critical while loading data
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
1919
Karachi: Facts About DataKarachi: Facts About Data
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2020
Data from Islamabad CampusData from Islamabad Campus
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2121
Data from Islamabad Campus: SampleData from Islamabad Campus: Sample
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2222
Islamabad: Header of Student TableIslamabad: Header of Student Table
Roll NumRoll Num NameName FatherFather Reg DateReg Date Reg StatusReg Status Degree StatusDegree Status Date of BirthDate of Birth EducationEducation GenderGender AddressAddress
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2323
Islamabad: Header of Course Reg. TableIslamabad: Header of Course Reg. Table
Roll Num:Roll Num: CourseCourse MarksMarks DisciplineDiscipline SessionSession
Degree (BS/MS) is missing, whereas Degree (BS/MS) is missing, whereas same table contains records for both. same table contains records for both. Only way to differentiate is through Only way to differentiate is through discipline attribute.discipline attribute.
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2424
Islamabad: Facts About DataIslamabad: Facts About Data
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2525
ExerciseExercise
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2626
Problems with Adhoc ApproachProblems with Adhoc Approach
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2727
LAHORE
KARACHI
ISLAMABAD
PESHAWAR
Text Files
Excel Book
MS-ACCESS
Text Files
Uses
Uses
Uses
Uses
Problem-1: Non-Standard data sourcesProblem-1: Non-Standard data sources
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2828
Problem-2: Non-standard attributes Problem-2: Non-standard attributes
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
2929
Problem-3: Non Normalized database Problem-3: Non Normalized database
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3030
Notepad: IssuesNotepad: Issues
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3131
MS-Excel: IssuesMS-Excel: Issues
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3232
MS-Access: IssuesMS-Access: Issues
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3333
Problem StatementProblem Statement
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3434
Data from Peshawar Campus Data from Peshawar Campus
Data at Peshawar campus is stored in Data at Peshawar campus is stored in Text filesText files
To store data regarding one complete To store data regarding one complete batch 2 text files are usedbatch 2 text files are used Lhr_Student_batch (Student record)Lhr_Student_batch (Student record) Lhr_Detail_batch (Course Reg. record)Lhr_Detail_batch (Course Reg. record)
22 text files for 11 BS batches22 text files for 11 BS batches 8 text files for 4 MS batches8 text files for 4 MS batches
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3535
Data from Peshawar Campus: SampleData from Peshawar Campus: Sample
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3636
Peshawar: Header of Student TablePeshawar: Header of Student Table
Reg#: Reg#: Student identityStudent identity Name: Name: Student nameStudent name Father: Father: Father nameFather name Address: Address: Permanent addressPermanent address Date of Birth: Date of Birth: Date of BirthDate of Birth lastDeg: lastDeg: Last degree achieved Last degree achieved Reg Date:Reg Date: Date of Enrollment Date of Enrollment Reg Status: Reg Status: Status of Enrollment (A/T)Status of Enrollment (A/T) Degree Status: Degree Status: Status of Degree (C/I)Status of Degree (C/I)
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
3737
Peshawar: Header of Course Reg. TablePeshawar: Header of Course Reg. Table
Reg#: Reg#: Courses: Courses: Course codeCourse code Score: Score: Out of 100Out of 100 Program: Program: CS/TC/SE/CECS/TC/SE/CE Sem: Sem: Fall/SpringFall/Spring Year: Year: YYYY e.g. 1999YYYY e.g. 1999
We need to identify semester session We need to identify semester session (fall04) through combination of Sem and (fall04) through combination of Sem and YearYear