using the concept of info- cubes to facilitate data analysis in demographic surveillance systems...
TRANSCRIPT
Using the Concept of Info-Cubes to Facilitate Data
Analysis in Demographic Surveillance Systems
Yazoumé Yé, Uwe Wahser
Centre de Recherche en Santé de Nouna, Burkina Faso
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser2/21
Content
• Introduction and Background (Slide 3 - 6)• The Relational Model in DSS (Slide 7 - 12)• Introducing Info Cubes (Slide 13 - 19)• Conclusion (Slide 20)
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser3/21
Introduction
• Background: Data Analysis in
Demographic Surveillance Systems (DSS)• Dilemma: Analysis of original data is
difficult, preprocessed output by technical staff is not flexible enough
• Proposition: provide Info-Cubes as easy-to-access data sources
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser4/21
Data Processing in DSS
Data Collection
D.Management
Data Entry
Analysis
Analysis of FixedFormat Output
Online AnalyticalProcessing (OLAP)
Online Transaction Processing (OLTP)
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser5/21
OLTP vs. OLAP
• provides big picture
• supports analysis
• needs aggregate data
• evaluate all datasets
quickly
• multidimensional model
Q: “HOW MANY live in
Atown?”
• provides detailed audit
• supports operations
• needs detailed data
• find one dataset quickly
• relational model
Q: “WHO lives in Atown?”
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser6/21
Peeping into DWH Architecture
OperationalDatabase(s)
Data Warehouse(DWH)
Data Marts
OLAP onMultidimensional DB
OLTP onRelational DB
Info Cubes
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser7/21
Relational Model in DSS
Location
Individual
Group
Residence
Relationship
MembershipOutmigration
Inmigration
Death
Birth
Observation
Status Observation
Preg. Outcome
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser8/21
Advantages of the RM
• Optimized for creation, reading, updating and deletion of data sets
• Ensures the retrieval of data• Eliminates redundant data to
– Ensure data consistency – Minimize data volume
• Insensitive to change
The RM supports OnLine Transaction Processing (OLTP)
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser9/21
Expected Output for Analysis
Muzungu Tababu Ferengi
M F M F M F
Atown 123 132 234 243 45 54
Betown 234 243 45 54 123 132
Cetown 45 54 123 132 234 243
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser10/21
Querying the Database
Location
Individual
Group
Resident
Relationship
MembershipOutmigration
Inmigration
Death
Birth
Observation
Status Observation
Preg. Outcome
Muzungu Tababu Ferengi
M F M F M F
Atown 123 132 234 243 45 54
Betown 234 243 45 54 123 132
Cetown 45 54 123 132 234 243
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser11/21
Relational Data Storage
LocationLocID Village123 Atown234 Btown
ResidenceIndID LocID ResNoABC 123 1ABC 234 2BCD 234 1Individual
IndID Ethn.GroupABC MuzunguBCD Tababu
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser12/21
Problems of the RM
• Information has to be collected from several tables with complex queries– Design of queries is time consuming– Execution of queries is time consuming
• Complex model is difficult to understand– Design of queries needs skilled staff
The RM is not optimized for OnLine Analytical Processing (OLAP)
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser13/21
From Fixed Format Tables ...
Muzungu Tababu Ferengi
M F M F M F
Atown 123 132 234 243 45 54
Betown 234 243 45 54 123 132
Cetown 45 54 123 132 234 243
Analysis Variable: Number of People
Dimension 1: Ethnic GroupDimension 2: Sex
Dimension 3: Town
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser14/21
... to Info-Cubes
Muzungu Tababu Ferengi
M F M F M F
Atown 123 132 234 243 45 54
Betown 234 243 45 54 123 132
Cetown 45 54 123 132 234 243
Atown
Betown
Tababu
Ferengi
M
Cetown
Muzungu
45
F
54
234 243
123 132
Dimension 1: Ethnic Group
Dimension 2: Sex
Dimension 3: Town
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser15/21
... to MDD Tables
Muzungu Tababu Ferengi
M F M F M F
Atown 123 132 234 243 45 54
Betown 234 243 45 54 123 132
Cetown 45 54 123 132 234 243
Dimension 1: Ethnic Group
EthGrp
Dimension 2: Sex
Sex
Dimension 3: Town
TownMuzungu M Cetown 45Muzungu F Cetown 54Tababu M Cetown 123Tababu F Cetown 132Ferengi M Cetown 234Ferengi F Cetown 243
Muzungu M Betown 234Muzungu F Betown 243Tababu M Betown 45
... ... ... ...
V_Count
Analysis Variable
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser16/21
Some Terms
• Granularity: degree of detail of the aggregated data
• Drill Down: zoom into detail during OLAP
• Roll Up: zoom out
• Slicing: restricting analysis to one category
• Dicing: restricting analysis to a selection of categories
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser17/21
Example of Slicing
Atown
Betown
Tababu
Ferengi
M
Cetown
MuzunguF
Dimension 1: Ethnic Group
Dimension 2: Sex
Dimension 3: Town
123 132
45 54
234 243Atown
Betown
only Tababu
M
Cetown
F
45 54
234 243
123 132
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser18/21
Some Remarks
• Adding dimensions increases granularity
• Adding categories increases granularity
• High granularity produces big cubes
• Number of data sets in a cube = number of existing combinations of categories across all dimensions
Challenge: define useful info-cubes which are not too big (performance) but contain sufficient dimensions (flexibility)
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser19/21
Advantages of Info-CubesQueries on Info-Cubes • are simple• are fast, when the granularity is not too high• are more flexible than fixed output tables
Info-Cubes• normally don’t contain confidential data• can be seized according to the needs of the
trageted researcher
Info-Cubes are suitable for OLAP and for dissemination of DSS Data
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser20/21
Conclusion: Possible Benefits
• Facilitate dissemination of DSS data
• Give more responsibilities to the researcher
• Reduce workload of technical staff
• Ensure consistent analysis results
• With suitable browser-tools: enable online cubes on the WWW
INDEPTH General Meeting 2000
Yazoumé Yé, Uwe Wahser21/21
Links and LiteratureCommercial Demo Cubes on the WWW:• http://www.sas.com/rnd/web/demos/mddbapp/webeis.html• http://www5.ibi.com/ibi_html/insure2/
Good Overview on DWH:• Inmon WH, Welch JD, Glassey KL, Managing the Data
Warehouse. New York: John Wiley & Sons, 1997
Some Links on DWH, MDD:• www.informix.com/informix/solutions/dw/redbrick/wpapers/• www2.andrews.edu/~dheise/dw/Avondale/ACDWTOC.html• members.aol.com/fmcguff/dwmodel/index.htm• www.forwiss.tu-muenchen.de/~system42/public/Line42/
Literatur/OLAP-Modeling.html• www.archer-decision.com/star101.zip