using the concept of info- cubes to facilitate data analysis in demographic surveillance systems...

21
Using the Concept of Info-Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de Nouna, Burkina Faso

Upload: norah-todd

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

Using the Concept of Info-Cubes to Facilitate Data

Analysis in Demographic Surveillance Systems

Yazoumé Yé, Uwe Wahser

Centre de Recherche en Santé de Nouna, Burkina Faso

Page 2: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser2/21

Content

• Introduction and Background (Slide 3 - 6)• The Relational Model in DSS (Slide 7 - 12)• Introducing Info Cubes (Slide 13 - 19)• Conclusion (Slide 20)

Page 3: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser3/21

Introduction

• Background: Data Analysis in

Demographic Surveillance Systems (DSS)• Dilemma: Analysis of original data is

difficult, preprocessed output by technical staff is not flexible enough

• Proposition: provide Info-Cubes as easy-to-access data sources

Page 4: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser4/21

Data Processing in DSS

Data Collection

D.Management

Data Entry

Analysis

Analysis of FixedFormat Output

Online AnalyticalProcessing (OLAP)

Online Transaction Processing (OLTP)

Page 5: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser5/21

OLTP vs. OLAP

• provides big picture

• supports analysis

• needs aggregate data

• evaluate all datasets

quickly

• multidimensional model

Q: “HOW MANY live in

Atown?”

• provides detailed audit

• supports operations

• needs detailed data

• find one dataset quickly

• relational model

Q: “WHO lives in Atown?”

Page 6: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser6/21

Peeping into DWH Architecture

OperationalDatabase(s)

Data Warehouse(DWH)

Data Marts

OLAP onMultidimensional DB

OLTP onRelational DB

Info Cubes

Page 7: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser7/21

Relational Model in DSS

Location

Individual

Group

Residence

Relationship

MembershipOutmigration

Inmigration

Death

Birth

Observation

Status Observation

Preg. Outcome

Page 8: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser8/21

Advantages of the RM

• Optimized for creation, reading, updating and deletion of data sets

• Ensures the retrieval of data• Eliminates redundant data to

– Ensure data consistency – Minimize data volume

• Insensitive to change

The RM supports OnLine Transaction Processing (OLTP)

Page 9: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser9/21

Expected Output for Analysis

Muzungu Tababu Ferengi

M F M F M F

Atown 123 132 234 243 45 54

Betown 234 243 45 54 123 132

Cetown 45 54 123 132 234 243

Page 10: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser10/21

Querying the Database

Location

Individual

Group

Resident

Relationship

MembershipOutmigration

Inmigration

Death

Birth

Observation

Status Observation

Preg. Outcome

Muzungu Tababu Ferengi

M F M F M F

Atown 123 132 234 243 45 54

Betown 234 243 45 54 123 132

Cetown 45 54 123 132 234 243

Page 11: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser11/21

Relational Data Storage

LocationLocID Village123 Atown234 Btown

ResidenceIndID LocID ResNoABC 123 1ABC 234 2BCD 234 1Individual

IndID Ethn.GroupABC MuzunguBCD Tababu

Page 12: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser12/21

Problems of the RM

• Information has to be collected from several tables with complex queries– Design of queries is time consuming– Execution of queries is time consuming

• Complex model is difficult to understand– Design of queries needs skilled staff

The RM is not optimized for OnLine Analytical Processing (OLAP)

Page 13: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser13/21

From Fixed Format Tables ...

Muzungu Tababu Ferengi

M F M F M F

Atown 123 132 234 243 45 54

Betown 234 243 45 54 123 132

Cetown 45 54 123 132 234 243

Analysis Variable: Number of People

Dimension 1: Ethnic GroupDimension 2: Sex

Dimension 3: Town

Page 14: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser14/21

... to Info-Cubes

Muzungu Tababu Ferengi

M F M F M F

Atown 123 132 234 243 45 54

Betown 234 243 45 54 123 132

Cetown 45 54 123 132 234 243

Atown

Betown

Tababu

Ferengi

M

Cetown

Muzungu

45

F

54

234 243

123 132

Dimension 1: Ethnic Group

Dimension 2: Sex

Dimension 3: Town

Page 15: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser15/21

... to MDD Tables

Muzungu Tababu Ferengi

M F M F M F

Atown 123 132 234 243 45 54

Betown 234 243 45 54 123 132

Cetown 45 54 123 132 234 243

Dimension 1: Ethnic Group

EthGrp

Dimension 2: Sex

Sex

Dimension 3: Town

TownMuzungu M Cetown 45Muzungu F Cetown 54Tababu M Cetown 123Tababu F Cetown 132Ferengi M Cetown 234Ferengi F Cetown 243

Muzungu M Betown 234Muzungu F Betown 243Tababu M Betown 45

... ... ... ...

V_Count

Analysis Variable

Page 16: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser16/21

Some Terms

• Granularity: degree of detail of the aggregated data

• Drill Down: zoom into detail during OLAP

• Roll Up: zoom out

• Slicing: restricting analysis to one category

• Dicing: restricting analysis to a selection of categories

Page 17: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser17/21

Example of Slicing

Atown

Betown

Tababu

Ferengi

M

Cetown

MuzunguF

Dimension 1: Ethnic Group

Dimension 2: Sex

Dimension 3: Town

123 132

45 54

234 243Atown

Betown

only Tababu

M

Cetown

F

45 54

234 243

123 132

Page 18: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser18/21

Some Remarks

• Adding dimensions increases granularity

• Adding categories increases granularity

• High granularity produces big cubes

• Number of data sets in a cube = number of existing combinations of categories across all dimensions

Challenge: define useful info-cubes which are not too big (performance) but contain sufficient dimensions (flexibility)

Page 19: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser19/21

Advantages of Info-CubesQueries on Info-Cubes • are simple• are fast, when the granularity is not too high• are more flexible than fixed output tables

Info-Cubes• normally don’t contain confidential data• can be seized according to the needs of the

trageted researcher

Info-Cubes are suitable for OLAP and for dissemination of DSS Data

Page 20: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser20/21

Conclusion: Possible Benefits

• Facilitate dissemination of DSS data

• Give more responsibilities to the researcher

• Reduce workload of technical staff

• Ensure consistent analysis results

• With suitable browser-tools: enable online cubes on the WWW

Page 21: Using the Concept of Info- Cubes to Facilitate Data Analysis in Demographic Surveillance Systems Yazoumé Yé, Uwe Wahser Centre de Recherche en Santé de

INDEPTH General Meeting 2000

Yazoumé Yé, Uwe Wahser21/21

Links and LiteratureCommercial Demo Cubes on the WWW:• http://www.sas.com/rnd/web/demos/mddbapp/webeis.html• http://www5.ibi.com/ibi_html/insure2/

Good Overview on DWH:• Inmon WH, Welch JD, Glassey KL, Managing the Data

Warehouse. New York: John Wiley & Sons, 1997

Some Links on DWH, MDD:• www.informix.com/informix/solutions/dw/redbrick/wpapers/• www2.andrews.edu/~dheise/dw/Avondale/ACDWTOC.html• members.aol.com/fmcguff/dwmodel/index.htm• www.forwiss.tu-muenchen.de/~system42/public/Line42/

Literatur/OLAP-Modeling.html• www.archer-decision.com/star101.zip