dallas data brewery meetup #2: data quality perception

18
Data Quality Perception data brewery Dallas Data Brewery, June 2013

Upload: stefan-urbanek

Post on 18-Dec-2014

206 views

Category:

Technology


1 download

DESCRIPTION

Brief, introductiory slides for second Dallas Data Brewery meetup. Topic: Data Quality Perception.

TRANSCRIPT

Page 1: Dallas Data Brewery Meetup #2: Data Quality Perception

Data Quality Perception

data brewery

Dallas Data Brewery, June 2013

Page 2: Dallas Data Brewery Meetup #2: Data Quality Perception

Topic■ What is "high quality data"?

■ What are data quality expectations?you, people or businesses you know have

■ Business issues and data qualityHow to you deal with it?

■ What happens when you ignore it?

Page 3: Dallas Data Brewery Meetup #2: Data Quality Perception

What is data quality ?

Page 4: Dallas Data Brewery Meetup #2: Data Quality Perception

Dimensions■ completeness – data provided

■ accuracy – reflecting real world

■ credibility – regarded as true

■ timeliness – up-to-date

■ consistency – matching facts across datasets

■ integrity – valid references between datasets

... and there are more

Page 5: Dallas Data Brewery Meetup #2: Data Quality Perception

Fallacies

■ “good data are error-free and valid”

■ “improving quality means cleansing”

■ “it is IT problem”

■ “it can be fixed”

Page 6: Dallas Data Brewery Meetup #2: Data Quality Perception

Short Story:Completeness

Open Public Procurements

Page 7: Dallas Data Brewery Meetup #2: Data Quality Perception

from this...

Page 8: Dallas Data Brewery Meetup #2: Data Quality Perception

... to this:

http://tendre.sme.sk

Page 9: Dallas Data Brewery Meetup #2: Data Quality Perception

0%

25%

50%

75%

100%

2005

-320

05-5

2005

-720

05-9

2005

-11

2006

-120

06-3

2006

-520

06-7

2006

-920

06-1

120

07-1

2007

-320

07-5

2007

-720

07-9

2007

-11

2008

-120

08-3

2008

-520

08-7

2008

-920

08-1

120

09-1

2009

-320

09-5

2009

-720

09-9

2009

-11

2010

-120

10-3

2010

-520

10-7

2010

-9

better

have it all

none

Quality measure

completeness: 55%

how many % of the field is filled and successfully processed?

Page 10: Dallas Data Brewery Meetup #2: Data Quality Perception

type 1 type 2

+

Page 11: Dallas Data Brewery Meetup #2: Data Quality Perception

how many % of the field is filled and successfully processed?

0%

25%

50%

75%

100%

2005-3

2005-5

2005-7

2005-9

2005-10

2005-12

2006-3

2006-5

2006-7

2006-9

2006-11

2007-1

2007-3

2007-5

2007-7

2007-9

2007-10

2007-12

2008-3

2008-5

2008-7

2008-9

2008-11

2009-1

2009-3

2009-5

2009-7

2009-9

2009-11

2010-1

2010-3

2010-5

2010-7

2010-9

Quality measure

completeness: 88%

better

have it all

none

Page 12: Dallas Data Brewery Meetup #2: Data Quality Perception

What does that mean:

“high quality data?”

?

Page 13: Dallas Data Brewery Meetup #2: Data Quality Perception

85% ?

Page 14: Dallas Data Brewery Meetup #2: Data Quality Perception

Conclusion

Page 15: Dallas Data Brewery Meetup #2: Data Quality Perception

appropriate for given purpose

Page 16: Dallas Data Brewery Meetup #2: Data Quality Perception

Data Project

■ define data quality requirements

■ measure during development

■ provide data quality report

Page 17: Dallas Data Brewery Meetup #2: Data Quality Perception

More topics

■ Data quality measurementindicators, probes

■ Data quality managementroles, processes, impact

■ Data cleansing