keeping your data clean - niug · 2016-11-29 · keeping your data clean . data integrity paradigm...

61
Monday November 28, 2016 2:05PM 3:05PM Jake Bound, K2 and You Doug Morris, Computer System Innovations Keeping Your Data Clean

Upload: others

Post on 20-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Monday November 28, 2016

2:05PM – 3:05PM

Jake Bound, K2 and You

Doug Morris, Computer System Innovations

Keeping Your Data Clean

Page 2: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Data Integrity Paradigm

Page 3: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Data Integrity Paradigm

Page 4: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Today

Page 5: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

A day in our life…

Page 6: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Why?

• The 1-10-100 Rule • It costs $1 to verify data when it is first entered into the

database

• It costs $10 to clean and de-dupe the data later

• It costs $100 in additional costs and lost opportunity if it is never corrected

• Trust

• Trust is priceless

Page 7: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Why?

• Data is the single most important asset in your organization

or to put it in even more clear terms

• Your whole business might blow up!

Page 8: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How

1. Understand the Paradigm (you can’t win)

2. Decide what “clean” means to you

3. Create your iMIS superhero

4. Set standards

5. Identify bad data

6. Remediate

7. Refine

8. Rinse and Repeat

Page 9: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Understanding the Paradigm

• Fixing everything in one pass is impractical

• Data cleanup is a process, not a one-time task

• Not all errors can be prevented

• Not all cleanup can be automated

• Consider cost/benefit

• Always remember, WE ARE HUMAN

Page 10: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Deciding on what “clean” means to you

• Every organisation has different:

• Requirements

• Configurations

• Budgets

• Audiences

• Staff abilities

• Legacy data

• Vendors

• Politics

Page 11: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

The “why” behind cleaning

• Build Trust with all

• Improve reporting quality

• Increase self-service

• Simplify automation and other projects

• Change of process, fresh start

• Save license fees

• Increase value of rosters

• Discard legacy data

• Save postage

• Save space

• Save money (e.g. emails)

Intangible Tangible

Page 12: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Your iMIS Superhero

• In charge of

• New Fields added to iMIS

• Bridging the gap between what is important and what is controllable

• Working with your vendors • Initial through on-going

• Creating reports/queries/other clean-up options

Page 13: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Set Standards

• Define criteria for “clean” and “dirty”

• Communicate the standards to all (staff)

• Configure iMIS to capture data in a useful format

• Birth date vs. age, staff size range vs. number of employees

• Desired outputs guide your inputs

• The ultimate use of data should guide how you collect that data (e.g. spouse name, public directory)

Page 14: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identify bad data

• Country = Australia, but State is not an Australian state

• Country = Canada, but State is not a Province

• Potential duplicate related data (two orders for the same products the same day, same ID)

• Middle name is 1 character with no period (i20 fixes now)

• Orphaned address

• Preferred address is blank

• “Other” selection with no other text provided

• One Address found when two are expected

Page 15: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identify bad data

• Business rule violations

• Lapsed members (still a member, but expired)

• Email required

• Students with past grad date

• Apprentice members must have a mentor relationship

• Board member not registered for House Of Delegates

Page 16: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Free Tools (IQA)

Pros

Easy to access

On demand

End user accessible from iMIS

Does not require SQL knowledge

Can be made interactive

Cons

• Reactive

• No scheduling

• No automated corrections

• No automatic distribution of results (without PAP)

Page 17: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Free Tools (SSRS)

Pros

On demand or scheduled*

Can be under user control

Can give them a home in iMIS

Automatic Distribution of results (send email to all)*

Cons

• Reactive

• No automated corrections

• Some technical aptitude required

Page 18: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Free Tools (SQL)

Pros

Incredibly powerful (SOUNDEX)

Can create views/business objects

More powerful with each version of SQL

Automated corrections

Cons

• Requires SQL Server

Knowledge

Access

• Dangerous in the wrong hands

• Can stop working without notice (Agent/Jobs)

Page 19: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Included with iMIS 20 (Process Automation)

Pros

Included with iMIS 20

Multiple alerts included

Works great with Staff site and Member site

Cons

• Limited Data Quality Alerts

Missing Primary email alert

Missing Mobile phone alert

Page 20: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Optional Modules (Process Automation Plus)

Pros

Create your own alerts

Can perform scheduled tasks (and call SQL)

Can send emails to Staff

Can send SSRS reports

Automatic Distribution of results

Cons

• Reactive

• Optional

• Additional Licensing Required

Page 21: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Optional modules (Customer Service Alerts)

Pros

Proactive

User configurable

Works with desktop and staff site

Immediate notification

Can run processes

Cons

• No automated corrections

• No interactive corrections*

• Reactive

• Additional Licensing Required

Page 22: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Identifying bad data

Optional modules (TaskCentre)

Pros

User configurable

Immediate or delayed notification

Can correct errors as well

Cons

• Reactive

• No interactive corrections

• IT Support required to configure and maintain

• Requires dedicated SQL Server (limiting hosting options)

Page 23: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Remediate

• Better labeling/instructions

• Restrict access, increase access (nobody knows their data better than themselves)

• Change the process, add lookup tables/validations OR just stop collecting data

• Upgrade to newer technology (i20)

• Rollback changes

• Notify the person who entered it

• Sooner is better!

• Notify a supervisor or someone who is able to fix (or punish)

• Modify the data automatically using tools

Page 24: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Remediate

Page 25: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Refine

• Is our process working?

• Do we need to automate what we log?

• Have we discovered new problems?

• What is the most serious remaining problem? (choose 1 per month)

• Remember to check cost/benefit (JB story)

Page 26: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Rinse and Repeat

• The process is continuous

Page 27: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How about some tools to take with you?

• IQA Query #1

• If country = Canada, province is a valid province

Or

• If valid province, country should = Canada

Page 28: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

If Country = Canada then Valid Postal Code

Page 29: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

If Country = Canada then Valid Postal Code

Page 30: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

If Valid Postal Code then Country = Canada

Page 31: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

If Valid Postal Code then Country = Canada

Page 32: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How about some tools to take with you?

• IQA Query #2

• Ensure email address = username

• Because we can’t compare fields in IQA (can’t say where csContact.EMAIL <> users.UserId) we need a custom business object

• Create a SQL view

• Create a business object

• Create an IQA

Page 33: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Username = Email address

create view k2V_UserNameEmailCheck as Select n.ID, n.EMAIL, u.ContactMaster, u.UserId, VALID_CHECK = case when n.EMAIL = '' then 'Email missing' when u.UserId = '' then 'Username missing' when n.EMAIL <> u.UserId then 'Invalid Username' else '' end From Name n, UserMain u Where n.ID = u.ContactMaster And n.STATUS = 'A' And n.MEMBER_RECORD = 1 And n.COMPANY_RECORD = 0 And u.UserId <> ''

Page 34: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Username = Email address

Page 35: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Username = Email address

Page 36: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How about some tools to take with you?

• IQA Query #3

• Ensure Canadian postal code matches correct format (LNL NLN)

• Unfortunately another example of not being able to use more advanced logic in IQA comparisons

• SQL and Business Object time again …

Page 37: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Check for Valid/Correct Postal Code

create view k2V_PostalCodeCheck as Select ID, ADDRESS_NUM, ZIP, VALID_CHECK = PATINDEX('%[A-Z]%', substring(ZIP,1,1)) + PATINDEX('%[0-9]%', substring(ZIP,2,1)) + PATINDEX('%[A-Z]%', substring(ZIP,3,1)) + PATINDEX('% %', substring(ZIP,4,1)) + PATINDEX('%[0-9]%', substring(ZIP,5,1)) + PATINDEX('%[A-Z]%', substring(ZIP,6,1)) + PATINDEX('%[0-9]%', substring(ZIP,7,1)), COUNTRY From Name_Address Where ZIP <> '' And COUNTRY = 'Canada' And PATINDEX('%[A-Z]%', substring(ZIP,1,1)) + PATINDEX('%[0-9]%', substring(ZIP,2,1)) + PATINDEX('%[A-Z]%', substring(ZIP,3,1)) + PATINDEX('% %', substring(ZIP,4,1)) + PATINDEX('%[0-9]%', substring(ZIP,5,1)) + PATINDEX('%[A-Z]%', substring(ZIP,6,1)) + PATINDEX('%[0-9]%', substring(ZIP,7,1)) <> 7

Page 38: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Check for Valid/Correct Postal Code

Page 39: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Check for Valid/Correct Postal Code

Page 40: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How about some tools to take with you?

• IQA Query #4

• Find Duplicates based on First Name, Last Name, and part of address

Page 41: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Possible Duplicates (Name and City)

create view k2_PossibleDuplicates as Select ID From Name Where replace(lower(FIRST_NAME)+lower(LAST_NAME)+lower(CITY),' ','') in ( Select replace(lower(FIRST_NAME)+lower(LAST_NAME)+lower(CITY),' ','') as MATCH_KEY from Name where FIRST_NAME <> '' and LAST_NAME <> '' group by replace(lower(FIRST_NAME)+lower(LAST_NAME)+lower(CITY),' ','') having count(replace(lower(FIRST_NAME)+lower(LAST_NAME)+lower(CITY), ' ','')) > 1 )

Page 42: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Possible Duplicates (Name and City)

Page 43: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Possible Duplicates (Name and City)

Page 44: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

How about some tools to take with you?

• IQA Query #5

• Members without paid dues subscriptions

• Considering paid thru as “no paid dues”

Page 45: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Unpaid Members

Page 46: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Unpaid Members

Page 47: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Custom Shortcuts to Help with Editing

• Create a simple custom page with just the address and security details

• Pass the ID to the custom page from each IQA

Page 48: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Custom Shortcuts to Help with Editing

• Create a simple custom page with just the address and security details

• Pass the ID to the custom page from each IQA

Page 49: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Custom Shortcuts to Help with Editing

Page 50: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

More ideas from Michelle Lelempsis!

Page 51: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

More ideas from Michelle

Page 52: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

More ideas from Michelle

Page 53: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Michelle’s favorite Stored Procedures

• Job coding in Financials for division

• Phone formatting

• Country specific (Australia/Canada/USA) removal from address

• Unlocking member web accounts

• Activity tasks to notify a staff about an issue

Page 54: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Michelle’s plan

• Data sources and data integrity

• Develop a Data Management Plan with key teams and management

• Identify your Minimum Data Set

• Develop iQA Reports for daily, weekly, monthly and adhoc checks

• Identify SQL Stored Procedures for updating fields

• Develop Staff Sites dashboards

• Implement Staff Site Alerts

Page 55: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Staff Site Data Integrity Dashboards Community & Events i20 2015Q4

Page 56: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Staff Site Data Integrity iMIS 20 Q4

• Staff site Alerts

Page 57: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Let’s hear from you!

Page 59: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Click to add text CLICK TO ADD TEXT

Page 60: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

Click to add text

• Click to add text • Click to add text

Page 61: Keeping Your Data Clean - NiUG · 2016-11-29 · Keeping Your Data Clean . Data Integrity Paradigm . ... Easy to access Reactive On demand End user accessible from iMIS Does not require

THANK YOU TO OUR SPONSORS

FOUNDING PARTNER

PLATINUM PARTNERS

GOLD PARTNERS

SILVER PARTNERS