copyright © 2011, oracle and/or its affiliates. all rights reserved. oracle enterprise data quality...

8
Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

Upload: phyllis-watts

Post on 13-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Oracle Enterprise Data Quality

Introduction to Parsing

Page 2: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

2Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

• “The application of business rules and semantic intelligence to data in order to understand and validate it en masse and, if required, improve its structure in order to make it fit for purpose.”

• Commonly used to structure and prepare data before matching.

What is Parsing?

Page 3: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

3Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Typical Business Problems (1)

Data extraction• E.g. who do I sell to?

• Extract all customer names into single attribute.– Communicate with accuracy.– Avoid sending inappropriate communications.

— E.g. to deceased customers.— Bad public relations and possible legal issues.

Page 4: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

4Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Typical Business Problems (2)

Data migration:• Single name field > structured columns.

– Data is now structured.

Original system New system

Page 5: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

5Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Clean Data.• E.g. For better matching.

Typical Business Problems (3)

Remove or move personal names hidden in company names.

Remove or move inappropriate data. Standardize abbreviations.

Page 6: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

6Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Enterprise Data Quality Text Analysis (1)

Application of business rules and semantic intelligence.

• Understanding and transforming data:– Names, Addresses, Product descriptions etc.– “Does this list contain only names?”– “How good is my data?”

Page 7: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

7Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Enterprise Data Quality Text Analysis (2)

Generic capability:• Configure Parse processor to solve specific

problem.• Apply own business rules.• Pre-configured Parse Processors are

available.– Starting point for tailored parsing.

Page 8: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing

8Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Example: Parse a Full Name String

Parse free-text Full Name into ordered columns.