copyright © 2011, oracle and/or its affiliates. all rights reserved. oracle enterprise data quality...
TRANSCRIPT
![Page 1: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/1.jpg)
Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle Enterprise Data Quality
Introduction to Parsing
![Page 2: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/2.jpg)
2Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• “The application of business rules and semantic intelligence to data in order to understand and validate it en masse and, if required, improve its structure in order to make it fit for purpose.”
• Commonly used to structure and prepare data before matching.
What is Parsing?
![Page 3: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/3.jpg)
3Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Typical Business Problems (1)
Data extraction• E.g. who do I sell to?
• Extract all customer names into single attribute.– Communicate with accuracy.– Avoid sending inappropriate communications.
— E.g. to deceased customers.— Bad public relations and possible legal issues.
![Page 4: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/4.jpg)
4Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Typical Business Problems (2)
Data migration:• Single name field > structured columns.
– Data is now structured.
Original system New system
![Page 5: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/5.jpg)
5Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Clean Data.• E.g. For better matching.
Typical Business Problems (3)
Remove or move personal names hidden in company names.
Remove or move inappropriate data. Standardize abbreviations.
![Page 6: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/6.jpg)
6Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Enterprise Data Quality Text Analysis (1)
Application of business rules and semantic intelligence.
• Understanding and transforming data:– Names, Addresses, Product descriptions etc.– “Does this list contain only names?”– “How good is my data?”
![Page 7: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/7.jpg)
7Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Enterprise Data Quality Text Analysis (2)
Generic capability:• Configure Parse processor to solve specific
problem.• Apply own business rules.• Pre-configured Parse Processors are
available.– Starting point for tailored parsing.
![Page 8: Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Introduction to Parsing](https://reader036.vdocuments.us/reader036/viewer/2022082610/56649f4e5503460f94c6f7ca/html5/thumbnails/8.jpg)
8Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Example: Parse a Full Name String
Parse free-text Full Name into ordered columns.