uipath document understanding
TRANSCRIPT
3
This presentation may include forward-looking statements. Forward looking statements include all statements that are not historical facts, and in some cases, can be identified by terms such as “anticipate,” “believe,” “estimate,” “expect,” “intend,” “may,” “might,” “plan,” “project,” “will,” “would,” “should,” “could,” “can,” “predict,” “potential,” “continue,” or the negative of these terms, and similar expressions that concern our expectations, strategy, plans or intentions. By their nature, these statements are subject to numerous risks and uncertainties, including factors beyond our control, that could cause actual results, performance or achievement to differ materially and adversely from those anticipated or implied in the statements. Although our management believes that the expectations reflected in our statements are reasonable, we cannot guarantee that the future results, levels of activity, performance or events and circumstances described in the forward-looking statements will be achieved or occur. Recipients are cautioned not to place undue reliance on these forward-looking statements, which speak only as of the date such statements are made and should not be construed as statements of fact.
This meeting is strictly confidential. By participating in this meeting, you agree to keep any information we provide confidential and not to disclose any of the information to any other parties without our prior express written permission. Neither the information contained in this presentation, nor any further information made available by us or any of our affiliates or employees, directors, representatives, officers, agents or advisers in connection with this presentation will form the basis of or be construed as a contract or any other legal obligation.
Safe Harbor
5
Document processing is a core of RPA. Why?
One of the main promises
of process automation is
freeing up data trapped
in documents
There is no company in
the world which does not
deal with documents,
especially in the industries
like banking, finance,
insurance, manufacturing,
HR, public sector
Any selling process
involves repetitive
document processing
(accounts payable &
receiving, invoices, receipts,
purchase orders, shipment
tracking)
6
35%cost reduction
compared to manual
document
processing
17%reduction in time
employees spend on
document
processing
40%increase in
employee
productivity or
customer satisfaction
52%decrease in errors
that mitigates the
risk of rework and
related losses
Why would you automate document processing?
7
Document processing can be a
Manually extract, interpret, act upon
Variety of document types and quality
Human error, rework, losses
Cost and time consuming
Lack of end-to-end complex solutions
challenge
8
Manually extract, interpret, act upon
Variety of document types and quality
Human error, rework, losses
Cost and time consuming
Lack of end-to-end complex solutions
Delegation to robots understanding documents with AI
Processed automatically despitedocument structure, volume, or quality
Accurate and fast document processing
Cost and time efficiency
End-to-end solution powered by AI
Document processing can be a solution
9
What is document understanding?
Not
OCR
Not
Computer
Vision
Document understanding is the ability to extract and interpret information and meaning from a wide range of documents.
It emerges at the intersectionof document processing, AI, and RPA.
12
Get documents processed intelligently
Teach robots how to process your documents using intelligent drag-and-drop skills
for data extraction and interpretation
AI understands documents,
takes actions, and learns
from the data
Getting rid of the “noise”
caused by unrelated, rotated,
or skewed documents
Saving time and costs
with seamless end-to-end
automation
Processing a wide range of
documents and layouts,
handwriting, checkboxes
Mix of template and
template-less approaches
for most accurate results
Machine learning (ML) skills
improve over time
based on the custom data
INTELLIGENTFLEXIBLE
ACCURATE EFFICIENT
13
Let robots take on document processing routine
Extract and interpret data from structured documents (different forms,
passports, licenses, time sheets, etc.) that can contain handwritten text,
signatures, checkboxes.
Extract data from semi-structured documents containing fixed and variable
parts like tables. Samples of semi-structured documents are invoices,
receipts, purchase orders, medical bills, bank statements, utility bills, etc.
Analyze and extract data from complex unstructured documents like
various contracts, agreements, emails, disease descriptions, drug
prescriptions, news, voice scripts, etc.
Leverage human knowledge to process the data from documents by
robots that use AI to understand and act upon the extracted content.
14
Every company in the world processes documents
GeneralFinancial Services
& Insurance
Human
Resources
(HR)
Manufacturing Public Sector Healthcare
• Invoice
• Receipt
• Purchase
Order
• Utility bill
• Bill of
Landing
• Passport
• License
• Accounts Payable
& Accounts
Receivable
• IRS form
• Loan application
• Mortgage
processing
• Account opening
and customer
onboarding
• Claims
processing
• Vendor
onboarding
• Compliance-
related processes
• Employee
onboarding
• Resume
screening
• HR records
processing
• Sales order
processing
• Customer
parts
request
• Remittance
processing
• Immigration
application
• School
application
• Passport
management
application
• Medical
forms
• Medical bills
• Health
records
• Drug
prescriptions
15
Higher
efficiency
Save time and costs
spent on paperwork
with easy to deploy
and maintain
automations
Automating document processing leads to…
Accelerated
productivity
Automate highly-
manual document
processing tasks to
accelerate
productivity rates
Better customer
experience
Mitigate the risk of
errors and decrease
the response time to
deliver better
customer experience
Happier
employees
Help employees
escape from the
mundane chores
and focus on higher-
value tasks
16
“Respondents across all industries said content intelligence
technologies enabled several key corporate initiatives, including
employee engagement, customer engagement, work
transformation, and overall digital transformation.”
Holly Muscolino
International Data Corporation (IDC)
17
UiPath Document Understanding webpage ← for trial and more
resources
How AI Can Continuously Improve and Scale Automations (webinar with
customers sharing case studies)
AI Coming to the Rescue: Accurate Document Processing at Scale
(product deep-dive webinar, February 2021)
Guide on Document Understanding (white paper)
Academy training
Documentation on Document Understanding
AI & RPA webpage
Where can you find out more?
19
Challenge
• Evros Finance team spend 20 hours a week manually processing up to
1,600 to 1,800 purchase invoices a month.
• Manual, repetitive, and time-consuming document processing led to
overall a poor use of the team’s time.
• They wanted an automated solution that could scale with the business.
Solution
• Mapped the business process to automated approach
• Built and trained the data models for the key suppliers
• Setup & configured UiPath Document Understanding
• Designed & implemented the integrations with Outlook & Microsoft NAV
(Evros financial system)
• Continuous improvement with machine learning (ML)
• Human involvement when required
21,000About 21,000 invoices per year
80%80% improvement in time savings
6-8 weeksSpent on retraining to increase
accuracy from 60% to 80%
80%Accuracy that is still improving over
time with automatic retraining
• Important considerations
Each individual customer/client has their individual invoice
template
• There were other document types mixed into the start of the
process that would need to be filtered out (e.g. credit notes etc)
• Each invoice was printed, reviewed, and stored for audit
purposed as part of the manual process. Needed to consider a
new storage approach
IT managed services and
system integration
Location: Ireland
Size: SME
Finance purchase invoice processing
Challenge
• Evros Finance team spend 20 hours a week manually processing up to
1,600 to 1,800 purchase invoices a month.
• Manual, repetitive, and time-consuming document processing led to
overall a poor use of the team’s time.
• They wanted an automated solution that could scale with the business.
Link to the case study
20
7000 invoices processed monthly
45 seconds avg invoice processing time
160+ hours saved monthly
90%+ straight through processing
• Required a custom "Bill of Lading" field to
be trained
• Starting with out-of-the-box ML model
significantly reduced effort
• 6 weeks development + 6 weeks model
trainingFuel invoice
processing
Retail: Fuel invoice processing
Customer:
Wholesale Club
Link to the case study
21
200-250 training data setsthat are representative of production population is core to
building successful ML models
SME education/trainingneeded for SMEs to learn how to prepare data sets
2-3 minutesare needed to tag 1 document with 12 elements
~40% time savedwith 400 docs model training in GPU (10 hours) compared to
CPU (17 hours)
2-4 weeksfrom first cutover to production with expected confidence
levels
Why UiPath Document Understanding?• Out-of-the-box ML model for invoice processing
• Built-in integrations with OCR engines including UiPath Document OCR
and Omnipage OCR
• Action Center for human validation
Challenge
• NY based healthcare provider has 500+ suppliers dealing with frequent
returns/cancellations of orders thus leading to process monthly 800+ credit
memos (reverse invoices) in PeopleSoft ERP
• Accounts team manually processes suppliers credit memos and makes
credit adjustments against the PO in ERP
• Involves number of business validation including the data comparison
between document and ERP
• On average, manual processing takes about 8 minutes
Why automate?
• Timely adjustment of credit amount on Purchase Orders (PO) enables
efficient working capital
• Avoid manual errors with AI automation to address more complex scenarios
Types of documents
• 65% semi-structured scanned
• 20% structured
• 5% semi-structured scanned
handwritten
• 10% digital PDFs
Healthcare provider Location: USA
Automation of credit adjustment on purchase order
Link to the case study
22
Why UiPath Document Understanding?• Automated redaction of PII and PHI coverage
• Continuous improvement with ML Learning
• US-based operations for production support
• Human validation if required
Challenge
• Customer has compliance needs to redact
documents before transmission to third
parties to validate patient claims
• Time-consuming process when done
manually
Location: USA
Redaction of documents
Types of documents
• 10-page documents on average
• Important considerations
• Prioritize top 5 document types
covering 80% of the documents
• 10-11 PII fields + list of medical terms
considered as scope for redaction
• Human to review 100% redactions to
begin with and reducing over time
(move from attended assistance to
STP path)
Link to the case study
23
INDUSTRIES
Accounts Payable invoice
processing results:
• Increased operational excellence
• Reduced average hold time (AHT)
• SLAs are met for all invoices
• Reduced # errors vs. manual work
• UiPath robot is used to speed up the
supplier invoice processing by leveraging
UiPath AI solutions like Document
Understanding and Action Center.
Finance: Invoice processing
Customer:
25
UiPath Document Understanding and RPA platform
Automation
Hub
Process
Mining
Task
Mining Studio
Family
Test
Manager
Chatbots
Action
Center
Assistant
DISCOVER
ENGAGE
BUILD
Document
Understanding
Marketplace
and
Integrations
AI Center
Orchestrator
Apps
Task
Capture
Insights
MANAGE
RUN
GOVERN
MEASURE
Robots
Studio
Robots
Orchestrator
AI Center
Action Center
• Pre-trained models available out of the box
• Bring your own model – custom or 3rd party
• Retrain the models
• Core RPA tools
• Human validation
26
Document Understanding Framework
Load taxonomy
to define document types
and fields
Digitize
documents using OCR to make them machine-
readable
Export
the extracted
data for further
usage
Extract
information from the
documents
Validate
extraction results (human
review)
Train
extractors
based on the
validated data
Classify
and split the files
into document
types
Validate
classification results (human
review)
Train
classifiers
based on the
validated data
27
Load taxonomy
Taxonomy manager is
used once at the start to
define the collection of
documents that you
would want to process.
Additionally, you can
describe what data you
would like to extract.
28
Move-For-You Co.
We move so you don’t need to move
PO: NP74006735
1 February 2020
PAYABLE WITHIN 15 DAYS OF RECEIPT
20800 ALMADEN AVE, SUITE 404
SAN JOSE, CA 95120-0520
T: +1 425 555 9876
F: +1 425 555 3456
www.moveforyou-co.com
Bill To:
Tony Tzeng
12345 Mango Lane
Seattle, WA 98108
INVOICE DETAILS
Packing services
Storage fees (1 month)
House move (white-glove service)
Vehicle storage and transport
Sales tax 10%
Total Fee including Tax
FEE
$1,282.00
$1,884.00
$5,320.00
$5,186.00
$1,367.20
$15,039.20
Methods of payment
Personal Check: Move-For-You LLC
Wire Transfer: BigBank Co., Account 123456789-0987ABC
Invoice No: 456200-TZE1
Digitize text in the documents using OCR
29
Classify and split the documents
Multi-page
document
______
______
______
______
______
Invoice
______
______
______
Purchase
order
______
______
Receipt
______
______
______
______
______
______
______
_____
______
______
______
______
_____
Unrelated
documents
______
______
Documents scanned into one
file isn’t a problem – owing to
classifiers, the robot can identify
the document types and split the
file to process the documents
accordingly.
Document Understanding offers
different classification capabilities
ranging from keyword-based to
ML-based classification.
30
Validate classification of the documents
Classification Station is
used to check, correct,
and confirm the results of
document classification
and splitting.
31
Extract information from the document
You can easily configure
data extraction to choose
most suitable extractor
for each field.
Use a combination of rule-
based and model-based
approaches to ensure
smooth and accurate
processing of different
documents.
32
Data extraction – from rules to AI to a hybrid approach
RegEx-Based
ExtractorMachine Learning
Extractor
Intelligent Form
Extractor
Form
Extractor
A combination of both –
rule-based and model-based extractors
Rule-based Model-based
Hybrid approach
33
Rule-based or template-based approach
Relies on rules (like regular
expressions) and templates
(including anchors)
Processes fixed in format
structured data
Ensures high accuracy for
already known documents
34
Machine learning (ML) models as a template-less approach
Pre-trained ML models
• Invoices
• Invoices for Australia, India, Japan
• Receipts
• Purchase Orders*
• Utility Bills*
• Contracts**
Bring your own model
• Create new custom model
• Third-party models
Model retraining
• AI Center
• Data Manager
• Validation Station
* in public preview
** to be released to public preview in 21.4
Learn about sharing data for model retraining here.
35
Make use of pre-trained ML
models to process invoices,
receipts, utility bills, purchase
orders, contracts (more
models coming soon).
Bring your own model or
3rd party models and
incorporate them in your
automations.
Retrain the models to
improve their accuracy over
time!
Pre-trained ML models
36
ML model training via AI Center
You can use Data Manager to
train your custom ML models
or retrain the existing models
in AI Center. This would help
robots understand the
specificities of your
documents better. The more
you work with the model, the
more effective it becomes.
Thus, the accuracy of the
extracted data improves over
time.
Learn about sharing data for model retraining here.
37
Validate the extracted
Information and handle
exceptions using
Validation Station.
Now, ML models can also
be retrained using the data
confirmed or corrected in
the Validation Station.
Validate extraction results
38
Let the classifiers and
extractors learn from the
data corrected and
validated in the
Classification Station and
Validation Station
respectively.
Train classifiers and extractors
Learn about sharing data for model retraining here.
39
Export the extracted data
Export the data for further
usage/automation.
For example:
- to an Excel spreadsheet
- send as email
- SAP, etc.
40
Example scenario: Mortgage packet audit post-closing
Extract key loan
information from
documents
Split the packet into
underlying files for
faster processing
Robot monitors folder
for new files, initiates
document process
Executed closing packet
received and scanned
• Document scanning
• Digitization with OCR
• Unattended robot • Pre-processing
• Document classification
(keyword, anchors, model)
• RPA parallelization
• Extraction
• ML model-based and/or
rule-based (hybrid)
1 2 3 4
Write results into line of
business application
Send exceptions for
human review
Compare / validate
information across
documents
Check for signature
present in executed fields
• Signature detection • Unattended robot • Confidence / business rule-
based exceptions
• Validation Station
• Attended robot
• Action Center (Unattended
RPA)
• Unattended robot
5 6 7 8
41
FlexibleProcess various types and formats of documents
IntelligentMix & match different extractors and retrain them to achieve higher accuracy
End-to-endSeamlessly automate high-volume document processing with end-to-end RPA & AI
Open & extensibleBring your own or third-party components and use them within Document Understanding
42
Explore how UiPath Document Understanding can automate this
Define which documents you want to process
Get trained on Document Understanding in Academy
or via instructor-led training from UiPath
Give it a try – start Enterprise Trial
What’s next?
Enterprise Trial, Academy Course and more at uipath.com/document-understanding
44
Example AI use cases
Professional Svc
Data Extraction
from Charts
RFP Opportunities
Classification
Deal Guidance
Financial Svc
Fraud
Detection
Personal Loan
Approval
KYC – Entity
Identification
Retail
Packaging Quality
Evaluation
Inventory
Management
Merchandising
Planning
Healthcare
Real Time Pregnancy
Risk Evaluation
Patient Receivables
Management
Propensity of Claim
Denial Prediction
General
Help Desk
Answers
Customer Churn
Prediction
Resume Matching
Auditing – Anomaly
DetectionAML Alert
Classification
Product
Recommendation
Fraudulent Medical
Claim Prediction
Customer Complaints
- Email Classification
Legal – Win/loss rate
predictionID Information
ExtractionPricing Optimization
Readmission
Prediction
Quality - Visual
Inspection
*Review appendix for details
45
AI use cases: a tandem of UiPath Document Understanding, AI Center, and AI Computer Vision
46
Leveraging Document Understanding ecosystem
Optical character
recognition
(OCR)
Structured
documents
Semi-structured
documents
Unstructured
documents
Natural language
processing (NLP)
• UiPath
Document
OCRh
• Kofax
Omnipage
• ABBYY
FineReader
• Google OCR
• Microsoft OCR
• Tessaract
• UiPath Form
Extractor &
Intelligent
Form
Extractorh
• ABBYY
FlexiCapture
• UiPath Machine
Learning Exractorh
• ABBYY FlexiCapture
Distributed / for
Invoices
• Hyperscience
• Ephesoft
• Vidado
• Rossum
• Omnius
• Microsoft Form
Recognizer
• Amazon Textract
• Indico
• SortSpoke
• Botminds AI
• Xtracta
• Contract
Wrangler
• Expert System
• Amazon
Comprehend
• Stanford NLP
Group
47
Document Understanding
21.4 FTS Release
Enhanced extraction capabilities:
• Field-level anchors for Form Extractor & Intelligent Form
Extractor
• Checkbox support for UiPath Document OCR
• Checkbox support in ML semi-structured models
• UiPath handwriting OCR accuracy improvements and support
for handwriting / machine text mixed fields
ML-based classification:
• ML Classifier for cloud and on-prem
ML models:
• Pretrained models: Invoices – India and Invoices – Australia GA;
ID Cards and Passports Preview
• ML model field limit increase to 300 fields with 2-key shortcuts
• Data Manager in AI Center cloud and removal of storage limits
• Auto-retraining capabilities (coming soon after 21.4)
Improved user experience:
• Document Understanding process for Studio / Studio Pro
(Preview)
48
“We have lots of instances where we need to grab documents from portals and sites.
Using Document Understanding, we can automatically read the PDFs, extract the
relevant information and input accurate data into our systems. This was a big
problem we’ve had for a very long time. Within two weeks, we were able to build a
working solution that can be rolled out to teams globally.”
Davendra PatelSenior RPA Developer at Alter Domus
49
“Just 2 days after Classification/Validation station was released into Action Center we immediately
added the missing piece to the Document Understanding Process already built a few months back.
Now we can have SME’s classifying/splitting documents and non classified document. We made it so
smart that it will also extract data from unrecognised documents by the robots, by creating general
classifications & document types where SMEs can select these all and continue on with the workflow.
That’s what I call realtime validation and processing. The bonus was having Data Service to store the
data extracted from the process, and then we use this data to create the spreadsheets from this.
Amazing to see how many months of development coming together as an end to end solution.”
Davendra PatelSenior RPA Developer at Alter Domus
50
“Combining RPA and AI is critical for document processing. Our customers often
have documents containing different layouts, different ways of writing, and different
types of fonts. AI is a powerful addition to any document processing workflows - it
can help read and understand the variations in documents, as well as learn from the
previous iterations. This in turn delivers faster and more accurate automation to
document processing.”
Lahiru Fernando
Executive Lead for RPA at Boundaryless Automation
51
“It has enormous potential for solving document processing challenges in industries
where there is an acute need for a solution, like banking, finance, healthcare,
insurance, and manufacturing. The way that it extracts data from different types of
documents and integrates with the other components of the UiPath Platform like
UiPath AI Center, makes it really compelling.”
Lahiru Fernando
Executive Lead for RPA at Boundaryless Automation
52
“When we compare what we had back then—just a year ago—to what we have
today, the improvements are amazing. There have been lots of new useful features
added. I also provided input on new features that could be added. UiPath listened to
me and the early users. As a result, Document Understanding has become an
excellent product. Now, we do not feel the need to look at other third-party tools as
we used to.”
Lahiru Fernando
Executive Lead for RPA at Boundaryless Automation
53
Help us help you – share data with us
Make Document Understanding better, faster, more accurate to achieve better results at no additional cost
UiPath uses the data to improve the product at its own cost, unblocking the customer use case
The customer gets improved capabilities (including retrained ML models with higher accuracy) - at no cost
UiPath makes the enhanced capabilities available to other customers
A customer encounters a limitation or would like to request an enhancement of a Document Understanding
component for their own benefit (including ML model retraining)1
6
5
4
* Alternative option - UiPath Data Sharing Agreement, please ask your contact at UiPath to generate it in SFDC.
The customer joins UiPath Insider Preview Program2The customer shares sample documents or workflows exhibiting the limitation or enhancement3
54
TRAINING
Course Description
The 3-day instructor-led class for RPA Developers provides an in-depth
knowledge of Document Understanding (DU) and offers practical exercises.
Upon completion of the class, participants will be able to:
• Understand how to leverage and implement UiPath DU Framework to process
documents intelligently.
• Learn how to classify different document types.
• Learn how to process different types of documents by using rule-based,
model-based and hybrid approach.
• Learn how to use out-of-the-box Machine Learning Extractors
• Learn how to train a Custom Machine Learning Extractor
• Learn how to bring human-in-the-loop to validate different actions done by a
robot.
• Build and implement an end-to-end DU workflow
Course Outcomes
Agenda
Day 1
- Introduction
- Document Understanding solution, use cases & RPA platform
- Intelligent OCR & Document Understanding activities
- Hands-on exercises in UiPath Studio
Day 2
- Out-of-the-box and custom ML Extractors
- Data Manager and AI Center (former AI Fabric)
- Hands-on exercises in UiPath Data Manager, AI Center, and Studio
Pre-requisites:
• UiPath Foundation Diploma / partcipants must have hands-on experience with
UiPath platform (Orchestrator, Studio, Robot)
• Basic understanding of ML/AI concepts, but not a must.Note:
This is a paid training. Please each out to your point of contact at UiPath to get more
details.
Day 3
- Closing the loop – Human validation implementation
- DU Framework – how to build and implement production-ready
- Hands-on exercises in UiPath Studio