#abbyysummit15 - training (4/6): abbyy technology portfolio
TRANSCRIPT
TechnologySummit 2015
© Copyright 2015 ABBYY Technology Summit#ABBYYSummit15
ABBYY TECHNOLOGY PORTFOLIO OVERVIEW
Semyon Sergunin, Product Marketing Manager
TechnologySummit 2015
2#ABBYYSummit15
Agenda
● ABBYY Technology● Capture Use Cases● Latest Technology Updates● Developer Communications● Technology Roadmap
TechnologySummit 2015
ABBYY TECHNOLOGY
TechnologySummit 2015
4#ABBYYSummit15
Evolution of ABBYY Technology
81 patents 21 patents 95 patents
TechnologySummit 2015
5#ABBYYSummit15
ABBYY Technology Portfolio
Products
Platforms Supported
Technologies
Solutions
Retail Products
Corporate Application
Products
Software Development
Toolkits
Mobile Solutions
Cloud Services
Document Analysis
Scan, Photo Imaging
OCRICR
BCR, Receipt Capture
Layout Retention
Invoice Capture
Forms Capture Classification
Semantic Analysis
Information Extraction
Text Extraction
Scanning
Document Conversion
Intelligent Data Capture
TechnologySummit 2015
CAPTURE USE CASES
TechnologySummit 2015
7#ABBYYSummit15
Common Scenarios
● Ad-hoc document scanning and conversion to searchable PDF or MS Office formats
● Text extraction and indexing for full-text search● Batch documents/forms scanning + classification + data
extraction ● Data extraction from mobile images: online and offline scenarios● Screenshot reading for test automation
TechnologySummit 2015
8#ABBYYSummit15
Ad-hoc Document Scanning and Conversion
Use CaseAd-hoc scanning of individual documents for archiving, sharing via email or cloud, editing. May include data import from business cards, receipts.
Examples● Personal scanning applications
bundled with MFP and scanners (Toshiba Re-Rite, desktop FineReader)
● Cloud services (Ricoh ICE)
Technology● FineReader Engine OCR SDK
● Scanning● Imaging● OCR● ADRT (Doc. structure and layout)● PDF, Word, Excel, PowerPoint export● Receipt and Business Card Recognition
Use DocumentConversion profile
● Cloud OCR SDK
TechnologySummit 2015
9#ABBYYSummit15
Document Conversion Scenario
Products
Platforms Supported
Technologies
Solutions
Retail Products
Corporate Application
Products
Software Development
Toolkits
Mobile Solutions
Cloud Services
Document Analysis
Scan, Photo Imaging
OCRICR
BCR, Receipt Capture
Layout Retention
Invoice Capture
Forms Capture Classification
Semantic Analysis
Information Extraction
Text Extraction
Scanning
Document Conversion
Intelligent Data Capture
FineReader Engineor Cloud OCR SDK
TechnologySummit 2015
10#ABBYYSummit15
Text Extraction for Full-Text Search
Use CaseExtracting text from images and PDF files for full-text indexing and search
Examples● Document management● Document archives ● Enterprise search● eDiscovery● Data leak prevention
Technology● FineReader Engine OCR SDK(for ongoing processing)
● Imaging: deskew, stamps removal, noise reduction
● OCR (inc. low quality images)● Text extraction● PDF conversion
Use TextExtraction profile
● Cloud OCR SDK (for backlog conversion)
TechnologySummit 2015
11#ABBYYSummit15
Batch Classification and Data Capture
Use CaseDocuments/forms scanning, classification, data extraction, image to PDF conversion
Examples● Forms, surveys processing system● Invoice processing for Accounts
Payable ● Loans, mortgage processing systems
for financial institutions
Technology● FlexiCapture Engine
● Imaging● OCR● Classification● Advanced Data Capture● PDF Conversion
● FineReader Engine ● Scanning ● Advanced Imaging● Word, Excel conversion
TechnologySummit 2015
12#ABBYYSummit15
ABBYY Technology Portfolio
Products
Platforms Supported
Technologies
Solutions
Retail Products
Corporate Application
Products
Software Development
Toolkits
Mobile Solutions
Cloud Services
Document Analysis
Scan, Photo Imaging
OCRICR
BCR, Receipt Capture
Layout Retention
Invoice Capture
Forms Capture Classification
Semantic Analysis
Information Extraction
Text Extraction
Scanning
Document Conversion
Intelligent Data Capture
FineReader and FlexiCapture Engine
FlexiCapture Engine
TechnologySummit 2015
13#ABBYYSummit15
Mobile Capture – Offline
Use CaseOCR and data capture from mobile devices, disconnected from the Internet
Examples● Mobile app for insurance agents
(application forms and ID capture)● Mobile app for truck drivers
(bills of lading, invoices) ● Various personal apps: restaurant
menu translation, check splitting
Technology● Mobile OCR SDK
● Imaging● OCR● Business Card Recognition
● TouchTo for iPad (demo)
● Real-Time Recognition SDK (coming soon…)
TechnologySummit 2015
14#ABBYYSummit15
Mobile Capture – Online
Use CaseData captured from a mobile device initiates workflow at the back-end system
Examples● Mobile photo of a business card
creates a new contact in CRM system● Receipt photos start expense
reimbursement process● ID and car photos initiate insurance
claim
Technology● Mobile Imaging SDK
● Image quality check● Pre-processing and compression
● FlexiCapture Engine● Advanced image processing● OCR and data extraction● Automatic data verification● Data and Image export to the back-
end system
TechnologySummit 2015
15#ABBYYSummit15
ABBYY Technology Portfolio
Products
Platforms Supported
Technologies
Solutions
Retail Products
Corporate Application
Products
Software Development
Toolkits
Mobile Solutions
Cloud Services
Document Analysis
Scan, Photo Imaging
OCRICR
BCR, Receipt Capture
Layout Retention
Invoice Capture
Forms Capture Classification
Semantic Analysis
Information Extraction
Text Extraction
Scanning
Document Conversion
Intelligent Data Capture
MobileImagingSDK+FineReader and FlexiCapture Engine
TechnologySummit 2015
16#ABBYYSummit15
Software Test Automation
Use CaseTest User Interface of software applications and cloud services
Examples● Test automation software ● Internal test automation
systems
Technology● FineReader Engine SDK
● Imaging● Document Analysis● OCR ● XML export
– Text– Position– Confidence level
Use TextExtraction profile
TechnologySummit 2015
TECHNOLOGY UPDATESFineReader Engine Overview
17#ABBYYSummit15
TechnologySummit 2015
FineReader Engine Versions that areCurrently used by ATS attendees
FRE1167%
FRE10.518%
FRE9.510%
FRE8.55%
TechnologySummit 2015
19#ABBYYSummit15
Top OCR and Capture SDK Updates
● OCR Accuracy/Performance: English, Japanese, Chinese, Korean, Arabic, Farsi, Thai
● New! FineReader Engine 11 Release 6● New! FlexiCapture Engine 11
TechnologySummit 2015
20#ABBYYSummit15
23 25 27 29 31 33 3597.8%
97.9%
98.0%
98.1%
98.2%
98.3%
98.4%
98.5%
English, accuracy/speednormal mode
Speed, pages per minute
Accu
racy
, wor
d le
vel
14% less errors
OCR Accuracy: English
Test machine: Intel® Core™ i5-4690 CPU (3.50GHz, 4 physical cores) with 8GB RAM
FRE 11 R1
FRE 10.5 R5
FRE 11 R4/5/6
TechnologySummit 2015
21#ABBYYSummit15
60 65 70 75 80 85 90 95 100 105 11094.0%
94.5%
95.0%
95.5%
96.0%
96.5%
97.0%
97.5%
98.0%
English, accuracy/speedfast mode
Speed, pages per minute
Accu
racy
, wor
d le
vel
OCR Speed: English
14% slower
49% faster
FRE 11 R1
FRE 10.5 R5
FRE 11 R4/5 FRE 11 R6
Test machine: Intel® Core™ i5-4690 CPU (3.50GHz, 4 physical cores) with 8GB RAM
TechnologySummit 2015
22
25 30 35 40 45 50 55 60 65 70 7580%82%84%86%88%90%92%94%96%98%
100%
Japanese, accuracy/speedfast mode
Speed, pages per minuteAc
cura
cy, c
hara
cter
leve
l
160% faster
14 19 24 29 3496.1%
96.2%
96.3%
96.4%
96.5%
96.6%
96.7%
Japanese, accuracy/speednormal mode
Speed, pages per minute
Accu
racy
, cha
ract
er le
vel
FRE 11 R1 FRE 11 R4/5
FRE 11 R6
11% less errors
OCR Accuracy/Speed: Japanese
FRE 10.5 R5
FRE 11 R1 FRE 11 R4/5/6
FRE 10.5 R5
Test machine: Intel® Core™ i5-4690 CPU (3.50GHz, 4 physical cores) with 8GB RAM
TechnologySummit 2015
23
Improved Chinese, Korean and Arabic
15 20 25 30 35 4085.0%
87.0%
89.0%
91.0%
93.0%
95.0%
KoreanAccuracy/Speed, fast mode
Speed, pages per minute
Acc
urac
y, % FRE 10.5 R5
FRE 11 R1/2/3
FRE 11 R4/5
5 6 7 8 9 10 11 1295.0%
96.0%
97.0%
98.0%
99.0%
100.0%
Chinese PRCAccuracy/Speed, normal mode
Speed, pages per minute
Acc
urac
y, %
FRE 10.5 R5 FRE 11 R1/2
FRE 11 R4/5FRE 11 R3
15 20 25 30 35 40 4595.0%95.5%96.0%96.5%97.0%97.5%98.0%
Chinese TaiwanAccuracy/Speed, normal mode
Speed, pages per minute
Acc
urac
y, % FRE 10.5 R5
FRE 11 R1/2
FRE 11 R4/5
FRE 11 R3
2 4 6 8 10 12 14 16 1870.0%
72.0%
74.0%
76.0%
78.0%
80.0%
ArabicAccuracy/Speed, normal mode
Speed, pages per minuteA
ccur
acy,
%
FRE 10.5 R5
FRE 11 R1/2/3FRE 11 R4/5
TechnologySummit 2015
24#ABBYYSummit15
New! FineReader Engine 11 R6● Windows 10 support● Japanese, Arabic, Thai, and
Farsi OCR accuracy improvements
● PDF export improvements:● Large document conversion to
searchable PDF● New compression for PDF files
containing pages of different color
● Improved product documentation
0 500 1000 1500 2000 2500 30000
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
50.18165.41 263.53
542.08
1091.231370.34
53.89187.77332.02
1156.33
3180.78
4364.64
PDF Conversion – Processing Time
Batch Processor with standard export Batch Processor with ExportFileWriter
Number of pages
Tim
e in
seco
nds 3x faster
TechnologySummit 2015
25#ABBYYSummit15
New! FlexiCapture Engine 11
● Improved OCR Technology via FRE11 API● New OCR and ICR languages, new barcode types, improved image pre-processing ● Development Improvements: 64-bit support, asynchronous scanning, new Java
Native Interface (JNI) support
● Accurate data in exported PDFs - includes verification results● Back-up server support for Network License Manager
TechnologySummit 2015
DEVELOPER COMMUNICATIONS
TechnologySummit 2015
27#ABBYYSummit15
● Product Updates
● Technology Insides
● Code Samples
● FAQ● News and
Events
Follow Updates at https://ABBYY.technology
TechnologySummit 2015
28#ABBYYSummit15
Access to R&D at http://forum.ocrsdk.com/
● Managed by R&D
● All SDK products
● All Technical Questions
TechnologySummit 2015
TECHNOLOGY ROADMAP
TechnologySummit 2015
30#ABBYYSummit15
New Products Are Coming
● Receipt Capture SDK● Phase 1 (Q2 2016) – Shopping receipts: top retail vendors ● Phase 2 (TBD) – Travel receipts: hotels, restaurants, car rentals, gas
stations
● Real-time Recognition SDK● Beta – available● Release – Q2 2016
TechnologySummit 2015
31#ABBYYSummit15
OCR Technology Roadmap
FineReader Engine 12 OCR SDK – planned for Q1 2017● OCR Improvements: Japanese, Chinese, Korean, Arabic , ● Adding OCR for Urdu, Pashto
● Document Conversion Improvements: accurate and editable layouts in MS Office formats, electronic PDF conversion
● Image pre-processing for mobile photos of receipts and IDs.
TechnologySummit 2015
32#ABBYYSummit15
OCR Technology Roadmap
Mobile OCR SDK 5.0 – planned for 2016● Based on compacted OCR Tech from FRE 11 R5● Improved Japanese, Chinese and Korean ● New languages available: Thai, Vietnamese, Hebrew, Arabic
Embedded SDK 3.0 – planned for H1 2016● Based on compacted OCR Tech from FRE 11.5
Cloud OCR SDK● Seamless incremental updates ● Enhance service features: stability dashboard, data protection, 24/7 SLA
TechnologySummit 2015
33#ABBYYSummit15
QUESTIONS