10 tips for making your sharepoint scanning project a sucess

Post on 13-Nov-2014

14.229 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

PSIGEN presentation at SharePoint Intelligence Anaheim. During this QuickStart event presentation, we gave an overview of success factors and planning required t

TRANSCRIPT

10 Tips To Make Your SharePoint Scanning

Project a Success

Stephen Boals949-916-7700- x230

Plan Your Storage

Description Number of Pages Storage

1 Scanned Page – 8.5 x 11 1 30-50KB

1 Scanned Page – 11x17 1 100KB

1 File Cabinet – 4 drawers 10,0000 500MB

1 Box 2500 125MB

1 Linear Inch 100 5MB

1 E Size Engineering Drawing (48x36)

16 – 8.5x11 800KB

How much storage?

Key Factors in Storage and Sizing• DPI Setting• Color/Black White/Grayscale• Image Format – PDF or TIFF??• Image Processing technology can reduce file

size by 10-30%– Despeckle– Border removal– 3 hole punch removal– Binarization***********

Scanning Mode/DPI File Size

Black and White – 200 DPI 26K

Black and White - 300 DPI 38K

Black and White - 400 DPI 51K

Black and White - 600 DPI 80K

Greyscale – 300 DPI 301K

Color- 300 DPI 577K

File Size Comparison

SharePoint Storage Architecture

• Image file sizes can lead to DB issues if proper planning does not take place and storage considerations are not examined.

• Consider the use of Remote BLOB Storage (RBS)

Latest Content Database Limitation• Content databases of up to 4 TB are supported

when the following requirements are met:– Disk sub-system performance of 0.25 IOPs per

GB. 2 IIOPs per GB is recommended for optimal performance.

– You must have developed plans for high availability, disaster recovery, future capacity, and performance testing.

• http://technet.microsoft.com/en-us/library/cc298801.aspx

• http://sharepoint.microsoft.com/blog/pages/BlogPost.aspx?pID=988

• Backup and restore.• Skilled administrators.• Complexity of customizations and configurations on

SharePoint Server 2010 may necessitate refactoring (or splitting) of data into multiple content databases.

• 100-200GB is still the best size for backup and restore, and overall manageability.

Considerations on Content DB Size

Microsoft RBS Recommendations

• RBS provides benefits in the following:– The content databases are larger than 500 gigabytes

(GB).– The BLOB data files are larger than 256 kilobytes (KB). – The BLOB data files are at least 80 KB and the

database server is a performance bottleneck. In this case, RBS reduces the both the I/O and processing load on the database server.

Use Folder and File Names

No More Folders!!

• Maybe not• Majority of our implementations,

customers required folder naming in libraries

• Why?

Why Folders?

• Users are familiar with Folder structures– Easier adoption

• Use of 3rd party tools– Colligo Briefcase– Access Tools

• WebDav applications• Office

Folders and Filenames

• Search Aid• Flexibility for migration• Structured data for DR• Overall Contingencies

Think Search

Capture drives Search• How do you want to find your documents?• Index fields (Columns in SharePoint) are the

critical focus.• Use Term Store and Managed

Metadata• Rules to live by:

– 5 <= defining fields per document type– Always include dates– Steer clear of field “overdrive”

• Automation and data sources can let you go beyond

OCR for Search

Full Text

• The Insurance Policy• Adobe PDF Image + Hidden Text– Industry Standard– One “Package” for image and OCR text– Portable

• Provide the ultimate in searchablility with iFilter

Define Your Scanning Model

Scanning Models

Scanning Models• Centralized Capture – Documents are scanned

at one location and in “batches” at a particular time or times

• De-centralized Capture – Documents are still scanned in batches at a particular time, but are now scanned at multiple locations

• Distributed Capture – Documents are scanned at the point of transaction and at multiple locations

Trend from Centralized to Distributed Scanning

Choose the Correct Scanners

Choosing your Weapon

• MFPs or Scanners??

MFPs – The Pros

• Leverage your existing investment in the MFP• Most copier maintenance plans do not charge

for scans• MFP manufacturers are really focusing on

scanning • Network scanning functions:

– Scan to email– Scan to Windows Folders– Scan to FTP

• One-to-Many relationship: all workers can use one device.

MFPs – The Cons

• Contention – “line at the copier”• Poor performance with differing paper sizes• Lack of color dropout (Scanning blue or black

backgrounds will result in a black page)• Small Document Feeder sizes (50 – 100 pages)• On average, file sizes are 10-20% larger• Duplex scanning/DPI increase greatly slows

down rated speed• Black and White scanning only on some models

Scanners – The Pros

• Convenience – scan at your desk• Duplexing does not slow down scanner• Color dropout• Superior image quality due to

enhancement features• Ease in handling differing paper

sizes/types• Larger document feeder selections (up

to 1000+ pages)

Scanners – The Cons

• One to One relationship – directly connected to PC

• Additional Maintenance costs• Can be quite expensive to outfit your

whole organization.

When to use a Dedicated Scanner

• Scanning 10+ documents per day• Workers that are constantly scanning throughout

the day• Mixed paper sizes, weights and colors• Poor quality, older documents or when image

enhancement is required• OCR or ICR applications• High volume copying and printing environments• Large Document scanning• High security environments

Key Points When Purchasing

• Scanning speed• Document Feeder Capacity• Daily Duty Cycle• Scanning Mode• Warranty and Service

Correctly Configure Devices

Too Many IT Killers

Focus

• Almost all MFPs Scan in Color by Default

• DPI is always set above 200 DPI• Huge network impact• Huge DB Impact• Huge drain on resources

Recommendations-Default

• 200 DPI • Black and White• Only add color for specific

departmental needs• Use TIFF and PDF

(compressed)• Linearized PDF (WebFast)

Scan or Capture?

Scanning Challenges

• Basic capabilities• No standardization• Documents not searchable• Time intensive• Lack of integration into

Enterprise Applications

Capture vs. Scanning

• A scanning application is just a means to take paper, and quickly and easily convert it from paper to digital form.  They are well suited to environments with very basic needs, and what I call "onsie-twosie" scanning, or low volume environments. 

Capture software can be utilized for basic scanning needs, but takes you to a whole new level from a "capture" perspective.  These applications typically have a number of ways to "slice and dice" documents, and really focus on efficiency, and minimizing the time required to scan, index and capture data. 

Why capture?

• Reduce the required time for scanning and indexing documents = Efficiency

• Enable a standard process for scanning, capturing, indexing, naming, and processing = Standardization

• Provide numerous gateways to multiple repositories = Flexibility

Automation is Key

Extraction Technologies

Advanced Data Extraction (ADE)

Zone OCR

Manual Entry

What is ADE?

Automated Routing

12332

ATT

1232.00

Use Barcodes and OMR

Routing/Separator Sheets

• Utilize barcodes and/or Optical Mark Recognition (OMR)

• Capture reads and determines routing based on them

Intelligent Routing

How are they Created?

• Most Capture Apps Include them

• Ad Hoc and Bulk Generation

• Excel and Word Macros

• Custom SP Apps

JDoe

Summary: Plan, Plan, Plan

Keys to Project Success

• All items in this presentation are critical to overall planning

• Focus on meeting needs and driving users to proper use of technology

• Start small – POC• Learn from smaller projects• Expand

Who is PSIGEN?

• Founded 1995• Mature capture company• Innovative Capture• Focus on Automation• Integration with 56 ECM

systems

Who uses PSIGEN products?

Links

• www.psigen.com• www.scanningwithsharepoint.com

top related