presented by rebecca shwayri. introduction to predictive coding and its benefits how can records...

42
Presented by Rebecca Shwayri

Upload: nathaniel-jefferson

Post on 31-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Presented by

Rebecca Shwayri

Page 2: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Introduction to Predictive Coding and its benefits

How can records managers use Predictive CodingPredictive Coding in Action Limitations of keyword searches & human

reviewPredictive Coding Defensibility

Page 3: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

What is predictive coding?How does it work?

Page 4: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

NOT Magic NOT a cure for cancer NOT based on voodoo

Page 5: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Keyword searchingConcept searchingE-mail threadingThese methods can be useful but do not

predict relevance of future documents based on past documents

Page 6: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Expert (you) develops an understanding of the documents and classifies the documents

Old tech In common use today

Example: Spam Filter, Amazon.comMath and Statistics

Page 7: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

AlgorithmsMathematical model builtAccuracy depends on quality of training set

Page 8: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Random Sample

Single person reviews & codes

the Sample

Non-Responsive

Responsive

Computer learns & predicts

Computer categorizes all remaining documents

Responsive Non-Responsive

Repeat as needed

Review 2000-5000 randomly selected documentsOne person’s time for 15-39 hours

Predictive Coding in Practice

Page 9: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Dramatic Reduction in e-discovery costsMore accurate than human review and

keyword searchLight years faster than human review and

keyword search

Page 10: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Fact driven, not fear driven, settlementsLearn the facts of the case in a few days

rather than over months or years using traditional methods of review

Helps avoid litigation – uncovers the facts more quickly

Use as an information governance tool

Page 11: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Method Recall Ratio

Cost Speed

Keywords 20 percent High $$$ Slow - Misses content

Human Review 60 percent Very High $$$$ 60 docs / hr

Predictive Coding

75-98 percent

Low $ >80-250x faster

Page 12: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Information Governance Tool (proactive)Litigation Tool (reactive)

Page 13: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Encompasses a variety of disciplinesRecords ManagementKnowledge ManagementInformation Security and Privacy

Page 14: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Data breach risksE-discovery costsUnable to locate documents needed for the

business units

Page 15: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Standardized IG policiesReduce the need to review every single

document to determine the importance of the document to the company

Locate data within the company’s IT infrastructure and categorize it appropriately for the business units

Locate data that needs to be destroyed

Page 16: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Example: Company is sued in a dispute involving fraud and breach of contract

Custodians: 20 Potential Custodians with average e-mail box of 40 GB each (800 total GB of e-mail data)

Other electronic Files: 200 GB Total Data: 1 Terabyte

Page 17: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Company is served with a Request for Production of Documents by Plaintiffs’ Counsel

Plaintiffs’ Counsel demands searching through ESI of custodians

Plaintiffs’ Counsel makes a broad demand for accounting records

Page 18: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

What do you do?Keyword search 1TB of data? How do you

keyword search fraud? Information disadvantage!

Human review? It will take many, many months and millions of dollars to review 1TB of data!

Page 19: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Use Predictive CodingShould you disclose?

One school of thought suggests disclosing use of predictive coding to opposing counsel, agreeing to precision and recall rates (Full Agreement and Full Disclosure)

The other school of thought suggests making no disclosures (Avoid litigation associated with use of predictive coding)

Page 20: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Recall (Completeness) Recall measures how successful the system was in finding all

of the responsive documents. If 1,000 documents in the full set were actually responsive, but

the system only marked 750 of those documents responsive, then the recall would be 75 percent.

Precision (Accuracy) Precision measures how often the documents that were

marked responsive were actually responsive. If the system marked 10 documents responsive, and only six of

them were actually responsive, then the precision would be 60 percent.

Page 21: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Depends on collection “richness”2-5 days – one person & one only!500-5000 documents reviewedStop when system exhibits:

High rates of Precision & Recall – above the agreed to rates

No longer discovering new topics to teach the computer about

Computer is predicting with consistency

Page 22: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

It is like Exit Polling….

Statistics Truth: Sample of a certain size yields a certain level of confidence and a certain margin of error.

400 randomly selected docs provides 95% confidence level in the estimate of Predictive Coding accuracy, with a ± 5% margin of error. Reference: Cochran, WG 1977. Sampling Techniques, 3rd Ed. John Wiley & Sons, New York,

New York, USA.

Page 23: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

When you are out of timeIf you want to save moneyConsider using CAR for cases involving 5 GB

or more of dataPredictive coding makes sense when you

have 20,000 documents or more

Page 24: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Judge Facciola (D.DC): “If you are practicing e-discovery without a clawback, you are committing malpractice.”

Parties agree in writing that inadvertent production of privileged material does not automatically constitute a waiver

Page 25: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

What if the other side won’t agree to the clawback agreement?

Go to the Court!Rajala v. McGuire Woods, 2010 WL 294582

(D. Kan. July 22, 2010): Court issued clawback order with no need to show reasonable efforts

Page 26: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Consider Clawback Agreement during “meet and confer” conference

Embody agreement in Court Order (Rule 502(d))

Page 27: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Predictive coding should be used to cull down data set to a manageable level

This should occur AFTER predictive codingAttorneys should conduct privilege reviewAttorneys need to decide what is privileged: Do

not put this on auto-pilot

Page 28: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Why Linear Review is IneffectiveLinear Review compared to other methods

Page 29: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Catches only 20 percent of relevant evidence Therefore…misses 80 percent

The “Google” phenomenon

Limitations of Keywords

Page 30: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Failure of imagination (Example: Nasdaq versus Stock Market)

How many synonyms for the word “think”?Precise Terms of ArtMisspellings (Example: Mangment, Mangemnt…)

Problems With Keywords

Page 31: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Human problemPeople express concepts differentlyDifficulties in learning to adopt another party’s

language styleTREC (Text Retrieval Conference) was a

competition and it showed a complete failure in keyword searches

Page 32: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Human keyword based review is expensiveIt is slow & inaccurateIt unnecessarily complicates a simple processIs widely used as until now, there were no

alternativesPredictive coding – when “done right” – can save

a corporation 80-90% of review costs.

Page 33: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Keyword searches missed 96 percent of relevant documents (recall ratio averaged less than 4 percent)

TREC Legal Track Study 2009

Page 34: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

97 percent of relevant documents not foundOnly a 3 percent recall ratio (76,373 relevant

documents not discovered)Boolean searches reduced the initial corpus from

685,592 to 2,715 documents87 percent precision ratio (2,362 documents out

of 2,715 are relevant)

TREC Legal Track Study 2010

Page 35: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Involved a San Francisco Bay Area Rapid Transit Accident

Discovery database contained 40,000 documents and 350,000 pages

Attorneys believed keyword searches uncovered 75 percent of relevant documents

In reality: Only 20 percent of relevant documents uncovered

Blair and Maron Study

Page 36: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Human eyeballs on every documentJudge Peck: The “gold” standard does not have

any goldHuman assessors disagree on the relevance of a

document to a single topic

The “Gold” Standard

Page 37: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

TREC Conclusion: 65% Recall and 65% Precision is best retrieval effectiveness for human reviewers

Human eyeballs on every document is not working

Reviewers disagree as frequently as 50 percent

Page 38: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Monique Da Silva Moore v. Publicis Groupe & MSL Group (SDNY) (endorsed using predictive coding) Complicated and confusing protocol – DO NOT USE Defendants offered plaintiffs everything they wanted –

protocol was so confusing they could not see they got everything they ask for – so they went after the Judge.

Global Aerospace, Inc. v. Landow Aviation Limited Partnership (Circuit Court of Loudoun County Virginia) (authorized use of predictive coding over objection) Nothing in news – as no controversy – everything

worked!

Page 39: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Expensive Kleen case – 1400 attorney hours to determine search

terms – and plaintiff was not satisfied – and neither was aware of overall effectiveness of terms

Not effective Over or Under produces

Known to be very problematic“Ostrich approach” is no longer advisable –

technology has evolved Judges know it exists, plaintiffs know it exists and ask

for it

Page 40: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

EORHB, Inc., et. al. v. HOA Holdings, LLC (Delaware Chancery Court)

Court ordered the parties sua sponte to use predictive coding and ordered the parties to use the same vendor

Judge may have over stepped bounds

Page 41: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

Technology is your friendMake data driven decisionsWe are living in the “MoneyBall” ageIf you are unsure, please ask – this is not

going away

Page 42: Presented by Rebecca Shwayri. Introduction to Predictive Coding and its benefits How can records managers use Predictive Coding Predictive Coding in Action

For more information contact Rebecca Shwayri

Email: [email protected]: (813) 209-5029