the next generation of concept searchingstorage.googleapis.com/wzukusers/user-12916790... ·...

9
The Next Generation of Concept Searching: Clearwell Transparent Concept Search A Technology White Paper

Upload: others

Post on 06-Oct-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 1

The Next Generation of Concept Searching: Clearwell Transparent Concept Search

A Technology White Paper

Page 2: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 2

Clearwell Whitepaper

Table of Contents:

Introduction .........................................................................................3

Given the Benefits of Concept Search, Why isn’t it More

Commonly Utilized? .............................................................................4

Introducing Transparent Concept Searching.............................................5

How to Use Clearwell’s Transparent Concept Search ................................6

Conclusion ..........................................................................................9

Page 3: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 3

Clearwell Whitepaper

Introduction

A recent article published in The Economist states that according to one estimate, “mankind created 150 exabytes (billion gigabytes) of data in 2005. This year, it will create 1,200 exabytes.”1 To put these numbers in perspective, all the catalogued books in the library of congress total 15 terabytes, while 5 petabytes (approximately 5000 terabytes) is roughly equal to all the letters delivered by the US postal service in 2010. Although, “terabyte” comes from the Greek word meaning “monster,” a terabyte is dwarfed by the size of an exabyte. An exabyte2 is about 1000 petabytes and is estimat-ed to be equal to all the printed material in the world.3

Not surprisingly, the exponential growth of information results in a critical challenge to the justice system since evaluating electronically stored information (ESI) is often one of the most important facets of litigation.4 In recent years, parties involved in legal disputes have commonly relied on basic keyword searching technologies to iden-tify and exchange ESI related to a matter. Although leveraging technology to search for keywords was perceived as a great leap forward among lawyers who historically reviewed paper documents without the aid of technology, keyword search technology by itself is often inadequate to meet today’s e-discovery challenges without resulting in unnecessary risk, expense, or both.

The problem is that keyword searches tend to be both over and under-inclusive at times in light of the inherent ambiguity of language.5 For example, the ambiguous nature of the word “strike” could result in a keyword search that identifies a broad range of documents relating to labor unions, a military action, bowling, or a baseball game even though the matter in question deals only with a labor union strike.6 The over-inclusive nature of the search results in what is referred to as low “precision” and could have serious financial consequences because the excess documents that are retrieved must typically be segregated as responsive or non-responsive by legal teams paid hourly for document review. On the other hand, failing to guess the right keywords to use as part of the search could have the opposite under-inclusive effect since potentially relevant documents not containing the keyword “strike” may be completely overlooked. The failure to retrieve these documents may not only impact the outcome of the matter, it could also lead to legal risk in the form of sanctions if the information is not identified and produced to the requesting party.

Keyword searching initially introduced a simple way to retrieve and produce docu-ments during discovery, but the inherent limitations of keyword searching have been magnified as data volumes grow and judicial scrutiny increases.7 Not surprisingly, more intelligent search technology has evolved to meet these needs. Although none of these technologies solved all the limitations of keyword searching, “traditional concept search” technology made strides in addressing the risk of under-inclusive key-word searches, but in a way that increases expenses in exchange for reducing risk. To address this industry need, Clearwell’s next generation Transparent Concept Search technology takes traditional concept searching a step further by empowering practi-tioners to reap the advantages of traditional concept searching while actually reducing instead of increasing e-discovery expenses.

Page 4: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 4

Clearwell Whitepaper

Given the Benefits of Concept Search, Why isn’t it More Commonly Utilized?

The over-inclusive nature of traditional concept search technology results in signifi-cantly greater e-discovery expenses in the form of vendor costs and downstream docu-ment review. Incurring significantly greater document review expenses to reduce the risk that important documents are overlooked is a difficult case by case economic decision that has hampered large scale adoption of traditional concept searching in the legal industry. The main reasons traditional concept searching is used sparingly or not at all are discussed below.

Increased e-dIscovery cost & the “Black Box” ProBlem

Traditional concept searching reduces the risk of information being overlooked by searching more broadly than keyword searching technology, but the benefits are often overshadowed by the technology’s lack of precision. In other words, traditional concept searching reduces risk by retrieving a much larger number of potentially re-sponsive documents than keyword searching, but many of these documents are irrel-evant or “false positives” that must be segregated from responsive documents through manual (linear) document review at great cost. According to industry estimates, these linear document review costs are significant because attorneys can typically only re-view between 50 and 60 documents per hour when conducting a linear page by page document review.8 That means if law firm associates are billed out at an average of $100/hour to review 50,000 documents, the cost of reviewing those documents would be somewhere between $80,000 and $100,000.

The problem stems from the fact that traditional concept searching like basic keyword searching is done in a “black box.” A black box search means users have no visibility or control over which concepts are included as part of a search because every concept related to the key term is automatically included as part of the search whether or not the concept is relevant. This one size fits all approach means every concept search tends to be unnecessarily broad since traditional concept searching doesn’t allow the user to intelligently narrow the search results. (See figure 1).

For example, although concept searching the term “strike” to investigate a labor dis-pute would likely recall relevant documents about labor union contracts, it might also recall thousands of irrelevant documents related to a military action, bowling, or a baseball game. Since linear document review is one of the costliest facets of e-discov-ery, lawyers and their clients may choose to gamble and run the risk of sanctions or producing privileged documents rather than utilizing traditional concept searching to help minimize these risks.

Figure 1: Traditional concept searching is done in a “black box”, meaning users have no visibility or control over which concepts are included as part of a search.

Page 5: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 5

Clearwell Whitepaper

Many who might otherwise overlook the additional document review costs resulting from the use of traditional concept search technology still reject the technology be-cause it is often more expensive than basic keyword search technology. More specifi-cally, e-discovery software solutions containing concept searching functionality must almost always be purchased separately. This additional software cost typically cannot be circumvented by using a vendor services model since vendors typically charge a premium (directly or indirectly) for processing data through concept searching engines. Given these financial trade-offs, traditional concept searching has not been widely adopted.

Introducing Transparent Concept Searching

Clearwell’s next generation Transparent Concept Search technology provides the same advantages as traditional concept searching without the drawbacks. In addi-tion to identifying potentially relevant documents containing concepts related to keywords, Transparent Concept Searching technology significantly reduces the time and expense resulting from over-inclusive document retrieval by allowing users to eliminate documents containing concepts that are not relevant to the intended search. This is accomplished by providing a transparent view into the contents of the “black box” so that users can actually visualize and select (or deselect) the range of concepts related to a particular term before the search is executed. This transparent approach to concept searching gives legal teams strategic advantages during meet and confer and settlement negotiations because cases can be assessed faster and with more accuracy. Similarly, Transparent Concept Searching’s comprehensive approach increases legal defensibility by eliminating the risk that documents are overlooked while simultaneously reducing downstream processing and review costs by retrieving only the most relevant documents for review.

Not only does Clearwell’s Transparent Concept Search technology decrease the overall number of documents to be reviewed through greater search precision, the technol-ogy also organizes documents logically making document review faster and more accurate. The idea is that attorneys reviewing randomly organized documents are not as productive as they could be if the same set of documents was logically organized by similar concepts and degrees of relevance. Studies have shown that reviewers us-ing advanced techniques can review documents up to 6 times faster than traditional linear reviewers and with better accuracy.9 Applying a six fold increase to the review rates mentioned earlier would result in an astounding reduction in review cost ex-ceeding 80%.10 This potentially tremendous cost savings doesn’t even account for the cost savings discussed earlier that are a result of Transparent Concept Search’s ability to significantly reduce the number of documents even requiring review. The ability to exclude or include documents with far greater precision coupled with the ability to review documents faster not only saves money, it makes assessing cases earlier and meeting critical deadlines easier. Unlike most traditional concept searching modules, Clearwell’s Transparent Concept Search feature provides all of these benefits at no additional licensing cost.

Page 6: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 6

Clearwell Whitepaper

How to Use Clearwell’s Transparent Concept Search

Clearwell’s Transparent Concept Search contains three interactive features designed to simplify and mimic common legal discovery workflows. First, Transparent Concept Search Preview enables users to enter a keyword and automatically gener-ate a list of related concepts ranked by relevance (frequency of occurrence). Prior to launching a search, the user can easily refine the search by selecting relevant concepts and ignoring irrelevant concepts. Clearwell’s Transparent Concept Search Explorer simplifies the process of constructing searches by providing a visual representation of various concepts in an intuitive graphical display that dynamically changes as new concepts are added or eliminated to explore different search options. Finally, Clearwell’s Transparent Concept Search Report provides detailed analytics for each search and documents how the search was constructed. These features help stream-line the legal team’s ability to assess cases early and defensibly, prepare for meet and confer conferences and other negotiations more efficiently, and minimize document review costs.

transParent concePt search for

early case assessment & legal defensIBIlIty

The ability to pick and choose which concepts will be included as part of a search introduces a new era in early case assessment that gives users earlier insight into docu-ments that could be critical to the outcome of a case. For example, searching for the keyword “diamond” as part of an investigation into insider trading of Apple stock by Diamond Investment Company would yield significantly different results depending on the technology used. Tradi-tional concept searching tools are black box technologies that would automatically include every concept (such as other precious gems) related to the term “diamond” in the search and return a high number of irrelevant documents thereby delaying the ability to assess the case thoroughly and quickly. On the other hand, Clearwell’s Transparent

Concept Search technology gives users the flexibility to see and select only the precise concepts related to the word “diamond” that should logically be included in the search while excluding other irrelevant concepts. (See figure 2). The ability to search with precision enables legal teams to understand and assess case strategy earlier without getting bogged down with coursing through irrelevant documents.

Figure 2: Clearwell’s Transparent Concept Search Preview (above left) and Transparent Concept Search Explorer (above right) give users the flexibility to see and select only the precise concepts related to the word “diamond” that should logically be included in the search while excluding other irrelevant concepts.

Page 7: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 7

Clearwell Whitepaper

Clearwell’s Transparent Concept Search also minimizes the risk that key documents are overlooked to insure that cases are assessed early and accurately. For example, searching the keyword “diamond” as part of the Apple insider trading scandal may not reveal that the code term “football” was used by some employees to cover up discus-sions about their insider trading activities. However, since Clearwell’s Transparent Concept Search technology provides a complete list of concepts rated by estimated rel-evance, there is a clear connection between the keyword “diamond” and the concept “football” since “football” is the top rated concept. (See Figure 2). This information allows documents containing deceptively relevant terms to be retrieved that may have otherwise been overlooked. (See figure 3). The ability to quickly and more compre-hensively identify critical documents that could have been overlooked through traditional concept searching results in a more complete understanding of the evidence earlier in the case. Clearwell’s Transparent Concept Search Re-port feature also automatically tracks key criteria so search methodology can be communicated or defended during meet and confer discussions or in court. (See figure 4).

transParent concePt search for

revIew & QualIty assurance

Clearwell’s Transparent Concept Search technology also expedites document review by providing tools to review ESI in a less expensive non-linear fashion that results in a higher degree of accuracy. As indicated earlier, studies have shown that reviewers using advanced techniques can review documents up to 6 times faster than traditional linear reviewers and with better accuracy. The general idea is that the monotonous task of reviewing hundreds if not thousands of documents for relevance and privilege becomes even more tedious when those documents are not conceptually related. Utilizing Clearwell’s Transpar-ent Concept Search functionality to construct intelligent searches for documents with a degree of precision that is not possible in traditional concept searching tools allows documents to be organized and reviewed logically. The more logical the connection between the documents being reviewed by a person, the faster and more accurately that person will be able to review those documents.11 Figure 4: Clearwell’s Transparent Concept

Search Reports automatically track important information so searches suggested as part of meet and confer negotiations can be tested and legal attacks regarding search method-ology can be defended.

Figure 3: Clearwell’s Transparent Concept Search minimizes the risk that key docu-ments are overlooked including documents containing code terms like “project football.”

Page 8: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 8

Clearwell Whitepaper

Similarly, the more certain concepts are unrelated to the intended search, the more likely those documents can be bulk tagged as “Not Relevant” or “Unlikely Relevant” so they can be excluded from more comprehensive and expensive linear document review. For example, if enter-ing the keyword “diamond” yields a list of concepts that include “wedding” and “wedding” has a low relevancy rating, then the user may elect to bulk tag all documents containing the concept “wedding” as “Not Relevant” after a cursory review. (See figure 5).

A similar process can be followed as a quality assurance step to reduce the risk of producing potentially privileged documents. For example, after document review has been completed, a list of privileged terms can be searched to identify conceptually related documents that might con-tain privileged words that may have otherwise been over-looked. The ability to automate the review and tagging of documents that may have otherwise required manual review coupled with the ability to review any remain-ing documents faster and with more accuracy is another significant cost saving advantage of Transparent Concept Search over traditional concept search.

Figure 5: Clearwell’s Transparent Concept Search technology enables bulk tagging of clearly irrelevant documents.

Page 9: The Next Generation of Concept Searchingstorage.googleapis.com/wzukusers/user-12916790... · concept search” technology made strides in addressing the risk of under-inclusive key-word

1. Bret Ryder, The Data Deluge, The Economist, Feb. 25, 2010 at http://www.economist.com (last visited Dec. 30, 2010)

2. All too much – Monstrous amounts of data, The Economist, Feb. 25, 2010 at http://www.economist.com (last visited Dec. 30, 2010)

3. See Exabyte definition: http://www.techterms.com/definition/exabyte (last visited Dec. 30, 2010)

4. The Sedona Conference Best Practices Commentary on the use of Search & Informational Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189, 194 (2007) (“Sedona Conference Best Practices”)

5. See Sedona Conference Best Practices, Id. at 201.

6. Id. at 203.

7. See Victor Stanley Inc. v. Creative Pipe,Civ. No. MJG-06-2662, 2008 WL 2221841 (D. Md. May 29, 2008) (Defendant’s questionable search methodology led to waiver of privilege and work-product protection with respect to 165 docu-ments).

8. Bennett B. Borden, The Demise of Linear Review, Williams Mullen E-Discovery Alert Oct., 2010

9. Id.

10. Assuming a billing rate of $100/hour, the total cost to review 50,000 documents decreases from $100,000 to $16,666 when the rate of document review per hour is increased from 50 to 300. This results in document review cost savings that exceed 80 percent.

11. Id.

© 2011 Clearwell Systems, Inc., Clearwell E-Discovery Platform is a trademark of Clearwell Systems, Inc. All rights reserved.

For More InForMatIon

For more information about Clearwell or the Clearwell E-Discovery Platform, please visit www.clearwellsystems.com.

Clearwell Systems441 Logue AvenueMountain View, CA 94043650.526.0600 tel650.526.0699 [email protected]

The NexT GeNeraT ioN of CoNCepT SearChiNG: Clearwell TraNSpareNT CoNCepT SearCh p a G e : 9

Conclusion

Clearwell’s next generation Transparent Concept Search technology overcomes the inherent limitations of traditional concept searching by empowering users to identify, assess, and review evidence faster and with more accuracy resulting in significant reductions in risk and cost.

First, Transparent Concept Search streamlines the ability to assess cases early by identifying documents conceptually related to a keyword that may have otherwise been overlooked. Transparent Concept Search technology also opens the black box of traditional concept searching by revealing concepts related to a keyword so only relevant concepts can be selected thereby resulting in more precision and fewer ir-relevant documents. The ability to systematically reduce the number of documents to be reviewed by eliminating irrelevant concepts saves money. Additionally, searching with precision results in a thorough understanding of the issues and evidence earlier so case strength and the corresponding negotiation strategy can be determined prior to incurring unnecessary downstream e-discovery costs. Since important criteria can automatically be tracked, search methodology can be communicated to the court and opposing counsel if necessary.

Second, Transparent Concept Searching allows documents to be organized logically and ranked by relevance thereby streamlining review speed and increasing accuracy. Toward the end of document review, Transparent Concept Search can also be used as a quality assurance tool to identify privileged documents that may have been over-looked during manual review so the risk of privilege waiver can be minimized.

No search technology, including Transparent Concept Search, is a “silver bullet” solution that can completely eliminate the risk and expense of manual document review. However, the Clearwell E-Discovery Platform gives users unparalleled flex-ibility to use a wide variety of search technologies -- including Transparent Keyword and Transparent Concept Searching -- to substantially reduce the risk and expense of e-discovery and to maximize strategic advantages.