e-discovery on a budget - levin college of law on a budget ... they print documents or convert them...

1

E-Discovery on a Budget

© 2015

Craig Ball

About this Collection ........................................................................................................ 1

A Lawyers’ Introduction to Digital Computers, Servers and Storage ............................... 4

Eight Tips to Quash the Cost of E-Discovery ................................................................. 24

E-Discovery for Everybody: The EDna Challenge ......................................................... 31

Ten Things That Trouble Judges About E-Discovery .................................................... 42

Preserving Google Content for Dummies ...................................................................... 47

Easing the Pain of E-Discovery with ESI Special .......................................................... 50

Gold Standard ............................................................................................................... 56

Ten Bonehead Mistakes in E-Discovery ........................................................................ 59

About the Author ............................................................................................................ 63

About this Collection What should e-discovery cost? That’s easy. It should cost less. Much less. E-discovery should cost proportionately less than the amounts at issue in business and damages cases; and e-discovery should not cost so much as to chill the willingness of litigants to bring meritorious cases and defend those without merit. Moreover, e-discovery shouldn’t cost much more than the ways we used to do discovery before all the “e-stuff” got out of hand. Lawyers and judges generally agree with these cost propositions. Who wouldn’t? Happily, e-discovery can handily meet these goals. It’s the wasteful “extras” that kill you, like:

1. Over preserving information because digital information is poorly managed; 2. Preserving data by copying data, generating multiple divergent sets over time; 3. Using crude search mechanisms to segregate potentially responsive information; 4. Finding and culling privileged information commingled with other data; 5. Converting functional and complete forms to degraded forms for review and production; 6. Relying upon labor-intensive item-by-item human review to assess relevance; and, 7. Though not an extra, insufficient skill to make defensible choices about volume and cost.

One reason e-discovery is more costly than paper discovery is because we didn’t include the cost of information management in our tabulation of paper discovery costs. Clients bore the expense of keeping records organized as a cost of doing business, and information tended to be produced in

2

forms used in the ordinary course of business, i.e., on paper. This meant less information found its way into the hands of the biggest contributors to discovery expense: the lawyers. Another reason discovery on paper cost less is because everyone using paper grasped the fundamentals of managing information on paper. You could reliably gauge volume by looking at the size of a pile of papers or by counting boxes or drawers. Information was encoded in accessible ways; e.g., for Americans, in decimal numbers and English words. Information was also logically unitized by staples, paper clips, binders and folders. Retention was routinely managed because excessive retention was expensive and inconvenient. It was a splendid system. We adored it. We miss it. Now, we must get over it because it’s gone. Kaput! It’s not a system that can be scaled to an information society. We cannot practically or cost-effectively adapt a paper-centric system to the amount and variety of digital information generated today. Lawyers still refuse to accept this and squander huge sums in their obstinacy. Don’t get me wrong. I love lawyers. Most of my friends are lawyers. I’d be proud if my kids become lawyers. Some of the best and brightest people I’ve ever known are lawyers. I share my affection for my colleagues so you’ll know where I’m coming from when I confess I’m chagrinned by lawyers’ steadfast refusal to acquire even the most basic competency in electronic discovery and digital evidence. Electronic evidence is fast becoming the most ubiquitous, probative and powerful proof extant; yet, the justice system abides a crisis of competence when it comes to discovery of electronically stored information. It’s a crisis that carries an awesome cost—millions upon millions of dollars wasted on overbroad preservation and collection, purposeless processing and—worst of all—profligate document review efforts destined to turn courtrooms into country clubs and suck the souls from young lawyers. The fault doesn’t lie with e-discovery or the rules of procedure or even (greedy) plaintiffs or (greedy) defendants. I have seen the enemy, and it is us. America's halcyon days of hammer and harness are behind us. We are all knowledge workers now; yet, even those who drive trucks or empty bedpans are tasked by pixels and tracked by bytes. The evidence of what we do and say, of when and where and how we go, of what we own and earn and spend is digital. More than 99% of it will never exist as anything but electronically stored information, and most takes forms that require special tools or expertise to see and interpret. This irritates and intimidates old school lawyers. At great cost to unwitting clients, the old school cling to what they know and disregard the rest. They print documents or convert them to paper-like formats like TIFF. They unleash armies of reviewers against hordes of irrelevant documents. They thunder that e-discovery is "out-of-control," extolling the merits of raw meat rather than learning to make fire.

3

A lawyer without the skills needed to properly preserve, collect, analyze and present electronic evidence is all-but-incompetent to manage litigation today, and visiting the cost to compensate for those shortcomings upon the client is an ethical minefield. That's why you must make it your mission to master electronic discovery, and help staunch the hemorrhaging of money stemming from lawyer incompetence in e-discovery. Now, let me tell you why you'll be glad you did. Occasionally you’ll win a case on charm, a good or bad judge, an appealing client, a hateful opponent or just dumb luck. But, without any of these things, you'll win most of the time if you have the evidence proving your case. Much of that evidence is digital. It's there. It's waiting for you--eager to tell its compelling story, ready to show your client was right and the other side should pay big or go hence without day. The lawyer who can get to the digital evidence--find it, understand it and use it in a cost-effective fashion--enjoys an enormous competitive advantage. The selected articles and columns that follow were chosen to help you identify ways to tame the cost of e-discovery. They are a small sampling of the articles I've written about electronic discovery and computer forensics, available at www.craigball.com and ballinyourcourt.com. I hope you find them to be a helpful, accessible introduction to the cost-saving side of electronic discovery. Craig Ball, January 2015

http://www.craigball.com/

http://www.ballinyourcourt.com/

4

A Lawyers’ Introduction to Digital Computers, Servers and Storage By Craig Ball © 2014

In 1774, a Swiss watchmaker named Pierre Jaquet-Droz built an ingenious mechanical doll resembling a barefoot boy. Constructed of 6,000 handcrafted parts and dubbed "L'Ecrivain” (“The Writer”), Jaquet-Droz’ automaton uses quill and ink to handwrite messages in cursive, up to 40 letters long, with the content controlled by interchangeable cams. The Writer is a charming example of an early programmable computer.

The monarchs that marveled at Jaquet-Droz’ little penman didn’t need to understand how it worked to enjoy it. Lawyers, too, once had little need to understand the operation of their clients’ information systems in order to conduct discovery. But as the volume of electronically stored information (ESI) has exploded and the forms and sources of ESI continue to morph and multiply, lawyers conducting electronic discovery cannot ignore the watch works. New standards of competence demand that lawyers master some fundamentals of information technology and electronic evidence.

Digital Data

Despite its daunting complexity, all digital content—photos, music, documents, spreadsheets, databases, social media and communications—exist in one common and mind-boggling form: as an unbroken string of ones and zeroes, most memorialized as impossibly tiny reversals of magnetic polarity. These minute fluctuations must be read by a detector riding above the surface of a spinning disk on a cushion of air one-thousandth the width of a human hair in an operation akin to a jet fighter flying around the world at more than 800 times the speed of sound, less than a millimeter above the ground…and precisely counting every blade of grass it passes.

That’s astonishing, but what should astound you more is that there are no pages, paragraphs, spaces or markers of any kind to define the data stream. That is, the history, knowledge and creativity of humankind have been reduced to two different states (on/off…one/zero) in an unbroken, featureless expanse. Moreover, it’s a data stream that carries not only the information we store but all of the instructions needed to make sense of that data, as well. The data stream holds all of the information about the data required to play it, display it, transmit it or otherwise put it to work. It’s a reductive feat that’ll make your head spin…or at least make you want to buy a computer scientist a beer.

5

Data, Not Documents

Lawyers—particularly those who didn’t grow up with computers—tend to equate data with documents when, in a digital world, documents are just one variant of the many forms in which electronic information exists. Documents like the letters, memos and reports of yore account for a dwindling share of electronically stored information relevant in discovery, and documents generated from electronic sources tend to convey just part of the information stored in the source. The decisive information in a case may exist as nothing more than a single bit of data that, in context, signals whether the fact you seek to establish is true or not. A Facebook page doesn’t exist until a request sent to a database triggers the page’s assembly and display. Word documents, PowerPoint presentations and Excel spreadsheets lose content and functionality when printed to screen images or paper.

With so much discoverable information bearing so little resemblance to documents, and with electronic documents carrying much more probative and useful information than a printout or screen image conveys, competence in electronic discovery demands an appreciation of data more than documents.

Binary

When we were children starting to count, we had to learn the decimal system. We had to think about what numbers meant. When our first grade selves tackled a big number like 9,465, we were acutely aware that each digit represented a decimal multiple. The nine was in the thousands place, the four in the hundreds, the six in the tens place and so on. We might even have parsed 9,465 as: (9 x 1000) + (4 x 100) + (6 x 10) + (5 x 1). But soon, it became second nature to us. We’d unconsciously process 9,465 as nine thousand four hundred sixty-five. As we matured, we learned about powers of ten and now saw 9,465 as: (9 x 103) + (4 x 102) + (6 x 101) + (5 x 100). This was exponential or “base ten” notation. Mankind probably uses base ten to count because we evolved with ten fingers. But, had we slithered from the ooze with eight or twelve digits, we’d have gotten on splendidly using a base eight or base twelve number system. It really wouldn’t matter because any number--and consequently any data--can be expressed in any number system. So, it happens that computers use the base two or “binary” notation, and computer programmers are partial to base sixteen or “hexadecimal” notation. It’s all just counting. Bits Computers use binary digits in place of decimal digits. The word bit is even a shortening of the words "Binary digIT." Unlike the decimal system, where any number is represented by some combination of ten possible digits (0-9), the bit has only two possible values: zero or one. This is not as limiting as one might expect when you consider that a digital circuit—essentially an unfathomably

6

complex array of switches—hasn’t got any fingers to count on, but is very good and very fast at being “on” or “off.”

In the binary system, each binary digit—“bit”—holds the value of a power of two. Therefore, a binary number is composed of only zeroes and ones, like this: 10101. How do you figure out what the value of the binary number 10101 is? You do it in the same way we did it above for 9,465, but you use a base of 2 instead of a base of 10. Hence: (1 x 24) + (0 x 23) + (1 x 22) + (0 x 21) + (1 x 20) = 16 + 0 + 4 + 0 + 1 = 21. Moving from right to left, each bit you encounter represents the value of increasing powers of 2, standing in for zero, two, four, eight, sixteen, thirty-two, sixty-four and so on. That makes counting in binary pretty easy. From zero to 21, decimal and binary equivalents look like the table at right. Bytes A byte is a sequence or “string” of eight bits. The biggest number that can be stored as one byte of information is 11111111, equal to 255 in the decimal system. The smallest number is zero or 00000000. Thus, there are 256 different numbers that can be stored as one byte of information. So, what do you do if you need to store a number larger than 256? Simple! You use a second byte. This affords you all the combinations that can be achieved with 16 bits, being the product of all the variations of the first byte and all of the second byte (256 x 256 or 65,536). So, using bytes to express values, any number that is greater than 256 needs at least two bytes to be expressed (called a “word” in geek speak), and any number above 65,536 requires at least three bytes, and so on. A value greater than 16,777,216 (2563 or 224) needs four bytes (called a “long word”) and so on.

Let’s try it: Suppose we want to represent the number 51,975. It’s 1100101100000111, viz:

215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20 32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1 1 1 0 0 1 0 1 1 + 0 0 0 0 0 1 1 1 (32768+16384+2048+512+256) or 51,968 + (4+2+1) or 7

Why is an eight-bit sequence the fundamental building block of computing? It just sort of happened that way. In this time of cheap memory, expansive storage and lightning-fast processors, it’s easy

7

to forget how scarce and costly these resources were at the dawn of the computing era. Seven bits (with a leading bit reserved) was basically the smallest block of data that would suffice to represent the minimum complement of alphabetic characters, decimal digits, punctuation and control instructions needed by the pioneers in computer engineering. It was, in another sense, about all the data early processors could chew on at a time, perhaps explaining the name “byte” (coined by IBM scientist, Dr. Werner Buchholz, in 1956).

The Magic Decoder Ring called ASCII Back in 1935, American kids who listened to the Little Orphan Annie radio show (and who drank lots of Ovaltine) could join the Radio Orphan Annie Secret Society and obtain a device with rotating disks that allowed them to write secret messages in numeric code. Similarly, computers encode words as numbers. Binary data stand in for the upper and lower case English alphabet, as well as punctuation marks, special characters and machine instructions (like carriage return and line feed). The most widely deployed U.S. encoding mechanism is known as the ASCII code (for American Standard Code for Information Interchange,

pronounced “ask-key”). By limiting the ASCII character set to just 128 characters, any character can be expressed in just seven bits (27 or 128) and so occupies less than one byte in the computer's storage and memory. In the Binary Table that follows, the columns reflect a binary (byte) value, its decimal equivalent and the corresponding ASCII text value (including some for machine codes and punctuation): ASCII Table

Binary Decimal Character Binary Decimal Character Binary Decimal Character 00000000 000 NUL 00101011 043 + 01010110 086 V 00000001 001 SOH 00101100 044 , 01010111 087 W 00000010 002 STX 00101101 045 - 01011000 088 X 00000011 003 ETX 00101110 046 . 01011001 089 Y 00000100 004 EOT 00101111 047 / 01011010 090 Z 00000101 005 ENQ 00110000 048 0 01011011 091 [ 00000110 006 ACK 00110001 049 1 01011100 092 \ 00000111 007 BEL 00110010 050 2 01011101 093 ] 00001000 008 BS 00110011 051 3 01011110 094 ^ 00001001 009 HT 00110100 052 4 01011111 095 _ 00001010 010 LF 00110101 053 5 01100000 096 ` 00001011 011 VT 00110110 054 6 01100001 097 a

Werner Bucholz

8

00001100 012 FF 00110111 055 7 01100010 098 b 00001101 013 CR 00111000 056 8 01100011 099 c 00001110 014 SO 00111001 057 9 01100100 100 d 00001111 015 SI 00111010 058 : 01100101 101 e 00010000 016 DLE 00111011 059 ; 01100110 102 f 00010001 017 DC1 00111100 060 < 01100111 103 g 00010010 018 DC2 00111101 061 = 01101000 104 h 00010011 019 DC3 00111110 062 > 01101001 105 i 00010100 020 DC4 00111111 063 ? 01101010 106 j 00010101 021 NAK 01000000 064 @ 01101011 107 k 00010110 022 SYN 01000001 065 A 01101100 108 l 00010111 023 ETB 01000010 066 B 01101101 109 m 00011000 024 CAN 01000011 067 C 01101110 110 n 00011001 025 EM 01000100 068 D 01101111 111 o 00011010 026 SUB 01000101 069 E 01110000 112 p 00011011 027 ESC 01000110 070 F 01110001 113 q 00011100 028 FS 01000111 071 G 01110010 114 r 00011101 029 GS 01001000 072 H 01110011 115 s 00011110 030 RS 01001001 073 I 01110100 116 t 00011111 031 US 01001010 074 J 01110101 117 u 00100000 032 SP 01001011 075 K 01110110 118 v 00100001 033 ! 01001100 076 L 01110111 119 w 00100010 034 " 01001101 077 M 01111000 120 x 00100011 035 # 01001110 078 N 01111001 121 y 00100100 036 $ 01001111 079 O 01111010 122 z 00100101 037 % 01010000 080 P 01111011 123 { 00100110 038 & 01010001 081 Q 01111100 124 | 00100111 039 ' 01010010 082 R 01111101 125 } 00101000 040 ( 01010011 083 S 01111110 126 ~ 00101001 041 ) 01010100 084 T 01111111 127 DEL 00101010 042 * 01010101 085 U

So, “E-Discovery” would be written in a binary ASCII sequence as: 0100010100101101010001000110100101110011011000110110111101110110011001010111001001111001 Now that you have some sense of how information can be written as digital data, let’s take a look at some of the devices that store and utilize digital data. Introduction to Data Storage Media Mankind has been storing data for thousands of years, on stone, bone, clay, wood, metal, glass, skin, papyrus, paper, plastic and film. In fact, people were storing data in binary formats long before the emergence of modern digital computers. Records from 9th century Persia describe an

9

organ playing interchangeable cylinders. Eighteenth century textile manufacturers employed perforated rolls of paper to control looms, and Swiss and German music box makers used metal drums or platters to store tunes. At the dawn of the Jazz Age, no self-respecting American family of means lacked a player piano capable (more-or-less) of reproducing the works of the world’s greatest pianists.

Whether you store data as a perforation or a pin, you’re storing binary data. That is, there are two data states: hole or no hole, pin or no pin. Zeroes or ones. Punched Cards

In the 1930’s, demand for electronic data storage led to the development of fast, practical and cost-effective binary storage media. The first of these were punched cards, initially made in a variety of sizes and formats, but ultimately standardized by IBM as the 80 column, 12 row (7.375” by 3.25”) format (right) that dominated computing well into the 1970’s. [From 1975-79, the author spent many a midnight in the basement of a computer center at Rice University typing program instructions on these unforgiving punch cards].

The 1950’s saw the emergence of magnetic storage as the dominant medium for electronic data storage, and it remains so today. Although optical and solid state storage are expected to ultimately eclipse magnetic media for local storage, magnetic storage will continue to dominate network and cloud storage well into the 2020s, if not beyond.

y IBM 5081 80 column card

http://upload.wikimedia.org/wikipedia/commons/4/4d/CIMA_mg_8302.jpg

http://upload.wikimedia.org/wikipedia/commons/4/4d/CIMA_mg_8302.jpg

10

Tape

The earliest popular form of magnetic data storage was magnetic tape. Spinning reels of tape were a clichéd visual metaphor for computing in films and television shows from the 1950s through 1970’s. Though the miles of tape on those reels now resides in cartridges and cassettes, tapes remain an enduring medium for backup and archival of electronically stored information. The LTO-5 format introduced in 2010 natively holds 1.5 terabytes of uncompressed data and delivers a transfer rate of 140 megabytes per second. Since most data stored on backup tape is compressed, the actual volume of ESI on tape may be 2-3 times greater than the native capacity of the tape.

Magnetic tape was the earliest data storage medium for personal computers including the pioneering Radio Shack

TRS-80 and the very first IBM personal computer, the model XT.

While tape isn’t as fast or capacious as hard drives, it’s proven to be more durable and less costly for long term storage; that is, so long as the data is being stored, not restored.

LTO-5 Ultrium Tape Sony AIT-3 Tape SDLT-II Tape

Chronology of Magnetic Tape Formats for Data Storage (Wikipedia)

1951 – UNISERVO 1952 - IBM 7 track 1958 - TX-2 Tape System 1962 – LINCtape 1963 – DECtape 1964 - 9 Track 1964 – MagCard Selectric typewriter

1986 - SLR 1987 - Data8 1989 - DDS/DAT 1992 - Ampex DST 1994 - Mammoth 1995 - IBM 3590 1995 - Redwood SD-3

TRS

5 Ultrium Tape

Chronology of Magnetic Tape Formats for Data Storage (Wikipedia)Formats for Data Storage (Wikipedia)

11

1966 - 8-Track Tape 1972 - QIC 1975 - KC Standard, Compact Cassette 1976 - DC100 1977 - Commodore Datasette 1979 – DECtapeII 1979 - Exatron Stringy Floppy 1983 - ZX Microdrive 1984 - Rotronics Wafadrive 1984 - IBM 3480 1984 - DLT

1995 - Travan 1996 - AIT 1997 - IBM 3570 MP 1998 - T9840 1999 – VXA 2000 - T9940 2000 - LTO Ultrium 2003 - SAIT 2006 - T10000 2007 - IBM 3592 2008 - IBM TS1130 2011 - IBM TS1140

For further information, see Ball, Technology Primer: Backups in Civil Discovery at http://www.craigball.com/Ball_Technology%20Primer-Backups%20in%20E-Discovery.pdf

Floppy Disks

It’s rare to encounter a floppy disk today, but floppy disks played a central role in software distribution and data storage for personal computing for almost thirty years. Today, the only place a computer user is likely to see a floppy disk is as the menu icon for storage on the menu bar of Microsft Office applications. All floppy disks have a spinning, flexible plastic disk coated with a magnetic oxide (e.g., rust). The disk is essentially the same composition as magnetic tape in disk form. Disks are formatted (either by the user or pre-formatted by the manufacturer) so as to divide the disk into various concentric rings of data called tracks, with tracks further subdivided into tiny arcs called sectors. Formatting enables systems to locate data on physical storage media much as roads and lots enable us to locate homes in a neighborhood.

Though many competing floppy disk sizes and formats have been introduced since 1971, only five formats are likely to be encountered in e-discovery. These are the 8”, 5.25”, 3.5 standard, 3.5 high density and Zip formats and, of these, the 3.5HD format 1.44 megabyte capacity floppy is by far the most prevalent legacy floppy disk format.

8" Floppy Disk in Use 8" Floppy Disk in Use

8", 5.25" and 3.5" Floppy Disks

http://www.craigball.com/Ball_Technology%20Primer-Backups%20in%20E-Discovery.pdf

12

The Zip Disk was one of several proprietary “super floppy” products that enjoyed brief success before the high capacity and low cost of recordable optical media (CD-R and DVD-R) and flash drives rendered them obsolete.

Optical Media

The most common forms of optical media for data storage are the CD, DVD and Blu-ray disks in read only, recordable or rewritable formats. Each typically exists as a 4.75” plastic disk

with a metalized reflective coating and/or dye layer that can be distorted by a focused laser beam to induce pits and lands in the media. These pits and lands, in turn, interrupt a laser reflected off the surface of the disk to generate the ones and zeroes of digital data storage. The practical difference between the three prevailing forms of optical media are their native data storage capacities and the availability of drives to read them.

A CD (for Compact Disk) or CD-ROM (for CD Read Only Media) is read only and not recordable by the end user. It’s typically fabricated in

factory to carry music or software. A CD-R is recordable by the end user, but once a recording session is closed, it cannot be altered in normal use. A CD-RW is a re-recordable format that can be erased and written to multple times. The native data storage capacity of a standard-size CD is about 700 megabytes.

A DVD (for Digital Versitile Disk) also comes in read only, recordable (DVD±R) and rewritable (DVD±RW) iterations and the most common form of the disk has a native data storage capacity of approximately 4.7 gigabytes. So, one DVD holds the same amount of data as six and one-half CDs. By employing the narrower wavelength of a blue laser to read and write disks, a dual layer Blu-ray disk can hold up to about 50 gigabytes of data, equalling the capacity of about ten and one-half DVDs. Like their predecessors, Blu-ray disks are available in recordable (BD-R) and rewritable (CD-RE) formats

Though ESI resides on a dizzying array of media and devices, by far the largest complement of same occurs within three closely-related species of computing hardware: computers, hard drives and servers. A server is essentially a computer dedicated to a specialized task or tasks, and both servers and computers routinely employ hard drives for program and data storage.

Conventional Electromagnetic Hard Drives A hard drive is an immensely complex data storage device that’s been engineered to appear deceptively simple. When you connect a hard drive to your machine, and the operating system detects the drive, assigns it a drive letter and—presto!—you’ve got trillions of bytes of new storage!

Zip Disk

http://upload.wikimedia.org/wikipedia/en/e/e9/DVD-4.5-scan.png

13

Microprocessor chips garner the glory, but the humdrum hard drive is every bit a paragon of ingenuity and technical prowess. A conventional personal computer hard drive is a sealed aluminum box measuring (for a desktop system) roughly 4” x 6” x 1” in height. A hard drive can be located almost anywhere within the case and is customarily secured by several screws attached to any of ten pre-threaded mounting holes along the edges and base of the case. One face of the case will be labeled to reflect the drive specifications, while a printed circuit board containing logic and controller circuits will cover the opposite face.

A conventional hard disk contains round, flat discs called platters, coated on both sides with a special material able to store data as magnetic patterns. Much like a record player, the platters have a hole in the center allowing multiple platters to be stacked on a spindle for greater storage capacity.

The platters rotate at high speed—typically 5,400, 7,200 or 10,000 rotations per minute—driven by an electric motor. Data is written to and read from the platters by tiny devices called read/write heads mounted on the end of a pivoting extension called an actuator arm that functions similarly to the tone arm that carried the phonograph cartridge and needle across the face of a record. Each platter has two read/write heads, one on the top of the platter and one on the bottom. So, a conventional hard disk with three platters typically sports six surfaces and six read/write heads.

Unlike a record player, the read/write head never touches the spinning platter. Instead, when the platters spin up to operating speed, their rapid rotation causes air to flow under the read/write heads and lift them off the surface of the disk—the same principle of lift that operates on aircraft wings and enables them to fly. The head then reads the magnetic patterns on the disc while flying just .5 millionths of an inch above the surface. At this speed, if the head bounces against the surface, there is a good chance that the head will burrow into the surface of the platter, obliterating data, destroying both read/write heads and rendering the hard drive inoperable—a so-called “head crash.”

14

The hard disk drive has been around for more than 50 years, but it was not until the 1980’s that the physical size and cost of hard drives fell sufficiently for their use to be commonplace.

Introduced in 1956, the IBM 350 Disk Storage Unit pictured was the first commercial hard drive. It was 60 inches long, 68 inches high and 29 inches deep (so it could fit through a door). It held 50 magnetic disks of 50,000 sectors, each storing 100 alphanumeric characters. Thus, it held 4.4 megabytes, or enough for about two cellphone snapshots today. It weighed a ton (literally), and users paid $130.00 per month to rent each megabyte of storage.

Today, that same $130.00 buys a 3-4 terabyte hard drive that stores 3 million times more information, weighs less than three pounds and hides behind a paperback book.

Over time, hard drives took various shapes and sizes (or “form factors” as the standard dimensions of key system components are called in geek speak). Three form factors are still in use: 3.5” (desktop drive), 2.5” (laptop drive) and 1.8” (iPod and microsystem drive, now supplanted by solid state storage).

Hard drives connect to computers by various mechanisms called “interfaces” that describe both how devices “talk” to one-another as well as the physical plugs and cabling required. The five most common hard drive interfaces in use today are:

PATA for Parallel Advanced Technology Attachment (sometimes called EIDE for Extended Integrated Drive Electronics):

SATA for Serial Advanced Technology Attachment

SCSI for Small Computer System Interface

SAS for Serial Attached SCSI

FC for Fibre Channel

Though once dominant in personal computers, PATA drives are rarely found in machines manufactured after 2006. Today, virtually all laptop and desktop computers employ SATA drives for local storage. SCSI, SAS and FC drives tend to be seen exclusively in servers and other applications demanding high performance and reliability.

$130.00

15

From the user’s perspective, PATA, SATA, SCSI, SAS and FC drives are indistinguishable; however, from the point of view of the technician tasked to connect to and image the contents of the drive, the difference implicates different tools and connectors.

The five drive interfaces divide into two employing parallel data paths (PATA and SCSI) and three employing serial data paths (SATA, SAS and FC). Parallel ATA interfaces route data over multiple simultaneous channels necessitating 40 wires where serial ATA interfaces route data through a single, high- speed data channel requiring only 7 wires. Accordingly, SATA cabling and connectors are smaller than their PATA counterparts (see photos, right).

Fibre Channel employs optical fiber (the spelling difference is intentional) and light waves to carry data at impressive speeds. The premium hardware required by FC dictates that it will be found in enterprise computing environments, typically in conjunction with a high capacity/high demand storage device called a SAN (for Storage Attached Network) or a NAS (for Network Attached Storage).

It’s easy to become confused between hard drive interfaces and external data transfer interfaces like USB or FireWire seen on external hard drives. The drive within the external hard drive housing will employ one of the interfaces described above (except FC); however, to facilitate external connection to a computer, a device called a bridge will convert data written to and from the hard drive to a form that can traverse a USB or FireWire connection. In some compact, low-cost external drives, manufacturers dispense with the external bridge board altogether and build the USB interface right on the hard drive’s circuit board.

16

Flash Drives, Memory Cards and Solid State Drives Computer memory storage devices have no moving parts and the data resides entirely within the solid materials which compose the memory chips, hence the term, “solid state.” Historically, rewritable memory was volatile (in the sense that contents disappeared when power was withdrawn) and expensive. But, beginning around 1995, a type of non-volatile memory called NAND flash became sufficiently affordable to be used for removable storage in emerging applications like digital photography. Further leaps in the capacity and dips in the cost of NAND flash led to the near-eradication of film for photography and the extinction of the floppy disk, replaced by simple, inexpensive and reusable USB storage devices called, variously, flash drives, thumb drives, pen drives and memory sticks or keys.

As the storage capacity of NAND flash has gone up and its cost has come down, the conventional electromagnetic hard drive is rapidly being replaced by solid state drives in standard hard drive form factors. Solid state drives are significantly faster, lighter and more energy efficient than conventional drives, but they currently cost anywhere from 10-20 times more per gigabyte than their mechanical counterparts. All signs point to the ultimate obsolescence of mechanical drives by solid state drives, and some products (notably tablets like the iPad and ultra-lightweight laptops like the MacBook Air) have eliminated hard drives altogether in favor of solid state storage.

Currently, solid state drives assume the size and shape of mechanical drives to facilitate compatibility with existing devices. However, the size and shape of mechanical hard drives was driven by the size and operation of the platter they contain. Because solid state storage devices have no moving

parts, they can assume virtually any shape. It’s likely, then, that slavish adherence to 2.5” and 3.5” rectangular form factors will diminish in favor of shapes and sizes uniquely suited to the devices that employ them.

With respect to e-discovery, the shift from electromagnetic to solid state drives is inconsequential. However, the move to solid state drives will significantly impact matters necessitating computer forensic analysis. Because the

NAND memory cells that comprise solid state drives wear out rapidly with use, solid state drive controllers must constantly reposition data to insure usage is distributed across all cells. Such “wear leveling” hampers techniques that forensic examiners have long employed to recover deleted data from conventional hard drives.

RAID Arrays

storage devices called, variously, flash drives, thumb drives,

17

Whether local to a user or in the Cloud, hard drives account for nearly all the electronically stored information attendant to e-discovery. In network server and Cloud applications, hard drives rarely work alone. That is, hard drives are ganged together to achieve greater capacity, speed and reliability in so-called Redundant Arrays of Independent Disks or RAIDs. In the SAN pictured at left, the 16 hard drives housed in trays may be accessed as Just a Bunch of Disks or JBOD, but it’s far more likely they are working together as a RAID

RAIDs serve two ends: redundancy and performance. The redundancy aspect is obvious—two drives holding identical data safeguard against data loss due to mechanical failure of either drive—but how do multiple drives improve performance? The answer lies in splitting the data across more than one drive using a technique called striping.

A RAID improves performance by dividing data across more than one physical drive. The swath of data deposited on one drive in an array before moving to the next drive is called the "stripe." If you imagine the drives lined up alongside one-another, you can see why moving back-and-forth the drives to store data might seem like painting a stripe across the drives. By striping data, each drive can deliver their share of the data simultaneously, increasing the amount of information handed off to the computer’s microprocessor.

But, when you stripe data across drives, Information is lost if any drive in the stripe fails. You gain performance, but surrender security.

This type of RAID configuration is called a RAID 0. It wrings maximum performance from a storage system; but it's risky.

If RAID 0 is for gamblers, RAID 1 is for the risk averse. A RAID 1 configuration duplicates everything from one drive to an identical twin, so that a failure of one drive won't lead to data loss. RAID 1 doesn't improve performance, and it requires twice the hardware to store the same information.

Other RAID configurations strive to integrate the performance of RAID 0 and the protection of RAID 1.

Thus, a "RAID 0+1" mirrors two striped drives, but demands four hard drives delivering only half their total storage capacity, Safe and fast, but not cost-efficient. The solution lies in a concept called parity, key to a range of other sequentially numbered RAID configurations. Of those other configurations, the ones most often seen are called RAID 5 and RAID 7.

To understand parity, consider the simple equation 5 + 2 = 7. If you didn't know one of the three values in this equation, you could easily solve for the missing value, i.e., presented with "5 + __ = 7," you can reliably calculate the missing value is 2. In this example, "7" is the parity value or checksum for "5" and "2."

18

The same process is used in RAID configurations to gain increased performance by striping data across multiple drives while using parity values to permit the calculation of any missing values lost to drive failure. In a three drive array, any one of the drives can fail, and we can use the remaining two to recreate the third (just as we solved for 2 in the equation above).

In this illustration, data is striped across three hard drives, HDA, HDB and HDC. HDC holds the parity values for data stripe 1 on HDA and stripe 2 on HDB. It's shown as "Parity (1, 2)." The parity values for the other stripes are distributed on the other drives. Again, any one of the three drives can fail and all of the data is recoverable. This configuration is RAID 5 and, though it requires a minimum of three drives, it can be expanded to dozens or hundreds of disks.

Computers Historically, all sorts of devices—and even people—were “computers.” During World War II, human computers—women for the most part—were instrumental in calculating artillery trajectories and assisting with the challenging number-crunching needed by the Manhattan Project. Today, laptop and desktop personal computers spring to mind when we hear the term “computer;” yet smart phones, tablet devices, global positioning systems, video gaming platforms, televisions and a host of other intelligent tools and toys are also computers. More precisely, the central processing unit (CPU) or microprocessor of the system is the “computer,” and the various input and output devices that permit humans to interact with the processor are termed peripherals. The key distinction between a mere calculator and a computer is the latter’s ability to be programmed and its use of memory and storage. The physical electronic and mechanical components of a computer are its hardware, and the instruction sets used to program a computer are its software. Unlike the interchangeable cams of Pierre Jaquet-Droz’ mechanical doll, modern electronic computers receive their instructions in the form of digital data typically retrieved from the same electronic storage medium as the digital information upon which the computer performs its computational wizardry.

When you push the power button on your computer, you trigger an extraordinary, expedited education that takes the machine from insensible illiterate to worldly savant in a matter of seconds. The process starts with a snippet of data on a chip called the ROM BIOS storing just enough information in its Read Only Memory to grope around for the Basic Input and Output System peripherals (like the keyboard, screen and, most importantly, the hard drive). The ROM BIOS also holds the instructions needed to permit the processor to access more and more data from the hard drive in a widening gyre, “teaching” itself to be a modern, capable computer.

This rapid, self-sustaining self-education is as magical as if you lifted yourself into the air by pulling on the straps of your boots, which is truly why it’s called “bootstrapping” or just “booting” a computer.

19

Computer hardware circa 2014 shares certain common characteristics. Within the CPU, a microprocessor chip is the computational “brains” of system and resides in a socket on the motherboard, a rigid surface etched with metallic patterns serving as the wiring between the components on the board. The microprocessor generates considerable heat necessitating the attachment of a heat dissipation device called a heat sink, often abetted by a small fan. The motherboard also serves as the attachment point for memory boards (grouped as modules or “sticks”) called RAM for Random Access Memory. RAM serves as the working memory of the processor while it performs calculations; accordingly, the more memory present, the more information can be processed at once, enhancing overall system performance.

Other chips comprise a Graphics Processor Unit (GPU) residing on the motherboard or on a separate expansion board called a video card or graphics adapter. The GPU supports the display of information from the processor onto a monitor or projector and has its own complement of memory dedicated to superior graphics performance. Likewise, specialized chips on the motherboard or an expansion board called a sound card support the reproduction of audio to speakers or a headphone. Video and sound processing capabilities may even be fully integrated into the microprocessor chip.

The processor communicates with networks through an interface device called a network adapter which connects to the network physically, through a LAN Port, or wirelessly using a Wi-Fi connection.

Users convey information and instructions to computers using tactile devices like a keyboard, mouse or track pad, but may also employ voice or gestural recognition mechanisms.

Persistent storage of data is a task delegated to other peripherals: optical drives (CD-ROM and DVD-ROM devices), floppy disk drives, portable solid-state media (i.e., thumb drives) and, most commonly, hard drives. .

All of the components just described require electricity, supplied by batteries in portable devices or by a power supply converting AC current to the lower DC voltages required by electronics.

capabilities may even be fully integrated into the

20

From the standpoint of electronic discovery, it’s less important to define these devices than it is to fathom the information they hold, the places it resides and the forms it takes. Parties and lawyers have been sanctioned for what was essentially their failure to inquire into and understand the roles computers, hard drives and servers play as repositories of electronic evidence. Moreover, much money spent on electronic discovery today is wasted as a consequence of parties’ efforts to convert ESI to paper-like forms instead of learning to work with ESI in the forms in which it customarily resides on computers, hard drives and servers.

Servers Servers were earlier defined as computers dedicated to a specialized task or tasks. But that definition doesn’t begin to encompass the profound impact upon society of the so-called client-server computing model. The ability to connect local “client” applications to servers via a network, particularly to database servers, is central to the operation of most businesses and to all telecommunications and social networking. Google and Facebook are just enormous groupings of servers, and the Internet is merely a vast, global array of shared servers. Local, Cloud and Peer-to-Peer Servers For e-discovery, let’s divide the world of servers into three realms: Local, Cloud and Peer-to-Peer server environments. “Local” servers employ hardware that’s physically available to the party that owns or leases the servers. Local servers reside in a computer room on a business’ premises or in leased equipment “lockers” accessed at a co-located data center where a lessor furnishes, e.g., premises security, power and cooling. Local servers are easiest to deal with in e-discovery because physical access to the hardware supports more and faster options when it comes to preservation and collection of potentially responsive ESI. “Cloud” servers typically reside in facilities not physically accessible to persons using the servers, and discrete computing hardware is typically not dedicated to a particular user. Instead, the Cloud computing consumer is buying services via the Internet that emulate the operation of a single machine or a room full of machines, all according to the changing needs of the Cloud consumer. Web mail is the most familiar form of Cloud computing, in a variant called SaaS (for Software as a Service). Webmail providers like Google, Yahoo and Microsoft make e-mail accounts available on their servers in massive data centers, and the data on those servers is available solely via the Internet, no user having the right to gain physical access to the machines storing their messaging. “Peer-to-Peer” (P2P) networks exploit the fact that any computer connected to a network has the potential to serve data across the network. Accordingly, P2P networks are decentralized; that is, each computer or “node” on a P2P network acts as client and server, sharing storage space, communication bandwidth and/or processor time with other nodes. P2P networking may be employed to share a printer in the home, where the computer physically connected to the printer acts as a print server for other machines on the network. On a global scale, P2P networking is the

21

technology behind file sharing applications like BitTorrent and Gnutella that have garnered headlines for their facilitation of illegal sharing of copyrighted content. When users install P2P applications to gain access to shared files, they simultaneously (and often unwittingly) dedicate their machine to serving up such content to a multitude of other nodes. Virtual Servers Though we’ve so far spoken of server hardware, i.e., physical devices, servers may also be implemented virtually, through software that emulates the functions of a physical device. Such “hardware virtualization” allows for more efficient deployment of computing resources by enabling a single physical server to host multiple virtual servers. Virtualization is the key enabling technology behind many Cloud services. If a company needs powerful servers to launch a new social networking site, it can raise capital and invest in the hardware, software, physical plant and personnel needed to support a data center, with the attendant risk that it will be over-provisioned or under-provisioned as demand fluctuates. Alternatively, the startup can secure the computing resources it needs by using virtual servers hosted by a Cloud service provider like Amazon, Microsoft or Rackspace. Virtualization permits computing resources to be added or retired commensurate with demand, and being pay-as-you-go, it requires little capital investment. It’s helpful for attorneys to understand the role of virtual machines (VMs) because the ease and speed with which VMs are deployed and retired, as well as their isolation within the operating system, can pose unique risks and challenges in e-discovery, especially with respect to implementing a proper legal hold and when identifying and collecting potentially responsive ESI. Server Applications Computers dedicated to server roles typically run operating systems optimized for server tasks and applications specially designed to run in a server environment. In turn, servers are often dedicated to supporting specific functions such as serving web pages (Web Server), retaining and delivering files from shared storage allocations (File Server), organizing voluminous data (Database Server), facilitating the use of shared printers (Print Server), running programs (Application Server) or handling messages (Mail Server). These various server applications may run physically, virtually or as a mix of the two. Practice Tips for Computers, Hard Drives and Servers Your first hurdle when dealing with computers, hard drives and servers in e-discovery is to identify potentially responsive sources of ESI and take appropriate steps to inventory their relevant contents and preserve them against spoliation. As the volume of ESI to be collected and processed bears on the expense and time required, it’s useful to get a handle on data volumes and distribution as early in the litigation process as possible.

22

Start your ESI inventory by taking stock of physical computing and storage devices. For each machine or device holding potentially responsive ESI, collect the following information (as applicable):

Manufacturer and model Serial number and/or service or asset tag Operating system Custodian Location Type of storage (don’t miss removable media, like SD cards) Aggregate storage capacity (in MB, GB or TB) Encryption status Credentials (user IDs and passwords), if encrypted Prospects for upgrade or disposal If you’ll preserve ESI by drive imaging, it’s helpful to identify device interfaces.

For servers, further information might include:

Purpose(s) of the server (e.g., web server, file server, print server, etc.) Names and contact information of server administrator(s) Time in service Whether hardware virtualization is used RAID implementation(s) Users and privileges Logging and log retention practices Backup procedures and backup media rotation and retention Whether the server is “mission critical” and cannot be taken offline or can be downed.

When preserving the contents of a desktop or laptop computer, it’s typically unnecessary to sequester any component of the machine other than its hard drive(s) since the ROM BIOS holds little information beyond the rare forensic artifact. Before returning a chassis to service with a new hard drive, be sure to document the custodian, manufacturer, model and serial number/service tag of the redeployed chassis, retaining this information with the sequestered hard drive.

The ability to fully explore the contents of servers for potentially responsive information hinges upon the privileges extended to the user. Be sure that the person tasked to identify data for preservation or collection holds administrator-level privileges.

Above all, remember that computers, hard drives and servers are constantly changing while in service. Simply rebooting a machine alters system metadata values for large numbers of files. Accordingly, you should consider the need for evidentiary integrity before exploring the contents of a device, at least until appropriate steps are taken to guard against unwitting alteration. Note also

23

that connecting an evidence drive to a new machine effects changes to the evidence unless suitable write blocking tools or techniques are employed.

24

Eight Tips to Quash the Cost of E-Discovery This really happened: Opposing counsel supplied an affidavit stating it would take thirteen years to review 33 months of e-mail traffic for thirteen people. Counsel averred there would be about 950,000 messages and attachments after keyword filtering. Working all day, every day reviewing 40 documents per hour, they expected first level review to wrap up in 23,750 hours. A more deliberate second level review of 10-15% of the items would require an additional two years. Finally, counsel projected another year to prepare a privilege log. Cost: millions of dollars. The arithmetic was unassailable, and a partner in a prestigious law firm swore to its truth under oath. This could have happened: On Monday afternoon, an associate attached a hard drive holding 33 months of e-mail for thirteen custodians to the USB port of her computer and headed home. Overnight, e-discovery review software churned through the messages and attachments indexing their contents for search and de-duplicating redundant data. The next morning, the associate identified responsive documents using keywords and concept clustering. She learned the lingo, mastered the acronyms and identified common misspellings. She found large swaths of irrelevant data that could be safely eliminated from the collection and began segregating responsive and non-responsive items. By lunchtime on Wednesday, the software started asking whether particular items were responsive. Before she called it a day, the associate ceded much of the heavy lifting to the program’s technology-assisted review capabilities and shifted her attention to searching for lawyers’ names and e-mail domains to flag privileged communications. She spent Thursday afternoon sampling items the computer identified as non-responsive to be assured of the quality of review. Before she called it a day, the associate tasked the software to generate a production set and a privilege log for partner review on Friday and wondered if it might be a good weekend to head to the beach. Cost: 40 associate hours. These two scenarios contrast the gross disparity in review costs and time between lawyers who approach e-discovery in ignorance and those who do so with skill. The Luddite lawyer who knows nothing of modern methods misleads the court and cheats the client. The adept associate proves that e-discovery is fast and affordable when the right tools and talents are brought to bear. Electronically stored information (ESI) serves us in all our day-to-day endeavors. ESI can and should serve us just as well in our search for probative evidence and in the resolution of disputes. You Must Make It Happen Finding efficiencies and avoiding dumb decisions in electronic discovery isn’t someone else’s responsibility. It’s yours. If someone else must perennially whisper in your ear, articulating the issues and answering the questions you should be competent to address, you aren’t serving your client.

25

ESI isn’t going away, nor will it wane in quantity, variety or importance as evidence. Each day you fail to hone your e-discovery skills is a day closer to losing a case or losing a client. Each day you learn something new about ESI and better appreciate how to request, find, cull, review and produce it at lowest cost is a day that cements your worth to your clients and makes you a more effective counselor and advocate. Eight Tips to Quash the Cost of E-Discovery The following tips are offered to help you slash the outsize cost of e-discovery:

1. Eliminate Waste 2. Reduce Redundancy and Fragmentation 3. Don’t Convert ESI 4. Review Rationally 5. Test your Methods and Know your ESI 6. Use good tools 7. Communicate and cooperate 8. Price is what the seller accepts

1. Eliminate Waste The author once polled thought leaders in electronic discovery about costs. They uniformly agreed that about half of every e-discovery dollar is expended unnecessarily as a consequence of counsel lacking competence with respect to ESI. Half was kind. Every time you over-preserve or over-collect ESI, every time you convert native data to alternate forms or fail to deduplicate ESI before review and every time you otherwise review information that didn’t warrant “eyes on,” you add cost without benefitting your client. It’s money wasted. Poor e-discovery choices tend to be driven by irrational fears, and irrational fears flow from lack of familiarity with systems, tools and techniques that achieve better outcomes at lower cost. The consequences of poor e-discovery decisions prompt motions to compel or for sanctions, further ratcheting up the cost of incompetence. 2. Reduce Redundancy and Fragmentation Many complain that electronic discovery has made litigation more costly because there is so much more information available today. Certainly, there are more channels of information available today, allowing an enlightened advocate more probative evidence. Much of what evaporated as a phone conversation now endures as a writing. There is more temporal, photographic and geolocation data to draw on, and more “persons with knowledge of relevant facts” who are privy to revealing information. Despite there being more, the increase doesn’t reflect the dire logarithmic leap in data volume some suggest. Much of the growth is attributable to replication and fragmentation. Put simply, human beings don’t create that much more unique information; they mostly make more copies of the same

26

information and break it into smaller pieces. Yesterday’s memo sent to three people is today’s 30- message thread sent to the whole department and retrieved on multiple devices. These iterations add a lot to the quantity of ESI, but little in the way of truly unique evidence. Thus, the burden and cost of e-discovery is inversely proportional to a litigant’s ability to reduce redundancy and fragmentation. There are many ways to minimize redundancy and fragmentation. Some entail sensible choices during identification and collection; others involve the application of tools and techniques geared to eliminating replication and organizing fragmented information for efficient review. Anyone who has done a document review can attest to the tedium of seeing the same documents over and over again. Messages repeat within threads or across recipients, and attachments to messages mirror documents from file servers. Some of this can be readily eliminated by simple hash-based de-duplication that costs very little and reliably eliminates documents that are duplicates in all respects. Hash-based deduplication calculates a “digital fingerprint” value (variously called an MD5 or SHA1 value) for each document, allowing redundant documents to be excluded from review. Nothing offers a more cost-effective means to reduce the cost of document review than deduplication; consequently, no one should undertake a document review without minimally running a simple hash-based deduplication to eliminate replication. Unfortunately, simple hash-based deduplication doesn’t work for e-mail messages (which necessarily reflect different routing information for different recipients) or for documents with minor variations that don’t signify material differences in content. For these items, more advanced near-deduplication techniques are needed to eliminate redundancy without increasing the risk that unique documents will be overlooked. Deduplication is a mechanical process requiring little, if any, human intervention or costly programming. Accordingly, its cost should always be a nominal component of an e-discovery effort. If a service provider attempts to charge princely sums for deduplication, consider it a sign that it’s time to find a new vendor. When the volume of information to be deduplicated is modest (e.g., less than 10-15 GB), low cost tools are available to deduplicate without the need to engage a service provider.1 3. Don’t Convert ESI It’s criminal how much money is wasted converting electronic information into paper-like forms just so lawyers don’t have to update workflows or adopt contemporary review tools. Our clients work with native forms of ESI because native forms are the most utile, complete and efficient forms in 1 One of the finest tools for deduplicating collections less than 15GB is called Prooffinder (www.prooffinder.com). It costs $100.00 for an annual license, and all proceeds from its sale go to support child literacy.

http://www.prooffinder.com/

27

which to store and access data. Our clients don’t print their e-mail before reading it. Our clients don’t see the need to emboss the document’s name on every page. Our clients communicate and collaborate using tracked changes and embedded comments, yet many lawyers intentionally or unwittingly purge these changes and comments in e-discovery and fail to disclose such redaction. They do it by converting native forms to images, like TIFF. Converting a client’s ESI from its natural state as kept “in its ordinary course of business” to TIFF images injects needless expense in at least half a dozen ways. First, you must pay someone to convert native forms to TIFF images and emboss Bates numbers. Second, you must pay someone to generate load files containing extracted text and application metadata from the native ESI. Third, you must produce multiple copies of certain documents (like spreadsheets) that are virtually incapable of being produced as TIFF images. Fourth, because TIFF images paired with load files are much “fatter” files than their native counterparts, you pay much more for vendors to ingest and host them by the gigabyte. Fifth, it’s very difficult to reliably deduplicate documents once they have been converted to TIFF images. Sixth, you may have to reproduce everything when your opponent wises up to the fact that you’ve substituted cumbersome TIFF images and load files for the genuine, efficient evidence. 4. Review Rationally Recently, an opponent advised the Court that their projected cost of review encompassed the obligation to look at every e-mail attachment when the body of the e-mail message contained a keyword hit, even when none of the attachments contained a hit. They made this representation knowing that the majority of the hits would prove to be noise hits, that is, keywords in a context that doesn’t denote responsiveness. Why would a party incur the expense to review the attachments to a message they’d determined was non-responsive when the attachments contained no keywords? It turned out they had separated attachments from e-mail transmittals, surrendering the ability to know which attachments could be eliminated from review because the transmitting message was eliminated from review. That’s not a rational approach to review. A common irrational approach to review is to treat information in any form from any source as requiring privilege review when even a dollop of thought would make clear that not all forms or sources of ESI are created equal when it comes to their potential to hold privileged content. Review accounts for anywhere from 60-90% of the cost of e-discovery; so, anything that defensibly narrows the scope of review prompts maximum savings. Almost anytime you can use technology to isolate privileged content and prudently employ a clawback agreement or Federal Rule of Evidence 502 to guard against inadvertent disclosure, you can slash the cost of privilege review. 5. Test your Methods and Know your ESI Staggering sums are spent in e-discovery to collect and review data that would never have been collected if only someone had run a small scale test before deploying an enterprise search. It’s easy and inexpensive to test proposed searches against representative samples of data (e.g., one key custodian’s mailbox) so as to identify outcomes that will unduly drive up the cost of ingestion, hosting

28

and review. This entails more than simply eliminating queries with large numbers of hits; it requires modifying them to balance the incidence of noise hits against hits on responsive data. A lot of money gets wasted in e-discovery over disputes that could be quickly resolved if someone simply knew more about the ESI i.e., if someone simply looked. Here again, knowing the software and file types used, the nature and configuration of the e-mail system, the retention scheme for backup media or whether a key custodian used a home system for business are all examples of information that can serve to facilitate decisions that will narrow the scope of collection and review with consequent cost savings. 6. Use Good Tools If you needed to dig a big hole, you wouldn’t use a teaspoon, nor would you hire a hundred people with teaspoons. You’d use the right power tool and a skilled operator. You can’t efficiently collect or review ESI without using good tools. Anyone engaging in e-discovery should be able to answer the question, “What’s your review platform?” They should be able to articulate why they use one review platform over another, and “because we already owned a copy” is not the best reason. A review platform is the software tool used to index, sort, search, view, organize and tag ESI. Choosing the right review platform for your practice requires understanding your workflow, personnel, search needs and forms in which ESI will be ingested and produced. Review platforms can be cost-prohibitive for some practitioners, but it’s untenable to manage ESI in discovery without a capable review platform. There are many review platforms on the market, including familiar names like Relativity, Concordance and Summation. There are also Internet-accessible “hosted” review environments and many proprietary review tools touting more bells and whistles than a Mardi Gras parade. Among the most important consideration in selecting a review platform is its ability to accept data in forms that do not to require costly conversion to TIFF images. Additionally, you may want the platform you select to support the most advanced forms of technology-assisted search and review that your budget allows, including predictive coding capabilities. 7. Communicate and Cooperate Poor communication and lack of cooperation between parties on e-discovery issues contribute markedly to increased cost. The incentives driving transparency and cooperation in e-discovery are often misunderstood. You don’t communicate or cooperate with an opponent to help them win their case on the merits; you do it to permit the case to be resolved on its merits and not be derailed or made more expensive by e-discovery disputes.

29

Much of the waste in e-discovery grows out of apprehension and uncertainty. Litigants often over-collect and over-review, preferring to spend more than necessary instead of giving the transparency needed to secure a crucial concession on scope or methodology. Communication and cooperation in e-discovery are not signs of weakness but of strength. Cooperation is a means to demonstrate that your client understands its e-discovery obligations and is meeting them. More, it’s a means to build trust in the scope and methods of discovery so as to forestall challenges that may prove disruptive to the case and the client’s operations. It’s even possible that your opponent understands e-discovery or your client’s systems better than you do and can propose more efficient ways to scope and complete the effort. What an opponent will accept in a cooperative give-and-take is often less onerous than what you were planning to produce. Put simply: the more you seek to hide the ball, the more likely a savvy opponent will dig deeper and find something your side missed. Because there are no perfect e-discovery efforts, there are none that can withstand the heightened scrutiny invited by shortsighted stonewalling. Hubris doesn’t help. Most flaws in e-discovery processes can be rectified quickly and cheaply when they surface early. An overlooked variant on a keyword or a missed file type is easy to fix at the outset, but can prove costly or irreparable when discovered months or years later. Moreover, disclosure tends to shift the burden to act. Courts tend not to entertain belated objections from parties who’d been supplied sufficient information to act promptly. 8. Price is What the Seller Accepts I’ve haggled in bazaars and markets from Cairo to Kowloon; but, I’ve never seen more pliant pricing than among those hawking e-discovery tools and services in the United States. A famous/infamous e-discovery vendor once quoted $43.5 million for a six-week engagement processing a very large volume of data on an expedited basis. The prospect was desperate, but not insane. Rebuffed, the vendor re-quoted the job the next day for several million dollars less. They “sharpened their pencil” again the next day…and the next. Before the week was out, the vendor was proposing to do the job for $3.5 million. They didn’t get the work. Service providers have to pay staff and keep the lights on. So, almost any work beats no work at all. Many will accept work that isn’t profitable, if it keeps a competitor from getting the business. Shop around. Make an offer. Only a sucker pays rack rate.

Make yourself sheep and the wolves will eat you. Benjamin Franklin

30

30

Copyright 2005-11

31

E-Discovery for Everybody: The EDna Challenge Craig Ball

© 2009

E-discovery is just for big budget cases involving big companies, handled by big firms. Right, and suffrage is just for white, male landowners. Some Neanderthal notions take longer than others to get shown the door, and it's time to dispel the mistaken belief that e-discovery is just for the country club set. Today, evidence means electronic evidence; so, like the courts themselves, access to evidence can't be just for the privileged. Everyone gets to play. If you think big firms succeed at e-discovery because they know more than you do, think again. Marketing hype aside, big firm litigators don't know much more about e-discovery than solo practitioners. Corporate clients hire pricey vendors with loads of computing power to index, search, de-duplicate, convert and manage terabytes of data. Big law firms deploy sophisticated in-house or hosted review platforms that let armies of associates and contract lawyers plow through vast plains of data--viewing, tagging, searching, sorting and redacting with a few keystrokes. The big boys simply have better toys. A hurdle for everyone else is the unavailability and high cost of specialized software to process and review electronic evidence. A Mercedes and a Mazda both get you where you need to go, but the e-discovery industry has no Mazdas on the lot. This article explores affordable, off-the-shelf ways to get where you need to go in e-discovery. One Size Doesn't Fit All First, let's set sensible expectations: Vast, varied productions of ESI cannot be efficiently or affordably managed and reviewed with software from Best Buy. If you're grappling with millions of files and messages, you'll need to turn to some pretty pricy power tools. The key consideration is workflow. Tools designed for ESI review can save considerable time over cobbled-together methods employing off-the-shelf applications; and, when every action is extrapolated across millions of messages and documents, seconds saved add up to big productivity gains. But few cases involve millions of files. Most entail review of material collected from a handful of custodians in familiar productivity formats like Outlook e-mail, Word documents, Excel spreadsheets and PowerPoint presentations. Yes, volume is a challenge in these cases, too; but, a mix of low-

32

cost tools and careful attention to process makes it possible to do defensible e-discovery on the cheap. Paper Jam More from comfort than sense, ESI in smaller cases tends to be printed out. Paper filled the void for a time, but lately the cracks are starting to show. Lawyers are coming to appreciate that printing evidence isn't just more expensive and slower, it puts clients at an informational disadvantage. When you print an electronic document, you lose three things: Money, time and metadata. Money and time are obvious, but the impact of lost metadata is often missed. When you move ESI to paper or paper-like formats like TIFF images, you cede most of your ability to search and authenticate information, along with the ability to quickly and reliably exclude irrelevant data. Losing metadata isn't about missing the chance to mine embedded information for smoking guns. That's secondary. Losing metadata is like losing all the colors, folders, staples, dates and page numbers that help paper records make sense. The EDna Challenge I polled a group of leading e-discovery lawyers and forensic technologists to see what tools and techniques they thought suited to the following hypothetical:

Your old school chum, Edna, runs a small firm and wants your advice. A client is about to send her two DVDs containing ESI collected in a construction dispute. It will be Outlook PST files for six people and a mixed bag of Word documents, Excel spreadsheets, PowerPoint presentations, Adobe PDFs and scanned paper records sans OCR. There could be a little video, some photographs and a smattering of voicemail in WAV formats. "Nothing too hinky," she promises. Edna's confident it will comprise less than 50,000 documents and e-mails, but it could grow to 100,000 items before the case concludes in a year or two. Edna's determined to conduct an in-house, paperless privilege and responsiveness review, sharing the task with a tech-savvy associate and legal assistant. All have late-model, big screen Windows desktop PCs with MS Office Professional 2007 and Adobe Acrobat 9.0 installed. The network file server has ample available storage space. Edna doesn't own Summation or Concordance, but she's willing to spend up to $1,000.00 for new software and hardware, but not a penny more. She's open to an online Software as a Service (SaaS) option, but the review has to be completed using just the hardware and software she currently owns, supplemented only by the $1,000.00 in new purchases. Her team will supply as much brute force as necessary. She's too proud to accept a loan of systems or software, and you can't change her mind or budget.

How should Edna proceed? Goals of the Challenge Ideally, the review method employed should:

33

1. Preserve relevant metadata; 2. Incorporate de-duplication, as feasible; 3. Support robust search of Outlook mail and productivity formats; 4. Allow for efficient workflow; 5. Enable rudimentary redaction; 6. Run well on most late-model personal computers; and 7. Require no more than $1,000.00 in new software or hardware, though it's fine to use fully-

functional "free trial" software so long as you can access the data for the 2-3 year life of the case.

I had some ideas (shared later in this article), but expected my colleagues might point me to better mousetraps. Instead, I was struck by the familiarity and consistency of their excellent suggestions as compared to options that have been around for years. Sadly, there's not that much new for those on shoestring budgets; that is, developers remain steadfastly disinterested in 85% of the potential market for desktop discovery tools. One possible bright spot was the emergence of hosted options. No one was sure the job could be begun--let alone completed--using SaaS on so tight a budget; but, there was enough mention of Saas to make it seem like a possibility, now or someday soon. Advice to Edna While the range of proposals was thin, the thought behind them was first-rate. All responding recognized the peril of using the various Microsoft applications to review the ESI. Outlook's search capabilities are limited, especially with respect to attachments. If Edna expected to reliably search inside of every message, attachment and container file, she would need more than Outlook alone. Notable by their absence were any suggestions to use Google's free desktop indexing and search tool. Though a painful interface for e-discovery, Google Desktop installed on a dedicated, "clean" machine would be capable of reading and searching Outlook e-mail, Word documents, Excel spreadsheets, PowerPoint presentations, PDF files, Zip archives and even text within music, video and image files. It wouldn't be pretty--and Edna would have to scrupulously guard against cross-contamination of the evidence with other data--but Google Desktop might get much of the job done without spending a penny. Quin Gregor of Strategic Data Retention LLC in Georgia was first to respond with an endorsement of my two favorite affordable workhorses, the ubiquitous dtSearch indexing and search tool ($199.00 at www.dtsearch.com) and Aid4Mail ($69.95 at www.fookes.com), a robust utility for opening, filtering and converting common e-mail container files and message formats. Quin described a bankruptcy case where a microscopic budget necessitated finding a low-end option. He reports that dtSearch and Aid4Mail saved the day.

34

Ron Chichester, an attorney and forensic examiner in Texas pointed to the many open source Linux tools available without cost. These command line interface tools are capable of indexing, Bayesian analysis and much of the heavy lifting of the tools used by e-discovery vendors; but. Ron acknowledged that Edna and her staff would need a lot of Linux expertise to integrate the open source offerings. Bottom line: The price is right, but the complexity unacceptable. Florida e-discovery author and blogger, Ralph Losey, a partner at AkermanSenterfitt, suggested using an online review tool like Catalyst and tried to dance around the budget barrier by pointing out that the cost could be passed on to the client. Ralph argued that hosting would save enough lawyer time to pay for itself. No doubt he's right; but, passing on the costs isn't permitted in the Edna Challenge and, even in a real world situation, unless the savings were considerable, Edna's likely to keep the work--and the revenue--in house. Another Floridian, veteran forensic examiner, Dave Kleiman, suggested that Edna blow her budget on alcohol and amphetamines because she has a lot of toil ahead of her. Party on, Dave! Our northern neighbor, Dominic Jaar of Ledjit Consulting Inc. in Quebec, took a similar doleful tack. Dominic thought that SaaS might be a possibility but added that Edna should use her grand to take an e-discovery course because she needs to learn enough to "stay far from the case." Else, he offered, she could go forward and apply the funds to coffee and increased malpractice coverage. Ouch! John Simek of Sensei Enterprises in Virginia prudently suggested that Edna use part of her budget to buy an hour of a consultant's time to help her get started. John predicted that a SaaS approach would be priced out-of-reach, but was another who thought salvation lay with dtSearch. John recognized that Adobe Acrobat could handle both the redaction and light-duty OCR required. As for the images, video and sounds, Edna's in the same boat, rich or poor. She's just going to have to view or listen to them, one-by-one. Jerry Hatchett with Evidence Technology in Houston suggested LitScope, a SaaS offering from LitSoft. Jerry projected a cost of around $40/GB/month, which would burn through Edna's budget in about 3 months...if she didn't buy any Starbucks. Following up, I discovered that LitScope can't ingest the native file formats Edna needed to review unless accompanied by load files containing the text and metadata of the documents and messages. The cost to pre-process the data to load it would eat up Edna's budget before she looked a single page. That, and a standard $200 minimum on monthly billings coupled with a 6 month minimum commitment, made this SaaS option a non-starter. Attractive pricing, to be sure, but not low enough for Edna's shallow pockets. The meager budget forced George Rudoy, Director of Global Practice Technology & Information Services at Shearman & Sterling, LLP in New York, to suggest using Outlook 2007 as the e-mail review tool, adding the caveat that metadata may change. Unlike earlier versions, Outlook 2007 claims to extend its text search capabilities to attachments. Unfortunately, it doesn't work very well

35

in practice, meaning Edna and her staff will need to examine each attachment instead of ruling any out by search. George also urged Edna to buy licenses for Quick View Plus--a universal file viewer utility--and hire an Access guru to design a simple database to track the files and hyperlink to each one for review. From Down Under, Michelle Mahoney of Mallesons Stephen Jaques in Melbourne shared several promising approaches. She suggested Karen's Power Tools (a $30 suite of applications at www.karenware.com) as a means to inventory and hash the files and Microsoft Access as a means to de-duplicate by hash values. Michelle also favored hyperlinking from Access for review, working through the collection progressively, ordering them by file type and then filename. She envisions adding fields to the database for Relevant and Privileged designations and a checkbox for exceptional files that can't be opened and require further work. For the e-mail files, Michelle also turns to Outlook as a review tool, proposing that folders be created for dragging-and-dropping items into Relevant Non Privileged; Relevant Privileged and Non Relevant groups. She echoed warnings about metadata modification and gives her thumbs up to Aid4Mail. Finally, Michelle offers more kudos for dtSearch as the low cost tool-of-choice for keyword searching. dtSearch will allow Edna to run keywords across files, including emails and attachments, and it is a simple file copy option to copy them, with or without original path, into a folder. Messages emerge in the generic MSG mail format, and Edna can either produce them in that format (with embedded attachments) or use Aid4Mail to copy them into an Outlook PST file format. For further discussion of using dtSearch as a low-cost e-discovery tool, see, Craig Ball, Do-It-Yourself Digital Discovery, (Law Technology News, May 2006). Tom O'Connor, Director of the Legal Electronic Document Institute in New Orleans, observed that he often gets requests like Edna's from his clients in Louisiana and Mississippi and weighed in with a mention of Adobe Acrobat, noting that it might be feasible to print everything to Acrobat and use Acrobat's annotation and redaction features. As mentioned, Acrobat also offers rudimentary OCR capabilities to help deal with the scanned paper documents in the collection and even has the ability to convert modest volumes of e-mail to PDFs directly from Outlook. For further discussion of using Adobe Acrobat to process Outlook e-mail, see, Craig Ball, Adobe Brings an Acrobat to Perform EDD (Law Technology News, June 2008). Tom concludes that, although working with the tools you already own and know can be cumbersome, it's sometimes a better approach that trying to master new tools under pressure. Ohio-based e-discovery consultant, Brett Burney, had some very concrete ideas for Edna. He thought she could try to find some SaaS solution to host the data, suggesting Lexbe, NextPoint or Trial Solutions as candidates. Brett was most familiar with Lexbe and knew of small law firms that had successfully and inexpensively used their services.

36

Brett guessed Edna's budget might allow her to upload everything to Lexbe, review it quickly and then take everything down before the hosting costs ate up her budget. He reported that Lexbe will accept about any file format, by uploading it yourself or sending it to Lexbe to load. Brett put the cost at $99 per month for 2 users and 1GB of storage. Noting that Edna needs to host more than 1GB of data, he predicted her outlay should be close to $200/month. Brett added, "Edna and her crew can upload everything with the tools they have, get it reviewed pronto (i.e. less than a month), and then take everything down--paying only for what they use." For the Outlook e-mail, Brett thought Edna should turn to Adobe Acrobat and convert the PST container files to PDF Portfolios along the lines of my June 2008 column. Alternatively, Brett suggested Edna use the free Trident Lite tool from Wave Software (www.discoverthewave.com) to get a "snapshot" of the PSTs and then convert relevant messages to PDF or upload them to a hosting provider. Lisa Habbeshaw of FTI in California pointed to Intella by Vound Software (http://www.vound-software.com) as an all-in-one answer to Edna's needs. Intella offers an efficient indexing engine, user-friendly interface and innovative visual analysis capability sure to make quick work of Edna's review effort. Lisa was unsure if the program could be had for under $1,000, but noted that Vound Software offers a free, fully-functional demo that might fill the bill for Edna's immediate needs. Like Lisa, I'm unsure whether Intella will bust Edna's budget, but it's certainly a splendid new entry to the do-it-yourself market. Other Great Tools If the dollar holds its own against the Euro, Edna could accomplish just about everything she needs to do using a terrific tool created in Germany called X-Ways Forensics from X-Ways Software Technology AG. X-Ways Forensics could make quick work of the listing, hashing, opening, viewing, indexing, searching, categorizing and reporting on all that client data; however, it's a complex, powerful forensics tool that would require more time and training to master than Edna can spare. Plus, it would eat up all of her $1,000 budget. If her budget was bigger, Edna would be very happy attacking the review with the easy-to-use, fast and versatile Nuix Desktop (www.nuix.com). Nuix would allow Edna to begin her review in minutes, and it supports a host of search options. The embedded viewer, hash and classification features foster an efficient workflow and division of review among multiple reviewers. Like Intella, Nuix is an Australian import. Whatever they're doing way down there in Kangaroo land, they're certainly doing something right! A Few More Ideas for Edna It's hard to add much to so many fine ideas. Collectively, dtSearch, Adobe Acrobat and Aid4Mail deliver the essential capabilities to unbundle, index, search, OCR and redact the conventional file formats and modest data volumes Edna faces. Her challenge will be cobbling together tools not

37

designed for e-discovery so as to achieve an acceptable workflow and defensible tracking methodology. It won't be easy. For example, while dtSearch is Best of Class in its price range, it doesn't afford Edna any reasonable way to tag or annotate documents as she reviews them. Accordingly, Edna will be obliged to move each document to a folder as she makes her assessments respecting privilege and responsiveness. That effort will get very old, very fast. On the plus side, dtSearch offers a fully functional thirty-day demo of its desktop version, so Edna can buy a copy for her long-term use, but rely on 30-day evaluation copies for her staff during the intense review effort--a $400 savings. While Adobe Acrobat supports conversion of e-mail into PDFs, the process is painfully slow and cumbersome. Moreover, the conversion capabilities break down above 10,000 messages. That sounds like a lot, but it's likely less than Edna will see emerge in the collections of six custodians. Further, Edna may encounter an opponent who smart enough to demand the more versatile electronic formats for e-mail (i.e., PST, MSG or EML). What's Edna going to do if she finds herself locked into a reviewed wedded to image formats? Whatever tools she employs, Edna will need to be meticulous in her shepherding of the individual messages and documents through the process. To that end, I'd offer this advice:

1. Your first step should be to make a working copy of the data to be processed and secure the source dataset against any usage or alteration. Processing of ESI poses risks of data loss or alteration. If errors occur, you must be able to return to uncorrupted data from prior steps. For each major processing threshold, set aside a copy of the data for safekeeping and carefully document the time the data was set aside and what work had been done to that point (e.g., the status of deduplication, filtering and redaction).

2. From the working copy, hash the files and generate an inventory of all files and their metadata. The processes you employ must account for the disposition of every file in the source collection or extracted from those files (i.e., message attachments and contents of compressed archives). Your accounting must extend from inception of processing to production. By hashing the constituents of the collection as it grows, you gain a means to uniquely identify files as well as a way to identify identical files across custodians and sources. A useful tool for hashing files is Karen's Hasher available at http://www.karenware.com. But the best "free" tool for the task is AccessData's FTK Imager, available from www.accessdata.com/downloads. FTK Imager not only hashes files, it also exports Excel-compatible comma delimited listings of filenames, file paths, file sizes and modified, accessed and created dates. Moreover, it supports loading the collected files into a container called a Custom Content Image that protects the data from metadata corruption.

38

3. Devise a logical division scheme for the components of the collection; e.g., by machine,

custodian, business unit or otherwise. Be careful not to aggregate files in a manner that files from one source may overwrite identically named files from other sources.

4. Expand files that hold messages and other files. Here, you should identify e-mail container files (like Outlook .PST files) and archives (e.g., .Zip files) that must be opened or decompressed to make their constituents amenable to search. For e-mail, this can be done using an inexpensive utility like Aid4mail from Fookes Software or Trident Lite from Wave Software. Additionally, e-mail client applications, including Outlook, usually permit export of individual messages and attachments. Though dtSearch includes a command line utility to convert Outlook PST container files to individual messages (.MSG) files for indexing, it doesn't work well or easily compared to Aid4Mail. Finally, most indexing tools are capable of directly accessing text within compressed formats. For example, DTSearch can extract text from Zip files and other archives.

5. A feature common to premium e-discovery tools but hard to match with off-the-shelf software is deduplication. You can use hash values to identify identical files, but the challenge is to keep track of all de-duplicated content and reliably apply tagging for privilege and responsiveness to all deduplicated iterations. Most off-the-shelf utilities simply eliminate duplicates and so aren't suited to e-discovery. This is where it's a good investment to secure help from an expert in Microsoft Excel or Access because those applications can be programmed to support deduplication tracking and tagging. When employing deduplication, keep in mind that files with matching hash values can have different filenames and dates. The hash identicality of two files speaks to the contents of the files, not the names assigned to the files by the operating system or to information, like modified, accessed and created dates, stored outside the files.

6. Above all, don't process and review ESI in a vacuum. Be certain that you understand the other side's expectations in terms of the scope of the effort, approach to search and--critically--the forms of production they seek. You may not agree on much, but you may be pleasantly surprised to learn that some of the perils of a low budget e-discovery effort (e.g., altered metadata, limited search capabilities, native production formats) don't concern the other side. Further, you may reach accord on limiting the scope of review in terms of time intervals, custodians and types of data under scrutiny. Why look at all the e-mail if the other side is content with your searching just communications between Don and Betty during the third week of January 2009?

39

Finally, Edna may seek an answer to two common questions from those taking the do-it-yourself route in e-discovery:

What if I change metadata? Certain system metadata values--e.g., last access times and creation dates--are prone to alteration when processed using tools not designed for e-discovery. Such changes are rarely a problem if you adhere to three rules: 1. Preserve an unaltered copy of whatever you're about to process; 2. Understand what metadata were altered; and, 3. Disclose the changes to the requesting party.

By keeping a copy of the data at each step, you can recover true metadata values if particular values prove significant. Then, disclosing what metadata values were changed eliminates any suggestion that you pulled a fast one. Many requesting parties have little regard for system metadata values; but, they don't want to be surprised by relying on inaccurate information. Can I Use My Own E-Mail Account for Review? You wouldn't commingle client funds with your own money, so why commingle e-mail that's evidence in a case with your own mail? That said, when ESI is evidence and the budget leaves no alternative, you may be forced to use your own e-mail tools for small-scale review efforts. If so, remember that you can create alternate user accounts within Windows to avoid commingling client data with your own. Better still, undertake the review using a machine with a clean install of the operating system. Very tech-savvy counsel can employ virtual environments (e.g., VMWare products) to the same end. If using an e-mail client for review, it may be sufficient to categorize messages and attachments by simply dragging them to folders representing review categories; for example: 1. Attorney-client privilege: entire item; 2. Work product privilege: entire item; 3. A-C Privilege: needs redaction; 4. W-P privilege: needs redaction; 5. Other privilege; 6. Responsive; 7. Non-responsive.

Once categorized, the contents of the various folders can be exported for further processing or for production, if in a suitable format.

Throwing Down The Gauntlet The vast majority of cases filed, developed and tried in the United States are not multimillion dollar dust ups between big companies. The evidence in modest cases is digital, too. Solo and small firm counsel like Edna need affordable, user-friendly tools designed for desktop e-discovery--tools that

40

preserve metadata, offer efficient workflow and ably handle the common file formats that account for nearly all of the ESI seen in day-to-day litigation. Using the tools and techniques described by my thoughtful colleagues, Edna will get the job done on time and under budget. The pieces are there, though the integration falls short. So, how about it e-discovery industry? Can you divert your gaze from the golden calf long enough to see the future and recall the past? Sam Walton became the richest man of his era by selling to more for less. There's a fast growing need...and a huge emerging market. The real Edna Challenge is waiting for the visionaries who will meet the need and serve the market. March 2013 Epilog: Since I penned this article in 2009, several software vendors have risen to the EDna Challenge and market capable tools at prices within EDna’s reach. I’m still not ready to declare anyone a “winner” of the Challenge, but the emergence of lower-priced e-discovery tools makes us all winners. Two offerings meriting special recognition are Nuix’ Prooffinder (www.prooffinder.com) and GGO’s Digital WarRoom Pro (www.digitalwarroom.com). Prooffinder would cost Edna $100.00 for an annual license, with all proceeds of sale going to support children’s literacy. Prooffinder is scaled-down version of Nuix, arguably the most capable e-discovery processing tool on the market today. To keep its price low, Prooffinder will not process more than 15GB of data for a single case (ample for Edna’s needs); but, even at so piddling a price, Prooffinder delivers speedy, sophisticated search capabilities, excellent metadata extraction, effective de-duplication and a host of other functional and analytical features. From the standpoint of cost of capability, no other product can touch it, and probably the only additional cost Edna will need to incur is to purchase some redaction software (or a copy of Adobe Acrobat with redaction capabilities). Digital WarRoom is a full-featured e-discovery suite of tools that was a promising EDNa challenge contender on all fronts except for its pricing. Though an annual license for DWR Pro is only $895.00, renewal would push the product out of Edna’s budget. Moreover, to gain full functionality of the product, users must purchase a separate $49.00 license for a file viewer application.


41

41

42

Ten Things That Trouble Judges About E-Discovery

Craig Ball © 2010

As counselor, consultant or court-appointed special master, my law practice revolves around electronically stored information (ESI)--seeking to salvage the wrecks others have made of e-discovery and helping parties to navigate unfamiliar shoals. The goal is to forestall or resolve conflicts with judges incensed by parties’ failure to fulfill e-discovery duties. Judges frequently doubt that electronic discovery is as difficult or expensive as the lawyers before them claim. For the most part, the judges are right. E-discovery is not that hard and need not be so costly. That is, it’s not that hard or expensive if counsel knows what he or she is doing, and that’s a huge “if.” Judges feel lawyers should know how to protect, marshal, search and produce the evidence in their cases or enlist co-counsel and experts with that know how. The judges are right about that, too. Lawyers must master modern evidence in the same way that doctors must stay abreast of the latest developments in medicine. The challenge to listing ten things that trouble judges about e-discovery is limiting it to only ten things. E-discovery exposes much that is not pretty about the state of the law practice, e.g., wasteful, obsolete practices; poor management skills; conflicting interests between lawyers and clients; and unequal access to justice between the rich and the rest. E-discovery didn't create these problems, but like a hard rain on an old roof, it exposes failings too long ignored. First and most intractable among these problems is:

1. Lawyer incompetence The landscape of litigation has forever changed, and there is no going back to a paper-centric world. Too many lawyers are like farriers after the advent of the automobile, grossly--even stubbornly--unprepared to deal with electronic evidence As lawyers’ duties to supervise and direct clients’ preservation and collection of ESI have broadened, their grasp of information systems, forms of ESI and effective search hasn’t kept pace. This knowledge gap troubles judges who rely upon lawyers to police the discovery process and stand behind the integrity of that process. Lawyers cannot defend what they don’t understand. No lawyer wants to be thought incompetent; yet the skills developed to collect, assess and produce paper records do not translate well to a world steeped in ESI. Digital is different, and neither clients nor the justice system can long afford the costly, cumbersome efforts lawyers employ to regress data to paper or images.

43

Other things that trouble judges about e-discovery are:

2. Misstatements of fact coupled with a lack of reliable metrics Perhaps because no lawyer wants to be thought incompetent, some resort to “winging it” when it comes to reporting the state of client ESI and status of discovery. The case law proves the folly of blind reliance on clients when gauging the true state of retention and collection. Lawyers must not parrot client claims without undertaking even minimal steps to establish their accuracy. Often, the misstatements take the form of fanciful claims of burden or cost, advanced sans reliable metrics gained through measurement or testing. Judges expect more than histrionics and hand wringing. They demand competent, quantitative evidence of burden and cost supported by the testimony of knowledgeable people who’ve done their homework. It troubles judges to be asked to decide important issues on much less.

3. Cost and waste Judges are of one troubled mind about litigation today. They all feel it costs too much and worry that spiraling costs may crowd out legitimate cases or compel unjustified settlements. Recently, a distinguished panel of e-discovery experts surprised this writer by agreeing that about 70% of the money spent on e-discovery is wasted through poor planning and decision-making. Worse, they attributed about 70% of that waste to lawyer incompetence. If true, that suggests that about half of every dollar spent on e-discovery is wasted because lawyers don’t know what they’re doing with ESI. Half!

4. Delay in addressing ESI Issues Over time, data tends to morph, migrate and disappear. Employees join and leave, and machines are re-tasked or retired. Memories fade. Active data migrates to tape. Tape moves to warehouses. Old tape formats give way to newer formats, and old tape drives are discarded. With these changes, discoverable information grows more difficult and costly to access over time. It troubles judges when parties ignore ESI issues until little problems grow into big ones. Judges expect parties and counsel to think and act in timely ways, identifying and preserving potentially responsive evidence when they anticipate a claim or lawsuit instead of waiting until a preservation demand surfaces or a lawsuit is filed. Judges are also troubled when parties or counsel delay getting needed help from experts and vendors. When a lawyer waits until discovery is overdue to begin seeking such help, it's hard for a judge to impute good faith.

44

5. Lack of communication and cooperation

One reason judges don’t like discovery disputes is that they're often so unnecessary; that is, they concern issues the parties could have resolved if they’d simply listened and cooperated. It greatly troubles judges when parties and counsel exert little effort to resolve e-discovery disputes before filing motions and demanding hearings. It further troubles judges when lawyers mistakenly equate candor and cooperation with weakness, seeking to profit from pointless disputes and motion practice. Judges don’t abide trial by ambush or gamesmanship in e-discovery. The bench expects parties to be forthcoming about the volume and nature of discoverable ESI and to be reasonably transparent in, e.g., detailing preservation efforts or disclosing automated search methods. Because judges never forget that all lawyers owe duties to uphold the integrity of the justice system they serve, judges are troubled when advocates let the desire to win eclipse those duties.

6. Failing to get the geeks together Communication presupposes comprehension, but judges daily confront how working through intermediaries clouds the court's understanding of technical issues. Like lawyers, information technologists employ a language all their own. They speak geek. Because lawyers rarely know what IT personnel are talking about, lawyers are often fearful of allowing technical personnel from opposing sides to talk to each other. Instead, counsel for the requesting party conveys questions from their technical expert to opposing counsel, who passes them on to in house counsel, who has the paralegal on the case talk to the IT person. The IT person responds to the paralegal who speaks to in house counsel who tells outside counsel who passes on his or her best understanding to opposing counsel or the court. No wonder so much gets misunderstood. Judges expect clear, accurate communication about technical matters, and it troubles them when knowledgeable people aren't brought together to foster transparency and trust.

7. Failing to implement a prompt and effective legal hold Preservation is a backstop against error. Slipshod preservation pervades and poisons much of what follows, and the cost to resolve inadequate preservation is breathtakingly more than the cost of a reasonable and timely legal hold effort. One need only peruse the opus opinions in The Pension Committee of the University of Montreal Pension Plan, et al. v. Banc of America Securities, et al.,2 or Rimkus v. Cammarata3

2 2010 WL 184312 (S.D.N.Y. Jan. 15, 2010) 3 07-cv-00405 (S.D. Tex. Feb. 19, 2010)

45

to appreciate the signal importance judges place on a prompt and effective legal hold of potentially relevant ESI and documents. Lawyers appear to have only two settings when it comes to implementing legal holds: "off" and "crazy." Either they ignore the need for a hold until challenged about missing data, or they issue so vague, paralyzing and impractical a retention directive, that responses run the gamut from doing nothing to pulling the plug and sitting in the dark. It troubles judges when lawyers and clients fail to preserve information that bears on the issues. Judges rightly expect lawyers to promptly hone in on potentially responsive information when a claim or suit looms. Judges expect lawyers to identify fragile forms of information and take reasonable steps to protect the evidence against loss or corruption due to negligence or guile.

8. Overbroad requests and boilerplate objections In the bygone era of paper discovery, asking for "any and all documents touching or concerning" a topic was accepted. Information was generally stored on paper, paper was predictably managed and a company's documents were typically organized topically in a few easily-ascertainable locations. But when information exploded into countless shards of messages and attachments strewn across a sea of accounts, servers, machines, media and devices, "any and all" became too many. It deeply troubles--even antagonizes--judges when requests for information are unfocused and over-inclusive and when reasonable requests are met with a litany of generic objections Both demonstrate a lack of care and judgment. Judges want to see evidence that the discovery sought is proportional to the matters at issue. They expect objections to be asserted in good faith and narrowly drawn. Some judges are even exploring sanctions under Fed. R. Civ. P. 26(g) to address fishing expeditions and boilerplate objections. See, e.g., Mancia v. Mayflower Textile Servs. Co.4

9. Mishandling claims of privilege Ask a judge what percentage of documents claimed “privileged” actually prove to be privileged, and you'll probably hear, "ten percent, perhaps less." Yet more than one e-discovery expert has opined that finding, fighting about and redacting privileged documents accounts for a sizeable share of the money spent on e-discovery. Whatever the percentages, it's clear litigants spend far too much money and time ginning the seeds of privilege from electronic evidence, even while overlooking privileged content through a paucity of quality

4 253 F.R.D. 354 (D. Md. 2008)

46

assurance and control. See, e.g., Mt. Hawley Ins. Co. v. Felman Prod., Inc.5 and Victor Stanley, Inc. v. Creative Pipe, Inc.6 Lawyers gravitate to error-prone tools, like seat-of-the-pants keyword search, to cull potentially privileged content, mischaracterizing much that's not privileged and much that is. Further, many lawyers forget (or ignore) their client's duty to generate a proper privilege log when material withheld from discovery as privileged happens to be ESI. Finally, lawyers inexplicably fail to avail themselves of Fed. R. Evid. 502, which provides significant protections against waiver of privilege, including the near-impregnable shield of a R. 502(d) court order.

Last, but not least, any list of things that trouble judges about e-discovery is sure to include:

10. Failing to follow the Rules Judges value the rules of procedure, and they expect those who come to their courts to do so. So it troubles judges when the rules set forth a clear requirement that's ignored, especially when the failure to follow a rule triggers a superfluous motion and hearing. A telling example is the Federal Rule of Civil Procedure requiring a producing party to object to a requested form of production and specify the form to be produced.7 It's a rule observed more in the breach than in compliance; yet adherence to the rule would make many costly battles demanding alternate forms of production unnecessary. The rule sets out what to do--with the goal that conflicts be resolved before production in objectionable forms--but litigants just don't do it.

Heads in the Sand Ironically, what most troubles judges about e-discovery also makes their lives easier: judges are astounded they don't see more efforts to discover ESI! The bench well understands that the dearth of e-discovery isn't indicia of cooperation, but of evasion. Though virtually all evidence today is digital, many lawyers still try to pretend otherwise and look where they've always looked for evidence. Increasingly, judges know this shouldn't be the case and that it can't last. They enjoy the calm, but are troubled that so few are prepared for the gathering storm.

5 2010 WL 1990555 (S.D. W. Va. May 18, 2010) 6 250 F.R.D. 251 (D. Md. 2008) 7 Fed. R. Civ. P. Rule 34(b)(2)(D): Responding to a Request for Production of Electronically Stored Information. The response may state an objection to a requested form for producing electronically stored information. If the responding party objects to a requested form—or if no form was specified in the request—the party must state the form or forms it intends to use.

47

Preserving Google Content for Dummies © 2014

Craig Ball A key responsibility of in-house and litigation counsel is to insure that potentially responsive information is preserved facing litigation. Counsel must advise and supervise a client’s efforts to preserve both information deemed favorable and information helpful to the other side. It’s a duty owed to the Court under common law. Attorneys have seen harsh criticism from courts and borne the brunt of monetary sanctions for failing to act promptly and prudently to preserve electronically stored information (ESI). The duty to preserve ESI attaches to every case, including those where parties lack the wherewithal to hire technical experts. Moreover, absent agreement or court order, parties are not free to degrade the forms of the ESI preserved and produced, such as by printing ESI out and destroying its electronic searchability. Meeting these obligations is challenging; more so when the data resides with third-parties like cloud and webmail services. Millions of clients depend on Google tools to manage e-mail, contacts, documents, calendars, contacts, photos and more. That’s a lot of potentially relevant evidence, and it’s often sensible or necessary to preserve cloud content by collecting it. Heretofore, Google made it easy to find content, but hard to get that content out in forms that preserved utility and integrity. Some coped by printing individual messages and attachments to the Adobe PDF format. But, printing to PDF is tedious and doesn’t always produce usable or complete forms. Others relied on a mail transmission protocol called IMAP to download the contents of a Gmail account to Microsoft Outlook PST container files. But, downloading Gmail using IMAP and Outlook is tricky and slow. Happily, the geniuses at Google have introduced a truly simple, no-cost way to collect Google cloud content like Gmail, Google Drive, Calendar and others for preservation and portability. It sets a top flight example for other cloud service providers and presages how we may use the speed, power and flexibility of Google search as a culling mechanism before exporting for e-discovery. Even if you’re a lawyer who could care less about IMAP, this is a development worth cheering because until now, you had two choices when it came to putting Gmail on legal hold: Either you’d instruct your client not to delete anything (and cross your fingers they’d comply) or you had to hire someone to download the data. Now, Google does the Gmail collection gratis and puts it in a standard MBOX container format that can be downloaded and sequestered. Google even incorporates custom metadata values that reflect labeling and threading. You won’t see these

http://en.wikipedia.org/wiki/Mbox

48

unique metadata tags if you pull the messages into an e-mail client; but, e-discovery software will pick them up. I tested this using Nuix and the $100 marvel, Prooffinder. Both parsed the Gmail metadata handily, enabling the messages to be threaded and paired with their Gmail labels. MBOX might not have been everyone’s choice for a Gmail container file; but, it’s an inspired choice. MBOX stores the messages in their original Internet message format called RFC 2822 (now RFC 5322), a superior form for e-discovery preservation and production. So, meet Google Data Tools (https://www.google.com/settings/datatools). Armed with login credentials and client permission, the only hard part of preserving a client’s Google content is navigating to the right page. After logging into the user account, you get to Google Data Tools from the Google Account Setting page by selecting “Data Tools” and looking for the “Download your Data” option on the lower right. When you click on “Create New Archive,” you’ll see a menu where you select the Google content to archive and even choose whether to download all mail or just items bearing the labels you select.

The ability to label content within Gmail and archive only labelled messages means that Gmail’s powerful search capabilities can be used to identify and label potentially responsive messages, obviating the need to archive everything. It’s not a workflow suited to every case; yet, it’s a promising capability for keeping costs down in the majority of cases involving just a handful of custodians with Gmail. A lot of discoverable data is moving to Google–to Gmail, Drive, Calendar, YouTube–you name it. Kudos to Google for turning a task that’s been hard into something so simple anyone can do it well. That it costs nothing at all--thank you, Google!

http://www.nuix.com/


http://ballinyourcourt.wordpress.com/2013/07/02/what-is-native-production-for-e-mail/#more-1231

http://www.craigball.com/Lawyers%20Guide%20to%20Forms%20of%20Production_Ver.20140512_TX.pdf

https://www.google.com/settings/datatools

https://ballinyourcourt.files.wordpress.com/2014/10/gmail-archive1.png

50

Easing the Pain of E-Discovery with ESI Special ©2010

Craig Ball

I get quizzical looks when people ask what I do and I answer, ‘I’m an ESI Special Master. I help courts and litigants resolve electronic evidence issues.” They know what litigants, judges and lawyers do; and they’re often familiar with mediators, arbitrators and expert witnesses; but, few have a clue about the many roles played by ESI Special Masters in litigation. In simplest terms, a Special Master is someone—most often a lawyer— appointed to act for a court in specific ways An ESI Special Master is tasked to assist with matters relating to electronic discovery, computer forensics and digital evidence. The role of the ESI Special Master may be adjudicative, investigative or ministerial. One day, I’m presiding over hearings on e-discovery disputes and issuing directives geared to effective and proportionate e-discovery; another, I’m the Court’s neutral forensic examiner poring over vast data volumes to uncover the facts while protecting each side’s privileged and proprietary information. It’s the broad range of responsibilities delegated to Special Masters that makes the work so rewarding and piques the interest of lawyers with strong technical skills seeking a new and challenging career in e-discovery and computer forensics. An ESI Special Master does the sorts of things the judge would do, if the judge had the time and technical expertise, and what neutral IT experts would do, if those experts were experienced trial lawyers. Technical expertise equips a master to know what to do and how to do it, but legal training equips the master to know what's important and when enough is enough. Pros and Cons of Special Masters When I'm approached to consult on e-discovery, I often ask, "Are you sure you want a partisan consultant? Wouldn't a neutral special master be more effective and less costly?" The first response is usually, "I never thought about it." The next is, "I don't know if the other side will go for it.” Both sides may benefit from a neutral. An ESI Special Master can achieve significant savings serving as a neutral investigator. In matters where the evidence on digital media is commingled with privileged, proprietary or confidential information, the use of a qualified neutral examiner obviates the need for separate-but-redundant examinations by opposing experts. Instead, the partisan experts work with the neutral to frame a suitable examination protocol and then flesh out particular areas of concern after the neutral examiner completes the work. The result is that both parties enjoy substantially reduced costs and trusted outcomes. A master enjoys greater access to the producing parties' systems and data, helping to insure that responsive, non-privileged material will see the light of day. Producing parties benefit because a neutral has no incentive to pursue overbroad or unduly expensive discovery, and by doing what the neutral directs, they're insulated from criticism for doing too much or too little. While most producing parties recognize that they will have to devote resources to e-discovery, what they despise most is expending those resources only to find they're vulnerable to sanctions or obliged to start over again because something was mishandled. A skilled special master is better able to "right size" e-discovery, striking the optimum balance between avoiding unnecessary expense and the right to receive information. A careful neutral has

51

no incentive to spend more or find less. Further, a neutral's right to see information withheld on claims of privilege or confidentiality without triggering a waiver is a powerful hedge against abuse. An effective neutral finds consensus; but when consensus fails, the special master must possess the technical skill to fashion a sensible protocol and the legal ability to memorialize and enforce it. It’s crucial that the Master serve as a catalyst to speedier and less-costly resolutions, not another venue for endless argument or a means of delay. The overarching goal of a Master should be to do away with any enduring need for a Master in the case. The principal objection to use of a master is cost. Going before a judge on e-discovery disputes feels "free" to lawyers because the judge doesn't charge by the hour and is paid from public coffers. In fact, bringing discovery disputes to the judge is very costly and time consuming. Issues must be briefed in formal submissions, witnesses must attend court and the delay pending a ruling introduces still more costs, such as idling a large review team. But the biggest expense flows from the potential that the court, hampered by a lack of technical insight, will decide the issues in ways that seem equitable at first blush but prove unjust, ineffective or unduly expensive in practice. Breaking Bad Habits and Fostering Cooperation Resolving e-discovery disputes demands a mix of technical initiatives, information exchange and behavioral modification. Often, problems stem from a breakdown in communication, so parties must be steered to more effective communication strategies concerning ESI. It's like marriage counseling, but without happier times to hearken back to. As in ugly divorces, conflict can become an end in itself. Reasonable requests are refused just to be obstreperous. Unreasonable demands for marginally relevant information are served simply because responding engenders hardship or expense. Each side is determined to give no quarter and perceives cooperation as complicity and weakness. A successful master helps the parties separate advocacy from discovery and works to end peripheral battles over ESI, refocusing the parties on the merits. The first thing I seek to instill in the parties is a clear understanding of what must stop. Data destruction, dissembling, sniping at opponents and gross speculation are verboten. Where feasible, each side must designate a technical liaison equipped to answer questions about systems, applications and capabilities. Introducing players without a history of animus and shifting the focus to technical issues helps establish a culture of cooperation. Fostering cooperation may seem misguided in an adversarial system, especially to those who see cooperation as affording aid and comfort to the enemy. But, the savvy lawyer understands that the biggest beneficiary of cooperation is his or her own client. E-discovery efforts characterized by cooperation cost the parties less and serve as a bulwark against waste and sanctions. Working with the ESI Special Master Over the course of dozens of appointments as ESI Special Master, I’ve done almost any task an ESI Master might be called upon to do as facilitator, adjudicator or investigator. Along the way, I’ve identified ways litigants can aid the process and further their standing with the Master:

Focus on the Facts

52

Because few attorneys are well-versed in information technology, it’s not surprising that assumptions made with respect to the cost, burden and risks of e-discovery are frequently off-the-mark. Requesting parties tend to think it too easy, where responding parties make it sound improbably hard. An important role for the ESI Master is getting parties to examine the bases for their assumptions and secure reliable metrics. The right questions posed to the right persons often reveals that matters thought arduous are trivial and vice-versa. When working with an ESI Special Master, bring forward the persons with knowledge and be prepared to respond with solid metrics respecting file types, data volumes and other essential facts.

Designate a Technical Liaison

It’s understandable that lawyers often seek to interpose themselves between technicians and the court; but, much is lost in translation. An ESI Special Master “speaks geek” and may prefer to deal directly with technically-astute liaisons. I customarily direct each party to designate one or more technical liaisons who are obliged to be fluent in the particulars of the implicated systems and ESI. Few steps are more effective at resolving e-discovery disputes than facilitating productive communications between counterparts who grasp the technical challenges and range of solutions. As well, countless hours can be saved by eliminating much of the “let-us-get-back-to-you-on-that” typical of ESI disputes.

Come Armed with a Plan

Robert Moses, the controversial master builder who reshaped 20th-century New York, won many battles by the simple expedient of showing up at meetings with fully-realized drawings for civic improvements. Where others came with dreams, Robert Moses came with blueprints. Lawyers often approach e-discovery disputes with nothing more than a naked demand or an intransigent refusal.

Don’t force the Master to construct a solution from scratch and run the risk that it will be less favorable to your client’s interests; instead, come armed with a sound plan, and don’t be surprised if, in making the plan, you discover there’s less in dispute than you thought.

Be Candid

If you have problems in your case, such as spoliation issues or processing defects, promptly communicate them to the ESI Special Master. A skilled Master may be able to resolve defects before they become grounds for sanctions, and courts are hesitant to sanction when advised that the parties are working with the Special Master to fix problems.

Mechanics of Appointment In federal practice, the appointment of a special master is governed by Fed. R. Civ. P. 53, which provides that a court may appoint a master with the parties' consent, where the appointment is warranted by "some exceptional condition" or to address pretrial matters that cannot be effectively and timely addressed by an available judge. Each state has its own regime for appointment of a Special Master. For example, Rule 2-541 of the Maryland Rules of Civil Procedure is an amalgam of the Federal rule and Maryland practice. Maryland Rule 2-541 states that, “[on] motion of any party or on its own initiative, the court, by order, may refer to a master any…matter or issue not triable of right before a jury.” The Maryland rule afford special masters broad powers. The appointment order must “prescribe the compensation, fees, and costs of the special master and assess them among the parties,” and may specify or limit the powers of a special master and contain special directions.

53

Tips for Appointment Order The federal rule governing appointment of a Master sets out the requirements to serve and the requisites of the appointment order, which should clearly define the role and powers of the master with a particular eye toward establishing when the master's work is concluded. Masters cost money, so it's important to insure the meter stops running once the job's done. Appointment orders should specify the duties, powers and limits placed upon the Master, as well as whether and how the Master may engage in ex parte contact with the parties. Orders should set out the Master’s obligations to make a record and periodically report to the Court. Finally, the order should address the master’s compensation, including the parties’ payment responsibilities and whether the Master’s charges may be taxed as costs. The appointment order is also a means by which the Court can address common concerns such as whether the Master may be deposed or subject to trial subpoena and what is the standard for review for particular actions taken by the Master. An example of a federal appointment order follows as Appendix A. Though the example affords broad discretion, parties would be wise to consider the master's experience before seeking such leeway in all cases. A Bridge to Competence E-discovery and digital evidence pose technical challenges that few litigants are equipped to handle and fewer lawyers have been trained to address. Courts, too, often lack the resources and experience to delve deeply into the digital realm to achieve optimum outcomes. The consequences have been costly and, until attorney competence in information technology becomes commonplace, the need for ESI Special Masters will grow. ESI Special Masters can ease the pain of e-discovery by insuring that it proceeds fairly, efficiently, effectively and in proportion to each side’s needs and rights. An ESI Master promotes transparency of process, consensus and cooperation where possible, and provides prompt, practical direction and resolution, when not. As neutral investigator, an ESI Special Master affords all parties protections difficult to secure by other means, all allowing the parties to focus on the merits, and the lawyers to be more confident and competent in the e-discovery process.

54

APPENDIX A: Exemplar ESI Special Master Appointment Order IN THE UNITED STATES DISTRICT COURT

FOR THE ____ DISTRICT OF _____ ______ DIVISION

[STYLE]

ORDER APPOINTING SPECIAL MASTER FOR ESI

1. Craig Ball of Austin, Texas, is hereby appointed as Special Master for Electronically Stored Information pursuant to Rule 53 of the Federal Rules of Civil Procedure. Mr. Ball has filed the certification required by Rule 53(b)(3).

2. The Special Master shall proceed with all reasonable diligence to assist and, when necessary, direct the parties in completing required identification, preservation, recovery and discovery of electronically stored information with reasonable dispatch and efficiency. 3. The Special Master shall review with the parties ongoing discovery requests to determine where potentially responsive information is stored and how it can most effectively be identified, accessed, preserved, sampled, searched, reviewed, redacted and produced. To the extent the parties have disputes as to these matters, the Special Master may initiate or participate in the parties’ efforts to resolve same. He is authorized to resolve issues as to the scope and necessity of electronic discovery, as well as search methods, terms and protocols, means, methods and forms of preservation, restoration, production and redaction, formatting and other technical matters. 4. The Special Master is granted the full rights, powers and duties afforded by F.R.C.P. Rule 53(c) and may adopt such procedures as are not inconsistent with that Rule or with this or other Orders of the Court. The Special Master may by order impose upon a party any sanction other than contempt and may recommend a contempt sanction against a party and contempt or any other sanction against a non-party. 5. The Special Master shall be empowered to communicate on an ex parte basis with a party for purposes of seeking to maintain the confidentiality of privileged, trade secret or proprietary information or for routine scheduling and other matters which do not concern the merits of the parties’ claims. The Special Master may communicate with the Court ex parte on all matters as to which the Special Master has been empowered to act. The Special Master shall enjoy the same protections from being compelled to give testimony and from liability for damages as those enjoyed by other federal judicial adjuncts performing similar functions. 6. The Special Master shall regularly file a written report, in such format he deems most helpful, identifying his activities and the status of matters within his purview. The report should identify outstanding issues, with particular reference to matters requiring Court action. The Special Master

55

shall maintain a record of materials and communications that form the basis for such reporting by a suitable means determined at the Special Master's discretion. 7. Each side is ordered to designate a lead attorney and a lead technical individual as contacts for the Special Master. These designees shall have sufficient authority and knowledge to make commitments and carry them out to allow the Special Master to accomplish his duties. The parties are directed to give the Special Master their full cooperation and to promptly provide the Special Master access to any and all facilities, files, documents, media, systems, databases and personnel (including technical staff and vendors) which the Special Master deems necessary to complete his duties. 8. Disclosure of privileged or protected information connected with the litigation to the Special Master shall not be a waiver of privilege or a right of protection in this cause and is also not a waiver in any other Federal or State proceeding; accordingly, a claim of privilege or protection may not be raised as a basis to resist such disclosure. 9. The Court will decide de novo all objections to findings of fact or conclusions of law made by the Special Master. Any order, report, or recommendation of the Special Master, unless it involves a finding of fact or conclusion of law, will be deemed a ruling on a procedural matter. The Court will set aside a ruling on a procedural matter only where it is clearly erroneous or contrary to law. 10. The Special Master’s compensation, as well as reasonable and necessary expenses, will be paid by the [Plaintiff] [Defendant] [parties in equal shares]. Mr. Ball shall be compensated at his usual and customary rate of $500 per hour, including time spent in transit or otherwise in connection with this appointment, provided however that travel time will be paid at one-half (50%) of the usual and customary rate unless substantive work, research or discussions in support of the engagement are performed while traveling, in which case such activities will be paid at the usual and customary rate. The Special Master shall submit to both parties invoices for services performed according to his normal billing cycle and [Plaintiff] [Defendant] [the Plaintiff and Defendant in equal shares] shall pay such invoices within thirty (30) days of receipt. 11. In making this appointment, the Court has determined that the matters within the purview of the Special Master necessitate highly specialized technical knowledge and cannot be effectively and timely addressed by an available district judge or magistrate judge of the district. SO ORDERED AND ADJUDGED this the _______ day of _______________ 20____. _________________________________ UNITED STATES DISTRICT JUDGE

56

Gold Standard by Craig Ball

[Originally published in Law Technology News, April 2012]

Lawyers are in denial to the point of delusion with respect to the reliability of keyword search and human review. Judge John Facciola put it best when he quipped that lawyers think they’re experts at keyword search because they once found a Chinese restaurant on Google.

We trust keyword search because we understand it. We trust manual review of documents because we grossly overestimate reviewers’ abilities to make sound, consistent decisions about relevance. “To err is human,” the Bar seems to say, “but forgive us if we’d rather not divine just how error-prone reviewers really are.”

Better approaches to search are arriving as so-called “predictive coding” or “technology assisted review” (TAR) products. Still, it will be years before the rank and file embraces TAR, if only because those hawking TAR tools remain resolutely uninterested in positioning the technology for use by anyone but big corporations and white shoe law firms. Worse, the fervor among vendors to sell something, anything they can label predictive coding insures that tools little different from ordinary keyword search will be given a dab of lipstick and pushed out to market as TAR tools. It’s messy down in the TAR pit.

Even those adopting predictive coding tools will need to compile “seed sets” of relevant documents to train their tools. So, clunky-but-comfy keyword search and manual review are likely to remain the means to cull seed sets from samples. Despite serious shortcomings, keyword search and manual review will be with us for a while.

Keyword search is the art of finding documents containing words and phrases that signal relevance followed by page-by-page (linear) review of those documents. It’s often called the “gold standard” of electronic discovery.

That’s ironic, because extracting and refining gold relies less on finding precious aurum than it does on dispersing all that isn’t golden. Prospectors use water and chemicals to flush away all but the gold left behind. So, a true “gold standard” for keyword search would incorporate both precise inclusion (smart queries) and defensible exclusion (smart filters).

To illustrate, in one e-discovery dispute over search, the plaintiff submitted keywords to be run against the defendant’s e-mail archive for a three-month interval. Unfortunately, the archive held all e-mail for all custodians, and the defendant adamantly refused to segregate by key custodian or deduplicate before running searches. The interval was narrow, but the collection was vast and redundant.

The defendant tested the agreed-upon keywords but shared only aggregate hit rates for each. Thinking the numbers too high, but unwilling to look at the hits in context, the defendant rejected the search terms. The plaintiff agreed the hit counts were daunting but asked to see examples of hits on irrelevant documents before furnishing exclusionary (AND NOT) modifications to flush away more of what wasn’t golden.

57

The defendant refused, insisting it wasn’t necessary to see the noise hits in context to generate more precise queries. The parties were at an impasse, with one side grousing “too many hits” and demanding different search terms and the other side uncertain how to exclude irrelevant documents without knowing what caused the noisy results.

A lawyer who dismisses a search because it yields “too many hits” is as astute as the Emperor Joseph dismissing Mozart’s Il Seraglio as an opera with “too many notes.” Mozart replied, “There are just as many notes as there should be." Indeed, if data is properly processed to be susceptible to text search and the search tool performs appropriately, a keyword search generates just as many hits as there should be. Of course, few lawyers craft queries with the precision Mozart brought to music; so when the terms used seem well chosen for relevance, it’s crucial to scrutinize the results to learn what tailings are cropping up with the gilt-edged, relevant documents.

Keyword search is just a crude screen: “Show me items that contain these words, and don’t show me items that contain those.” High hit counts don’t always signal a bad screen. If search terms merely divide the collection into one pile holding relevant documents and one without, you’re closer to striking gold. Then, you look at what you can reliably exclude with the next screen, and the next; drawing ever closer to that elusive quarry, documentum relevantus.

But you must see hits in context to refine queries by exclusion. That seems so manifestly obvious, it’s astounding how often it’s not done.

When lawyers delegate keyword search, they often get back only aggregate hit counts and mistakenly conclude that’s enough information to judge searches noisy or not. If, instead, counsel get their hands dirty with the data, as by personally exploring representative samples using desktop or hosted tools, the parties could work quickly, effectively and cooperatively to zero in on relevant material. Good queries are best refined by knowledgeable people testing them against pertinent, small collections. Lousy outcomes spring from lawyers thinking up magic words and running them against everything.

It’s not just a theory. Recently, as part of an early case assessment effort, I sought to rapidly isolate relevant documents from a half million e-mail items culled from four key custodians. That’s a volume where you’d expect to see bids from service providers and mustering of review teams. It’s a project most firms would see as much more than a weekend’s work for one lawyer.

We tried something different. To start, the client exported the four key custodians’ e-mail messages for the time period of interest from its e-mail archives. Those 50 gigabytes of messaging went into a desktop processing and review tool.

Extracting and indexing the data overnight, I flagged exception items (e.g., images without extractable text and encrypted files) for further processing, then exported spreadsheets reflecting the most used e-mail addresses. I asked the custodians to flag addresses with no connection to the dispute. Meanwhile, I compiled the customary list of search terms and phrases expected to occur in relevant documents and tested these. Documents with false hits were examined for characteristics permitting mechanical exclusion. Testing, re-testing and re-examination soon produced reliable inclusion and exclusion term lists. Weeks of evaluation took just days because the iterations and results were instantaneous.

58

The discards were tested, too. For example, material excluded by addresses but containing inclusion terms was carefully checked to insure the hits weren’t relevant. Defensible exclusion proved as powerful as inclusion, and potentially relevant material that couldn’t be excluded as tailings stayed in the collection as ore. A true “gold standard.”

Did it produce a perfectly parsed set of material? Certainly not. Keyword search and human review still fall short of expectations. But it was fast, relatively cheap and afforded cautious confidence that the set produced was more relevant and less riddled with junk than what would have emerged from the usual game of blind man’s buff. It was fast and cheap because the person creating and testing the inclusive and exclusive filters was elbows deep in the data and hands on with the search tool. Feedback was immediate. Quality checks could be done at once.

Ideally, e-discovery tools don’t put distance between the lawyer and the evidence but, instead, extend our reach and help us get our arms around big data. A lawyer who is hands-on with the evidence and who tests and refines his or her choices is a lawyer who can explain and defend those choices. That’s the real golden future of e-discovery. Welcome back, counselor.

59

Ten Bonehead Mistakes in E-Discovery

by Craig Ball [Originally published in Law Technology News, June 2012]

Spoiled by Google and legal research, lawyers are woefully unprepared for the difficulty of search in e-discovery.

Search fails us in two, non-exclusive ways: our query will not retrieve the information we seek, and our query will retrieve information we didn’t seek. Obviously, we want what we’re looking for (high recall) and only what we are looking for (high precision).

Recall and Precision aren’t friends. Every time Recall has a tea party, Precision crashes with his biker buddies and breaks the dishes.

It’s easy to achieve a high recall of responsive ESI. You simply grab it all: 100% of the data = 100% recall. The challenge is achieving precision. If one out of every hundred items returned is what you seek, 99 items are duds—1% precision stinks.

Keyword search followed by human review is called “linear search,” and for now, it’s standard operating procedure in e-discovery, in part because linear search is mistakenly considered the safest course lest a party fail to produce something responsive or turn over something that should have been withheld.

Linear search is time-consuming, so it’s expensive. Worse, it doesn’t work well. People make search and assessment errors, and making lots of searches and assessments, they make lots of errors!

Mistakes can be subtle and hyper technical, but most are not. If we eliminate bonehead errors, we improve the quality of e-discovery and markedly trim its cost. Search will ever be a battle between Recall and Precision, but avoiding bonehead mistakes limits casualties.

Recently, I ran a blog post sharing five bonehead mistakes I’d observed and asking readers to contribute five more.

Mistake 1: Searching for someone’s name or e-mail address in their own e-mail

If you run a list of search terms including a custodian’s name or e-mail address against their own e-mail, you should expect to get hits on all messages. I know some of you are saying, “Craig, no one’s that boneheaded!” Actually, plaintiffs do it, defendants do it, and vendors run these searches without flagging the error. Ask yourself: how often are the proposed search term lists exchanged between counsel carefully broken out by particular custodians or forms of ESI to be searched?

Bill Onwusah, Litigation Support Manager at Hogan Lovells in London, commented that he’d seen this mistake take the form of “searching for a term that shows up in the footer of every single document produced by the organisation,” such as the firm’s name.

Mistake 2: Assuming the Tool can run the Search

60

Every ESI search tool has features and limitations. You must understand what data has been indexed and what search methods and syntax are supported.

Most e-discovery tools index words, which means you won’t retrieve any information that isn’t text (including some PDF, TIF and other pictures of words that haven’t been OCR’d to searchable text) or that isn’t accessible text (like encrypted documents). Plus, most search tools don’t index parts of speech called “noise” or “stop” words deemed so common they’ll gum up the works. I call this the “To Be or Not to Be” problem, because all of the words in Hamlet’s famous phrase tend not to be indexed in e-discovery.

Syntax mistakes occur when you assume the tool can run the search the way you constructed it. Not every search tool supports every common search method, e.g., wildcard characters, Boolean constructs, stemming, proximity searches or regular expressions, and even when two tools support the same search method, tool A may require you to use different search syntax than tool B.

Mistake 3: Not Testing Searches

Much of what distinguishes a mistake as boneheaded is the ease with which it could have been avoided. When a party to a lawsuit once proposed the letter “S” as a search term, I didn’t need to test it to know it was a bonehead choice. But what about all those noisy terms that pop up in file paths or are invariably encountered within ESI yet have nothing to do with the case? Even search terms that appear bulletproof can surprise you. Test your searches to be sure they perform as expected.

Mistake 4: Not Looking at the Data!

Don’t just natter on about the quantity of hits to evaluate your search; check the quality of the hits. Look at the data! Minutes spent looking at the data can eliminate weeks or months of reviewing crappy results and a zillion dollars spent in motion practice.

Mistake 5: Ignoring the Exceptions List

It’s the rare e-discovery effort where everything processes without exception. Typically, the exception list will reflect hundreds or thousands of items that are encrypted, corrupt, unrecognized or unreadable. You may take a calculated risk to ignore certain exceptional items; but too often, exceptions are misclassified as benign or dismissed altogether. That’s boneheaded.

Ed Fiducia, Regional Vice President for EDD vendor Inventus, offered a sixth and seventh for the bonehead mistakes list:

Mistake 6: Assuming That Deduplication Solves My Problem

Ed pointed to the limits of using hashing to identify truly duplicative files. “The rub is the definition of a truly duplicative file.”

For example, e-mail messages sent to multiple addressees won’t deduplicate across custodians because each message reflects its unique message ID and delivery path. Word and PDF versions of the same document won’t hash deduplicate because they’re different file formats.

61

Hashing leaves “thousands upon thousands of near duplicates that must be identified and reviewed. This leads to not only a dramatic increase in review costs, but a dramatic increase in the probability that documents will be coded inconsistently. Spend more money, get worse results. Not a good combination.”

Mistake 7: Reviewing Fifty Custodians When Five Will Do

Ed Fiducia: “Preserve everything? You bet! Review everything? Not in my book.”

“The knee jerk reaction is to blame plaintiffs’ attorneys who ask for everything. Equal responsibility goes to defense attorneys who don’t negotiate the process from the start in meet and confer. As a service provider, you’d think I’d push to process and review everything; but over the past 18 years, I’ve seen case after case prove that if the scope of e-discovery is limited from the start--with caveats to allow for additional discovery when warranted--everybody wins.”

Dave Swider, Senior Discovery Consultant for Evolve Discovery, contributed:

Mistake 8: Failing to Search for Common Name Variations

“Here’s one we see pretty often: Searching for names without anticipating variations. We’ll see a search for ‘Robert Smith’ with no variations specified; no Rob, Bob, Bobby, Robby, not even an email address.”

“Similarly, we’ll be asked to search for a complete law firm name: all five names as an exact string, with no domain or proximity search.”

Too, “we see use of wildcards and terms that are far too expansive…. I worked on a case that involved laying one material on top of another in a process called ‘deposition.’ Guess what term appeared on the potential privileged terms list? A common offender in groundwater cases is ‘well.’”

Marc Hirschfeld, President of Precision Legal Services, added:

Mistake 9: Neglecting to Run Searches Against File and Folder Names

“Here is one that I never see attorneys talk about…. I often find a treasure trove of information when the name of a folder holding relevant information includes a search word but the documents inside do not. It’s as if the user pre-identified these documents as relevant but, because the file and folder names weren’t indexed or searched, the treasure is missed.”

Ann Marie Gibbs, National Director of Consulting at Daegis, offered:

10: Failing to Rapidly React to the Problems You Encounter

“Another review oversight we see is a failure to ‘update’ the review set when a ‘false hit’ is running up the review bill. This relates to the mistake where a client declines to accept excellent advice on search selection criteria. If you can’t get them to understand the problem on the front end, you have a second bite at the apple on the back-end.”

62

Dave Swider sums it up: “The number one boneheaded move by legal staff is simply not bothering to understand how data works and how they can best apply tools that will make their outcomes better. Our best clients are those that treat data not like documents, but like data.”

63

About the Author

EDUCATION Rice University (B.A., 1979, triple major); University of Texas (J.D., with honors, 1982); Oregon State University (Computer Forensics certification, 2003); EnCase Intermediate Reporting and Analysis Course (Guidance Software 2004); WinHex Forensics Certification Course (X-Ways Software Technology 2005); Certified Data Recovery Specialist (Forensic Strategy Services 2009); Nuix Certified E-Discovery Specialist (2014); numerous other classes on computer forensics and electronic discovery. SELECTED PROFESSIONAL ACTIVITIES Law Offices of Craig D. Ball, P.C.; Licensed in Texas since 1982. Board Certified in Personal Injury Trial Law by the Texas Board of Legal Specialization 1988-2015 Certified Computer Forensic Examiner, Oregon State University and NTI Certified Computer Examiner (CCE), International Society of Forensic Computer Examiners Certified Data Recovery Specialist Certified E-Discovery Specialist (Nuix) Faculty, University of Texas School of Law, Adjunct Professor teaching Electronic Discovery & Digital Evidence Faculty, Georgetown University Law Center, E-Discovery Training Academy Admitted to practice U.S. Court of Appeals, Fifth Circuit; U.S.D.C., Southern, Northern and Western Districts of Texas. Board Member, Georgetown University Law Center Advanced E-Discovery Institute and E-Discovery Academy Board Member, International Society of Forensic Computer Examiners (agency certifying computer forensic examiners) Member, Sedona Conference WG1 on Electronic Document Retention and Production Member, Educational Advisory Board for LegalTech (largest annual legal technology event) Member, Maryland Committee on Federal E-Discovery Guidelines, 2014 Special Master, Electronic Discovery, numerous federal and state tribunals Instructor in Computer Forensics and Electronic Discovery, United States Department of Justice Lecturer/Author on Electronic Discovery for Federal Judicial Center and Texas Office of the Attorney General Instructor, HTCIA Annual 2010, 2011 Cybercrime Summit, 2006, 2007; SANS Instructor 2009, PFIC 2010, CEIC 2011, 2012 Special Prosecutor, Texas Commission for Lawyer Discipline, 1995-96 Council Member, Computer and Technology Section of the State Bar of Texas, 2003-date Chairman: Technology Advisory Committee, State Bar of Texas, 2000-02 President, Houston Trial Lawyers Association (2000-01); President, Houston Trial Lawyers Foundation (2001-02) Director, Texas Trial Lawyers Association (1995-2003); Chairman, Technology Task Force (1995-97) Member, High Technology Crime Investigation Association and International Information Systems Forensics Assn. Member, Texas State Bar College Member, Continuing Legal Education Comm., 2000-04, Civil Pattern Jury Charge Comm., 1983-94, State Bar of Texas Life Fellow, Texas and Houston Bar Foundations Adjunct Professor, South Texas College of Law, 1983-88 Selected Publications available at www.craigball.com

CRAIG BALL ESI Special Master and Attorney Computer Forensic Examiner Author and Educator

3723 Lost Creek Blvd. Austin, Texas 78735 E-mail: [email protected] Web: craigball.com Blog: ballinyourcourt.com

Lab: 512-514-0182 Mobile: 713-320-6066

Craig Ball is a trial lawyer, certified computer forensic examiner, law professor and electronic evidence expert He's dedicated his career to teaching the bench and bar about forensic technology and trial tactics. After decades trying lawsuits, Craig limits his practice to service as a court-appointed special master and consultant in computer forensics and e-discovery. A prolific contributor to educational programs worldwide--having delivered over 1,650 presentations and papers--Craig’s articles on forensic technology and electronic discovery frequently appear in the national media. For nine years, he wrote the award winning column on computer forensics and e-discovery for American Lawyer Media called "Ball in your Court." Craig Ball has served as the Special Master or testifying expert on computer forensics and electronic discovery in some of the most challenging, front page cases in the U.S.

0

Matters in Which Craig Ball has Served as a Court Appointed Special Master or Neutral or Testified as an Expert or in Connection with Computer Forensics/Electronic Evidence

1. Meyer v. Brown; Harris County, TX, Judge Baker; (Court’s Neutral) 2. In Re: Enron and Arthur Andersen Secs. Litigation; USDC SDTX (Lead Plaintiff’s Counsel’s ESI expert) 3. In Re: Tyco Securities Litigation; USDC NH (Lead Plaintiff’s Counsel’s ESI Expert) 4. American Express v. Americap; USDC SDTX (Court’s Special Master) 5. TXU v. Whittaker et al.; 151st Harris County, TX (Court’s Special Master) 6. Miller et al. v. Highland Medical Center; 295th JDC, Harris County, TX (Plaintiff’s Counsel’s Expert) 7. Barnes v. Kissner; 190thJDC; Harris County, TX (Court’s Neutral) 8. BP Texas City Explosion Litigation, Galveston, TX (Joint Prosecution Group ‘s Expert) 9. Chart Industries v. Runyan and Applied Hydrocarbon Systems; USDC SDTX (Plaintiff’s Expert) 10. Key Energy v. Crisp; USDC Midland, TX (Plaintiff’s Counsel’s Expert) 11. Broussard v. Dunlap; 190th Harris County, TX (Court’s Neutral) 12. State Bar of Texas v. [Attorneys Under Investigation]; TX Office of the Disciplinary Counsel 13. In Re: Flowserve Securities Litigation; USDC NDTX (Lead Plaintiff’s Counsel’s Expert) 14. Grooms v. Montelaro; 295th, Harris County, TX (Court’s Special Master) 15. Luk v. Eisner; 11th, Harris County, TX (Defense Counsel’s Expert) 16. MJCM, LLC. v. Floyd and Associates. Harris County, TX (Court’s Neutral) 17. PowerTrain v. American Honda; USDC NDMS (Hybrid Appointment) 18. Shue v USAA et al; Kendall County, TX (Court’s Special Master) 19. In Re: Sirna Therapeutics Litigation; USDC NDCA (Defense Counsel’s Expert) 20. Yeh v. McDougal; 333rd Harris County, TX (Court’s Neutral) 21. Plus Technologia, SA de CV v ACI Worldwide; Pinellas Cty., FL (Plaintiff’s Counsel’s Expert) 22. Anadarko Petroleum v. Geosouthern Energy; USDC SDTX (Hybrid/Court’s Neutral) 23. ASC v. SCI; Ft. Bend County, TX (Court’s Neutral by Stipulation) 24. Katrina Canal Breaches Consolidated Litigation; USDC EDLA (Court’s Neutral) 25. Sellar v. Boecker; Harris County, TX (Court’s Neutral) 26. In Re: Seroquel Products Liability Litigation; USDC MDFL (Court’s Special Master-ESI) 27. Daimler Trucks N.A. LLC v. Younessi; USDC OR (Court's Special Master) 28. MDI v. NaphCare; USDC SDMS (Court's Neutral) 29. Baker Hughes v. Pathfinder; USDC SDTX (Defense Counsel's Expert) 30. Bd. of Comms. of the Port of N.O. v. Lexington Ins. Co. et al.; USDC EDLA (Special Master-ESI) 31. Stewart & Stevenson v. McGuirt; Harris County, TX (Neutral Expert by Stipulation) 32. Fisher et al. V. Halliburton et al.; USDC SDTX (Plaintiff Counsel's Expert) 33. Aquamar S.A. v. E.I. Du Pont de Nemours & Co.; Broward County, FL (Plaintiff’s Counsel’s Expert) 34. AmWINS Brokerage of Texas, Inc . v Hildebrand; Collin County, TX (Neutral by Agreement) 35. Arthur v. Stern; Harris County, TX (Court's Special Master in computer forensics) 36. Duke Energy Int'l, LLC et al. v. Napoli; Harris County., TX (Court's Special Master) 37. Austin Capital Mgmt. v. Balthrop; USDC WDTX (Court's Special Master in computer forensics) 38. Grace et al. v. DRS Sensors & Targeting Systems, Inc.; USDC MDFL (Defense Counsel's Expert) 39. Peironnet et al. v. Matador Resources Co. et al.; Caddo Parish, LA (Court's Neutral) 40. Camp Mystic, Inc. et al. v. Eastland et al.; Kerr County, TX (Defense Counsel's Expert) 41. Maggette, Jacobs et al. v. BL Development et al.; USDC NDMS (Court's Special Master) 42. Ridha et al. v. Texas A&M University et al.; USDC SDTX (Defense Counsels' Expert) 43. In re: CityCenter Construction Litigation, Clark County, NV (Court’s Special Master for ESI) 44. In re: Bernard L. Madoff Investment Services Litigation; Bankruptcy Court SDNY (Trustee’s Expert for ESI) 45. Lexington v. Estate of John O’Quinn, Deceased; Probate Ct 2, Harris County, TX (Court’s Neutral Examiner) 46. Allison et al. v. Exxon Mobil Corp.; Circuit Court Baltimore County, MD (Court’s Special Master-ESI) 47. PIC Group v LandCoast, Inc.; USDC SDMS (Court’s Special Master) 48. SSC, et al v. Halberdier, et al; Harris County., TX (Neutral by Agreement) 49. Houlahan v. WWASPS; USDC DDC (Court’s Neutral) 50. M-I L.L.C. v. Stelly et al; USDC SDTX (Court’s Neutral) 51. Coyote Springs Inv. v Pardee Homes; Clark County, NV (Court’s Special Master) 52. Segner v. Sinclair Oil & Gas; USDC NDTX (Court’s Special Master) 53. Adams Golf v. Reed and Callaway Golf; 296th, Collin County, TX (Court’s Special Master) 54. Elliott v. Tetlow and MCO-I; USDC SDTX (Court’s Special Master) 55. 12001 Beamer, Ltd. V. Valtasaros; 295th, Harris County, TX (Court’s Special Master)

1

56. William A. Sawyer v. Frank Gabrysch et al.; 269th. Harris County, TX (Court’s Special Master) 57. Bridges et al. v. GES et al.; 164th, Harris County, TX (Court’s Special Master) 58. Ramirez v. State Farm Lloyds; 206th, Hidalgo County, TX (Plaintiffs’ Counsel’s Expert) 59. In re: Forest Research Institute Cases, USDC DNJ (Plaintiffs’ Counsel’s Expert) 60. Radcliffe v. Tidal Petroleum; 218th , LaSalle County, TX (Court’s Special Master) 61. Estate of Henry G. McMahon, Jr.; Travis County, TX (Court’s Special Master) 62. Samame d/b/a Alamo Packing v. Arco Iris et al; USDC WDTX (Court’s Special Master) 63. Huerta/Kodish v. BASF; Circuit Court, Cook County, IL (Court’s E-Discovery Mediator) 64. EPAC Technologies v. Thomas Nelson, Inc.; USDC MDTN (Court’s Special Master)