National Law Journal: The rising tide of nonlinear review
Disruptive technology, savvy clients and cost pressures are changing the e-discovery game.
Catherine A. Casey and Alejandra P. Perez, The National Law Journal
January, 2012
Leveraging technological advancements to minimize the cost and maximize the accuracy of human analysis required for large data reviews is the next step in the world of electronic discovery. Rapidly expanding data volumes and skyrocketing costs are driving this evolution. It is increasingly in counsel’s interest to make use of technology to find the most cost-efficient and accurate ways to review large volumes of data.
Service providers, counsel and cost-conscious clients are looking to nonlinear methods for relief from this burgeoning volume of data and the costs associated with it. When used effectively, emerging technologies can create efficiencies in prereview categorization of data, expand the scope of content analysis and accelerate the speed and accuracy of review. Rather than a one-size-fits-all approach to e-discovery, there is now a spectrum of alternative technologies and processes that can be tailored to the case at hand.
The exponential growth of electronically stored information (ESI) during the past decade has forced organizations to reevaluate how e-discovery is handled. Rather than focusing their efforts on the merits of a case, companies and their counsel have been forced to spend exorbitant amounts of money and time to preserve, collect, identify and produce the proverbial needle in a haystack from first megabytes, terabytes and now petabytes of data.
ESI has been doubling or tripling every 18 to 24 months and 85 percent resides in business domains. International Data Corp. estimated that 1.8 zettabytes (1.8 trillion gigabytes), contained within 500 quadrillion files, would be created and replicated in 2011. This means, as of today, there are more bits of ESI than there are stars in the known universe. See John Gantz & David Reinsel, “Extracting Value from Chaos” (2011). http://idcdocserv.com/1142. This tidal wave of data and the cost associated with it is driving innovations in nonlinear review.
On average, companies allocate 70 percent of their legal budgets to litigation, and in 2010 that meant more than $1 million for half of the companies surveyed. See Fulbright & Jaworski, “Third Annual Litigation Trends Survey Findings” (2006) and “Seventh Annual Litigation Trends Survey Findings” (2010). Discovery accounts for 50 percent of these litigation costs and up to 90 percent for the cases in the most expensive 5 percent of cases. See Nicola Faith Sharpe, “Corporate Cooperation Through Cost Sharing,” 16 Mich. Telecomm. Tech. L. Rev. 109 (2009). For every gigabyte that matriculates to review, companies are spending $18,750. Given that the average employee’s e-mail contains 25 gigabytes of data, there is a potential risk of $468,750 per custodian in a discovery request. See Enterprise Strategy Group & Gartner, “Mapping Technology and Vendors to the Electronic Discovery Reference Model” (November 2007), http://www.gartner.com/id=543108.
As corporations contend with mountains of data and costs in e-discovery, they are taking greater ownership of handling ESI and welcoming the development of more efficient methods. Large corporations are bringing technology and e-discovery specialists in-house and exerting increased pressure on outside counsel to contain cost in all aspects of the discovery process. Increasingly, corporations are also requesting alternative fee structures with outside counsel and service providers to reign in the cost of e-discovery, and requesting flat fees per gigabyte, per document or for entire matters inclusive of billable hours. See Fulbright & Jaworski surveys, supra.
In order to meet the expectations of their corporate clients, counsel and service providers are using new technology and processes for handling ESI. E-discovery practitioners are opening up to the idea of technology-assisted review to manage the cost and time pressures.
Nonlinear review uses technology to analyze data, allowing counsel to prioritize what is reviewed, in what order and by whom. Early case assessment (ECA), sophisticated culling searches and algorithms that “learn” about the data the way Pandora.com “learns” what type of music you like are examples of some of the recent technologies ranging from the most conservative to most progressive.
Typical linear reviews group data on a per-custodian or collection-sequential basis for attorney review and then involve trudging through the data set document by document. By contrast, nonlinear methods can be used to decrease the volume of data necessary for review and to group content-similar documents, like those most likely or least likely to be relevant, allowing for a fast-paced yet accurate review.
Using tools promoting ECA capabilities, practitioners can view documents as they are collected and make strategic decisions earlier than before that expedite fact-driven case development. Prior to ECA, attorneys had to wait until after lengthy, expensive processing and document review to analyze ESI. This workflow was costly and potentially exposed organizations to risk. Prioritization and analysis of data before the review phase reduces the time and money necessary to uncover and manage “smoking gun” documents. It also allows for implementation of a more organized data-review workflow, which speeds data-review pace and increases accuracy, allowing counsel to prepare for trial sooner and with more specificity.
Practitioners can now search for and cluster documents by concept or near-duplicate content and eliminate or prioritize data-review workflows based upon that information. E-mail suppression and threading technology capture and reconstruct e-mail conversations within the data set. This can consolidate the review of duplicate or near-duplicate documents and allow for greater exclusion of nonrelevant data prior to paying for attorney review. Iterative searching using these tools in conjunction with traditional Boolean search allows practitioners to significantly reduce the volume of data pushed to review without sacrificing accuracy.
On the more complex end of the spectrum, nonlinear review encompasses predictive analytics, machine learning, predictive coding and automated coding.
With these technologies, as more of the data are analyzed, the model continues to be refined to the point of maximum accuracy, and can then run with minimal supervision through the body of data. This type of computer-assisted review is the new “it” topic in e-discovery. When properly managed, it reins in the cost, time and labor that have rendered e-discovery problematic.
These varied, nonlinear methods conform to one central formula: Instead of following data collection with limited analysis and then trudging through the output page by page, these methods begin with analyzing sample data and determining which additional nonlinear model is most appropriate for the data and goals of the particular case.
In the past, practitioners had to choose whether they should spend exorbitant amounts of money for quick and accurate reviews, or sacrifice speed or accuracy to contain cost. Nonlinear review tools have enabled complex e-discovery reviews to be done at a reduced cost without loss of accuracy or speed — often increasing both.
Recent writings by U.S. Magistrate Judge Andrew J. Peck of the Southern District of New York bode well for wider implementation of technology-assisted methods. According to Peck, advanced nonlinear methods — when used in a defensible workflow and in a reasonable manner when data volume necessitates them — will receive blessings from the bench. Instead of having only one option, “a factory of contract lawyers” culling document by document and filtering a selection of documents to senior lawyers, the senior lawyers can now leverage technology in a way that best fits the unique case at hand. Andrew J. Peck, “Search, Forward: Will manual document review and keyword searches be replaced by computer-assisted coding?” L. Tech. News (Oct. 1, 2011).