Learn about the differences between simple and continuous learning

When considering predictive technologies in eDiscovery review, there are two core technical methodologies that you should know:

Simple or Continuous learning. Though the end goals of these analytic workflows should ideally be the same, the actual process and end result could be quite different. Simple learning is the predictive review methodology that most are familiar with, but a continuous learning process can be far more efficient and flexible for new discovery initiatives when employed.

What is Simple Learning?

Most users of predictive review solutions are familiar with the concept behind simple learning. In simple learning, human reviewers go through samples of documents and code them manually. The predictive coding software goes through these samples to “learn” about how to code these documents. The software will recognize patterns within the documents and will associate it to the classifications that the human reviewers have offered. The software will then use all of this manually coded knowledge in order to code the rest of the software set.

If the software set is less than accurate, manual reviewers will then code more documents. These documents may be randomly selected or the reviewers may specifically go through documents that appear to be giving the software issues. Once the manual reviewers have coded more documents, the software is run again. The software then recodes all of the uncoded documents, again based on this core knowledge of manually coded information.

In addition to coding the documents, the predictive coding software will usually have an accuracy rating (represented by precision and recall) that represents confidence in its analysis. If this confidence is not high, the reviewers may need to again manually code more documents — or may need to review their samples for inconsistencies. But again, because simple learning uses samples only at the beginning of the process, the entirety of the training will need to be repeated each time. This is a process also known as active learning.

What is Continuous Learning?

Continuous learning differs from simple learning in that the manual human review does not occur solely during the beginning of the process. Instead, the continuous learning system is constantly fed more information and the training set continuous to grow. New responsive documents are identified as other documents are being reviewed — and the documents that have been automatically coded are adjusted based on the input and eventual feedback by the reviewers involved.

But just as with a simple learning system, the review is only going to be as accurate as the review of the human coders. If review is reporting inconsistent results, then the human coders will need to look back at what they have responded to in the past and will need to tailor their responses to get better results. The primary difference is that the system will continue to learn as human coders work. This system is also known as passive learning.

This methodology also allows for a smaller representative set to be sampled across the collection, as continuous learning (or re-training) is anticipated as part of the workflow design. Smart content sampling across the collection ensures a rich collection for all concepts presented in small, manageable review batches.

Which Predictive Review Methodology is Better?

The difference between simple and continuous learning can be a bit subtle. Many believe that continuous learning provides a better overall review experience. Simple learning systems are a little more limited, in that you can only give the machine a core amount of knowledge and then run it against all documents at once. Continuous learning systems are able to respond and adapt far more efficiently, thereby cutting down the amount of time that eDiscovery takes and providing a more consistent review process. However, at the end of the day, both a simple and a continuous learning system should be able to produce highly accurate results — as long as they are used properly.

The key to success with either methodology is to keep your predictive goals straight forward and simple. Responsive / Non-Responsive. Privileged / Not Privileged. Initial training of the system should be performed by a small, if not solo review team to keep the concepts focused. Once a responsive set is presented with a high level of accuracy it can be easily batched out to a larger group of linear reviewers.

Surprisingly, many review platforms do not allow for simultaneous or parallel predictive reviews at a single time, preventing multiple reviewers from predictively pushing through different case issues at the same time. It’s important to understand your review platform’s capabilities in this regard before you commit to a solution.

Are you trying to decide between a simple vs. continuous learning system? Are you implementing a predictive review methodology — or investing in predictive coding workflows? Platinum can help you find the system that is best for your organization.

Predictive Coding E-Book

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.