Document Summarization through Deep Learning Neural Networks.
Simplified with Summary Engine.
We live in a world of verbose and complex concepts. It’s Human Nature wanting to express ourselves for the purposes of education, documentation or even entertainment to our teams, families and friends.
From news articles, educational works, and even messages between loved ones through text or chat applications. Not to mention petrified bodies of work like email, expert reports or deposition transcripts. There’s a world of information out there and clouds of words become haystacks obscuring the important or relevant content we need to dig out. Our challenge has become finding the important parts in the corpuses of information around us.
The Millennials have dubbed this deluge of copy ‘TL;DR’, or “Too Long – Didn’t Read”. Long copy is traditionally ended with a TL;DR, followed by a summary of the long text, provided for those readers who don’t have time (or in some cases the desire) to read 3 minutes worth of text.
The usage of TL;DR is considered anywhere from highly appropriate to absolutely disrespectful based on the disposition of the author or reader. However, it’s a sign that we as a human race have basically had enough at this point.
We can easily chalk this up to another obvious case of millennial laziness, or we can call it what it is. It’s innovative, it’s creative and it’s useful in a number of business, educational and casual / entertainment scenarios.
Just consider how useful it could be to summarize a great American classic like Moby Dick. In fact, we all remember the successful company Cliff’sNotes who created an entire business model based on doing just that. It’s so obviously necessary that all books traditionally utilize liner notes to give the prospective reader a high-level understanding of the book’s content without the reader actually committing to read the book at all.
Summaries are extraordinarily useful.
In the legal industry, many attorneys, paralegals, secretarial staff and even subject matter experts are tasked with summarizing complex content to save time in either recalling an important document or for the purpose of orienting a new reader with the document’s content, without the prospective reader actually committing to read the document.
It’s important work, but I think we can all agree that our time could be better spent elsewhere. Furthermore, human summarization of documents is a clearly subjective process which produces inconsistent results when performed by multiple participants.
A highly-skilled reviewer can summarize as many as six to ten documents per hour. With hourly rates ranging anywhere from $50 to $700 an hour, this process is both time-consuming for the legal team and ultimately expensive for the end client, while Texas Rule of Evidence 1006 allows parties to use summaries, charts, and calculations to prove the content of voluminous writings, recordings, or photographs that cannot be conveniently examined in court, and which are otherwise admissible. Everything doesn’t need summarization, but wouldn’t it be nice to have summaries for all of your hot documents; for instance, consider documents that might become exhibits in the future, documents that were produced or documents withheld for privilege?
Would your partner become more involved in the discovery process if you could provide summaries? What if the summaries could be created quickly, confidentially and inexpensively by a Convolutional Neural Network with Machine Learning?
Summary Engine. A new direction.
Enter Summary Engine. Cullable’s newest development pipeline.
With Cullable, Platinum already provides the world’s fastest text generation and extraction workflows. With the highest quality OCR in the world in any language, combined with it’s scalability for digital text extraction, it performs beyond any measured solution in any industry. Summary Engine’s automated document summarization seems like a perfect fit as the newest major feature release for Cullable’s platform.
The sample below is from news content processed through Summary Engine. The original copy had 168 words and 991 characters. Summarized, we’re left with the most important 47 words and 282 characters. This is over a 72% reduction in content!
Meanwhile, text summarization need goes well beyond legal. We plan to release localized plugins and text summarization tools for many popular applications in the coming months, along with offering the solution as part of our own Developers API for existing and prospective integrators of Cullable.
Today, we’re offering no cost pilots of the technology for the general public. Platinum’s seeking feedback and suggested improvements to Summary Engine. We’re inviting participants to directly enter the development pipeline with the inventors and developers of Cullable.
Was the blog post too long to read? You could have Summarized it 🙂 Here’s an example of what Summary Engine would do for this document.
“Long copy is traditionally ended with a TL;DR, followed by a summary of the long text, provided for those readers who don’t have time (or in some cases the desire) to read 3 minutes worth of text. The Millennials have dubbed this deluge of copy ‘TL;DR’, or “Too Long – Didn’t Read”
Legal teams are often tasked with summarizing complex content to save time in either recalling an important document or for the purpose of orienting a new reader with the document’s content.
What if the summaries could be created quickly, confidentially and inexpensively by a Convolutional Neural Network with Machine Learning?
Cullable is offering no cost pilots of it’s Summary Engine technology for the general public. We plan to release localized plugins and text summarization tools for many popular applications in the coming months. We will also provide the solution as part of our own Developers API for existing and prospective integrators of Cullable.”
Check Out Summary Engine for Yourself.