10 Dec /19

Data Extraction

Data Extraction – Word of the day – EVS Translations
Data Extraction – Word of the day – EVS Translations

Whether you realize it or not, every day, you collect and process numerous pieces of information that influence your actions going forward. For example, traffic conditions may change the route you decide to take, or financial news may cause you to rethink your spending habits or investment plans. In many ways, businesses (notably law firms) do the same thing as individuals, and this same process – called data extraction – is today’s word.

Breaking down the term itself, data, the classical plural of the Latin datum, which means ‘an item of information’, was first used by Sir William Batten in the 1630 work, the Most Easie Way Finding Sunnes Amplitude, while extraction, meaning ‘the process of withdrawing or obtaining something’, is derived from Medieval Latin through the Old French extracion and first found in the Acts of Parliament of 1530-31.

Much in the same way that an individual would hypothetically scan through global weather reports to find the current conditions affecting their locality, the process of data extraction – especially under the auspices of e-discovery – scans through massive amounts of electronic data in order to find necessary and relevant information.

Of course, beyond our simple weather analogy and considering that business is now local and global, “information” is not monolithic: there are many varied sources in a multitude of languages, often leading to a massive amount of information to review. To give an idea, 10 GB of “rough” e-discovery data is equal to 10 truckloads of papers for reviewing, which, for turnover speed, necessitates the usage of specialized machine translation.

Considering that…

  • In the past 5 years, 3 times more companies are facing organizational-risk legislation.
  • An issue cited by 49% of industry professionals is the variety, volume, and speed of dissimilar forms of data from the e-discovery process.
  • 98% of legal professionals using early case assessment, which relies heavily on e-discovery and data extraction, cite it as a necessary and efficient approach.
  • 70% of eDiscovery costs are attributed to the review of documents – USD 18,000 per 1 GB.


Not only will it be increasingly important to find the information “needle” in the growing proverbial haystack, but the greater challenge will be doing it in a cost-effective manner.

Download EVS Translations’ white paper Language Technology for Law Firms and find out how to apply AI-based solutions to the problem of searching large amounts of foreign language data.