1 (617) 528-7410 ClientService@Linguist.com
Neural Translation + Human Post-Editing: Is It Good Enough to Publish?

Neural Translation + Human Post-Editing: Is It Good Enough to Publish?

Will it look better at 1/3 less?

Neural translation (NT) is showing great promise as a baseline tool for high-volume content translation. Compared to (statistical) machine translation, Neural offers better comprehension and readability right out of the box.

Few would argue that the quality of Neural translation is good enough to publish as is. But what if a human post-edit is added? And not just a light post-edit (our level 3 of quality), but a full human post-edit (level 4)?

Would you, as a high-volume content publisher, use this process for final distribution in certain markets? What if it cost 33% less than a full human translation? Would you use it then?

These are the issues and opportunities being examined by many of our clients. Here’s some food for thought:

  • Most everyone would agree that (statistical) machine translation by itself is not sufficient for publishing. It even strains basic comprehension in many cases.
  • Adding a light or even a full human post-edit can improve output, but accuracy is far from guaranteed.
  • Neural translation relies on artificial intelligence to translate concepts, phrases, and sentences versus single words, so readability and comprehension are significantly improved. This makes it much easier to do a human post-edit.
  • Post-editing a neural translation therefore takes less time. It’s likely to be no less accurate. And it’s an easier process for a translator. So, it costs less.

Here is an example of the potential savings:

Full Human Translation

  • 1 File (300 words) human-translated into 1 language: $90
  • Cost for 6 languages $540
  • X 40 files per month (10 per week): $21,600
  • X 12 months: $259,200

Neural Translation Plus Full Post-Edit

  • 1 File, neural-translated into 1 language >$1
  • + Full human post-edit of neural translation $59
  • Cost for 6 languages $360
  • X 40 files per month (10 per week): $14,400
  • X 12 months $172,800

Annual Savings: $86,400 Percent Savings: 33%

A Practical Application

The best way to test the savings possible with Neural translation plus a full human post-edit may be to offer it to your internal translators. Could they process more content quicker, and with less frustration, if they were supplied with an NT + FPE translation to start? Is this an enormous time savings? Or do they end up checking the source material so often it’s not worth it?

We don’t prejudge the outcome for our clients. But at approximately 33% less, it may certainly be worth the test.

That’s why we’re offering to help you test it for yourself. You can submit a 3-page sample of your typical content and we’ll Neural-translate and full post-edit it, for free! Compare it to a full human edit to see if there’s much difference. Just click below to get started.

Finally, whatever your quality needs or desired process, we’re here to support you. NT + FPE is just one of our many solution options.

EDITOR’S NOTE: Linguistic Systems uses a combination of advanced proprietary technology and 7,500 skilled, certified translators to deliver high-quality translations in 120+ languages. With 50+ years and billions of words of experience serving 25,000 clients, including many Fortune 100 and AmLaw100 firms, you can trust us with your must-win foreign language translation projects. 

Monetize Your Association Assets — Build A Foreign Language Translation Library

Monetize Your Association Assets — Build A Foreign Language Translation Library

Your Foreign Language Translation Library Is A Hidden Asset

If you’re responsible for the growth of your association membership and/or the improvement of member services, a significant opportunity may be available for you. You can monetize one of your most important assets — your industry-specific terminology and knowledge — by building a foreign language translation library. This multilingual glossary enables members to enhance their translations with augmented intelligence.

This can help members to save significantly on foreign language translations. It’s a process they encounter all the time if they’re doing business in non-English speaking countries. And who isn’t these days?

Here’s what typically happens:

  • A member company needs to translate its product documentation, promotional content, advertising, Web site, contracts, or other essential communications related to their business.
  • They engage a translation company and pay the going rate to translate EVERY WORD of their content. And they pay too much.
  • If they expect to do a number of translation projects, they might work with their language service provider to build a library of the most commonly used words, terms, and expressions which are particular to their business and, most likely, to their industry.
  • On subsequent projects, this library serves as a repository of already translated terms. This means your member will save time and money by not having to translate the same words again.

Now, imagine that instead of building a library of already translated terms on their own, one project at a time, members benefit from referencing a massive library of words, terms, and expressions across your entire industry or association? It would include terms they may not have thought about yet, but ones that have proven important to other members who are more advanced in their business.

And imagine if this massive industry library of relevant terms was set up and owned by your association? How much would that be worth to you each year in member retention and acquisition? It could be a standard benefit for everyone, or you could monetize it further by offering it as a premium service.

How would you go about setting up such a valuable library? You would talk to us. We’ve been in the translation business for 50 years — serving major Fortune 100 and AmLaw 100 clients for decades. And we’ve been building and managing sometimes massive proprietary libraries for almost all of them. We offer:

  • the technology (including machine, neural, and human translation);
  • the know-how (including fluency in 120+ languages and cultures, and a carefully screened and tested network of 7,500 translators);
  • and the experience (including 50 years of service and 6 quality and cost options).

Finally, your data will be secure. Any library we create for you will be proprietary to your association and the members to whom you provide access. No one else will access your data. Our ISO 27001 certification in information security management guarantees it.

Monetize your industry knowledge for your members, and your association, today.

EDITOR’S NOTE: Linguistic Systems uses a combination of advanced proprietary technology and 7,500 skilled, certified translators to deliver high-quality translations in 120+ languages. With 50 years and billions of words of experience serving 25,000 clients, you can trust us to build your foreign language translation library.

Are Your Translations Exposing You To Risk?

Are Your Translations Exposing You To Risk?

How Linguistics Cracked The Ransomware Code

Cyberattacks, bugs, viruses, cybertheft, malware or ransomware … a breach of data security under any name is formidable. But, leveraging linguistic analysis is proving to be a valuable tool in cracking a hacker’s code.

As technology advances, the sophistication and intricacies of cyberterrorism add new complexity to data and risk management. However, each attack embeds identifiers in the code that can help lead authorities to the correct perpetrator.

Global law enforcement officials search for those identifiers within the malware to lead to the source of the attack. By analyzing language trends within the code, authorities can make assumptions about where the attack originated.

For example, with the WannaCry ransomware scam, ransom letters were sent out in different languages. But linguistic nuances appeared as errors in generic translations by free machine translation engines.

Experts saw that the hacker’s use of certain Chinese characters hinted at fluency, while the failure to recognize grammatic and contextual cues in other languages supported forensic claims.

You want to be careful of the accuracy of machine translation by itself, especially from free translation sites. (Note: Linguistic Systems uses advanced, proprietary statistical and neural engines for its machine translation. We then add human translation as needed, to get to the desired quality level.)

According to Flashpoint authors Jon Condra, John Costello, and Sherman Chu, in an article published May 25, 2017, “A number of unique characteristics in the note indicate it was written by a fluent Chinese speaker. A typo in the note, “帮组” (bang zu) instead of “帮助” (bang zhu) meaning “help,” strongly indicates the note was written using a Chinese-language input system rather than being translated from a different version. More generally, the note makes use of proper grammar, punctuation, syntax, and character choice, indicating the writer was likely native or at least fluent.”

Data security starts with a commitment to confidentiality. Although free translation sites may seem like a quick and cost-effective choice to translate your documents, they can expose you to risk.

Even Google Translate’s FAQs confirm this possibility: “The stored text is typically deleted in a few hours, although occasionally we will retain it for longer while we perform debugging and other testing. Google also temporarily logs some metadata about translation requests (such as the time the request was received and the size of the request) to improve our service.”

The lack of accountability of free translation sites may contribute to lower quality translations. Forgoing the expertise of human insight probably gave authorities valuable clues to the location of the WannaCry Ransomware hackers. It also highlights the flaws of machine translation software in general, particularly on free sites.

Using a free online translation tool may seem cost-effective, but it invites a third party to engage with your content — one that cannot be held accountable in the event of a security breach. This exposes you to risk.

To be sure that you have the most secure and accurate translation, put your trust in a translation service provider who can offer you the cost- and time-effective methods of machine translation complemented with the expertise of human translation as needed. Choose a service provider with a strong history of excellence in translation and confidentiality supported by multiple security certifications.

We’ve got you covered in all those areas.

EDITOR’S NOTE:  Linguistic Systems maintains an information security management system certified to the requirements of the ISO 27001 information security standards.

Machine Translation: The Hidden Value of Outliers

Machine Translation: The Hidden Value of Outliers

Would You Invest $1 for Enhanced Machine Translation to Save $198?

Savings can hide in the most curious places – particularly when doing machine translation (MT) to translate vast quantities of foreign language content.

Machine translation is usually the starting point in the legal profession, corporate law, and many other high-volume content situations in order to churn through large amounts of foreign-language documents. But as powerful as machine translation engines can be, there are countless words that MT engines aren’t trained to know.

These “outlier” words can take many forms:

• No machine translation engine contains all the words of a language (e.g., English is thought to have more than 1 million words);

• Words may have multiple uses, and machine translation can mistakenly apply the wrong context (e.g., “cranes by the riverbank” referring to birds and a body of water, are not the same as “construction cranes used to build the bank by the river”);

• Some words may be rarely used (e.g., in English where grammatical use has changed, words like “thee,” “thy,” and “whom” may be unknown);

• Words may be newly created (e.g., in German and Russian, it is common to see new words pop up or combinations of words with new concepts);

• Company names and product names should generally not be translated, although there are certainly exceptions in some languages; • Personal names should not be translated, (e.g. – “Mr. Grey” is a person and not a “man of a certain color”);

• Some chemical compound names may not be present in a machine translation word repository; • Industry-specific terms may not be widely known outside the industry.

What do you do with these “outlier” words?

Linguistic Systems’ translation analytics capability allows us to extract all words that the machine translation engine “did not translate” (DNT). An output file of those words is then created which includes the number of times they occurred. This file is used to create a new proprietary client glossary (or to update existing glossaries) with DNT words. The glossary then serves as a reference for the machine translation engine, for this client’s specific project or case going forward.

The next step is most important. We collaborate offline with our clients to prioritize and define the outliers so they can be added to a custom glossary with “DNT” words for that client and project. The job can then be rerun with the custom glossary. The result: Significant savings of both time and money for a portion of the files. Here’s how it works.

“The process keeps getting quicker, more efficient, and less expensive with each additional job.”

Economics

Here are the numbers for a project that contains 2,000 files to be translated. Starting with machine translation, we might run 400 files (one fifth) as straight MT with no updated glossary. (We would leverage glossaries if they exist from previous jobs.)

Our machine translation engine would isolate the outlier “DNT” words and their number of instances. After collaborating with the client, the project is rerun – all 2,000 files — with the updated glossary in place.

To use round numbers, let’s assume that straight machine translation is charged out at $1 per document. The cost might go to $2 per document for enhanced machine translation (using the custom glossary) on the 400 documents to be examined more deeply. However, this $1 increase per file may enable savings of much more for this segment of identified files where human post-editing is recommended to significantly improve clarity.

Post-editing could run $200 per document for the files that require it. That’s $198 per file in savings times 400 files, which translates to $79,200 in savings by not having to post-edit that segment of machine translation work.

There should also be a significant savings in time if machine translation plus an augmented glossary can be used. The next time a project is run for that same client, a more robust glossary is already in place, and collaboration time on new “outliers” should be less. The process keeps getting quicker, more efficient, and less expensive with each additional job.

Machine translation is still limited in its quality. Even files that have gone through an enhanced glossary may require additional human translation if their end purpose demands it.

Machine translation — even with enhanced glossaries and post-editing — is nowhere near “certification grade.” An attorney could not go into court unless the content has gone through a full and careful human edit. But by augmenting the machine translation process with custom glossaries, this can be a very cost-effective option for segments of large projects.

EDITOR’S NOTE: Linguistic Systems uses a combination of advanced proprietary technology and 7,500 skilled, certified translators to deliver high-quality translations in 120+ languages. Trust us with your next translation project.