Jaybill McCarthy

Localization of web applications involves managing translations in multiple languages. This can be a complex task, especially when your application is continually evolving. Leveraging AWS Translate and Node.js can significantly streamline and automate this process. In this blog post, we will develop a Node.js script that uses AWS Translate to handle translations, while smartly managing existing translations and updating only what’s necessary. We assume that you have a basic understanding of Node.js and AWS services. If you’re not familiar with these technologies, you may want to familiarize yourself with them before proceeding.

Lost in Translation: The Hilarious (and Potentially Embarrassing) Pitfalls of Machine Translation

While machine translation has come a long way and can be a great tool for quickly translating large amounts of text, it is important to note that it should only be used as a starting point. This is particularly true when it comes to translating text for public consumption in a professional setting. Machine translation, including services like AWS Translate, can sometimes produce awkward or inaccurate translations, as they lack the nuanced understanding of cultural context, idioms, and language subtleties that a human translator possesses.

It’s strongly advised to have any translations produced by machine reviewed and refined by a native speaker of the target language before they are made publicly available. Relying solely on machine translation can lead to unintentional, and potentially embarrassing, mistakes.

One anecdotal example of a machine translation gone awry involves the translation of a popular American fast food slogan into Chinese. The famous slogan “Finger-lickin’ good” was reportedly translated into Chinese as “Eat your fingers off”. While this may seem humorous, it underscores the potential pitfalls of relying solely on machine translation and the importance of human review to ensure accuracy and cultural appropriateness.

If you are looking for a service that can provide cost-effective human review and refinement of your machine-translated files, you may want to consider Straker Translations. Their team of professional translators can help ensure that your translated content is accurate and culturally appropriate. This is not an endorsement, and it’s recommended to explore various options to find a service that best suits your needs.

Prerequisites

Before we start, ensure that you have the following:

Node.js installed on your system.
An AWS account with access to AWS Translate.
An existing application with a localization file structure where each locale has its own directory and each directory contains JSON files with key-value pairs for translation. Our example assumes Next.js.

Source Files and Integration with Next.js

For our use case, we will be working with a Next.js application that uses the i18next library for localization. The source language files are JSON files located in the ./public/locales/{defaultLocale} directory, where {defaultLocale} is the locale of the default language (for example, en-US for American English). Each JSON file represents a different namespace used in the Next.js application and consists of key-value pairs, where the key is the translation identifier and the value is the text in the source language.

Here’s an example of what the source language files might look like for two namespaces, homepage and dashboard:

Homepage (./public/locales/en-US/homepage.json):

{
  "welcome": "Welcome to our application",
  "footer_text": "Thank you for visiting our site"
}

Dashboard (./public/locales/en-US/dashboard.json):

{
  "welcome_back": "Welcome back, {user}!",
  "login_prompt": "Please log in to continue"
}

An example configuration file for i18next (next-i18next.config.js) might look like this:

module.exports = {
  i18n: {
    defaultLocale: 'en-US',
    locales: ['en-US', 'fr-FR', 'es-ES'],
  },
};

When the Next.js application is run, i18next will look for these files in the specified directory and use them to provide translations based on the current language setting of the application. If a translation for the current language doesn’t exist, i18next will fall back to the default language.

The script we’re about to create will read these source files, translate them into the target languages using AWS Translate, and save the translated files in their respective ./public/locales/{locale} directories. This way, when we switch the language in our Next.js application, i18next will have the translated files readily available to serve to the user.

Please note that a detailed explanation of how i18next works is beyond the scope of this blog post. If you’re not familiar with i18next, we recommend that you read the official i18next documentation before proceeding.

While this example is specifically for Next.js, the concept could easily be adapted to suit other applications. The key idea is to automate the translation of source language files and update them in a way that integrates smoothly with your localization library and application structure.

Step by Step Guide

Step 1: Install the necessary packages

Our script depends on several Node.js packages. Install them using the following command:

npm install @aws-sdk/client-translate dotenv glob

These packages include:

@aws-sdk/client-translate: The AWS SDK for JavaScript, providing JavaScript objects for AWS services including AWS Translate.
dotenv: A zero-dependency module that loads environment variables from a .env file into process.env.
glob: A package to match files using the patterns the shell uses.

Step 2: Setting up AWS Translate

To use AWS Translate, you need to have your AWS credentials. These can be set in a .env file that you load with the dotenv package. The file could look something like this:

AWS_REGION=us-east-1
I18N_AWS_ACCESS_KEY_ID=your-access-key-id
I18N_AWS_SECRET_ACCESS_KEY=your-secret-access-key

You can load these into your environment with the dotenv package:

require('dotenv').config({ path: './.translate.env' });

Step 3: Create the translation script

We’ll create a script that reads JSON files from the default locale directory, translates the content, and writes the translated content to corresponding files in the target locale directories.

The script uses the glob package to find all JSON files in the default locale directory. It then iterates over each file and each locale, translating the content from the default language to the target language using AWS Translate.

The script carefully manages the translations by:

Checking if the translated file already exists.
If it does, it loads the existing translations.
Only translates new keys from the source file that don’t exist in the target file.
Removes any keys from the target file that no longer exist in the source file.
After updating the translations, it writes them back to the target file.

The script also logs the number of new keys added, keys deleted, or if no changes were made for each file.

Here’s the complete script:

require('dotenv').config({ path: './.translate.env' });

const { TranslateClient, TranslateTextCommand } = require('@aws-sdk/client-translate');
const fs = require('fs');
const { glob } = require('glob');
const path = require('path');
const { exit } = require('process');

// Read the configuration file
const config = require('../next-i18next.config');

const translate = new TranslateClient({
    region: process.env.AWS_REGION,
    credentials: {
        accessKeyId: process.env.I18N_AWS_ACCESS_KEY_ID, 
        secretAccessKey: process.env.I18N_AWS_SECRET_ACCESS_KEY 
    }
});

const locales = config.i18n.locales;
const defaultLocale = config.i18n.defaultLocale;

const translateText = async (text, lang) => {
    const params = {
        Text: text,
        SourceLanguageCode: 'en',
        TargetLanguageCode: lang.split('-')[0], // Get the language part of the locale
    };

    const command = new TranslateTextCommand(params);

    const translation = await translate.send(command);

    return translation.TranslatedText;
};

const doTranslation = async () => {
    let files = [];
    try {
        files = await glob(`./public/locales/${defaultLocale}/*.json`); 
    } catch (error) {
        console.error(error);
        exit;
    }

    for (const locale of locales) {
        // Skip the default locale
        if (locale === defaultLocale) continue;

        for (let i = 0; i < files.length; i++) {
            const file = files[i];
            const newFile = file.replace(`/${defaultLocale}/`, `/${locale}/`);
            console.log(`Translating ${file} to ${newFile}...`);

            const fileContent = fs.readFileSync(file, 'utf-8');
            const jsonContent = JSON.parse(fileContent);

            let existingTranslations = {};
            if (fs.existsSync(newFile)) {
                const existingContent = fs.readFileSync(newFile, 'utf-8');
                existingTranslations = JSON.parse(existingContent);
            }

            let translations = {};
            let newKeys = 0;
            let deletedKeys = 0;
            const translatePromises = Object.keys(jsonContent).map(async (key) => {
                if (!(key in existingTranslations)) {
                    const translation = await translateText(jsonContent[key], locale);
                    translations[key] = translation;
                    newKeys++;
                } else {
                    // Copy over the existing translation
                    translations[key] = existingTranslations[key];
                }
            });

            await Promise.all(translatePromises);

            // Check for keys that were removed
            Object.keys(existingTranslations).forEach((key) => {
                if (!(key in jsonContent)) {
                    deletedKeys++;
                }
            });

            // Ensure the directory exists, if not create it
            const dir = path.dirname(newFile);
            if (!fs.existsSync(dir)) {
                fs.mkdirSync(dir, { recursive: true });
            }

            if (newKeys > 0 || deletedKeys > 0) {
                console.log(`Added ${newKeys} keys and deleted ${deletedKeys} keys for ${newFile}.`);
            } else {
                console.log(`No changes for ${newFile}.`);
            }

            fs.writeFileSync(newFile, JSON.stringify(translations, null, 2));
        }
    }
};

doTranslation();

Step 4: Update your `package.json` and run the script

At this point, you have a fully functional script that can translate your source language files into any number of target languages. To make running this script even easier, you could add a new script to your package.json file.

Here’s an example of how to do it:

{
  "name": "your-app-name",
  "version": "1.0.0",
  "scripts": {
    "start": "node index.js",
    "test": "jest",
    "translate": "node path/to/your/translation/script.js"
  },
  "dependencies": {
    ...
  }
}

In this example, we’ve added a new script called “translate”. The value for this script is the command that Node.js should run when you execute npm run translate in your terminal. This command tells Node.js to run your translation script.

Now, to run your translation script, all you have to do is open your terminal, navigate to your project directory, and run:

npm run translate

This command will execute the translation script, translating your source language files into your target languages using AWS Translate. With this setup, you now have an automated workflow for managing your localization files. This not only accelerates the process of adding new languages to your web application but also ensures your translations are up to date with the evolving source language.

Extending the Script: Additional Features

The script we’ve developed is a solid starting point for automating your translation workflow, but there are several additional features you could implement to enhance its functionality:

Support for Nested JSON Objects: The current script assumes that all translations are at the top level of the JSON object. If your application uses nested objects for organizing translations, you would need to modify the script to navigate these nested structures.
Pluralization Support: Many languages have complex rules for pluralization. A library that handles pluralization, such as i18next, could be integrated into the script to ensure accurate translations of singular and plural forms.
Version Control Integration: Integrating the script with version control systems like Git would enable you to track changes made by the script, providing a clear history of what has been changed, added, or removed.
Error Handling and Retry Logic: The script could benefit from more robust error handling. If the AWS Translate API fails to respond or throws an error, the script could retry the request a certain number of times before failing.
Rate Limiting: To prevent hitting API limits for the AWS Translate service, rate limiting could be implemented in the script. This would involve tracking the number of requests made in a certain time period and delaying requests if the limit is reached.
Parallelization: To speed up the translation process, the script could be modified to process multiple files simultaneously. However, you would need to be cautious of API rate limits if you implement this.
Support for Other Translation Services: While this script uses AWS Translate, it could be extended to support other translation services, like Google Translate or Microsoft Translator. This would involve abstracting the translation functionality so that the translation service can be easily swapped.
Web Interface: For non-technical users, a web interface where users can trigger translations, monitor progress, and download completed translations could be developed.
Translation Memory: Implement a system that remembers previously translated phrases or sentences to prevent translating the same segments again, saving costs and ensuring consistency across translations.

Each of these features brings its own set of complexities, so thorough testing will be crucial as you continue to enhance the script. By tailoring the script to the needs of your project, you can create a powerful tool that streamlines your localization process and ensures your application can connect with users across the globe.

← Make Food With Me - Episode 4 - Sundried Tomato, Spinach and Provolone Omelet

Automating Translation Workflow for Localization With Partial Updates