CSV per Locale

AI translation for CSV files with separate files per locale using Lingo.dev CLI

What is CSV per Locale?

CSV per Locale is a translation approach where each locale has its own separate CSV file, rather than storing all locales in a single CSV file with multiple columns. This format is useful when you have structured data with multiple columns (like product catalogs, user data, or content management systems) where each row represents a record and columns represent different fields.

For example:

id,name,description,created,enabled,sort
1,Welcome,Welcome to our application,2024-01-01,true,1
2,Save,Save your changes,2024-01-01,true,2
3,Error,An error occurred,2024-01-01,true,3

Unlike the standard CSV bucket which stores all locales in one file with columns like KEY,en,es, the csv-per-locale bucket maintains separate files for each locale, preserving the original CSV structure with all columns.

What is Lingo.dev CLI?

Lingo.dev CLI is a free, open-source CLI for translating apps and content with AI. It's designed to replace traditional translation management software while integrating with existing pipelines.

To learn more, see Overview.

About this guide

This guide explains how to translate CSV files using the csv-per-locale bucket with Lingo.dev CLI.

You'll learn how to:

  • Create a project from scratch
  • Configure a translation pipeline with separate CSV files per locale
  • Generate translations with AI
  • Use locked and ignored keys

Prerequisites

To use Lingo.dev CLI, ensure that Node.js v18+ is installed:

❯ node -v
v22.17.0

Step 1. Set up a project

In your project's directory, create an i18n.json file:

{
  "$schema": "https://lingo.dev/schema/i18n.json",
  "version": "1.10",
  "locale": {
    "source": "en",
    "targets": ["es"]
  },
  "buckets": {}
}

This file defines the behavior of the translation pipeline, including what languages to translate between and where the localizable content exists on the file system.

To learn more about the available properties, see i18n.json.

Step 2. Configure the source locale

The source locale is the original language and region that your content was written in. To configure the source locale, set the locale.source property in the i18n.json file:

{
  "$schema": "https://lingo.dev/schema/i18n.json",
  "version": "1.10",
  "locale": {
    "source": "en",
    "targets": ["es"]
  },
  "buckets": {}
}

The source locale must be provided as a BCP 47 language tag.

For the complete list of the locale codes that Lingo.dev CLI supports, see Supported locale codes.

Step 3. Configure the target locales

The target locales are the languages and regions you want to translate your content into. To configure the target locales, set the locale.targets property in the i18n.json file:

{
  "$schema": "https://lingo.dev/schema/i18n.json",
  "version": "1.10",
  "locale": {
    "source": "en",
    "targets": ["es"]
  },
  "buckets": {}
}

Step 4. Create the source content

If you haven't already, create a CSV file for your source locale. The file should contain:

  • A header row with column names
  • One or more data rows
  • Any columns you need (not limited to specific column names)

The CSV file must be located at a path that includes the source locale somewhere in the path (e.g., as a directory name like en/ or as part of the filename like data.en.csv).

Note: Unlike the standard CSV bucket, you don't need a "KEY" column or a column matching the source locale. The csv-per-locale bucket treats each row as a record and will translate all text content in the CSV while preserving the structure.

Step 5. Create a bucket

  1. In the i18n.json file, add a "csv-per-locale" object to the buckets object:

    {
      "$schema": "https://lingo.dev/schema/i18n.json",
      "version": "1.10",
      "locale": {
        "source": "en",
        "targets": ["es"]
      },
      "buckets": {
        "csv-per-locale": {}
      }
    }
    
  2. In the "csv-per-locale" object, define an array of one or more include patterns:

    {
      "$schema": "https://lingo.dev/schema/i18n.json",
      "version": "1.10",
      "locale": {
        "source": "en",
        "targets": ["es"]
      },
      "buckets": {
        "csv-per-locale": {
          "include": ["./[locale]/example.csv"]
        }
      }
    }
    

    These patterns define which files to translate.

    The patterns themselves:

    • must contain [locale] as a placeholder for the configured locale
    • can point to file paths (e.g., "[locale]/data.csv")
    • can use asterisks as wildcard placeholders (e.g., "[locale]/*.csv")

    Recursive glob patterns (e.g., **/*.csv) are not supported.

  3. Optionally, you can configure lockedKeys and ignoredKeys:

    {
      "$schema": "https://lingo.dev/schema/i18n.json",
      "version": "1.10",
      "locale": {
        "source": "en",
        "targets": ["es"]
      },
      "buckets": {
        "csv-per-locale": {
          "include": ["./[locale]/example.csv"],
          "lockedKeys": ["locked_key_1"],
          "ignoredKeys": ["ignored_key_1"]
        }
      }
    }
    
    • lockedKeys: Keys (column values in the first column, typically an ID) that should not be translated
    • ignoredKeys: Keys that should not appear in target locale files

Step 6. Configure an LLM

Lingo.dev CLI uses large language models (LLMs) to translate content with AI. To use one of these models, you need an API key from a supported provider.

To get up and running as quickly as possible, we recommend using Lingo.dev Engine — our own, hosted platform that offers 10,000 tokens of free, monthly usage:

  1. Sign up for a Lingo.dev account.

  2. Run the following command:

    npx lingo.dev@latest login
    

    This will open your default browser and ask you to authenticate.

  3. Follow the prompts.

Step 7. Generate the translations

In the directory that contains the i18n.json file, run the following command:

npx lingo.dev@latest run

This command:

  1. Reads the i18n.json file.
  2. Finds the files that need to be translated.
  3. Extracts the translatable content from the CSV files.
  4. Uses the configured LLM to translate the extracted content.
  5. Writes the translated content back to separate CSV files for each target locale.

The first time translations are generated, an i18n.lock file is created. This file keeps track of what content has been translated, preventing unnecessary retranslations on subsequent runs.

Example

en/example.csv (before translation)

id,name,description,created,enabled,sort
1,Welcome,Welcome to our application,2024-01-01,true,1
2,Save,Save your changes,2024-01-01,true,2
3,Error,An error occurred,2024-01-01,true,3
4,Success,Operation completed successfully,2024-01-01,true,4
5,Loading,Please wait while we load your data,2024-01-01,true,5

es/example.csv (after translation)

id,name,description,created,enabled,sort
1,Bienvenida,Bienvenido a nuestra aplicación,2024-01-01,true,1
2,Guardar,Guarda tus cambios,2024-01-01,true,2
3,Error,Ha ocurrido un error,2024-01-01,true,3
4,Éxito,Operación completada con éxito,2024-01-01,true,4
5,Cargando,Por favor espera mientras cargamos tus datos,2024-01-01,true,5

i18n.json

{
  "$schema": "https://lingo.dev/schema/i18n.json",
  "version": "1.10",
  "locale": {
    "source": "en",
    "targets": ["es"]
  },
  "buckets": {
    "csv-per-locale": {
      "include": ["./[locale]/example.csv"],
      "lockedKeys": ["locked_key_1"],
      "ignoredKeys": ["ignored_key_1"]
    }
  }
}

i18n.lock

version: 1
checksums:
  e8b273672f895de0944f0a2317670d7c:
    0/name: 1308168cca4fa5d8d7a0cf24e55e93fc
    0/description: 8de4bc8832b11b380bc4cbcedc16e48b
    1/name: f7a2929f33bc420195e59ac5a8bcd454
    1/description: 8de4bc8832b11b380bc4cbcedc16e48b
    2/name: d3d99b147cc363dc6db8a48e8a13d4c1
    2/description: 7cd986af1fe5e89abe7ecffba5413110

Differences from CSV bucket

The csv-per-locale bucket differs from the standard csv bucket in several ways:

  • File structure: csv-per-locale uses separate files for each locale (e.g., en/example.csv, es/example.csv), while csv uses a single file with multiple columns (e.g., KEY,en,es).

  • Column requirements: csv-per-locale doesn't require a "KEY" column or locale-named columns. You can use any column structure that fits your data.

  • Use cases: csv-per-locale is ideal for structured data like product catalogs, content management systems, or databases where each row represents a record with multiple fields. The standard csv bucket is better suited for simple key-value translation tables.

  • File mutations: Both buckets mutate files directly, but csv-per-locale creates separate files for each locale, while csv adds new columns to the existing file.