How to build locale identifiers from components

Construct locale identifiers by combining language, script, and region codes in JavaScript

Introduction

Locale identifiers like en-US or zh-Hans-CN encode information about language, writing system, and region. Sometimes you need to construct these identifiers programmatically rather than using a fixed string. For example, you might let users select their language and region separately, then combine them into a valid locale identifier.

JavaScript's Intl.Locale constructor allows you to build locale identifiers from individual components. You can specify language, script, and region as separate parameters, and the constructor assembles them into a properly formatted identifier.

This guide explains how to construct locale identifiers from components, when to use this approach, and how to handle edge cases.

Understanding locale identifier components

Locale identifiers consist of components separated by hyphens. Each component represents a different aspect of cultural preferences.

The language code specifies which language to use. It uses two or three lowercase letters from ISO 639:

  • en for English
  • es for Spanish
  • fr for French
  • zh for Chinese
  • ar for Arabic

The script code specifies the writing system. It uses four letters with the first letter capitalized, from ISO 15924:

  • Hans for Simplified Chinese characters
  • Hant for Traditional Chinese characters
  • Cyrl for Cyrillic script
  • Latn for Latin script

The region code specifies the geographic area. It uses two uppercase letters from ISO 3166-1 or three digits from UN M.49:

  • US for United States
  • GB for United Kingdom
  • CN for China
  • MX for Mexico

These components combine in a specific order: language, then script, then region. For example, zh-Hans-CN means Chinese language, Simplified script, China region.

Building locales with only language and region

The most common scenario is combining language and region codes. Most applications do not need to specify the script because each language has a default script.

You can construct a locale by passing the language code as the first argument and an options object with the region:

const locale = new Intl.Locale("en", {
  region: "US"
});

console.log(locale.toString());
// Output: "en-US"

This creates a locale identifier for American English.

You can build different regional variants of the same language:

const usEnglish = new Intl.Locale("en", { region: "US" });
const britishEnglish = new Intl.Locale("en", { region: "GB" });
const canadianEnglish = new Intl.Locale("en", { region: "CA" });

console.log(usEnglish.toString()); // "en-US"
console.log(britishEnglish.toString()); // "en-GB"
console.log(canadianEnglish.toString()); // "en-CA"

Each variant uses the same language but different regional formatting conventions.

Building locales with language, script, and region

Some languages require explicit script codes. Chinese, Serbian, and a few other languages use multiple writing systems. You must specify the script to avoid ambiguity.

You can add the script component to the options object:

const simplifiedChinese = new Intl.Locale("zh", {
  script: "Hans",
  region: "CN"
});

console.log(simplifiedChinese.toString());
// Output: "zh-Hans-CN"

This creates a locale for Simplified Chinese as used in China.

Traditional Chinese uses a different script and region:

const traditionalChinese = new Intl.Locale("zh", {
  script: "Hant",
  region: "TW"
});

console.log(traditionalChinese.toString());
// Output: "zh-Hant-TW"

The script code distinguishes between the two writing systems.

Serbian uses both Cyrillic and Latin scripts. You need to specify which script to use:

const serbianCyrillic = new Intl.Locale("sr", {
  script: "Cyrl",
  region: "RS"
});

const serbianLatin = new Intl.Locale("sr", {
  script: "Latn",
  region: "RS"
});

console.log(serbianCyrillic.toString()); // "sr-Cyrl-RS"
console.log(serbianLatin.toString()); // "sr-Latn-RS"

Both locales use Serbian language in Serbia, but with different writing systems.

Building locales from user selections

User interfaces often let users select language and region separately. You can combine these selections into a locale identifier.

Consider a settings form with two dropdown menus:

function buildLocaleFromSelections(languageCode, regionCode) {
  const locale = new Intl.Locale(languageCode, {
    region: regionCode
  });

  return locale.toString();
}

const userLocale = buildLocaleFromSelections("es", "MX");
console.log(userLocale);
// Output: "es-MX"

This creates a locale identifier from independent selections.

You can validate the selections by catching errors from the constructor:

function buildLocaleFromSelections(languageCode, regionCode) {
  try {
    const locale = new Intl.Locale(languageCode, {
      region: regionCode
    });
    return {
      success: true,
      locale: locale.toString()
    };
  } catch (error) {
    return {
      success: false,
      error: error.message
    };
  }
}

const valid = buildLocaleFromSelections("fr", "CA");
console.log(valid);
// Output: { success: true, locale: "fr-CA" }

const invalid = buildLocaleFromSelections("invalid", "XX");
console.log(invalid);
// Output: { success: false, error: "..." }

The constructor throws a RangeError if any component is invalid.

Building locales with optional components

Not every locale needs every component. You can omit components that are not required.

A language-only locale omits region and script:

const locale = new Intl.Locale("fr");
console.log(locale.toString());
// Output: "fr"

This represents French without specifying a particular region or script.

You can conditionally include components based on user input:

function buildLocale(language, options = {}) {
  const localeOptions = {};

  if (options.region) {
    localeOptions.region = options.region;
  }

  if (options.script) {
    localeOptions.script = options.script;
  }

  const locale = new Intl.Locale(language, localeOptions);
  return locale.toString();
}

console.log(buildLocale("en"));
// Output: "en"

console.log(buildLocale("en", { region: "US" }));
// Output: "en-US"

console.log(buildLocale("zh", { script: "Hans", region: "CN" }));
// Output: "zh-Hans-CN"

The function builds the simplest valid locale identifier based on available information.

Overriding components in existing locales

You can take an existing locale identifier and override specific components. This is useful when you need to change one part while keeping others intact.

The second argument to the constructor overrides components from the first argument:

const baseLocale = new Intl.Locale("en-US");
const withDifferentRegion = new Intl.Locale(baseLocale, {
  region: "GB"
});

console.log(withDifferentRegion.toString());
// Output: "en-GB"

The new locale keeps the language but changes the region.

You can override multiple components:

const original = new Intl.Locale("zh-Hans-CN");
const modified = new Intl.Locale(original, {
  script: "Hant",
  region: "TW"
});

console.log(modified.toString());
// Output: "zh-Hant-TW"

This changes both script and region while preserving the language.

Adding formatting preferences to constructed locales

Beyond language, script, and region, locales can include formatting preferences. These preferences control how dates, numbers, and other values appear.

You can add calendar preferences when constructing a locale:

const locale = new Intl.Locale("ar", {
  region: "SA",
  calendar: "islamic"
});

console.log(locale.toString());
// Output: "ar-SA-u-ca-islamic"

console.log(locale.calendar);
// Output: "islamic"

The calendar preference appears as a Unicode extension in the identifier string.

You can specify multiple formatting preferences:

const locale = new Intl.Locale("en", {
  region: "US",
  calendar: "gregory",
  numberingSystem: "latn",
  hourCycle: "h12"
});

console.log(locale.toString());
// Output: "en-US-u-ca-gregory-hc-h12-nu-latn"

The constructor orders extension keys alphabetically.

These preferences affect how formatters display data:

const locale = new Intl.Locale("ar", {
  region: "EG",
  numberingSystem: "arab"
});

const formatter = new Intl.NumberFormat(locale);
console.log(formatter.format(12345));
// Output: "١٢٬٣٤٥" (Arabic-Indic numerals)

The numbering system preference controls which digits appear.

Validating component combinations

Not all combinations of language, script, and region are meaningful. The constructor accepts any syntactically valid components, but some combinations may not represent real locales.

The constructor validates syntax but not semantic correctness:

// Syntactically valid but semantically questionable
const locale = new Intl.Locale("en", {
  script: "Arab",
  region: "JP"
});

console.log(locale.toString());
// Output: "en-Arab-JP"

This constructs a locale for English in Arabic script in Japan. The identifier is valid according to BCP 47, but it does not represent a real-world locale.

You can use the maximize() method to check if a locale matches common patterns:

const locale = new Intl.Locale("en", { region: "JP" });
const maximized = locale.maximize();

console.log(maximized.toString());
// Output: "en-Latn-JP"

The method adds the most likely script for the language. If the result matches expected patterns, the combination is reasonable.

Reading components from constructed locales

After constructing a locale, you can read its components as properties.

The language property returns the language code:

const locale = new Intl.Locale("fr", { region: "CA" });
console.log(locale.language);
// Output: "fr"

The region property returns the region code:

const locale = new Intl.Locale("fr", { region: "CA" });
console.log(locale.region);
// Output: "CA"

The script property returns the script code if specified:

const locale = new Intl.Locale("zh", {
  script: "Hans",
  region: "CN"
});

console.log(locale.script);
// Output: "Hans"

If the script is not specified, the property returns undefined:

const locale = new Intl.Locale("en", { region: "US" });
console.log(locale.script);
// Output: undefined

The baseName property returns the complete identifier without extensions:

const locale = new Intl.Locale("ar", {
  region: "SA",
  calendar: "islamic",
  numberingSystem: "arab"
});

console.log(locale.baseName);
// Output: "ar-SA"

This gives you the language-script-region portion without formatting preferences.

Converting locale identifiers to strings

The toString() method returns the complete locale identifier as a string:

const locale = new Intl.Locale("es", { region: "MX" });
const identifier = locale.toString();

console.log(identifier);
// Output: "es-MX"

You can use the string with other Intl APIs:

const locale = new Intl.Locale("de", { region: "DE" });
const formatter = new Intl.NumberFormat(locale.toString());

const price = 1234.56;
console.log(formatter.format(price));
// Output: "1.234,56"

The formatter accepts the string representation.

Most Intl APIs also accept locale objects directly:

const locale = new Intl.Locale("de", { region: "DE" });
const formatter = new Intl.NumberFormat(locale);

The API calls toString() internally when needed.

Practical use cases

Building locale identifiers from components solves several common problems in internationalized applications.

Creating locale pickers

User interfaces often let users choose language and region separately. You combine the selections:

function createLocaleFromPicker(languageSelect, regionSelect) {
  const language = languageSelect.value;
  const region = regionSelect.value;

  const locale = new Intl.Locale(language, { region });
  return locale.toString();
}

// User selects "Spanish" and "Mexico"
const selectedLocale = createLocaleFromPicker(
  { value: "es" },
  { value: "MX" }
);

console.log(selectedLocale);
// Output: "es-MX"

Generating locale variants

You can generate multiple regional variants from a single language code:

function generateRegionalVariants(languageCode, regionCodes) {
  return regionCodes.map(regionCode => {
    const locale = new Intl.Locale(languageCode, {
      region: regionCode
    });
    return locale.toString();
  });
}

const englishVariants = generateRegionalVariants("en", [
  "US",
  "GB",
  "CA",
  "AU",
  "NZ"
]);

console.log(englishVariants);
// Output: ["en-US", "en-GB", "en-CA", "en-AU", "en-NZ"]

This creates a list of locale identifiers for different English-speaking regions.

Building locales from URL parameters

URLs often encode locale preferences as separate parameters. You can construct a locale from these parameters:

function getLocaleFromURL(url) {
  const params = new URL(url).searchParams;
  const language = params.get("lang");
  const region = params.get("region");

  if (!language) {
    return null;
  }

  const options = {};
  if (region) {
    options.region = region;
  }

  try {
    const locale = new Intl.Locale(language, options);
    return locale.toString();
  } catch (error) {
    return null;
  }
}

const locale1 = getLocaleFromURL("https://example.com?lang=fr&region=CA");
console.log(locale1);
// Output: "fr-CA"

const locale2 = getLocaleFromURL("https://example.com?lang=ja");
console.log(locale2);
// Output: "ja"

Normalizing locale identifiers

You can normalize locale identifiers by parsing and reconstructing them:

function normalizeLocale(identifier) {
  try {
    const locale = new Intl.Locale(identifier);
    return locale.toString();
  } catch (error) {
    return null;
  }
}

console.log(normalizeLocale("EN-us"));
// Output: "en-US"

console.log(normalizeLocale("zh_Hans_CN"));
// Output: null (invalid separator)

The constructor normalizes case and validates structure.

Configuring formatters with user preferences

You can build locale identifiers with formatting preferences based on user settings:

function buildFormatterLocale(language, region, preferences) {
  const locale = new Intl.Locale(language, {
    region,
    hourCycle: preferences.use24Hour ? "h23" : "h12",
    numberingSystem: preferences.numberingSystem
  });

  return locale;
}

const userPreferences = {
  use24Hour: true,
  numberingSystem: "latn"
};

const locale = buildFormatterLocale("fr", "FR", userPreferences);

const timeFormatter = new Intl.DateTimeFormat(locale, {
  hour: "numeric",
  minute: "numeric"
});

const now = new Date("2025-10-15T14:30:00");
console.log(timeFormatter.format(now));
// Output: "14:30" (24-hour format)

The locale includes formatting preferences from user settings.

When to build locales from components

Building locales from components is useful in specific scenarios. Use this approach when you have separate language and region data, when processing user input, or when generating locale variants programmatically.

Use a literal string for fixed locales:

// Good for fixed locales
const locale = new Intl.Locale("en-US");

Build from components when values come from variables:

// Good for dynamic locales
const locale = new Intl.Locale(userLanguage, {
  region: userRegion
});

The constructor validates components and creates a properly formatted identifier.

Browser support

The Intl.Locale constructor works in all modern browsers. Chrome, Firefox, Safari, and Edge support the constructor and options object for building locales from components.

Node.js supports Intl.Locale starting from version 12, with full support for all constructor options in version 14 and later.

Summary

The Intl.Locale constructor builds locale identifiers from individual components. You pass the language code as the first argument and provide script, region, and formatting preferences in an options object.

Key concepts:

  • Locale identifiers consist of language, script, and region components
  • The constructor accepts an options object with language, script, and region properties
  • You can override components from an existing locale by passing it as the first argument
  • Formatting preferences like calendar and hourCycle appear as Unicode extensions
  • The toString() method returns the complete identifier string
  • Properties like language, region, and script let you read components
  • The constructor validates syntax but not semantic correctness

Use this approach when building locales from user input, generating regional variants, or combining separate language and region selections. For fixed locales, use string literals instead.