---
title: "How to normalize locale identifiers to standard form"
subtitle: "Convert locale identifiers to canonical format with correct casing and component ordering"
---

## Introduction

Locale identifiers can be written in many different ways while referring to the same language and region. A user might write `EN-us`, `en-US`, or `en-us`, and all three represent American English. When storing, comparing, or displaying locale identifiers, these variations create inconsistency.

Normalization converts locale identifiers to a standard canonical form. This process adjusts the casing of components, orders extension keywords alphabetically, and produces a consistent representation that you can rely on throughout your application.

JavaScript provides built-in methods to normalize locale identifiers automatically. This guide explains what normalization means, how to apply it in your code, and when normalized identifiers improve your internationalization logic.

## What normalization means for locale identifiers

Normalization transforms a locale identifier into its canonical form according to the BCP 47 standard and Unicode specifications. The canonical form has specific rules for casing, ordering, and structure.

A normalized locale identifier follows these conventions:

- Language codes are lowercase
- Script codes are title case with the first letter capitalized
- Region codes are uppercase
- Variant codes are lowercase
- Extension keywords are sorted alphabetically
- Extension attributes are sorted alphabetically

These rules create a single standard representation for each locale. No matter how a user writes a locale identifier, the normalized form is always the same.

## Understanding the normalization rules

Each component of a locale identifier has a specific casing convention in the canonical form.

### Language casing

Language codes always use lowercase letters:

```text
en (correct)
EN (incorrect, but normalizes to en)
eN (incorrect, but normalizes to en)
```

This applies to both two-letter and three-letter language codes.

### Script casing

Script codes use title case, where the first letter is uppercase and the remaining three letters are lowercase:

```text
Hans (correct)
hans (incorrect, but normalizes to Hans)
HANS (incorrect, but normalizes to Hans)
```

Common script codes include `Latn` for Latin, `Cyrl` for Cyrillic, `Hans` for Simplified Han characters, and `Hant` for Traditional Han characters.

### Region casing

Region codes always use uppercase letters:

```text
US (correct)
us (incorrect, but normalizes to US)
Us (incorrect, but normalizes to US)
```

This applies to the two-letter country codes used in most locale identifiers.

### Extension ordering

Unicode extension tags contain keywords that specify formatting preferences. In the canonical form, these keywords appear in alphabetical order by their key:

```text
en-US-u-ca-gregory-nu-latn (correct)
en-US-u-nu-latn-ca-gregory (incorrect, but normalizes to first form)
```

The calendar key `ca` comes before the numbering system key `nu` alphabetically, so `ca-gregory` appears first in the normalized form.

## Using Intl.getCanonicalLocales to normalize

The `Intl.getCanonicalLocales()` method normalizes locale identifiers and returns them in canonical form. This is the primary method for normalization in JavaScript.

```javascript
const normalized = Intl.getCanonicalLocales("EN-us");
console.log(normalized);
// ["en-US"]
```

The method accepts a locale identifier with any casing and returns the properly cased canonical form.

### Normalizing language codes

The method converts language codes to lowercase:

```javascript
const result = Intl.getCanonicalLocales("FR-fr");
console.log(result);
// ["fr-FR"]
```

The language code `FR` becomes `fr` in the output.

### Normalizing script codes

The method converts script codes to title case:

```javascript
const result = Intl.getCanonicalLocales("zh-HANS-cn");
console.log(result);
// ["zh-Hans-CN"]
```

The script code `HANS` becomes `Hans`, and the region code `cn` becomes `CN`.

### Normalizing region codes

The method converts region codes to uppercase:

```javascript
const result = Intl.getCanonicalLocales("en-gb");
console.log(result);
// ["en-GB"]
```

The region code `gb` becomes `GB` in the output.

### Normalizing extension keywords

The method sorts extension keywords alphabetically:

```javascript
const result = Intl.getCanonicalLocales("en-US-u-nu-latn-hc-h12-ca-gregory");
console.log(result);
// ["en-US-u-ca-gregory-hc-h12-nu-latn"]
```

The keywords reorder from `nu-latn-hc-h12-ca-gregory` to `ca-gregory-hc-h12-nu-latn` because `ca` comes before `hc` and `hc` comes before `nu` alphabetically.

## Normalizing multiple locale identifiers

The `Intl.getCanonicalLocales()` method accepts an array of locale identifiers and normalizes all of them:

```javascript
const locales = ["EN-us", "fr-FR", "ZH-hans-cn"];
const normalized = Intl.getCanonicalLocales(locales);
console.log(normalized);
// ["en-US", "fr-FR", "zh-Hans-CN"]
```

Each locale in the array is converted to its canonical form.

### Removing duplicates

The method removes duplicate locale identifiers after normalization. If multiple input values normalize to the same canonical form, the result contains only one copy:

```javascript
const locales = ["en-US", "EN-us", "en-us"];
const normalized = Intl.getCanonicalLocales(locales);
console.log(normalized);
// ["en-US"]
```

All three inputs represent the same locale, so the output contains a single normalized identifier.

This deduplication is useful when processing user input or merging locale lists from multiple sources.

### Handling invalid identifiers

If any locale identifier in the array is invalid, the method throws a `RangeError`:

```javascript
try {
  Intl.getCanonicalLocales(["en-US", "invalid", "fr-FR"]);
} catch (error) {
  console.error(error.message);
  // "invalid is not a structurally valid language tag"
}
```

When normalizing user-provided lists, validate or catch errors for each locale individually to identify which specific identifiers are invalid.

## Using Intl.Locale for normalization

The `Intl.Locale` constructor also normalizes locale identifiers when creating locale objects. You can access the normalized form through the `toString()` method.

```javascript
const locale = new Intl.Locale("EN-us");
console.log(locale.toString());
// "en-US"
```

The constructor accepts any valid casing and produces a normalized locale object.

### Accessing normalized components

Each property of the locale object returns the normalized form of that component:

```javascript
const locale = new Intl.Locale("ZH-hans-CN");

console.log(locale.language);
// "zh"

console.log(locale.script);
// "Hans"

console.log(locale.region);
// "CN"

console.log(locale.baseName);
// "zh-Hans-CN"
```

The `language`, `script`, and `region` properties all use the correct casing for the canonical form.

### Normalizing with options

When you create a locale object with options, the constructor normalizes both the base identifier and the options:

```javascript
const locale = new Intl.Locale("EN-us", {
  calendar: "gregory",
  numberingSystem: "latn",
  hourCycle: "h12"
});

console.log(locale.toString());
// "en-US-u-ca-gregory-hc-h12-nu-latn"
```

The extension keywords appear in alphabetical order in the output, even though the options object does not specify any particular order.

## Why normalization matters

Normalization provides consistency across your application. When you store, display, or compare locale identifiers, using the canonical form prevents subtle bugs and improves reliability.

### Consistent storage

When storing locale identifiers in databases, configuration files, or local storage, normalized forms prevent duplication:

```javascript
const userPreferences = new Set();

function saveUserLocale(identifier) {
  const normalized = Intl.getCanonicalLocales(identifier)[0];
  userPreferences.add(normalized);
}

saveUserLocale("en-US");
saveUserLocale("EN-us");
saveUserLocale("en-us");

console.log(userPreferences);
// Set { "en-US" }
```

Without normalization, the set would contain three entries for the same locale. With normalization, it correctly contains one.

### Reliable comparison

Comparing locale identifiers requires normalization. Two identifiers that differ only in casing represent the same locale:

```javascript
function isSameLocale(locale1, locale2) {
  const normalized1 = Intl.getCanonicalLocales(locale1)[0];
  const normalized2 = Intl.getCanonicalLocales(locale2)[0];
  return normalized1 === normalized2;
}

console.log(isSameLocale("en-US", "EN-us"));
// true

console.log(isSameLocale("en-US", "en-GB"));
// false
```

Direct string comparison of unnormalized identifiers produces incorrect results.

### Consistent display

When showing locale identifiers to users or in debugging output, normalized forms provide consistent formatting:

```javascript
function displayLocale(identifier) {
  try {
    const normalized = Intl.getCanonicalLocales(identifier)[0];
    return `Current locale: ${normalized}`;
  } catch (error) {
    return "Invalid locale identifier";
  }
}

console.log(displayLocale("EN-us"));
// "Current locale: en-US"

console.log(displayLocale("zh-HANS-cn"));
// "Current locale: zh-Hans-CN"
```

Users see properly formatted locale identifiers regardless of the input format.

## Practical applications

Normalization solves common problems when working with locale identifiers in real applications.

### Normalizing user input

When users enter locale identifiers in forms or settings, normalize the input before storing it:

```javascript
function processLocaleInput(input) {
  try {
    const normalized = Intl.getCanonicalLocales(input)[0];
    return {
      success: true,
      locale: normalized
    };
  } catch (error) {
    return {
      success: false,
      error: "Please enter a valid locale identifier"
    };
  }
}

const result = processLocaleInput("fr-ca");
console.log(result);
// { success: true, locale: "fr-CA" }
```

This ensures consistent formatting in your database or configuration.

### Building locale lookup tables

When creating lookup tables for translations or locale-specific data, use normalized keys:

```javascript
const translations = new Map();

function addTranslation(locale, key, value) {
  const normalized = Intl.getCanonicalLocales(locale)[0];

  if (!translations.has(normalized)) {
    translations.set(normalized, {});
  }

  translations.get(normalized)[key] = value;
}

addTranslation("en-us", "hello", "Hello");
addTranslation("EN-US", "goodbye", "Goodbye");

console.log(translations.get("en-US"));
// { hello: "Hello", goodbye: "Goodbye" }
```

Both calls to `addTranslation` use the same normalized key, so the translations are stored in the same object.

### Merging locale lists

When combining locale identifiers from multiple sources, normalize and deduplicate them:

```javascript
function mergeLocales(...sources) {
  const allLocales = sources.flat();
  const normalized = Intl.getCanonicalLocales(allLocales);
  return normalized;
}

const userLocales = ["en-us", "fr-FR"];
const appLocales = ["EN-US", "de-de"];
const systemLocales = ["en-US", "es-mx"];

const merged = mergeLocales(userLocales, appLocales, systemLocales);
console.log(merged);
// ["en-US", "fr-FR", "de-DE", "es-MX"]
```

The method removes duplicates and normalizes casing across all sources.

### Creating locale selection interfaces

When building dropdown menus or selection interfaces, normalize locale identifiers for display:

```javascript
function buildLocaleOptions(locales) {
  const normalized = Intl.getCanonicalLocales(locales);

  return normalized.map(locale => {
    const localeObj = new Intl.Locale(locale);
    const displayNames = new Intl.DisplayNames([locale], {
      type: "language"
    });

    return {
      value: locale,
      label: displayNames.of(localeObj.language)
    };
  });
}

const options = buildLocaleOptions(["EN-us", "fr-FR", "DE-de"]);
console.log(options);
// [
//   { value: "en-US", label: "English" },
//   { value: "fr-FR", label: "French" },
//   { value: "de-DE", label: "German" }
// ]
```

The normalized values provide consistent identifiers for form submissions.

### Validating configuration files

When loading locale identifiers from configuration files, normalize them during initialization:

```javascript
function loadLocaleConfig(config) {
  const validatedConfig = {
    defaultLocale: null,
    supportedLocales: []
  };

  try {
    validatedConfig.defaultLocale = Intl.getCanonicalLocales(
      config.defaultLocale
    )[0];
  } catch (error) {
    console.error("Invalid default locale:", config.defaultLocale);
    validatedConfig.defaultLocale = "en-US";
  }

  config.supportedLocales.forEach(locale => {
    try {
      const normalized = Intl.getCanonicalLocales(locale)[0];
      validatedConfig.supportedLocales.push(normalized);
    } catch (error) {
      console.warn("Skipping invalid locale:", locale);
    }
  });

  return validatedConfig;
}

const config = {
  defaultLocale: "en-us",
  supportedLocales: ["EN-us", "fr-FR", "invalid", "de-DE"]
};

const validated = loadLocaleConfig(config);
console.log(validated);
// {
//   defaultLocale: "en-US",
//   supportedLocales: ["en-US", "fr-FR", "de-DE"]
// }
```

This catches configuration errors early and ensures your application uses valid normalized identifiers.

## Normalization and locale matching

Normalization is important for locale matching algorithms. When finding the best locale match for a user preference, compare normalized forms:

```javascript
function findBestMatch(userPreference, availableLocales) {
  const normalizedPreference = Intl.getCanonicalLocales(userPreference)[0];
  const normalizedAvailable = Intl.getCanonicalLocales(availableLocales);

  if (normalizedAvailable.includes(normalizedPreference)) {
    return normalizedPreference;
  }

  const preferenceLocale = new Intl.Locale(normalizedPreference);

  const languageMatch = normalizedAvailable.find(available => {
    const availableLocale = new Intl.Locale(available);
    return availableLocale.language === preferenceLocale.language;
  });

  if (languageMatch) {
    return languageMatch;
  }

  return normalizedAvailable[0];
}

const available = ["en-us", "fr-FR", "DE-de"];
console.log(findBestMatch("EN-GB", available));
// "en-US"
```

Normalization ensures the matching logic works correctly regardless of input casing.

## Normalization does not change meaning

Normalization only affects the representation of a locale identifier. It does not change which language, script, or region the identifier represents.

```javascript
const locale1 = new Intl.Locale("en-us");
const locale2 = new Intl.Locale("EN-US");

console.log(locale1.language === locale2.language);
// true

console.log(locale1.region === locale2.region);
// true

console.log(locale1.toString() === locale2.toString());
// true
```

Both identifiers refer to American English. Normalization simply ensures they are written the same way.

This is different from operations like `maximize()` and `minimize()`, which add or remove components and can change the specificity of the identifier.

## Browser support

The `Intl.getCanonicalLocales()` method works in all modern browsers. Chrome, Firefox, Safari, and Edge provide full support.

Node.js supports `Intl.getCanonicalLocales()` starting from version 9, with full support in version 10 and later.

The `Intl.Locale` constructor and its normalization behavior work in all browsers that support the `Intl.Locale` API. This includes modern versions of Chrome, Firefox, Safari, and Edge.

## Summary

Normalization converts locale identifiers to their canonical form by applying standard casing rules and sorting extension keywords. This creates consistent representations that you can store, compare, and display reliably.

Key concepts:

- Canonical form uses lowercase for languages, title case for scripts, and uppercase for regions
- Extension keywords are sorted alphabetically in the canonical form
- The `Intl.getCanonicalLocales()` method normalizes identifiers and removes duplicates
- The `Intl.Locale` constructor also produces normalized output
- Normalization does not change the meaning of a locale identifier
- Use normalized identifiers for storage, comparison, and display

Normalization is a foundational operation for any application that works with locale identifiers. It prevents bugs caused by inconsistent casing and ensures your internationalization logic handles locale identifiers reliably.