How to customize locales with Unicode extensions

Add calendar systems, numbering formats, and time display preferences to locale identifiers

Introduction

A locale identifier like en-US tells JavaScript which language and region to use for formatting. However, it does not specify which calendar system to use, which numbering format to display, or whether to show time in 12-hour or 24-hour format. These formatting preferences vary by user choice, not just by location.

Unicode extensions solve this problem. They let you add formatting preferences directly to locale identifiers. Instead of using separate configuration options for each formatter, you encode the preferences once in the locale string itself.

This guide explains how Unicode extensions work, which extension types are available, and when to use them in your internationalization code.

What are Unicode extensions

Unicode extensions are additional tags you add to locale identifiers to specify formatting preferences. They follow a standard format defined in BCP 47, the same specification that defines locale identifiers.

An extension starts with -u- followed by key-value pairs. The u stands for Unicode. Each key is two letters, and values vary by key type.

const locale = "en-US-u-ca-gregory-hc-h12";

This locale identifier specifies American English with the Gregorian calendar and 12-hour time display.

How to add extensions to locale strings

Extensions appear at the end of a locale identifier, after the language, script, and region components. The -u- marker separates the core identifier from the extensions.

The basic structure follows this pattern:

language-region-u-key-value-key-value

Each key-value pair specifies one formatting preference. You can include multiple key-value pairs in a single locale string.

const japanese = new Intl.Locale("ja-JP-u-ca-japanese-nu-jpan");
console.log(japanese.calendar); // "japanese"
console.log(japanese.numberingSystem); // "jpan"

The order of key-value pairs does not matter. Both "en-u-ca-gregory-nu-latn" and "en-u-nu-latn-ca-gregory" are valid and equivalent.

Calendar extensions

The ca key specifies which calendar system to use for date formatting. Different cultures use different calendar systems, and some users prefer non-Gregorian calendars for religious or cultural reasons.

Common calendar values include:

  • gregory for the Gregorian calendar
  • buddhist for the Buddhist calendar
  • islamic for the Islamic calendar
  • hebrew for the Hebrew calendar
  • chinese for the Chinese calendar
  • japanese for the Japanese imperial calendar
const islamicLocale = new Intl.Locale("ar-SA-u-ca-islamic");
const date = new Date("2025-03-15");
const formatter = new Intl.DateTimeFormat(islamicLocale, {
  year: "numeric",
  month: "long",
  day: "numeric"
});

console.log(formatter.format(date));
// Output: "٢٠ رمضان ١٤٤٦ هـ"

This formats the date according to the Islamic calendar. The same Gregorian date appears as a different year, month, and day in the Islamic calendar system.

The Buddhist calendar is commonly used in Thailand. It counts years from Buddha's birth in 543 BCE, making Buddhist years 543 years ahead of Gregorian years.

const buddhistLocale = new Intl.Locale("th-TH-u-ca-buddhist");
const formatter = new Intl.DateTimeFormat(buddhistLocale, {
  year: "numeric",
  month: "long",
  day: "numeric"
});

console.log(formatter.format(new Date("2025-03-15")));
// Output: "15 มีนาคม 2568"

The year 2025 in the Gregorian calendar is 2568 in the Buddhist calendar.

Numbering system extensions

The nu key specifies which numbering system to use for displaying numbers. While most locales use Western Arabic numerals (0-9), many regions have their own traditional numbering systems.

Common numbering system values include:

  • latn for Western Arabic numerals (0-9)
  • arab for Arabic-Indic numerals
  • hanidec for Chinese decimal numerals
  • deva for Devanagari numerals
  • thai for Thai numerals
const arabicLocale = new Intl.Locale("ar-EG-u-nu-arab");
const number = 123456;
const formatter = new Intl.NumberFormat(arabicLocale);

console.log(formatter.format(number));
// Output: "١٢٣٬٤٥٦"

Arabic-Indic numerals look different from Western numerals but represent the same values. The number 123456 appears as ١٢٣٬٤٥٦.

Thai numerals provide another example:

const thaiLocale = new Intl.Locale("th-TH-u-nu-thai");
const formatter = new Intl.NumberFormat(thaiLocale);

console.log(formatter.format(123456));
// Output: "๑๒๓,๔๕๖"

Many Arabic locales support both Arabic-Indic numerals and Latin numerals. Users can choose their preferred system based on personal preference or context.

Hour cycle extensions

The hc key specifies how to display time. Some regions prefer 12-hour time with AM and PM indicators, while others prefer 24-hour time. The hour cycle also determines how midnight appears.

Four hour cycle values are available:

  • h12 uses hours 1-12 with midnight at 12:00 AM
  • h11 uses hours 0-11 with midnight at 0:00 AM
  • h23 uses hours 0-23 with midnight at 0:00
  • h24 uses hours 1-24 with midnight at 24:00

The h12 and h11 values represent 12-hour time, while h23 and h24 represent 24-hour time. The difference lies in whether the hour range starts at 0 or 1.

const us12Hour = new Intl.Locale("en-US-u-hc-h12");
const japan11Hour = new Intl.Locale("ja-JP-u-hc-h11");
const europe23Hour = new Intl.Locale("en-GB-u-hc-h23");

const date = new Date("2025-03-15T00:30:00");

console.log(new Intl.DateTimeFormat(us12Hour, { hour: "numeric", minute: "numeric" }).format(date));
// Output: "12:30 AM"

console.log(new Intl.DateTimeFormat(japan11Hour, { hour: "numeric", minute: "numeric" }).format(date));
// Output: "0:30 AM"

console.log(new Intl.DateTimeFormat(europe23Hour, { hour: "numeric", minute: "numeric" }).format(date));
// Output: "00:30"

The h12 format shows midnight as 12:30 AM, while h11 shows it as 0:30 AM. The h23 format shows it as 00:30 without AM or PM.

Most applications use either h12 or h23. The h11 format is primarily used in Japan, and h24 is rarely used in practice.

Collation extensions

The co key specifies collation rules for sorting strings. Collation determines the order of characters when sorting text. Different languages and regions have different sorting conventions.

Common collation values include:

  • standard for standard Unicode collation
  • phonebk for phonebook ordering (German)
  • pinyin for Pinyin ordering (Chinese)
  • stroke for stroke count ordering (Chinese)

German phonebook collation treats umlauts differently from standard collation. The phonebook order expands ä to ae, ö to oe, and ü to ue for sorting purposes.

const names = ["Müller", "Meyer", "Möller", "Mueller"];

const standard = new Intl.Collator("de-DE");
const phonebook = new Intl.Collator("de-DE-u-co-phonebk");

console.log(names.sort((a, b) => standard.compare(a, b)));
// Output: ["Meyer", "Möller", "Mueller", "Müller"]

console.log(names.sort((a, b) => phonebook.compare(a, b)));
// Output: ["Meyer", "Möller", "Mueller", "Müller"]

Chinese collation offers multiple ordering systems. Pinyin ordering sorts by pronunciation, while stroke ordering sorts by the number of brush strokes used to write each character.

const pinyinCollator = new Intl.Collator("zh-CN-u-co-pinyin");
const strokeCollator = new Intl.Collator("zh-CN-u-co-stroke");

Collation extensions only affect the Intl.Collator API and methods like Array.prototype.sort() when used with collators.

Case-first extensions

The kf key determines whether uppercase or lowercase letters sort first in collation. This preference varies by language and use case.

Three values are available:

  • upper to sort uppercase letters before lowercase
  • lower to sort lowercase letters before uppercase
  • false to use the locale's default case ordering
const words = ["apple", "Apple", "APPLE", "banana"];

const upperFirst = new Intl.Collator("en-US-u-kf-upper");
const lowerFirst = new Intl.Collator("en-US-u-kf-lower");

console.log(words.sort((a, b) => upperFirst.compare(a, b)));
// Output: ["APPLE", "Apple", "apple", "banana"]

console.log(words.sort((a, b) => lowerFirst.compare(a, b)));
// Output: ["apple", "Apple", "APPLE", "banana"]

Case-first ordering affects collation when words are otherwise identical except for case. It determines the secondary sorting order after comparing the base characters.

Numeric collation extensions

The kn key enables numeric collation, which sorts numeric sequences by their numeric value instead of lexicographically. Without numeric collation, "10" sorts before "2" because "1" comes before "2" in character order.

Numeric collation accepts two values:

  • true to enable numeric collation
  • false to disable numeric collation (default)
const items = ["item1", "item10", "item2", "item20"];

const standard = new Intl.Collator("en-US");
const numeric = new Intl.Collator("en-US-u-kn-true");

console.log(items.sort((a, b) => standard.compare(a, b)));
// Output: ["item1", "item10", "item2", "item20"]

console.log(items.sort((a, b) => numeric.compare(a, b)));
// Output: ["item1", "item2", "item10", "item20"]

With numeric collation enabled, "item2" correctly sorts before "item10" because 2 is less than 10. This produces the expected sort order for strings containing numbers.

Numeric collation is useful for sorting file names, version numbers, street addresses, and any text containing embedded numbers.

Using options objects instead of extension strings

Instead of encoding extensions in the locale string, you can pass them as options to the Intl.Locale constructor. This approach separates the base locale from the formatting preferences.

const locale = new Intl.Locale("ja-JP", {
  calendar: "japanese",
  numberingSystem: "jpan",
  hourCycle: "h11"
});

console.log(locale.toString());
// Output: "ja-JP-u-ca-japanese-hc-h11-nu-jpan"

The constructor converts the options into extension tags automatically. Both approaches produce identical locale objects.

The options object approach offers several benefits. It makes code more readable by using full property names instead of two-letter codes. It also makes it easier to construct locales dynamically from configuration data.

const userPreferences = {
  language: "ar",
  region: "SA",
  calendar: "islamic",
  numberingSystem: "arab"
};

const locale = new Intl.Locale(`${userPreferences.language}-${userPreferences.region}`, {
  calendar: userPreferences.calendar,
  numberingSystem: userPreferences.numberingSystem
});

You can also pass options directly to formatter constructors:

const formatter = new Intl.DateTimeFormat("th-TH", {
  calendar: "buddhist",
  numberingSystem: "thai",
  year: "numeric",
  month: "long",
  day: "numeric"
});

This combines locale-specific formatting options with presentation options in a single constructor call.

When to use extensions versus formatter options

Extensions and formatter options serve different purposes. Understanding when to use each approach helps you write cleaner, more maintainable code.

Use extensions in the locale string when the formatting preferences are inherent to the user's locale. If a Thai user always wants to see the Buddhist calendar and Thai numerals, encode those preferences in their locale identifier.

const userLocale = "th-TH-u-ca-buddhist-nu-thai";

This lets you pass the locale to any formatter without repeating the preferences:

const dateFormatter = new Intl.DateTimeFormat(userLocale);
const numberFormatter = new Intl.NumberFormat(userLocale);

Both formatters automatically use the Buddhist calendar and Thai numerals.

Use formatter options when the formatting preferences are specific to one use case. If you want to display the Islamic calendar in one part of your application but the Gregorian calendar elsewhere, pass the calendar option to the specific formatter.

const islamicFormatter = new Intl.DateTimeFormat("ar-SA", {
  calendar: "islamic"
});

const gregorianFormatter = new Intl.DateTimeFormat("ar-SA", {
  calendar: "gregory"
});

The same locale identifier produces different formatting based on the calendar option.

Extensions in the locale string act as defaults. Formatter options override those defaults when specified. This lets you use user preferences as a baseline while customizing specific formatters.

const locale = "en-US-u-hc-h23";
const formatter12Hour = new Intl.DateTimeFormat(locale, {
  hourCycle: "h12"
});

The user prefers 24-hour time, but this specific formatter overrides that preference to show 12-hour time.

Reading extension values from locales

The Intl.Locale object exposes extension values as properties. You can read these properties to inspect or validate a locale's formatting preferences.

const locale = new Intl.Locale("ar-SA-u-ca-islamic-nu-arab-hc-h12");

console.log(locale.calendar); // "islamic"
console.log(locale.numberingSystem); // "arab"
console.log(locale.hourCycle); // "h12"

These properties return the extension values if present, or undefined if the extension is not specified.

You can use these properties to build configuration interfaces or validate user preferences:

function describeLocalePreferences(localeString) {
  const locale = new Intl.Locale(localeString);

  return {
    language: locale.language,
    region: locale.region,
    calendar: locale.calendar || "default",
    numberingSystem: locale.numberingSystem || "default",
    hourCycle: locale.hourCycle || "default"
  };
}

console.log(describeLocalePreferences("th-TH-u-ca-buddhist-nu-thai"));
// Output: { language: "th", region: "TH", calendar: "buddhist", numberingSystem: "thai", hourCycle: "default" }

The collation, caseFirst, and numeric properties correspond to the co, kf, and kn extension keys:

const locale = new Intl.Locale("de-DE-u-co-phonebk-kf-upper-kn-true");

console.log(locale.collation); // "phonebk"
console.log(locale.caseFirst); // "upper"
console.log(locale.numeric); // true

Note that the numeric property returns a boolean, not a string. The value true indicates numeric collation is enabled.

Combining multiple extensions

You can combine multiple extensions in a single locale identifier. This lets you specify all formatting preferences at once.

const locale = new Intl.Locale("ar-SA-u-ca-islamic-nu-arab-hc-h12-co-standard");

const dateFormatter = new Intl.DateTimeFormat(locale, {
  year: "numeric",
  month: "long",
  day: "numeric",
  hour: "numeric",
  minute: "numeric"
});

const date = new Date("2025-03-15T14:30:00");
console.log(dateFormatter.format(date));
// Output uses Islamic calendar, Arabic-Indic numerals, and 12-hour time

Each extension key can appear only once in a locale string. If you specify the same key multiple times, the last value takes precedence.

const locale = new Intl.Locale("en-US-u-hc-h23-hc-h12");
console.log(locale.hourCycle); // "h12"

When constructing locales programmatically, ensure each extension key appears once to avoid ambiguity.

Practical use cases

Unicode extensions solve real problems in internationalized applications. Understanding common use cases helps you apply extensions effectively.

User preference storage

Store user formatting preferences in a single locale string instead of multiple configuration fields:

function saveUserPreferences(userId, localeString) {
  const locale = new Intl.Locale(localeString);

  return {
    userId,
    language: locale.language,
    region: locale.region,
    localeString: locale.toString(),
    preferences: {
      calendar: locale.calendar,
      numberingSystem: locale.numberingSystem,
      hourCycle: locale.hourCycle
    }
  };
}

const preferences = saveUserPreferences(123, "ar-SA-u-ca-islamic-nu-arab-hc-h12");

This approach stores formatting preferences as a single string while still providing structured access to individual components.

Building locale selectors

Let users choose formatting preferences through a UI by constructing locale strings with extensions:

function buildLocaleFromUserInput(language, region, preferences) {
  const options = {};

  if (preferences.calendar) {
    options.calendar = preferences.calendar;
  }

  if (preferences.numberingSystem) {
    options.numberingSystem = preferences.numberingSystem;
  }

  if (preferences.hourCycle) {
    options.hourCycle = preferences.hourCycle;
  }

  const locale = new Intl.Locale(`${language}-${region}`, options);
  return locale.toString();
}

const userLocale = buildLocaleFromUserInput("th", "TH", {
  calendar: "buddhist",
  numberingSystem: "thai",
  hourCycle: "h23"
});

console.log(userLocale);
// Output: "th-TH-u-ca-buddhist-hc-h23-nu-thai"

Respecting religious calendars

Applications serving religious communities should support their calendar systems:

function createReligiousCalendarFormatter(religion, baseLocale) {
  const calendars = {
    jewish: "hebrew",
    muslim: "islamic",
    buddhist: "buddhist"
  };

  const calendar = calendars[religion];
  if (!calendar) {
    return new Intl.DateTimeFormat(baseLocale);
  }

  const locale = new Intl.Locale(baseLocale, { calendar });
  return new Intl.DateTimeFormat(locale, {
    year: "numeric",
    month: "long",
    day: "numeric"
  });
}

const jewishFormatter = createReligiousCalendarFormatter("jewish", "en-US");
console.log(jewishFormatter.format(new Date("2025-03-15")));
// Output: "15 Adar II 5785"

Sorting with custom rules

Use collation extensions to implement locale-specific sorting:

function sortNames(names, locale, collationType) {
  const localeWithCollation = new Intl.Locale(locale, {
    collation: collationType
  });

  const collator = new Intl.Collator(localeWithCollation);
  return names.sort((a, b) => collator.compare(a, b));
}

const germanNames = ["Müller", "Meyer", "Möller", "Mueller"];
const sorted = sortNames(germanNames, "de-DE", "phonebk");
console.log(sorted);

Displaying traditional numerals

Show numbers in traditional numbering systems for culturally appropriate display:

function formatTraditionalNumber(number, locale, numberingSystem) {
  const localeWithNumbering = new Intl.Locale(locale, {
    numberingSystem
  });

  return new Intl.NumberFormat(localeWithNumbering).format(number);
}

console.log(formatTraditionalNumber(123456, "ar-EG", "arab"));
// Output: "١٢٣٬٤٥٦"

console.log(formatTraditionalNumber(123456, "th-TH", "thai"));
// Output: "๑๒๓,๔๕๖"

Browser support

Unicode extensions work in all modern browsers. Chrome, Firefox, Safari, and Edge support the extension syntax in locale identifiers and the corresponding properties on Intl.Locale objects.

The availability of specific extension values depends on browser implementation. All browsers support common values like gregory for calendar, latn for numbering system, and h12 or h23 for hour cycle. Less common values like traditional Chinese calendars or minority language numbering systems may not work in all browsers.

Test your locale identifiers in target browsers when using less common extension values. Use the Intl.Locale properties to check whether the browser recognized your extension values:

const locale = new Intl.Locale("zh-CN-u-ca-chinese");
console.log(locale.calendar);
// If browser supports Chinese calendar: "chinese"
// If browser does not support it: undefined

Node.js supports Unicode extensions starting from version 12, with full support for all properties in version 18 and later.

Summary

Unicode extensions let you add formatting preferences to locale identifiers. Instead of configuring each formatter separately, you encode preferences once in the locale string.

Key concepts:

  • Extensions start with -u- followed by key-value pairs
  • The ca key specifies calendar system
  • The nu key specifies numbering system
  • The hc key specifies hour cycle format
  • The co key specifies collation rules
  • The kf key specifies case-first ordering
  • The kn key enables numeric collation
  • You can use extension strings or options objects
  • Extensions act as defaults that formatter options can override
  • The Intl.Locale object exposes extensions as properties

Use Unicode extensions to store user preferences, respect cultural calendars, display traditional numerals, and implement locale-specific sorting. They provide a standard way to customize formatting behavior in JavaScript internationalization code.