Intl.Collator API

Sort and compare strings correctly across languages

Introduction

Sorting strings in JavaScript seems straightforward until you encounter international text. Default string comparison uses Unicode code point values, producing incorrect results for many languages. The Intl.Collator API provides locale-aware string comparison that respects cultural sorting rules and handles special characters correctly.

Why default sorting fails

Consider sorting a list of German names:

const names = ["Zoe", "Ava", "Ärzte", "Änder"];
console.log(names.sort());
// ["Ava", "Zoe", "Änder", "Ärzte"]

This output is wrong for German speakers. In German, characters with umlauts like ä should sort near their base letter a, not at the end. The problem stems from JavaScript comparing Unicode code point values, where Ä (U+00C4) comes after Z (U+005A).

Different languages have different sorting rules. Swedish sorts ä at the end of the alphabet, German sorts it near a, and French treats accented characters differently. Binary comparison ignores these cultural conventions.

How string collation works

Collation is the process of comparing and ordering strings according to language-specific rules. The Unicode Collation Algorithm defines how to compare strings by analyzing characters, diacritics, case, and punctuation separately.

When comparing two strings, a collation function returns a number:

  • Negative value: first string comes before second
  • Zero: strings are equivalent for the current sensitivity level
  • Positive value: first string comes after second

This three-way comparison pattern works with Array.sort and enables precise control over what differences matter.

Using localeCompare for basic locale-aware sorting

The localeCompare method provides locale-aware string comparison:

const names = ["Zoe", "Ava", "Ärzte", "Änder"];
console.log(names.sort((a, b) => a.localeCompare(b, "de")));
// ["Ava", "Änder", "Ärzte", "Zoe"]

This produces correct German sorting. The first parameter specifies the locale, and localeCompare handles the cultural rules automatically.

You can pass options as the third parameter:

const items = ["File10", "File2", "File1"];
console.log(items.sort((a, b) =>
  a.localeCompare(b, "en", { numeric: true })
));
// ["File1", "File2", "File10"]

The numeric option enables natural sorting where "2" comes before "10". Without it, "10" would sort before "2" because "1" comes before "2".

The performance problem with repeated localeCompare

Each localeCompare call processes locale settings from scratch. When sorting large arrays, this creates significant overhead:

// Inefficient: processes locale for every comparison
const sorted = items.sort((a, b) => a.localeCompare(b, "de"));

Sorting 1000 items requires roughly 10000 comparisons. Each comparison recreates the locale configuration, multiplying the performance cost. This overhead becomes noticeable in user interfaces with large datasets.

Using Intl.Collator for efficient string comparison

Intl.Collator creates a reusable comparison object that processes locale settings once:

const collator = new Intl.Collator("de");
const sorted = items.sort((a, b) => collator.compare(a, b));

The collator instance stores the locale configuration and comparison rules. The compare method uses these precomputed rules for every comparison, eliminating repeated initialization overhead.

Performance improvements range from 60% to 80% when sorting large arrays compared to repeated localeCompare calls.

Accessing the compare method directly

You can pass the compare method directly to sort:

const collator = new Intl.Collator("de");
const sorted = items.sort(collator.compare);

This works because compare is bound to the collator instance. The method receives two strings and returns the comparison result, matching the signature Array.sort expects.

Understanding sensitivity levels

The sensitivity option controls which character differences matter during comparison. There are four levels:

Base sensitivity

Base sensitivity ignores accents and case:

const collator = new Intl.Collator("en", { sensitivity: "base" });

console.log(collator.compare("a", "a")); // 0
console.log(collator.compare("a", "á")); // 0
console.log(collator.compare("a", "A")); // 0
console.log(collator.compare("a", "b")); // -1

Only base letters differ. This level works well for fuzzy search where users might not type accents correctly.

Accent sensitivity

Accent sensitivity considers accents but ignores case:

const collator = new Intl.Collator("en", { sensitivity: "accent" });

console.log(collator.compare("a", "a")); // 0
console.log(collator.compare("a", "á")); // -1
console.log(collator.compare("a", "A")); // 0
console.log(collator.compare("á", "A")); // 1

Accented and unaccented characters differ. Uppercase and lowercase versions of the same letter match.

Case sensitivity

Case sensitivity considers case but ignores accents:

const collator = new Intl.Collator("en", { sensitivity: "case" });

console.log(collator.compare("a", "a")); // 0
console.log(collator.compare("a", "á")); // 0
console.log(collator.compare("a", "A")); // -1
console.log(collator.compare("á", "Á")); // -1

Case differences matter but accents are ignored. This level is less common in practice.

Variant sensitivity

Variant sensitivity considers all differences:

const collator = new Intl.Collator("en", { sensitivity: "variant" });

console.log(collator.compare("a", "a")); // 0
console.log(collator.compare("a", "á")); // -1
console.log(collator.compare("a", "A")); // -1
console.log(collator.compare("á", "Á")); // -1

This is the default for sorting. Every character difference produces a distinct comparison result.

Choosing sensitivity based on use case

Different scenarios need different sensitivity levels:

  • Sorting lists: Use variant sensitivity to maintain strict ordering
  • Searching content: Use base sensitivity to match regardless of accents or case
  • Filtering options: Use accent sensitivity when case should not matter
  • Case-sensitive search: Use case sensitivity when accents should not matter

The usage option provides default sensitivity settings for common scenarios.

Using usage for sort and search modes

The usage option optimizes collator behavior for sorting or searching:

// Optimized for sorting
const sortCollator = new Intl.Collator("en", { usage: "sort" });

// Optimized for searching
const searchCollator = new Intl.Collator("en", { usage: "search" });

Sort usage defaults to variant sensitivity, ensuring every difference produces a consistent order. Search usage optimizes for finding matches, typically using more relaxed sensitivity.

For case-insensitive and accent-insensitive search:

const collator = new Intl.Collator("en", {
  usage: "search",
  sensitivity: "base"
});

const items = ["Apple", "Äpfel", "Banana"];
const matches = items.filter(item =>
  collator.compare(item, "apple") === 0
);
console.log(matches); // ["Apple"]

This pattern enables fuzzy matching where users do not need to type exact characters.

Enabling numeric sorting for natural ordering

The numeric option treats embedded numbers as numeric values:

const collator = new Intl.Collator("en", { numeric: true });

const files = ["File1", "File10", "File2"];
console.log(files.sort(collator.compare));
// ["File1", "File2", "File10"]

Without numeric sorting, "File10" would sort before "File2" because the string "10" starts with "1". Numeric sorting parses number sequences and compares them mathematically.

This produces natural ordering that matches human expectations for filenames, version numbers, and numbered lists.

Handling decimal numbers with numeric sorting

Numeric sorting has a limitation with decimal numbers:

const collator = new Intl.Collator("en", { numeric: true });

const values = ["1.5", "1.10", "1.2"];
console.log(values.sort(collator.compare));
// ["1.2", "1.5", "1.10"]

The decimal point is treated as punctuation, not part of the number. Each segment between punctuation is sorted separately. For decimal number sorting, parse values to numbers and use numeric comparison.

Controlling case ordering with caseFirst

The caseFirst option determines whether uppercase or lowercase letters sort first:

// Uppercase first
const upperFirst = new Intl.Collator("en", { caseFirst: "upper" });
console.log(["a", "A", "b", "B"].sort(upperFirst.compare));
// ["A", "a", "B", "b"]

// Lowercase first
const lowerFirst = new Intl.Collator("en", { caseFirst: "lower" });
console.log(["a", "A", "b", "B"].sort(lowerFirst.compare));
// ["a", "A", "b", "B"]

The default is false, which uses the locale default ordering. This option has no effect when sensitivity is base or accent because those levels ignore case.

Ignoring punctuation during comparison

The ignorePunctuation option skips punctuation marks during comparison:

const collator = new Intl.Collator("en", { ignorePunctuation: true });

console.log(collator.compare("hello", "he-llo")); // 0
console.log(collator.compare("hello", "hello!")); // 0

This option defaults to true for Thai and false for other languages. Use it when punctuation should not affect string ordering or matching.

Specifying collation types for language-specific rules

Some locales support multiple collation types for specialized sorting:

// Chinese pinyin ordering
const pinyin = new Intl.Collator("zh-CN-u-co-pinyin");

// German phonebook ordering
const phonebook = new Intl.Collator("de-DE-u-co-phonebk");

// Emoji grouping
const emoji = new Intl.Collator("en-u-co-emoji");

The collation type is specified in the locale string using Unicode extension syntax. Common types include:

  • pinyin: Chinese sorting by romanized pronunciation
  • stroke: Chinese sorting by stroke count
  • phonebk: German phonebook ordering
  • trad: Traditional sorting rules for certain languages
  • emoji: Groups emoji by category

Check Intl.supportedValuesOf for available collation types in your environment.

Reusing collator instances across your application

Create collator instances once and reuse them throughout your application:

// utils/collation.js
export const germanCollator = new Intl.Collator("de");
export const searchCollator = new Intl.Collator("en", {
  sensitivity: "base"
});
export const numericCollator = new Intl.Collator("en", {
  numeric: true
});

// In your components
import { germanCollator } from "./utils/collation";

const sorted = names.sort(germanCollator.compare);

This pattern maximizes performance and maintains consistent comparison behavior across your codebase.

Sorting arrays of objects by property

Use the collator in a comparison function that accesses object properties:

const collator = new Intl.Collator("de");

const users = [
  { name: "Zoe" },
  { name: "Änder" },
  { name: "Ava" }
];

const sorted = users.sort((a, b) =>
  collator.compare(a.name, b.name)
);

This approach works for any object structure. Extract the strings to compare and pass them to the collator.

Comparing Intl.Collator with localeCompare performance

Intl.Collator provides better performance when sorting large datasets:

// Slower: recreates locale settings for each comparison
const slow = items.sort((a, b) => a.localeCompare(b, "de"));

// Faster: reuses precomputed locale settings
const collator = new Intl.Collator("de");
const fast = items.sort(collator.compare);

For small arrays (under 100 items), the difference is negligible. For large arrays (thousands of items), Intl.Collator can be 60-80% faster.

One exception exists in V8-based browsers like Chrome. localeCompare has an optimization for ASCII-only strings using lookup tables. When sorting purely ASCII strings, localeCompare may perform comparably to Intl.Collator.

Knowing when to use Intl.Collator versus localeCompare

Use Intl.Collator when:

  • Sorting large arrays (hundreds or thousands of items)
  • Sorting repeatedly (user toggles sort order, virtual lists)
  • Building reusable comparison utilities
  • Performance matters for your use case

Use localeCompare when:

  • Making one-off comparisons
  • Sorting small arrays (under 100 items)
  • Simplicity outweighs performance concerns
  • You need inline comparison without setup

Both APIs support the same options and produce identical results. The difference is purely about performance and code organization.

Checking resolved options

The resolvedOptions method returns the actual options used by the collator:

const collator = new Intl.Collator("de", { sensitivity: "base" });
console.log(collator.resolvedOptions());
// {
//   locale: "de",
//   usage: "sort",
//   sensitivity: "base",
//   ignorePunctuation: false,
//   collation: "default",
//   numeric: false,
//   caseFirst: "false"
// }

This helps debug collation behavior and understand default values. The resolved locale may differ from the requested locale if the system does not support the exact locale.

Verifying locale support

Check which locales are supported in the current environment:

const supported = Intl.Collator.supportedLocalesOf(["de", "fr", "xx"]);
console.log(supported); // ["de", "fr"]

Unsupported locales fall back to the system default. This method helps detect when your requested locale is unavailable.

Browser and environment support

Intl.Collator has been widely supported since September 2017. All modern browsers and Node.js versions support it. The API works consistently across environments.

Some collation types and options may have limited support in older browsers. Test critical functionality or check MDN compatibility tables if supporting older environments.

Common mistakes to avoid

Do not create a new collator for every comparison:

// Wrong: creates collator repeatedly
items.sort((a, b) => new Intl.Collator("de").compare(a, b));

// Right: create once, reuse
const collator = new Intl.Collator("de");
items.sort(collator.compare);

Do not assume default sorting works for international text:

// Wrong: breaks for non-ASCII characters
names.sort();

// Right: use locale-aware sorting
names.sort(new Intl.Collator("de").compare);

Do not forget to specify sensitivity for search:

// Wrong: variant sensitivity requires exact match
const collator = new Intl.Collator("en");
items.filter(item => collator.compare(item, "apple") === 0);

// Right: base sensitivity for fuzzy matching
const collator = new Intl.Collator("en", { sensitivity: "base" });
items.filter(item => collator.compare(item, "apple") === 0);

Practical use cases

Use Intl.Collator for:

  • Sorting user-generated content (names, titles, addresses)
  • Implementing search and autocomplete features
  • Building data tables with sortable columns
  • Creating filtered lists and dropdown options
  • Sorting file names and version numbers
  • Alphabetical navigation in contact lists
  • Multi-language application interfaces

Any interface that displays sorted text to users benefits from locale-aware collation. This ensures your application feels native and correct regardless of the user's language.