How to sort strings with embedded numbers correctly
Use numeric collation to sort file names, version numbers, and other strings containing numbers in natural order
Introduction
When you sort strings containing numbers, you expect file1.txt, file2.txt, and file10.txt to appear in that order. However, JavaScript's default string comparison produces file1.txt, file10.txt, file2.txt instead. This happens because strings are compared character by character, and the character 1 in 10 comes before the character 2.
This problem appears whenever you sort file names, version numbers, street addresses, product codes, or any other strings with embedded numbers. The incorrect ordering confuses users and makes data difficult to navigate.
JavaScript provides the Intl.Collator API with a numeric option that solves this problem. This lesson explains how numeric collation works, why default string comparison fails, and how to sort strings with embedded numbers in natural numerical order.
What numeric collation is
Numeric collation is a comparison method that treats sequences of digits as numbers rather than individual characters. When comparing strings, the collator identifies digit sequences and compares them by their numeric value.
With numeric collation disabled, the string file10.txt comes before file2.txt because character-by-character comparison finds that 1 comes before 2 in the first differing position. The collator never considers that 10 represents a number greater than 2.
With numeric collation enabled, the collator recognizes 10 and 2 as complete numbers and compares them numerically. Since 10 is greater than 2, file2.txt correctly comes before file10.txt.
This behavior produces what people call natural sorting or natural order, where strings containing numbers sort the way humans expect them to rather than strictly alphabetically.
Why default string comparison fails for numbers
JavaScript's default string comparison uses lexicographic ordering, which compares strings character by character from left to right using Unicode code point values. This works correctly for alphabetic text but produces unexpected results for numbers.
Consider how lexicographic comparison handles these strings:
const files = ['file1.txt', 'file10.txt', 'file2.txt', 'file20.txt'];
files.sort();
console.log(files);
// Output: ['file1.txt', 'file10.txt', 'file2.txt', 'file20.txt']
The comparison examines each character position independently. At the first differing position after file, it compares 1 against 2. Since 1 has a lower Unicode value than 2, any string starting with file1 comes before any string starting with file2, regardless of what follows.
This produces the sequence file1.txt, file10.txt, file2.txt, file20.txt, which violates human expectations about number ordering.
Using Intl.Collator with the numeric option
The Intl.Collator constructor accepts an options object with a numeric property. Setting numeric: true enables numeric collation, causing the collator to compare digit sequences by their numeric value.
const collator = new Intl.Collator('en-US', { numeric: true });
const files = ['file1.txt', 'file10.txt', 'file2.txt', 'file20.txt'];
files.sort(collator.compare);
console.log(files);
// Output: ['file1.txt', 'file2.txt', 'file10.txt', 'file20.txt']
The collator's compare method returns a negative number when the first argument should come before the second, zero when they are equal, and a positive number when the first should come after the second. This matches the signature expected by JavaScript's Array.sort() method.
The sorted result places files in natural numerical order. The collator recognizes that 1 < 2 < 10 < 20, producing the sequence humans expect.
Sorting mixed alphanumeric strings
Numeric collation handles strings where numbers appear at any position, not just at the end. The collator compares alphabetic portions normally and numeric portions numerically.
const collator = new Intl.Collator('en-US', { numeric: true });
const addresses = ['123 Oak St', '45 Oak St', '1234 Oak St', '5 Oak St'];
addresses.sort(collator.compare);
console.log(addresses);
// Output: ['5 Oak St', '45 Oak St', '123 Oak St', '1234 Oak St']
The collator identifies the digit sequences at the beginning of each string and compares them numerically. It recognizes that 5 < 45 < 123 < 1234, even though lexicographic comparison would produce a different order.
Sorting version numbers
Version numbers are a common use case for numeric collation. Software versions like 1.2.10 should come after 1.2.2, but lexicographic comparison produces the wrong order.
const collator = new Intl.Collator('en-US', { numeric: true });
const versions = ['1.2.10', '1.2.2', '1.10.5', '1.2.5'];
versions.sort(collator.compare);
console.log(versions);
// Output: ['1.2.2', '1.2.5', '1.2.10', '1.10.5']
The collator compares each numeric component correctly. In the sequence 1.2.2, 1.2.5, 1.2.10, it recognizes that the third component increases numerically. In 1.10.5, it recognizes that the second component is 10, which is greater than 2.
Working with product codes and identifiers
Product codes, invoice numbers, and other identifiers often mix letters with numbers. Numeric collation ensures these sort in a logical order.
const collator = new Intl.Collator('en-US', { numeric: true });
const products = ['PROD-1', 'PROD-10', 'PROD-2', 'PROD-100'];
products.sort(collator.compare);
console.log(products);
// Output: ['PROD-1', 'PROD-2', 'PROD-10', 'PROD-100']
The alphabetic prefix PROD- matches in all strings, so the collator compares the numeric suffix. The result reflects increasing numeric order rather than lexicographic order.
Sorting with different locales
The numeric option works with any locale. While different locales may have different sorting rules for alphabetic characters, the numeric comparison behavior remains consistent.
const enCollator = new Intl.Collator('en-US', { numeric: true });
const deCollator = new Intl.Collator('de-DE', { numeric: true });
const items = ['item1', 'item10', 'item2'];
console.log(items.sort(enCollator.compare));
// Output: ['item1', 'item2', 'item10']
console.log(items.sort(deCollator.compare));
// Output: ['item1', 'item2', 'item10']
Both locales produce the same result because the strings contain only ASCII characters and numbers. When strings include locale-specific characters, the alphabetic comparison follows locale rules while numeric comparison remains consistent.
Comparing strings without sorting
You can use the collator's compare method directly to determine the relationship between two strings without sorting an entire array.
const collator = new Intl.Collator('en-US', { numeric: true });
console.log(collator.compare('file2.txt', 'file10.txt'));
// Output: -1 (negative number means first argument comes before second)
console.log(collator.compare('file10.txt', 'file2.txt'));
// Output: 1 (positive number means first argument comes after second)
console.log(collator.compare('file2.txt', 'file2.txt'));
// Output: 0 (zero means arguments are equal)
This is useful when you need to check ordering without modifying an array, such as when inserting an item into a sorted list or checking whether a value falls within a range.
Understanding the limitation with decimal numbers
Numeric collation compares sequences of digits, but it does not recognize decimal points as part of numbers. The period character is treated as a separator, not as a decimal separator.
const collator = new Intl.Collator('en-US', { numeric: true });
const measurements = ['0.5', '0.05', '0.005'];
measurements.sort(collator.compare);
console.log(measurements);
// Output: ['0.005', '0.05', '0.5']
The collator treats each measurement as three separate numeric components: the part before the period, the period itself, and the part after the period. It compares 0 against 0 (equal), then compares the parts after the period as separate numbers: 5, 5, and 5 (equal). Then it compares the second decimal place: nothing, 5, and nothing. This produces incorrect ordering for decimal numbers.
For sorting decimal numbers, convert them to actual numbers and sort numerically, or use string padding to ensure correct lexicographic order.
Combining numeric collation with other options
The numeric option works alongside other collation options like sensitivity and caseFirst. You can control how the collator handles case and accents while maintaining numeric comparison behavior.
const collator = new Intl.Collator('en-US', {
numeric: true,
sensitivity: 'base'
});
const items = ['Item1', 'item10', 'ITEM2'];
items.sort(collator.compare);
console.log(items);
// Output: ['Item1', 'ITEM2', 'item10']
The sensitivity: 'base' option makes the comparison case-insensitive. The collator treats Item1, item1, and ITEM1 as equivalent while still comparing numeric portions correctly.
Reusing collators for performance
Creating a new Intl.Collator instance involves loading locale data and processing options. When you need to sort multiple arrays or perform many comparisons, create the collator once and reuse it.
const collator = new Intl.Collator('en-US', { numeric: true });
const files = ['file1.txt', 'file10.txt', 'file2.txt'];
const versions = ['1.2.10', '1.2.2', '1.10.5'];
const products = ['PROD-1', 'PROD-10', 'PROD-2'];
files.sort(collator.compare);
versions.sort(collator.compare);
products.sort(collator.compare);
This approach is more efficient than creating a new collator for each sort operation. The performance difference becomes significant when sorting many arrays or performing frequent comparisons.