How to compare strings ignoring case differences

Use locale-aware comparison to handle case-insensitive matching correctly across languages

Introduction

Case-insensitive string comparison appears frequently in web applications. Users type search queries in mixed case, enter usernames with inconsistent capitalization, or fill forms without regard to letter case. Your application needs to match these inputs correctly regardless of whether users type uppercase, lowercase, or mixed case.

The straightforward approach converts both strings to lowercase and compares them. This works for English text but fails for international applications. Different languages have different rules for converting between uppercase and lowercase. A comparison method that works for English can produce wrong results for Turkish, German, Greek, or other languages.

JavaScript provides the Intl.Collator API to handle case-insensitive comparison correctly across all languages. This lesson explains why simple lowercase conversion fails, how locale-aware comparison works, and when to use each approach.

The naive approach with toLowerCase

Converting both strings to lowercase before comparison is the most common approach to case-insensitive matching:

const str1 = "Hello";
const str2 = "HELLO";

console.log(str1.toLowerCase() === str2.toLowerCase());
// true

This pattern works for ASCII text and English words. The comparison treats uppercase and lowercase versions of the same letter as identical.

You can use this approach for fuzzy search:

const query = "apple";
const items = ["Apple", "Banana", "APPLE PIE", "Orange"];

const matches = items.filter(item =>
  item.toLowerCase().includes(query.toLowerCase())
);

console.log(matches);
// ["Apple", "APPLE PIE"]

The filter finds all items containing the search query regardless of case. This provides the expected behavior for users who type queries without thinking about capitalization.

Why the naive approach fails for international text

The toLowerCase() method converts text according to Unicode rules, but these rules do not work the same way in all languages. The most famous example is the Turkish i problem.

In English, the lowercase letter i converts to uppercase I. In Turkish, there are two distinct letters:

  • Lowercase dotted i converts to uppercase dotted İ
  • Lowercase dotless ı converts to uppercase dotless I

This difference breaks case-insensitive comparison:

const word1 = "file";
const word2 = "FILE";

// In English locale (correct)
console.log(word1.toLowerCase() === word2.toLowerCase());
// true

// In Turkish locale (incorrect)
console.log(word1.toLocaleLowerCase("tr") === word2.toLocaleLowerCase("tr"));
// false - "file" becomes "fıle"

When converting FILE to lowercase using Turkish rules, the I becomes ı (dotless), producing fıle. This does not match file (with dotted i), so the comparison returns false even though the strings represent the same word.

Other languages have similar problems. German has the ß character which uppercases to SS. Greek has multiple lowercase forms of sigma (σ and ς) that both uppercase to Σ. Simple case conversion cannot handle these language-specific rules correctly.

Using Intl.Collator with base sensitivity for case-insensitive comparison

The Intl.Collator API provides locale-aware string comparison with configurable sensitivity. The sensitivity option controls which differences matter during comparison.

For case-insensitive comparison, use sensitivity: "base":

const collator = new Intl.Collator("en", { sensitivity: "base" });

console.log(collator.compare("Hello", "hello"));
// 0 (strings are equal)

console.log(collator.compare("Hello", "HELLO"));
// 0 (strings are equal)

console.log(collator.compare("Hello", "Héllo"));
// 0 (strings are equal, accents ignored too)

Base sensitivity ignores both case and accent differences. Only the base letters matter. The comparison returns 0 when strings are equivalent at this sensitivity level.

This approach handles the Turkish i problem correctly:

const collator = new Intl.Collator("tr", { sensitivity: "base" });

console.log(collator.compare("file", "FILE"));
// 0 (correctly matches)

console.log(collator.compare("file", "FİLE"));
// 0 (correctly matches, even with dotted İ)

The collator applies Turkish case folding rules automatically. Both comparisons recognize the strings as equivalent, regardless of which uppercase I appears in the input.

Using localeCompare with sensitivity option

The localeCompare() method provides an alternative way to perform case-insensitive comparison. It accepts the same options as Intl.Collator:

const str1 = "Hello";
const str2 = "HELLO";

console.log(str1.localeCompare(str2, "en", { sensitivity: "base" }));
// 0 (strings are equal)

This produces the same result as using Intl.Collator with base sensitivity. The comparison ignores case differences and returns 0 for equivalent strings.

You can use this in array filtering:

const query = "apple";
const items = ["Apple", "Banana", "APPLE PIE", "Orange"];

const matches = items.filter(item =>
  item.localeCompare(query, "en", { sensitivity: "base" }) === 0 ||
  item.toLowerCase().includes(query.toLowerCase())
);

console.log(matches);
// ["Apple"]

However, localeCompare() only returns 0 for exact matches at the specified sensitivity level. It does not support partial matching like includes(). For substring search, you still need to use lowercase conversion or implement a more sophisticated search algorithm.

Choosing between base and accent sensitivity

The sensitivity option accepts four values that control different aspects of string comparison:

Base sensitivity

Base sensitivity ignores both case and accents:

const collator = new Intl.Collator("en", { sensitivity: "base" });

console.log(collator.compare("cafe", "café"));
// 0 (accents ignored)

console.log(collator.compare("cafe", "Café"));
// 0 (case and accents ignored)

console.log(collator.compare("cafe", "CAFÉ"));
// 0 (case and accents ignored)

This provides the most lenient matching. Users who cannot type accented characters or who skip them for convenience still get correct matches.

Accent sensitivity

Accent sensitivity ignores case but considers accents:

const collator = new Intl.Collator("en", { sensitivity: "accent" });

console.log(collator.compare("cafe", "café"));
// -1 (accents matter)

console.log(collator.compare("cafe", "Café"));
// -1 (accents matter, case ignored)

console.log(collator.compare("Café", "CAFÉ"));
// 0 (case ignored, accents match)

This treats accented and unaccented letters as different while ignoring case. Use this when accent differences are significant but case differences are not.

Choosing the right sensitivity for your use case

For most case-insensitive comparison needs, base sensitivity provides the best user experience:

  • Search functionality where users type queries without accents
  • Username matching where case should not matter
  • Fuzzy finding where you want maximum flexibility
  • Form validation where Smith and smith should match

Use accent sensitivity when:

  • The language requires distinguishing accented characters
  • Your data contains both accented and unaccented versions with different meanings
  • You need case-insensitive but accent-aware comparison

Performing case-insensitive search with includes

The Intl.Collator API compares complete strings but does not provide substring matching. For case-insensitive search, you still need to combine locale-aware comparison with other approaches.

One option is to use toLowerCase() for the substring search but accept its limitations for international text:

function caseInsensitiveIncludes(text, query, locale = "en") {
  return text.toLowerCase().includes(query.toLowerCase());
}

const text = "The Quick Brown Fox";
console.log(caseInsensitiveIncludes(text, "quick"));
// true

For more sophisticated searching that handles international text correctly, you need to iterate through possible substring positions and use the collator for each comparison:

function localeAwareIncludes(text, query, locale = "en") {
  const collator = new Intl.Collator(locale, { sensitivity: "base" });

  for (let i = 0; i <= text.length - query.length; i++) {
    const substring = text.slice(i, i + query.length);
    if (collator.compare(substring, query) === 0) {
      return true;
    }
  }

  return false;
}

const text = "The Quick Brown Fox";
console.log(localeAwareIncludes(text, "quick"));
// true

This approach checks every possible substring of the correct length and uses locale-aware comparison for each. It handles international text correctly but has worse performance than simple includes().

Performance considerations when using Intl.Collator

Creating an Intl.Collator instance involves loading locale data and processing options. When you need to perform multiple comparisons, create the collator once and reuse it:

// Inefficient: creates collator for every comparison
function badCompare(items, target) {
  return items.filter(item =>
    new Intl.Collator("en", { sensitivity: "base" }).compare(item, target) === 0
  );
}

// Efficient: creates collator once, reuses it
function goodCompare(items, target) {
  const collator = new Intl.Collator("en", { sensitivity: "base" });
  return items.filter(item =>
    collator.compare(item, target) === 0
  );
}

The efficient version creates the collator once before filtering. Each comparison uses the same instance, avoiding repeated initialization overhead.

For applications that perform frequent comparisons, create collator instances at application startup and export them for use throughout your codebase:

// utils/collation.js
export const caseInsensitiveCollator = new Intl.Collator("en", {
  sensitivity: "base"
});

export const accentInsensitiveCollator = new Intl.Collator("en", {
  sensitivity: "accent"
});

// In your application code
import { caseInsensitiveCollator } from "./utils/collation";

const isMatch = caseInsensitiveCollator.compare(input, expected) === 0;

This pattern maximizes performance and maintains consistent comparison behavior across your application.

When to use toLowerCase versus Intl.Collator

For English-only applications where you control the text content and know it contains only ASCII characters, toLowerCase() provides acceptable results:

// Acceptable for English-only, ASCII-only text
const isMatch = str1.toLowerCase() === str2.toLowerCase();

This approach is simple, fast, and familiar to most developers. If your application truly never handles international text, the added complexity of locale-aware comparison may not provide value.

For international applications or applications where users enter text in any language, use Intl.Collator with appropriate sensitivity:

// Required for international text
const collator = new Intl.Collator(userLocale, { sensitivity: "base" });
const isMatch = collator.compare(str1, str2) === 0;

This ensures correct behavior regardless of which language users speak or type. The small performance cost of using Intl.Collator is worthwhile to avoid incorrect comparisons.

Even if your application currently supports only English, using locale-aware comparison from the start makes future internationalization easier. Adding support for new languages requires no changes to comparison logic.

Practical use cases for case-insensitive comparison

Case-insensitive comparison appears in many common scenarios:

Username and email matching

Users type usernames and email addresses with inconsistent capitalization:

const collator = new Intl.Collator("en", { sensitivity: "base" });

function findUserByEmail(users, email) {
  return users.find(user =>
    collator.compare(user.email, email) === 0
  );
}

const users = [
  { email: "[email protected]", name: "John" },
  { email: "[email protected]", name: "Jane" }
];

console.log(findUserByEmail(users, "[email protected]"));
// { email: "[email protected]", name: "John" }

This finds the user regardless of how they capitalize their email address.

Search autocomplete

Autocomplete suggestions need to match partial input with case insensitivity:

const collator = new Intl.Collator("en", { sensitivity: "base" });

function getSuggestions(items, query) {
  const queryLower = query.toLowerCase();

  return items.filter(item =>
    item.toLowerCase().startsWith(queryLower)
  );
}

const items = ["Apple", "Apricot", "Banana", "Cherry"];
console.log(getSuggestions(items, "ap"));
// ["Apple", "Apricot"]

This provides suggestions regardless of the case users type.

Tag and category matching

Users assign tags or categories to content without consistent capitalization:

const collator = new Intl.Collator("en", { sensitivity: "base" });

function hasTag(item, tag) {
  return item.tags.some(itemTag =>
    collator.compare(itemTag, tag) === 0
  );
}

const article = {
  title: "My Article",
  tags: ["JavaScript", "Tutorial", "Web Development"]
};

console.log(hasTag(article, "javascript"));
// true

This matches tags regardless of capitalization differences.