自动复数处理 - Lingo.dev Compiler

@lingo.dev/compiler 会自动检测文本中的复数形式，并将其转换为 ICU MessageFormat，以实现多语言下的正确复数处理。

功能介绍

编译器利用 AI 检测文本中是否包含复数形式（如 “1 item” 与 “5 items”），并将其转换为 ICU MessageFormat 语法。

转换前（你的代码）：

<p>You have {count} items</p>

转换后（编译结果）：

<p>{count, plural, one {You have 1 item} other {You have # items}}</p>

不同语言有不同的复数规则。例如，阿拉伯语有 6 种形式，俄语有 4 种，英语有 2 种。ICU MessageFormat 能正确处理所有这些情况。

工作原理

编译器检测包含数值的插值
将文本发送给 LLM 以检测复数形式
如果检测到复数，LLM 返回 ICU MessageFormat 模式
编译器注入支持复数的翻译

这一切都是自动完成的——无需更改代码。

配置方法

{
  pluralization: {
    enabled: true, // Default: true
    model: "groq:llama-3.1-8b-instant", // Fast model for plural detection
  }
}

模型： 复数检测采用独立（高速）模型以降低成本。默认使用 Groq 的最快模型。

示例

简单复数

<p>You have {count} items in your cart</p>

转换为：

{count, plural,
  one {You have 1 item in your cart}
  other {You have # items in your cart}
}

零形式

<p>You have {unreadCount} unread messages</p>

转换为：

{unreadCount, plural,
  =0 {You have no unread messages}
  one {You have 1 unread message}
  other {You have # unread messages}
}

复杂复数

<p>{days} days and {hours} hours remaining</p>

变为：

{days, plural,
  one {1 day}
  other {# days}
} and {hours, plural,
  one {1 hour}
  other {# hours}
} remaining

语言特定规则

ICU MessageFormat 会自动处理复杂的复数规则：

英语（2 种形式）：

one：1 item
other：0 items、2 items、100 items

俄语（4 种形式）：

one：1、21、31、41...（оди́н элеме́нт）
few：2-4、22-24...（два элеме́нта）
many：0、5-20、25-30...（пять элеме́нтов）
other：1.5、2.5...（1.5 элеме́нта）

阿拉伯语（6 种形式）：

zero：0
one：1
two：2
few：3-10
many：11-99
other：100+

编译器会为每个目标语言自动生成正确的形式。

禁用复数处理

如果你的应用不使用复数形式，或希望手动处理：

{
  pluralization: {
    enabled: false,
  }
}

这样会完全跳过复数检测，减少 LLM 调用次数。

手动复数形式

你可以直接在代码中编写 ICU MessageFormat：

<p>
  {formatMessage(
    { id: "items", defaultMessage: "{count, plural, one {# item} other {# items}}" },
    { count }
  )}
</p>

编译器不会尝试检测复数形式——会直接使用你指定的格式。

性能

复数检测会带来极小的性能开销：

每个包含数值插值的唯一文本会多一次 LLM 调用
使用快速模型（默认：Groq llama-3.1-8b-instant）
结果会被缓存——后续构建会复用检测结果

**成本：**可以忽略。快速模型每次请求仅需几分钱。

常见问题

这适用于所有语言吗？ 是的。ICU MessageFormat 支持 200 多种语言的复数规则。

如果 AI 检测错了怎么办？ 使用 data-lingo-override 指定带有正确复数形式的精确翻译：

<p data-lingo-override={{
  es: "{count, plural, one {1 artículo} other {# artículos}}"
}}>
  {count} items
</p>

可以自定义复数检测模型吗？ 可以。将 pluralization.model 设置为任意支持的 LLM。默认模型已针对速度和成本优化。

这会增加 bundle 大小吗？ 会有轻微增加。ICU MessageFormat 模式比简单字符串稍大，但影响极小。

嵌套复数支持吗？ 支持。编译器可以处理同一字符串中的多个复数变量。

需要手动标记复数形式吗？ 不需要。当 pluralization.enabled: true 时，编译器会自动检测。

调试

要查看检测到的复数形式，请检查 .lingo/metadata.json：

{
  "translations": {
    "abc123": {
      "source": "You have {count} items",
      "pattern": "{count, plural, one {You have 1 item} other {You have # items}}",
      "locales": {
        "es": "{count, plural, one {Tienes 1 artículo} other {Tienes # artículos}}"
      }
    }
  }
}

pattern 字段显示编译器检测到的 ICU MessageFormat。

下一步

手动覆盖 — 手动覆盖复数形式
配置参考 — 复数选项
ICU MessageFormat 指南 — 了解 ICU 语法