|
Documentation
Book a DemoPlatform
PlatformMCP
CLIAPIWorkflows
GuidesChangelog

Getting Started

  • How it works
  • Setup
  • Capabilities

Workflows

  • Create engine
  • Import glossary
  • Localize content
  • Inspect requests
  • Investigate
  • Tune engine
  • Spot-check
  • Compare engines
  • Add locale

Compare Engines

Max PrilutskiyMax Prilutskiy·Updated 1 day ago·1 min read

Test the same content through two engine configurations to evaluate a change before committing.

The workflow#

"Compare our production engine against the staging engine on these 5 strings for Japanese"

What happens:

  1. The assistant localizes the content through both engines
  2. Presents results in a side-by-side table
  3. Highlights differences: "The staging engine applies the new glossary term for 'onboarding' (オンボーディング) while production still uses the descriptive localization (導入手続き)"

When to use this#

  • After tuning — verify the change improved output before promoting
  • Evaluating model changes — same config, different primary model
  • Testing glossary impact — with and without new terms
  • Comparing engines for different use cases — marketing vs. technical content

Example comparisons#

Before/after a tune#

"Localize 'Welcome to your new workspace' to German through engine A and engine B"

Shows whether the glossary entry for "workspace" is being preserved in the updated engine.

Model evaluation#

"I switched the Japanese model from GPT-4.1 to Claude Sonnet. Compare outputs for these 10 UI strings."

Side-by-side reveals which model handles short UI strings vs. longer descriptions better for your specific domain.

Glossary depth testing#

"Compare the engine with our full 200-term glossary against a fresh engine with no glossary on these legal strings"

Quantifies how much the glossary contributes to output quality for a specific content type.

Was this page helpful?