Ai Settings
Configure the AI provider, model, and context options used by the Supervertaler for Trados plugin.
Accessing AI settings
Open the plugin Settings dialogue and switch to the AI tab.
Provider selection
Choose one of the supported AI providers:
| Provider | Description |
|---|---|
| OpenAI | GPT-5.5, GPT-5.4 Mini |
| Claude (Anthropic) | Claude Sonnet 4.6, Claude Haiku 4.5, Claude Opus 4.7 |
| Gemini (Google) | Gemini 3.1 Flash-Lite, Gemini 2.5 Pro, Gemini 3.1 Pro (Preview), Gemma 4 26B MoE |
| Grok (xAI) | Grok 4.3 |
| Mistral AI | Mistral Large, Mistral Small |
| DeepSeek | DeepSeek V4 Pro, DeepSeek V4 Flash |
| OpenRouter | Access 200+ models from all major providers with a single API key |
| Ollama (Local) | Run models locally, no API key required |
| Custom (OpenAI-compatible) | Any provider with an OpenAI-compatible API |
API key
Enter the API key for your selected provider. The key is stored locally and never sent anywhere except to the provider’s API endpoint.
Model selection
A dropdown showing a curated list of recommended models for the selected provider.
Model ID
Below the dropdown is an optional Model ID field. To use a model that isn’t in the curated list – a brand-new release, a preview model, or an OpenRouter router such as openrouter/free – type its exact model ID here. When filled, it overrides the dropdown selection; leave it blank to use the model picked from the dropdown.
The field is available for every cloud provider. If you reopen Settings and a saved model isn’t in the curated list, it is shown back in the Model ID field.
Ollama endpoint
When using Ollama as the provider, this field sets the local endpoint URL. Defaults to:
http://localhost:11434Change this only if you are running Ollama on a different port or a remote machine.
DeepSeek
DeepSeek is a Chinese AI lab offering high-quality models with competitive pricing.
To use DeepSeek directly:
- Create an account at platform.deepseek.com
- Go to API Keys and create a key
- In Supervertaler, select DeepSeek as the provider and paste your key
DeepSeek models are also available via OpenRouter if you prefer a single-key setup.
Custom OpenAI-compatible provider
For providers that expose an OpenAI-compatible API (e.g., Azure OpenAI, together.ai, internal LLM gateways, local inference servers), configure these fields:
| Field | Description |
|---|---|
| Endpoint | The base URL for the API (e.g., https://your-server.com/v1) |
| Model | The model identifier to use (e.g., llama-3-70b) |
| API Key | The authentication key for this endpoint |
Managing multiple endpoints
You can configure more than one custom endpoint and switch between them without re-entering credentials. Each endpoint is stored as a named profile in the Profile dropdown.
| Button | Action |
|---|---|
| + | Add a new endpoint (starts as “New Endpoint 1”, “New Endpoint 2”, etc.) |
| − | Remove the currently selected endpoint |
| ✎ | Rename the currently selected endpoint |
Names are free-form labels – use whatever makes sense for your workflow (e.g. Azure Production, Internal gateway – Mistral Large, Local Ollama). Names must be unique within the list. Renaming is a UI-only change: the endpoint URL, model, and API key all stay attached to the same profile.
OpenRouter
OpenRouter is an API gateway that gives you access to 200+ models from OpenAI, Anthropic, Google, Mistral, Meta, and many others – all through a single API key. Instead of managing separate keys for each provider, you sign up once at OpenRouter and use one key for everything.
Getting started
- Create a free account at openrouter.ai
- Go to Keys and create an API key
- In Supervertaler, select OpenRouter as the provider and paste your key
Curated model list
The model dropdown includes a curated selection of the best models for translation:
| Model | Description |
|---|---|
| Claude Sonnet 4.6 | Recommended – best balance of speed, quality, and cost |
| Claude Opus 4.7 | Highest quality – Anthropic’s most capable model, 1M context |
| GPT-5.5 | Premium quality – OpenAI’s most advanced model |
| GPT-5.4 Mini | Fast, affordable, and high quality for everyday translation |
| Gemini 3.1 Pro | Google’s most advanced model, large context |
| Gemini 3 Flash | Fast and affordable – great for large batch jobs |
| Gemma 4 31B | Open-source – strong multilingual quality, 256K context |
| Gemma 4 26B MoE | Open-source – near-31B quality at a fraction of the cost |
| Mistral Small 4 | Very fast and cheap – good multilingual support |
| Qwen 3.6 Plus (Free) | Free – no API costs, good general-purpose quality |
| DeepSeek V4 Pro | DeepSeek flagship – strong multilingual, competitive pricing |
| DeepSeek V4 Flash | DeepSeek fast – great for high-volume translation |
Using any OpenRouter model
OpenRouter exposes far more models than the curated list above. To use one that isn’t listed, type its exact model ID into the Model ID field (see Model selection) – for example, meta-llama/llama-3.1-70b-instruct, deepseek/deepseek-r1, or a router such as openrouter/free. Browse all available models at openrouter.ai/models.
Pricing
OpenRouter adds a 5.5% platform fee on top of the underlying provider’s token price. For example, if Claude Sonnet 4.6 costs $3/$15 per million tokens at Anthropic, it costs approximately $3.17/$15.83 through OpenRouter. For a typical 5,000-word translation costing $0.50, the OpenRouter fee adds less than 3 cents.
AI context options
These options control what additional context is included in AI prompts. The settings are split into two groups depending on which features they apply to.
Which settings apply where
| Setting | Chat & QuickLauncher | Batch Operations |
|---|---|---|
| Termbases in AI prompts | Yes | Yes |
| Include full document content | Yes | Yes |
| Max segments | Yes | Yes |
| Include term definitions and domains | Yes | Yes |
| Log prompts to Reports | Yes | Yes |
| Include TM matches | Yes | AutoPrompt only |
| Surrounding segments | Yes | No |
AI context (Batch operations, Chat and QuickLauncher)
These settings apply to all AI features – Chat, QuickLauncher, Batch Translate, and Batch Proofread.
Include full document content
When enabled, all source segments in the current document are sent to the AI so it can determine the document type (legal, medical, technical, marketing, etc.) and provide context-appropriate assistance. This uses more tokens but greatly improves response quality – the AI can tailor its terminology and style to the specific type of document you are translating.
For very large documents, the content is automatically truncated to the configured maximum. The truncation preserves the beginning and end of the document (first 80% + last 20%).
For Batch Operations, the document content is included once in the system prompt (shared across all batches), so the AI knows what kind of document it is translating even when processing individual batches of segments.
Max segments
The maximum number of source segments to include in the AI prompt when document content is enabled. Default: 500. Range: 100–2000.
Increase this for very large documents where you want the AI to see more content. Decrease it if you want to reduce token usage.
Include term definitions and domains
When enabled, term definitions, domains, and usage notes from your termbases are included alongside matched terminology in the AI prompt. This gives the AI deeper understanding of your terminology – for example, knowing that a term belongs to the legal domain or has a specific definition helps the AI use it correctly in both chat responses and batch translations.
Include termbases in AI prompt
Select which termbases are included in AI prompts. Terminology matches from enabled termbases are injected into the prompt to help the AI use the correct, approved terminology.
For AutoPrompt, TermScan automatically filters the termbase to only terms that appear in the document’s source text, keeping the prompt focused and within token limits.
AI context (mostly Chat and QuickLauncher)
These settings apply primarily to the Supervertaler Assistant chat window and QuickLauncher prompts. The exception is Include TM matches, which also feeds AutoPrompt – see the per-setting notes below.
Include TM matches
The behaviour of this checkbox depends on which feature is asking for context:
- Chat and QuickLauncher (live TM lookups). When enabled, the AI gets translation memory matches – fuzzy and exact – for the active segment. This gives the AI reference translations from your project TMs to improve consistency.
- AutoPrompt (Batch Operations). When enabled, AutoPrompt samples up to 50 already-translated, human-confirmed segment pairs evenly from the active document and includes them in the meta-prompt as in-project reference translations. This includes 100% / exact matches that have been applied and confirmed, fuzzy-and-edited segments, and segments translated from scratch – any segment with a Translated, Approved, or Signed-off confirmation level qualifies. AutoPrompt does not do live TM lookups; it samples confirmed segments straight from the document.
- Other Batch Operations (Translate, Proofread). Unaffected by this checkbox – they always work segment-by-segment without TM reference pairs, regardless of how it’s set.
Surrounding segments
The number of segments before and after the active segment to include as context. Default: 5 (five segments on each side). Range: 1–20.
This provides the AI with local context around the segment you are working on. It is also used for the {{SURROUNDING_SEGMENTS}} variable in QuickLauncher prompts.
QuickLauncher prompts go to
Picks where Ctrl+Q QuickLauncher prompts run.
| Option | Where the prompt and response appear |
|---|---|
| In-Trados AI Assistant (default) | The Supervertaler Assistant chat panel in Trados Studio. Same behaviour as before this setting existed. |
| Workbench Sidekick | Supervertaler Workbench’s floating Sidekick Chat. The window pops to the front and maximises to the screen it’s on, the prompt is echoed into the chat, and the AI’s response appears there instead of in Trados. |
The Workbench-Sidekick option is for users who want the bigger reading area Sidekick provides for long explanations, or who prefer to keep all their AI chat history in one product rather than split between Trados and Workbench.
If the option is set to Workbench Sidekick but Workbench isn’t running (or the Sidekick Bridge isn’t reachable for any reason), the QuickLauncher silently falls back to the in-Trados Assistant – a missing Workbench never blocks a prompt.
SuperMemory context
These two toggles control whether SuperMemory knowledge base articles are included in the AI context.
Include memory bank in AI context
When enabled, the AI loads client profiles, domain knowledge, style guides, and terminology reasoning from the active memory bank before every translation and chat message. This gives the AI the reasoning behind your terminology decisions, not just the terms themselves.
Use memory bank when generating prompts (AutoPrompt)
When enabled, SuperMemory articles are included in the AutoPrompt meta-prompt so that generated translation prompts reflect your established client conventions, terminology reasoning, and style guides. Only effective when “Include memory bank in AI context” is also enabled.
Prompt logging
Log prompts and responses to Reports tab
When enabled, AI operations are logged to the Reports tab in the Supervertaler Assistant panel. Each log entry shows:
- The feature and prompt name (e.g. “QuickLauncher · Explain in Context”)
- The model used, estimated token counts, cost, and duration
- Expandable sections for the system prompt, messages, and response
Click “Show system prompt…”, “Show messages…”, or “Show response…” to expand a section. Press Escape to collapse it. Use Copy to copy a single section, or Copy all to copy the full prompt details to your clipboard.
This is useful for:
- Monitoring costs – see exactly how many tokens each operation uses
- Debugging prompts – inspect the full text sent to the AI to understand its behaviour
- Comparing models – run the same prompt with different models and compare token usage
Batch settings
Configure the batch size for the Batch Translate feature. This determines how many segments are sent to the AI provider in a single request.
- A larger batch size is faster but uses more tokens per request
- A smaller batch size is more granular and easier to review