AI Subtitle Translation¶
WhisperJAV can translate Japanese subtitles to other languages using AI language models. Translation works as a standalone tool or integrated into the transcription pipeline.
Supported Providers¶
| Provider | Type | API Key Required | Best For |
|---|---|---|---|
| Ollama | Local | No | Privacy, no cost, easy setup. Recommended for local use. |
| Local LLM | Local | No | Legacy local option (llama-cpp). Consider Ollama instead. |
| DeepSeek | Cloud | Yes | Cost-effective, good CJK quality |
| Gemini | Cloud | Yes | Good multilingual support |
| Claude | Cloud | Yes | High quality |
| GPT | Cloud | Yes | Widely available |
| OpenRouter | Cloud | Yes | Access to many models |
| GLM | Cloud | Yes | Chinese-related tasks |
| Groq | Cloud | Yes | Fast inference |
| Custom | Cloud | Varies | Any OpenAI-compatible endpoint |
Setting Up a Provider¶
Cloud Providers¶
- Get an API key from the provider's website
- In the GUI, select the provider from the dropdown
- Enter your API key in the field
- Click Test Connection to verify
Tip
API keys are saved locally and never sent anywhere except the provider's API endpoint.
Ollama (Recommended for Local)¶
Ollama is the easiest way to run local translation. Install Ollama, then:
# CLI: translate with Ollama (auto-detects GPU, picks best model for your VRAM)
whisperjav-translate -i subtitles.srt --provider ollama
# Use a specific model
whisperjav-translate -i subtitles.srt --provider ollama --model gemma3:12b
# List locally available Ollama models
whisperjav --list-ollama-models
OllamaManager auto-starts the server, detects your GPU, and recommends a model:
| VRAM | Recommended Model |
|---|---|
| CPU only | qwen2.5:3b |
| 8 GB | qwen2.5:7b |
| 12 GB | gemma3:12b |
| 16 GB+ | qwen2.5:14b |
Local LLM (Legacy)¶
The Local provider runs a llama-cpp server on your machine. No API key needed, but requires:
- A GPU with ~8GB VRAM
- The
[llm]extra installed (pip install whisperjav[llm]) - A GGUF model file (downloaded automatically on first use)
Note
Consider switching to Ollama — it's easier to set up, more reliable, and supports more models.
Translation Tone¶
| Tone | Description |
|---|---|
| Standard | Clean, natural translations suitable for general audiences |
| Adult-Explicit | Specialized instructions tuned for JAV dialogue with appropriate vocabulary |
Two Ways to Translate¶
Method 1: Translate During Transcription¶
Transcribe and translate in one workflow:
- Set up your transcription on the Ensemble tab
- Check "AI-translate" after the merge strategy
- Select provider and model
- Click Start
Translation runs automatically after transcription completes.
Method 2: Translate an Existing SRT¶
Use the standalone translation tab:
- Go to AI SRT Translate (Tab 4)
- Add your
.srtfile - Configure provider, model, target language, and tone
- Click Start
Advanced Settings¶
| Setting | Default | What It Does |
|---|---|---|
| Movie Title | (empty) | Gives the AI context about the content |
| Actress Names | (empty) | Helps the AI handle character names correctly |
| Plot Summary | (empty) | Additional context for better translation |
| Scene Threshold | 60 sec | Groups subtitles into scenes for batch processing |
| Max Batch Size | 30 | Subtitles per API call (lower = fewer token issues) |
| Max Retries | 3 | Retry count for failed API calls |
Improving Translation Quality
Filling in the Movie Title and Actress Names fields significantly improves translation quality. The AI uses this context to make better word choices and handle names consistently.
Batch Size Tuning for Local LLMs¶
Local LLMs have limited context windows compared to cloud APIs. If you see errors like "Hit API token limit" or "No matches found in translation text", your batch size is too large for your model's context window.
WhisperJAV auto-adjusts the batch size based on context window size, but you can also set it manually:
# CLI: set batch size explicitly
whisperjav-translate -i subtitles.srt --provider local --max-batch-size 10
# Or configure it persistently
whisperjav-translate --configure
# When prompted for "Max batch size", enter your preferred value
Recommended batch sizes by model context window:
| Context Window | Auto-Cap | Recommended Manual | Notes |
|---|---|---|---|
| 8K (8192) | 11 | 8–12 | gemma-9b, small models |
| 16K (16384) | 27 | 20–27 | Most mid-range models |
| 32K+ | 30 | 30 | Large context models, cloud APIs |
Note
The default batch size of 30 is designed for cloud APIs with 128K+ context windows. For local models, the auto-cap handles most cases automatically. Only set it manually if you still see token limit errors.
CLI Translation¶
# Translate with Ollama (local, recommended)
whisperjav-translate -i subtitles.srt --provider ollama
# Translate with DeepSeek (cloud)
whisperjav-translate -i subtitles.srt --provider deepseek --api-key YOUR_KEY
# Translate with adult tone
whisperjav-translate -i subtitles.srt --provider gemini --tone adult
# Translate to Portuguese
whisperjav-translate -i subtitles.srt --target-language portuguese
# Translate to Chinese
whisperjav-translate -i subtitles.srt --target-language Chinese
# Local LLM with reduced batch size (for 8K context models)
whisperjav-translate -i subtitles.srt --provider local --max-batch-size 10