Ensemble Mode¶
Ensemble mode runs two different ASR pipelines on the same video and merges their output for higher accuracy. This is WhisperJAV's most powerful transcription mode.
How It Works¶
Each pipeline has different strengths. For example:
- Whisper (Balanced): excellent timing, good text
- Qwen3-ASR: excellent text quality, different timing approach
Merging combines the best of both.
Setting Up Ensemble¶
- Go to the Ensemble tab (Tab 3)
- Pass 1 is always active — configure pipeline, sensitivity, and options
- Enable Pass 2 by checking the checkbox
- Configure Pass 2 with a different pipeline
- Choose a Merge Strategy
- Click Start
Pass Configuration¶
Each pass has identical controls:
| Control | Description |
|---|---|
| Pipeline | ASR backend (Balanced, Fast, Faster, Qwen3-ASR, ChronosJAV, etc.) |
| Sensitivity | Detection threshold (Aggressive, Balanced, Conservative) |
| Scene Detector | How to split audio into scenes (Auditok, Silero, Semantic, None) |
| Speech Enhancer | Audio preprocessing (None, FFmpeg DSP, ClearVoice, BS-RoFormer) |
| Speech Segmenter | Voice activity detection within scenes (Silero, TEN, None) |
| Model | Which model to use (pipeline-dependent) |
Click Customize on any pass for fine-grained parameter control.
Merge Strategies¶
| Strategy | Best For |
|---|---|
| Pass 1 Primary | When Pass 1 is your trusted baseline — fills gaps from Pass 2 |
| Smart Merge | General use — selects the best subtitle from each pass using quality heuristics |
| Full Merge | Maximum coverage — combines all subtitles, resolves overlaps |
| Longest | Picks the longer (more detailed) subtitle when passes overlap |
| Pass 2 Primary | When Pass 2 is your trusted baseline |
| Overlap 30% | Conservative merge — requires 30% time overlap before merging |
Recommended Combo
Balanced (Pass 1) + Qwen3-ASR (Pass 2) + Smart Merge is a strong default for most content.
Serial vs Parallel Batch Mode¶
When processing multiple files in ensemble mode:
| Mode | Behavior |
|---|---|
| Parallel (default) | All Pass 1 jobs run first, then all Pass 2, then all merges |
| Serial | Each file completes fully (Pass 1 → Pass 2 → Merge) before the next starts |
Serial mode is useful when you want to see results as they finish. Enable it with the Serial checkbox in the GUI or --ensemble-serial in CLI.
Presets¶
Save your ensemble configuration for reuse:
- Configure your passes, merge strategy, and parameters
- Click Save Preset
- Give it a name (e.g., "High Quality JAV", "Quick Anime")
- Load presets later from the preset dropdown
Presets save all pass configurations, merge strategy, and custom parameters. They persist across sessions.
Inline Translation¶
Check "AI-translate" after the merge strategy to automatically translate the merged output. Select your provider and model inline, or click the settings button for full configuration.