WhisperJAV Linux Installation Guide¶
Version: 1.8.3 Last Updated: 2026-02-10 Platforms: Ubuntu, Debian, Fedora, RHEL, Arch Linux, Google Colab, Kaggle
Table of Contents¶
- System Requirements
- Prerequisites
- Ubuntu / Debian
- Fedora / RHEL / CentOS Stream
- Arch Linux / Manjaro
- NVIDIA Driver and CUDA Setup
- Installation Methods
- Method 1: Source Installation (Recommended)
- Method 2: pip Install with Extras
- Method 3: Conda Environment
- GPU Verification
- Installing Specific Extras
- Headless Server Setup
- Google Colab Setup
- Kaggle Setup
- Running the Application
- Systemd Service Setup
- Troubleshooting
- Performance Tuning
- Uninstallation
System Requirements¶
Hardware¶
| Component | Minimum | Recommended | Qwen3-ASR |
|---|---|---|---|
| CPU | 4 cores (x86_64) | 8+ cores | 8+ cores |
| RAM | 8 GB | 16 GB | 32 GB |
| GPU VRAM | 4 GB (basic) | 8 GB | 16+ GB |
| Disk Space | 15 GB (install) | 50 GB (install + models + temp) | 50+ GB |
| Network | Required for install | Broadband for model downloads | 3-10 GB model downloads |
Supported GPUs¶
| GPU Family | VRAM | Recommended Mode | Notes |
|---|---|---|---|
| RTX 4090/4080/4070 | 12-24 GB | All modes, Qwen3-ASR | Best performance |
| RTX 3090/3080/3070 | 8-24 GB | All modes, Qwen3-ASR | Excellent |
| RTX 3060/3050 | 6-12 GB | Balanced, Fast | Qwen possible with 12 GB |
| RTX 2080/2070/2060 | 6-11 GB | Balanced, Fast | Good |
| GTX 1080 Ti/1070 | 8-11 GB | Balanced, Fast | Adequate |
| Tesla V100/A100 | 16-80 GB | All modes | Data center GPUs |
| No GPU (CPU only) | N/A | Faster mode only | 10-50x slower |
Software¶
| Component | Requirement | Notes |
|---|---|---|
| Linux Kernel | 4.15+ | 5.4+ recommended for modern NVIDIA drivers |
| Python | 3.10, 3.11, or 3.12 | 3.9 and 3.13+ are NOT supported |
| NVIDIA Driver | 450+ (cu118) or 570+ (cu128) | Required for GPU acceleration |
| FFmpeg | 4.0+ | Required for audio/video processing |
| Git | 2.0+ | Required for installing packages from GitHub |
| GCC / build-essential | Any recent version | Required for compiled extensions |
Prerequisites¶
Install the following system packages BEFORE running the WhisperJAV installer. These are system-level libraries that cannot be installed by pip.
Ubuntu / Debian¶
# Update package lists
sudo apt-get update
# Essential: Python, build tools, FFmpeg, Git
sudo apt-get install -y \
python3 \
python3-pip \
python3-venv \
python3-dev \
build-essential \
ffmpeg \
git
# Audio processing libraries
sudo apt-get install -y \
libsndfile1 \
libsndfile1-dev
# Optional: For TEN VAD native library
sudo apt-get install -y libc++1 libc++abi1
# Optional: For PyAudio/auditok (microphone input)
sudo apt-get install -y portaudio19-dev
# Optional: For GUI (whisperjav-gui)
sudo apt-get install -y \
libwebkit2gtk-4.0-dev \
libgtk-3-dev \
gir1.2-webkit2-4.0
Ubuntu 20.04 (Focal) users: The default Python is 3.8, which is too old. Install Python 3.10+ from the deadsnakes PPA:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install -y python3.11 python3.11-venv python3.11-dev
# Use python3.11 instead of python3 in all subsequent commands
Fedora / RHEL / CentOS Stream¶
# Essential: Python, build tools, FFmpeg, Git
sudo dnf install -y \
python3 \
python3-pip \
python3-devel \
gcc \
gcc-c++ \
ffmpeg \
git
# Audio processing libraries
sudo dnf install -y libsndfile libsndfile-devel
# Optional: For PyAudio/auditok
sudo dnf install -y portaudio-devel
# Optional: For GUI
sudo dnf install -y \
webkit2gtk4.0-devel \
gtk3-devel
RHEL/CentOS: FFmpeg is not in the default repos. Enable RPM Fusion first:
# RHEL 9 / CentOS Stream 9
sudo dnf install -y \
https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-9.noarch.rpm \
https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-9.noarch.rpm
sudo dnf install -y ffmpeg
Arch Linux / Manjaro¶
# Essential: Python, build tools, FFmpeg, Git
sudo pacman -S --noconfirm \
python \
python-pip \
base-devel \
ffmpeg \
git
# Audio processing libraries
sudo pacman -S --noconfirm libsndfile
# Optional: For PyAudio/auditok
sudo pacman -S --noconfirm portaudio
# Optional: For GUI
sudo pacman -S --noconfirm webkit2gtk gtk3
NVIDIA Driver and CUDA Setup¶
WhisperJAV uses PyTorch for GPU inference. You need NVIDIA drivers but do NOT need to install the CUDA Toolkit separately -- PyTorch bundles its own CUDA runtime.
Check Current Driver¶
If nvidia-smi is not found, you need to install NVIDIA drivers.
Install NVIDIA Drivers¶
Ubuntu / Debian:
# Method 1: Ubuntu's recommended driver tool (easiest)
sudo ubuntu-drivers autoinstall
sudo reboot
# Method 2: Specific driver version
sudo apt-get install -y nvidia-driver-570
sudo reboot
Fedora:
# Enable RPM Fusion repos first (see above), then:
sudo dnf install -y akmod-nvidia xorg-x11-drv-nvidia-cuda
sudo reboot
Arch Linux:
Verify Driver Version¶
After installation and reboot:
Look for the driver version in the output. This determines which CUDA version PyTorch will use:
| Driver Version | CUDA Support | PyTorch Index |
|---|---|---|
| 570+ | CUDA 12.8 | Best performance (default) |
| 450-569 | CUDA 11.8 | Universal fallback |
| < 450 | None | CPU only (update drivers!) |
Data Center / Cloud GPUs¶
For Tesla, A100, H100, or other data center GPUs, install the data center driver:
Installation Methods¶
Method 1: Source Installation (Recommended)¶
This method uses the automated installer that handles GPU detection, installation ordering, and retry logic.
# Step 1: Clone the repository
git clone https://github.com/meizhong986/whisperjav.git
cd whisperjav
# Step 2: Create and activate a virtual environment
python3 -m venv ~/.venv/whisperjav
source ~/.venv/whisperjav/bin/activate
# Step 3: Run the installer
python install.py
The installer will: 1. Check Python version, FFmpeg, Git, disk space, and network 2. Detect your GPU and select the optimal CUDA version 3. Install PyTorch with GPU support (or CPU fallback) 4. Install all dependencies in the correct order 5. Install WhisperJAV 6. Verify the installation
Installer Options:
# CPU-only (no GPU)
python install.py --cpu-only
# Force specific CUDA version
python install.py --cuda118 # For older drivers (450+)
python install.py --cuda128 # For modern drivers (570+)
# Skip optional features
python install.py --no-speech-enhancement
python install.py --minimal # Transcription only
# Include local LLM translation
python install.py --local-llm # Prebuilt wheel (fast)
python install.py --local-llm-build # Build from source (slow)
python install.py --no-local-llm # Skip without prompting
# Development mode (editable install)
python install.py --dev
# Skip preflight checks
python install.py --skip-preflight
Alternative: Use the shell wrapper:
The shell wrapper checks for PEP 668 (externally-managed Python) and delegates to install.py.
Method 2: pip Install with Extras¶
If you want more control over what gets installed, use pip directly. You MUST install PyTorch first.
# Step 1: Create and activate a virtual environment
python3 -m venv ~/.venv/whisperjav
source ~/.venv/whisperjav/bin/activate
# Step 2: Upgrade pip
pip install --upgrade pip
# Step 3: Install PyTorch with CUDA (MUST BE FIRST!)
# For driver 570+:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
# For driver 450-569:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
# For CPU only:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
# Step 4: Install WhisperJAV with desired extras
pip install "whisperjav[cli] @ git+https://github.com/meizhong986/whisperjav.git"
# Or install from local clone:
git clone https://github.com/meizhong986/whisperjav.git
cd whisperjav
pip install -e ".[cli]"
IMPORTANT: Always install PyTorch FIRST with --index-url before installing WhisperJAV. If you skip this step, pip will install CPU-only PyTorch and you will get 10-50x slower performance.
Method 3: Conda Environment¶
# Step 1: Create conda environment
conda create -n whisperjav python=3.11 -y
conda activate whisperjav
# Step 2: Install PyTorch via conda (handles CUDA automatically)
conda install pytorch torchaudio pytorch-cuda=12.8 -c pytorch -c nvidia -y
# Or for CUDA 11.8:
conda install pytorch torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y
# Step 3: Install system deps that conda doesn't provide
conda install ffmpeg -c conda-forge -y
# Step 4: Install WhisperJAV
pip install "whisperjav[all] @ git+https://github.com/meizhong986/whisperjav.git"
# Or from local clone:
cd whisperjav
pip install -e ".[all]"
GPU Verification¶
After installation, verify GPU support is working:
# Quick check: Is CUDA available?
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
python3 -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0)}') if torch.cuda.is_available() else print('No GPU')"
python3 -c "import torch; print(f'CUDA version: {torch.version.cuda}')"
# Full diagnostic:
python3 -m whisperjav.utils.preflight_check -v
# Device detection report:
python3 -m whisperjav.utils.device_detector
Expected output for a working GPU setup:
If CUDA shows False, see Troubleshooting: CUDA Not Detected.
Installing Specific Extras¶
WhisperJAV uses a modular extras system. Install only what you need:
# After activating your venv and installing PyTorch first:
# CLI only (transcription, no GUI)
pip install "whisperjav[cli] @ git+https://github.com/meizhong986/whisperjav.git"
# CLI + Translation
pip install "whisperjav[cli,translate] @ git+https://github.com/meizhong986/whisperjav.git"
# CLI + GUI
pip install "whisperjav[cli,gui] @ git+https://github.com/meizhong986/whisperjav.git"
# CLI + Qwen3-ASR (large model, needs 8+ GB VRAM)
pip install "whisperjav[cli,qwen] @ git+https://github.com/meizhong986/whisperjav.git"
# Unix-optimized (CLI + translate + enhance + huggingface, no GUI)
pip install "whisperjav[unix] @ git+https://github.com/meizhong986/whisperjav.git"
# Everything
pip install "whisperjav[all] @ git+https://github.com/meizhong986/whisperjav.git"
Available Extras¶
| Extra | Description | System Deps Required |
|---|---|---|
cli |
Audio processing, VAD, scene detection | libsndfile |
gui |
PyWebView GUI interface | libwebkit2gtk-4.0-dev, libgtk-3-dev |
translate |
AI subtitle translation (cloud APIs) | None |
llm |
Local LLM server (FastAPI) | None |
enhance |
Speech enhancement (ClearVoice, BS-RoFormer) | libsndfile |
huggingface |
HuggingFace Transformers integration | None |
qwen |
Qwen3-ASR pipeline (requires huggingface) | None (8+ GB VRAM recommended) |
analysis |
Visualization and analysis tools | None |
compatibility |
pyvideotrans integration | None |
all |
Everything combined | All of the above |
unix |
CLI + translate + enhance + huggingface + analysis + compatibility | libsndfile |
colab |
Optimized for Google Colab | N/A (Colab pre-installs most) |
kaggle |
Optimized for Kaggle | N/A |
dev |
Development tools (pytest, ruff) | None |
Headless Server Setup¶
For servers without a display (SSH-only, cloud VMs, CI/CD):
# Step 1: Install prerequisites (no GUI packages needed)
sudo apt-get install -y \
python3 python3-pip python3-venv python3-dev \
build-essential ffmpeg git libsndfile1
# Step 2: Create venv
python3 -m venv ~/.venv/whisperjav
source ~/.venv/whisperjav/bin/activate
# Step 3: Install PyTorch
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
# Step 4: Install WhisperJAV (unix extra = no GUI deps)
pip install "whisperjav[unix] @ git+https://github.com/meizhong986/whisperjav.git"
# Or use the installer with minimal flag:
python install.py --minimal
Key points for headless operation:
- Use the [unix] extra or [cli] extra instead of [all] to skip GUI dependencies
- The GUI (whisperjav-gui) requires a display server and WebKit2GTK -- skip it on servers
- CLI mode (whisperjav) works fully headless
- Set MPLBACKEND=Agg if matplotlib warnings appear (no display for plots)
Google Colab Setup¶
WhisperJAV includes a dedicated Colab installer that handles all setup automatically.
Quick Start¶
In a Colab notebook cell:
# Cell 1: Clone and install
!git clone https://github.com/meizhong986/WhisperJAV.git
!bash WhisperJAV/installer/install_colab.sh
# Cell 3: Transcribe
!MPLBACKEND=Agg /content/whisperjav_env/bin/whisperjav \
/content/drive/MyDrive/video.mp4 \
--mode balanced \
--sensitivity aggressive
What the Colab Installer Does¶
- Installs
uvpackage manager (10-100x faster than pip) - Creates an isolated virtual environment at
/content/whisperjav_env - Installs PyTorch with CUDA support matching Colab's GPU
- Installs system libraries (portaudio, libsndfile, ffmpeg, libc++)
- Installs WhisperJAV with all extras including Qwen3-ASR
- Attempts to install llama-cpp-python from prebuilt wheels (optional)
Colab Tips¶
- Use
MPLBACKEND=Aggto avoid matplotlib display errors - Mount Google Drive to save output subtitles persistently
- Source the aliases file for shorter commands:
- Debug mode:
!bash WhisperJAV/installer/install_colab.sh --debug
Kaggle Setup¶
Similar to Colab, but use the pip-based approach:
# Cell 1: Install
!pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
!pip install "whisperjav[kaggle] @ git+https://github.com/meizhong986/whisperjav.git"
# Cell 2: Verify
!python -c "import whisperjav; print(whisperjav.__version__)"
# Cell 3: Transcribe
!whisperjav /kaggle/input/your-dataset/video.mp4 --mode balanced
Running the Application¶
CLI Usage¶
# Activate your virtual environment first
source ~/.venv/whisperjav/bin/activate
# Basic transcription
whisperjav video.mp4
# With mode and sensitivity
whisperjav video.mp4 --mode balanced --sensitivity aggressive
# Faster mode (less accurate, quick)
whisperjav video.mp4 --mode faster
# With speech enhancement
whisperjav video.mp4 --mode balanced --enhance
# With Qwen3-ASR pipeline (requires [qwen] extra)
whisperjav video.mp4 --mode qwen --input-mode assembly
# With translation
whisperjav video.mp4 --translate --translate-provider deepseek
# Batch processing (all .mp4 files in directory)
whisperjav /path/to/videos/ --mode balanced
# Specify output directory
whisperjav video.mp4 --output-dir /path/to/subtitles/
# Help
whisperjav --help
# Pre-flight environment check
whisperjav --check
GUI Usage¶
Note: The GUI requires a display server (X11 or Wayland) and WebKit2GTK. It will not work over SSH unless you use X11 forwarding or VNC.
Translation¶
# Translate existing subtitles
whisperjav-translate -i subtitles.srt --provider deepseek
# Translate with specific instructions
whisperjav-translate -i subtitles.srt --provider gemini --instructions standard
Systemd Service Setup¶
For automated/scheduled transcription on a server:
Create a Service File¶
sudo tee /etc/systemd/system/whisperjav-batch.service << 'EOF'
[Unit]
Description=WhisperJAV Batch Transcription
After=network.target
[Service]
Type=oneshot
User=your-username
Group=your-group
WorkingDirectory=/home/your-username
Environment="PATH=/home/your-username/.venv/whisperjav/bin:/usr/local/bin:/usr/bin"
Environment="MPLBACKEND=Agg"
ExecStart=/home/your-username/.venv/whisperjav/bin/whisperjav \
/data/incoming/ \
--mode balanced \
--output-dir /data/subtitles/
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
Create a Timer for Scheduled Runs¶
sudo tee /etc/systemd/system/whisperjav-batch.timer << 'EOF'
[Unit]
Description=Run WhisperJAV batch transcription hourly
[Timer]
OnCalendar=hourly
Persistent=true
[Install]
WantedBy=timers.target
EOF
Enable and Start¶
sudo systemctl daemon-reload
sudo systemctl enable whisperjav-batch.timer
sudo systemctl start whisperjav-batch.timer
# Check status
sudo systemctl status whisperjav-batch.timer
# View logs
journalctl -u whisperjav-batch.service -f
Troubleshooting¶
CUDA Not Detected¶
Symptom: torch.cuda.is_available() returns False
Diagnosis:
# Step 1: Check if NVIDIA driver is loaded
nvidia-smi
# Step 2: Check if PyTorch was installed with CUDA
python3 -c "import torch; print(torch.version.cuda)"
# Should print "12.8" or "11.8", NOT "None"
# Step 3: Check driver compatibility
python3 -c "import torch; print(torch.__version__)"
nvidia-smi | head -3
# Compare driver version with CUDA requirements
Common Causes and Fixes:
| Cause | Fix |
|---|---|
| CPU-only PyTorch installed | pip uninstall torch torchaudio && pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128 |
| Driver too old for CUDA 12.8 | Update driver: sudo apt install nvidia-driver-570 or use --cuda118 |
| NVIDIA driver not installed | Install driver (see NVIDIA Driver Setup) |
| Running in container without GPU passthrough | Pass --gpus all to Docker: docker run --gpus all ... |
| Nouveau driver loaded instead of nvidia | Blacklist nouveau: echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf && sudo update-initramfs -u && sudo reboot |
Library Not Found Errors¶
OSError: sndfile library not found
# Ubuntu/Debian
sudo apt-get install -y libsndfile1 libsndfile1-dev
# Fedora/RHEL
sudo dnf install -y libsndfile libsndfile-devel
# Arch
sudo pacman -S libsndfile
ModuleNotFoundError: No module named '_tkinter'
ImportError: libwebkit2gtk-4.0.so: cannot open shared object file
The GUI requires WebKit2GTK. For CLI-only use, this is not needed.
# Ubuntu/Debian
sudo apt-get install -y libwebkit2gtk-4.0-dev
# Fedora
sudo dnf install -y webkit2gtk4.0-devel
Permission Denied¶
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied
You are trying to install to the system Python without a virtual environment. Create one:
error: externally-managed-environment (PEP 668)
Same solution -- create and activate a virtual environment. This error appears on Debian 12+, Ubuntu 24.04+, and similar modern distributions.
If python3 -m venv fails with "No module named venv":
# Ubuntu/Debian
sudo apt-get install -y python3-venv
# Or for a specific version:
sudo apt-get install -y python3.11-venv
Git Timeout / Network Issues¶
Symptom: Failed to connect to github.com port 443 after 21 ms
This commonly occurs behind the Great Firewall (GFW) or slow VPN connections.
# Option 1: The installer auto-configures Git timeouts on retry
# Just run the installer again -- it detects and handles this
# Option 2: Manually configure Git
git config --global http.connectTimeout 120
git config --global http.timeout 300
git config --global http.maxRetries 5
# Option 3: Use a proxy
export https_proxy=http://your-proxy:port
export http_proxy=http://your-proxy:port
PyTorch Version Mismatch¶
Symptom: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED or similar
# Check current versions
python3 -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.version.cuda}')"
# Reinstall matching versions
pip uninstall torch torchaudio -y
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
numba / llvmlite Errors¶
Symptom: ImportError: numba needs NumPy 1.x or Cannot import llvmlite
# Ensure numpy 1.26.x is installed (not 2.x)
pip install "numpy>=1.26.0,<2.0"
pip install --force-reinstall "numba>=0.58.0"
Speech Enhancement Failures¶
Symptom: modelscope / clearvoice installation fails
# These packages are optional. Reinstall without them:
python install.py --no-speech-enhancement
# Or install specific backends:
pip install clearvoice # ClearVoice only
pip install bs-roformer-infer # BS-RoFormer only
Out of Memory (OOM)¶
Symptom: CUDA out of memory during transcription
# Use a smaller model
whisperjav video.mp4 --mode faster
# Reduce batch size for Qwen pipeline
whisperjav video.mp4 --mode qwen --input-mode vad_slicing
# Monitor GPU memory
watch -n 1 nvidia-smi
Performance Tuning¶
VRAM Management¶
| GPU VRAM | Recommended Settings |
|---|---|
| 4 GB | --mode faster only |
| 6 GB | --mode fast or --mode balanced with small model |
| 8 GB | --mode balanced --sensitivity balanced |
| 12 GB | --mode balanced --sensitivity aggressive |
| 16+ GB | All modes including --mode qwen --input-mode assembly |
| 24+ GB | All modes, large batch sizes |
Qwen3-ASR Specific Tuning¶
Qwen3-ASR requires significant VRAM. Use input modes based on your GPU:
| Input Mode | VRAM Usage | Quality | Speed |
|---|---|---|---|
assembly |
Highest (text gen + alignment separately) | Best for long scenes | Moderate |
context_aware |
High (coupled ASR + alignment) | Best for dialogue | Slower |
vad_slicing |
Lower (short segments) | Good for noisy audio | Fastest |
# Assembly mode (recommended for 16+ GB VRAM)
whisperjav video.mp4 --mode qwen --input-mode assembly
# VAD slicing mode (for 8 GB VRAM)
whisperjav video.mp4 --mode qwen --input-mode vad_slicing
Environment Variables¶
# Limit GPU memory usage (fraction of total VRAM)
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512
# Use specific GPU in multi-GPU systems
export CUDA_VISIBLE_DEVICES=0
# Disable matplotlib display (headless servers)
export MPLBACKEND=Agg
# Enable TF32 for faster inference on Ampere+ GPUs
export TORCH_ALLOW_TF32=1
Batch Processing Optimization¶
For processing many files:
# Process all .mp4 files
whisperjav /path/to/videos/ --mode balanced
# Use screen/tmux for long-running jobs
tmux new -s whisperjav
whisperjav /path/to/videos/ --mode balanced --sensitivity aggressive
# Ctrl+B, D to detach; tmux attach -t whisperjav to reattach
Uninstallation¶
# Remove the virtual environment
rm -rf ~/.venv/whisperjav
# Remove cached models (optional, saves disk space)
rm -rf ~/.cache/whisper
rm -rf ~/.cache/huggingface
# Remove desktop entry (if created)
rm -f ~/.local/share/applications/whisperjav.desktop
# Remove source code (if cloned)
rm -rf ~/whisperjav # Adjust path as needed
Appendix: Architecture Overview¶
Installation Flow¶
install_linux.sh (thin wrapper)
|
v
install.py (orchestrator)
|
+-- Preflight checks (disk, network)
+-- detect_gpu() --> CUDA version selection
+-- Step 1: pip upgrade
+-- Step 2: PyTorch (GPU lock-in via --index-url)
+-- Step 3: Core deps (numpy, scipy, numba, audio libs)
+-- Step 4: Whisper packages (openai-whisper, stable-ts, faster-whisper)
+-- Step 5: Optional (HuggingFace, Qwen, translation, VAD, enhancement, GUI)
+-- Step 6: WhisperJAV (--no-deps to preserve GPU torch)
+-- Verification
Why PyTorch Must Be Installed First¶
PyTorch on PyPI is CPU-only. If you run pip install whisperjav directly, pip resolves torch from PyPI and you get CPU-only inference (10-50x slower). By installing torch FIRST with --index-url https://download.pytorch.org/whl/cu128, the GPU version is "locked in" and subsequent packages see it as already satisfied.
Package Registry¶
All package definitions live in whisperjav/installer/core/registry.py. This is the single source of truth for:
- Package names and versions
- Installation order (PyTorch first, numba after numpy, etc.)
- Which extras each package belongs to
- Platform-specific packages (Windows-only, Linux-only)
- Import name mapping (e.g., opencv-python imports as cv2)
When adding or modifying dependencies, update the registry and run validation: