Faster Whisper ChickenRice: Japanese‑Chinese Transcription

ChickenRice – A Powerful Open‑Source Japanese‑to‑Chinese Transcription Tool

In a world where videos and podcasts span dozens of languages, the ability to auto‑generate subtitles accurately and quickly can save hours of manual labor. ChickenRice (Faster‑Whisper‑TransWithAI) is a ready‑to‑use solution that takes Japanese audio or video and produces Chinese subtitles (SRT, VTT, LRC) in a flash. Built on the lightning‑fast Faster Whisper engine and powered by an optimized Japanese‑to‑Chinese model trained on 5 000 h of audio, it delivers state‑of‑the‑art accuracy.

Key Features

Feature Description
GPU Acceleration Supports CUDA 11.8, 12.2, 12.8 – perfect for NVIDIA RTX series.
Batch Inference Process dozens of files at once with auto‑caching to skip already‑processed items.
Voice‑Optimized VAD Uses TransWithAI’s whisper‑VAD to trim background noise and focus on speech.
Multi‑Format Output Exports to SRT, VTT, LRC, or even raw text.
Cloud Inference Modal integration allows you to run the model on a GPU in the cloud without local hardware.
Zero‑Code Start Drag‑and‑drop bat files for GPU and CPU modes – no heavy scripting required.
Open‑Source & MIT All source, data, and models are GPL‑compatible – contributors welcome.

Why ChickenRice?

  • High Accuracy: The custom Japanese‑Chinese model was trained on a vast dataset of native‑speaker audio, ensuring correct translations and context handling.
  • Speed: Faster Whisper squeezes decoding power into a single pass, making it the fastest alternative to the original Whisper.
  • Flexibility: Whether you have a powerful RTX 3090 or only a CPU, there’s a deployment path for you.
  • Extensibility: The source is clean and modular – tweak the generation_config.json5 or drop in your own VAD model.

Quick Setup Guide

  1. Prerequisites
  2. Windows 10/11 (+ optional WSL for Linux), Python 3.11+, and an NVIDIA GPU or Modal account.
  3. git, conda (or pip) and modal CLI.

  4. Clone Repo

    git clone https://github.com/TransWithAI/Faster-Whisper-TransWithAI-ChickenRice.git
    cd Faster-Whisper-TransWithAI-ChickenRice
    

  5. Install Dependencies

    conda env create -f environment-cuda118.yml    # or cuda122 / cuda128
    conda activate faster-whisper-cu118
    
    Or use pip install -r requirements.txt if you prefer.

  6. Download Models

    python download_models.py  # pulls Whisper and VAD models
    

  7. Run Locally

  8. GPU (most performance): . un(GPU).bat
  9. CPU (fallback): . un(CPU).bat
  10. Low‑VRAM GPU: . un(GPU,低显存模式).bat
  11. Video‑only: . un(翻译视频)(GPU).bat

Drag your video/audio file onto the corresponding batch file.

  1. Cloud Inference (Optional)
    modal token new   # register/renew your Modal token
    modal run modal_infer.py   # interactive prompt will ask for GPU type, model, files
    

For a pre‑built executable, use modal_infer.exe.

  1. Customising Output Edit generation_config.json5 to adjust beam size, temperature, or enable segment_merge for cleaner subtitles.

Example tweak:

{
  "segment_merge": {"enabled": true, "max_gap_ms": 500, "max_duration_ms": 2000}
}

Advanced Topics

Using Modal Cloud Inference

  • Why Modal? No local GPU, or you want to scale across many jobs. Modal gives you a T4 GPU for free (up to $30/month) and handles scaling automatically.
  • Setup: After running modal token new, you can launch jobs from the command line or via the provided modal_infer.py script.
  • Cost: Roughly $0.02–$0.05 per minute of GPU time depending on GPU type.

Batch Processing & Caching

The tool auto‑detects already‑processed files and skips them. This is crucial when dealing with large media libraries; you only re‑run the dirty ones.

Extending the Model Toolkit

You can swap the Japanese‑Chinese translation model for any Whisper model checkpoint or add a custom VAD model by modifying the infer.py entrypoint and the environment YAML.

Community & Support

  • Issues & Pull Requests: Visit the GitHub repository to report bugs or submit improvements.
  • Telegram: Join the AI汉化组 chat for quick help and collaborative development.
  • Documentation: The repo contains README.md, 使用说明.txt, and the RELEASE_NOTES_CN.md for detailed guidance.

Final Thoughts

ChickenRice is more than just a transcription script; it’s a production‑grade pipeline ready for YouTubers, podcasters or researchers needing fast, reliable Japanese‑to‑Chinese subtitles. With GPU acceleration, seamless cloud scaling, and an MIT license, adopting ChickenRice can dramatically cut manual subtitle creation time.

Give it a try, fork the repo, and contribute – the community’s next breakthrough in AI‑assist transcription is just a few lines of code away!

Original Article: View Original

Share this article