Faster Whisper ChickenRice: Japanese‑Chinese Transcription
ChickenRice – A Powerful Open‑Source Japanese‑to‑Chinese Transcription Tool
In a world where videos and podcasts span dozens of languages, the ability to auto‑generate subtitles accurately and quickly can save hours of manual labor. ChickenRice (Faster‑Whisper‑TransWithAI) is a ready‑to‑use solution that takes Japanese audio or video and produces Chinese subtitles (SRT, VTT, LRC) in a flash. Built on the lightning‑fast Faster Whisper engine and powered by an optimized Japanese‑to‑Chinese model trained on 5 000 h of audio, it delivers state‑of‑the‑art accuracy.
Key Features
| Feature | Description |
|---|---|
| GPU Acceleration | Supports CUDA 11.8, 12.2, 12.8 – perfect for NVIDIA RTX series. |
| Batch Inference | Process dozens of files at once with auto‑caching to skip already‑processed items. |
| Voice‑Optimized VAD | Uses TransWithAI’s whisper‑VAD to trim background noise and focus on speech. |
| Multi‑Format Output | Exports to SRT, VTT, LRC, or even raw text. |
| Cloud Inference | Modal integration allows you to run the model on a GPU in the cloud without local hardware. |
| Zero‑Code Start | Drag‑and‑drop bat files for GPU and CPU modes – no heavy scripting required. |
| Open‑Source & MIT | All source, data, and models are GPL‑compatible – contributors welcome. |
Why ChickenRice?
- High Accuracy: The custom Japanese‑Chinese model was trained on a vast dataset of native‑speaker audio, ensuring correct translations and context handling.
- Speed: Faster Whisper squeezes decoding power into a single pass, making it the fastest alternative to the original Whisper.
- Flexibility: Whether you have a powerful RTX 3090 or only a CPU, there’s a deployment path for you.
- Extensibility: The source is clean and modular – tweak the
generation_config.json5or drop in your own VAD model.
Quick Setup Guide
- Prerequisites
- Windows 10/11 (+ optional WSL for Linux), Python 3.11+, and an NVIDIA GPU or Modal account.
-
git,conda(orpip) andmodalCLI. -
Clone Repo
git clone https://github.com/TransWithAI/Faster-Whisper-TransWithAI-ChickenRice.git cd Faster-Whisper-TransWithAI-ChickenRice -
Install Dependencies
Or useconda env create -f environment-cuda118.yml # or cuda122 / cuda128 conda activate faster-whisper-cu118pip install -r requirements.txtif you prefer. -
Download Models
python download_models.py # pulls Whisper and VAD models -
Run Locally
- GPU (most performance):
. un(GPU).bat - CPU (fallback):
. un(CPU).bat - Low‑VRAM GPU:
. un(GPU,低显存模式).bat - Video‑only:
. un(翻译视频)(GPU).bat
Drag your video/audio file onto the corresponding batch file.
- Cloud Inference (Optional)
modal token new # register/renew your Modal token modal run modal_infer.py # interactive prompt will ask for GPU type, model, files
For a pre‑built executable, use modal_infer.exe.
- Customising Output
Edit
generation_config.json5to adjust beam size, temperature, or enablesegment_mergefor cleaner subtitles.
Example tweak:
{
"segment_merge": {"enabled": true, "max_gap_ms": 500, "max_duration_ms": 2000}
}
Advanced Topics
Using Modal Cloud Inference
- Why Modal? No local GPU, or you want to scale across many jobs. Modal gives you a T4 GPU for free (up to $30/month) and handles scaling automatically.
- Setup: After running
modal token new, you can launch jobs from the command line or via the providedmodal_infer.pyscript. - Cost: Roughly $0.02–$0.05 per minute of GPU time depending on GPU type.
Batch Processing & Caching
The tool auto‑detects already‑processed files and skips them. This is crucial when dealing with large media libraries; you only re‑run the dirty ones.
Extending the Model Toolkit
You can swap the Japanese‑Chinese translation model for any Whisper model checkpoint or add a custom VAD model by modifying the infer.py entrypoint and the environment YAML.
Community & Support
- Issues & Pull Requests: Visit the GitHub repository to report bugs or submit improvements.
- Telegram: Join the AI汉化组 chat for quick help and collaborative development.
- Documentation: The repo contains
README.md,使用说明.txt, and theRELEASE_NOTES_CN.mdfor detailed guidance.
Final Thoughts
ChickenRice is more than just a transcription script; it’s a production‑grade pipeline ready for YouTubers, podcasters or researchers needing fast, reliable Japanese‑to‑Chinese subtitles. With GPU acceleration, seamless cloud scaling, and an MIT license, adopting ChickenRice can dramatically cut manual subtitle creation time.
Give it a try, fork the repo, and contribute – the community’s next breakthrough in AI‑assist transcription is just a few lines of code away!