JJYB_AI VideoAutoCut: The Open Source AI Video Editing Toolkit

January 29, 2026

Category: Practical Open Source Projects

Tags:

Open Source Python tts ai-video-editing Flask

JJYB_AI VideoAutoCut – A Complete Open‑Source AI Video Editing Toolkit

In late 2025, a developer named Jianjie Yi released JJYB_AI_VideoAutoCut (aka JJYB_AI 智剪) – an end‑to‑end AI video editing solution that brings professional video production into the hands of hobbyists and content creators. The project is a single GitHub repository that bundles:

a Flask‑based web front‑end + lightweight desktop wrapper,
a set of 19 language models (ChatGLM, GPT‑4, Claude 3…)
6 vision models (YOLOv8, GPT‑4V, Gemini Vision, etc.),
4 TTS engines (Edge‑TTS, Google TTS, Azure TTS, Voice Clone), and
a robust FFmpeg‑MoviePy‑OpenCV processing pipeline.

Below we walk through the architecture, key features, quick start, and a few practical use cases.

1. Project Overview

JJYB_AI_VideoAutoCut
 ├─ frontend/           # Flask + SocketIO UI
 ├─ backend/            # AI services & processing logic
 ├─ config/             # Global INI settings
 ├─ resource/           # Pre‑downloaded model weights 
 ├─ upload/             # User’s raw files
 └─ output/            # Final video artefacts

Highlights

Feature	Description
Smart Cutting	Automatic segment detection via YOLOv8 and a custom scene‑change detector.
Original Commentary	Vision analysis → LLM draft → TTS → video overlay.
Multi‑Engine Voice‑Over	Edge‑TTS (free, 23+ voices), Google TTS, Azure TTS, Voice Clone.
Mix‑Cut Mode	Batch import, auto‑highlight, style‑guided transitions, music‑sync cut.
Extremely Low Latency	< 100 ms sync between audio and video using custom timing map.
One‑Click Startup	`启动应用.bat` runs `check_system.py`, resolves dependencies, launches app at `http://localhost:5000`.

2. Installation & Setup

1. Clone the repository

git clone https://github.com/jianjieyiban/JJYB_AI_VideoAutoCut.git
cd JJYB_AI_VideoAutoCut

2. Create and activate a virtual environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

Tip – If you’re on Windows and your machine has a GPU, install the CUDA‑enabled PyTorch wheel from the official site.

4. Check system prerequisites

python check_system.py

The script verifies: * Python 3.9‑3.11 * FFmpeg binary (auto‑downloaded if missing) * CUDA libs (if GPU mode is desired)

5. Configure your APIs

Visit http://localhost:5000/api_settings after startup. At minimum, supply: * One Large Language Model API key (e.g., Alibaba TongYi‑Qwen‑Plus, DeepSeek, or OpenAI). The UI will test connectivity automatically. * Optionally a Vision model key (e.g., Tencent CV or Google‑Vision). * Edge‑TTS works offline; other TTS engines may require credentials.

6. Launch the application

Double‑click 启动应用.bat, or
Run python frontend/app.py and open http://localhost:5000.

You now have a lightweight web‑app for video editing! The front‑end ships with 3 sub‑apps: 1. index.html – timeline editor 2. voiceover.html – AI voice‑over module 3. commentary.html – auto‑generate narration

3. Core Features Explained

3.1 Smart Cutting

The system automatically splits a raw file into logical segments. It uses YOLOv8 for object detection and OpenCV for frame‑by‑frame analysis. The detection thresholds are tunable via config/.

How to tweak

[cutting]
ObjectScoreThreshold = 0.4
SceneChangeSensitivity = 0.8

3.2 Original Commentary Pipeline

Vision Parsing – Detects objects, faces, and actions.
LLM Scripting – Generates a concise commentary based on the model selected.
TTS Synthesis – Renders the paragraph in audio.
Video Overlay – Syncs audio to timeline and optionally adds subtitles.

Pro Tip: Using the TongYi‑Qwen‑Plus model usually yields the best balance between cost, speed, and quality for Chinese videos.

3.3 AI Voice‑Over

Choose a language and voice; adjust speech speed, pitch, and volume. The UI supports real‑time preview before final rendering.

3.4 Mix‑Cut & Music‑Sync

Upload multiple clips → the system identifies cool snippets, arranges them according to a specified style, adds transitions, and syncs cuts to a music track.

4. Advanced Usage & Automation

# Example: Batch process via CLI (future feature)
from backend.api import process_video
process_video(
    src='uploads/sample.mp4',
    model='tongyi_qwen',
    voice='en_azure_01',
    mode='commentary',
    output='output/sample_result.mp4'
)

Note: While the UI is sufficient for most users, you can directly interact with the backend via REST endpoints documented in docs/API.md.

5. Development & Contribution

The project follows a standard Git workflow. Contributing guidelines: 1. Fork and clone. 2. Create a feature branch (git checkout -b feature/X) 3. Add unit tests under tests/. 4. Update README.md or docs if you add functionality. 5. Submit a PR.

The maintainers actively review PRs that improve model support, add new UI features, or polish the processing pipeline.

6. Community & Support

GitHub Issues – For bugs, feature requests, or general questions.
Discord – A separate server hosts quick help, demos, and tutorials (invite link in the README).
Documentation – The 开发文档/ folder contains multi‑chapter guides covering everything from AI model configuration to detailed API usage.

7. Why This Is a Must‑Try Open‑Source Project

Zero Cost – All core models are free or open source. Paid APIs are optional.
Modular Design – Swap in any LLM, vision, or TTS model with a few lines of config.
Cross‑Platform – Works on Windows, macOS, and Linux via Flask.
Extensible – Researchers can plug in new model checkpoints to the resource/ folder.
No Cloud Lock‑In – Everything runs locally; your video data never leaves your machine.

Get Started Today

Download and try JJYB_AI VideoAutoCut. Build your own AI‑enhanced videos without a single line of code—just open the web UI, plug in your API keys, and let the AI do the heavy lifting.

Happy editing!

Original Article: View Original

Share this article