Real-time AI - Open Source Projects

Helios: 14B Real-Time Video Gen at 19.5 FPS

March 25, 2026

Tags:

Open Source Real-time AI HuggingFace Video Generation diffusion-models

Discover Helios, the breakthrough 14B parameter video generation model from PKU-YuanGroup that generates minute-scale, high-quality videos at 19.5 FPS on a single H100 GPU. No anti-drifting tricks, no acceleration hacks - just pure architectural innovation. Supports T2V, I2V, V2V, and interactive generation with Day-0 support for Diffusers, SGLang, vLLM-Omni, and Ascend NPU. Run it locally with ~6GB VRAM using group offloading. Complete training code and three model variants (Base, Mid, Distilled) available now.

Read more Original

Practical Open Source Projects

WhisperLiveKit: Real-time Local Speech-to-Text

August 30, 2025

Tags:

Open Source Python Real-time AI Speech-to-Text Whisper

Discover WhisperLiveKit, a powerful open-source project enabling real-time, fully local speech-to-text, translation, and speaker diarization. It leverages state-of-the-art research like SimulStreaming and WhisperStreaming for unparalleled accuracy and low latency, overcoming the limitations of traditional audio chunk processing. With a user-friendly server and web UI, WhisperLiveKit is ideal for applications ranging from meeting transcriptions and accessibility tools to content creation and customer service analysis. The project offers straightforward installation via pip, various configuration options for different models and backends, and robust deployment guides for both CPU and GPU environments using Docker.

Read more Original

Practical Open Source Projects

TEN VAD: High-Performance, Lightweight Voice Activity Detector

June 30, 2025

Tags:

Open Source Real-time AI Voice Activity Detection Speech Processing Conversational AI

Discover TEN VAD, an advanced, low-latency Voice Activity Detector (VAD) from the TEN framework. Designed for real-time conversational AI, TEN VAD offers superior precision and efficiency compared to industry standards like WebRTC VAD and Silero VAD. It boasts a lightweight footprint, cross-platform compatibility (Linux, Windows, macOS, Android, iOS, Web via WASM), and comprehensive language support including Python, JS, and C. This open-source project is ideal for developers building agent-friendly, high-performance voice applications, providing robust capabilities for accurate speech detection and reduced latency in human-agent interactions. Explore its features, installation guides, and how it fits into the broader TEN ecosystem for multimodal conversational AI.

Read more Original

Practical Open Source Projects

Airi: Open-Source AI VTuber for Real-Time Interaction

June 09, 2025

Tags:

Open Source AI AI VTuber Virtual Character Real-time AI Minecraft AI

Discover Airi, an ambitious open-source project aiming to create AI-driven virtual characters capable of real-time voice chat, even playing Minecraft and Factorio. Built with web technologies like WebGPU and WebAudio, Airi is designed for accessibility, running seamlessly in browsers and on desktop. This project stands out by inviting developers, artists, and designers to contribute to its vision of bringing AI waifus and virtual personalities into our digital worlds. Learn about its current capabilities, development roadmap, and how you can get involved in shaping the future of AI-powered virtual companions.

Read more Original

Categories

Posts tagged with: Real-time AI

Helios: 14B Real-Time Video Gen at 19.5 FPS

WhisperLiveKit: Real-time Local Speech-to-Text

TEN VAD: High-Performance, Lightweight Voice Activity Detector

Airi: Open-Source AI VTuber for Real-Time Interaction