Posts tagged with: ASR

Content related to ASR

VibeVoice: Microsoft’s Open‑Source Voice AI Suite

March 15, 2026

Tags:

Explore VibeVoice, Microsoft’s cutting‑edge open‑source toolkit that brings long‑form ASR, multi‑speaker TTS, and real‑time streaming to developers and researchers. Learn how to harness its 60‑minute ASR pipeline, 90‑minute TTS, and lightweight real‑time model, and discover integration with Hugging Face Transformers for seamless deployment.

Qwen3-ASR: Alibaba’s Open‑Source 52‑Language ASR Model

January 31, 2026

Tags:

Open Source Speech Recognition Alibaba ASR Multilingual

Alibaba Cloud’s latest release, Qwen3‑ASR, brings state‑of‑the‑art multilingual speech recognition to the open‑source community. Supporting 52 languages and 22 Chinese dialects, the two 1.7B/0.6B models excel on benchmarks and rival commercial APIs. The repo ships with a full inference toolkit that works with transformers or the high‑performance vLLM backend, automatic timestamping via the Qwen3‑ForcedAligner, and a ready‑to‑run Gradio demo. Whether you’re a researcher, developer, or hobbyist, this guide walks you through downloading, setting up, benchmarking, and deploying Qwen3‑ASR in Docker or directly on GPU, so you can start transcribing speech, music, and songs with ease. Key highlights: multilingual support, streaming inference, forced‑alignment, quick‑start scripts, Docker deployments, and API integration with OpenAI‑compatible endpoints.

Categories

Posts tagged with: ASR

VibeVoice: Microsoft’s Open‑Source Voice AI Suite

Qwen3-ASR: Alibaba’s Open‑Source 52‑Language ASR Model