Categories
- All Posts 549
- Practical Open Source Projects 478
- Tutorial Articles 22
- Online Utilities 13
- AI news 7
- Tiny Startups Showcase 7
- Claude Code Skills 6
- Prompt Templates 5
- Hugging Face Spaces 3
- OpenClaw Use Cases 3
- LLM Learning Resources 1
- Online AI Image Tools 1
- OpenClaw Master Skills Collection 1
- Rust Training Resources 1
- AI Short Drama Tools 1
- My Favorites 0
Posts tagged with: Whisper
Content related to Whisper
SpeechRecognition: Ultimate Python Speech-to-Text Library
Discover SpeechRecognition, the most comprehensive Python library for converting speech to text. Supports offline engines like CMU Sphinx, Vosk, and OpenAI Whisper, plus cloud APIs from Google, OpenAI, Groq, and Cohere. Install with one pip command and start transcribing microphone input or audio files instantly. Perfect for voice assistants, transcription apps, and meeting recorders. Includes detailed setup guides for PyAudio, PocketSphinx, and troubleshooting tips.
AI‑Video‑Transcriber: Transcribe and Summarize Any Video with AI
Discover how AI‑Video‑Transcriber brings next‑generation speech‑to‑text and AI‑powered summarization to every video platform. With Faster‑Whisper, FastAPI, and optional OpenAI GPT‑4o translation, it supports 30+ sites—including YouTube, TikTok, Bilibili—and 100+ languages. Learn how to install via Docker or scripts, configure Whisper models, and optimize performance for long‑form content. Perfect for developers, content creators, and researchers seeking a ready‑to‑go, open‑source solution that scales from laptops to cloud servers.
WhisperLiveKit: Real-time Local Speech-to-Text
Discover WhisperLiveKit, a powerful open-source project enabling real-time, fully local speech-to-text, translation, and speaker diarization. It leverages state-of-the-art research like SimulStreaming and WhisperStreaming for unparalleled accuracy and low latency, overcoming the limitations of traditional audio chunk processing. With a user-friendly server and web UI, WhisperLiveKit is ideal for applications ranging from meeting transcriptions and accessibility tools to content creation and customer service analysis. The project offers straightforward installation via pip, various configuration options for different models and backends, and robust deployment guides for both CPU and GPU environments using Docker.