Edge-TTS: Free Text-to-Speech from Python
Harnessing the Power of Microsoft Edge TTS with the edge-tts Python Library
For developers seeking a versatile and free solution for text-to-speech (TTS) generation, the edge-tts Python library presents an exceptional open-source offering. This project cleverly utilizes Microsoft Edge's online TTS service, enabling users to convert text into speech directly from their Python applications without the need for specialized hardware, operating system dependencies on Windows, or costly API keys.
Effortless Installation and Usage
Getting started with edge-tts is straightforward. A simple pip install edge-tts command is all that's required to integrate its capabilities into your development environment. For those who primarily intend to use the command-line interface, pipx install edge-tts is a recommended alternative.
The library provides a user-friendly command-line interface for quick audio generation. You can easily create audio files and corresponding subtitle files with commands like:
$ edge-tts --text "Hello, world!" --write-media hello.mp3 --write-subtitles hello.srt
For immediate playback, the edge-playback command can be used:
$ edge-playback --text "Hello, world!"
It's worth noting that edge-playback requires the mpv command-line player for playback, except on Windows systems.
Customization and Voice Selection
edge-tts shines in its flexibility. You can effortlessly switch between the vast array of voices supported by Microsoft's service using the --voice option. To explore the available voices and their characteristics, simply run:
$ edge-tts --list-voices
This command outputs a comprehensive list of voices, including their names, genders, content categories, and voice personalities, allowing you to select the perfect vocal profile for your needs.
Furthermore, fine-tuning the speech output is readily achievable. Parameters such as speech rate, volume, and pitch can be adjusted using the --rate, --volume, and --pitch options, respectively. Special consideration is needed when using negative values, where you must append a percent sign (e.g., --rate=-50%) to prevent misinterpretation by the command line.
Programmatic Integration
Beyond its command-line utility, edge-tts is designed for seamless integration into Python projects. Developers can import and utilize the module directly within their code, opening up possibilities for creating dynamic text-to-speech functionalities in a wide range of applications, from interactive bots to content creation tools.
Several other projects, such as hass-edge-tts and Podcastfy, already leverage the power of the edge-tts module, demonstrating its practicality and widespread adoption within the developer community.
With its robust features, ease of use, and open-source nature, edge-tts stands out as a valuable tool for anyone looking to incorporate high-quality, accessible text-to-speech capabilities into their Python projects.