Mistral AI launches Voxtral TTS, a low-cost open-source text-to-speech model

Key Points

Mistral AI launched Voxtral TTS, an open-source text-to-speech model for voice assistants and enterprise use.
The model supports nine languages and can create custom voices from under five seconds of audio.
It maintains voice characteristics across languages and has low latency of 90 milliseconds, costing less than competitors.

French AI company Mistral AI has launched Voxtral TTS, a new open-source text-to-speech model designed for voice assistants and enterprise applications like customer support and sales. The model supports nine languages including English, French, German, and Arabic, and can create a custom voice from less than five seconds of audio. It maintains voice characteristics like accent and intonation when switching between languages, making it suitable for dubbing and real-time translation.

With latency as low as 90 milliseconds, Voxtral TTS is reportedly faster and costs a fraction of other market models.

Transparency

How we verified this article

LowBased on 2 sources

2 sources2 Involved