ElevenLabs is an AI research company known for its strong text-audio models such as TTS, voice cloning, and speech transcription. The latest model, v3, was released in 2026 and features natural-sounding English and a wider coverage of languages other than English. The signature feature of ElevenLabs is "voices," which can be trained for specific purposes, such as cloning your own voice or speaking in a specific language or accent.
ElevenLabs v3 supports the following 74 languages: Afrikaans, Arabic, Armenian, Assamese, Azerbaijani, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chichewa, Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Kirghiz, Korean, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malay, Malayalam, Mandarin Chinese, Marathi, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Sindhi, Slovak, Slovenian, Somali, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh. https://elevenlabs.io/docs/overview/models
The ElevenLabs speeches were generated with the following prompt:
ElevenLabsTTS.py --voice-id "[voice_id]" --model-id "eleven_v3" --text "[translated prompt]" --output elevenlabs.mp3