Available Models
Explore the wide range of AI models supported by TTS WebUI
Text-to-Speech Models
Vall-E-X
Multilingual text-to-speech model supporting English, Chinese, and Japanese
By Plachtaa
StyleTTS2
StyleTTS2 is a text-to-speech model that generates high-quality speech with controllable style
By StyleTTS2 Team
Seamless M4T
SeamlessM4T is a multilingual and multimodal translation model supporting text and speech
By Facebook
MMS
MMS (Massively Multilingual Speech) is a text-to-speech model supporting over 1000 languages
By Facebook
Tortoise TTS
Tortoise TTS is a high-quality text-to-speech model with voice cloning capabilities
By neonbjb
F5-TTS
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching.
By Yushen Chen et al.
Chatterbox
Chatterbox, Resemble AI's first production-grade open source TTS model
By Resemble AI
Kokoro
Bark
XTTS
Parler-TTS
Parler-TTS is a training and inference library for high-fidelity text-to-speech (TTS) models.
By rsxdalv
CosyVoice
MARS5
DIA
GPT-SoVITS
GPT-SoVITS: A TTS solution powered by GPT and SoftVC VITS Singing Voice Conversion.
By rsxdalv
Audio & Music Generation Models
ACE-Step
Stable Audio
Stable Audio is a text-to-audio model for generating high-quality music and sound effects
By Stability AI
Audiocraft
Audiocraft provides MusicGen and MAGNeT models for high-quality music and audio generation
By Facebook
AudioCraft Plus
AudioCraft Plus is an all-in-one WebUI for the original AudioCraft, adding many quality features on top.
By GrandaddyShmax
Audio Conversion Models
Vocos
Vocos is a neural audio codec for high-quality audio compression and reconstruction
By charactr
RVC
Demucs
Demucs is a music source separation model that can separate drums, bass, vocals, and other instruments
By Facebook
Conversational AI Models
Kimi Audio
Kimi Audio is a powerful text-to-speech and speech-to-text model by Moonshot AI
By Moonshot AI
MiMo-Audio