ChatTTS
Open-source conversational text-to-speech model optimized for dialogue, with fine-grained prosody control including laughter and pauses.
About
ChatTTS is a generative speech model designed specifically for conversational applications rather than general-purpose TTS. It produces natural, expressive audio suited to chatbot responses, LLM assistants, and interactive dialogue systems.
The model supports English and Chinese and gives developers granular control over prosodic elements such as pauses, laughter, and interjections. It is pre-trained on roughly 40,000 hours of speech data. The open-source code is released under AGPLv3, while the model weights carry a CC BY-NC 4.0 license that restricts commercial deployment. It targets developers and researchers who need high-quality, dialogue-focused speech synthesis.
The model supports English and Chinese and gives developers granular control over prosodic elements such as pauses, laughter, and interjections. It is pre-trained on roughly 40,000 hours of speech data. The open-source code is released under AGPLv3, while the model weights carry a CC BY-NC 4.0 license that restricts commercial deployment. It targets developers and researchers who need high-quality, dialogue-focused speech synthesis.