Cannot Model Conversation Structure
Commercial Use Restrictions
Consent Required for Voice Cloning
English Language Dominance
Free Tier Generation Speed
slower
Memory Bottleneck in Training
No Pre-trained Language Model Use
Personal Use Only on Free
Prohibited Use Cases
ImpersonationFraudHate SpeechSpam
Real-Time Generation Delay
RVQ time-to-first-audio scales poorly