Cannot Model Conversation Structure
English Language Dominance
Memory Bottleneck in Training
No Collaboration Features
No Mention of Third-Party Integrations
No Pre-trained Language Model Use
No Pricing Details Listed
Real-Time Generation Delay
RVQ time-to-first-audio scales poorly