Cannot Model Conversation Structure
English Language Dominance
Memory Bottleneck in Training
No Pre-trained Language Model Use
Platform Limitation
macOS Ventura 13+ (Apple Silicon recommended)Windows 10+
Real-Time Generation Delay
RVQ time-to-first-audio scales poorly