Conversational Speech Generation
Dataset Size
1 million hours
Emotionally Intact Pseudovoices
Fully Automated Redaction
Irreversible Anonymization
Model Sizes
Tiny: 1B backbone, 100M decoderSmall: 3B backbone, 250M decoderMedium: 8B backbone, 300M decoder
Multilingual Support
EnglishFrenchGermanSpanishItalian
Multiple Speaker Handling
No Disruption to Existing Processes
Objective Metrics
Word Error RateSpeaker SimilarityHomograph DisambiguationPronunciation Consistency
Partial Multilingual Support Planned
Planned for 20+ languages
Preserves Emotions & Expressions
Speaker, Age, Gender, Expressiveness Selection
Subjective Metrics
Comparative Mean Opinion Score
Workflow Integration
Works with leading DAWs and editing suites