Actionable Recommendations
Cost Reduction at Scale
Call costs reduced by 5x
Custom Voice Agent Creation
Create AI Voice Agents quickly
Emotion Tags
normalslowcryingsleepysighchuckle
Guided Emotion and Intonation
Input Streaming for Lower Latency
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multi-language Support
Over 12 languages
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Personalized Agent Deployment
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Productivity Increase
Up to 52% increase
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens