Automatic Scalability
Millions of concurrent calls
Emotion Tags
normalslowcryingsleepysighchuckle
Guided Emotion and Intonation
Input Streaming for Lower Latency
LLM-based Customizability
Low Latency
500ms latency
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multi-language Support
18+ languages
Multiple Deployment Channels
AI phone callsWeb callsSMSChat
Natural Language Conversations
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Platform Uptime
99.99% uptime
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens