Bulk Uploads and Processing
Emotion Tags
normalslowcryingsleepysighchuckle
Granular Audio Adjustment
Guided Emotion and Intonation
Input Streaming for Lower Latency
Language Pairs Supported
300
Languages Supported (Dubbing)
20
Languages Supported (Live)
60
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multiple Speakers Support
No Multiple AI Solutions Needed
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Real-time AI Interpretation
Role-Based Access Control
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens