90-95% Voice Recognition Accuracy
90-95%
Custom Workflow Automation
Customer Support Automation
Data Security & Encryption
Emotion Tags
normalslowcryingsleepysighchuckle
File Uploads for Training
Guided Emotion and Intonation
Input Streaming for Lower Latency
Interactive Product Assistance
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
No Limits on Customization
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Real-Time Interactive Voice Assistance
Real-Time Language Translation
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Supported Use Cases
Customer serviceSalesHealthcareFinanceEducationLogisticsGovernment
Training Data Volume
100k+ hours of speech, billions of text tokens