Device Compatibility
Any device, anywhere
Effortless Multi-way Translation
Emotion Tags
normalslowcryingsleepysighchuckle
Guided Emotion and Intonation
Highlight and Summary Extraction
Industry Domains Supported
ManufacturingTechnologyTransportationChurchPharmaceutical
Industry Use Cases
MeetingsWorship servicesGlobal all-handsTrainingEvents
Industry-specific Fine-tuning
Industry-specific Terminology
Input Streaming for Lower Latency
Lifelike Human Voices
TerraVenusLunaMercuryMarsJupiter
LLM-based Customizability
Manual Note-taking Not Needed
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multi-Speaker, Multi-Language Support
Multilingual Meeting Notes
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Real-Time AI Voice Translation
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Supported Language List
50+
Training Data Volume
100k+ hours of speech, billions of text tokens
Translation Accuracy Baseline
>85%
Translation Cost Savings
15X
Voice-to-Text Translation Speed
under 300ms
Voice-to-Voice Translation Speed
1-3 seconds