Crystal-Clear Transcription
Emotion Tags
normalslowcryingsleepysighchuckle
Fast Transcription Speed
8x faster than other models
Guided Emotion and Intonation
Input Streaming for Lower Latency
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multi-Language Transcription
No Signup Required for Test
Noisy Environment Accuracy
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Paid Plans Full Edit History
Paid Plans Max File Duration
600
Paid Plans Unlimited Daily Transcription
Paid Plans Watermark-Free Downloads
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Real-Time Speech Transcription
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Supported Language List
100
Training Data Volume
100k+ hours of speech, billions of text tokens