16-bit 48k WAV Recording
16-bit 48k WAV
Adjust Enhancement Strength
Audiogram Theme Customization
Custom Audiogram Backgrounds
Emotion Tags
normalslowcryingsleepysighchuckle
Guided Emotion and Intonation
Industry-Leading Transcription
Input Streaming for Lower Latency
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Project Templates
True crime podcastClass lectureFashion podcast
Record Solo or with Guests
Sample Finetuning Scripts
Sliding Window Detokenizer
Speaker Separated Downloads
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens
Transcribe Audio and Video