Emotion Tags
normalslowcryingsleepysighchuckle
Emotionally Intact Pseudovoices
Fully Automated Redaction
Guided Emotion and Intonation
Input Streaming for Lower Latency
Irreversible Anonymization
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multilingual Support
EnglishFrenchGermanSpanishItalian
No Disruption to Existing Processes
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Preserves Emotions & Expressions
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Sample Finetuning Scripts
Sliding Window Detokenizer
Speaker, Age, Gender, Expressiveness Selection
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens
Workflow Integration
Works with leading DAWs and editing suites