Accelerated Content Delivery
Audio Post-Production Automation
Custom AI Music Generation
Emotion Tags
normalslowcryingsleepysighchuckle
Genre Diversity
FunkCountryWorldRnBRockBluesCinematicAcousticHouseLoungePopElectronicReggaeton PopHip Hop
Guided Emotion and Intonation
Input Streaming for Lower Latency
LLM-based Customizability
Manual Editing Not Required
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Mood Matching
DreamyHappyRestlessDynamicCalmingExcitingBusy & FranticDarkChasingEuphoric
No Manual Sound Search Needed
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pause or Cancel Subscription
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens
Use Cases
TravelStorytellingAdsPhotographyHorror & ThrillerCinematicWorkout & Wellness