Audio Experience Amplification
Audio Experience Customization
Content-to-Audio Conversion
Emotion Tags
normalslowcryingsleepysighchuckle
Guided Emotion and Intonation
Increased Engagement
95% increase in visits/users
Input Streaming for Lower Latency
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multi-Stage Audio Workflow
CreationDistributionMonetization
Multiple Products
Trinity PlayTrinity MixTrinity ConductTrinity PlayerTrinity PulseTrinity Octopus
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Quick Conversion Time
Within minutes
Sample Finetuning Scripts
Scalable Platform
RobustScalableSeamlessSimple
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Training Data Volume
100k+ hours of speech, billions of text tokens