Auto-Learning from Knowledge Sources
Auto-saving Training Editor
Automatic Interruption Handling
Call Summary Key Points
3
Custom Voice and Personality
Custom Workflow Automation
Emotion Tags
normalslowcryingsleepysighchuckle
Enterprise-Grade Features
Guided Emotion and Intonation
Input Streaming for Lower Latency
Knowledge Base Integration
Language/Country Support
English (US)SpanishChineseVietnameseKoreanFrenchRussianGermanHindiPortugueseItalianPolishJapaneseThaiDutchFlemishUkrainianGreekRomanianHungarianCzechSwedishBulgarianDanishFinnishSlovakNorwegianLithuanianLatvianEstonianCatalanMalayIndonesian
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Multiple Concurrent Calls
Multiple Deployment Channels
PhoneWebsite ChatWhatsAppDiscord
Natural Language Training
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Production-Ready in Minutes
Real-Time Emotion Tracking
Real-Time Performance Monitoring
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Train from Multiple Knowledge Sources
Training Data Volume
100k+ hours of speech, billions of text tokens
Unlimited Scalability
Unlimited
Voice Transcription & Playback