Canopy Labs

State-of-the-art AI speech models that sound indistinguishable from real humans

Orpheus Speech Models

Medium (3B)Small (1B)Tiny (400M)Nano (150M)

Zero-Shot Voice Cloning

Guided Emotion and Intonation

Realtime Streaming

Input Streaming for Lower Latency

Open Source Release Planned

Llama Architecture

Llama

Pretrained and Finetuned Models

Pretrained modelsFinetuned models

Sample Finetuning Scripts

Text to Speech

Handles Disfluencies

Emotion Tags

normalslowcryingsleepysighchuckle

Training Data Volume

100k+ hours of speech, billions of text tokens

LLM-based Customizability

Streaming Inference Speed

Faster than playback on A100 40GB for 3B model

Model Tokenizer Type

Non-streaming (CNN-based) tokenizer

Sliding Window Detokenizer

Demo Availability

Python Package for Streaming

GitHub Repository Access

Hugging Face Model Access

Google Colab Notebook

Baseten 1-Click Deployment

LLama Ecosystem Support

English Language Only

No Explicit Pricing Details

No API Mentioned

No Mention of File Format Support