Emotion Tags
normalslowcryingsleepysighchuckle
Flexible Download Options
Guided Emotion and Intonation
Input Streaming for Lower Latency
Language List
Arabic (Algeria)Arabic (Bahrain)Arabic (Egypt)Arabic (Iraq)Arabic (Israel)Arabic (Jordan)Arabic (Kuwait)Arabic (Lebanon)Arabic (Libya)Arabic (Morocco)Arabic (Oman)Arabic (Qatar)Arabic (Saudi Arabia)Arabic (Palestinian)Arabic (Syria)Arabic (Tunisia)Arabic (United Arab Emirates)Arabic (Yemen)Bulgarian (Bulgaria)Catalan (Spain)Chinese (Cantonese, Traditional)Chinese (Mandarin, Simplified)Chinese (Taiwanese Mandarin)Croatian (Croatia)Czech (Czech Republic)Danish (Denmark)Dutch (Netherlands)English (Australia)English (Canada)English (Ghana)English (Hong Kong)English (India)English (Ireland)English (Kenya)English (New Zealand)English (Nigeria)English (Philippines)English (Singapore)English (South Africa)English (Tanzania)English (United Kingdom)English (United States)Estonian(Estonia)Filipino (Philippines)Finnish (Finland)French (Canada)French (France)French (Switzerland)German (Austria)German (Germany)Greek (Greece)Gujarati (Indian)Hebrew (Israel)Hindi (India)Hungarian (Hungary)Indonesian (Indonesia)Irish(Ireland)Italian (Italy)Japanese (Japan)Korean (Korea)Latvian (Latvia)Lithuanian (Lithuania)Malay (Malaysia)Maltese (Malta)Marathi (India)Norwegian (Bokmål, Norway)Polish (Poland)Portuguese (Brazil)Portuguese (Portugal)Romanian (Romania)Russian (Russia)Slovak (Slovakia)Slovenian (Slovenia)Spanish (Argentina)Spanish (Bolivia)Spanish (Chile)Spanish (Colombia)Spanish (Costa Rica)Spanish (Cuba)Spanish (Dominican Republic)Spanish (Ecuador)Spanish (El Salvador)Spanish (Equatorial Guinea)Spanish (Guatemala)Spanish (Honduras)Spanish (Mexico)Spanish (Nicaragua)Spanish (Panama)Spanish (Paraguay)Spanish (Peru)Spanish (Puerto Rico)Spanish (Spain)Spanish (Uruguay)Spanish (USA)Spanish (Venezuela)Swedish (Sweden)Tamil (India)Telugu (India)Thai (Thailand)Turkish (Turkey)Vietnamese (Vietnam)Afrikaans (South Africa)Albanian (Albania)Amharic (Ethiopia)Armenian (Armenia)Azerbaijani (Azerbaijan)Basque (Spain)Bengali (India)Burmese (Myanmar)Czech (Czech)Dutch (Belgium)French (Belgium)Galician (Spain)Georgian (Georgia)German (Switzerland)Icelandic (Iceland)Irish (Ireland)Italian (Switzerland)Javanese (Indonesia)Kannada (India)Kazakh (Kazakhstan)Khmer (Cambodia)Lao (Laos)Macedonian (North Macedonia)Mongolian (Mongolia)Nepali (Nepal)Persian (Iran)Serbian (Serbia)Sinhala (Sri Lanka)Swahili (Kenya)Swahili (Tanzania)Ukrainian (Ukraine)Uzbek (Uzbekistan)Zulu (South Africa)
LLM-based Customizability
Model Tokenizer Type
Non-streaming (CNN-based) tokenizer
Open Source Release Planned
Orpheus Speech Models
Medium (3B)Small (1B)Tiny (400M)Nano (150M)
Pretrained and Finetuned Models
Pretrained modelsFinetuned models
Sample Finetuning Scripts
Sliding Window Detokenizer
Streaming Inference Speed
Faster than playback on A100 40GB for 3B model
Supported Language List
75
Supported Utilization Areas
Call CentersJournalistsHealthcareLawyersMedia and BroadcastingPodcastsGovernmentResearchersInterviewsStudentsMeetingsSubtitle
Training Data Volume
100k+ hours of speech, billions of text tokens
YouTube Link Transcription