Home
Articles
Home
Sesame
Compared to Uberduck AI Voices
Sesame vs Uberduck AI Voices
Comparing the features of Sesame to Uberduck AI Voices
Feature
Sesame
Uberduck AI Voices
Capability Features
API Access
Consistent Personality
Context Awareness
Conversational Dynamics
Conversational Speech Generation
Dataset Size
1 million hours
Emotional Intelligence
Evaluation Suite
Industry-Leading Accuracy
Make Music
Make Videos
Make Voiceovers
Model Sizes
Tiny: 1B backbone, 100M decoder
Small: 3B backbone, 250M decoder
Medium: 8B backbone, 300M decoder
Multiple Speaker Handling
Objective Metrics
Word Error Rate
Speaker Similarity
Homograph Disambiguation
Pronunciation Consistency
Partial Multilingual Support Planned
Planned for 20+ languages
Pronunciation Correction
Sequence Length
2048
Single-Stage Model
Speech to Speech
Subjective Metrics
Comparative Mean Opinion Score
Supported Language List
Afrikaans
Albanian
Amharic
Arabic
Armenian
Azerbaijani
Bengali
Bosnian
Bulgarian
Burmese
Catalan
Chinese
Croatian
Czech
Danish
Dutch
English
Estonian
Filipino
Finnish
French
Georgian
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Irish
Italian
Japanese
Javanese
Kannada
Kazakh
Khmer
Korean
Lao
Latvian
Lithuanian
Macedonian
Malay
Maltese
Mandarin
Mongolian
Nepali
Norwegian
Pashto
Persian
Polish
Portuguese
Romanian
Russian
Serbian
Sinhala
Slovak
Slovenian
Somali
Spanish
Swahili
Swedish
Tagalog
Tamil
Telugu
Thai
Turkish
Ukrainian
Urdu
Uzbek
Vietnamese
Welsh
Zulu
Text and Audio Input
Text
Audio
Text to Rapping
Text to Singing
Text to Speech
Training Epochs
5
Voice Cloning
Voice Conversion
Voice Options
Abbi
Abeo
Aditi
AIGenerate1
AIGenerate2
Aisha Patel
Alfie
Amber
Amy
Ana
Andrew
AndrewMultilingualNeural
Annette
Aria
Arthur
Ashley
Asilia
Ava
AvaMultilingualNeural
Ayanda
B La B
Bella
Big G
Blue
Brandon
Brian
BrianMultilingualNeural
Carly
Chilemba
Christopher
Clara
Connor
Cora
Danielle
Darren
David Kim
Davis
Duncan
Elena Rodriguez
Elimu
Elizabeth
Elliot
Elsie
Emily
Emma
EmmaMultilingualNeural
en-AU-Neural2-A
en-AU-Neural2-B
en-AU-Neural2-C
en-AU-Neural2-D
en-AU-News-E
en-AU-News-F
en-AU-News-G
en-AU-Polyglot-1
en-AU-Standard-A
en-AU-Standard-B
en-AU-Standard-C
en-AU-Standard-D
en-AU-Wavenet-A
en-AU-Wavenet-B
en-AU-Wavenet-C
en-AU-Wavenet-D
en-GB-Neural2-A
en-GB-Neural2-B
en-GB-Neural2-C
en-GB-Neural2-D
en-GB-Neural2-F
en-GB-News-G
en-GB-News-H
en-GB-News-I
en-GB-News-J
en-GB-News-K
en-GB-News-L
en-GB-News-M
en-GB-Standard-A
en-GB-Standard-B
en-GB-Standard-C
en-GB-Standard-D
en-GB-Standard-F
en-GB-Studio-B
en-GB-Studio-C
en-GB-Wavenet-A
en-GB-Wavenet-B
en-GB-Wavenet-C
en-GB-Wavenet-D
en-GB-Wavenet-F
en-IN-Neural2-A
en-IN-Neural2-B
en-IN-Neural2-C
en-IN-Neural2-D
en-IN-Standard-A
en-IN-Standard-B
en-IN-Standard-C
en-IN-Standard-D
en-IN-Wavenet-A
en-IN-Wavenet-B
en-IN-Wavenet-C
en-IN-Wavenet-D
en-US-Casual-K
en-US-Journey-D
en-US-Journey-F
en-US-Neural2-A
en-US-Neural2-C
en-US-Neural2-D
en-US-Neural2-E
en-US-Neural2-F
en-US-Neural2-G
en-US-Neural2-H
en-US-Neural2-I
en-US-Neural2-J
en-US-News-K
en-US-News-L
en-US-News-N
en-US-Polyglot-1
en-US-Standard-A
en-US-Standard-B
en-US-Standard-C
en-US-Standard-D
en-US-Standard-E
en-US-Standard-F
en-US-Standard-G
en-US-Standard-H
en-US-Standard-I
en-US-Standard-J
en-US-Studio-O
en-US-Studio-Q
en-US-Wavenet-A
en-US-Wavenet-B
en-US-Wavenet-C
en-US-Wavenet-D
en-US-Wavenet-E
en-US-Wavenet-F
en-US-Wavenet-G
en-US-Wavenet-H
en-US-Wavenet-I
en-US-Wavenet-J
Eric
Ethan
Ezinne
Freya
Geraint
Gregory
Guy
Hollie
Imani
Ivy
Jacob
James
James Wilson
Jane
Jason
Jenny
Jenny Multilingual
Jenny Multilingual V2
Joanna
Joanne
Joey
JSXI
Justin
Kajal
Ken
Kendra
Kevin
Kim
Kimberly
Leah
Liam
Libby
Lucas Garcia
Luke
Luna
Maisie
Marcus Johnson
Matthew
Maya Thompson
Mia
Michelle
Mitchell
Molly
Monica
Nancy
Natasha
Neerja
Neil
Niamh
Nicole
Noah
Oliver
Olivia
Prabhat
Quackmaster
Raveena
Relikk
Roger
Rosa
Russell
Ruth
Ryan
Ryan Multilingual
Salli
Sam
Sara
Sarah Chen
Sonia
SpongeBob SquarePants (Seasons 3–9A)
Steffan
Stephen
T.A.G.
Thomas
Tim
Tina
Tony
Wayne
William
WRL
Yan
ZWF (rapping)
Integration Features
API for Developers
GitHub Release
LLama Architecture Backbone
Mimi Split-RVQ Tokenizer
Limitation Features
Cannot Model Conversation Structure
English Language Dominance
Memory Bottleneck in Training
No Pre-trained Language Model Use
Real-Time Generation Delay
RVQ time-to-first-audio scales poorly
Text Character Limit
350
Pricing Features
Free Preview
Free Tier
Open Source
Apache 2.0
Upgrade Option