Home
Articles
Home
Sesame
Compared to Text Reader
Sesame vs Text Reader
Comparing the features of Sesame to Text Reader
Feature
Sesame
Text Reader
Capability Features
Accessibility Features
Advanced AI Models
AI WaveNet Voices
Commercial Use Allowed
Consistent Personality
Context Awareness
Continuous AI Improvement
Conversational Dynamics
Conversational Speech Generation
Dataset Size
1 million hours
Download as MP3
Educational Use
Emotional Intelligence
Evaluation Suite
Gender Selection
Male
Female
Instant Generation
Language and Accent Variety
50+ languages and variants
Manual Input Supported
Model Sizes
Tiny: 1B backbone, 100M decoder
Small: 3B backbone, 250M decoder
Medium: 8B backbone, 300M decoder
Multi-language Support
Afrikaans
Arabic
Bengali
Bulgarian
Catalan
Chinese
Czech
Danish
Dutch
English (Australia)
English (United Kingdom)
English (United States)
Filipino
Finnish
French (Canada)
French (France)
German
Gujarati
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Kannada
Korean
Latvian
Malayalam
Mandarin
Norwegian
Polish
Portuguese (Brazil)
Portuguese (Portugal)
Romanian
Russian
Serbian
Slovak
Spanish (Spain)
Spanish (United States)
Swedish
Tamil
Telugu
Thai
Turkish
Ukrainian
Vietnamese
Multiple Speaker Handling
Objective Metrics
Word Error Rate
Speaker Similarity
Homograph Disambiguation
Pronunciation Consistency
Partial Multilingual Support Planned
Planned for 20+ languages
Pronunciation Correction
Sequence Length
2048
Single-Stage Model
Subjective Metrics
Comparative Mean Opinion Score
Text and Audio Input
Text
Audio
Training Epochs
5
Unlimited Downloads
User-Friendly Controls
Voice Options
Integration Features
API Integrations
GitHub Release
LLama Architecture Backbone
Mimi Split-RVQ Tokenizer
Supported Output Formats
MP3
Upload TXT Files
Limitation Features
Cannot Model Conversation Structure
English Language Dominance
Export Format Limitation
.txt
Free Tier Character Limit
1000
Memory Bottleneck in Training
No Mention of API
No Pre-trained Language Model Use
Real-Time Generation Delay
RVQ time-to-first-audio scales poorly
Pricing Features
Free Preview
Free Tier
Monthly Voice Generation Limit
3 hrs/month
Open Source
Apache 2.0
Premium Plan Monthly
$18/mo
Pro Annual Plan
$15/mo (billed annually)
Voice Generation Limit (Annual)
36 hrs/year