Conversational Speech Generation
Customization Options
Voice typePitchSpeedBackground musicAccentSentence breaksVolumeToneStress areas
Dataset Size
1 million hours
Device Compatibility
Any device
Model Sizes
Tiny: 1B backbone, 100M decoderSmall: 3B backbone, 250M decoderMedium: 8B backbone, 300M decoder
Multiple Speaker Handling
Objective Metrics
Word Error RateSpeaker SimilarityHomograph DisambiguationPronunciation Consistency
Partial Multilingual Support Planned
Planned for 20+ languages
Pitch Adjustment
Up to 20 semitones higher or lower
Sentence Breaks and Punctuation Recognition
Stress and Emphasis Control
Subjective Metrics
Comparative Mean Opinion Score
Types of Voices
BasicStandardNeuralCloned
Unlimited Access to Basic Voices
Unlimited Basic Voice-Overs
Use Cases Information
Video Sales LettersEducational VideosMarketing VideosAnimated VideosAudio BooksExplainer VideosPodcastsWebsites
Uses IBM, Azure, Google, Amazon TTS
IBMAzure AIGoogle Text to SpeechAmazon