Cloud TTS vs OpenAI Realtime API

Comparing the features of Cloud TTS to OpenAI Realtime API

Feature
Cloud TTS
OpenAI Realtime API

Capability Features

Adjustable Speed
Adjustable Volume
Cloud-Based TTS
Enterprise Privacy Commitment
Expanded Model Support Planned
Export Audio Files
File Input Support
Five New Voices
5
Function Calling
Human and Automated Safety Monitoring
Interruption Handling
Karaoke-Style Highlighting
Language List
Norwegian Bokmål (Norway)American EnglishEuropean SpanishChinese (China)Russian (Russia)Arabic (Saudi Arabia)French (France)German (Germany)Afrikaans (South Africa)Amharic (Ethiopia)Arabic (United Arab Emirates)Arabic (Bahrain)Arabic (Algeria)Arabic (Egypt)Arabic (Iraq)Arabic (Jordan)Arabic (Kuwait)Arabic (Lebanon)Arabic (Libya)Arabic (Morocco)Arabic (Oman)Arabic (Qatar)Arabic (Syria)Arabic (Tunisia)Arabic (Yemen)Azerbaijani (Azerbaijan)Bulgarian (Bulgaria)Bangla (Bangladesh)Bangla (India)Bosnian (Bosnia & Herzegovina)Catalan (Spain)Czech (Czechia)Welsh (United Kingdom)Danish (Denmark)Austrian GermanSwiss High GermanGreek (Greece)Australian EnglishCanadian EnglishBritish EnglishEnglish (Hong Kong)English (Ireland)English (India)English (Kenya)English (Nigeria)English (New Zealand)English (Philippines)English (Singapore)English (Tanzania)English (South Africa)Spanish (Argentina)Spanish (Bolivia)Spanish (Chile)Spanish (Colombia)Spanish (Costa Rica)Spanish (Cuba)Spanish (Dominican Republic)Spanish (Ecuador)Spanish (Equatorial Guinea)Spanish (Guatemala)Spanish (Honduras)Mexican SpanishSpanish (Nicaragua)Spanish (Panama)Spanish (Peru)Spanish (Puerto Rico)Spanish (Paraguay)Spanish (El Salvador)Spanish (United States)Spanish (Uruguay)Spanish (Venezuela)Estonian (Estonia)Persian (Iran)Finnish (Finland)Filipino (Philippines)French (Belgium)Canadian FrenchSwiss FrenchIrish (Ireland)Galician (Spain)Gujarati (India)Hebrew (Israel)Hindi (India)Croatian (Croatia)Hungarian (Hungary)Indonesian (Indonesia)Icelandic (Iceland)Italian (Italy)Japanese (Japan)Javanese (Indonesia)Georgian (Georgia)Kazakh (Kazakhstan)Khmer (Cambodia)Kannada (India)Korean (South Korea)Lao (Laos)Lithuanian (Lithuania)Latvian (Latvia)Macedonian (North Macedonia)Malayalam (India)Mongolian (Mongolia)Marathi (India)Malay (Malaysia)Maltese (Malta)Burmese (Myanmar [Burma])Nepali (Nepal)FlemishDutch (Netherlands)Polish (Poland)Pashto (Afghanistan)Brazilian PortugueseEuropean PortugueseRomanian (Romania)Sinhala (Sri Lanka)Slovak (Slovakia)Slovenian (Slovenia)Somali (Somalia)Albanian (Albania)Serbian (Serbia)Sundanese (Indonesia)Swedish (Sweden)Swahili (Kenya)Swahili (Tanzania)Tamil (India)Tamil (Sri Lanka)Tamil (Malaysia)Tamil (Singapore)Telugu (India)Thai (Thailand)Turkish (Türkiye)Ukrainian (Ukraine)Urdu (India)Urdu (Pakistan)Uzbek (Uzbekistan)Vietnamese (Vietnam)Chinese (China, LIAONING)Chinese (China, SHAANXI)Chinese (Hong Kong)Chinese (Taiwan)Zulu (South Africa)
Mobile Friendly
No Training on Data Without Permission
Playground Access
Prompt Caching Planned
Public Beta
Reference Client Available
Six Preset Voices
6
Speech-to-Speech
Streaming Audio Inputs/Outputs
Supported Language List
140
Supports Text and Audio Inputs
TextAudio
Text Input
Ultra Low Latency
User Preferences Persistence
User-Friendly Interface
Voice Selection
Microsoft AvaMultilingual Online (Natural)Microsoft AndrewMultilingual Online (Natural)Microsoft EmmaMultilingual Online (Natural)Microsoft BrianMultilingual Online (Natural)Microsoft Ava Online (Natural)Microsoft Andrew Online (Natural)Microsoft Emma Online (Natural)Microsoft Brian Online (Natural)Microsoft Ana Online (Natural)Microsoft Aria Online (Natural)Microsoft Christopher Online (Natural)Microsoft Eric Online (Natural)Microsoft Guy Online (Natural)Microsoft Jenny Online (Natural)Microsoft Michelle Online (Natural)Microsoft Roger Online (Natural)Microsoft Steffan Online (Natural)
WebSocket Connection

Integration Features

Agora Integration
API Availability
Chat Completions API Integration
LiveKit Integration
OpenAI Node.js SDK Planned
OpenAI Python SDK Planned
Supports GPT-4o
gpt-4o-realtime-preview
Third-Party API Integration
Twilio Voice API Integration

Limitation Features

Ads
AI Disclosure Requirement
Audio Only Modality (Initially)
Lower Session Limits Tiers 1-4
Lower than 100
No Simultaneous Session Limit Anymore
Simultaneous Sessions Limit Tier 5
100
Usage Limits
Not specified
Usage Policy Restriction

Pricing Features

Approximate Audio Input Price
$0.06/minute
Approximate Audio Output Price
$0.24/minute
Free Tier
No Free Tier
Pricing Audio Input
$100/1M tokens
Pricing Audio Output
$200/1M tokens
Pricing Cached Audio Input
$20/1M tokens
Pricing Cached Text Input
$2.50/1M tokens
Pricing Plan Details
None
Pricing Text Input
$5/1M tokens
Pricing Text Output
$20/1M tokens
Trial Period