iSpeech TTS & Speech API vs OpenAI Realtime API

Comparing the features of iSpeech TTS & Speech API to OpenAI Realtime API

Feature
iSpeech TTS & Speech API
OpenAI Realtime API

Capability Features

API Requests per Month
100000000
Cloud and Embedded TTS
Create IVR Prompts
Custom Voice Cloning
Developer Signups
80000
Enterprise Privacy Commitment
Expanded Model Support Planned
Five New Voices
5
Function Calling
Human and Automated Safety Monitoring
Interruption Handling
No Training on Data Without Permission
Open Source SDKs
Playground Access
Prompt Caching Planned
Public Beta
Reference Client Available
SDK Availability
Six Preset Voices
6
Speech Rate Adjustment
SlowRegularFast
Speech Recognition
Speech-to-Speech
Streaming Audio Inputs/Outputs
Supported Language List
US EnglishUK EnglishAustralian EnglishUS SpanishChineseHong Kong ChineseTaiwan ChineseJapaneseKoreanCanadian EnglishHungarianBrazilian PortugueseEuropean PortugueseEuropean SpanishEuropean CzechEuropean DanishEuropean FinnishEuropean FrenchEuropean NorwegianEuropean DutchEuropean PolishEuropean ItalianEuropean TurkishEuropean GreekEuropean GermanRussianSwedishCanadian FrenchArabic
Supports Text and Audio Inputs
TextAudio
Talking Stickers
Text to Speech
Ultra Low Latency
WebSocket Connection

Integration Features

Agora Integration
Audiobook Support
Chat Completions API Integration
Chrome Extension
Connected Vehicle Integration
LiveKit Integration
Mobile Integration
OpenAI Node.js SDK Planned
OpenAI Python SDK Planned
SAPI Integration
Supports GPT-4o
gpt-4o-realtime-preview
Twilio Voice API Integration

Limitation Features

AI Disclosure Requirement
Audio Only Modality (Initially)
Lower Session Limits Tiers 1-4
Lower than 100
No Simultaneous Session Limit Anymore
Registration Required for Downloads
Simultaneous Sessions Limit Tier 5
100
TTS for Non-Commercial Use Only
Usage Policy Restriction

Pricing Features

Approximate Audio Input Price
$0.06/minute
Approximate Audio Output Price
$0.24/minute
Free Tier
IVR Prompts Free Sample
No Free Tier
Pricing Audio Input
$100/1M tokens
Pricing Audio Output
$200/1M tokens
Pricing Cached Audio Input
$20/1M tokens
Pricing Cached Text Input
$2.50/1M tokens
Pricing Text Input
$5/1M tokens
Pricing Text Output
$20/1M tokens