OpenAI Realtime API vs Uberduck AI Voices

Comparing the features of OpenAI Realtime API to Uberduck AI Voices

Feature
OpenAI Realtime API
Uberduck AI Voices

Capability Features

API Access
Enterprise Privacy Commitment
Expanded Model Support Planned
Five New Voices
5
Function Calling
Human and Automated Safety Monitoring
Industry-Leading Accuracy
Interruption Handling
Make Music
Make Videos
Make Voiceovers
No Training on Data Without Permission
Playground Access
Prompt Caching Planned
Public Beta
Reference Client Available
Six Preset Voices
6
Speech-to-Speech
Streaming Audio Inputs/Outputs
Supported Language List
AfrikaansAlbanianAmharicArabicArmenianAzerbaijaniBengaliBosnianBulgarianBurmeseCatalanChineseCroatianCzechDanishDutchEnglishEstonianFilipinoFinnishFrenchGeorgianGermanGreekHebrewHindiHungarianIcelandicIndonesianIrishItalianJapaneseJavaneseKannadaKazakhKhmerKoreanLaoLatvianLithuanianMacedonianMalayMalteseMandarinMongolianNepaliNorwegianPashtoPersianPolishPortugueseRomanianRussianSerbianSinhalaSlovakSlovenianSomaliSpanishSwahiliSwedishTagalogTamilTeluguThaiTurkishUkrainianUrduUzbekVietnameseWelshZulu
Supports Text and Audio Inputs
TextAudio
Text to Rapping
Text to Singing
Text to Speech
Ultra Low Latency
Voice Cloning
Voice Conversion
Voice Options
AbbiAbeoAditiAIGenerate1AIGenerate2Aisha PatelAlfieAmberAmyAnaAndrewAndrewMultilingualNeuralAnnetteAriaArthurAshleyAsiliaAvaAvaMultilingualNeuralAyandaB La BBellaBig GBlueBrandonBrianBrianMultilingualNeuralCarlyChilembaChristopherClaraConnorCoraDanielleDarrenDavid KimDavisDuncanElena RodriguezElimuElizabethElliotElsieEmilyEmmaEmmaMultilingualNeuralen-AU-Neural2-Aen-AU-Neural2-Ben-AU-Neural2-Cen-AU-Neural2-Den-AU-News-Een-AU-News-Fen-AU-News-Gen-AU-Polyglot-1en-AU-Standard-Aen-AU-Standard-Ben-AU-Standard-Cen-AU-Standard-Den-AU-Wavenet-Aen-AU-Wavenet-Ben-AU-Wavenet-Cen-AU-Wavenet-Den-GB-Neural2-Aen-GB-Neural2-Ben-GB-Neural2-Cen-GB-Neural2-Den-GB-Neural2-Fen-GB-News-Gen-GB-News-Hen-GB-News-Ien-GB-News-Jen-GB-News-Ken-GB-News-Len-GB-News-Men-GB-Standard-Aen-GB-Standard-Ben-GB-Standard-Cen-GB-Standard-Den-GB-Standard-Fen-GB-Studio-Ben-GB-Studio-Cen-GB-Wavenet-Aen-GB-Wavenet-Ben-GB-Wavenet-Cen-GB-Wavenet-Den-GB-Wavenet-Fen-IN-Neural2-Aen-IN-Neural2-Ben-IN-Neural2-Cen-IN-Neural2-Den-IN-Standard-Aen-IN-Standard-Ben-IN-Standard-Cen-IN-Standard-Den-IN-Wavenet-Aen-IN-Wavenet-Ben-IN-Wavenet-Cen-IN-Wavenet-Den-US-Casual-Ken-US-Journey-Den-US-Journey-Fen-US-Neural2-Aen-US-Neural2-Cen-US-Neural2-Den-US-Neural2-Een-US-Neural2-Fen-US-Neural2-Gen-US-Neural2-Hen-US-Neural2-Ien-US-Neural2-Jen-US-News-Ken-US-News-Len-US-News-Nen-US-Polyglot-1en-US-Standard-Aen-US-Standard-Ben-US-Standard-Cen-US-Standard-Den-US-Standard-Een-US-Standard-Fen-US-Standard-Gen-US-Standard-Hen-US-Standard-Ien-US-Standard-Jen-US-Studio-Oen-US-Studio-Qen-US-Wavenet-Aen-US-Wavenet-Ben-US-Wavenet-Cen-US-Wavenet-Den-US-Wavenet-Een-US-Wavenet-Fen-US-Wavenet-Gen-US-Wavenet-Hen-US-Wavenet-Ien-US-Wavenet-JEricEthanEzinneFreyaGeraintGregoryGuyHollieImaniIvyJacobJamesJames WilsonJaneJasonJennyJenny MultilingualJenny Multilingual V2JoannaJoanneJoeyJSXIJustinKajalKenKendraKevinKimKimberlyLeahLiamLibbyLucas GarciaLukeLunaMaisieMarcus JohnsonMatthewMaya ThompsonMiaMichelleMitchellMollyMonicaNancyNatashaNeerjaNeilNiamhNicoleNoahOliverOliviaPrabhatQuackmasterRaveenaRelikkRogerRosaRussellRuthRyanRyan MultilingualSalliSamSaraSarah ChenSoniaSpongeBob SquarePants (Seasons 3–9A)SteffanStephenT.A.G.ThomasTimTinaTonyWayneWilliamWRLYanZWF (rapping)
WebSocket Connection

Integration Features

Agora Integration
API for Developers
Chat Completions API Integration
LiveKit Integration
OpenAI Node.js SDK Planned
OpenAI Python SDK Planned
Supports GPT-4o
gpt-4o-realtime-preview
Twilio Voice API Integration

Limitation Features

AI Disclosure Requirement
Audio Only Modality (Initially)
Lower Session Limits Tiers 1-4
Lower than 100
No Simultaneous Session Limit Anymore
Simultaneous Sessions Limit Tier 5
100
Text Character Limit
350
Usage Policy Restriction

Pricing Features

Approximate Audio Input Price
$0.06/minute
Approximate Audio Output Price
$0.24/minute
Free Tier
No Free Tier
Pricing Audio Input
$100/1M tokens
Pricing Audio Output
$200/1M tokens
Pricing Cached Audio Input
$20/1M tokens
Pricing Cached Text Input
$2.50/1M tokens
Pricing Text Input
$5/1M tokens
Pricing Text Output
$20/1M tokens
Upgrade Option