OpenAI Realtime API

OpenAI Realtime API

Low‑latency speech‑to‑speech API for building real‑time voice experiences

Website

Capability Features

Public Beta
Speech-to-Speech
Ultra Low Latency
Six Preset Voices
6
Five New Voices
5
WebSocket Connection
Streaming Audio Inputs/Outputs
Interruption Handling
Function Calling
Supports Text and Audio Inputs
TextAudio
Prompt Caching Planned
Expanded Model Support Planned
Playground Access
Reference Client Available
Human and Automated Safety Monitoring
No Training on Data Without Permission
Enterprise Privacy Commitment

Integration Features

Supports GPT-4o
gpt-4o-realtime-preview
Chat Completions API Integration
OpenAI Python SDK Planned
OpenAI Node.js SDK Planned
LiveKit Integration
Agora Integration
Twilio Voice API Integration

Pricing Features

Pricing Text Input
$5/1M tokens
Pricing Text Output
$20/1M tokens
Pricing Audio Input
$100/1M tokens
Pricing Audio Output
$200/1M tokens
Pricing Cached Text Input
$2.50/1M tokens
Pricing Cached Audio Input
$20/1M tokens
No Free Tier
Approximate Audio Input Price
$0.06/minute
Approximate Audio Output Price
$0.24/minute

Limitation Features

Simultaneous Sessions Limit Tier 5
100
Lower Session Limits Tiers 1-4
Lower than 100
No Simultaneous Session Limit Anymore
Audio Only Modality (Initially)
Usage Policy Restriction
AI Disclosure Requirement