Search

3,267 results found for openai (1742ms)

Code
3,172

All these components together make up a Realtime Session. You will use client events to update t
![diagram realtime state](https://openaidevs.retool.com/api/file/11fe71d2-611e-4a26-a587-881719a
Session lifecycle events
**To play output audio back on a client device like a web browser, we recommend using WebRTC rat
all with Twilio](https://www.twilio.com/en-us/blog/twilio-openai-realtime-api-launch-integration
Note that the [`response.audio.done`](/docs/api-reference/realtime-server-events/response/audio/
# hello-transcription
Real-time speech transcription using OpenAI's Realtime API - a demonstration of transcription-on
## Features
## How It Works
This app uses OpenAI's Realtime API in transcription-only mode:
1. Your voice is captured via WebRTC
2. Audio is streamed to OpenAI's transcription service
3. Transcriptions are returned in real-time
4. No AI responses are generated (transcription only)
Set in your Val Town environment:
- `OPENAI_API_KEY` - Your OpenAI API key (required)
## Local Development
1. Fork/remix this val on Val Town
2. Add your `OPENAI_API_KEY` to Val Town secrets
3. Your app will be available at `https://[your-val-name].val.run`
## Technical Details
The app uses OpenAI's Realtime API in transcription mode:
- Session type: `transcription` (not `realtime`)
- Audio format: PCM16
## Credits
Built with OpenAI's Realtime API for transcription-only use cases.
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>OpenAI Realtime Transcription</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
</head>
<body>
<h1>📝 OpenAI Realtime Transcription</h1>
<p class="description">
Transcribe audio in real-time using OpenAI's Realtime API.
<a href="/source" target="_blank">View source</a>
</p>
# Hello-Transcription - OpenAI Realtime API Transcription Demo
## 🎯 Project Overview
Transcription demonstrates the transcription-only mode of OpenAI's Realtime API. Unlike the conv
**Created:** September 2, 2025
**Platform:** Val Town
**API:** OpenAI Realtime API (Transcription Mode)
**Key Feature:** Real-time streaming transcription with multiple model support
- **Runtime:** Deno (Val Town platform)
- **Framework:** Hono (lightweight web framework)
- **Transcription:** OpenAI Realtime API in transcription mode
- **Connection:** WebRTC with data channel for events
- **Frontend:** Vanilla JavaScript with split-view interface
1. **Audio Input**
```
User speaks → Microphone → WebRTC → OpenAI
```
```bash
# Create .env file
echo "OPENAI_API_KEY=sk-..." > .env
# Install Deno
**Solutions:**
- Check microphone permissions
- Verify OPENAI_API_KEY is set
- Check browser console for errors
- Ensure WebRTC connection established
2. **Set Environment**
- Add `OPENAI_API_KEY` in Val Town secrets
3. **Deploy**
### Environment Variables
- `OPENAI_API_KEY` - Required for OpenAI API access
## 📝 Future Enhancements
### Documentation
- [OpenAI Realtime Transcription Guide](https://platform.openai.com/docs/guides/realtime-transcr
- [Realtime API Reference](https://platform.openai.com/docs/api-reference/realtime)
- [Voice Activity Detection Guide](https://platform.openai.com/docs/guides/realtime-vad)
- [Val Town Documentation](https://docs.val.town)
## 🎯 Summary
fully demonstrates the transcription-only capabilities of OpenAI's Realtime API. Key achievement
1. **Pure Transcription**: No AI responses, focused solely on speech-to-text
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>OpenAI Realtime Transcription</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
</head>
<body>
<h1>📝 OpenAI Realtime Transcription</h1>
<p class="description">
Transcribe audio in real-time using OpenAI's Realtime API.
<a href="/source" target="_blank">View source</a>
</p>
If you want to connect a phone number to the Realtime API, use a SIP trunking provider (e.g., Tw
k](/docs/guides/webhooks) for incoming calls, at platform.openai.com. Then, point your SIP trunk
When OpenAI receives SIP traffic associated with your project, the webhook that you configured w
This webhook lets you accept or reject the call. When accepting the call, you'll provide the con
URIs used for interacting with Realtime API and SIP:
|SIP URI|sip:$PROJECT_ID@sip.api.openai.com;transport=tls|
|Accept URI|https://api.openai.com/v1/realtime/calls/$CALL_ID/accept|
|Reject URI|https://api.openai.com/v1/realtime/calls/$CALL_ID/reject|
|Refer URI|https://api.openai.com/v1/realtime/calls/$CALL_ID/refer|
|Events URI|wss://api.openai.com/v1/realtime?call_id=$CALL_ID|
Find your `$CALL_ID` in the `call_id` field in data object present in the webhook. See an exampl
```python
from flask import Flask, request, Response, jsonify, make_response
from openai import OpenAI, InvalidWebhookSignatureError
import asyncio
import json
app = Flask(__name__)
client = OpenAI(
webhook_secret=os.environ["OPENAI_WEBHOOK_SECRET"]
)
AUTH_HEADER = {
"Authorization": "Bearer " + os.getenv("OPENAI_API_KEY")
}
try:
async with websockets.connect(
"wss://api.openai.com/v1/realtime?call_id=" + call_id,
additional_headers=AUTH_HEADER,
) as websocket:
if event.type == "realtime.call.incoming":
requests.post(
"https://api.openai.com/v1/realtime/calls/"
+ event.data.call_id
+ "/accept",
It's also possible to redirect the call to another number. During the call, make a POST to the `
|URL|https://api.openai.com/v1/realtime/calls/$CALL_ID/refer|
|Payload|JSON with one key target_uriThis is the value used in the Refer-To. You can use Tel-URI
|Headers|Authorization: Bearer YOUR_API_KEYSubstitute YOUR_API_KEY with a standard API key|
Our most advanced speech-to-speech model is [gpt-realtime](/docs/models/gpt-realtime).
ore information, see the [announcement blog post](https://openai.com/index/introducing-gpt-realt
Update your session to use a prompt
----------------------
g, see the [realtime prompting cookbook](https://cookbook.openai.com/examples/realtime_prompting
### General usage tips
--------------------------------------------
s, see the [realtime prompting cookbook](https://cookbook.openai.com/examples/realtime_prompting
#### 1\. Be precise. Kill conflicts.
You can include sample phrases for preambles to add variety and better tailor to your use case.
s, see the [realtime prompting cookbook](https://cookbook.openai.com/examples/realtime_prompting
#### 9\. Use LLMs to improve your prompt.
This guide is long but not exhaustive! For more in a specific area, see the following resources:
* [Realtime prompting cookbook](https://cookbook.openai.com/examples/realtime_prompting_guide)
* [Inputs and outputs](/docs/guides/realtime-inputs-outputs): Text and audio input requirement
* [Managing conversations](/docs/guides/realtime-conversations): Learn to manage a conversatio
* [MCP servers](/docs/guides/realtime-mcp): How to use MCP servers to access additional tools
* [Realtime transcription](/docs/guides/realtime-transcription): How to transcribe audio with
* [Voice agents](https://openai.github.io/openai-agents-js/guides/voice-agents/quickstart/): A
Was this page useful?
All these components together make up a Realtime Session. You will use client events to update t
![diagram realtime state](https://openaidevs.retool.com/api/file/11fe71d2-611e-4a26-a587-881719a
Session lifecycle events
**To play output audio back on a client device like a web browser, we recommend using WebRTC rat
all with Twilio](https://www.twilio.com/en-us/blog/twilio-openai-realtime-api-launch-integration
Note that the [`response.audio.done`](/docs/api-reference/realtime-server-events/response/audio/
Build low-latency, multimodal LLM applications with the Realtime API.
The OpenAI Realtime API enables low-latency communication with [models](/docs/models) that nativ
Voice agents
------------
f applications is the [Agents SDK for TypeScript](https://openai.github.io/openai-agents-js/guid
```js
import { RealtimeAgent, RealtimeSession } from "@openai/agents/realtime";
const agent = new RealtimeAgent({
Follow the voice agent quickstart to build Realtime agents in the browser.
](https://openai.github.io/openai-agents-js/guides/voice-agents/quickstart/)
To use the Realtime API directly outside the context of voice agents, check out the other connec
------------------
While building [voice agents with the Agents SDK](https://openai.github.io/openai-agents-js/guid
There are three primary supported interfaces for the Realtime API:
```javascript
const baseUrl = "https://api.openai.com/v1/realtime/calls";
const model = "gpt-realtime";
const sdpResponse = await fetch(`${baseUrl}?model=${model}`, {
// Connect to a WebSocket for the in-progress call
const url = "wss://api.openai.com/v1/realtime?call_id=" + callId;
const ws = new WebSocket(url, {
headers: {
Authorization: "Bearer " + process.env.OPENAI_API_KEY,
},
});
### With SIP
1. A user connects to OpenAI via phone over SIP.
2. OpenAI sends a webhook to your application’s backend webhook URL, notifying your app of the
```text
POST https://my_website.com/webhook_endpoint
user-agent: OpenAI/1.0 (+https://platform.openai.com/docs/webhooks)
content-type: application/json
webhook-id: wh_685342e6c53c8190a1be43f081506c52 # unique id for idempotency
```
n the webhook. This `call_id` looks like this: `wss://api.openai.com/v1/realtime?call_id={callId
Was this page useful?