• Townie
    AI
  • Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
darefail

darefail

hello-transcription

Remix of jubertioai/hello-transcription
Public
Like
hello-transcription
Home
Code
7
frontend
1
routes
2
.gitattributes
.vtignore
README.md
deno.json
H
main.tsx
Branches
1
Pull requests
Remixes
History
Environment variables
1
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
README.md
Code
/
README.md
Search
10/2/2025
README.md

hello-transcription

Real-time speech transcription using OpenAI's Realtime API - a demonstration of transcription-only mode without AI responses.

Features

  • Real-time Transcription: Live speech-to-text conversion as you speak
  • Multiple Models: Choose between GPT-4o Transcribe, GPT-4o Mini Transcribe, or Whisper-1
  • Language Support: Transcribe in multiple languages (English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean)
  • Voice Activity Detection (VAD): Automatic detection of speech segments
  • Logprobs Support: Optional confidence scores for transcriptions
  • Split View: See transcriptions and event logs side-by-side

How It Works

This app uses OpenAI's Realtime API in transcription-only mode:

  1. Your voice is captured via WebRTC
  2. Audio is streamed to OpenAI's transcription service
  3. Transcriptions are returned in real-time
  4. No AI responses are generated (transcription only)

Model Differences

  • GPT-4o Transcribe: Streaming transcription with incremental updates
  • GPT-4o Mini Transcribe: Smaller model, streaming transcription
  • Whisper-1: Complete transcription after each speech segment (no streaming)

Configuration Options

  • Model: Select the transcription model
  • Language: Choose the primary language for better accuracy
  • VAD: Enable/disable automatic voice activity detection
  • Logprobs: Include confidence scores (for advanced use)

Usage

  1. Click "Start" to begin transcription
  2. Allow microphone access when prompted
  3. Start speaking - transcriptions appear in real-time
  4. Partial transcriptions update as you speak
  5. Final transcriptions are marked in green
  6. Click "Stop" to end the session

API Endpoints

  • GET / - Serves the transcription interface
  • POST /rtc - Creates a WebRTC transcription session
  • POST /observer/:callId - WebSocket observer for transcription events

Environment Variables

Set in your Val Town environment:

  • OPENAI_API_KEY - Your OpenAI API key (required)

Local Development

# Install Deno curl -fsSL https://deno.land/install.sh | sh # Run locally deno run --allow-all main.tsx

Val Town Deployment

  1. Fork/remix this val on Val Town
  2. Add your OPENAI_API_KEY to Val Town secrets
  3. Your app will be available at https://[your-val-name].val.run

Technical Details

The app uses OpenAI's Realtime API in transcription mode:

  • Session type: transcription (not realtime)
  • Audio format: PCM16
  • Noise reduction: Near-field (optimized for close microphones)
  • WebRTC data channel for receiving transcription events

Event Flow

  1. input_audio_buffer.committed - Audio chunk received
  2. conversation.item.input_audio_transcription.delta - Partial transcription
  3. conversation.item.input_audio_transcription.completed - Final transcription

Credits

Built with OpenAI's Realtime API for transcription-only use cases.

Get started with a template:
FeaturesVersion controlCode intelligenceCLI
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2025 Val Town, Inc.