A button component that captures voice input and converts it to text, with cross-browser support.
The SpeechInput component provides an easy-to-use interface for capturing
voice input in your application. It uses the Web Speech API for real-time
transcription in supported browsers (Chrome, Edge), and falls back to
MediaRecorder with an external transcription service for browsers that don't
support Web Speech API (Firefox, Safari).
See scripts/speech-input.tsx for this example.
npx ai-elements@latest add speech-input
The component extends the shadcn/ui Button component, so all Button props are available.
| Prop | Type | Default | Description |
|---|---|---|---|
onTranscriptionChange | (text: string) => void | - | Callback fired when final transcription text is available. Only fires for completed phrases, not interim results. |
onAudioRecorded | (audioBlob: Blob) => Promise<string> | - | Callback for MediaRecorder fallback. Required for Firefox/Safari support. Receives recorded audio blob and should return transcribed text from an external service (e.g., OpenAI Whisper). |
lang | string | - | Language for speech recognition. |
...props | React.ComponentProps<typeof Button> | - | Any other props are spread to the Button component, including variant, size, disabled, etc. |
The component automatically detects browser capabilities and uses the best available method:
| Browser | Mode | Behavior |
|---|---|---|
| Chrome, Edge | Web Speech API | Real-time transcription, no server required |
| Firefox, Safari | MediaRecorder | Records audio, sends to external transcription service |
| Unsupported | Disabled | Button is disabled |
Uses the Web Speech API with the following configuration:
true to keep recognition active until manually
stoppedtrue to receive partial results during speechlang prop, defaults to "en-US"When the Web Speech API is unavailable, the component falls back to recording audio:
MediaRecorder APIaudio/webm)onAudioRecorded with the blobonTranscriptionChangeNote: The onAudioRecorded prop is required for this mode to work. Without
it, the button will be disabled in Firefox/Safari.
The component only calls onTranscriptionChange with final transcripts.
Interim results (Web Speech API) are ignored to prevent incomplete text from
being processed.
The component provides cross-browser support through a two-tier system:
| Browser | API Used | Requirements |
|---|---|---|
| Chrome | Web Speech API | None |
| Edge | Web Speech API | None |
| Firefox | MediaRecorder | onAudioRecorded prop |
| Safari | MediaRecorder | onAudioRecorded prop |
For full cross-browser support, provide the onAudioRecorded callback that
sends audio to a transcription service like OpenAI Whisper, Google Cloud
Speech-to-Text, or AssemblyAI.
To support Firefox and Safari, provide an onAudioRecorded callback that sends
audio to a transcription service:
const handleAudioRecorded = async (audioBlob: Blob): Promise<string> => {
const formData = new FormData();
formData.append("file", audioBlob, "audio.webm");
formData.append("model", "whisper-1");
const response = await fetch(
"https://api.openai.com/v1/audio/transcriptions",
{
method: "POST",
headers: {
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
},
body: formData,
},
);
const data = await response.json();
return data.text;
};
<SpeechInput
onTranscriptionChange={(text) => console.log(text)}
onAudioRecorded={handleAudioRecorded}
/>;
onTranscriptionChange callbacklang proponAudioRecorded prop to be providedaudio/webm format for the MediaRecorder fallbackThe component includes full TypeScript definitions for the Web Speech API:
SpeechRecognitionSpeechRecognitionEventSpeechRecognitionResultSpeechRecognitionAlternativeSpeechRecognitionErrorEventThese types are properly declared for both standard and webkit-prefixed implementations.