Speech to Text
* Model data (75–145 MB) is downloaded on first use only. Audio is processed entirely in your browser.
Drag & drop audio/video here, or click to select
MP3 / WAV / MP4 / WebM / M4A / OGG
Transcribe audio and video files with AI entirely in your browser using Whisper. Nothing sent to a server.