TIL on January 25, 2026
Web Speech API
The Web Speech API is a browser-native interface for voice. It has two parts: SpeechRecognition (voice to text) and SpeechSynthesis (text to speech). Recognition routes audio to Google's servers in Chrome; synthesis uses your OS's built-in voices and works offline.
Use cases: voice commands, dictation, accessibility, having your app talk back to users. People have built voice-controlled 3D scenes, karaoke games that match speech to lyrics, and browser extensions that read selected text aloud.
The tradeoff vs. dedicated speech models (Whisper, AssemblyAI, etc.): Web Speech API is free and works in minutes, but accuracy drops with accents, noise, or specialized vocabulary. It's Chrome/Edge only for recognition. Good for prototypes and simple features, but you should reach for a speech model when transcription quality matters.