Yiddish Voice: A Desktop App for Yiddish Speech-to-Text
Type Yiddish with your voice — anywhere on your computer, in any app — with a single keyboard shortcut. That's the idea behind a small Windows desktop app I built for a client who works with Yiddish audio.
The problem it solves: there's no good, reliable Yiddish dictation tool. Windows speech recognition doesn't support Yiddish. Neither does anything built into productivity software. If you need to type Yiddish quickly, you're either typing it character by character or copy-pasting from somewhere else. The app fixes that by hooking directly into your workflow: press a hotkey, speak, and the transcription appears wherever your cursor is.
How It Works
The app runs in the system tray and registers a global hotkey (Ctrl+Shift+Space by default). Press it from anywhere — a Word document, an email, a chat window — and a small floating pill appears on screen. Speak your Yiddish. Release the hotkey and the recording stops. The audio is sent to a transcription backend, and the result is automatically typed into whatever was active before.
The pill UI is frameless, always-on-top, and draggable — so you can move it out of the way while you speak. It uses Electron's transparent window feature to blend into the desktop with a frosted-glass look rather than sitting in a hard rectangle.
Under the Hood
Two Transcription Backends
The app supports two backends, configurable in settings:
- RunPod pod — a custom Whisper deployment on a RunPod GPU instance, tuned for Yiddish. Lower latency for long audio, costs compute per use.
- Vertex AI / Gemini — sends audio to a fine-tuned Gemini model running on Google Cloud. Handles shorter clips well and is more tolerant of varying audio quality.
The language is hardcoded to yi (Yiddish ISO 639-1 code) so the model never tries to guess — no accidental Hebrew or German output.
One Trick Worth Noting
The app uses Electron's net module (Chromium's network stack) for the actual HTTP requests to the transcription backend, rather than Node's native https module. The reason: on networks with kosher content filters that do deep packet inspection, Node.js HTTPS requests can get silently dropped while Chromium's requests go through. Using net means the app works on filtered networks out of the box.
Auto-Place
After transcription, the result lands in the clipboard and a Ctrl+V is simulated to paste it into the active window automatically. There's a setting to disable this and just copy to clipboard if you want to review before pasting.
History
Every transcription is saved locally to a JSON file (capped at 50 entries). A separate history window lets you browse past transcriptions, copy any of them, or clear the log. Nothing goes to any server — the history is entirely local.
Distribution
It's packaged as a portable Windows .exe — no installer, no admin rights needed. Download, run, done. A start.bat is also included for users who prefer running from source with Node.