Voice is how you tell Cue. Screen context is how Cue understands. AI action is what gets delivered. One hotkey, the full loop. Mac & Windows.
Voice alone just gives you words on screen, that's what SuperWhisper, Wispr Flow, and Aqua Voice do. But the real question is never "what did you say," it's "what do you want done, given where you are right now." Cue captures all three: your voice (intent), your screen (context), and delivers the action. Each layer compounds the one below it.
Press a hotkey, speak. Cue transcribes, translates, and detects your language on the fly. Sub-3-second latency, 20+ languages, no training required.
Sub-3-second latency for short utterances. Accuracy on par with or better than SuperWhisper, Wispr Flow, and Aqua Voice, powered by Deepgram Nova-2.
Speak Mandarin, write English. Speak Japanese, write Korean. Cue auto-detects your spoken language and writes in whichever one you need.
English, Mandarin (Simplified + Traditional), Cantonese, Japanese, Korean, Spanish, French, German, Portuguese, Russian, Arabic, Hindi, and 8 more.
Cue reads your screen, knows which app is active, and understands the input field you're in. Same voice command means different things in Mail vs. Slack vs. a search box. Cue figures it out so you don't have to.
Cue captures a screenshot on agent invocation. Tasks adapt to what's in front of you without any explicit context dump. Reads text, UI, data, code, everything.
Cue knows if you're in a chat, an email, a search box, or a code editor. Formal tone in Mail, casual in WhatsApp, keyword mode in Spotlight. Zero mode switching.
Highlight a paragraph, press the hotkey, say "make this shorter." Cue reads what you selected and acts on it. No copy-paste gymnastics.
Voice + context turn into finished work. Fast path for text tasks (translate, polish, summarize) in 1-3 seconds. Full agent path for multi-step tasks with tools. Claude Sonnet and Opus under the hood.
Every dictation goes through a polish step that respects where you are. Full punctuation in email, short replies in chat, bare keywords in search. Never once-size-fits-all.
"Scan my receipts and build an expense report." Cue plans, reads files, writes code, opens apps, and produces the finished artifact in under 60 seconds.
No integrations, no plugins, no special adapters. If your app has a text field or a window, Cue works with it. Terminal to Slack to Figma.
Cue is a real desktop app, not a browser tab. Voice never leaves your hotkey, screenshots stay on-device until you trigger an agent task, history is 100% local. No account required for dictation.
macOS 13+ on Apple Silicon and Intel, plus Windows 10 1809+ / Windows 11 x64. Same hotkey, same voice, same features.
Every interaction stored in a JSONL file on your device. Nothing is uploaded without your trigger. Searchable offline forever.
Fn on Mac, Alt on Windows. Long-press for push-to-talk, tap to toggle. Option key for pure dictation. No click, no menu, no delay.
We're heads-down on voice wake, long-form capture, and a Memory system that aggregates your history across every AI tool you use.
Hands-free activation without pressing a hotkey. Wake word customizable, runs fully local on your device.
Auto-detect meetings, videos, and long-form speech. Cue transcribes, summarizes, and saves to your Memory without a hotkey.
Import ChatGPT, Claude, and local agent history. Cue becomes the single home for your relationship with every AI tool.
Follow Updates for weekly ship notes, or join our Beta Community.
Unlimited voice dictation on day one. AI agent on day two. No credit card required.