Now available for macOS

Yes, another voice-to-text app.
Here's why this one exists.

What you get with Shoute:

  • Local model that matches cloud speed
  • Context-aware formatting in local mode
  • No screenshots
  • 100+ languages (in cloud, 25+ local) — speak your native language and either
    • Transcribe natively, or
    • Translate to English
↓ Download for Mac ↓ Download for Windows

macOS 13+ · Apple Silicon & Intel · 2,000 words/week free

🛡
No screenshots taken
💻
Local model = cloud speed
🌍
100+ languages
Sub-second transcription

Every app makes you choose

"Great formatting OR privacy. Pick one."
Cloud apps (Wispr Flow, Aqua Voice) format beautifully - but some screenshot your screen every few seconds for "context." And if you want privacy? Their local mode is either nonexistent or noticeably worse.

Local apps (SuperWhisper, TalkFlowy) protect your privacy - but output raw transcripts you need to clean up yourself. The local model is always a downgrade.

Shoute doesn't make you choose. Cloud mode streams audio for blazing-fast transcription. Local mode keeps everything on your Mac. Both produce the same formatted, context-aware output. And neither mode ever screenshots your screen.

Three things no competitor does together

Lots of apps do one of these. None do all three.

🛡

Privacy without compromise

No screenshots of your screen. No screen capture permissions. Cloud mode streams audio for fast transcription - nothing is stored. Local mode? Zero audio leaves your device, period. You choose the mode. We never read your screen either way.

Local model = cloud speed

Most apps have a "local mode" that's noticeably slower. Ours isn't. Same formatted, properly punctuated, cleanly structured output at the same snappy speed whether you're on-device or cloud. Try both and compare - we're that confident.

🎯

Formats to the app you're in

Same spoken words, different output. Casual in Slack. Formal in Mail. Checklist in Reminders. Clean paragraph in Notes. Shoute reads what app you're in and formats accordingly - no settings to toggle.

Same voice. Different format.

Say the same thing everywhere. Shoute formats it for where you are.

💬 Slack Casual
You said

"hey can you push the standup to 3 today um something came up with the client"

Shoute output

Hey, can you push the standup to 3 today? Something came up with the client.

Mail Formal
You said

"hey sarah thanks for the proposal let's schedule a call this week to go over next steps does thursday afternoon work"

Shoute output

Hi Sarah,

Thanks for sending over the proposal. I'd like to schedule a call this week to discuss next steps. Does Thursday afternoon work for you?

Best regards

Reminders Checklist
You said

"pick up dry cleaning get almond milk call the dentist about tuesday and order avi's birthday present"

Shoute output
Pick up dry cleaning
Get almond milk
Call the dentist about Tuesday
Order Avi's birthday present
📝 Notes / Docs Paragraph
You said

"the main issue with the current approach is that we're triggering the photo evaluation too early um users haven't uploaded enough photos yet so the results aren't meaningful"

Shoute output

The main issue with the current approach is that we're triggering the photo evaluation too early. Users haven't uploaded enough photos yet, so the results aren't meaningful.

Speak your language. Get formatted text.

Not just transcription - the same context-aware formatting works across every language. Dictate in Tamil and get a proper email. Speak Spanish and get a clean Slack message.

🇩🇪 German
🇫🇷 French
🇮🇳 Hindi
🇨🇳 Chinese
🇰🇷 Korean
🇵🇹 Portuguese
🇸🇦 Arabic
🇮🇹 Italian
🇳🇱 Dutch
🇷🇺 Russian
🇹🇭 Thai
🇮🇩 Indonesian
🇹🇷 Turkish
🇵🇱 Polish
🇻🇳 Vietnamese
🇪🇸 Spanish Slack
You said

"oye puedes mover la reunión a las tres de la tarde es que me surgió algo con el cliente"

Shoute output

Oye, ¿puedes mover la reunión a las 3 de la tarde? Me surgió algo con el cliente.

🇮🇳 Tamil Mail
You said

"vanakkam sir report ready aayiduchi naalaikku meeting la discuss pannalaam"

Shoute output

வணக்கம் Sir,

Report தயாராகிவிட்டது. நாளைக்கு meeting-ல் discuss பண்ணலாம்.

நன்றி

🇩🇪 German Notes
You said

"das hauptproblem ist dass wir die auswertung zu früh starten ähm die nutzer haben noch nicht genug daten hochgeladen"

Shoute output

Das Hauptproblem ist, dass wir die Auswertung zu früh starten. Die Nutzer haben noch nicht genug Daten hochgeladen.

🇯🇵 Japanese Slack
You said

"sumimasen kyou no meeting san ji ni henkou dekimasuka chotto kyaku no ken de"

Shoute output

すみません、今日のミーティング3時に変更できますか?ちょっと客の件で。

Most voice-to-text apps bolt on "multilingual support" as an afterthought - you get raw transcription in other languages, but no formatting intelligence. Shoute's context-aware formatting works in every language. Checklist in Reminders, formal in Mail, casual in Slack - regardless of which language you speak.

What "privacy-first" actually looks like

Every app says "privacy-first." Here's what they actually do vs. what we do.

How most voice apps work

Screenshots, no real local option

Audio retention policies unclear - some use your voice data to train their models
Some apps screenshot your screen every few seconds for "context awareness"
Local mode available but output quality is noticeably worse
Usage analytics on transcription content
"Your data may be used to improve our models" (opted in by default)
How Shoute works

Private by architecture

Two modes, your choice: cloud streams audio for speed (nothing stored), local keeps everything on your Mac
No screenshots. Ever. We detect context from the frontmost app name only
Local model output matches cloud speed - same formatting, same snappy response
Cloud audio is streamed for transcription and discarded - never stored, never logged, never used for training
We're a two-person indie studio, not a VC-backed data play

Three steps. One shortcut.

No app to switch to. No text to copy-paste. It just appears.

1

Press one shortcut

From anywhere on your Mac. Any app, any text field. No switching required.

⌥ Option + Space
2

Speak naturally

Ramble. Use filler words. Change your mind mid-sentence. Shoute's AI handles all of it.

3

Text appears instantly

Clean, formatted text lands right where your cursor was. In the app you were already using. Done.

How Shoute stacks up - no spin

We respect our competitors. Here's where we think we're different.

App No Screenshots Local = Cloud Smart Format Multi-Language Price
Shoute ✓ Yes ✓ Yes Per-app context 100+ $5.83/mo
Wispr Flow ✗ Takes screenshots Cloud only Context-aware 100+ $15/mo
Aqua Voice Unknown Cloud only Prose polish Multi $8-10/mo
SuperWhisper ✓ Yes Local is worse Basic Multi $249 lifetime
TalkFlowy ✓ Yes Local only Raw transcript 50+ One-time
Sayline ✓ Yes Local only Grammar only Multi One-time

Start free. Upgrade when you're hooked.

No credit card required. Free tier gives you 2,000 words/week — enough to see if voice-to-text changes your workflow.

Free
Get started, no card required
$0
Free forever
  • Cloud-powered transcription
  • AI smart formatting
  • Works in every app
  • Audio never stored or used for training
  • 2,000 words / week
  • 1 device

You'll start with Shoute Pro free for 7 days

Download Free
Local
100% offline, pay once
$49.99
One-time purchase, yours forever
  • On-device transcription only
  • Nothing leaves your computer
  • Works fully offline
  • Apple Silicon optimized
  • All future updates
  • 2 devices

Questions you're probably asking

How is the local model actually as good as cloud?
On-device AI models have improved dramatically. Shoute uses optimized models tuned specifically for dictation formatting - not general-purpose LLMs shoved into a small footprint. We benchmark local vs. cloud output weekly and tune until they match. Don't take our word for it - the free tier lets you try both modes and compare.
What do you mean "no screenshots"? Why would a voice app take screenshots?
Some voice-to-text apps capture your screen periodically to understand what you're working on - this is how they provide "context-aware" formatting. Shoute takes a different approach: we detect the frontmost app name (e.g., "Mail" or "Slack") through the macOS Accessibility API. Same formatting intelligence, zero screen capture.
How does context-aware formatting work?
When you trigger Shoute, it checks which app is active. Slack? Output is casual - lowercase greeting, no sign-off. Mail? Proper email structure with greeting and closing. Reminders? Checklist format. Notes? Clean paragraphs. The AI formatting model adjusts its output based on where your text will land.
What languages are supported?
100+ languages, and the formatting intelligence works across all of them. You can dictate in Tamil, Spanish, German, Japanese, Hindi, or Arabic and get properly formatted output - not just raw transcription. You can even switch languages mid-conversation.
What happens if you shut down? Will the app stop working?
If you're on the Local plan, the app runs entirely on your device - it will keep working regardless. Cloud features depend on our servers, but we offer the local-only option precisely so you're never locked in. We're Forward Alpha, a small studio building tools we use ourselves every day. This isn't a "launch and pivot" play.
I can just use Apple's built-in Dictation. Why pay?
Apple Dictation times out after 60 seconds, doesn't format anything, can't tell the difference between a Slack message and an email, and outputs one continuous sentence with no punctuation or structure. Try dictating a grocery list - you'll get a single run-on sentence. Shoute gives you a checklist. That's the gap.
Who's behind this?
Forward Alpha - a two-person indie studio. We build tools we want to use ourselves. No VC funding, no investor pressure to harvest your data, no growth-at-all-costs playbook. Just a product we're proud of and use every single day.

Try it free. You'll feel the difference
in 10 seconds.

2,000 words/week free. No credit card. No commitment.

Mac

Free · ~17 MB

macOS 13 Ventura or later
Universal (Apple Silicon & Intel)

Download for Mac

Windows

Free · ~110 MB

Windows 10 & 11
64-bit (x64)

Download for Windows
Share Shoute