Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vowen.ai/llms.txt

Use this file to discover all available pages before exploring further.

AI Enhancement is a light cleanup pass that runs after transcription. It fixes spelling, grammar, and formatting, then gets out of the way.
API key required. AI Enhancement runs on a third-party provider (OpenAI, Anthropic, Google Gemini, Groq, and 6+ others). Connect your own key in Settings > AI Setup. Groq and Google Gemini both offer generous free tiers that cover most daily use; other providers may require a paid plan.

What it does

Cleanup

Spelling and grammar
”i think wee should go tommorow
“I think we should go tomorrow.”
Punctuation and capitalization
hey are you free tomorrow”
Hey, are you free tomorrow?
Self-corrections
”let’s meet at 3 actually 4”
“Let’s meet at 4.”

Formatting

Spoken numbers to digits
three thousand dollars”
3000 dollars”
Phone numbers
five five five one two three four
555-1234
Spoken formatting commands
”price colon ten dollars”
“Price: 10 dollars”

Structure

List formatting
”buy milk call mom send report”
1. Buy milk
2. Call mom
3. Send report
Email structure
”hi sarah thanks for the update regards john”
Hi Sarah,
Thanks for the update.
Regards,
John
Paragraph splitting
”we met today anyway tomorrow we ship”
We met today.

Anyway, tomorrow we ship.
Filler words like “um”, “uh”, and “you know” are stripped by Vowen’s output filter before AI Enhancement runs. This works whether or not you have an AI provider connected. Toggle it in Settings > General > Remove filler words.

How it works

When you record with AI Enhancement enabled, your speech moves through three stages. AI Enhancement is the middle one and is the focus of this page.
Step 1
Voice to Text
Your speech is transcribed by your selected local or cloud model. Filler words are stripped by Vowen’s output filter, and any matching Threads are applied.
Step 2
AI Enhancement
The cleaned transcription is sent to your AI provider with a system prompt that tells it to fix errors and apply formatting without rewriting your sentences. Custom instructions, if you’ve set them, are appended.
Step 3
Auto-Paste
The polished output is pasted into whatever app has focus, using your selected text insertion method.

Voice formatting commands

Speak any of these and Vowen will apply the symbol or formatting instead of writing the words.
Punctuation
period, full stop.
comma,
question mark?
exclamation point!
colon:
semicolon;
Line breaks
line break, new line
new paragraph↵↵
Quotes
quotation mark, quote
apostrophe
Symbols
asterisk, star*
ampersand&
percent sign%
ellipsis
slash, forward slash/
backslash
at, at sign@
hashtag#
Brackets and dashes
open / close parenthesis( )
open / close bracket[ ]
open / close brace
dash, hyphen-
em dash
Math and special
plus+
minus, negative-
equals=
trademark, tm
copyright©
degree°
degrees celsius°C
degrees fahrenheit°F

Per-shortcut override

Each shortcut has its own AI Enhancement setting:
SettingBehavior
Always onThis shortcut always enhances the transcription
Always offThis shortcut never enhances the transcription
Configure this by clicking the gear icon next to any shortcut in Settings > Shortcuts. You can also create a dedicated shortcut just for AI Enhancement: set its enhancement to Always on while leaving your main shortcut on Always off. Custom shortcuts (any key combination, plus mouse buttons) are supported alongside the defaults. This is the cleanest way to keep one shortcut for raw transcription (code, AI prompts) and another for polished output (writing). See Shortcut Patterns for example setups.

Custom instructions Pro feature

Inside the same settings panel as Enhance transcription with AI, there is a separate Custom Instructions toggle. Turn it on and a text area appears where you can write rules that get appended to every AI Enhancement pass. Click Save to apply them. Examples:
  • “Always use Oxford commas”
  • “Write in active voice”
  • “Keep paragraphs to two or three sentences”
  • “Output everything in lowercase”
How a custom instruction shapes the output
The rule you set
”Write in passive voice with a formal, report-style tone.”
You say
”planned obsolescence is when companies make products that break on purpose so people keep buying new stuff and it’s really wasteful because everything ends up in landfills and it’s bad for sustainability”
AI Enhanced output
”Planned obsolescence refers to the practice by which products are deliberately designed by manufacturers to fail prematurely. As a result, consumers are compelled to make repeated purchases, and discarded items are accumulated in landfills, undermining broader sustainability efforts.”
See the full guide for recommended phrasings.

More Examples

Example 1: Email structure

You say:
“Hi Donna comma new line I’m writing to follow up on our meeting yesterday period I wanted to confirm the next steps period new paragraph Regards comma new line John”
You get:
Hi Donna,

I'm writing to follow up on our meeting yesterday. I wanted to confirm the next steps.

Regards,
John

Example 2: Numbered list

You say:
“I need three things first buy groceries second call mom and third finish the report”
You get:
I need 3 things:
1. Buy groceries
2. Call mom
3. Finish the report

Example 3: Phone number

You say:
“my office number is plus one eight hundred five five five zero one zero one extension two three four”
You get:
My office number is +1-800-555-0101 ext. 234

Example 4: Email address

You say:
“send the report to alex at example dot com”
You get:
Send the report to alex@example.com

Best practices

  1. Match your model to your need. For light cleanup, small models (Llama 3.1 8B, Gemini Flash Lite, Groq Llama) are fast and accurate. For complex custom instructions, use a larger model (GPT-4, Claude Sonnet, Gemini Pro).
  2. Speak naturally. AI Enhancement is built to handle filler words, false starts, and casual phrasing. You do not need to over-articulate.
  3. Use formatting commands sparingly. Let auto-detection handle normal sentences. Reach for explicit commands only when you need a specific symbol or paragraph break.
  4. Keep custom instructions short and specific. “Use Oxford commas” works better than “Make my writing better”.
  5. Set up Per-app Tones for repeated context switches. If you want different behavior in different apps (formal in email, casual in Slack, off in code editors), Per-app Tones is more sustainable than toggling the global setting.
  6. Choose latency-conscious providers. If AI Enhancement feels slow, switch to Groq or Cerebras. Both serve large models at sub-second latency on the free tier.

Languages

AI Enhancement runs in the same language as the transcription. Most providers handle major world languages well, but smaller models (under 8B parameters) can produce uneven results in low-resource languages. For best non-English results:
  • Pair a multilingual speech model (Whisper Large v3, Groq Whisper) with a frontier AI model (GPT-4o, Claude Sonnet, Gemini Pro).
  • Avoid .en speech models for non-English input.
  • Set your language explicitly in Settings > Language instead of relying on auto-detect.
See Languages for the full list of supported speech languages.
Running into issues with AI Enhancement? See AI & API Issues for symptom-driven fixes.

Set up an AI provider

Connect Groq, OpenAI, Anthropic, Gemini, or any of 10+ supported providers.