Documentation Index
Fetch the complete documentation index at: https://docs.vowen.ai/llms.txt
Use this file to discover all available pages before exploring further.
AI Enhancement is a light cleanup pass that runs after transcription. It fixes spelling, grammar, and formatting, then gets out of the way.
API key required. AI Enhancement runs on a third-party provider (OpenAI, Anthropic, Google Gemini, Groq, and 6+ others). Connect your own key in Settings > AI Setup. Groq and Google Gemini both offer generous free tiers that cover most daily use; other providers may require a paid plan.
What it does
Cleanup
Spelling and grammar
”i think wee should go tommorow”
↓
“I think we should go tomorrow.”
Punctuation and capitalization
”hey are you free tomorrow”
↓
“Hey, are you free tomorrow?”
Self-corrections
”let’s meet at 3 actually 4”
↓
“Let’s meet at 4.”
Formatting
Spoken numbers to digits
”three thousand dollars”
↓
“3000 dollars”
Phone numbers
”five five five one two three four”
↓
“555-1234”
Spoken formatting commands
”price colon ten dollars”
↓
“Price: 10 dollars”
Structure
List formatting
”buy milk call mom send report”
↓
1. Buy milk
2. Call mom
3. Send report
2. Call mom
3. Send report
Email structure
”hi sarah thanks for the update regards john”
↓
Hi Sarah,
Thanks for the update.
Regards,
John
Thanks for the update.
Regards,
John
Paragraph splitting
”we met today anyway tomorrow we ship”
↓
We met today.
Anyway, tomorrow we ship.
Anyway, tomorrow we ship.
Filler words like “um”, “uh”, and “you know” are stripped by Vowen’s output filter before AI Enhancement runs. This works whether or not you have an AI provider connected. Toggle it in Settings > General > Remove filler words.
How it works
When you record with AI Enhancement enabled, your speech moves through three stages. AI Enhancement is the middle one and is the focus of this page.Step 1
Voice to Text
Your speech is transcribed by your selected local or cloud model. Filler words are stripped by Vowen’s output filter, and any matching Threads are applied.
Step 2
AI Enhancement
The cleaned transcription is sent to your AI provider with a system prompt that tells it to fix errors and apply formatting without rewriting your sentences. Custom instructions, if you’ve set them, are appended.
Step 3
Auto-Paste
The polished output is pasted into whatever app has focus, using your selected text insertion method.
Voice formatting commands
Speak any of these and Vowen will apply the symbol or formatting instead of writing the words.Punctuation
period, full stop
.comma
,question mark
?exclamation point
!colon
:semicolon
;Line breaks
line break, new line
↵new paragraph
↵↵Quotes
quotation mark, quote
”apostrophe
’Symbols
asterisk, star
*ampersand
&percent sign
%ellipsis
…slash, forward slash
/backslash
at, at sign
@hashtag
#Brackets and dashes
open / close parenthesis
( )open / close bracket
[ ]open / close brace
dash, hyphen
-em dash
—Math and special
plus
+minus, negative
-equals
=trademark, tm
™copyright
©degree
°degrees celsius
°Cdegrees fahrenheit
°FPer-shortcut override
Each shortcut has its own AI Enhancement setting:| Setting | Behavior |
|---|---|
| Always on | This shortcut always enhances the transcription |
| Always off | This shortcut never enhances the transcription |
Custom instructions Pro feature
Inside the same settings panel as Enhance transcription with AI, there is a separate Custom Instructions toggle. Turn it on and a text area appears where you can write rules that get appended to every AI Enhancement pass. Click Save to apply them. Examples:- “Always use Oxford commas”
- “Write in active voice”
- “Keep paragraphs to two or three sentences”
- “Output everything in lowercase”
How a custom instruction shapes the output
The rule you set
”Write in passive voice with a formal, report-style tone.”
You say
”planned obsolescence is when companies make products that break on purpose so people keep buying new stuff and it’s really wasteful because everything ends up in landfills and it’s bad for sustainability”
↓
AI Enhanced output
”Planned obsolescence refers to the practice by which products are deliberately designed by manufacturers to fail prematurely. As a result, consumers are compelled to make repeated purchases, and discarded items are accumulated in landfills, undermining broader sustainability efforts.”
More Examples
Example 1: Email structure
You say:“Hi Donna comma new line I’m writing to follow up on our meeting yesterday period I wanted to confirm the next steps period new paragraph Regards comma new line John”You get:
Example 2: Numbered list
You say:“I need three things first buy groceries second call mom and third finish the report”You get:
Example 3: Phone number
You say:“my office number is plus one eight hundred five five five zero one zero one extension two three four”You get:
Example 4: Email address
You say:“send the report to alex at example dot com”You get:
Best practices
- Match your model to your need. For light cleanup, small models (Llama 3.1 8B, Gemini Flash Lite, Groq Llama) are fast and accurate. For complex custom instructions, use a larger model (GPT-4, Claude Sonnet, Gemini Pro).
- Speak naturally. AI Enhancement is built to handle filler words, false starts, and casual phrasing. You do not need to over-articulate.
- Use formatting commands sparingly. Let auto-detection handle normal sentences. Reach for explicit commands only when you need a specific symbol or paragraph break.
- Keep custom instructions short and specific. “Use Oxford commas” works better than “Make my writing better”.
- Set up Per-app Tones for repeated context switches. If you want different behavior in different apps (formal in email, casual in Slack, off in code editors), Per-app Tones is more sustainable than toggling the global setting.
- Choose latency-conscious providers. If AI Enhancement feels slow, switch to Groq or Cerebras. Both serve large models at sub-second latency on the free tier.
Languages
AI Enhancement runs in the same language as the transcription. Most providers handle major world languages well, but smaller models (under 8B parameters) can produce uneven results in low-resource languages. For best non-English results:- Pair a multilingual speech model (Whisper Large v3, Groq Whisper) with a frontier AI model (GPT-4o, Claude Sonnet, Gemini Pro).
- Avoid
.enspeech models for non-English input. - Set your language explicitly in Settings > Language instead of relying on auto-detect.
Running into issues with AI Enhancement? See AI & API Issues for symptom-driven fixes.
Set up an AI provider
Connect Groq, OpenAI, Anthropic, Gemini, or any of 10+ supported providers.