Essay

How we calibrate the AI to your voice (and not ChatGPT’s).

2026-03-22·5 min read·The Veluxa team

AI-generated copy has a sound.

You've heard it. Everyone has. It's smooth, earnest, slightly over-structured, always positive, prone to starting sentences with "In the fast-paced world of..." and ending them with "...for years to come." It sounds like it was written to please a panel of focus-group participants who were paid to nod.

That sound — let's call it The Voice — is what ruins AI tools for anyone who actually has a brand.

Your brand does not sound like The Voice. Your brand sounds like you. The question every serious content tool has to answer is: how do we get the AI to sound like you, not like itself?

We've iterated on this a lot. Here's what we've learned.

Style guides don't work

The first thing every agency tries is writing a detailed style guide and stuffing it into the system prompt.

"Our voice is confident but not arrogant. We use short sentences but not terse ones. We are witty without being sarcastic. We do not use exclamation marks. We prefer Oxford commas. Our tone is..."

This is well-meaning. It does not work. The model reads a 200-word style guide as a 200-word style guide, which is to say it doesn't really internalize any of it. Output still sounds like The Voice with a light accent.

The effect size of a great style guide, in our testing, is about a 15% reduction in how much the output feels AI-generated. You need 90%+.

Few-shot is 10x better than rules

The thing that actually moves the needle is showing, not telling. Give the model real examples of the voice and ask it to match.

Here's our exact approach inside Veluxa's AI Compose:

·On every generation, we pull the last 5 published posts from the workspace.
·We inject them into the system prompt as explicit "here is the brand voice" examples.
·We add a short natural-language descriptor ("Confident. Editorial. Avoid hype. Avoid exclamation marks.") as a lightweight steer.
·We instruct the model: write new content on this topic in exactly that voice, without repeating the themes.

That's it. No fine-tuning. No embeddings. No RAG. Just examples in the prompt.

The results are genuinely striking. Output from this pipeline passes the blind-voice test with our customers about 78% of the time — meaning their readers can't tell it's AI-generated at better than chance. Output without the few-shot sits around 12%.

Why few-shot works when style guides don't

The underlying reason is that large language models are pattern-matchers over token sequences. They do not learn abstract rules well. They learn concrete examples extremely well.

When you write "our tone is witty but not sarcastic," the model has to translate that abstraction into token probabilities. It doesn't translate cleanly.

When you paste five examples of your actual witty-but-not-sarcastic writing, the model can directly sample the joint probability distribution of your voice. Not an abstraction of it — the real thing.

The upper bound: a voice profile

On our Growth tier and above, we go one step further. Customers can upload 5 to 20 sample posts and we use those to build a persistent voice profile.

Specifically:

·We extract per-sample features: sentence length distribution, punctuation use, Flesch-Kincaid reading level, passive-voice ratio, emoji use, hashtag density, em-dash density.
·We compute a centroid across the samples.
·On every generation, we inject the raw samples and the feature summary as structured context ("average sentence: 14 words; 1 em-dash per 200 words; 0 emoji; reading level grade 11").
·We do a post-generation critique pass where the model scores its own output against the feature summary and regenerates if it's off.

This is the closest thing we have to fine-tuning without the cost and latency of actual fine-tuning. And it scales to every customer.

Output from this pipeline passes blind-voice tests about 89% of the time. That last 11% is, honestly, often better than the real brand's output — a surprising number of customers have started using Veluxa's suggestions as the canonical voice and asking their human writers to match it.

Three rules of thumb for anyone building with LLMs

·Examples beat rules. If you find yourself writing "always do X," replace it with three examples of X being done.
·Five examples is the floor. Fewer and the model doesn't have enough signal.
·Twenty examples is the ceiling. More and the context starts to dilute.

If you want to try the Compose pipeline against your own voice, Veluxa's 14-day trial includes full voice calibration on every plan above Starter. Start there.