A linter for its own writing


I pointed Claude at every conversation I’d ever had with it — chat history, Claude Code sessions, about 27,000 messages across 19 million characters — and had it analyse its own output for recurring verbal tics.

“Genuinely” appeared 862 times. “You’re right” showed up 700 times (271 of those were “you’re absolutely right”). “Comprehensive” got 663. The word cloud looked like a TEDx talk got LinkedIn pregnant and r/iamverysmart adopted it.

This blog is written with Claude. Every post is a collaboration — I give direction, Claude writes, and I edit (well, I give it a couple of words and it figures out the rest until I swear at it). The problem is that Claude writes like Claude. It hedges, it flatters, it reaches for the same hundred words, and it frames everything like it’s about to change your life. The CLAUDE.md for this site has a deny list that’s longer than most of the posts.

It started with a few rules — no sycophancy, no SEO, don’t waffle (well, even THIS section was its own fucking subheader). Then I kept noticing patterns. Every time Claude did something that made me wince, I added it to the file. It grew.

Banned words: leverage, robust, comprehensive, streamline, seamless, facilitate,
utilize, ecosystem, landscape, paradigm, synergy, unlock, empower, holistic,
delve, deep dive, pivotal, crucial, game changer, non-trivial, elegant,
straightforward, genuinely, under the hood, behind the scenes.

Banned openers: However, Indeed, Notably, Arguably, Honestly, Basically,
Essentially, Fundamentally, Specifically, Typically, Generally, Interestingly,
In practice, In theory, More precisely, To be clear, That said, Looking at,
Based on, For context, For reference.

Banned constructions: "It's not X, it's Y." "No X, no Y — just Z."
"The key is..." "The beauty of..." "Let me break this down."
"Think of it like..." "It turns out that..." "This changes everything."

Anything that sounds like a keynote. Anything that would get upvoted on r/iamverysmart.

Then the discourse analysis layer. I had Claude categorise its own worst habits by their linguistic names — epistemic gatekeeping, pedagogical reframing, presupposition of ignorance, cataphoric teasing, apophasis, litotes, anaphora, antithesis, tricolons, epistrophe, chiasmus. Rhetorical devices that exist to make the conclusion sound profound when it isn’t.

All of this lives in the CLAUDE.md. Claude reads it at the start of every session. It mostly works — until it doesn’t, and then I add another rule.

The linter

The CLAUDE.md works until Claude decides to ignore it — the linter runs at build time and blocks the deploy.

The linter is a Python script that runs against every markdown file in the blog before Astro builds the site. It parses the prose — strips frontmatter, code blocks, inline code, and markdown image syntax — then checks what’s left against the deny list.

Three levels of detection:

Exact matching. Banned words (whole-word, case-insensitive), banned phrases, banned sentence openers. Text inside double quotes is skipped — if Sony called something their “comprehensive review”, that’s Sony’s problem.

Pattern matching. Regex for structural tics. The “it’s not X, it’s Y” reframe, the “no X, no Y — just Z” payoff (and variants with not/isn’t/wasn’t/doesn’t/can’t/won’t). Exclamation marks in prose.

Structural analysis. Standalone single-sentence paragraphs (dramatic effect detection). Anaphora — three or more consecutive lines starting with the same words. Header count — flags anything over three per post.

Warnings fail the build. Info flags are advisory — a single-sentence paragraph might be dramatic padding or it might be “It had dead pixels.” Context matters, and regex doesn’t have context.

Sometimes a banned word is the right word. There’s an inline ignore for that — an HTML comment above the line with the rule and a reason. The reason is required. If you can’t articulate why you’re ignoring a rule, maybe you shouldn’t be.

The irony

Claude built the linter. It analysed its own chat history, categorised its own rhetorical habits by their linguistic names, and wrote the regex patterns to catch them (even this sentence is borderline tricolon, and the fact that this is unnecessarily its own section with a clearly AI-generated header).

This was the most difficult post to get right. Every draft Claude produced tripped its own linter — tricolons in the section about tricolons, sycophantic framing in the section about sycophancy. Getting it to stop writing like itself about the thing that stops it writing like itself took more nudging than any other post on this site.

It’s in tools/ai_linter.py. The deny list is in CLAUDE.md. If Claude develops new verbal tics, they get added to both.

We kept the em dash. I’ve grown quite fond of it — there’s a comfort in knowing a sentence wasn’t written by a human.