We Scored 100 on Agent-Friendly Docs. Here's Why That's Not Enough.

← Back to Blog

Yesterday we became the first documentation site to reach a perfect score of 100 on the Agent-Friendly Docs benchmark hosted by Fern.

I’m pleased with this not because we had a perfect score when first tested. No, I’m happy because I told Promptless in slack how to run the score via npx afdocs@latest --score and it was then able to do pretty much all the work on its own.

What did Promptless figure out?

static Markdown exports for every page
a comprehensive llms.txt index
content-negotiation middleware so agents can request text/markdown directly
hidden directives on every page pointing to our documentation map.

When Claude Code, Cursor, or any other coding assistant fetches our docs, it gets clean, structured content instead of HTML soup filled with navigation chrome.

A perfect accessibility score is table stakes. Necessary, but nowhere near sufficient. If you’re optimizing for agent accessibility without thinking about what comes next, you might be making things worse.

The accessibility problem is mostly solved

The basic mechanics of making docs agent-readable are well understood now. Serve Markdown or clean HTML, maintain an llms.txt file that is a site map for AI, strip out JavaScript-heavy components that don’t survive fetch, and use semantic headings. Those are the checkboxes.

Most documentation platforms handle this automatically. Mintlify, GitBook, ReadMe, and others have shipped agent-friendly features as defaults. If your docs are on a modern platform, you probably score reasonably well on accessibility metrics without doing anything special.

The harder problem is what happens after the agent successfully reads your docs.

Agents are more credulous than humans

AI agents believe what they read in a way that humans don’t. Credulous to an unreal degree.

When a human developer reads documentation, they bring healthy skepticism: testing code examples before trusting them, noticing when something feels outdated, cross-referencing against the actual API behavior. If the docs say one thing and the code does another, they believe the code.

Agents take your docs at face value. If your authentication guide says to use an API key in a header called X-Auth-Token but you deprecated that six months ago in favor of Bearer tokens, the agent will confidently generate code using the deprecated pattern, notice nothing wrong, and do exactly what your docs told it to do.

This is a feature of how language models work. They’re trained to be helpful, to complete tasks based on the information they’re given. Where humans develop an adversarial instinct from years of being burned by bad documentation, agents arrive at your docs in good faith. They trust you.

Which means they’re easier to mislead.

The blast radius of bad docs just increased

Every inaccuracy in your documentation used to affect one developer at a time. Someone would follow your quickstart, hit an error, maybe file a support ticket or give up and choose a competitor. Painful, but contained.

Now multiply that by every AI coding assistant your potential users have installed. A single outdated code example now gets served to Claude, to Copilot, to Cursor, to Windsurf, to hundreds of coding agents that might recommend your product to their users. Each one ingests your docs, believes them completely, and generates code that fails.

The debugging experience is worse too. When a human reads bad docs, they at least know which page they read and can report the problem. When agent-generated code fails, the developer often has no idea where the agent got its information. They know only that your product behaves differently from what the AI promised.

You’ve made it easier for agents to access your docs. You’ve also made it easier for your docs’ inaccuracies to propagate at scale.

Accuracy is the new competitive advantage

If your docs are going to be read primarily by AI agents, accuracy and clarity matter more than they ever did for human readers.

This isn’t just about catching typos. It’s about:

Stating things exactly. Don’t say “you might need to configure authentication” when you mean “you must configure authentication before making any API calls.” Agents don’t handle ambiguity well. They’ll pick whichever interpretation seems more plausible, which may not be correct.

Keeping examples current. Code examples are the highest-signal content for agents. When an LLM needs to generate code using your API, it will lean heavily on your examples. If those examples use deprecated methods, deprecated syntax, or deprecated patterns, the agent will reproduce them.

Versioning carefully. If your product has multiple versions with different behaviors, make sure the docs indicate which version they apply to. An agent that retrieves your v2 docs and generates v3 code will create a confusing mess for the developer who has to debug it.

Documenting failure modes. Agents struggle with unwritten knowledge. The constraints and edge cases that experienced users know intuitively, like being unable to do X when Y is already configured, need to be explicit. If they’re not in the docs, the agent won’t know about them.

Writing directly. Dense, jargon-heavy prose that a motivated human can puzzle through will trip up an agent. Use short, declarative sentences with one concept per paragraph and direct statements in place of clever phrasings.

Why we built Promptless

When your docs are primarily consumed by AI agents, you need a system that catches every inaccuracy before it propagates. You need docs that update automatically when your product changes. You need verification that what you’ve written actually matches what your API does.

Scoring 100 on agent accessibility was a milestone. Making sure every page deserves that accessibility score is the real work.

We started Promptless because documentation accuracy is a much harder and much more important problem that deserves good tools.