Overview

/llms.txt is a proposed standard from llmstxt.org for giving LLM agents a curated, human-readable index of a site. Treat it as the agent-facing counterpart to sitemap.xml. Ship one on any site you expect agents to browse.

What it is

/llms.txt is a single markdown file served at the site root. It opens with the site title and a blockquote summary, then groups page links under H2 category headings. Each link line has the format - [Title](url): One-line summary. The file is hand-curated or generated from a content index; it is not a crawl.

Where it lives

Serve it at https://<your-domain>/llms.txt with Content-Type: text/plain or text/markdown. Static hosts (GitHub Pages, Vercel, Netlify, Cloudflare Pages) serve the file directly when placed at the build root. Use Quartz’s static/ directory if the SSG does not pick it up from content/ automatically.

Format spec, with example

# Site Name

> One- or two-paragraph blockquote describing what this site is and who it is for.

## Category

- [Page Title](https://example.com/category/slug): One-sentence summary.
- [Another Page](https://example.com/category/other): One-sentence summary.

## Another Category

- [Page](https://example.com/other/page): Summary.

Rules that matter:

  • The H1 is the site name, not the file name.
  • The blockquote after the H1 is the site summary. Keep it tight; this is what an agent reads to decide whether to keep going.
  • Use H2 for category groupings. Avoid H3 unless the site is large enough to need sub-sections.
  • Each link line is one bullet, one link, one summary. Do not stack multiple links per bullet.
  • Use absolute URLs. Relative paths confuse agents that fetch the file in isolation.

Difference from robots.txt and sitemap.xml

Different audiences, different jobs.

  • robots.txt tells crawlers what they may not fetch. It is a policy file.
  • sitemap.xml tells crawlers what exists. It is a machine index with no descriptions.
  • llms.txt tells LLM agents what is worth reading and why. It is a curated, descriptive index.

Ship all three. They do not overlap.

When to also include /llms-full.txt

/llms-full.txt is an optional companion file: the same index, but with the full markdown body of every page inlined. Ship it when the site is small enough that the entire corpus fits in a single LLM context window (rule of thumb: under ~500K tokens of content). For larger sites, omit it; agents should fetch pages individually.

How to auto-generate from a content index

If pages already have YAML frontmatter with title, summary, and category, generate llms.txt at build time. Pseudocode:

groups = defaultdict(list)
for page in all_pages:
    if page.frontmatter.get("status") == "draft":
        continue
    groups[page.frontmatter["category"]].append(page)
 
lines = [f"# {site_title}", "", f"> {site_summary}", ""]
for category, pages in sorted(groups.items()):
    lines.append(f"## {category.title()}")
    for p in sorted(pages, key=lambda p: p.frontmatter["title"]):
        url = f"https://{base_url}/{p.path}"
        lines.append(f"- [{p.frontmatter['title']}]({url}): {p.frontmatter['summary']}")
    lines.append("")
write(lines, "static/llms.txt")

The generator should skip drafts and deprecated pages, sort categories alphabetically, and sort pages within each category by title.

Common mistakes

  • Listing the same page twice across categories. Pick one home for each page.
  • Letting summaries balloon past one sentence. The summary line is a routing hint, not the page itself.
  • Forgetting to regenerate the file on content changes. Add it to the build, not a manual step.
  • Linking to relative paths. Agents fetch the file in isolation; relative URLs break.
  • Mixing rendered HTML links and raw markdown links. Pick one form per file. Rendered HTML is the standard.

Validation

Lint the file at build time. The minimum check:

  • File exists at the build root.
  • File starts with # .
  • Every link line matches ^- \[.+\]\(https?://.+\): .+$.
  • No duplicate URLs.
  • Every URL returns 200 in a post-deploy smoke test.

For a richer check, parse the file with a markdown library and walk the link tree.