Overview
/llms.txt is a proposed standard from llmstxt.org for giving LLM agents a curated, human-readable index of a site. Treat it as the agent-facing counterpart to sitemap.xml. Ship one on any site you expect agents to browse.
What it is
/llms.txt is a single markdown file served at the site root. It opens with the site title and a blockquote summary, then groups page links under H2 category headings. Each link line has the format - [Title](url): One-line summary. The file is hand-curated or generated from a content index; it is not a crawl.
Where it lives
Serve it at https://<your-domain>/llms.txt with Content-Type: text/plain or text/markdown. Static hosts (GitHub Pages, Vercel, Netlify, Cloudflare Pages) serve the file directly when placed at the build root. Use Quartz’s static/ directory if the SSG does not pick it up from content/ automatically.
Format spec, with example
# Site Name
> One- or two-paragraph blockquote describing what this site is and who it is for.
## Category
- [Page Title](https://example.com/category/slug): One-sentence summary.
- [Another Page](https://example.com/category/other): One-sentence summary.
## Another Category
- [Page](https://example.com/other/page): Summary.
Rules that matter:
- The H1 is the site name, not the file name.
- The blockquote after the H1 is the site summary. Keep it tight; this is what an agent reads to decide whether to keep going.
- Use H2 for category groupings. Avoid H3 unless the site is large enough to need sub-sections.
- Each link line is one bullet, one link, one summary. Do not stack multiple links per bullet.
- Use absolute URLs. Relative paths confuse agents that fetch the file in isolation.
Difference from robots.txt and sitemap.xml
Different audiences, different jobs.
robots.txttells crawlers what they may not fetch. It is a policy file.sitemap.xmltells crawlers what exists. It is a machine index with no descriptions.llms.txttells LLM agents what is worth reading and why. It is a curated, descriptive index.
Ship all three. They do not overlap.
When to also include /llms-full.txt
/llms-full.txt is an optional companion file: the same index, but with the full markdown body of every page inlined. Ship it when the site is small enough that the entire corpus fits in a single LLM context window (rule of thumb: under ~500K tokens of content). For larger sites, omit it; agents should fetch pages individually.
How to auto-generate from a content index
If pages already have YAML frontmatter with title, summary, and category, generate llms.txt at build time. Pseudocode:
groups = defaultdict(list)
for page in all_pages:
if page.frontmatter.get("status") == "draft":
continue
groups[page.frontmatter["category"]].append(page)
lines = [f"# {site_title}", "", f"> {site_summary}", ""]
for category, pages in sorted(groups.items()):
lines.append(f"## {category.title()}")
for p in sorted(pages, key=lambda p: p.frontmatter["title"]):
url = f"https://{base_url}/{p.path}"
lines.append(f"- [{p.frontmatter['title']}]({url}): {p.frontmatter['summary']}")
lines.append("")
write(lines, "static/llms.txt")The generator should skip drafts and deprecated pages, sort categories alphabetically, and sort pages within each category by title.
Common mistakes
- Listing the same page twice across categories. Pick one home for each page.
- Letting summaries balloon past one sentence. The summary line is a routing hint, not the page itself.
- Forgetting to regenerate the file on content changes. Add it to the build, not a manual step.
- Linking to relative paths. Agents fetch the file in isolation; relative URLs break.
- Mixing rendered HTML links and raw markdown links. Pick one form per file. Rendered HTML is the standard.
Validation
Lint the file at build time. The minimum check:
- File exists at the build root.
- File starts with
#. - Every link line matches
^- \[.+\]\(https?://.+\): .+$. - No duplicate URLs.
- Every URL returns 200 in a post-deploy smoke test.
For a richer check, parse the file with a markdown library and walk the link tree.