Overview
/llms.txt is a single markdown file at the site root that gives LLM agents a curated, descriptive index of the site. This guide ships a working /llms.txt end to end: decide the structure, write the summaries, place the file at the build root, link it from robots.txt and ai.txt, and verify with curl. The standard itself is documented in llms-txt.
Prerequisites
- A site with stable URLs. The file lists URLs; they should not move.
- Pages with frontmatter that includes
title,summary, andcategory. The summary becomes the per-link description. - A build pipeline that can emit a file at the site root. Quartz copies any markdown at
content/to the build root; Astro and Next put it underpublic/.
Steps
1. Decide the structure
/llms.txt is one H1 for the site title, one blockquote for the site summary, then H2 sections per category with one bullet per page. Skip H3 unless the site has more than a few hundred pages.
# Site Name
> One- or two-paragraph blockquote describing what the site is and who it is for.
## Category
- [Page Title](https://example.com/category/slug): One-sentence summary.
- [Another Page](https://example.com/category/other): One-sentence summary.
## Another Category
- [Page](https://example.com/other/page): Summary.
Use absolute URLs. Relative paths break when an agent fetches the file in isolation.
2. Write the site summary
The blockquote after the H1 is the routing hint for agents. Keep it to two sentences. Name what the site is, who it serves, and what shape the content takes.
> Opinionated reference vault of best practices for dev, writing, and AI agent work. Pages are atomic, YAML-fronted, and stable enough to cite in agent prompts.
Do not market. Do not hedge. State what the site is.
3. Generate from frontmatter at build time
Hand-maintaining the file invites drift. Generate it from the same frontmatter the rest of the site uses. The skeleton, in Node:
// scripts/build-llms-txt.mjs
import fs from "node:fs/promises"
import { walkContent, readFrontmatter } from "./util.mjs"
const SITE = { title: "Site Name", summary: "One sentence." }
const BASE = "https://example.com"
const pages = await walkContent("content")
const byCat = new Map()
for (const p of pages) {
const fm = await readFrontmatter(p)
if (fm.status === "draft") continue
const cat = fm.category ?? "uncategorized"
if (!byCat.has(cat)) byCat.set(cat, [])
byCat.get(cat).push({ title: fm.title, slug: fm.slug, summary: fm.summary, cat })
}
const out = [`# ${SITE.title}`, "", `> ${SITE.summary}`, ""]
for (const [cat, items] of [...byCat.entries()].sort()) {
out.push(`## ${cat[0].toUpperCase() + cat.slice(1)}`, "")
for (const p of items.sort((a, b) => a.title.localeCompare(b.title))) {
out.push(`- [${p.title}](${BASE}/${cat}/${p.slug}): ${p.summary}`)
}
out.push("")
}
await fs.writeFile("content/llms.txt", out.join("\n"))Wire this into prebuild so it runs before the site build.
4. Place the file at the site root
The file must serve from https://yourdomain.com/llms.txt, not from a subpath. Each SSG has a different mechanism.
- Quartz: drop
content/llms.txtand it ships topublic/llms.txt. See quartz. - Astro: put it in
public/llms.txt. - Next.js: put it in
public/llms.txtor write an API route. - Hugo: put it in
static/llms.txt.
Serve it with Content-Type: text/plain or text/markdown. Most static hosts pick this up from the file extension.
5. Link from robots.txt and ai.txt
Discovery improves when the file is referenced where crawlers already look.
# robots.txt
Sitemap: https://example.com/sitemap.xml
# llms.txt
# llms.txt index for LLM agents: https://example.com/llms.txt
# ai.txt
User-Agent: *
Allow: /
# llms.txt index: https://example.com/llms.txt
The # llms.txt: comment is informational; some tools parse it. See ai-txt.
6. Consider /llms-full.txt
/llms-full.txt is an optional companion: the same index with full page bodies inlined. Ship it when the total content fits in a single context window (rule of thumb: under 500K tokens). For larger sites, omit it.
Verify it worked
Four checks.
# 1. The file is served at the root.
curl -sI https://yourdomain.com/llms.txt | head -1
# expected: HTTP/2 200
# 2. It starts with `# ` and contains a blockquote.
curl -s https://yourdomain.com/llms.txt | head -3
# expected: line 1 starts with "# ", line 3 starts with "> "
# 3. Every URL returns 200. Spot-check three.
curl -s https://yourdomain.com/llms.txt | \
grep -oE 'https?://[^)]+' | sort -u | head -3 | \
xargs -I {} curl -sI {} -o /dev/null -w "%{http_code} {}\n"
# 4. No duplicate URLs.
curl -s https://yourdomain.com/llms.txt | grep -oE 'https?://[^)]+' | sort | uniq -d
# expected: empty outputCommon errors
- File ends up at
/static/llms.txtinstead of/llms.txt. Move it to the build root. - Mixed relative and absolute URLs. Pick absolute and regenerate.
- Summary line longer than one sentence. The summary is a routing hint; trim it.
- Drafts and deprecated pages leak in. The generator must filter on
status. - File regenerated by hand and forgotten on the next content change. Wire the generator into
prebuildso the file is never stale.