Overview
/llms.txt is a plain-text markdown file that gives LLM agents a curated index of a site’s content without requiring a full crawl. This guide adds it to any static site: write the file, place it at the root, add the /ai.txt sibling, and cross-link from robots.txt. For the full generation-from-frontmatter workflow, see ship-llms-txt.
Prerequisites
- A site with stable URLs. The file lists absolute URLs that should not change.
- Access to place a file at the site root (
/llms.txt, not/blog/llms.txt). - A plain text editor or a templating step in your build pipeline.
Steps
1. Understand the spec
The llmstxt.org spec defines the format:
- One
# H1with the site name. - One
> blockquotewith a two-sentence site summary. - Optional prose paragraphs before the link sections.
- One or more
## H2sections, each with a bulleted list of markdown links with inline summaries.
Each bullet follows the pattern:
- [Page Title](https://example.com/path): One-sentence description.
Use absolute URLs. The file is fetched in isolation; relative URLs break.
2. Write the file
# My Site
> Opinionated reference vault of best practices for web development and AI agent work.
> Pages are atomic, YAML-fronted, and stable enough to cite in agent prompts.
## Coding
- [TypeScript Best Practices](https://example.com/coding/typescript): When to use strict mode, utility types, and runtime validation.
- [Python Packaging](https://example.com/coding/python-packaging): pyproject.toml layout, build backends, and publish to PyPI.
## AI Agents
- [Prompt Design](https://example.com/ai-agents/prompt-design): Rules for writing system prompts, few-shot examples, and chain-of-thought.
- [RAG Overview](https://example.com/ai-agents/rag): Chunking, embedding, retrieval, reranking, and citation strategies.Keep descriptions to one sentence. Agents use them to decide whether to follow the link. Long descriptions waste tokens without improving routing. See llms-txt for the complete spec reference.
3. Place the file at the site root
The file must serve from https://yourdomain.com/llms.txt.
- Static HTML: copy
llms.txtto the public root. - Astro / Next.js: place in
public/llms.txt. - Hugo: place in
static/llms.txt. - Quartz: place in
content/llms.txt. - Nginx: add a
locationblock to serve the file from the non-public directory if needed.
Serve with Content-Type: text/plain or text/markdown. Most static hosts detect this from the .txt extension.
4. Add the /ai.txt sibling
/ai.txt is a machine-readable counterpart to robots.txt that declares how AI systems may use the site’s content. At minimum:
User-Agent: *
Allow: /
# llms.txt index: https://yourdomain.com/llms.txt
This opt-in posture signals that LLM training and retrieval are permitted. If your content is licensed under MIT or CC, add a comment:
# License: MIT
# Attribution: https://yourdomain.com/about
For license and attribution rules, see ai-txt.
5. Cross-link from robots.txt
Discovery improves when the file is referenced in robots.txt:
User-agent: *
Disallow:
Sitemap: https://yourdomain.com/sitemap.xml
# LLM agent index
# llms.txt: https://yourdomain.com/llms.txt
The # llms.txt: comment format is recognized by some LLM-native crawlers. It costs nothing and improves discoverability. See technical for the full robots.txt playbook.
6. Consider an /llms-full.txt
/llms-full.txt inlines the full content of every page after the index section. Provide it when:
- Total content is under 200K tokens.
- Agents frequently need to read multiple pages (reduces round-trips).
Skip it for large sites. A bloated full-text file is slower to load than targeted per-page fetches.
Verify it worked
# 1. File is reachable.
curl -sI https://yourdomain.com/llms.txt | head -1
# expected: HTTP/2 200
# 2. File starts with the correct structure.
curl -s https://yourdomain.com/llms.txt | head -5
# 3. ai.txt is present.
curl -sI https://yourdomain.com/ai.txt | head -1
# expected: HTTP/2 200
# 4. All linked URLs return 200. Check three.
curl -s https://yourdomain.com/llms.txt | \
grep -oE 'https?://[^)]+' | head -3 | \
xargs -I {} sh -c 'echo -n "{}: "; curl -sI {} | head -1'Common errors
- File is at
/blog/llms.txtor/static/llms.txtinstead of the root. Move it. - Relative URLs in the link list. Replace with absolute URLs.
robots.txtblocksUser-agent: *. Thellms.txtlisting is pointless if the pages themselves are blocked. Fix therobots.txtdisallow rules.- File updates are cached at the CDN. Purge the CDN cache or set a short
Cache-ControlTTL (under 1 hour) for this file. - The blockquote is missing. Some parsers reject the file if the
>blockquote is absent. Add it immediately after the H1.