Overview

/llms.txt is a plain-text markdown file that gives LLM agents a curated index of a site’s content without requiring a full crawl. This guide adds it to any static site: write the file, place it at the root, add the /ai.txt sibling, and cross-link from robots.txt. For the full generation-from-frontmatter workflow, see ship-llms-txt.

Prerequisites

  • A site with stable URLs. The file lists absolute URLs that should not change.
  • Access to place a file at the site root (/llms.txt, not /blog/llms.txt).
  • A plain text editor or a templating step in your build pipeline.

Steps

1. Understand the spec

The llmstxt.org spec defines the format:

  1. One # H1 with the site name.
  2. One > blockquote with a two-sentence site summary.
  3. Optional prose paragraphs before the link sections.
  4. One or more ## H2 sections, each with a bulleted list of markdown links with inline summaries.

Each bullet follows the pattern:

- [Page Title](https://example.com/path): One-sentence description.

Use absolute URLs. The file is fetched in isolation; relative URLs break.

2. Write the file

# My Site
 
> Opinionated reference vault of best practices for web development and AI agent work.
> Pages are atomic, YAML-fronted, and stable enough to cite in agent prompts.
 
## Coding
 
- [TypeScript Best Practices](https://example.com/coding/typescript): When to use strict mode, utility types, and runtime validation.
- [Python Packaging](https://example.com/coding/python-packaging): pyproject.toml layout, build backends, and publish to PyPI.
 
## AI Agents
 
- [Prompt Design](https://example.com/ai-agents/prompt-design): Rules for writing system prompts, few-shot examples, and chain-of-thought.
- [RAG Overview](https://example.com/ai-agents/rag): Chunking, embedding, retrieval, reranking, and citation strategies.

Keep descriptions to one sentence. Agents use them to decide whether to follow the link. Long descriptions waste tokens without improving routing. See llms-txt for the complete spec reference.

3. Place the file at the site root

The file must serve from https://yourdomain.com/llms.txt.

  • Static HTML: copy llms.txt to the public root.
  • Astro / Next.js: place in public/llms.txt.
  • Hugo: place in static/llms.txt.
  • Quartz: place in content/llms.txt.
  • Nginx: add a location block to serve the file from the non-public directory if needed.

Serve with Content-Type: text/plain or text/markdown. Most static hosts detect this from the .txt extension.

4. Add the /ai.txt sibling

/ai.txt is a machine-readable counterpart to robots.txt that declares how AI systems may use the site’s content. At minimum:

User-Agent: *
Allow: /

# llms.txt index: https://yourdomain.com/llms.txt

This opt-in posture signals that LLM training and retrieval are permitted. If your content is licensed under MIT or CC, add a comment:

# License: MIT
# Attribution: https://yourdomain.com/about

For license and attribution rules, see ai-txt.

Discovery improves when the file is referenced in robots.txt:

User-agent: *
Disallow:

Sitemap: https://yourdomain.com/sitemap.xml

# LLM agent index
# llms.txt: https://yourdomain.com/llms.txt

The # llms.txt: comment format is recognized by some LLM-native crawlers. It costs nothing and improves discoverability. See technical for the full robots.txt playbook.

6. Consider an /llms-full.txt

/llms-full.txt inlines the full content of every page after the index section. Provide it when:

  • Total content is under 200K tokens.
  • Agents frequently need to read multiple pages (reduces round-trips).

Skip it for large sites. A bloated full-text file is slower to load than targeted per-page fetches.

Verify it worked

# 1. File is reachable.
curl -sI https://yourdomain.com/llms.txt | head -1
# expected: HTTP/2 200
 
# 2. File starts with the correct structure.
curl -s https://yourdomain.com/llms.txt | head -5
 
# 3. ai.txt is present.
curl -sI https://yourdomain.com/ai.txt | head -1
# expected: HTTP/2 200
 
# 4. All linked URLs return 200. Check three.
curl -s https://yourdomain.com/llms.txt | \
  grep -oE 'https?://[^)]+' | head -3 | \
  xargs -I {} sh -c 'echo -n "{}: "; curl -sI {} | head -1'

Common errors

  • File is at /blog/llms.txt or /static/llms.txt instead of the root. Move it.
  • Relative URLs in the link list. Replace with absolute URLs.
  • robots.txt blocks User-agent: *. The llms.txt listing is pointless if the pages themselves are blocked. Fix the robots.txt disallow rules.
  • File updates are cached at the CDN. Purge the CDN cache or set a short Cache-Control TTL (under 1 hour) for this file.
  • The blockquote is missing. Some parsers reject the file if the > blockquote is absent. Add it immediately after the H1.