Overview
llms-full.txt is a single file that contains all of your site’s textual content in a format LLMs can consume in one request. Agents use it for bulk indexing, offline analysis, and RAG corpus construction. This guide writes a build-time generator that traverses the content directory, filters by status, and concatenates Markdown files with metadata headers. See llms-txt for the spec and add-llms-txt-to-existing-site for adding the index file.
Prerequisites
- A Markdown-based content directory (Quartz, Astro, Hugo, or similar).
- Node 22 with ESM support.
- Each Markdown file has YAML frontmatter with at least a
title,slug, andstatusfield. - A
scripts/directory or equivalent for build utilities.
Steps
1. Create the generator script
Create scripts/generate-llms-full.mjs:
import { readdir, readFile, writeFile } from "node:fs/promises";
import { join, extname } from "node:path";
const CONTENT_DIR = "content";
const OUTPUT_FILE = "public/llms-full.txt";
const EXCLUDE_STATUS = ["deprecated"];
async function* walkMarkdown(dir) {
const entries = await readdir(dir, { withFileTypes: true });
for (const entry of entries) {
const path = join(dir, entry.name);
if (entry.isDirectory()) {
yield* walkMarkdown(path);
} else if (extname(entry.name) === ".md") {
yield path;
}
}
}
function parseFrontmatter(content) {
const match = content.match(/^---\n([\s\S]*?)\n---/);
if (!match) return {};
const fm = {};
for (const line of match[1].split("\n")) {
const [key, ...rest] = line.split(":");
if (key && rest.length) fm[key.trim()] = rest.join(":").trim().replace(/^"|"$/g, "");
}
return fm;
}
const parts = [];
for await (const file of walkMarkdown(CONTENT_DIR)) {
const raw = await readFile(file, "utf8");
const fm = parseFrontmatter(raw);
if (EXCLUDE_STATUS.includes(fm.status)) continue;
const body = raw.replace(/^---\n[\s\S]*?\n---\n/, "").trim();
parts.push(`# ${fm.title || file}\n\nSource: ${file}\n\n${body}`);
}
const output = parts.join("\n\n---\n\n");
await writeFile(OUTPUT_FILE, output, "utf8");
console.log(`Generated llms-full.txt: ${parts.length} pages, ${output.length} characters.`);2. Add to package.json scripts
{
"scripts": {
"generate:llms-full": "node scripts/generate-llms-full.mjs",
"build": "npm run generate:llms-full && npx quartz build"
}
}Run npm run generate:llms-full locally to inspect the output before deploying.
3. Filter by status and category
Add a filter to exclude draft pages and restrict to categories that are ready for public consumption.
const EXCLUDE_STATUS = ["deprecated", "draft"];
const INCLUDE_CATEGORIES = []; // empty = all; ["coding", "frontend"] = only those
// Inside the loop:
if (EXCLUDE_STATUS.includes(fm.status)) continue;
if (INCLUDE_CATEGORIES.length && !INCLUDE_CATEGORIES.includes(fm.category)) continue;4. Add a file header
Prepend a header that describes the file to agents reading it.
const header = `# llms-full.txt\n\nGenerated: ${new Date().toISOString()}\nPages: ${parts.length}\nSource: https://yoursite.com\n\nThis file contains the full text of all stable pages. Use llms.txt for the index.\n\n---\n\n`;
await writeFile(OUTPUT_FILE, header + output, "utf8");5. Serve the file from the site root
Ensure the built file is accessible at https://yoursite.com/llms-full.txt. For Quartz, placing it in public/ is sufficient; for Astro or Next.js, place it in public/ or static/.
Test with:
curl -I https://yoursite.com/llms-full.txt
# Expect 200 and Content-Type: text/plainVerify it worked
node scripts/generate-llms-full.mjs
wc -l public/llms-full.txt # line count
head -50 public/llms-full.txt # inspect header and first page
grep "^# " public/llms-full.txt | wc -l # count page sectionsThe page section count should match the number of stable Markdown files in the content directory.
Common errors
- Frontmatter parser missing fields: YAML frontmatter with arrays or multi-line values breaks the simple parser. Use a proper YAML parser (
js-yaml) for robustness:import yaml from 'js-yaml'; const fm = yaml.load(match[1]); - Output file too large: an
llms-full.txtover 10 MB is unwieldy for single-context LLM calls. Filter to stable pages only, or split by category and generate multiple files. - File not served at the correct URL: verify the output path matches the static file serving configuration. In Quartz,
public/llms-full.txtserves at/llms-full.txtautomatically. - Draft page content leaks: confirm
EXCLUDE_STATUSincludes"draft"and the frontmatter status field is set correctly on all draft pages. - Line endings inconsistent: on Windows, file writes may use CRLF. Force LF with
output.replace(/\r\n/g, '\n')before writing.