Overview
AI search optimization (Generative Engine Optimization, or Answer Engine Optimization) is the discipline of getting a page cited by an LLM-powered answer, not just ranked in a blue-link SERP. Google AI Overviews, ChatGPT search, Perplexity, and Claude citations all read pages, extract claims, and quote them with a link back. The signals that win citations overlap with e-e-a-t and helpful-content-update but add three new requirements: structured entity definitions, a clean discoverability layer, and atomic claims a model can lift without rewriting.
Define entities once, with a stable URL per term
LLMs cite the page they can resolve to a single concept. Pages that bundle five definitions get skipped for pages that define one.
- One concept per page. Glossary entries live at
/glossary/<term>and define a single term in 150 to 300 words. - Lead with a one-sentence definition. The first sentence is the model’s quote target.
- Mark up the term with
DefinedTermJSON-LD so the entity is machine-readable.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "DefinedTerm",
"name": "Largest Contentful Paint",
"alternateName": "LCP",
"description": "The time from navigation start to when the largest above-the-fold element renders. Core Web Vitals target is under 2.5 seconds at the 75th percentile.",
"inDefinedTermSet": "https://example.com/glossary/",
"url": "https://example.com/glossary/lcp"
}
</script>Pair the glossary entry with a deep-dive article elsewhere on the site; the glossary is the citation anchor, the deep-dive is the rank target. See schema-markup-deep for the full schema catalog.
Ship llms.txt and llms-full.txt at the site root
LLM crawlers and retrieval pipelines read /llms.txt for the agent-facing index and /llms-full.txt for the concatenated corpus. The two files together let a model decide what to read without crawling the HTML tree.
/llms.txt: a markdown index grouped by category, with one line per page (title + URL + one-sentence summary)./llms-full.txt: every canonical page concatenated as plain markdown, frontmatter included, in the same order asllms.txt.
Regenerate both on every build from the same source the sitemap uses. See llms-txt for the format and discoverability-files for the wider set of agent-facing files (ai.txt, humans.txt, security.txt).
Write atomic claims a model can lift verbatim
Models quote sentences, not paragraphs. A claim that wins citations has three properties.
- Self-contained: the sentence does not depend on the previous sentence for context.
- Specific: at least one named entity, number, or version. “Postgres 17 ships incremental backup via
pg_combinebackup” beats “Postgres has improved its backup story.” - Sourced: a primary citation in the next sentence or a link in the same paragraph.
Long, hedged sentences get rewritten by the model and lose the citation. Short, declarative sentences get lifted intact and keep the link.
Surface named entities everywhere they matter
LLMs build a model of the page from the entities it names. Surface them in four places.
<title>and<h1>: the primary entity verbatim.- First paragraph: every secondary entity the page covers.
ArticleJSON-LDaboutarray: each entity as aThingwith asameAsWikipedia or Wikidata URL.- Author byline:
PersonJSON-LD with a populatedsameAsarray (see e-e-a-t).
Entity disambiguation through sameAs is what tells the model that “Astro” on this page is the framework, not the website builder.
Pick evergreen over trending for citation durability
Trending pages spike in AI Overview inclusion for a week and drop out. Evergreen pages accrue citations for months. Default to evergreen; reserve trending coverage for the cases where the topic is the news.
- Evergreen: definitions, comparisons, how-tos, reference tables. Refresh on a quarterly cadence; re-date only when the body changes (see helpful-content-update).
- Trending: launches, version releases, breaking incidents. Publish fast, link from the evergreen page, and let the evergreen page outlive the news cycle.
A vault of 100 evergreen pages earns more durable AI citations than 1,000 trending posts.
Monitor AI Overview inclusion with deliberate queries
GSC does not report AI Overview impressions cleanly yet. Track inclusion with three methods.
- Manual SERP checks on a fixed query list, run weekly from a clean session.
- Perplexity and ChatGPT citation logs through their public APIs where available.
- Server logs filtered by AI user agents:
GPTBot,ClaudeBot,PerplexityBot,Google-Extended. A spike in fetches from one of these usually precedes a citation by 24 to 72 hours.
Treat the user-agent log as the leading indicator. Treat manual SERP checks as the confirmation.
Pitfalls
- Blocking AI crawlers in
robots.txtwhile wanting AI citations. Pick one. AllowingGPTBotandClaudeBotis the prerequisite to being quoted. - Stuffing the page with FAQ schema in the hope of triggering AI Overviews. The classifier ignores it on pages that do not actually answer the questions.
- Treating GEO as separate from SEO. The same page wins both when the entity work and the citation hygiene are right.
- Skipping the RAG side. If the site has a chat surface, the same atomic-claim structure that wins AI Overviews also wins internal retrieval. See rag.