Overview
SEO has its own vocabulary that crosses search engines, content strategy, structured data, performance, and AI search. This page defines every term once, links to the deep-dive where one exists, and groups terms by category so a developer or editor can navigate by domain. Use this as the citation anchor for every other SEO page in the vault.
Table of contents
- Core search concepts
- Crawling and indexing
- On-page SEO
- Off-page SEO
- Technical SEO
- Structured data
- Performance and Core Web Vitals
- AI search and agent readability
- Content strategy
- Discoverability files
- SERP features
- Penalties and recovery
- Local SEO
- Analytics and tooling
Core search concepts
What is SEO (Search Engine Optimization)?
SEO (Search Engine Optimization) is the practice of improving a site’s organic visibility in search engines. Covers technical setup, content quality, structured data, and external signals. The umbrella term for everything on this site. See technical for the technical baseline and content for the content layer.
What is a SERP (Search Engine Result Page)?
A SERP (Search Engine Result Page) is the page returned by a search engine for a given query. Contains organic results, ads, and SERP features (featured snippets, People Also Ask, knowledge panels). See serp-features.
What is a search engine?
A search engine is a system that crawls, indexes, and ranks web pages. The big four are Google, Bing, Baidu, and Yandex. AI search engines (ChatGPT search, Perplexity, Claude, Google AI Overviews) are a growing fifth category. See ai-search-optimization.
What is search intent?
Search intent is what the searcher actually wants when they type a query. The four classic categories are informational, navigational, commercial, and transactional. Matching intent is the single highest-impact content rule. See content.
What is a query?
A query is the string a user types or speaks into a search engine. The unit ranking systems compete on.
What is ranking?
Ranking is the position a page holds in the SERP for a given query. Affected by hundreds of signals, but driven by topical authority, technical health, and user-engagement metrics.
What is organic traffic?
Organic traffic is visits from unpaid search results. Distinct from paid (ads), direct (URL typed), referral (links), and social.
Crawling and indexing
What is a crawler?
A crawler is a bot that fetches pages to discover content for indexing. Googlebot, Bingbot, and Applebot are the major commercial crawlers. AI engines run their own (GPTBot, ClaudeBot, PerplexityBot). See javascript-seo.
What is crawl budget?
Crawl budget is the number of pages a search engine will fetch from a domain in a given period. Matters at scale (over 10k URLs). See crawl-budget.
What is indexing?
Indexing is the process of parsing a fetched page and storing its content in the search engine’s database. Indexed pages can rank; non-indexed pages cannot.
What is index coverage?
Index coverage is a Google Search Console report showing which pages are indexed, excluded, or erroring. The first place to check when traffic drops. See master-google-search-console.
What is noindex?
Noindex is a meta tag (<meta name="robots" content="noindex">) or HTTP header that asks search engines to exclude the page from their index. The page can still be crawled and counted for link equity flow.
What is nofollow?
Nofollow is a link attribute (rel="nofollow") that tells search engines not to pass authority through the link. Used for paid, untrusted, or user-generated content.
What are UGC and sponsored link attributes?
UGC and sponsored link attributes are two newer link-rel values: rel="ugc" for user-generated content (comments, forum posts) and rel="sponsored" for paid placements. More specific than nofollow.
What is render budget?
Render budget is a sub-budget within crawl budget specifically for JavaScript execution. Googlebot defers JS-heavy pages to a second-pass renderer with delay. See javascript-seo.
What is mobile-first indexing?
Mobile-first indexing is Google’s policy of indexing the mobile version of a page rather than the desktop version. Active across all sites since 2021. See mobile-first.
On-page SEO
What is a title tag?
A title tag is the HTML <title> element. The single highest-impact on-page lever. Google rewrites titles that are vague, stuffed, or mismatched to content. See title-tags.
What is a meta description?
A meta description is the HTML <meta name="description"> attribute. Not a ranking signal directly, but a CTR driver in the SERP snippet. See write-meta-descriptions.
What is an H1?
An H1 is the single top-level heading on a page (<h1>). Should match the page intent and the title tag, but does not have to be identical.
What are H2/H3?
H2/H3 are subheadings (<h2>, <h3>) for sections and sub-cases. Use them in order with no level skips. See technical.
What is a slug?
A slug is the path component of a URL that identifies a page (/seo/glossary has slug glossary). Use short kebab-case noun phrases.
What is a canonical URL?
A canonical URL is the one URL form chosen as the indexable version of a page. Declared with <link rel="canonical">. See canonical-url and redirects.
What is anchor text?
Anchor text is the visible text of a hyperlink. Descriptive anchor text outperforms generic (“click here”, “read more”) for both SEO and accessibility. See internal-linking.
What is keyword density?
Keyword density is the frequency of a target keyword in a page’s body. Optimization on this metric specifically is obsolete; write naturally and the density falls out.
What is keyword stuffing?
Keyword stuffing is cramming a target keyword into a page beyond natural usage. Triggers algorithmic downweighting. Avoid.
What is a long-tail keyword?
A long-tail keyword is a multi-word, lower-volume query that is easier to rank for and converts better than head terms. The bread-and-butter of content strategy. See keyword-research.
What is search intent classification?
Search intent classification is tagging each target keyword as informational, navigational, commercial, or transactional. Drives content format and depth.
What is cannibalization?
Cannibalization is two pages on the same site competing for the same query, neither winning. See cannibalization.
Off-page SEO
What is a backlink?
A backlink is a link from another site pointing to yours. The original PageRank signal. Quality of referring domain matters more than count.
What is an inbound link?
An inbound link is the same as a backlink: a link pointing into your site.
What is an outbound link?
An outbound link is a link from your site pointing to another. Linking to authoritative sources is a trust signal.
What is an internal link?
An internal link is a link from one page on your site to another page on your site. The strongest lever you control. See internal-linking.
What is link equity?
Link equity is the ranking authority that flows through a link. Also called “link juice” historically.
What is a referring domain?
A referring domain is a unique domain that links to your site. More valuable as a signal than total backlink count: ten links from ten domains beat 100 from one.
What is disavow?
Disavow is a file submitted to Google Search Console listing toxic backlinks the site does not want associated. Use only after a manual action or clear algorithmic penalty.
What is link building?
Link building is the practice of actively earning or acquiring backlinks. Modern best practice favors earning links through original research, reference content, and PR.
Technical SEO
What is robots.txt?
robots.txt is a file at the site root that declares crawl scope to bots. See discoverability-files.
What is sitemap.xml?
sitemap.xml is an XML file at the site root listing every canonical URL with its lastmod. See discoverability-files.
What is a 301 redirect?
A 301 redirect is a permanent redirect. Passes the bulk of link equity to the new URL. Default for all permanent URL changes. See redirects.
What is a 302 redirect?
A 302 redirect is a temporary redirect. Does not pass full link equity. Used rarely (legitimate A/B tests, scheduled outages).
What are 307 and 308 redirects?
307 and 308 redirects are HTTP/1.1 equivalents of 302 and 301 respectively, with the added guarantee that the request method (POST, GET) is preserved across the redirect.
What is a soft 404?
A soft 404 is a 200 OK response on a page that has no real content (empty search results, missing product). Google detects these heuristically and excludes them. Return real 404s.
What is hreflang?
Hreflang is a tag pair declaring language and region variants of a page. Required for multi-language and multi-region sites. See hreflang.
What is canonicalization?
Canonicalization is the process of selecting one URL form as the indexable version. Self-referential canonicals on every indexable page prevent parameter and protocol drift.
What are URL parameters?
URL parameters are query-string segments after ?. SEO concern when parameters generate near-duplicate pages (sorting, filtering, tracking).
What is faceted navigation?
Faceted navigation is filter UI (color, size, price) that generates parameterized URLs. Without crawl control, generates combinatorial explosion. See pagination-and-facets.
What is pagination?
Pagination is splitting a long list into numbered pages. Modern best practice: self-canonical per page, no rel=prev/next. See pagination-and-facets.
What is HSTS?
HSTS (HTTP Strict Transport Security) is a response header that forces browsers to use HTTPS for the domain. SEO-relevant because it prevents protocol-drift duplicate content.
Structured data
What is Schema.org?
Schema.org is the vocabulary used for structured data. A shared standard between Google, Bing, Yandex, and others.
What is JSON-LD?
JSON-LD is the format Google recommends for embedding structured data. A <script type="application/ld+json"> block in <head>. See structured-data.
What is Microdata?
Microdata is an older inline format for structured data. Superseded by JSON-LD for most use cases.
What is a rich result?
A rich result is a SERP listing enhanced by structured data (star ratings, FAQ accordions, recipe cards). See schema-markup-deep.
What is Article schema?
Article schema is the JSON-LD type for editorial pages. Required: headline, datePublished, dateModified, author. See e-e-a-t.
What is BreadcrumbList schema?
BreadcrumbList schema is JSON-LD declaring the page’s position in the site hierarchy. Surfaces breadcrumb display in SERPs.
What is FAQPage schema?
FAQPage schema is JSON-LD declaring a list of question-and-answer pairs. Can trigger expandable FAQ snippets in SERPs.
What is HowTo schema?
HowTo schema is JSON-LD declaring step-by-step instructions. Deprecated by Google in 2023 for most uses but still valid markup.
What is Person schema?
Person schema is JSON-LD declaring an author entity. Pair with sameAs array linking to social and Wikidata. See author-entities.
What is sameAs?
sameAs is a Person or Organization property listing public profile URLs (LinkedIn, GitHub, Wikipedia, Wikidata). The primary entity-resolution signal. See author-entities.
Performance and Core Web Vitals
What are Core Web Vitals (CWV)?
Core Web Vitals (CWV) are Google’s three field-measured performance metrics: LCP, INP, CLS. See core-web-vitals.
What is LCP (Largest Contentful Paint)?
LCP (Largest Contentful Paint) is the time to render the largest above-the-fold element. Target under 2.5 seconds at p75. See lcp.
What is INP (Interaction to Next Paint)?
INP (Interaction to Next Paint) is the time from a user interaction to the next visual response. Replaced FID in 2024. Target under 200ms at p75. See inp.
What is CLS (Cumulative Layout Shift)?
CLS (Cumulative Layout Shift) is a visual stability metric measuring unexpected layout shifts. Target under 0.1 at p75. See cls.
What is TTFB (Time to First Byte)?
TTFB (Time to First Byte) is server response time. Foundational but not a CWV. Target under 800ms. See page-speed.
What is FCP (First Contentful Paint)?
FCP (First Contentful Paint) is the time to render any content on screen. Not a CWV but tracked in CrUX.
What is CrUX (Chrome User Experience Report)?
CrUX (Chrome User Experience Report) is Google’s public field-data dataset for Core Web Vitals. The ground truth for CWV at p75.
What is Lighthouse?
Lighthouse is Google’s lab tool for measuring performance, accessibility, SEO, and best practices. Lab data, not field data; use for development, not ranking decisions.
AI search and agent readability
What is AEO (Answer Engine Optimization)?
AEO (Answer Engine Optimization) is the practice of structuring content so AI engines can extract and cite it. See ai-search-optimization.
What is GEO (Generative Engine Optimization)?
GEO (Generative Engine Optimization) is a synonym for AEO. The newer term that emphasizes generative AI engines (Google AI Overviews, ChatGPT, Perplexity, Claude). See ai-search-optimization.
What is an AI Overview?
An AI Overview is Google’s AI-generated SERP summary that appears above traditional results for many queries. Inclusion depends on entity recognition, citation hygiene, and structured definitions.
What is llms.txt?
llms.txt is a markdown file at the site root listing the canonical pages in a way agents can ingest cheaply. See llms-txt.
What is llms-full.txt?
llms-full.txt is a companion to llms.txt containing the full text of priority pages, concatenated for one-shot agent ingestion. See discoverability-files.
What is ai.txt?
ai.txt is a proposed standard declaring AI training opt-in or opt-out at the domain level. See ai-txt.
What are GPTBot, ClaudeBot, PerplexityBot?
GPTBot, ClaudeBot, and PerplexityBot are the crawlers run by OpenAI, Anthropic, and Perplexity respectively. Each respects robots.txt and ai.txt by convention.
What is entity SEO?
Entity SEO is the practice of building a recognized entity in the search engine’s knowledge graph (author, brand, product). See author-entities.
What is a knowledge graph?
A knowledge graph is the search engine’s structured database of entities and their relationships. Inclusion drives knowledge-panel SERP features.
Content strategy
What is topical authority?
Topical authority is the depth and breadth of a site’s coverage on a given topic cluster. Built by shipping comprehensive content on a focused subject. See content-clusters.
What is a content cluster?
A content cluster is a pillar page plus supporting cluster pages, all internally linked. The unit of topical authority. See content-clusters.
What is a pillar page?
A pillar page is the hub page of a content cluster. Comprehensive coverage of the parent topic, linking out to cluster pages for sub-topics.
What is a cluster page?
A cluster page is a supporting page within a content cluster. Targets a specific long-tail keyword and links back to the pillar.
What is E-E-A-T?
E-E-A-T stands for Experience, Expertise, Authoritativeness, Trustworthiness. Google’s quality framework used by human raters and modeled by the ranking systems. See e-e-a-t.
What is HCS (Helpful Content System)?
HCS (Helpful Content System) is Google’s classifier that grades whole sites on people-first vs. search-engine-first content. Site-wide signal. See helpful-content-update.
What is a content refresh?
A content refresh is updating an existing page with new facts, new sections, or new framing. Distinct from a date-only update. See content-refresh.
What is YMYL (Your Money, Your Life)?
YMYL (Your Money, Your Life) refers to topics where bad information could harm finances, health, or safety. Held to a higher E-E-A-T bar.
What is helpful content (people-first content)?
Helpful content (people-first content) is Google’s term for content written to answer the searcher’s question rather than to rank. The opposite is search-engine-first content. See helpful-content-update.
Discoverability files
What are discoverability files?
Discoverability files are the collective set of special files a public site ships at the root or under /.well-known/: robots.txt, sitemap.xml, llms.txt, ai.txt, security.txt, humans.txt, manifest.json, IndexNow key, OG default, favicon. See discoverability-files.
What is security.txt?
security.txt is a file at /.well-known/security.txt declaring vulnerability disclosure contacts. RFC 9116.
What is humans.txt?
humans.txt is an informal file at /humans.txt crediting the team behind the site.
What is manifest.json?
manifest.json is a PWA configuration file declaring app name, icons, theme color, display mode.
What is IndexNow?
IndexNow is a protocol for instantly notifying search engines (Bing, Yandex) of content changes. See indexnow.
What is an IndexNow key?
An IndexNow key is a 32-character key proving ownership of the domain, served as /<key>.txt.
What is an OG image?
An OG image is the og:image meta tag image used in social previews. Standard size 1200x630. See og-images.
SERP features
What is a featured snippet?
A featured snippet is a position-zero result displaying a direct answer above the first organic result. See serp-features.
What is People Also Ask (PAA)?
People Also Ask (PAA) is the expandable question-and-answer accordion in mid-SERP. Triggered by FAQPage schema or topical relevance.
What is a knowledge panel?
A knowledge panel is the right-rail entity card displaying structured information about a person, place, or thing. Driven by the knowledge graph.
What are sitelinks?
Sitelinks are the indented sub-links shown under a brand’s main result for branded queries. Algorithmic; cannot be requested directly.
What is an image pack?
An image pack is a horizontal strip of image results within the main SERP. Driven by image SEO.
What is a video carousel?
A video carousel is a horizontal strip of video results, usually from YouTube. Requires VideoObject schema.
What is a local pack?
A local pack is the map plus three-business listing for local queries. Driven by Google Business Profile. See local-seo.
Penalties and recovery
What is a manual action?
A manual action is a penalty applied by a Google human reviewer. Notified in GSC. Requires explicit reconsideration request to clear.
What is an algorithmic penalty?
An algorithmic penalty is a ranking downweight applied by the algorithm, not a human. No notification; detected by traffic patterns.
What is a core update?
A core update is a broad change to the ranking algorithm. Announced retroactively by Google. May reshuffle rankings significantly.
What is a reconsideration request?
A reconsideration request is the form a site submits to ask Google to review a manual action. Required after fixing the underlying issue.
What is a toxic backlink?
A toxic backlink is a link from a low-quality, spammy, or unrelated site. Disavow if numerous; ignore if isolated.
Local SEO
What is Google Business Profile (GBP)?
Google Business Profile (GBP) is the free Google product for managing a business’s local listing. Formerly Google My Business. See local-seo.
What is NAP?
NAP stands for Name, Address, Phone. Must be identical across the site, GBP, and major citation sites.
What is a local pack (Local SEO context)?
A local pack in the Local SEO context is the same SERP feature defined above. See SERP features above.
What is a citation?
A citation is a mention of a business’s NAP on another site (Yelp, Bing Places, industry directories). Citations strengthen local relevance.
What is LocalBusiness schema?
LocalBusiness schema is the JSON-LD type for a physical business. Includes hours, address, geo coordinates, payment types.
Analytics and tooling
What is Google Search Console (GSC)?
Google Search Console (GSC) is Google’s free tool for monitoring a site’s search performance, coverage, and structured data. The primary SEO instrument. See master-google-search-console.
What is Bing Webmaster Tools?
Bing Webmaster Tools is Microsoft’s equivalent for Bing. Always pair with GSC.
What is CTR (Click-Through Rate)?
CTR (Click-Through Rate) is the percentage of SERP impressions that result in a click. The primary lever for title and meta-description testing.
What are impressions?
Impressions are the number of times a page appeared in the SERP for a query. Measured in GSC.
What is average position?
Average position is the mean SERP position a page held for a query in a given period. Measured in GSC.
What is bounce rate?
Bounce rate is the percentage of sessions that ended without further interaction. Misleading on single-page intent; favor engagement metrics.
What is engagement rate?
Engagement rate is GA4’s primary engagement metric. Replaces bounce rate.
What is a schema validator?
A schema validator is a tool that parses and checks structured-data markup. Google’s Rich Results Test is the canonical one.
What is Lighthouse CI?
Lighthouse CI is continuous-integration automation for Lighthouse. Runs on every build.
What are Screaming Frog and Sitebulb?
Screaming Frog and Sitebulb are two desktop SEO crawlers used for site audits. Either is acceptable for the full audit workflow described in audit-checklist.