Overview
Site architecture is the shape of the URL graph and the link graph laid on top of it. A flat, siloed architecture concentrates ranking signal on the pages that should rank; a deep, sprawling one dilutes signal across orphan leaves no crawler will spend budget on. Decide the shape before the first content page ships, because retrofits cost redirects.
Keep priority pages within three clicks of home
The home page is the single highest-authority node on the site. Every click away from home halves the link equity reaching the next page, on average. A page buried five clicks deep ranks like the orphan it functionally is.
- Money pages, top-of-funnel guides, and category MOCs live at depth 1 or 2.
- Reference detail pages live at depth 2 or 3.
- Anything deeper than 3 is a candidate for promotion, merger, or removal.
Measure depth with a crawler that records the shortest path from the home URL. Screaming Frog, Sitebulb, and the open-source wget --spider --recursive all produce a depth column. See crawl-budget for the budget side of the same problem.
Map URL folders to topic clusters one-to-one
A category folder in the URL is a public commitment to a topic cluster. /seo/ should contain SEO pages and nothing else; /howto/ should contain step-by-step tutorials and nothing else. Mixing topics inside one folder confuses both the classifier and the reader.
/seo/technical
/seo/structured-data
/seo/internal-linking
/howto/audit-site-for-core-web-vitals
/howto/deploy-quartz-site
The folder is the silo. The MOC (index.md in each folder) is the silo entry point. See content-clusters for the pillar-cluster pattern that fills the silo, and index for the on-disk layout that mirrors the URL.
Use short kebab-case noun-phrase slugs
A slug is a UX surface; users see it, share it, and type it. Keep it short, descriptive, and stable.
- Kebab-case.
meta-descriptions, notmetaDescriptionsormeta_descriptions. - Noun phrase, not a sentence.
audit-checklist, nothow-to-audit-your-site. - No stop words unless they change meaning.
internal-linking, notthe-internal-linking-guide. - No dates, no version numbers, no author initials. The slug is permanent; the body is mutable.
- Match the slug to the H1 intent. Drift between the two is the most common cause of a Google rewrite.
Once a slug is published, treat it as an API. Renaming a slug requires a 301 and breaks every external link until link rot resolves.
Render breadcrumb schema on every page below root
Breadcrumbs serve two readers: the human who wants to navigate up, and the crawler that uses the BreadcrumbList JSON-LD to label the page’s place in the hierarchy. Render both.
<nav aria-label="Breadcrumb">
<a href="/">Home</a> /
<a href="/seo/">SEO</a> /
<span>Site architecture</span>
</nav>{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{ "@type": "ListItem", "position": 1, "name": "Home", "item": "https://example.com/" },
{ "@type": "ListItem", "position": 2, "name": "SEO", "item": "https://example.com/seo/" },
{ "@type": "ListItem", "position": 3, "name": "Site architecture" }
]
}The visible breadcrumb labels must match the JSON-LD name values exactly. See structured-data for the full schema rules.
Audit for orphan pages every quarter
An orphan page is any URL with zero internal inbound links. Crawlers find it only via the sitemap, and rankings reflect that. Run a site crawl, intersect the URL set with the link graph, and treat the difference as the orphan list.
- Link every orphan from at least one MOC and one peer page.
- If no peer page makes sense, the orphan is off-topic; remove or move it.
- Re-run the crawl after the fix and confirm zero orphans.
The same crawl produces the depth report; do both passes in one job.
Avoid dead-end leaf pages
Every page should link out to at least two siblings and one parent. A page that only links to itself drops the reader and the crawler into a cul-de-sac. Use the related frontmatter field and inline wikilinks to enforce this; see internal-linking for the link-anchor rules.
Common errors
- Deep nesting that buries priority pages.
/category/subcategory/sub-subcategory/pagereads tidy on disk and ranks like an orphan. Flatten to two segments. - Mismatch between URL slug and breadcrumb label.
/seo/site-architecturerendering as “Site Architecture Best Practices” in the crumb confuses the classifier. Match exactly. - Orphan MOCs. A folder index that no other page links to is itself an orphan. Link every MOC from the master index.
- Renaming slugs without 301s. Every external backlink to the old URL becomes a 404. Always redirect; see title-tags for the related title-stability rule.
- One H1 per page violated by the breadcrumb. The crumb is a
<nav>, not a heading; never wrap it in<h1>.