Overview
JSON-LD is the structured-data format every major search engine reads, and it is the one feature that earns rich results without rewriting the page. This guide picks the right schema.org types for a typical content site, generates the JSON-LD from page frontmatter at build time, injects it into the document <head>, and validates it before shipping. The umbrella rules live in structured-data.
Prerequisites
- A static site generator with a templating step (Astro, Next.js static export, Quartz, Hugo, Eleventy). The page
<head>must be controllable from a template. - Pages with YAML frontmatter that includes at minimum
title,slug,summary,datePublished(orlast_updated), and anauthorfield. - A canonical URL strategy. JSON-LD URLs must match the rendered canonical exactly.
Steps
1. Pick the schema types for each page kind
Most content sites need a small fixed set. Add specialty types only when the page truly matches; see schema-markup-deep for the long menu.
WebSiteonce, on the homepage, withSearchActionif the site has internal search.Organizationonce, on the homepage or/about, with logo andsameAs.BreadcrumbListon every page below the homepage.ArticleorBlogPostingon editorial pages.FAQPageonly when the page is genuinely Q&A.
Map each page template to one entity type. Do not stack five types on one page.
2. Write the JSON-LD generator
Generate the JSON from frontmatter at build time so the markup and the rendered content cannot drift. In a Quartz or Astro project, a small helper takes a page object and returns the JSON-LD block.
// scripts/jsonld.ts
export function articleJsonLd(page: Page, site: SiteConfig) {
return {
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": page.title,
"description": page.summary,
"author": { "@type": "Person", "name": page.author },
"datePublished": page.datePublished,
"dateModified": page.last_updated,
"image": `${site.baseUrl}/og/${page.slug}.png`,
"mainEntityOfPage": `${site.baseUrl}/${page.path}`,
}
}Keep one helper per entity type. Compose them in the template, not in the helper.
3. Inject into <head> server-rendered
The JSON-LD must be in the HTML on first byte. Do not inject from client JS; many crawlers do not run scripts.
<head>
<link rel="canonical" href="https://example.com/seo/technical" />
<script type="application/ld+json">
{{ jsonLdBlock }}
</script>
</head>One <script type="application/ld+json"> per entity. Quartz users: add the block from a Head component or via a custom emitter; do not paste raw HTML into markdown.
4. Match URLs across canonical, OG, and JSON-LD
The cardinal pitfall. The same canonical URL must appear in three places: <link rel="canonical">, og:url, and mainEntityOfPage (or url) in the JSON-LD. Mismatches confuse crawlers and disqualify rich results.
<link rel="canonical" href="https://example.com/foo/bar" />
<meta property="og:url" content="https://example.com/foo/bar" />
<!-- and inside JSON-LD: "mainEntityOfPage": "https://example.com/foo/bar" -->Generate all three from the same canonicalUrl variable.
5. Validate every change
Run two validators on every page kind.
- Google Rich Results Test:
https://search.google.com/test/rich-results. Tests rich-result eligibility per page. - schema.org validator:
https://validator.schema.org. Confirms the JSON parses and conforms.
Wire the validation into a pre-deploy script. The repo scripts/validate-jsonld.mjs here pulls every URL in sitemap.xml, extracts the <script type="application/ld+json"> block, and asserts required fields exist.
Verify it worked
Three checks confirm the markup is live and valid.
# 1. The JSON-LD block is in the rendered HTML on first byte.
curl -s https://yourdomain.com/foo/bar | grep -o 'application/ld+json'
# expected: application/ld+json (one or more matches)
# 2. The block parses as JSON.
curl -s https://yourdomain.com/foo/bar | \
python3 -c "import sys, re, json; m = re.search(r'<script[^>]*application/ld\+json[^>]*>(.+?)</script>', sys.stdin.read(), re.S); json.loads(m.group(1))"
# 3. The Rich Results Test reports the type without errors.
# Open: https://search.google.com/test/rich-results?url=https%3A%2F%2Fyourdomain.com%2Ffoo%2FbarAll three green means the markup is shippable.
Common errors
- The Rich Results Test reports “Page is not eligible for rich results” with no errors listed. A required field is missing. Compare against the type’s docs (for
BlogPosting,headline,author,datePublishedare required). - JSON-LD references content not visible on the page. Crawlers treat this as spam. Markup must match what a human sees; see structured-data.
- The canonical URL in JSON-LD has a trailing slash and the rendered URL does not. Normalize to one form site-wide.
- The script is injected from React or Vue on the client. Crawlers may not see it. Render it server-side or pre-render.
- Two
Articleblocks on the same page. Use one entity per type per page; collapse to a single block.