Overview

A prompt is a contract. The system message is the part of the contract that stays in force; the user message is the request under that contract. Write durable rules into the system prompt and put the per-turn ask in the user message. The patterns below apply whether the model is Claude, GPT-4.1, or a local Llama 3 served by ollama.

Put durable instructions in the system prompt

System prompts persist across turns; user messages do not. Anything the model should obey on every turn belongs in the system prompt.

  • Role definition, voice rules, banned constructions: system.
  • Output schema, tool list, safety policy: system.
  • The specific question or task: user.

A system prompt that restates the current task is wasted budget. A user message that restates the role on every turn is brittle; the model will drift if the user forgets to repeat it.

Frame the role explicitly

State what the model is and what it is not. Vague roles produce vague output.

You are a SQL reviewer for a Postgres 17 codebase. You read migrations
and return one of: approve, request_changes, block. You do not write
new SQL. You do not comment on style outside the rules in section 2.

Two lines of explicit framing beat a paragraph of “be helpful.” Name the domain, the input, the output type, and the non-goals.

Use few-shot examples only when the task is hard to describe

Examples are expensive context. Use them when the rule is easier to demonstrate than to state, not as decoration.

  • Good case for few-shot: classifying user intents into ten categories, where the boundary between two categories is subtle.
  • Bad case: “write a summary.” The rule is statable; the example just doubles the token count.

When you do include examples, show three to five, cover the edge cases, and label inputs and outputs unambiguously.

Return structured output via JSON Schema or tool use

Prose is for humans. Anything a program will consume should be JSON validated against a schema. Use the model’s tool-use or structured-output mode; do not parse free-form prose with regex.

{
  "type": "object",
  "required": ["verdict", "reasons"],
  "properties": {
    "verdict": { "enum": ["approve", "request_changes", "block"] },
    "reasons": { "type": "array", "items": { "type": "string" } }
  }
}

A schema gives you a validation step. If parsing fails, you retry or fall back, instead of shipping malformed data downstream.

Anchor with concrete formats

When the output is a document, show the exact format. Headings, field order, length bounds.

Return a markdown block with:
- H2 "Summary" (one sentence)
- H2 "Findings" (bulleted, 3 to 7 items)
- H2 "Next steps" (numbered, max 5)

Concrete formats reduce variance more than adjectives. “Be concise” is weak; “max 5 items, max 12 words per item” is enforceable.

Repeat critical instructions at the end

Long prompts have a recency bias: the model attends more to the last instructions it saw. Do not bury the load-bearing rule on line three of a 200-line system prompt. Restate it at the end of the system message, or at the end of the user message before the model responds.

This applies double in multi-turn agent loops where prior turns push the system prompt further from the active context. See multi-agent for protocols that handle this in worker handoffs.

State non-goals explicitly

Say what the model should not do. “Do not refactor unrelated code.” “Do not call tools other than search and read.” “Do not add comments to the diff.”

A prompt without non-goals invites helpful overreach. The same rule shows up in the claude-code brief pattern: explicit non-goals stop the agent from “fixing” things on the side.