LLM Prompt Patterns Cheatsheet

Overview

These patterns are the reusable building blocks for reliable LLM prompts. Each pattern solves a specific failure mode: XML delimiters prevent content injection, few-shot examples outperform abstract instructions, chain-of-thought unlocks multi-step reasoning, role separation keeps behavior consistent across turns, and output schemas eliminate parsing surprises. For the principles behind why each pattern works, see prompt-design.

XML delimiter pattern

Wrap variable content in XML tags so the model distinguishes instructions from data.

Tag convention	Use case
`<document>...</document>`	Reference text the model should read but not follow as instructions.
`<instructions>...</instructions>`	Task instructions separated from context.
`<examples>...</examples>`	Few-shot examples block.
`<input>...</input>`	The specific item to process in this call.
`<output>...</output>`	Where the model should write its answer (in the template).
`<context>...</context>`	Background information; lower priority than instructions.
`<thinking>...</thinking>`	Ask the model to write reasoning here before answering.

You are a copyeditor. Follow the rules in <instructions> and apply them to <input>.

<instructions>
- Correct grammar and spelling.
- Do not change the author's voice.
- Flag any factual claims you cannot verify with [VERIFY].
</instructions>

<input>
{{USER_DOCUMENT}}
</input>

Write the edited document only. No commentary.

Use distinct tag names for each semantic role. Do not reuse the same tag name for different content in the same prompt.

Few-shot template

Provide 2 to 5 input/output examples before the task. Examples override abstract instructions; the model generalizes from them. See few-shot for selection guidance.

Classify each customer message as: REFUND, SHIPPING, TECHNICAL, or OTHER.
Respond with only the category label.

<examples>
<example>
<input>My package hasn't arrived after 3 weeks.</input>
<output>SHIPPING</output>
</example>
<example>
<input>I'd like to return my order for a full refund.</input>
<output>REFUND</output>
</example>
<example>
<input>The app crashes when I tap the export button.</input>
<output>TECHNICAL</output>
</example>
</examples>

<input>
{{CUSTOMER_MESSAGE}}
</input>

Keep examples representative of the full distribution, not just easy cases. Include at least one near-boundary example.

Chain-of-thought

Ask the model to reason before answering. For Claude, the extended thinking feature handles this automatically; for other models, add an explicit reasoning step. See chain-of-thought.

You are a financial analyst. Answer the question after working through the evidence step by step.

<question>{{QUESTION}}</question>
<data>{{FINANCIAL_DATA}}</data>

First, identify the key metrics relevant to the question.
Then reason through what each metric implies.
Finally, state your conclusion.

<thinking>
[Reason here before writing the answer]
</thinking>

<answer>
[Final answer only]
</answer>

For latency-sensitive tasks, chain-of-thought increases token count. Profile before requiring it in production.

System and user role separation

Use the system prompt for persona and persistent rules; use the user turn for task content. This separation is critical for multi-turn reliability. See system-prompts.

Slot	Put here	Do not put here
`system`	Role definition, behavioral rules, output format spec, tone, language.	The user’s specific task.
`user`	The content to process for this specific request.	Behavioral overrides (they should not be overridable).
`assistant` (prefill)	Force a response prefix; constrains format strongly.	Long text blocks.

messages = [
    {
        "role": "system",
        "content": (
            "You are a senior code reviewer. "
            "Review the submitted code for correctness, security, and style. "
            "Respond in JSON matching the schema provided by the user."
        )
    },
    {
        "role": "user",
        "content": f"<code>{user_code}</code>\n\nSchema: {json_schema}"
    }
]

JSON output schema

Request structured output by embedding a schema in the prompt and prefilling the assistant turn with {. See structured-output.

Extract the key entities from the support ticket below.
Return a JSON object matching this schema exactly. No markdown fences.

Schema:
{
  "category": "REFUND | SHIPPING | TECHNICAL | OTHER",
  "urgency": "low | medium | high",
  "product_id": "string or null",
  "summary": "one sentence"
}

Ticket:
<ticket>{{TICKET_TEXT}}</ticket>

Prefill the assistant turn for models that support it:

messages = [
    {"role": "user", "content": prompt},
    {"role": "assistant", "content": "{"}   # forces JSON object start
]

Validate the response with a schema validator (pydantic, zod, ajv) rather than trusting raw output. Retry on validation failure with an error message in the next user turn.

Role-framing patterns

Assign a role when it encodes implicit expert knowledge. See role-framing.

Framing	When to use
`"You are a senior {domain} engineer."`	Technical review, code generation, architecture decisions.
`"You are a copy editor at a major publication."`	Writing, editing, headline scoring.
`"You are a {domain} expert who always cites sources."`	Research tasks where hallucination risk is high.
`"You are a helpful assistant."`	General tasks; add more specificity when quality matters.

Avoid roles that imply problematic behavior even when the task seems to require it. The role shapes the distribution of plausible responses; keep it domain-specific rather than character-specific.

Common gotchas

Prompt injection: user-controlled content in <input> tags can contain pseudo-instructions. Instruct the model to treat everything inside <input> as data only, not commands.
Overly long few-shot example sets push the task instructions out of the model’s attention window. Use 3 to 5 examples; prefer quality over quantity.
Prefilling with {" does not guarantee valid JSON. Always parse and validate the response. Structured-output APIs (with JSON mode or tool-use) are more reliable than prompt-only approaches.
Abstract negative instructions (“don’t be vague”) are less effective than positive examples showing the desired behavior. Show, rather than tell, what the output should look like.
System prompts are not truly secret in most APIs; adversarial users can often extract them via prompt injection or reflection. Do not store API keys or confidential logic there.
Chain-of-thought reasoning inside <thinking> tags can itself be manipulated. For high-stakes decisions, validate conclusions independently of the reasoning trace.

LLM Best Practices

Explorer

LLM Prompt Patterns Cheatsheet

Overview

XML delimiter pattern

Few-shot template

Chain-of-thought

System and user role separation

JSON output schema

Role-framing patterns

Common gotchas

Graph View

Table of Contents

Backlinks

LLM Best Practices

Explorer

LLM Prompt Patterns Cheatsheet

Overview

XML delimiter pattern

Few-shot template

Chain-of-thought

System and user role separation

JSON output schema

Role-framing patterns

Common gotchas

Related

Graph View

Table of Contents

Backlinks