Top-p (nucleus sampling)

Overview

This page is the atomic definition. Sampling and inference configuration live at prompt-design.

Definition

Top-p sampling (nucleus sampling) restricts the token vocabulary considered at each generation step to the smallest set of tokens whose cumulative probability mass exceeds p. At top_p=0.9, the model considers only the tokens that together account for 90% of the probability mass, discarding the long tail of unlikely tokens. The remaining candidates are renormalized and sampled. Top-p and temperature interact: temperature changes the shape of the distribution before top-p clips the tail. Setting top_p=1.0 disables nucleus sampling (the full vocabulary is available). Setting top_p=0.1 produces very conservative output that is close to greedy. Most production configurations tune only one of temperature or top-p and leave the other at its maximum value; mixing both requires understanding their interaction. OpenAI, Anthropic, and Google all expose top_p as an inference parameter.

When it applies

Set top_p below 1.0 to reduce incoherent or off-topic token selection in creative tasks. For deterministic pipelines, set temperature=0 and leave top_p=1.0; greedy decoding at temperature zero is simpler to reason about than nucleus sampling at low p.

Example

At top_p=0.9, if the top 3 tokens cover 92% of probability mass, only those 3 are considered. The rare fourth token (2% probability) is excluded even though it might occasionally be correct.

temperature - the companion sampling parameter that reshapes the distribution.
token - top-p filters the set of next-token candidates.
prompt-design - guidance on pairing temperature and top-p for different tasks.
structured-output - structured outputs pair with low temperature; top-p is less critical.

Citing this term

See Top-p (nucleus sampling) (llmbestpractices.com/glossary/top-p).

LLM Best Practices

Explorer

Overview

Definition

When it applies

Example

Citing this term

Graph View

Table of Contents

Backlinks

LLM Best Practices

Explorer

Top-p (nucleus sampling)

Overview

Definition

When it applies

Example

Related concepts

Citing this term

Related

Graph View

Table of Contents

Backlinks