Definition
Knowledge distillation is a training technique where a smaller student model is trained to match the behavior of a larger teacher model. The teacher generates outputs (soft labels, logit distributions, or full completions) on a dataset; the student is trained to reproduce them. The result is a compact model that retains most of the teacher’s quality on the target task at a fraction of the inference cost.
Two main forms:
-
Output distillation (black-box): the teacher generates text completions on a prompt set; the student fine-tunes on those completions. This works with API-only access to the teacher. The student learns to match the teacher’s outputs.
-
Logit distillation (white-box): the student minimizes the KL divergence between its token probability distribution and the teacher’s. Requires access to the teacher’s internal logits. More data-efficient than output distillation.
Distillation differs from fine-tuning in that the training signal is machine-generated from a stronger model rather than human-labeled. It is also distinct from quantization (which compresses weights without retraining).
When it applies
Use distillation when you need a smaller, faster, cheaper model for a specific task and a strong teacher model is available. Collect teacher outputs on a representative dataset (1,000 to 100,000+ examples). Evaluate the student against the teacher on a held-out test set using an evaluation harness. Distillation works best when the task is well-defined and the teacher’s outputs are high quality.
Do not expect the student to generalize beyond the teacher’s distribution; distillation is task-specific.
Example
import anthropic
client = anthropic.Anthropic()
prompts = load_training_prompts() # your task dataset
teacher_examples = []
for prompt in prompts:
response = client.messages.create(
model="claude-opus-4-5", # teacher
max_tokens=512,
messages=[{"role": "user", "content": prompt}]
)
teacher_examples.append({
"prompt": prompt,
"completion": response.content[0].text
})
save_jsonl(teacher_examples, "distillation_train.jsonl")Related concepts
- fine-tuning - distillation uses machine-generated labels; fine-tuning uses human-labeled data. Both update model weights.
- evaluation-harness - measure whether the student matches the teacher on the target task.
- golden-set - use a golden set to cap acceptable quality degradation from teacher to student.
- completion - distillation uses completions from the teacher as training targets.
- prompt-design - prompting the teacher to produce high-quality training examples is critical.
Citing this term
See Distillation (llmbestpractices.com/glossary/distillation).