Overview
This page is the atomic definition. The underlying phenomenon is defined at hallucination.
Definition
Hallucination rate is a quantitative metric that measures the frequency of factually incorrect or unsupported statements in LLM outputs over a defined evaluation set. It is computed as: (number of outputs with at least one hallucination) / (total outputs evaluated). The metric requires a reference: either a ground-truth dataset with verified correct answers, or human annotators rating each claim. Automated evaluation using llm-as-judge systems (one LLM critiquing another) scales better than human annotation but introduces its own biases. Hallucination rate is task-specific: a model may have near-zero hallucination rate on common knowledge questions and 30% on niche domain questions. rag reduces hallucination rate on retrievable facts but can introduce grounding errors when retrieved context is irrelevant or contradictory. Tracking hallucination rate over model versions and prompt changes is a prerequisite for safe production LLM deployment.
When it applies
Measure hallucination rate before deploying any LLM system where incorrect facts cause harm: medical, legal, financial, or safety-critical applications. Monitor it in production using sampling and spot-check annotation. Set an acceptable threshold (e.g., under 2%) and block model or prompt upgrades that exceed it.
Example
An eval-set of 500 drug interaction questions is evaluated against a pharmacist-verified answer key. The model answers 487 correctly and includes one or more incorrect claims in 13 answers. Hallucination rate = 13/500 = 2.6%. The threshold is 2%; the model fails and is not promoted to production.
Related concepts
- hallucination - the underlying phenomenon; hallucination rate quantifies it.
- ground-truth - the verified reference required to detect hallucinations.
- eval-set - the test set over which hallucination rate is measured.
- rag - retrieval reduces hallucination rate on knowledge-intensive tasks.
- llm-as-judge - automated scoring scales hallucination rate measurement.
Citing this term
See Hallucination Rate (llmbestpractices.com/glossary/hallucination-rate).