Skip to main content
LLM Best Practices
Search
Search
Dark mode
Light mode
Explorer
Tag: evaluation
13 items with this tag.
Jun 15, 2026
How to build reliable AI agents in production
ai-agents
production
reliability
evaluation
observability
Jun 15, 2026
Eval Set
glossary
ai-agents
llm
evaluation
testing
metrics
Jun 15, 2026
Evaluation Harness
glossary
ai-agents
evaluation-harness
evaluation
testing
llm
Jun 15, 2026
Ground Truth
glossary
ai-agents
llm
evaluation
datasets
metrics
Jun 15, 2026
Hallucination Rate
glossary
ai-agents
llm
evaluation
reliability
metrics
Jun 15, 2026
LLM evaluation and testing in production
ops
llmops
evaluation
testing
observability
Jun 15, 2026
LLMOps best practices
ops
llmops
mlops
evaluation
observability
best-practices
May 29, 2026
Embeddings: Evaluation
ai-agents
embeddings
evaluation
mrr
recall
ndcg
golden-set
benchmarks
May 21, 2026
Agent Evaluation
ai-agents
evaluation
testing
metrics
May 21, 2026
Golden set
glossary
ai-agents
evaluation
golden-set
metrics
May 21, 2026
LLM-as-judge
glossary
ai-agents
evaluation
llm-as-judge
metrics
May 14, 2026
Hallucination
glossary
ai-agents
hallucination
evaluation
grounding
May 14, 2026
RAG: Evaluation
ai-agents
rag
evaluation
metrics
testing