Overview

This page is the atomic definition. Retrieval implementation lives at rag-retrieval and embeddings.

Definition

Vector similarity is a scalar measure of how close two embedding vectors are in the embedding space. The three most common metrics: cosine similarity measures the angle between vectors, returning 1.0 for identical direction and 0.0 for orthogonal; it is independent of vector magnitude and is preferred for text embeddings. Dot product is the unnormalized inner product; faster to compute but magnitude-sensitive. Euclidean distance (L2 norm) measures geometric distance; lower is more similar. Most embedding models are trained to maximize cosine similarity between semantically related texts. Vector databases (pgvector, Qdrant, ChromaDB, Pinecone) build approximate nearest-neighbor (ANN) indexes (HNSW, IVFFlat) that return the top-k most similar vectors in sub-linear time. Exact search is O(n) per query; HNSW reduces this to O(log n) at the cost of occasional missed results.

When it applies

Use cosine similarity for text embedding search unless the embedding model documentation specifies a different metric. Use approximate search (HNSW) for collections over ~50k vectors. Use exact search for small collections where recall must be perfect.

Example

-- pgvector cosine similarity search:
SELECT id, content, 1 - (embedding <=> $1) AS score
FROM documents
ORDER BY embedding <=> $1
LIMIT 10;
  • embedding - the vectors being compared.
  • reranker - the second-stage model that improves on raw similarity rankings.
  • embeddings - embedding model choice affects the quality of similarity scores.
  • rag-retrieval - the retrieval pipeline that runs vector similarity search.
  • chromadb - a vector database that wraps similarity search.

Citing this term

See Vector similarity (llmbestpractices.com/glossary/vector-similarity).