Overview
Two embedding models dominate hosted English retrieval in 2026: Voyage 3 Large from Voyage AI and text-embedding-3-large from OpenAI. Both support Matryoshka truncation, top MTEB scores, and cost-optimized tiers. The choice is not about which headline benchmark is higher. It is about which one performs better on your corpus, at a price your pipeline can afford. For evaluation methodology, see embeddings-eval.
Benchmark on your data, not theirs
Public benchmarks like MTEB measure aggregate performance across many domains. The delta between the two models on your specific corpus is often larger than the gap between their MTEB scores. Run a 200 to 500 query golden set, embed with both models at the same dimension, and measure recall@5 or MRR@10 before committing.
# Quick bake-off template
from voyage import Client as VoyageClient
from openai import OpenAI
def embed_voyage(texts):
return voyage.embed(texts, model="voyage-3-large", output_dimension=1024).embeddings
def embed_openai(texts):
resp = oai.embeddings.create(input=texts, model="text-embedding-3-large", dimensions=1024)
return [r.embedding for r in resp.data]Run both against the same index structure and identical retrieval code. Domain-specific corpora (legal, biomedical, code-heavy) often flip the model ranking versus general benchmarks.
Voyage 3 Large wins English and code retrieval
On English prose, technical documentation, and mixed code-plus-text corpora, Voyage 3 Large consistently outperforms text-embedding-3-large at equivalent dimensions. Voyage was purpose-built for retrieval tasks; their training mix and loss functions are tuned specifically for semantic search rather than general language modeling.
Use Voyage 3 Large when the corpus is English-dominant, when code snippets appear alongside prose, and when retrieval quality is the primary cost driver. The quality gain is especially pronounced on short queries against long documents.
OpenAI wins multilingual coverage
text-embedding-3-large covers over 100 languages and is backed by OpenAI’s multilingual training corpus. Voyage 3 Large lags on non-English retrieval. For multilingual corpora, use text-embedding-3-large or switch to Voyage Multilingual-2 or BGE-M3.
The multilingual gap matters even for nominally English products. User queries in product search often include names, addresses, and brand terms that originate in other scripts. Test with a real query sample before assuming English-only coverage is sufficient.
Cost: OpenAI has a cheaper small-tier option
At the time of writing:
text-embedding-3-small: lowest cost, 1536 max dim, solid quality for many tasks.text-embedding-3-large: 3072 max dim, roughly 13x the cost of-small.- Voyage 3 (standard): mid-price, 1024 max dim.
- Voyage 3 Large: highest quality, higher cost per token than the OpenAI large tier.
For cost-sensitive pipelines, text-embedding-3-small with Matryoshka truncation to 512 or 768 dimensions can approach Voyage 3 Large quality at a fraction of the price. Always verify recall before selecting the cheap tier. For batching discounts, see embeddings-cost-control.
Dimensionality and storage parity
Both models support Matryoshka truncation, so you can request any dimension up to their maximum. At 1024 dimensions, they are directly comparable. For storage-sensitive deployments, truncating both to 512 or 768 and re-running recall tests is faster than assuming the full dimension is required. See embeddings-dimensionality for the truncation trade-off rules.
Latency and API availability
OpenAI’s embedding endpoint has higher sustained throughput limits and more geographic redundancy. Voyage’s latency is competitive for batch ingestion but can lag under burst load. For real-time low-latency applications, test p99 latency under load for both providers before committing.
Vendor lock-in and migration cost
Every embedding model produces incompatible vectors. Switching models requires re-embedding the entire corpus and rebuilding the index. Model lock-in is real. Prefer the model that is directionally better on your domain even if the setup cost is higher now. See embeddings for migration guidance and pinecone-vs-pgvector for index-layer considerations.