Overview
Neon is a serverless Postgres platform with native pgvector support. This guide enables the extension, creates a table to store embeddings, inserts vectors, builds an HNSW index for fast approximate nearest-neighbor search, and runs a cosine-similarity query. For the trade-off between pgvector and a dedicated vector database, see pinecone-vs-pgvector.
Prerequisites
- A Neon account. Free tier provides one project with up to 3 GB storage.
psqlinstalled locally or the Neon SQL editor in the browser.- Python 3.11+ and
psycopg2orasyncpgif you plan to insert via code. - An embedding model that produces fixed-length float vectors (e.g.,
text-embedding-3-smallat 1536 dimensions). See embeddings for model choice guidance. - Your
DATABASE_URLfrom the Neon project dashboard.
Steps
1. Enable the pgvector extension
Connect to your Neon database and run:
CREATE EXTENSION IF NOT EXISTS vector;Confirm it is active:
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
-- expected: vector | 0.7.0 (or later)pgvector is pre-installed on all Neon projects; you only need to activate it per database.
2. Create the embeddings table
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB,
embedding VECTOR(1536) NOT NULL
);Adjust the dimension (1536) to match your embedding model. Dimension mismatches at insert time raise a different vector dimensions error. See embedding for the dimension-model mapping.
3. Insert embeddings
Use the Python psycopg2 driver with the pgvector adapter:
import os
import psycopg2
from pgvector.psycopg2 import register_vector
conn = psycopg2.connect(os.environ["DATABASE_URL"])
register_vector(conn)
cur = conn.cursor()
# embedding_vector is a list of 1536 floats from your model.
cur.execute(
"INSERT INTO documents (content, metadata, embedding) VALUES (%s, %s, %s)",
("The quick brown fox", {"source": "test"}, embedding_vector),
)
conn.commit()For bulk inserts, use executemany or COPY to avoid per-row round-trips. See rag-vector-databases for batching strategies.
4. Build an HNSW index
A sequential scan (the default) is accurate but slow at scale. HNSW (Hierarchical Navigable Small World) gives approximate nearest-neighbor search in sub-linear time.
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);Parameter guidance:
m = 16: number of connections per node. Higher improves recall, increases index size.ef_construction = 64: build-time search depth. Higher improves recall, slows build.- Use
vector_cosine_opsfor cosine similarity (the standard for normalized embeddings from OpenAI and Voyage). Usevector_l2_opsfor Euclidean distance.
See postgres-indexes for the general index-selection playbook.
5. Query by similarity
-- Retrieve the 5 documents closest to a query embedding.
SELECT
id,
content,
metadata,
1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS cosine_similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 5;The <=> operator is cosine distance (1 minus similarity). Use <-> for Euclidean distance and <#> for negative inner product (for dot-product similarity with normalized vectors).
In Python:
query_vec = get_embedding("What does the fox do?")
cur.execute(
"SELECT id, content FROM documents ORDER BY embedding <=> %s LIMIT 5",
(query_vec,),
)
results = cur.fetchall()6. Set ef_search at query time for precision tuning
SET hnsw.ef_search = 100; -- Default is 40; higher improves recall at query cost.Set this per session or per query. For RAG pipelines where recall matters more than latency, start at 100 and tune down until latency meets the SLA.
Verify it worked
-- 1. Extension is active.
SELECT extname FROM pg_extension WHERE extname = 'vector';
-- 2. Table exists with the correct column type.
\d documents
-- 3. HNSW index is on the embedding column.
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'documents';
-- 4. Similarity query returns rows.
SELECT id, content FROM documents ORDER BY embedding <=> (SELECT embedding FROM documents LIMIT 1) LIMIT 5;Common errors
type "vector" does not exist. The extension was enabled in a different database. RunCREATE EXTENSION IF NOT EXISTS vectorin the correct database.different vector dimensions (X) != (Y). The inserted vector length does not match the column definition. Confirm the embedding model dimension and update the column definition if needed.- HNSW index not used by the query planner. Run
EXPLAIN (ANALYZE, BUFFERS)and confirmIndex Scan using ... on documents. If aSeq Scanappears, the table may be too small for the planner to prefer the index, oref_searchis set too high. - Insert is very slow on Neon. Neon auto-suspends the compute after five minutes of inactivity. The first connection after a suspend incurs a cold-start delay of one to three seconds.