Overview
A slow query in Postgres has a small number of likely causes: missing index, stale statistics, cold cache, wrong join order, or a query that is just doing too much work. This guide finds the offender, reads the plan, fixes one cause, and confirms the fix sticks. The umbrella rules live in postgres-explain and postgres-indexes.
Prerequisites
- Postgres 14 or newer. Postgres 17 is the assumed baseline; older versions miss WAL reporting in
EXPLAIN. - A user with
pg_read_all_statsor superuser access. Readingpg_stat_statementsrequires it. - A reproducible slow query. “Sometimes it is slow” is not a starting point; capture the query text and parameters first.
- Production-shaped data. Ten rows in dev will not show what 10 million rows in prod do.
Steps
1. Enable pg_stat_statements
Edit postgresql.conf, add to shared_preload_libraries, restart.
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = allThen, once, in the target database:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;The view starts collecting on next restart.
2. Find the offender
The top queries by total_exec_time are the right tuning targets. Mean time hides queries that are cheap individually but run a million times an hour.
SELECT
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(total_exec_time::numeric, 2) AS total_ms,
rows,
query
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY total_exec_time DESC
LIMIT 20;Pick the first query with high total_ms and high calls. That is the one with the biggest payback.
3. Run EXPLAIN ANALYZE with BUFFERS
EXPLAIN returns the planner’s estimate; EXPLAIN ANALYZE runs the query and reports actual rows and time; BUFFERS adds cache-hit and disk-read counts.
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT id, total_cents FROM orders
WHERE user_id = $1 AND created_at > now() - interval '30 days'
ORDER BY created_at DESC LIMIT 50;Read the plan top-down to follow data flow; debug bottom-up because the leaves run first. See postgres-explain for the full grammar.
4. Spot the failure mode
Three patterns cover most slow queries.
- Sequential scan on a large table with a selective predicate. The planner did not find a useful index, or the index was disqualified by a function call like
LOWER(email) = $1without a matching expression index. - Index scan with
actual rowsten times the estimate. Statistics are stale. RunANALYZE <table>or raisedefault_statistics_target. - High
Buffers: shared read=Non the second run. The working set does not fit inshared_buffers. Either raise the setting, narrow the query, or partition cold data.
5. Add the right index, not any index
The right index is the one that makes the predicate selective and covers the projection.
-- Predicate on (user_id, created_at) with an ORDER BY created_at DESC.
CREATE INDEX CONCURRENTLY orders_user_created_idx
ON orders (user_id, created_at DESC)
INCLUDE (total_cents);CONCURRENTLYavoids the long lock; required for production.- Column order matches the predicate equality first, then the range.
INCLUDEadds non-key columns so the index covers the projection and the plan becomes an index-only scan.
See postgres-indexes for the full rules.
6. Re-run EXPLAIN ANALYZE and confirm
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, total_cents FROM orders
WHERE user_id = $1 AND created_at > now() - interval '30 days'
ORDER BY created_at DESC LIMIT 50;The new plan should show Index Only Scan using orders_user_created_idx near the top, actual time under 5 ms, and Buffers: shared hit=N read=0 on a warm cache. If read is still large, autovacuum has not visited the table; run VACUUM ANALYZE orders. See postgres-vacuum.
7. Reset and re-measure under real load
pg_stat_statements_reset() clears the stats; let the application run for an hour, then re-query the top 20. The fixed query should be off the list. If it is back, the planner is choosing a different plan under parameter values you did not test; capture those and repeat.
Verify it worked
-- 1. The new plan is an Index Only Scan with low actual time.
EXPLAIN (ANALYZE, BUFFERS) <the query>;
-- expected: "Index Only Scan", "actual time=...ms" under target
-- 2. pg_stat_statements no longer ranks the query in the top 20.
SELECT query FROM pg_stat_statements
ORDER BY total_exec_time DESC LIMIT 20;
-- 3. Application p95 latency drops on the affected route.
-- Check the dashboard wired to [[backend/observability]].Common errors
- Missing index after
CREATE INDEX CONCURRENTLY. The command failed silently in a transaction;CONCURRENTLYcannot run insideBEGIN. Run it outside any transaction. pg_stat_statementsshows the wrong query. Statements are normalized; the literal?1placeholder hides parameters. Use theparameterizationview (Postgres 16+) or grab the text from the application log.- New index is not used. Postgres prefers a sequential scan for non-selective predicates. Run
ANALYZE; if it still does not pick the index, check that the column types in the query match the index. - Plan looks great in dev, slow in prod. Statistics differ. Compare
pg_stats.n_distinctandmost_common_valsbetween environments. - Query is fast at 5 ms warm, slow at 500 ms cold. IO problem, not plan problem. Increase
shared_buffersor partition the cold range; see postgres-partitioning.