Overview
Workers KV is a globally replicated key-value store optimized for reads at the edge. Writes propagate to every PoP within 60 seconds and reads return single-digit-millisecond latency once cached locally. Treat KV as a cache, not a database; the consistency model and the per-key write-rate ceiling make it the wrong tool for anything that needs read-your-writes.
Treat KV as a cache, not a database
KV is eventually consistent. A write to one PoP becomes visible to other PoPs only after replication completes, which can take up to a minute in practice and longer under load.
- Never write a value and immediately read it from a different region; the read can return the old value or
null. - Never use KV for counters, balances, or anything that must reflect the latest write. Use a Durable Object or Postgres instead.
- Use KV when stale-by-a-minute is fine: feature flags, config blobs, session data with long TTLs, cached API responses, static asset metadata.
The mental model: KV is a CDN for arbitrary bytes, keyed by string, with a write API.
Design for read-heavy workloads only
KV throttles writes at roughly one per second per key and bills per-write more aggressively than per-read. A KV namespace happily handles millions of reads per second across the edge, but a hot key under heavy writes will queue.
- Reads are cheap. The free plan covers 100,000 reads per day and the paid plan charges per million.
- Writes are slow and expensive. Batch them, and never write inside a hot request path.
- Keys cap at 512 bytes; values cap at 25 MiB. Most use cases stay under 1 KiB per value.
If the workload is write-heavy or strongly consistent, the answer is not “use KV more aggressively.” The answer is D1, a Durable Object, or an external database.
Bind the namespace and read with the SDK
Declare the namespace in wrangler.toml, then read and write from the Worker.
export default {
async fetch(req: Request, env: { CACHE: KVNamespace }) {
const cached = await env.CACHE.get("home:hero", { type: "json" })
if (cached) return Response.json(cached)
const fresh = await fetchHero()
await env.CACHE.put("home:hero", JSON.stringify(fresh), {
expirationTtl: 3600, // seconds
})
return Response.json(fresh)
},
}Pass { type: "json" } or { type: "arrayBuffer" } to skip the string parse. Use expirationTtl (seconds from now) or expiration (absolute Unix timestamp) to set a hard expiry. KV does not run a background cleaner; expired keys read as null.
Use bulk operations for batch writes
The single-key API issues one HTTP request per call and hits rate limits fast. The bulk endpoints land up to 10,000 writes in one shot.
await env.CACHE.put("batch", "value") // one round trip per call
// Bulk: prefer for imports, full-namespace updates, scheduled jobs
const items = [
{ key: "user:1", value: JSON.stringify(u1) },
{ key: "user:2", value: JSON.stringify(u2) },
]
// From a Node script with the REST API or the Wrangler CLI:
// npx wrangler kv:bulk put --binding CACHE bulk.jsonFor batch reads, fan out with Promise.all inside the Worker. There is no native multi-get, but parallel reads on warmed PoPs are cheap.
Design keys as namespaces with delimiters
KV keys are flat strings. Build hierarchy with a delimiter so you can list and prune.
- Pick a delimiter (
:is the convention) and use it consistently:session:abc123,flag:checkout-v2,cache:home:hero. - Use
list({ prefix: "session:" })to enumerate a slice. List paginates at 1,000 keys per page; expect to iterate. - Avoid putting user input directly in keys; hash or sanitize first to bound key length and avoid weird characters.
- Resist the urge to model relational data with composite keys. If the query needs joins or range scans, reach for D1.
Reach for KV when the read pattern is “lookup by ID”
KV fits a narrow shape of workload, and the shape is what makes it fast.
- Feature flags read on every request, updated weekly from a dashboard.
- Session caches keyed by token, with a 24-hour TTL.
- Edge-side config blobs that change once a day.
- Memoized API responses keyed by URL hash, with a short
expirationTtl. - Rate-limit counters under low contention (one key per IP, not one key per route).
Skip KV for transactional workloads (cloudflare-durable-objects), large blobs (cloudflare-r2), or anything that needs SQL-style queries (D1 or postgres-prod).