Overview
Cloudflare Workers run JavaScript and WebAssembly on every Cloudflare PoP. Each request hits the nearest edge, the runtime spins up a V8 isolate in under five milliseconds, and the handler runs to completion or yields. Reach for Workers when latency matters more than CPU minutes; reach past them when the workload needs more than 128 MB of memory or 30 seconds of CPU time per request.
Run V8 isolates, not containers
Workers share one process per PoP and isolate each tenant inside a V8 context. There is no container boot, no Node runtime, no filesystem.
- Cold start is sub-five-millisecond on the free plan because the runtime is already warm.
- Memory is capped at 128 MB and CPU at 10 ms per request on the free plan, 50 ms on paid, and 30 seconds for long-running jobs with
wait_until. - Node APIs are absent. Use the Web Platform (
fetch,URL,crypto.subtle,Request,Response) or pull in a Node compat layer withnodejs_compat.
For workloads that need full Node, FFmpeg, Puppeteer, or a Python runtime, ship to vercel node functions or a VPS instead.
Export a fetch handler from the module
Every Worker exports a default object with handlers for the events it serves. The fetch handler is the entry point for HTTP.
export default {
async fetch(request: Request, env: Env, ctx: ExecutionContext) {
const url = new URL(request.url)
if (url.pathname === "/health") return new Response("ok")
return new Response("Hello from the edge", { status: 200 })
},
}env holds the configured bindings (KV namespaces, R2 buckets, secrets). ctx.waitUntil(promise) lets the Worker return immediately and finish background work after the response, useful for logging and cache writes.
Schedule cron jobs by adding a scheduled handler and a [triggers] crons = ["0 * * * *"] block in wrangler.toml.
Bind to KV, R2, D1, Queues, and Durable Objects
Workers cannot open arbitrary TCP sockets on the free plan. State lives in the Cloudflare data plane through bindings declared in wrangler.toml.
name = "my-worker"
main = "src/index.ts"
compatibility_date = "2026-05-14"
kv_namespaces = [{ binding = "CACHE", id = "abc123" }]
r2_buckets = [{ binding = "ASSETS", bucket_name = "site-assets" }]
d1_databases = [{ binding = "DB", database_name = "app", database_id = "..." }]
queues.producers = [{ binding = "JOBS", queue = "jobs" }]Pick the binding by access pattern. KV for read-heavy lookups (see cloudflare-kv). R2 for blobs and static assets (see cloudflare-r2). D1 for relational queries under 10 GB. Queues for asynchronous fanout. Durable Objects for single-instance state and coordination (see cloudflare-durable-objects).
Develop locally with wrangler dev
Wrangler is the CLI; treat it as the source of truth. Install once with npm i -D wrangler and pin the version in the project lockfile.
npx wrangler dev # local Miniflare runtime, ports to 8787
npx wrangler dev --remote # run against the real Cloudflare edge
npx wrangler deploy # ship to production
npx wrangler tail # stream logs from the live workerLocal mode emulates KV, R2, D1, and Durable Objects on disk. Remote mode hits the real bindings, useful for debugging quota and consistency issues that the emulator hides.
Respect the size and CPU limits
A Worker script is capped at 1 MB compressed on the free plan and 10 MB on paid. CPU budgets are per-request.
- Tree-shake aggressively. A 4 MB bundle is a code-splitting problem, not a “Cloudflare too small” problem.
- Move heavy logic to a Durable Object, a backend service, or a scheduled batch job.
- Use
ctx.waitUntilfor fire-and-forget work; it does not extend the response budget but it does extend total CPU. - Stream large responses with
TransformStreaminstead of buffering. Memory caps end Workers that try to hold a 100 MB blob.
Pick Workers when latency drives the design
Workers fit a specific shape of workload.
- Auth gates, A/B routing, geosteering, and bot filtering at the edge.
- API endpoints that read from KV or R2 and return small JSON.
- Webhook receivers that write to a Queue and return 200 in under 10 ms.
- Cron jobs that scrape, expire caches, or rotate keys on a schedule.
Pick a traditional server (or vercel node functions) when the workload needs a long-lived database connection, gigabytes of memory, a Node-only library, or sustained CPU. The Workers model rewards short, stateless, latency-sensitive code.