Building with LLMs as collaborators and as services. Start at claude-code for day-to-day workflows; jump to rag or mcp-servers when shipping retrieval or tool-using systems.
Pages
- claude-code: Claude Code workflow patterns.
- claude-code-claude-md: CLAUDE.md as durable session memory; mission, voice rules, schema, and folder map.
- claude-code-hooks: pre-tool, post-tool, stop, and user-prompt-submit hooks in settings.json for automation.
- claude-code-mcp: connect MCP servers via settings.json, scope tools to repos, pass secrets through env.
- claude-code-permissions: the five permission modes via permissions.defaultMode, allowlists in settings.json, per-project tool scope.
- claude-code-pitfalls: the six most common Claude Code failure patterns and the brief-level guardrails that prevent them.
- claude-code-skills: user-invocable skills in .claude/skills/ to package reusable prompts and hook sequences.
- claude-code-subagents: define subagents as .claude/agents/*.md frontmatter files; isolate file-writing workers in git worktrees; coordinate through branches.
- Prompt Engineering: Best practices, templates, evals, injection defense, chains, caching, reasoning models. Includes the moved
[[prompt-engineering/prompt-design]]and[[prompt-engineering/chain-of-thought]]pages. - system-prompts: What belongs in system vs user; structure; versioning.
- role-framing: Set the role; prime expertise; defeat sycophancy.
- few-shot: When examples help; three to five; diversity over volume.
- structured-output: JSON Schema via tool use; strict mode; Pydantic plus instructor.
- prompt-injection-defense: Untrusted text is data; delimiters; sandboxed tools (threat-model framing). Complements
[[prompt-engineering/prompt-injection-defense]]. - examples-vs-rules: When examples beat rules; combining both safely.
- multi-agent: Orchestrator-worker and planner-executor patterns.
- agent-architecture-patterns: The pillar for agent structures: augmented LLM, workflows, orchestrator-worker, planner-executor, and evaluator-optimizer, and when each fits.
- agentic-workflow-patterns: The five workflow patterns when the path is known: chaining, routing, parallelization, orchestrator-worker, evaluator-optimizer.
- reliable-agents-in-production: Scope narrowly, constrain tools, bound loops, evaluate, observe, and degrade gracefully to ship an agent to production.
- tool-use-and-function-calling: Define, describe, and guard tools so the model calls them correctly: clear schemas, validation, actionable errors, minimal surface.
- rag: Chunking, retrieval, reranking, evaluation.
- rag-chunking: Semantic boundaries, 200 to 800 token sweet spot, overlap, chunk metadata.
- rag-retrieval: Dense plus sparse hybrid, top-k tuning, metadata pre-filters, multi-query, HyDE.
- rag-reranking: Cross-encoder rerankers, retrieve broad and rerank narrow, latency budgets.
- rag-eval: Recall@k for retrieval, faithfulness for generation, golden sets, regression dashboards.
- rag-citations: Cite every fact, chunk IDs, verifiable links, defending against hallucinated citations.
- rag-vector-databases: Pinecone, Qdrant, Weaviate, pgvector, ChromaDB; HNSW tuning; metric choice.
- evaluation: Golden sets, LLM-as-judge, eval-driven prompts.
- cost-control: Caching, model routing, batch APIs, fallback ladders.
- mcp-servers: Designing and shipping MCP servers.
- embeddings: Model choice, dimensionality, similarity.
- embeddings-cost-control: batch APIs, content-addressed caching, deduplication, input truncation, and budget caps.
- embeddings-dimensionality: Matryoshka truncation, storage vs recall trade-offs, and the halve-and-verify rule.
- embeddings-eval: golden sets, MRR, recall@k, nDCG, and A/B testing for retrieval systems.
- embeddings-hybrid-search: dense plus BM25 with Reciprocal Rank Fusion, query routing, when hybrid beats pure dense.
- embeddings-normalization: L2-normalize for cosine equivalence, silent retrieval bugs, library defaults.
- embeddings-semantic-cache: cache LLM responses on input embeddings, cosine thresholds, invalidation, pgvector store.
- embeddings-voyage-vs-openai: Voyage 4 vs OpenAI text-embedding-3-large: accuracy, cost, multilingual, dimensionality.
- openai-sdk-vs-langchain: When to call the OpenAI/Anthropic SDK directly versus wrapping it with LangChain.
- ollama: Running local LLMs with Ollama.
- ollama-model-selection: How to pick the right Ollama model for your hardware and task: Llama 3.3, Qwen 2.5, and Mistral compared by VRAM, quality, and use case.
- ollama-modelfile: How to write an Ollama Modelfile to pin a base model, system prompt, and generation parameters into a versioned local image.
- ollama-quantization: How quantization levels affect memory footprint and output quality in Ollama, and which level to pick for each use case.
- ollama-serving: How to use Ollama’s REST API, OpenAI-compatible endpoints, streaming responses, and concurrent request handling.
- ollama-deployment: How to deploy Ollama for local development, shared GPU servers, and production: Docker, systemd, and reverse proxy with authentication.
- mcp-protocol: How the Model Context Protocol is framed over JSON-RPC 2.0, what capabilities a server can advertise, and how client and server roles divide responsibility.
- mcp-tool-design: Rules for naming, typing, and describing MCP tools so the model calls them correctly the first time.
- mcp-transports: Choose stdio for local servers and Streamable HTTP for remote ones; understand reconnection, lifecycle, and multi-tenant patterns for each transport.
- mcp-streamable-http: The current remote MCP transport: single endpoint, Mcp-Session-Id sessions, resumable streams via Last-Event-ID, migration off deprecated HTTP+SSE.
- mcp-elicitation: The elicitation capability: servers request structured user input mid-session via a flat JSON Schema, with accept/decline/cancel responses and security implications.
- mcp-resources: Design MCP resources as stable URI-addressed data sources for read-only context, separate from tools that perform actions.
- mcp-security: Authenticate at the transport, redact secrets from tool outputs, rate-limit per session, and sandbox the filesystem surface to ship MCP servers safely.
- mcp-logging: Emit structured logs per tool call with correlation IDs and latency; use MCP Inspector for interactive debugging; know the common failure modes.