Building with LLMs as collaborators and as services. Start at claude-code for day-to-day workflows; jump to rag or mcp-servers when shipping retrieval or tool-using systems.

Pages

  • claude-code: Claude Code workflow patterns.
  • claude-code-claude-md: CLAUDE.md as durable session memory; mission, voice rules, schema, and folder map.
  • claude-code-hooks: pre-tool, post-tool, stop, and user-prompt-submit hooks in settings.json for automation.
  • claude-code-mcp: connect MCP servers via settings.json, scope tools to repos, pass secrets through env.
  • claude-code-permissions: the five permission modes via permissions.defaultMode, allowlists in settings.json, per-project tool scope.
  • claude-code-pitfalls: the six most common Claude Code failure patterns and the brief-level guardrails that prevent them.
  • claude-code-skills: user-invocable skills in .claude/skills/ to package reusable prompts and hook sequences.
  • claude-code-subagents: define subagents as .claude/agents/*.md frontmatter files; isolate file-writing workers in git worktrees; coordinate through branches.
  • Prompt Engineering: Best practices, templates, evals, injection defense, chains, caching, reasoning models. Includes the moved [[prompt-engineering/prompt-design]] and [[prompt-engineering/chain-of-thought]] pages.
  • system-prompts: What belongs in system vs user; structure; versioning.
  • role-framing: Set the role; prime expertise; defeat sycophancy.
  • few-shot: When examples help; three to five; diversity over volume.
  • structured-output: JSON Schema via tool use; strict mode; Pydantic plus instructor.
  • prompt-injection-defense: Untrusted text is data; delimiters; sandboxed tools (threat-model framing). Complements [[prompt-engineering/prompt-injection-defense]].
  • examples-vs-rules: When examples beat rules; combining both safely.
  • multi-agent: Orchestrator-worker and planner-executor patterns.
  • agent-architecture-patterns: The pillar for agent structures: augmented LLM, workflows, orchestrator-worker, planner-executor, and evaluator-optimizer, and when each fits.
  • agentic-workflow-patterns: The five workflow patterns when the path is known: chaining, routing, parallelization, orchestrator-worker, evaluator-optimizer.
  • reliable-agents-in-production: Scope narrowly, constrain tools, bound loops, evaluate, observe, and degrade gracefully to ship an agent to production.
  • tool-use-and-function-calling: Define, describe, and guard tools so the model calls them correctly: clear schemas, validation, actionable errors, minimal surface.
  • rag: Chunking, retrieval, reranking, evaluation.
  • rag-chunking: Semantic boundaries, 200 to 800 token sweet spot, overlap, chunk metadata.
  • rag-retrieval: Dense plus sparse hybrid, top-k tuning, metadata pre-filters, multi-query, HyDE.
  • rag-reranking: Cross-encoder rerankers, retrieve broad and rerank narrow, latency budgets.
  • rag-eval: Recall@k for retrieval, faithfulness for generation, golden sets, regression dashboards.
  • rag-citations: Cite every fact, chunk IDs, verifiable links, defending against hallucinated citations.
  • rag-vector-databases: Pinecone, Qdrant, Weaviate, pgvector, ChromaDB; HNSW tuning; metric choice.
  • evaluation: Golden sets, LLM-as-judge, eval-driven prompts.
  • cost-control: Caching, model routing, batch APIs, fallback ladders.
  • mcp-servers: Designing and shipping MCP servers.
  • embeddings: Model choice, dimensionality, similarity.
  • embeddings-cost-control: batch APIs, content-addressed caching, deduplication, input truncation, and budget caps.
  • embeddings-dimensionality: Matryoshka truncation, storage vs recall trade-offs, and the halve-and-verify rule.
  • embeddings-eval: golden sets, MRR, recall@k, nDCG, and A/B testing for retrieval systems.
  • embeddings-hybrid-search: dense plus BM25 with Reciprocal Rank Fusion, query routing, when hybrid beats pure dense.
  • embeddings-normalization: L2-normalize for cosine equivalence, silent retrieval bugs, library defaults.
  • embeddings-semantic-cache: cache LLM responses on input embeddings, cosine thresholds, invalidation, pgvector store.
  • embeddings-voyage-vs-openai: Voyage 4 vs OpenAI text-embedding-3-large: accuracy, cost, multilingual, dimensionality.
  • openai-sdk-vs-langchain: When to call the OpenAI/Anthropic SDK directly versus wrapping it with LangChain.
  • ollama: Running local LLMs with Ollama.
  • ollama-model-selection: How to pick the right Ollama model for your hardware and task: Llama 3.3, Qwen 2.5, and Mistral compared by VRAM, quality, and use case.
  • ollama-modelfile: How to write an Ollama Modelfile to pin a base model, system prompt, and generation parameters into a versioned local image.
  • ollama-quantization: How quantization levels affect memory footprint and output quality in Ollama, and which level to pick for each use case.
  • ollama-serving: How to use Ollama’s REST API, OpenAI-compatible endpoints, streaming responses, and concurrent request handling.
  • ollama-deployment: How to deploy Ollama for local development, shared GPU servers, and production: Docker, systemd, and reverse proxy with authentication.
  • mcp-protocol: How the Model Context Protocol is framed over JSON-RPC 2.0, what capabilities a server can advertise, and how client and server roles divide responsibility.
  • mcp-tool-design: Rules for naming, typing, and describing MCP tools so the model calls them correctly the first time.
  • mcp-transports: Choose stdio for local servers and Streamable HTTP for remote ones; understand reconnection, lifecycle, and multi-tenant patterns for each transport.
  • mcp-streamable-http: The current remote MCP transport: single endpoint, Mcp-Session-Id sessions, resumable streams via Last-Event-ID, migration off deprecated HTTP+SSE.
  • mcp-elicitation: The elicitation capability: servers request structured user input mid-session via a flat JSON Schema, with accept/decline/cancel responses and security implications.
  • mcp-resources: Design MCP resources as stable URI-addressed data sources for read-only context, separate from tools that perform actions.
  • mcp-security: Authenticate at the transport, redact secrets from tool outputs, rate-limit per session, and sandbox the filesystem surface to ship MCP servers safely.
  • mcp-logging: Emit structured logs per tool call with correlation IDs and latency; use MCP Inspector for interactive debugging; know the common failure modes.

51 items under this folder.