Overview

The Anthropic Python SDK wraps the Messages API with typed request/response models and automatic retries. This guide installs the SDK, sets the API key, sends the first request, streams tokens, and adds a tool definition for structured output. The prompt design principles that govern what you send are in prompt-design.

Prerequisites

  • Python 3.9+ with pip.
  • An Anthropic API key from console.anthropic.com. Store it as ANTHROPIC_API_KEY in the environment, never hardcoded.
  • A virtual environment activated. Use python -m venv .venv && source .venv/bin/activate.

Steps

1. Install the SDK

pip install anthropic

Confirm the install:

python -c "import anthropic; print(anthropic.__version__)"

2. Load the API key from the environment

import anthropic
 
# The client reads ANTHROPIC_API_KEY automatically.
client = anthropic.Anthropic()

Never pass the key as a string literal. If it is not in the environment, the constructor raises anthropic.AuthenticationError with a clear message. See cost-control for rate-limit and budget strategies.

3. Send a basic messages request

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system="You are a concise technical writer. Reply in plain text.",
    messages=[
        {"role": "user", "content": "Explain HNSW indexing in two sentences."}
    ],
)
 
print(message.content[0].text)

The system parameter accepts a string (simple) or a list of content blocks (for cache-control). See system-prompts for best practices.

Response structure:

message.id            # "msg_01..."
message.model         # "claude-sonnet-4-5"
message.stop_reason   # "end_turn" | "max_tokens" | "tool_use"
message.usage         # Usage(input_tokens=X, output_tokens=Y)
message.content       # list[TextBlock | ToolUseBlock]

4. Stream the response

Streaming reduces time-to-first-token for long outputs.

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "List five Postgres performance tips."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
 
# Full message is available after the context exits.
final = stream.get_final_message()
print(f"\n\nTokens: {final.usage}")

Use the stream context manager over stream_text when you also need the final Usage object for cost tracking.

5. Add a tool definition

Tools let the model call structured functions. Define the tool schema, send the request, execute the function on the returned arguments, and send the result back.

import json
 
tools = [
    {
        "name": "get_weather",
        "description": "Return the current temperature for a city.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name."}
            },
            "required": ["city"],
        },
    }
]
 
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=256,
    tools=tools,
    messages=[{"role": "user", "content": "What is the weather in Tokyo?"}],
)
 
if response.stop_reason == "tool_use":
    tool_call = next(b for b in response.content if b.type == "tool_use")
    city = tool_call.input["city"]
    # Call your real function here.
    result = {"temperature": 22, "unit": "celsius"}
 
    # Send the tool result back.
    final = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=256,
        tools=tools,
        messages=[
            {"role": "user", "content": "What is the weather in Tokyo?"},
            {"role": "assistant", "content": response.content},
            {"role": "user", "content": [
                {"type": "tool_result", "tool_use_id": tool_call.id, "content": json.dumps(result)}
            ]},
        ],
    )
    print(final.content[0].text)

For validated structured output (JSON schema enforcement without a tool round-trip), see structured-output.

Verify it worked

# 1. Basic call returns text.
msg = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=32,
    messages=[{"role": "user", "content": "Reply with the word OK only."}],
)
assert msg.content[0].text.strip() == "OK", msg.content[0].text
 
# 2. Usage is tracked.
assert msg.usage.input_tokens > 0
assert msg.usage.output_tokens > 0
 
# 3. Streaming accumulates the full text.
chunks = []
with client.messages.stream(
    model="claude-haiku-4-5",
    max_tokens=32,
    messages=[{"role": "user", "content": "Reply with the word OK only."}],
) as s:
    for chunk in s.text_stream:
        chunks.append(chunk)
assert "".join(chunks).strip() == "OK"

Common errors

  • anthropic.AuthenticationError. ANTHROPIC_API_KEY is not set or is invalid. Check echo $ANTHROPIC_API_KEY.
  • anthropic.RateLimitError. The request rate exceeds the tier limit. Add exponential backoff or reduce concurrency. See cost-control.
  • anthropic.BadRequestError: max_tokens exceeds limit. The requested max_tokens is higher than the model’s output limit. Check the model’s context window.
  • message.content[0] raises IndexError. The response has zero content blocks, which happens when max_tokens is too low. Increase max_tokens.
  • Tool result not accepted. The tool_use_id in the tool result does not match the ID from the assistant’s tool call. Copy it from tool_call.id.