Python: Security

Overview

Python security failures cluster around a few predictable patterns: trusting user input, constructing queries with string concatenation, deserializing untrusted bytes, and shipping secrets in code. This page names each pattern and the fix. For dependency vulnerability scanning, see python-dependency-management.

Validate all external input with Pydantic

Never trust data from HTTP requests, environment variables, files, or message queues. Parse it at the boundary with Pydantic v2 before it enters business logic.

from pydantic import BaseModel, Field, EmailStr
from typing import Annotated
 
PositiveInt = Annotated[int, Field(gt=0)]
 
class CreateUserRequest(BaseModel):
    username: Annotated[str, Field(min_length=3, max_length=50, pattern=r"^[a-z0-9_]+$")]
    email: EmailStr
    age: PositiveInt
 
def create_user(raw: dict) -> User:
    req = CreateUserRequest.model_validate(raw)  # raises ValidationError on bad input
    return User(username=req.username, email=req.email, age=req.age)

model_validate raises ValidationError with field-level details on bad input. Catch it at the API layer and return a 422 response. Never pass the raw dict further into the system. See fastapi for how FastAPI handles this automatically via type annotations.

Use parameterized queries; never format SQL strings

String formatting in SQL queries opens the door to SQL injection. Use parameterized queries unconditionally.

import sqlite3
 
conn = sqlite3.connect("app.db")
 
# Bad: SQL injection possible
name = request_data["name"]
conn.execute(f"SELECT * FROM users WHERE name = '{name}'")
 
# Good: parameterized
conn.execute("SELECT * FROM users WHERE name = ?", (name,))

ORM layers (SQLAlchemy, Tortoise) parameterize by default when you use the query builder API. They are still vulnerable if you drop to text() with string formatting. See the SQLAlchemy docs on bindparam for safe raw SQL with variables.

Never deserialize untrusted data with `pickle`

pickle.loads executes arbitrary Python code embedded in the pickled bytes. Any attacker who controls the bytes controls the process.

# Bad: remote code execution if content is attacker-controlled
import pickle
obj = pickle.loads(untrusted_bytes)
 
# Good: use a safe format
import json
obj = json.loads(untrusted_string)
 
# Good: use Pydantic for structured data
from pydantic import BaseModel
result = MyModel.model_validate_json(untrusted_string)

Use pickle only for internal caches where you control both the writer and the reader, and the data never crosses a trust boundary. Document this constraint in the code with a comment.

Load secrets from the environment, not from code

API keys, database passwords, and tokens must not appear in source files, committed config, or Docker image layers. Load them at runtime from environment variables or a secrets manager.

import os
from dotenv import load_dotenv
 
load_dotenv()  # reads .env in dev; no-op in production where env vars are set externally
 
DATABASE_URL = os.environ["DATABASE_URL"]  # raises KeyError if missing: fail fast
API_KEY = os.environ.get("OPENAI_API_KEY")  # returns None if missing: acceptable for optional

Use os.environ["KEY"] (raises on missing) rather than os.environ.get("KEY") (returns None) for required secrets. A clear startup failure is better than a confusing runtime error ten calls later.

Add .env to .gitignore. Never commit it. Use python-dotenv in development; in production, set variables via the platform’s secrets mechanism (GitHub Secrets, AWS Secrets Manager, Fly.io secrets).

Prevent SSRF in outbound HTTP calls

Server-side request forgery (SSRF) occurs when user-supplied URLs are fetched by the server, allowing attackers to reach internal services. Validate URLs before fetching.

from urllib.parse import urlparse
import httpx
 
ALLOWED_SCHEMES = {"https"}
BLOCKED_HOSTS = {"169.254.169.254", "metadata.google.internal"}
 
def safe_fetch(url: str) -> bytes:
    parsed = urlparse(url)
    if parsed.scheme not in ALLOWED_SCHEMES:
        raise ValueError(f"Disallowed scheme: {parsed.scheme}")
    if parsed.hostname in BLOCKED_HOSTS:
        raise ValueError(f"Blocked host: {parsed.hostname}")
    return httpx.get(url, follow_redirects=False).content

Disable automatic redirect following (follow_redirects=False) when fetching user-supplied URLs; redirects can bypass host checks. Consider a DNS-based allowlist for services that require external HTTP calls.

Audit dependencies regularly

Transitive dependencies introduce vulnerabilities you did not write. Run pip-audit in CI to catch known CVEs before they reach production.

uv run pip-audit

When pip-audit reports a vulnerability, check whether the patched version exists and upgrade. If no patch is available, assess exploitability and, if the CVE is in a code path you use, consider replacing the library. See python-dependency-management for the full lockfile workflow.

LLM Best Practices

Explorer

Overview

Validate all external input with Pydantic

Use parameterized queries; never format SQL strings

Never deserialize untrusted data with `pickle`

Load secrets from the environment, not from code

Prevent SSRF in outbound HTTP calls

Audit dependencies regularly

Graph View

Table of Contents

Backlinks

LLM Best Practices

Explorer

Python: Security

Overview

Validate all external input with Pydantic

Use parameterized queries; never format SQL strings

Never deserialize untrusted data with pickle

Load secrets from the environment, not from code

Prevent SSRF in outbound HTTP calls

Audit dependencies regularly

Related

Graph View

Table of Contents

Backlinks

Never deserialize untrusted data with `pickle`