Python's os.path.join Absolute-Path Trap: The Base Directory That Quietly Vanishes

2026-05-31

This helper resolves a file path inside the authenticated user's home directory. It's meant to be a thin wrapper that callers use everywhere — download endpoints, file viewers, log tailers. The base directory is always trusted; only the second argument comes from the request. What could go wrong?

import os

USER_ROOT = "/srv/app/userdata"

def resolve_user_file(user_id: str, requested: str) -> str:
    """Return the absolute path to a file inside the user's sandbox."""
    user_home = os.path.join(USER_ROOT, user_id)
    full = os.path.join(user_home, requested)

    # Defence in depth: reject obvious traversal attempts.
    if ".." in requested.split(os.sep):
        raise PermissionError("path traversal blocked")

    return full

# Looks fine on the happy path:
print(resolve_user_file("alice", "notes/today.md"))
# /srv/app/userdata/alice/notes/today.md

# And it blocks the obvious attack:
print(resolve_user_file("alice", "../bob/notes/today.md"))
# PermissionError: path traversal blocked

# But this returns /etc/shadow:
print(resolve_user_file("alice", "/etc/shadow"))

The Bug

os.path.join has a behavior most people learn the hard way: if any argument after the first is an absolute path, every earlier argument is silently discarded. From the docs: "If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component."

So os.path.join("/srv/app/userdata/alice", "/etc/shadow") returns "/etc/shadow". The USER_ROOT sandbox you carefully constructed is gone. Worse, the .. check passes — there are no .. components in /etc/shadow — so the function happily hands back a path that escapes the sandbox without ever traversing out of it.

This is a real CVE pattern. It has shipped in static-file servers, template loaders, and "upload to user folder" endpoints. The reason it slips through review is that the bug looks like correct, idiomatic code: of course you use os.path.join to combine paths. The Python docs even recommend it over manual "/" concatenation, which would actually have been safer here ("/srv/app/userdata/alice" + "/" + "/etc/shadow" would yield a nonsense path that open() rejects).

The Fix

Don't trust join to enforce containment. Resolve to a canonical absolute path, then verify it lives under the sandbox using os.path.commonpath or Path.is_relative_to (3.9+). And use os.sep-aware comparisons — a naive startswith matches /srv/app/userdata-evil.

from pathlib import Path

USER_ROOT = Path("/srv/app/userdata").resolve()

def resolve_user_file(user_id: str, requested: str) -> Path:
    user_home = (USER_ROOT / user_id).resolve()
    # Strip leading separators so absolute inputs become relative.
    candidate = (user_home / requested.lstrip("/\\")).resolve()
    if not candidate.is_relative_to(user_home):
        raise PermissionError("path escapes sandbox")
    return candidate

The crucial steps: resolve() collapses .. and symlinks before the containment check (otherwise TOCTOU strikes again), and is_relative_to does a proper component-wise comparison rather than a string prefix match.

Key Takeaway: os.path.join is a string concatenator, not a security boundary — any absolute argument silently erases the base directory, so always resolve() and verify containment explicitly.

All newsletters