Centralized input guards

Require a shared, spec-aligned “guard” step for any untrusted input used in security-sensitive contexts (SQL construction, dynamic table names, outbound URL fetch). Fail closed (raise a safe exception) before interpolation/requests, and reuse a single utility to avoid security-policy drift.

copy reviewer prompt

Prompt

Reviewer Prompt

Require a shared, spec-aligned “guard” step for any untrusted input used in security-sensitive contexts (SQL construction, dynamic table names, outbound URL fetch). Fail closed (raise a safe exception) before interpolation/requests, and reuse a single utility to avoid security-policy drift.

Practical rules:

  • SQL / identifier interpolation: validate identifiers (e.g., UUIDs) before using them in f-strings or dynamic table names; parameterize whenever possible.
  • Reuse instead of duplicating: if an existing validation helper exists, ensure it matches the required constraints (e.g., don’t silently narrow “UUID version” rules) and raises the exception type your call site expects.
  • Outbound fetch (SSRF): centralize URL safety checks in a shared utility; avoid per-endpoint ad-hoc rules.
  • Logging hygiene: don’t log full user-supplied URLs (may contain credentials/tokens). Log minimal metadata (scheme/host or a redacted form).

Example pattern (UUID guard):

import uuid
import logging
logger = logging.getLogger(__name__)

def assert_valid_uuid(value: str, *, label: str) -> str:
    try:
        # Validate; optionally canonicalize to the storage format your SQL expects
        return uuid.UUID(str(value)).hex  # e.g., 32-char form
    except (ValueError, TypeError):
        logger.warning("Rejected invalid %s (len=%d)", label, len(str(value)))
        raise ValueError(f"Invalid {label} format")

# Use only after guard
kb_hex = assert_valid_uuid(kb_ids[0], label="kb_id")
table_name = f"ragflow_{tenant_id}_{kb_hex}"

Example pattern (SSRF logging hygiene):

# Instead of logging full URL:
logger.warning("SSRF guard blocked URL (scheme=%r host=%r)", scheme, parsed.hostname)

Source discussions