When code uses externally influenced data (URLs, hostnames, user-provided strings, identifiers) across trust boundaries, treat it as hostile and apply layered defenses:
Network/URL safety (SSRF + DoS): use an SSRF-safe client/validator, set timeouts, enforce maximum response sizes, and stream/abort as soon as caps are exceeded (don’t fully buffer). Also reject obviously bad declared sizes (e.g., Content-Length over the cap).
Validation/allowlisting: for hostnames/IPs and other security-relevant fields, validate against an explicit policy (blocked private ranges, cloud metadata IPs/hosts, etc.). Keep hostname coverage consistent with your environment (e.g., consider whether host.docker.internal should be blocked or allowed based on your threat model).
Sanitize for downstream syntax/rendering: before using strings in templates/renderers (Mermaid nodes, formatted placeholders, etc.), constrain allowed characters or escape/sanitize safely; document the assumptions about allowed character sets.
Don’t silence security tooling weakly: avoid noqa/suppression for insecure primitives (e.g., MD5) unless there’s a clearly justified reason and the rationale is documented.
Document limitations + add tests: security middleware/controls are mitigations—document what they protect against and what they don’t (e.g., PII handling middleware helps avoid sending PII to an LLM, but doesn’t guarantee full compliance in your logging/checkpointing/infrastructure). Add tests for the security-relevant edge cases (oversized/lying headers, SSRF policy behavior, etc.).
If you apply this standard consistently, you reduce SSRF and resource-exhaustion risk while also preventing unsafe rendering/templating behavior and avoiding “security bypass by suppression.”