When proposing performance/SLA goals (e.g., failover completion time), require an evidence-based rationale and a repeatable way to validate them—especially for worst-case behavior.

Apply this by documenting:

Example standard text developers can follow:

Failover target: <=10s (worst-case), <=5s (some configs).
Definition: time from heartbeat timeout detection to topology+role update propagated to all impacted nodes, including client observable readiness.
Assumptions: cross-AZ RTT p99=..., heartbeat interval=..., coordinator tick=..., quorum=...
Evidence: benchmark/staging results in regions X/Y with injected failures; include traces/profiles.
Regression guard: CI/staging test fails if p99/worst-case exceeds target.

This ensures performance claims aren’t just experience-based anecdotes, and it prevents future changes from silently breaking latency guarantees.