Adopt a single error-handling standard across LLM/provider layers:
1) Retry only transient failures (including connection jitter) with bounded, classified backoff
2) Keep return/output contracts stable between success and failure
3) Standardize failure encoding and checking at boundaries
Example (type-stable normalization before setting output):
try:
transcription = seq2txt_mdl.transcription(tmp_path)
# success may be tuple(text, token_count)
txt = transcription[0] if isinstance(transcription, tuple) else transcription
except Exception as e:
logging.warning(f"Transcription failed: {e}")
txt = ""
self.set_output("text", txt)
Example (standard retry rule for transient polling blips):
@retry
def _describe_task_status(self, req):
return self.client.DescribeTaskStatus(req)
while retries < max_retries:
resp = self._describe_task_status(req) # transient 429/5xx/DNS blips survive
Result: fewer hidden failure modes, consistent caller behavior, and reliable recovery for transient LLM/API issues.
Enter the URL of a public GitHub repository