Prompt
When integrating LLMs/embeddings into production code, treat the model as unreliable: enforce a stable IO contract (especially JSON), validate and safely recover from format mismatches, and route model calls through the project’s unified abstractions with configurable parameters.
Apply this checklist: 1) Make LLM outputs contract-safe
- Assume responses may not contain the expected ```json block or may include extra text.
- Parse defensively (try strict code-fence JSON first, then fall back to the first JSON object), validate required fields, and on failure return a safe default (e.g., empty dict) rather than throwing.
Example (defensive JSON parsing):
import json, re
from typing import Optional, Dict
def parse_llm_json(text: str) -> Dict:
# 1) Prefer ```json ... ```
code_block = re.search(r"```json\s*(.*?)\s*```", text, re.S|re.I)
candidate = code_block.group(1) if code_block else None
# 2) Fallback: first JSON object
if not candidate:
obj = re.search(r"\{.*?\}", text, re.S)
candidate = obj.group(0) if obj else ""
if not candidate:
return {}
try:
data = json.loads(candidate)
except json.JSONDecodeError:
return {}
# 3) Validate minimum contract
if not isinstance(data, dict):
return {}
return data
2) Never call provider SDKs directly in core business logic
- Use the project’s LLM abstraction layer (e.g., the same path as LLMExtractor/LLMTranslator) to centralize auth, retries, routing, and message handling.
- Avoid global mutable history variables; rely on framework-managed conversation state.
3) Keep embeddings/models configurable and extensible
- Don’t hardcode embedding dimension or provider model details.
- Accept/configure
embedding_fn(or an equivalent abstraction) so switching embedding providers/models doesn’t require code changes.
4) Keep message/history handling consistent
- If a proxy backend can’t handle multi-turn directly, convert messages only; let the framework manage multi-turn formatting.
5) Add fallback behavior for empty/failed vector sub-results
- If vector-based graph/entity expansion returns empty, fall back to keyword-based search (or another deterministic strategy) so retrieval doesn’t silently degrade.
This standard prevents production breakage from minor LLM formatting drift, improves portability across LLM providers, and makes embedding/model changes safe and configuration-driven.