When integrating LLMs/embeddings into production code, treat the model as unreliable: enforce a stable IO contract (especially JSON), validate and safely recover from format mismatches, and route model calls through the project’s unified abstractions with configurable parameters.

Apply this checklist: 1) Make LLM outputs contract-safe

Example (defensive JSON parsing):

import json, re
from typing import Optional, Dict

def parse_llm_json(text: str) -> Dict:
    # 1) Prefer ```json ... ```
    code_block = re.search(r"```json\s*(.*?)\s*```", text, re.S|re.I)
    candidate = code_block.group(1) if code_block else None

    # 2) Fallback: first JSON object
    if not candidate:
        obj = re.search(r"\{.*?\}", text, re.S)
        candidate = obj.group(0) if obj else ""

    if not candidate:
        return {}

    try:
        data = json.loads(candidate)
    except json.JSONDecodeError:
        return {}

    # 3) Validate minimum contract
    if not isinstance(data, dict):
        return {}
    return data

2) Never call provider SDKs directly in core business logic

3) Keep embeddings/models configurable and extensible

4) Keep message/history handling consistent

5) Add fallback behavior for empty/failed vector sub-results

This standard prevents production breakage from minor LLM formatting drift, improves portability across LLM providers, and makes embedding/model changes safe and configuration-driven.