AI Clinical Decision Support: Architecture, Guardrails & Liability
Building AI that assists clinicians without overstepping. Decision support, not diagnosis. Architecture patterns, guardrails, audit trails, and how to design for the liability questions that always come.
The Line We Must Not Cross
AI can suggest. AI cannot diagnose. A licensed clinician makes every clinical decision. Every architectural choice in a Clinical Decision Support (CDS) system flows from that line.
I've built CDS components for two healthcare clients. The pattern is consistent: provide structured suggestions with citations, log everything, never auto-act.
Core Architecture
┌─────────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Clinician Query │──→ │ De-identification│──→ │ LLM + RAG │
└─────────────────┘ └──────────────────┘ └───────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Audit Log │ ←─ │ Suggestion + │ ←─ │ Guardrails │
└─────────────────┘ │ Citations + Conf │ │ + Citations │
└──────────────────┘ └────────────────┘
Guardrails as a First-Class Stage
Every LLM response passes through deterministic guardrails before reaching the clinician:
GUARDRAILS = [
"Must not contain prescription dose recommendations without citation",
"Must not assert diagnosis — only suggest possibilities",
"Must cite at least one source for any factual claim",
"Must include confidence (low/medium/high) for every suggestion",
]@dataclass class GuardrailViolation: rule: str severity: str suggested_fix: str
def check_response(text: str, citations: list[Citation]) -> list[GuardrailViolation]: violations = [] if contains_dose_recommendation(text) and not citations: violations.append(GuardrailViolation( rule="dose-without-citation", severity="block", suggested_fix="add citation or remove dose" )) if contains_definitive_diagnosis(text): violations.append(GuardrailViolation( rule="definitive-diagnosis", severity="block", suggested_fix="rephrase as differential" )) return violations ```
Citation-First RAG
Every fact in the response is anchored to a retrieved source:
class Suggestion(BaseModel):
text: str
citations: list[Citation]
confidence: Literal["low", "medium", "high"]
differential: list[str] = Field(description="Other possibilities to consider")class Citation(BaseModel): source: str # PMID, DOI, guideline ID quote: str # Exact passage retrieved relevance: float ```
If the LLM produces text that doesn't map to a citation, the guardrail rejects it.
The Mandatory Audit Trail
Every query and response is logged with full provenance:
CREATE TABLE cds_interactions (
id UUID PRIMARY KEY,
clinician_id UUID NOT NULL,
patient_id_encrypted TEXT NOT NULL,
query_text TEXT NOT NULL,
retrieved_chunks JSONB NOT NULL,
llm_model VARCHAR(64) NOT NULL,
llm_version VARCHAR(64) NOT NULL,
prompt_version VARCHAR(32) NOT NULL,
response JSONB NOT NULL,
guardrails_passed BOOLEAN NOT NULL,
clinician_accepted BOOLEAN,
clinician_action TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Why every field matters: if a malpractice lawsuit arises 5 years later, you must reconstruct exactly what the system told the clinician, on what evidence.
Liability-Conscious UI
The interface enforces the line between AI suggestion and human decision:
- Suggestions are visually distinct from clinician-entered text
- Every suggestion has an "accept" or "reject" action that's logged
- Confidence is shown as a colored badge, not a percentage (avoid false precision)
- Sources are clickable and verifiable
- The clinician's final note is theirs alone — AI text never auto-fills patient records
What I Refuse to Build
- AI that auto-prescribes
- AI that produces final patient-facing diagnoses
- AI that bypasses clinician review
- AI without HITRUST-grade audit logs
- AI deployed without a clinical safety review by a licensed physician
These aren't engineering decisions. They're ethical ones.