HIPAA-Compliant AI: Architecture, Encryption & Audit Trails
How to design AI-powered healthcare systems that pass HIPAA, SOC 2, and GDPR audits. Real patterns from building DrMackMedicine — covering PHI handling, audit logging, and compliant LLM usage.
Why Healthcare AI is Different
When I built DrMackMedicine — an AI platform for clinical workflows — the hardest problems were not the AI models themselves. They were compliance: how do you let an LLM process patient data without violating HIPAA?
Every architectural decision flowed from three questions: Is this PHI? Who can access it? Is it audited?
PHI Data Classification
The first step is classifying every data field:
// PHI classifications
enum DataSensitivity {
PHI = 'PHI', // Protected Health Information — HIPAA applies
PII = 'PII', // Personally Identifiable Info — GDPR applies
INTERNAL = 'INTERNAL', // Non-patient business data
PUBLIC = 'PUBLIC', // Publicly visible data
}interface PatientRecord { id: string // INTERNAL (UUID) name: string // PHI + PII dob: Date // PHI diagnosis: string[] // PHI providerId: string // INTERNAL encryptedSsn: string // PHI — always encrypted at rest } ```
Encryption at Rest: Field-Level
Never rely on database-level encryption alone. Encrypt PHI fields individually so even a compromised database reveals nothing:
from cryptography.fernet import Fernet
import osPHI_KEY = Fernet(os.environ["PHI_ENCRYPTION_KEY"].encode())
def encrypt_phi(value: str) -> str: return PHI_KEY.encrypt(value.encode()).decode()
def decrypt_phi(token: str) -> str: return PHI_KEY.decrypt(token.encode()).decode()
# Usage in ORM class Patient(Base): __tablename__ = "patients" id = Column(UUID, primary_key=True) name_encrypted = Column(Text) # Stored encrypted
@property def name(self) -> str: return decrypt_phi(self.name_encrypted)
@name.setter def name(self, value: str): self.name_encrypted = encrypt_phi(value) ```
HIPAA Audit Log
Every access to PHI must be logged — who accessed it, when, from where, and for what purpose:
from datetime import datetime
from fastapi import Requestasync def log_phi_access( request: Request, user_id: str, patient_id: str, action: str, fields_accessed: list[str] ): await AuditLog.create( timestamp=datetime.utcnow(), user_id=user_id, patient_id=patient_id, action=action, fields_accessed=fields_accessed, ip_address=request.client.host, user_agent=request.headers.get("user-agent"), session_id=request.headers.get("x-session-id"), ) ```
Using LLMs with PHI Safely
Never send raw PHI to an LLM API. Instead, de-identify before sending:
import redef deidentify_for_llm(text: str, patient_id: str) -> tuple[str, dict]: """Replace PHI with tokens before sending to LLM.""" replacements = {}
# Replace names name = get_patient_name(patient_id) token = f"[PATIENT_{hash(name) % 9999}]" replacements[token] = name text = text.replace(name, token)
# Replace dates of birth dob_pattern = r'\b\d{1,2}/\d{1,2}/\d{4}\b' for match in re.findall(dob_pattern, text): token = f"[DATE_{hash(match) % 9999}]" replacements[token] = match text = text.replace(match, token)
return text, replacements ```
Infrastructure Compliance Checklist
| Requirement | Implementation |
|---|---|
| Data at rest encrypted | AES-256 field-level encryption |
| Data in transit encrypted | TLS 1.3 everywhere |
| Access logging | PostgreSQL audit_log table |
| Minimum necessary access | Role-based + purpose-based access |
| Business Associate Agreement | Required for all vendors (cloud, LLM) |
| Breach notification | Automated alerting + incident runbook |