SYSTEM_CONSOLE v2.4.0

Human-in-the-loop

When AI should defer to a human, how escalation paths work, and how human corrections feed back into the system.

LAST_UPDATED: 2025-05

Full automation is not always the best approach. Certain decisions require human judgement, not because the model is incapable, but because the stakes, uncertainty, or regulatory requirements demand human accountability. Human-in-the-loop should be a deliberate operating model decision rather than a fallback for a failing system.

Key Takeaways

  • • Define escalation criteria explicitly: don't leave them to model judgement alone.
  • • Human review queues are operational infrastructure and must be designed for latency.
  • • Human corrections are training signals; capture them systematically.

Escalation and review flow

Confidence and risk signals route responses to human review before delivery.

Human-in-the-loop flow

When to escalate to a human

Escalation criteria must be defined as explicit rules, not left to model self-assessment. Define them jointly with the domain and compliance owners.

Escalate on confidence signals

  • Retrieval returned no relevant sources above threshold
  • Top sources conflict and no resolution strategy applies
  • The model's own refusal or uncertainty signal is triggered
  • Query matches an "always escalate" keyword or pattern list

Escalate on risk signals

  • Response would trigger a regulatory or legal action
  • Query relates to a defined high-stakes domain (medical, financial advice, HR decisions)
  • The action being assisted is irreversible
  • User is acting on behalf of another person

Escalation path design

An escalation path that adds hours of latency will be bypassed. Design the review queue for the response time the use case requires.

Escalation type Target latency Reviewer Fallback if SLA missed
Low confidence < 1 hour Domain SME Decline with explanation
High risk domain < 4 hours Compliance / Legal Decline with referral
Regulatory trigger Synchronous Compliance officer Block until reviewed
Agentic action gate Synchronous Authorised approver Abort action

Graceful degradation

When escalation occurs, the user should not receive a blank failure. Define a graceful degradation response for each escalation type.

  • Acknowledge the query and explain why a human will respond.
  • Provide a realistic SLA for the human response.
  • Where possible, point to an existing resource the user can consult in the meantime.
  • Never fabricate an answer to avoid an escalation.
Best practice
"I cannot answer this with sufficient confidence. A specialist will follow up within 4 hours." is a valid, trustworthy response. It is better than a hallucinated answer that appears confident.

Human corrections as feedback signal

Human corrections provide evidence when the system produces a suboptimal result. Capture these corrections systematically, as they are the most valuable signals for improving retrieval quality, prompt design, and evaluation datasets.

Correction type
Wrong source cited, incorrect answer, unsafe response
Capture
Store original query, draft response, correction, and reviewer ID
Use
Add to evaluation dataset, trigger retrieval or prompt review

Failure modes

  • ! Escalation criteria are undefined, so the model decides when to escalate: inconsistently.
  • ! Review queue has no SLA, so high-risk queries wait indefinitely or are bypassed.
  • ! Graceful degradation is not implemented; users receive blank errors or hallucinated answers.
  • ! Human corrections are not captured, so the same failures repeat across conversations.
  • ! The escalation path is not tested, so it fails at the first real incident.

Checklist

  • Escalation criteria are explicit, documented, and agreed with domain and compliance owners.
  • Escalation paths have defined SLAs and fallback responses.
  • Graceful degradation responses exist for each escalation type.
  • Human corrections are captured in a structured store and reviewed regularly.
  • Escalation path is tested in staging before production deployment.