While single-turn RAG and agentic systems share the same retrieval and policy fundamentals, agentic systems introduce compounding risks. Agents may chain tool calls across multiple steps, delegate to sub-agents, or accumulate privileges that no single step would have granted alone. Standard access control is necessary, but often insufficient.
Key Takeaways
- • Policy gates must exist at every agent hop, not just at the entry point.
- • Budget and iteration limits are safety controls, not just cost controls.
- • Human approval gates are required for irreversible or high-impact actions.
Governed agent chain
Policy gates and budget controls across an orchestrator and sub-agent topology.
New risks introduced by agents
Privilege escalation across hops
Each agent step runs with its own identity. If sub-agents are not independently scoped, an orchestrator can grant access that no individual step would be allowed.
Prompt injection via tool outputs
A tool response containing adversarial text can redirect the agent's next step. Tool outputs must be sanitised before being re-injected into context.
Unbounded loops
Without hard iteration and cost limits, an agent can loop indefinitely or trigger cascading tool calls that consume unbounded resources.
Non-deterministic execution paths
The path an agent takes to complete a goal may vary across runs. Governance must capture the trace, not just the final output.
Per-hop policy model
Every agent handoff requires a new access decision. The orchestrator's permissions do not automatically flow to sub-agents, so each hop must be evaluated independently.
| Control | Applied where | Why it matters |
|---|---|---|
| Identity scoping | Each sub-agent | Prevents privilege inheritance across the chain |
| Tool allow-list | Per agent role | Sub-agents only access tools they need for their task |
| Context boundary | Between agents | Prevents classified data leaking into untrusted downstream agents |
| Output validation | Before re-injection | Blocks adversarial content in tool results from redirecting the plan |
| Trace ID propagation | Entire chain | Links every step to the originating user request for auditability |
Budget and iteration controls
Treat limits as safety rails, not just cost controls. A runaway agent is an operational incident.
Human approval gates
Not every action should be fully automated. Define which action types require a human confirmation before the agent proceeds.
Requires human approval
- • Writes or mutations to production systems
- • Actions on behalf of another user
- • Sending external communications
- • Actions above a defined cost threshold
- • Any irreversible action
Can proceed autonomously
- • Read-only retrieval within scoped sources
- • Summarisation and reformatting
- • Idempotent, low-impact tool calls
- • Draft generation (not delivery)
GCP mapping
Illustrative. Each layer maps to equivalent services on AWS, Azure, or any cloud.
Failure modes
- ! Orchestrator's permissions are inherited by all sub-agents, bypassing per-hop scoping.
- ! Tool output is re-injected without sanitisation, enabling prompt injection mid-chain.
- ! No iteration limit allows runaway loops that exhaust budget or trigger rate limits.
- ! Human approval gates are skipped under 'efficiency' pressure, enabling unsafe mutations.
- ! Agent traces are not linked to user identity, making post-incident investigation impossible.
Checklist
- □ Each agent hop has its own scoped identity and tool allow-list.
- □ Tool outputs are validated before re-injection into agent context.
- □ Hard limits exist for iterations, tokens, tool calls, and wall time.
- □ A human approval gate registry defines which actions require sign-off.
- □ Every agent execution trace is linked to the originating user and request.