Every governance model for distributed systems eventually confronts the same tension: you want a single, authoritative source of truth, but a single source of truth doesn't scale. The solution the industry has converged on, in domain after domain, is to accept that you can't have both — and to build systems that work correctly despite operating on information that may be slightly stale.
Agent authorization is no different. The temptation to build one central service that makes every decision is understandable. It's clean on a whiteboard. It fails in production.
The problem with single-source-of-truth authorization
Imagine a central authorization service that every agent must consult before taking any action. All policy lives there. All trust decisions are made there. It's the only place a revocation takes effect immediately.
In a low-volume deployment with agents running a handful of tasks, this works. Add a few hundred agents running concurrent tasks — each checking in for authorization multiple times per minute — and you've built a bottleneck at the heart of your system. The authorization service becomes both a performance constraint and a single point of failure. An outage doesn't slow your agents; it stops them entirely.
More subtly: the latency of a network roundtrip to a central service imposes a floor on how fast any agent can act. For agents handling real-time tasks — responding to events, processing streams, operating under SLAs — that floor may be unacceptable.
The instinct to solve this by caching at the agent is correct. But caching without a model for what the cache represents, how stale it can be, and when it must be bypassed leads to a different set of problems: agents operating on outdated authorization state, revocations not propagating cleanly, and nobody quite sure what guarantee the system is actually providing.
What's needed is a deliberate model for where authorization decisions get made, what trust level each location carries, and how they compose. That's what the three-zone model provides.
Zone 1: Local Authority
The Local Authority is the agent itself — specifically, the authorization state the agent holds in memory from prior decisions made by a higher authority.
An agent that was granted permission to read customer records at session start carries that grant locally. When it needs to read a record again 200ms later, it doesn't phone home. It checks its local grant, verifies it hasn't expired, and proceeds. The round-trip cost is zero.
The trade-off is trust level. Local Authority decisions are only as trustworthy as the last time the agent synchronized with a higher authority. The grant might have been revoked since it was issued. A delegation token upstream might have been invalidated. The agent has no way to know unless it asks.
This makes Local Authority appropriate for a specific, bounded use case: repeating an action type that was already authorized in this session, where the risk of acting on slightly stale state is acceptable. For low-risk, high-frequency actions within an established task, that's most of what an agent does. For anything involving new resource types, elevated permissions, or actions where revocation could matter in the last few minutes, Local Authority is not sufficient.
The key design constraint: Local Authority can only consume grants. It cannot issue them. An agent cannot authorize itself to do something it wasn't already granted by a higher zone.
Zone 2: Domain Authority
The Domain Authority is the operational heart of the model — an enterprise-operated trust service, typically running as a distributed PDP cluster, that handles the vast majority of real authorization decisions.
When an agent needs to perform an action that isn't covered by its local cache — a new resource type, an escalated operation, a fresh delegation — it sends a request to the Domain Authority. The Domain Authority evaluates the full policy set, checks the agent's delegation chain for validity, computes a trust risk score, and returns a signed decision. That decision may be cached locally by the agent for subsequent use, with a TTL set by the Domain Authority based on the risk profile of the action.
The Domain Authority has several properties that Local Authority lacks:
It knows about revocation. The Domain Authority maintains a connection to a revocation index. When a delegation token is invalidated upstream, the Domain Authority's PDP cluster picks up that change within its sync interval — typically seconds to tens of seconds. Any subsequent request that touches the invalidated chain will be denied.
It evaluates context. A Domain Authority PDP doesn't just check "is this action allowed?" It evaluates the full picture: what's the time of day, what's the request volume from this agent, does the context contain signals that increase environmental risk. Static policy plus dynamic context produces a decision that simple credential checking cannot.
It issues new grants. When an agent needs to extend its session, accept a fresh delegation, or escalate permissions, the Domain Authority is the entity that evaluates whether that's warranted and issues the corresponding credential.
The latency profile — 5 to 50ms in practice for a well-operated cluster — is acceptable for most non-trivial agent actions. The cluster scales horizontally. No single node is a bottleneck. A node failure degrades gracefully rather than stopping all authorization.
For the overwhelming majority of agent activity in a real deployment, the Domain Authority handles everything. Local Authority handles the repetitive fast path; Domain Authority handles everything else.
Zone 3: Global Authority
The Global Authority is invoked rarely — by design.
It holds the cryptographic root keys that underpin the entire trust hierarchy. It's the entity that provisions new agents into the system, issuing root credentials that Domain Authorities can then derive session tokens from. It maintains the authoritative global revocation state. It's the anchor for cross-organization trust when agents from different domains need to interact.
What it is not: a decision-making service for routine agent operations. Sending an email, querying a database, calling an API — none of these should touch the Global Authority. If they do, you've misarchitected your system.
The Global Authority's latency profile (100–500ms, potentially more for cross-region operations) reflects its role: occasional, high-stakes operations where correctness matters more than speed. Agent provisioning can tolerate 300ms. A routine tool call cannot.
This design mirrors certificate validation in TLS. Most certificate checks don't go to a root CA — they go to an intermediate issuer that was previously validated against the root. The root CA signs once, that signature propagates through the chain, and day-to-day operations rely on the intermediate. Touching the root is rare, deliberate, and logged.
How actions flow across zones
The routing logic is determined by risk score, not by action category. The Domain Authority assigns a composite risk score to each action type as part of policy definition. That score, combined with environmental signals calculated at evaluation time, determines where the decision gets made:
- Low composite risk, action covered by local cache, cache not expired → Local Authority. No network call.
- Low to moderate risk, no valid local cache → Domain Authority PDP. Network call, result cached with TTL.
- High risk or elevated trust concern → Domain Authority with mandatory Trust Evaluation Service consultation. Signed decision, shorter TTL or no caching.
- Provisioning, cross-org trust, root key operations → Global Authority. Rare, always logged, result cached aggressively.
An agent performing routine operations will spend most of its time in the first two buckets. The system is designed so that the common path is fast, and the expensive checks are reserved for operations that genuinely warrant them.
Why this pattern is familiar
This three-zone structure isn't novel. It's the same tiered architecture distributed systems use everywhere:
CDNs serve content from edge caches when possible, fall back to regional origin servers, and reach back to the source of truth only when a cache miss or invalidation requires it. The hierarchy exists because latency and load distribution require it.
DNS resolves names from local resolver caches first, queries recursive resolvers for cache misses, and reaches authoritative nameservers only for records not cached anywhere in the chain. The TTL controls how long each level can trust its cached state.
Certificate validation checks against locally cached CRLs or OCSP staples before contacting a responder, and responders are intermediaries that are themselves validated against roots they've cached.
In each case, the same trade-off applies: caching at lower levels gives you performance, but it introduces a window during which the cached state may diverge from the authoritative state. The architecture acknowledges this and manages it deliberately — through TTLs, revocation mechanisms, and clear rules about when a stale cache is acceptable and when it isn't.
Agent authorization is the same problem in a different domain. The three-zone model gives it a name and a formal structure, which is the precondition for specifying it rigorously enough that any implementation can be tested for conformance.
The goal isn't a perfect system where every authorization is evaluated against the latest possible state. The goal is a system where you know, for any given decision, what guarantees were in force — and where an auditor can reconstruct that with a signed artifact after the fact.
Next post: The Three Risk Layers — how inherent risk, environmental risk, and trust risk compose into a single authorization signal.

