2026-02-07

Securing AI Agents in Production: Sender-Constrained Tokens, Token Exchange, and Continuous Access Evaluation

How to harden AI agents with sender-constrained tokens (DPoP/mTLS), OAuth token exchange (OBO), and Continuous Access Evaluation (CAE/CAEP/SSF).

AI agents are quickly moving from “chatbot UI” to autonomous software actors: they call internal APIs, open tickets, trigger CI/CD jobs, read data from SaaS systems, and occasionally take actions that change the world.

That shift breaks a lot of familiar IAM assumptions:

  • A human is no longer the only entity holding powerful tokens.
  • “Sessions” don’t look like browser sessions; they look like long‑running workloads.
  • The blast radius of a stolen bearer token is bigger because agents are great at automation.
  • Your security team will still demand: least privilege, short lifetimes, revocation, and auditable decisions.

This post is a practical blueprint for enterprise-grade token/session security for AI agents. The core idea is simple:

  1. Stop relying on bearer tokens alone (reduce replay).
  2. Use token exchange / on-behalf-of to keep tokens scoped and contextual.
  3. Add continuous access evaluation so you can shut things off mid-flight.
  4. Treat agent credentials as non-human identity (NHI) with governance and rotation.

Along the way we’ll call out concrete product capabilities (Microsoft Entra ID, Okta, Auth0, AWS, GCP, Azure, SPIFFE/SPIRE) and map them to standards like OAuth 2.1, DPoP, Token Exchange, and CAEP/SSF.

Relevant Learn IAM topics you may want open in another tab:


The agent token problem (why “just use OAuth” isn’t enough)

OAuth is the right starting point for delegated access, but many AI agent deployments accidentally recreate the worst parts of old API key culture:

  • One long-lived token copied into an environment variable.
  • Broad scopes (or admin-level roles) because the agent “needs to work”.
  • Tokens reused across hosts/pods/workers (so theft in one place compromises all).
  • No reliable “off switch” when risk changes.

Bearer access tokens are especially dangerous in agent contexts because they’re easy to replay:

  • An attacker who steals a bearer token can use it from anywhere until it expires.
  • Agents often run in places with lots of log surfaces (debug logs, traces, crash dumps).
  • Agents often integrate with many systems; one token can open a chain reaction.

So the goal is not “use tokens”. The goal is make tokens safer than they are by default.


The blueprint (high level)

A robust production pattern for AI agents typically looks like this:

  1. Workload identity for the agent runtime (pod/VM/serverless) using cloud-native federation (no static secrets).
  2. Short-lived access tokens issued for narrowly-defined resource APIs.
  3. Sender-constraining (DPoP or mTLS) where feasible, so a stolen token is harder to replay.
  4. Token exchange / OBO to avoid passing end-user tokens around and to keep audience/scope tight.
  5. Continuous evaluation (CAE/CAEP/SSF) to revoke or reduce access mid-session based on risk.
  6. Policy enforcement points (PEPs) close to resources (API gateway / sidecar) with centralized policy (PDP).
  7. NHI governance: inventory, ownership, rotation, attestation, and break-glass.

If that feels like “a lot”, it is — but the upside is enormous: you move from “agent has a magic token” to “agent has a controlled identity that can be constrained, monitored, and shut off.”


Step 1: Start with the runtime identity (avoid static secrets)

Before worrying about OAuth access tokens, fix the base credential problem: how does the agent runtime authenticate to your identity provider?

Practical options (cloud and Kubernetes)

  • Kubernetes: Projected Service Account Tokens + OIDC federation
    • Azure: Azure Workload Identity (federated credentials in Entra ID)
    • AWS: IRSA (IAM Roles for Service Accounts)
    • GCP: Workload Identity Federation / GKE Workload Identity
  • VM / on-prem:
    • SPIFFE/SPIRE to mint workload identities (SVIDs) and enable mTLS
    • AWS: IAM Roles Anywhere (X.509-based) for non-EC2 environments
  • Serverless:
    • Use platform identity (AWS Lambda execution role, Azure Managed Identity, GCP service account) and avoid exporting keys.

The unifying goal: no long-lived API keys in the agent’s environment.

If you need a deep foundation for agent-to-API authorization decisions, pair this with an explicit PDP/PEP design:


Step 2: Use short lifetimes — but align them to agent reality

“Make tokens short-lived” is correct advice, but you need to decide what “short” means for autonomous agents.

  • Too short → noisy re-authentication loops, failures during long workflows.
  • Too long → big replay window if stolen.

A practical approach is layered lifetimes:

  • A very short-lived access token for each resource (minutes)
  • A rotating refresh token (hours) or a workload re-federation loop
  • A higher-level “agent session” concept enforced by policy (revocable at any time)

Comparison table: token/session lifetimes for agents

LayerWhat it isTypical lifetime (starting point)Recommended controlsWhen to tighten
Access tokenToken presented to resource API5–15 minutesNarrow scopes, strict audience, sender-constraining, jti replay detectionHigh-risk data/actions; internet-exposed APIs
Refresh token (if used)Token to get new access tokens1–8 hours (rotate)Rotation + reuse detection, bound to client, stored in secure secret storeAgent runs in shared/multi-tenant compute
Workload federation credentialCloud/K8s identity assertionUsually minutesNo static secrets; automatic rotation; run-time attestation where possibleAgents that scale to many replicas
“Agent session” (policy)Logical permission windowArbitrary (hours/days)CAE/CAEP events, SSF signals, disablement, risk scoringAnytime incident response requires immediate cutoff

For the standards background on token security, see:


Step 3: Reduce replay risk with sender-constrained tokens

Bearer tokens are “whoever holds it can use it”. Sender-constrained tokens change that: the token only works when presented by the intended client.

Two common mechanisms:

  • DPoP (Demonstration of Proof-of-Possession): bind the token to a public key; the client proves possession per request.
  • Mutual TLS (mTLS) sender-constrained tokens: bind token to the TLS client certificate.

Comparison table: DPoP vs mTLS (and when to use which)

OptionHow it bindsWhere it works wellOperational frictionGood fit for AI agents
DPoPPer-request proof signed by client keyMobile apps, thick clients, some services; can work behind CDNs/proxiesKey management in client; server must validate DPoPGreat when the agent can securely hold a key and you control the HTTP stack
mTLS (OAuth mTLS / certificate-bound tokens)TLS channel / client certService-to-service inside a mesh; SPIFFE/SPIRE environmentsCert issuance/rotation; LB termination complexityExcellent for in-cluster agents with a service mesh or SPIFFE
Bearer only (baseline)No bindingAnywhereLowOnly acceptable for low-risk APIs with very short TTL + strong detection

If you want the protocol-level details and tradeoffs, see:

Concrete implementation guidance (what to actually do)

  1. Pick your enforcement point: API gateway (Kong, Apigee, Azure API Management, AWS API Gateway) or a sidecar (Envoy) that can enforce DPoP/mTLS and validate tokens.
  2. Bind keys to the agent runtime:
    • Kubernetes: store the DPoP private key in a KMS-backed secret store (AWS KMS + Secrets Manager, Azure Key Vault, GCP Secret Manager) and mount via CSI driver; restrict access by namespace + service account.
    • SPIFFE: let the sidecar hold the identity and use mTLS at the mesh layer; keep application code simpler.
  3. Turn on strict token validation:
    • Validate iss, aud, exp, nbf
    • Validate signature and key rotation (JWK caching)
    • Validate nonce/jti (replay detection) for high-risk endpoints
  4. Instrument failed validations into your SIEM (Splunk, Microsoft Sentinel, Elastic, Datadog) as high-signal events.

Step 4: Use OAuth Token Exchange / OBO to contain blast radius

A common anti-pattern in agent architectures is passing around a powerful token (sometimes even an end-user token) across multiple hops:

  • Agent → orchestration service → tool runner → internal API

Every hop is a new place that could leak the token.

Instead, use OAuth Token Exchange (RFC 8693) or an on-behalf-of (OBO) pattern so each service gets a token tailored to:

  • its audience (the API it’s calling)
  • its scopes (only what it needs)
  • its context (user vs app vs both)

This gives you a clean security story:

  • The agent runtime has a workload identity
  • It exchanges for a token to call Tool A
  • Tool A exchanges again to call API B
  • Tokens are short-lived and audience-bound at every step

Learn IAM deep dive:

Concrete product examples (common enterprise patterns)

  • Microsoft Entra ID:
    • OBO flows are widely used for API-to-API delegated calls.
    • Pair with Continuous Access Evaluation (CAE) for near-real-time revocation.
  • Okta:
    • Use OAuth for API access; pair with Okta risk signals and event hooks where relevant.
  • Auth0 (Okta Customer Identity Cloud):
    • Use OAuth/OIDC tokens with strict audience and scopes; for B2C agents/tools, pair with anomaly detection and token rotation strategies.

Even if you don’t adopt “full” token exchange everywhere, aim for this rule:

Tokens should be minted as close as possible to the service that will use them, and be audience-scoped to exactly one resource.


Step 5: Add continuous access evaluation (the mid-session off switch)

Short lifetimes help, but incidents don’t wait for token expiry. If you detect that:

  • an agent identity is compromised
  • a device/workload posture changed
  • a user was terminated
  • a risky location or impossible travel is detected

…you need a way to cut off access now.

CAE, CAEP, and SSF (how they fit together)

  • Continuous Access Evaluation (CAE) is a pattern: re-check access based on events.
  • CAEP (Continuous Access Evaluation Protocol) is a standard for sharing those events.
  • SSF (Shared Signals Framework) is a broader standard for sharing security signals (risk, compromise, posture) between providers.

In practice, enterprises combine:

  • Identity provider session controls (conditional access, token revocation)
  • Eventing / signals (CAE events, SSF signals)
  • Resource-side enforcement (APIs rejecting tokens based on new state)

Learn IAM background:

Concrete implementation guidance

  1. Define “revocation triggers” for agents:
    • Agent owner changes or leaves the company
    • Agent code repo is compromised / unsigned build detected
    • High-confidence token theft signal (e.g., token used from unexpected ASN)
    • New critical vulnerability on the host image
  2. Wire the triggers into your IAM controls:
    • If using Entra ID: design for CAE-aware resources where feasible.
    • If using API gateways: maintain a denylist / risk score cache keyed by sub/client_id/workload id.
  3. Choose how resources react:
    • Hard fail (401/403) for sensitive endpoints
    • Step-up (require re-auth / re-federation) for medium risk
    • Reduce scopes/roles dynamically if your authorization layer supports it
  4. Test with chaos drills:
    • Rotate signing keys, revoke refresh tokens, disable service principals, and confirm agents fail safely.

Step 6: Authorization for agents is not just OAuth scopes

OAuth scopes are a coarse control. For AI agents, you often need:

  • action-level permissions (approve refund, create user, rotate key)
  • object-level restrictions (only tickets in project X)
  • tenant isolation (for multi-tenant tools)
  • policy based on posture/risk (deny if runtime attestation fails)

This is where explicit authorization architecture matters:

  • Central policy decision point (PDP) (e.g., OPA / Styra, Cedar, custom)
  • Distributed policy enforcement points (PEPs) (gateway/sidecar)

Learn IAM:

Implementation pattern: tool permissions as “capabilities”

For agent toolchains, treat each tool integration as a capability with:

  • explicit owner
  • explicit data classification
  • explicit allowed actions
  • explicit audit logging

Then bind capabilities to:

  • the agent workload identity (service principal / workload identity)
  • the environment (dev/stage/prod)
  • and optionally the human approver (for high-risk actions)

This lets you build a usable control plane without turning everything into a manual approval workflow.


Step 7: Govern agent credentials as Non-Human Identity (NHI)

Even with great runtime identity and OAuth best practices, agents will accumulate credentials over time: webhooks, SaaS API tokens, SSH deploy keys, signing keys, etc.

Treat agent identity as part of your NHI program:

  • inventory: “what agents exist and what can they access?”
  • ownership: “who is accountable for this agent?”
  • rotation: “how do secrets rotate and how often?”
  • segmentation: “is prod isolated from dev?”
  • audit: “can we explain what happened?”

Learn IAM:

Practical controls (what works in real enterprises)

  • Store secrets in a managed vault:
    • HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager
  • Prefer short-lived dynamic credentials:
    • Vault database credentials, cloud STS tokens, OIDC federation
  • Use code signing and artifact provenance:
    • Sigstore/cosign, SLSA controls, GitHub Actions OIDC to cloud
  • Ensure “break glass” exists:
    • Separate emergency identities, monitored and time-bounded

A phased adoption plan (so you can ship this)

You don’t need perfection on day one. You need a sequence that reduces risk quickly.

Phase 1 (1–2 weeks): Stop the bleeding

  • Replace static agent API keys with cloud/K8s workload identity where possible
  • Enforce strict token validation (iss, aud, signature, expiry)
  • Reduce access token TTL to 5–15 minutes for sensitive APIs
  • Add structured audit logs for agent actions (who/what/where)
  • Create an “agent inventory” (even a spreadsheet is better than nothing)

Phase 2 (2–6 weeks): Contain blast radius

  • Introduce audience-bound tokens per downstream API
  • Adopt token exchange/OBO for multi-hop calls
  • Segment agents by environment (dev/stage/prod identities)
  • Add scope/role minimization and object-level authorization for top-risk actions

Phase 3 (6–12 weeks): Make tokens harder to steal and reuse

  • Implement DPoP for agent-to-API calls where feasible
  • Implement mTLS sender-constrained tokens for in-mesh calls (SPIFFE/SPIRE or service mesh)
  • Add replay detection for critical endpoints (jti/nonce)

Phase 4 (ongoing): Continuous evaluation + mature governance

  • Integrate CAE/CAEP/SSF signals into resource enforcement
  • Run quarterly incident response drills focused on token theft
  • Add NHI governance metrics: rotation compliance, stale identities, orphaned agents

Operational checklist (what to verify before declaring “production ready”)

Token & auth

  • Access tokens are scoped to a single audience (one resource API)
  • Access token TTL is documented and justified
  • Refresh tokens (if any) rotate with reuse detection
  • Keys used for DPoP/mTLS are rotated and protected by HSM/KMS where feasible

Authorization

  • High-risk actions require explicit permissions (not “agent can do everything”)
  • Policies are versioned and tested (unit tests + integration tests)
  • Multi-tenant boundaries are enforced server-side

Detection & response

  • Agent actions produce audit logs with correlation IDs
  • Token validation failures are high-signal alerts
  • There is a documented “disable agent” and “revoke tokens” runbook
  • CAE/continuous revocation behavior is tested (not assumed)

Governance

  • Every agent has an owner, a purpose, and a decommission date/review cadence
  • Secret storage is centralized (no tokens in plain env vars or config files)
  • Break-glass access is separated, monitored, and time-bounded

Closing thought

AI agents don’t require brand-new IAM theory. They require applying your best IAM fundamentals — least privilege, short lifetimes, constrained tokens, explicit authorization, and continuous evaluation — to a new kind of actor that scales faster than humans.

If you implement only one improvement this month, make it this:

Stop issuing “bearer tokens with broad scope” to autonomous agents. Bind tokens to the sender, narrow the audience, and build a mid-session off switch.

That is the difference between “cool demo” and “production system your CISO can sign off on.”