2026-02-23

Token Security for AI Agents and Non‑Human Identities (NHI): Practical Patterns for 2026

A practical, enterprise-neutral guide to token lifetimes, audience/scope boundaries, replay resistance, and revocation for AI agents and other non-human identities—plus concrete implementation patterns across common IdPs, clouds, and meshes.

Summary: AI agents and other non‑human identities are turning “token hygiene” from a best practice into a production reliability and breach‑prevention requirement. This post lays out a pragmatic, vendor‑neutral approach to designing token lifetimes, audience/scope, sender constraints, and revocation for agentic workloads—plus concrete configurations across common IdPs, cloud providers, and service meshes.

No major SSF/CAEP or agent-identity “breaking news” surfaced in the last 24 hours. Instead, today’s post focuses on a high‑impact best‑practice topic that keeps showing up in real incidents: token theft + over‑privileged non‑human identities—now amplified by autonomous agents.


Why token security is suddenly everyone’s problem

Most organizations have spent years maturing identity controls for humans: MFA, conditional access, device posture, and step‑up authentication. But the fastest growth in identities is now non‑human identity (NHI):

  • Kubernetes workloads and service accounts
  • CI/CD runners and build pipelines
  • SaaS integrations and API clients
  • Bots, RPA, and event-driven automations
  • AI agents that call tools, fetch data, and take actions

The common denominator is tokens: OAuth access tokens, OIDC ID tokens, SAML assertions (less often for workloads), API keys, session cookies, and cloud provider credentials.

For humans, a stolen token is bad. For an AI agent, a stolen token can be worse because:

  1. Agents run continuously. A token stolen at 2 AM may be used immediately and repeatedly.
  2. Agent tokens often carry broad permissions. “Let the agent do its job” becomes “let it do everything.”
  3. Agents touch sensitive systems. Ticketing, customer data, production deployments, secrets stores.
  4. The blast radius is non-obvious. Toolchains create transitive access: GitHub → cloud → cluster → data.

Token security isn’t just “shorter expirations.” It’s a set of coordinated design choices:

  • Who can get a token (workload identity)
  • What a token can do (scopes/claims/permissions)
  • Where the token can be used (audience, resource indicators)
  • How the token can be replayed (sender constraints like mTLS/DPoP)
  • How fast you can stop it (revocation, eventing, session risk)

A practical threat model: what actually goes wrong

Token compromise for NHIs and agents typically falls into a few buckets:

1) Token exfiltration from logs, traces, or error reports

  • Access tokens accidentally logged by reverse proxies or API gateways
  • Debug logs in agents that print request headers
  • Observability tools capturing request bodies/headers

Countermeasures: log redaction, structured logging guardrails, “no secrets in telemetry” policies, WAF rules, and automated scanning (e.g., GitHub secret scanning; TruffleHog; Gitleaks).

2) Token theft from runtime memory or local storage

  • Tokens stored unencrypted on disk in an agent host
  • Compromised container reading service account token files
  • Browser automation capturing session cookies

Countermeasures: avoid long‑lived tokens, use workload identity federation, use secrets managers with short leases (Vault), isolate runtimes (gVisor/Kata), and apply pod security controls.

3) Replay and “token forwarding” attacks

  • A token intended for Service A is replayed against Service B
  • Token used from a different environment/host than expected

Countermeasures: strict aud validation, resource indicators where supported, and sender‑constrained tokens (mTLS or DPoP).

4) Over-broad scopes / entitlements

  • “agent.readwrite” scope that maps to admin-like permissions
  • Service principal granted Owner/Contributor everywhere

Countermeasures: fine-grained authorization, least privilege, and continuous entitlement review.

5) Slow revocation in distributed systems

Even if your IdP can revoke refresh tokens, downstream APIs may accept access tokens until they expire.

Countermeasures: short access-token TTLs, centralized authorization checks, introspection for high-risk operations, and security event propagation (SSF/CAEP patterns).


Design principle: treat tokens like volatile, single-purpose instruments

A good mental model:

  • Access tokens are like “single-purpose debit cards.” Limit where they work (audience) and what they buy (scope).
  • Refresh tokens are like “master keys.” Protect them heavily, rotate them, and revoke aggressively.
  • Session cookies are bearer tokens. Treat them as such, especially for headless automation.

For AI agents specifically: aim for task-scoped, time-bounded, and tool-bounded credentials.

If you want a deeper foundation on the identity side, these Learn IAM topics are useful background:


Comparison table: token and credential options for agentic + NHI workloads

The goal is not “one true token.” It’s choosing the right primitive for the job.

OptionTypical lifetimeRevocation behaviorBest forCommon platforms / products
OAuth 2.0 Access Token (JWT)5–15 minOften implicit (wait for expiry); explicit revocation depends on API checksHigh-throughput API callsOkta, Microsoft Entra ID, Auth0, Ping, Keycloak
OAuth 2.0 Access Token (opaque + introspection)1–10 minCan be immediate if API introspectsHigh-risk operations needing near-real-time revocationKeycloak, ForgeRock, Ping, some custom AS
OAuth Refresh Tokenhours–weeks (should be shorter for NHIs)Strong revocation lever; rotation recommendedLong-running agents that can renew access tokensOkta (rotation), Entra (varies by flow), Auth0
Workload identity federation (OIDC → cloud STS)5–60 min (cloud creds)Short-lived, scoped to role; rotate by defaultCI/CD, workloads getting cloud creds without static keysAWS STS (AssumeRoleWithWebIdentity), GCP Workload Identity Federation, Azure federated credentials
SPIFFE/SPIRE SVID (mTLS identity)minutes–hoursRotated automatically; trust anchored in mesh/agentService-to-service auth, mesh identitySPIRE, Istio, Linkerd (via SPIFFE), Consul
API keys (static)months–never (bad)Hard to revoke safely; often copied everywhereLast resort integrationsMany SaaS APIs
mTLS client certsdays–months (but can be short)Revocation via CRL/OCSP; operational overheadHigh assurance service authNGINX, Envoy, service meshes
DPoP (proof-of-possession)same as access tokenLimits replay; still needs TTL/revocation strategyPublic clients / agent frameworks that can sign requestsSome AS/resource servers; emerging support

Opinionated guidance: for agents and NHIs, prefer federated, short-lived credentials (cloud STS, SVIDs, short JWT access tokens) over static API keys or long-lived refresh tokens.


The “Four Boundaries” model for agent tokens

When you issue a token to an agent, define four explicit boundaries. If you can’t describe them, you probably don’t have them.

  1. Time boundary – how long is the token valid?
  2. Task boundary – what job is the token for (ticket update, read-only search, deploy)?
  3. Tool boundary – which tools/APIs can accept it (audience, resource indicator)?
  4. Trust boundary – where can it be used from (sender constraints, network, device/workload posture)?

Time: token lifetimes that actually work

A workable default starting point for many enterprises:

  • Access token TTL: 5–10 minutes for API access
  • Refresh token TTL: avoid for NHIs if possible; if required, 1–24 hours + rotation
  • Cloud STS creds: 15–60 minutes (default provider patterns)
  • mTLS/SVID certs: rotate every 1–4 hours

Short lifetimes are not free: they increase token minting, load on authorization servers, and operational noise. But they are one of the few controls that reliably reduce blast radius across heterogeneous APIs.

Task: “least privilege” must be expressed in the token

For OAuth/OIDC, that means:

  • Scopes that map to discrete actions (e.g., tickets:comment, not tickets:admin)
  • Claims like act/actor (who initiated), sub (agent identity), and azp (authorized party)
  • Fine-grained authorization checks in the resource server

If your agent is doing multiple tasks, consider issuing separate tokens per tool and per task—even if minted by the same identity.

Tool: lock down aud and resource indicators

A common failure mode: using tokens minted “for the agent” that are accepted by any internal API.

  • Validate aud strictly in every API
  • Consider RFC 8707 resource indicators (where supported) to request tokens for a specific API
  • If you have multiple environments, keep audiences environment-specific (api.prod.example.com vs api.dev.example.com)

Trust: sender constraints for replay resistance

If a stolen token is your biggest worry, a pure bearer token is fragile. Two practical sender-constraint approaches:

  1. mTLS-bound access tokens (aka “certificate-bound” tokens)
  2. DPoP (a signed proof attached to each request)

You may not be able to deploy these everywhere today, but you can use them for your highest-risk agent actions first (production changes, secrets access, admin APIs).


Concrete implementation guidance (by platform/pattern)

This section is deliberately concrete. You should be able to hand it to an engineer.

Pattern 1: CI/CD → cloud provider (OIDC federation), no static keys

Problem: build systems often leak long-lived cloud keys.

Recommended approach: use OIDC federation from your CI provider into cloud STS.

  • GitHub Actions → AWS STS AssumeRoleWithWebIdentity
  • GitHub Actions → Azure federated identity credentials for a managed identity / app registration
  • GitHub Actions → GCP Workload Identity Federation

Key controls:

  • Restrict OIDC subject claims (repo, environment, branch/tag) in the trust policy
  • Restrict audience to the intended STS
  • Keep session duration short (15–60 minutes)
  • Apply least privilege IAM roles; separate “plan” vs “apply” roles

AWS example (high level):

  • Create an IAM role with trust policy allowing GitHub’s OIDC provider
  • Condition on token.actions.githubusercontent.com:sub and aud
  • Use permission boundary or SCPs for guardrails

GCP example (high level):

  • Create a Workload Identity Pool + Provider for GitHub
  • Map attributes like repository and ref
  • Bind a service account with minimal roles

Azure example (high level):

  • Configure federated credential on an app registration or managed identity
  • Limit issuer, subject, and audience
  • Use Azure RBAC with narrow scope (resource group or specific resources)

Pattern 2: Kubernetes workloads → internal APIs (service identity)

If you’re still using Kubernetes long-lived service account tokens, you’re behind modern defaults.

Recommended approach:

  • Use projected service account tokens (shorter-lived, audience-bound)
  • Enforce audience on token requests
  • Use a mesh identity (SPIFFE/SPIRE or Istio) for service-to-service mTLS

Controls to prioritize:

  • Kubernetes: enable BoundServiceAccountTokenVolume (now default in many distributions)
  • Set automountServiceAccountToken: false by default and opt-in per workload
  • Use NetworkPolicies to restrict egress from agent pods
  • Use admission controls (OPA Gatekeeper / Kyverno) to enforce token + privilege rules

Related Learn IAM topics:

Pattern 3: AI agent → tool APIs (ticketing, CRM, data)

AI agents often need to act on behalf of a user, but with tighter boundaries.

A practical model:

  • User authenticates normally (SSO + MFA)
  • User authorizes the agent to perform a defined set of actions
  • Agent receives a delegated token with:
    • explicit scopes
    • short TTL
    • actor claim pointing to the user
    • sub identifying the agent workload identity

Where possible:

  • Avoid giving agents a user’s refresh token
  • Prefer token exchange patterns (RFC 8693) to mint a new, constrained token
  • Log every agent action with correlation IDs and actor + subject

If you need “impersonation,” treat it as a privileged capability with approvals and tight auditing.

Related Learn IAM topics:

Pattern 4: Central secrets manager with short-lived leases

If your agent needs credentials for downstream systems that cannot do OAuth well, use a broker.

  • HashiCorp Vault: dynamic secrets for DBs, cloud, PKI; short TTL + renewal
  • AWS Secrets Manager / GCP Secret Manager / Azure Key Vault: store secrets, rotate where possible

Guardrails:

  • Agents authenticate to the secrets manager using workload identity (OIDC, Kubernetes auth)
  • Secrets are leased for minutes/hours, not months
  • Rotate and revoke automatically on risk events

Revocation and security events: don’t rely on “wait for expiry”

In distributed architectures, you need a strategy for fast invalidation.

Pragmatic options (mix-and-match):

  1. Short access-token TTLs (baseline)
  2. Introspection for high-risk endpoints only (admin actions, secrets access)
  3. Back-channel security events to inform relying parties
  4. Centralized authorization for sensitive decisions

Where SSF/CAEP fit (even if you’re not “fully there” yet)

If you want revocation and risk signals to propagate beyond one IdP, you need eventing patterns.

  • OpenID Shared Signals Framework (SSF) provides a way to transmit security events.
  • CAEP (Continuous Access Evaluation Profile) is a profile used to continuously evaluate sessions and tokens.

Even if you’re not implementing SSF/CAEP end-to-end today, you can mimic the operational benefit:

  • Emit internal “token risk” events (user disabled, secret rotated, agent quarantined)
  • Subscribe critical services (API gateway, authorization service) to those events
  • Invalidate cached decisions and enforce step-up / deny

Phased adoption plan (what to do this quarter)

If you’re starting from “lots of API keys and long-lived tokens,” here’s an opinionated 4-phase sequence.

Phase 0 — Inventory and classify (1–2 weeks)

  • Inventory all non-human identities (service principals, API clients, agents)
  • Classify by blast radius (data sensitivity + actionability)
  • Identify where tokens/keys live (code, CI variables, secrets stores)

Phase 1 — Kill the worst keys (2–4 weeks)

  • Replace static cloud keys with OIDC federation (GitHub/GitLab/CircleCI → STS)
  • Enforce least privilege roles + short sessions
  • Add secret scanning to repos and build logs

Phase 2 — Standardize token boundaries (4–8 weeks)

  • Set org defaults: access token TTL 5–10 minutes for internal APIs
  • Enforce aud validation across services
  • Introduce token exchange for agents (mint constrained tokens per tool/task)
  • Add authorization decision logging

Phase 3 — Replay resistance + revocation improvements (ongoing)

  • Add sender constraints for the highest-risk flows (mTLS/DPoP)
  • Use introspection for admin endpoints or sensitive actions
  • Implement security event propagation (SSF/CAEP style) or internal equivalent

Actionable checklist (copy/paste)

Token issuance

  • Access token TTL set to 5–10 minutes (exceptions documented)
  • Refresh token rotation enabled where refresh tokens are unavoidable
  • Token audience is service-specific (not “all internal APIs”)
  • Scopes map to discrete actions; no “super scopes” for agents

Resource servers (APIs)

  • Strict aud validation
  • Scope/claim checks implemented server-side (not just in gateway)
  • High-risk endpoints support introspection or additional checks
  • Authorization decisions are logged with actor + subject

Runtime & operations

  • No tokens in logs/traces; redaction and linting in place
  • Workloads use federated identity (STS, projected SA tokens, SPIFFE)
  • Network egress restrictions for agent runtimes
  • Incident playbook includes token revocation + key rotation steps

AI agent-specific

  • Agent actions are tool-bounded (per-tool tokens)
  • Delegation model is explicit (actor claims / token exchange)
  • “Break glass” impersonation requires approvals and is audited

Closing: the safest token is the one that can’t be reused

You can’t prevent every token leak. You can design your system so leaked tokens are:

  • short-lived,
  • narrowly scoped,
  • audience-bound,
  • replay-resistant (where it matters), and
  • quickly invalidated.

As AI agents proliferate, token security becomes the foundation for safe autonomy.