Token Security for AI Agents and Non‑Human Identities (NHI): Practical Patterns for 2026

Summary: AI agents and other non‑human identities are turning “token hygiene” from a best practice into a production reliability and breach‑prevention requirement. This post lays out a pragmatic, vendor‑neutral approach to designing token lifetimes, audience/scope, sender constraints, and revocation for agentic workloads—plus concrete configurations across common IdPs, cloud providers, and service meshes.

No major SSF/CAEP or agent-identity “breaking news” surfaced in the last 24 hours. Instead, today’s post focuses on a high‑impact best‑practice topic that keeps showing up in real incidents: token theft + over‑privileged non‑human identities—now amplified by autonomous agents.

Why token security is suddenly everyone’s problem

Most organizations have spent years maturing identity controls for humans: MFA, conditional access, device posture, and step‑up authentication. But the fastest growth in identities is now non‑human identity (NHI):

Kubernetes workloads and service accounts
CI/CD runners and build pipelines
SaaS integrations and API clients
Bots, RPA, and event-driven automations
AI agents that call tools, fetch data, and take actions

The common denominator is tokens: OAuth access tokens, OIDC ID tokens, SAML assertions (less often for workloads), API keys, session cookies, and cloud provider credentials.

For humans, a stolen token is bad. For an AI agent, a stolen token can be worse because:

Agents run continuously. A token stolen at 2 AM may be used immediately and repeatedly.
Agent tokens often carry broad permissions. “Let the agent do its job” becomes “let it do everything.”
Agents touch sensitive systems. Ticketing, customer data, production deployments, secrets stores.
The blast radius is non-obvious. Toolchains create transitive access: GitHub → cloud → cluster → data.

Token security isn’t just “shorter expirations.” It’s a set of coordinated design choices:

Who can get a token (workload identity)
What a token can do (scopes/claims/permissions)
Where the token can be used (audience, resource indicators)
How the token can be replayed (sender constraints like mTLS/DPoP)
How fast you can stop it (revocation, eventing, session risk)

A practical threat model: what actually goes wrong

Token compromise for NHIs and agents typically falls into a few buckets:

1) Token exfiltration from logs, traces, or error reports

Access tokens accidentally logged by reverse proxies or API gateways
Debug logs in agents that print request headers
Observability tools capturing request bodies/headers

Countermeasures: log redaction, structured logging guardrails, “no secrets in telemetry” policies, WAF rules, and automated scanning (e.g., GitHub secret scanning; TruffleHog; Gitleaks).

2) Token theft from runtime memory or local storage

Tokens stored unencrypted on disk in an agent host
Compromised container reading service account token files
Browser automation capturing session cookies

Countermeasures: avoid long‑lived tokens, use workload identity federation, use secrets managers with short leases (Vault), isolate runtimes (gVisor/Kata), and apply pod security controls.

3) Replay and “token forwarding” attacks

A token intended for Service A is replayed against Service B
Token used from a different environment/host than expected

Countermeasures: strict aud validation, resource indicators where supported, and sender‑constrained tokens (mTLS or DPoP).

4) Over-broad scopes / entitlements

“agent.readwrite” scope that maps to admin-like permissions
Service principal granted Owner/Contributor everywhere

Countermeasures: fine-grained authorization, least privilege, and continuous entitlement review.

5) Slow revocation in distributed systems

Even if your IdP can revoke refresh tokens, downstream APIs may accept access tokens until they expire.

Countermeasures: short access-token TTLs, centralized authorization checks, introspection for high-risk operations, and security event propagation (SSF/CAEP patterns).

Design principle: treat tokens like volatile, single-purpose instruments

A good mental model:

Access tokens are like “single-purpose debit cards.” Limit where they work (audience) and what they buy (scope).
Refresh tokens are like “master keys.” Protect them heavily, rotate them, and revoke aggressively.
Session cookies are bearer tokens. Treat them as such, especially for headless automation.

For AI agents specifically: aim for task-scoped, time-bounded, and tool-bounded credentials.

If you want a deeper foundation on the identity side, these Learn IAM topics are useful background:

Non-human identity (NHI): https://learn-iam.com/topics/identity-for-ai/non-human-identity
AI agent identity: https://learn-iam.com/topics/identity-for-ai/ai-agent-identity
AI tool authorization: https://learn-iam.com/topics/authorization/ai-agent-tool-authorization
API authorization (scopes/claims): https://learn-iam.com/topics/authorization/authorization-for-apis-scopes-claims
Delegation & impersonation: https://learn-iam.com/topics/authorization/delegation-impersonation-acting-as

Comparison table: token and credential options for agentic + NHI workloads

The goal is not “one true token.” It’s choosing the right primitive for the job.

Option	Typical lifetime	Revocation behavior	Best for	Common platforms / products
OAuth 2.0 Access Token (JWT)	5–15 min	Often implicit (wait for expiry); explicit revocation depends on API checks	High-throughput API calls	Okta, Microsoft Entra ID, Auth0, Ping, Keycloak
OAuth 2.0 Access Token (opaque + introspection)	1–10 min	Can be immediate if API introspects	High-risk operations needing near-real-time revocation	Keycloak, ForgeRock, Ping, some custom AS
OAuth Refresh Token	hours–weeks (should be shorter for NHIs)	Strong revocation lever; rotation recommended	Long-running agents that can renew access tokens	Okta (rotation), Entra (varies by flow), Auth0
Workload identity federation (OIDC → cloud STS)	5–60 min (cloud creds)	Short-lived, scoped to role; rotate by default	CI/CD, workloads getting cloud creds without static keys	AWS STS (AssumeRoleWithWebIdentity), GCP Workload Identity Federation, Azure federated credentials
SPIFFE/SPIRE SVID (mTLS identity)	minutes–hours	Rotated automatically; trust anchored in mesh/agent	Service-to-service auth, mesh identity	SPIRE, Istio, Linkerd (via SPIFFE), Consul
API keys (static)	months–never (bad)	Hard to revoke safely; often copied everywhere	Last resort integrations	Many SaaS APIs
mTLS client certs	days–months (but can be short)	Revocation via CRL/OCSP; operational overhead	High assurance service auth	NGINX, Envoy, service meshes
DPoP (proof-of-possession)	same as access token	Limits replay; still needs TTL/revocation strategy	Public clients / agent frameworks that can sign requests	Some AS/resource servers; emerging support

Opinionated guidance: for agents and NHIs, prefer federated, short-lived credentials (cloud STS, SVIDs, short JWT access tokens) over static API keys or long-lived refresh tokens.

The “Four Boundaries” model for agent tokens

When you issue a token to an agent, define four explicit boundaries. If you can’t describe them, you probably don’t have them.

Time boundary – how long is the token valid?
Task boundary – what job is the token for (ticket update, read-only search, deploy)?
Tool boundary – which tools/APIs can accept it (audience, resource indicator)?
Trust boundary – where can it be used from (sender constraints, network, device/workload posture)?

Time: token lifetimes that actually work

A workable default starting point for many enterprises:

Access token TTL: 5–10 minutes for API access
Refresh token TTL: avoid for NHIs if possible; if required, 1–24 hours + rotation
Cloud STS creds: 15–60 minutes (default provider patterns)
mTLS/SVID certs: rotate every 1–4 hours

Short lifetimes are not free: they increase token minting, load on authorization servers, and operational noise. But they are one of the few controls that reliably reduce blast radius across heterogeneous APIs.

Task: “least privilege” must be expressed in the token

For OAuth/OIDC, that means:

Scopes that map to discrete actions (e.g., tickets:comment, not tickets:admin)
Claims like act/actor (who initiated), sub (agent identity), and azp (authorized party)
Fine-grained authorization checks in the resource server

If your agent is doing multiple tasks, consider issuing separate tokens per tool and per task—even if minted by the same identity.

Tool: lock down `aud` and resource indicators

A common failure mode: using tokens minted “for the agent” that are accepted by any internal API.

Validate aud strictly in every API
Consider RFC 8707 resource indicators (where supported) to request tokens for a specific API
If you have multiple environments, keep audiences environment-specific (api.prod.example.com vs api.dev.example.com)

Trust: sender constraints for replay resistance

If a stolen token is your biggest worry, a pure bearer token is fragile. Two practical sender-constraint approaches:

mTLS-bound access tokens (aka “certificate-bound” tokens)
DPoP (a signed proof attached to each request)

You may not be able to deploy these everywhere today, but you can use them for your highest-risk agent actions first (production changes, secrets access, admin APIs).

Concrete implementation guidance (by platform/pattern)

This section is deliberately concrete. You should be able to hand it to an engineer.

Pattern 1: CI/CD → cloud provider (OIDC federation), no static keys

Problem: build systems often leak long-lived cloud keys.

Recommended approach: use OIDC federation from your CI provider into cloud STS.

GitHub Actions → AWS STS AssumeRoleWithWebIdentity
GitHub Actions → Azure federated identity credentials for a managed identity / app registration
GitHub Actions → GCP Workload Identity Federation

Key controls:

Restrict OIDC subject claims (repo, environment, branch/tag) in the trust policy
Restrict audience to the intended STS
Keep session duration short (15–60 minutes)
Apply least privilege IAM roles; separate “plan” vs “apply” roles

AWS example (high level):

Create an IAM role with trust policy allowing GitHub’s OIDC provider
Condition on token.actions.githubusercontent.com:sub and aud
Use permission boundary or SCPs for guardrails

GCP example (high level):

Create a Workload Identity Pool + Provider for GitHub
Map attributes like repository and ref
Bind a service account with minimal roles

Azure example (high level):

Configure federated credential on an app registration or managed identity
Limit issuer, subject, and audience
Use Azure RBAC with narrow scope (resource group or specific resources)

Pattern 2: Kubernetes workloads → internal APIs (service identity)

If you’re still using Kubernetes long-lived service account tokens, you’re behind modern defaults.

Recommended approach:

Use projected service account tokens (shorter-lived, audience-bound)
Enforce audience on token requests
Use a mesh identity (SPIFFE/SPIRE or Istio) for service-to-service mTLS

Controls to prioritize:

Kubernetes: enable BoundServiceAccountTokenVolume (now default in many distributions)
Set automountServiceAccountToken: false by default and opt-in per workload
Use NetworkPolicies to restrict egress from agent pods
Use admission controls (OPA Gatekeeper / Kyverno) to enforce token + privilege rules

Pattern 3: AI agent → tool APIs (ticketing, CRM, data)

AI agents often need to act on behalf of a user, but with tighter boundaries.

A practical model:

User authenticates normally (SSO + MFA)
User authorizes the agent to perform a defined set of actions
Agent receives a delegated token with:
- explicit scopes
- short TTL
- actor claim pointing to the user
- sub identifying the agent workload identity

Where possible:

Avoid giving agents a user’s refresh token
Prefer token exchange patterns (RFC 8693) to mint a new, constrained token
Log every agent action with correlation IDs and actor + subject

If you need “impersonation,” treat it as a privileged capability with approvals and tight auditing.

Pattern 4: Central secrets manager with short-lived leases

If your agent needs credentials for downstream systems that cannot do OAuth well, use a broker.

HashiCorp Vault: dynamic secrets for DBs, cloud, PKI; short TTL + renewal
AWS Secrets Manager / GCP Secret Manager / Azure Key Vault: store secrets, rotate where possible

Guardrails:

Agents authenticate to the secrets manager using workload identity (OIDC, Kubernetes auth)
Secrets are leased for minutes/hours, not months
Rotate and revoke automatically on risk events

Revocation and security events: don’t rely on “wait for expiry”

In distributed architectures, you need a strategy for fast invalidation.

Pragmatic options (mix-and-match):

Short access-token TTLs (baseline)
Introspection for high-risk endpoints only (admin actions, secrets access)
Back-channel security events to inform relying parties
Centralized authorization for sensitive decisions

Where SSF/CAEP fit (even if you’re not “fully there” yet)

If you want revocation and risk signals to propagate beyond one IdP, you need eventing patterns.

OpenID Shared Signals Framework (SSF) provides a way to transmit security events.
CAEP (Continuous Access Evaluation Profile) is a profile used to continuously evaluate sessions and tokens.

Even if you’re not implementing SSF/CAEP end-to-end today, you can mimic the operational benefit:

Emit internal “token risk” events (user disabled, secret rotated, agent quarantined)
Subscribe critical services (API gateway, authorization service) to those events
Invalidate cached decisions and enforce step-up / deny