Summary: AI agents and other non‑human identities are turning “token hygiene” from a best practice into a production reliability and breach‑prevention requirement. This post lays out a pragmatic, vendor‑neutral approach to designing token lifetimes, audience/scope, sender constraints, and revocation for agentic workloads—plus concrete configurations across common IdPs, cloud providers, and service meshes.
No major SSF/CAEP or agent-identity “breaking news” surfaced in the last 24 hours. Instead, today’s post focuses on a high‑impact best‑practice topic that keeps showing up in real incidents: token theft + over‑privileged non‑human identities—now amplified by autonomous agents.
Why token security is suddenly everyone’s problem
Most organizations have spent years maturing identity controls for humans: MFA, conditional access, device posture, and step‑up authentication. But the fastest growth in identities is now non‑human identity (NHI):
- Kubernetes workloads and service accounts
- CI/CD runners and build pipelines
- SaaS integrations and API clients
- Bots, RPA, and event-driven automations
- AI agents that call tools, fetch data, and take actions
The common denominator is tokens: OAuth access tokens, OIDC ID tokens, SAML assertions (less often for workloads), API keys, session cookies, and cloud provider credentials.
For humans, a stolen token is bad. For an AI agent, a stolen token can be worse because:
- Agents run continuously. A token stolen at 2 AM may be used immediately and repeatedly.
- Agent tokens often carry broad permissions. “Let the agent do its job” becomes “let it do everything.”
- Agents touch sensitive systems. Ticketing, customer data, production deployments, secrets stores.
- The blast radius is non-obvious. Toolchains create transitive access: GitHub → cloud → cluster → data.
Token security isn’t just “shorter expirations.” It’s a set of coordinated design choices:
- Who can get a token (workload identity)
- What a token can do (scopes/claims/permissions)
- Where the token can be used (audience, resource indicators)
- How the token can be replayed (sender constraints like mTLS/DPoP)
- How fast you can stop it (revocation, eventing, session risk)
A practical threat model: what actually goes wrong
Token compromise for NHIs and agents typically falls into a few buckets:
1) Token exfiltration from logs, traces, or error reports
- Access tokens accidentally logged by reverse proxies or API gateways
- Debug logs in agents that print request headers
- Observability tools capturing request bodies/headers
Countermeasures: log redaction, structured logging guardrails, “no secrets in telemetry” policies, WAF rules, and automated scanning (e.g., GitHub secret scanning; TruffleHog; Gitleaks).
2) Token theft from runtime memory or local storage
- Tokens stored unencrypted on disk in an agent host
- Compromised container reading service account token files
- Browser automation capturing session cookies
Countermeasures: avoid long‑lived tokens, use workload identity federation, use secrets managers with short leases (Vault), isolate runtimes (gVisor/Kata), and apply pod security controls.
3) Replay and “token forwarding” attacks
- A token intended for Service A is replayed against Service B
- Token used from a different environment/host than expected
Countermeasures: strict aud validation, resource indicators where supported, and sender‑constrained tokens (mTLS or DPoP).
4) Over-broad scopes / entitlements
- “agent.readwrite” scope that maps to admin-like permissions
- Service principal granted Owner/Contributor everywhere
Countermeasures: fine-grained authorization, least privilege, and continuous entitlement review.
5) Slow revocation in distributed systems
Even if your IdP can revoke refresh tokens, downstream APIs may accept access tokens until they expire.
Countermeasures: short access-token TTLs, centralized authorization checks, introspection for high-risk operations, and security event propagation (SSF/CAEP patterns).
Design principle: treat tokens like volatile, single-purpose instruments
A good mental model:
- Access tokens are like “single-purpose debit cards.” Limit where they work (audience) and what they buy (scope).
- Refresh tokens are like “master keys.” Protect them heavily, rotate them, and revoke aggressively.
- Session cookies are bearer tokens. Treat them as such, especially for headless automation.
For AI agents specifically: aim for task-scoped, time-bounded, and tool-bounded credentials.
If you want a deeper foundation on the identity side, these Learn IAM topics are useful background:
- Non-human identity (NHI): https://learn-iam.com/topics/identity-for-ai/non-human-identity
- AI agent identity: https://learn-iam.com/topics/identity-for-ai/ai-agent-identity
- AI tool authorization: https://learn-iam.com/topics/authorization/ai-agent-tool-authorization
- API authorization (scopes/claims): https://learn-iam.com/topics/authorization/authorization-for-apis-scopes-claims
- Delegation & impersonation: https://learn-iam.com/topics/authorization/delegation-impersonation-acting-as
Comparison table: token and credential options for agentic + NHI workloads
The goal is not “one true token.” It’s choosing the right primitive for the job.
| Option | Typical lifetime | Revocation behavior | Best for | Common platforms / products |
|---|---|---|---|---|
| OAuth 2.0 Access Token (JWT) | 5–15 min | Often implicit (wait for expiry); explicit revocation depends on API checks | High-throughput API calls | Okta, Microsoft Entra ID, Auth0, Ping, Keycloak |
| OAuth 2.0 Access Token (opaque + introspection) | 1–10 min | Can be immediate if API introspects | High-risk operations needing near-real-time revocation | Keycloak, ForgeRock, Ping, some custom AS |
| OAuth Refresh Token | hours–weeks (should be shorter for NHIs) | Strong revocation lever; rotation recommended | Long-running agents that can renew access tokens | Okta (rotation), Entra (varies by flow), Auth0 |
| Workload identity federation (OIDC → cloud STS) | 5–60 min (cloud creds) | Short-lived, scoped to role; rotate by default | CI/CD, workloads getting cloud creds without static keys | AWS STS (AssumeRoleWithWebIdentity), GCP Workload Identity Federation, Azure federated credentials |
| SPIFFE/SPIRE SVID (mTLS identity) | minutes–hours | Rotated automatically; trust anchored in mesh/agent | Service-to-service auth, mesh identity | SPIRE, Istio, Linkerd (via SPIFFE), Consul |
| API keys (static) | months–never (bad) | Hard to revoke safely; often copied everywhere | Last resort integrations | Many SaaS APIs |
| mTLS client certs | days–months (but can be short) | Revocation via CRL/OCSP; operational overhead | High assurance service auth | NGINX, Envoy, service meshes |
| DPoP (proof-of-possession) | same as access token | Limits replay; still needs TTL/revocation strategy | Public clients / agent frameworks that can sign requests | Some AS/resource servers; emerging support |
Opinionated guidance: for agents and NHIs, prefer federated, short-lived credentials (cloud STS, SVIDs, short JWT access tokens) over static API keys or long-lived refresh tokens.
The “Four Boundaries” model for agent tokens
When you issue a token to an agent, define four explicit boundaries. If you can’t describe them, you probably don’t have them.
- Time boundary – how long is the token valid?
- Task boundary – what job is the token for (ticket update, read-only search, deploy)?
- Tool boundary – which tools/APIs can accept it (audience, resource indicator)?
- Trust boundary – where can it be used from (sender constraints, network, device/workload posture)?
Time: token lifetimes that actually work
A workable default starting point for many enterprises:
- Access token TTL: 5–10 minutes for API access
- Refresh token TTL: avoid for NHIs if possible; if required, 1–24 hours + rotation
- Cloud STS creds: 15–60 minutes (default provider patterns)
- mTLS/SVID certs: rotate every 1–4 hours
Short lifetimes are not free: they increase token minting, load on authorization servers, and operational noise. But they are one of the few controls that reliably reduce blast radius across heterogeneous APIs.
Task: “least privilege” must be expressed in the token
For OAuth/OIDC, that means:
- Scopes that map to discrete actions (e.g.,
tickets:comment, nottickets:admin) - Claims like
act/actor(who initiated),sub(agent identity), andazp(authorized party) - Fine-grained authorization checks in the resource server
If your agent is doing multiple tasks, consider issuing separate tokens per tool and per task—even if minted by the same identity.
Tool: lock down aud and resource indicators
A common failure mode: using tokens minted “for the agent” that are accepted by any internal API.
- Validate
audstrictly in every API - Consider RFC 8707 resource indicators (where supported) to request tokens for a specific API
- If you have multiple environments, keep audiences environment-specific (
api.prod.example.comvsapi.dev.example.com)
Trust: sender constraints for replay resistance
If a stolen token is your biggest worry, a pure bearer token is fragile. Two practical sender-constraint approaches:
- mTLS-bound access tokens (aka “certificate-bound” tokens)
- DPoP (a signed proof attached to each request)
You may not be able to deploy these everywhere today, but you can use them for your highest-risk agent actions first (production changes, secrets access, admin APIs).
Concrete implementation guidance (by platform/pattern)
This section is deliberately concrete. You should be able to hand it to an engineer.
Pattern 1: CI/CD → cloud provider (OIDC federation), no static keys
Problem: build systems often leak long-lived cloud keys.
Recommended approach: use OIDC federation from your CI provider into cloud STS.
- GitHub Actions → AWS STS
AssumeRoleWithWebIdentity - GitHub Actions → Azure federated identity credentials for a managed identity / app registration
- GitHub Actions → GCP Workload Identity Federation
Key controls:
- Restrict OIDC subject claims (repo, environment, branch/tag) in the trust policy
- Restrict audience to the intended STS
- Keep session duration short (15–60 minutes)
- Apply least privilege IAM roles; separate “plan” vs “apply” roles
AWS example (high level):
- Create an IAM role with trust policy allowing GitHub’s OIDC provider
- Condition on
token.actions.githubusercontent.com:subandaud - Use permission boundary or SCPs for guardrails
GCP example (high level):
- Create a Workload Identity Pool + Provider for GitHub
- Map attributes like repository and ref
- Bind a service account with minimal roles
Azure example (high level):
- Configure federated credential on an app registration or managed identity
- Limit issuer, subject, and audience
- Use Azure RBAC with narrow scope (resource group or specific resources)
Pattern 2: Kubernetes workloads → internal APIs (service identity)
If you’re still using Kubernetes long-lived service account tokens, you’re behind modern defaults.
Recommended approach:
- Use projected service account tokens (shorter-lived, audience-bound)
- Enforce audience on token requests
- Use a mesh identity (SPIFFE/SPIRE or Istio) for service-to-service mTLS
Controls to prioritize:
- Kubernetes: enable BoundServiceAccountTokenVolume (now default in many distributions)
- Set
automountServiceAccountToken: falseby default and opt-in per workload - Use NetworkPolicies to restrict egress from agent pods
- Use admission controls (OPA Gatekeeper / Kyverno) to enforce token + privilege rules
Related Learn IAM topics:
- Policy as code (OPA/Cedar): https://learn-iam.com/topics/authorization/policy-as-code-for-iam-opa-cedar
- Authorization performance & revocation tradeoffs: https://learn-iam.com/topics/authorization/authorization-performance-caching-consistency-revocation
Pattern 3: AI agent → tool APIs (ticketing, CRM, data)
AI agents often need to act on behalf of a user, but with tighter boundaries.
A practical model:
- User authenticates normally (SSO + MFA)
- User authorizes the agent to perform a defined set of actions
- Agent receives a delegated token with:
- explicit scopes
- short TTL
actorclaim pointing to the usersubidentifying the agent workload identity
Where possible:
- Avoid giving agents a user’s refresh token
- Prefer token exchange patterns (RFC 8693) to mint a new, constrained token
- Log every agent action with correlation IDs and actor + subject
If you need “impersonation,” treat it as a privileged capability with approvals and tight auditing.
Related Learn IAM topics:
- Delegation & impersonation: https://learn-iam.com/topics/authorization/delegation-impersonation-acting-as
- Authorization decision logging: https://learn-iam.com/topics/authorization/authorization-decision-logging-audit-explainability
Pattern 4: Central secrets manager with short-lived leases
If your agent needs credentials for downstream systems that cannot do OAuth well, use a broker.
- HashiCorp Vault: dynamic secrets for DBs, cloud, PKI; short TTL + renewal
- AWS Secrets Manager / GCP Secret Manager / Azure Key Vault: store secrets, rotate where possible
Guardrails:
- Agents authenticate to the secrets manager using workload identity (OIDC, Kubernetes auth)
- Secrets are leased for minutes/hours, not months
- Rotate and revoke automatically on risk events
Revocation and security events: don’t rely on “wait for expiry”
In distributed architectures, you need a strategy for fast invalidation.
Pragmatic options (mix-and-match):
- Short access-token TTLs (baseline)
- Introspection for high-risk endpoints only (admin actions, secrets access)
- Back-channel security events to inform relying parties
- Centralized authorization for sensitive decisions
Where SSF/CAEP fit (even if you’re not “fully there” yet)
If you want revocation and risk signals to propagate beyond one IdP, you need eventing patterns.
- OpenID Shared Signals Framework (SSF) provides a way to transmit security events.
- CAEP (Continuous Access Evaluation Profile) is a profile used to continuously evaluate sessions and tokens.
Even if you’re not implementing SSF/CAEP end-to-end today, you can mimic the operational benefit:
- Emit internal “token risk” events (user disabled, secret rotated, agent quarantined)
- Subscribe critical services (API gateway, authorization service) to those events
- Invalidate cached decisions and enforce step-up / deny
Phased adoption plan (what to do this quarter)
If you’re starting from “lots of API keys and long-lived tokens,” here’s an opinionated 4-phase sequence.
Phase 0 — Inventory and classify (1–2 weeks)
- Inventory all non-human identities (service principals, API clients, agents)
- Classify by blast radius (data sensitivity + actionability)
- Identify where tokens/keys live (code, CI variables, secrets stores)
Phase 1 — Kill the worst keys (2–4 weeks)
- Replace static cloud keys with OIDC federation (GitHub/GitLab/CircleCI → STS)
- Enforce least privilege roles + short sessions
- Add secret scanning to repos and build logs
Phase 2 — Standardize token boundaries (4–8 weeks)
- Set org defaults: access token TTL 5–10 minutes for internal APIs
- Enforce
audvalidation across services - Introduce token exchange for agents (mint constrained tokens per tool/task)
- Add authorization decision logging
Phase 3 — Replay resistance + revocation improvements (ongoing)
- Add sender constraints for the highest-risk flows (mTLS/DPoP)
- Use introspection for admin endpoints or sensitive actions
- Implement security event propagation (SSF/CAEP style) or internal equivalent
Actionable checklist (copy/paste)
Token issuance
- Access token TTL set to 5–10 minutes (exceptions documented)
- Refresh token rotation enabled where refresh tokens are unavoidable
- Token audience is service-specific (not “all internal APIs”)
- Scopes map to discrete actions; no “super scopes” for agents
Resource servers (APIs)
- Strict
audvalidation - Scope/claim checks implemented server-side (not just in gateway)
- High-risk endpoints support introspection or additional checks
- Authorization decisions are logged with actor + subject
Runtime & operations
- No tokens in logs/traces; redaction and linting in place
- Workloads use federated identity (STS, projected SA tokens, SPIFFE)
- Network egress restrictions for agent runtimes
- Incident playbook includes token revocation + key rotation steps
AI agent-specific
- Agent actions are tool-bounded (per-tool tokens)
- Delegation model is explicit (
actorclaims / token exchange) - “Break glass” impersonation requires approvals and is audited
Closing: the safest token is the one that can’t be reused
You can’t prevent every token leak. You can design your system so leaked tokens are:
- short-lived,
- narrowly scoped,
- audience-bound,
- replay-resistant (where it matters), and
- quickly invalidated.
As AI agents proliferate, token security becomes the foundation for safe autonomy.
Internal links used (Learn IAM)
- https://learn-iam.com/topics/identity-for-ai/non-human-identity
- https://learn-iam.com/topics/identity-for-ai/ai-agent-identity
- https://learn-iam.com/topics/authorization/ai-agent-tool-authorization
- https://learn-iam.com/topics/authorization/authorization-for-apis-scopes-claims
- https://learn-iam.com/topics/authorization/delegation-impersonation-acting-as
- https://learn-iam.com/topics/authorization/policy-as-code-for-iam-opa-cedar
- https://learn-iam.com/topics/authorization/authorization-performance-caching-consistency-revocation
- https://learn-iam.com/topics/authorization/authorization-decision-logging-audit-explainability