2026-02-08

Vibe-Coded Agents, Leaked Tokens, and the Next IAM Problem: Securing Agentic AI with Non-Human Identity

A practical blueprint for securing autonomous AI agents and other non-human identities: workload identity, short-lived tokens, audience restriction, proof-of-possession, and fast revocation (CAEP/SSF mindset).

Published: 2026-02-08

This week, cloud security researchers published a write-up on a now-fixed but very real failure mode for “agentic” systems: an AI-agent social platform exposed a client-side Supabase key and, because backend controls were misconfigured, that key effectively became a master credential. The result was unauthenticated read/write access to production data, including large volumes of API tokens and the ability to impersonate accounts.

Two details in that incident should make every IAM program perk up:

  1. The platform’s “agents” were just identities backed by bearer tokens. Once tokens leaked, impersonation was trivial.
  2. The app was built fast (“vibe-coded”) and shipped before the security defaults were understood. That’s increasingly how internal AI tools and prototypes get born inside enterprises.

This is not a “consumer app security” lesson. It’s a preview of what happens when enterprises begin deploying autonomous agents that:

  • call APIs and mutate infrastructure,
  • read and write sensitive data,
  • run as background services,
  • and coordinate with other agents.

If you squint, an autonomous agent is just a new kind of non-human identity (NHI) with a much larger blast radius and a higher likelihood of credential sprawl.

This post turns the incident into a practical blueprint: how to design identity, tokens, and revocation for AI agents and other NHIs so that leaked credentials don’t become “game over.”

Relevant reading on Learn IAM:


The incident pattern: “public key” + misconfiguration = mass impersonation

In the published research, a Supabase “publishable” key was embedded in client-side JavaScript (a common pattern for modern web apps). That alone isn’t necessarily fatal—Supabase expects some keys to be used from browsers.

The failure was that Row Level Security (RLS) and other server-side controls were not properly enforced. The public key became a path to:

  • enumerate database tables,
  • read authentication tokens,
  • and (critically) write to production tables.

From an IAM lens, the root problem is not Supabase. The root problem is:

Bearer-token systems fail catastrophically when tokens are long-lived, over-privileged, and not bound to context.

That’s exactly the default state of most early-stage agentic systems.

Why agentic systems are extra fragile

Traditional applications have clearer boundaries:

  • a human logs in,
  • a server issues a session,
  • the server does work.

Agentic systems invert this:

  • an “agent identity” exists first,
  • it holds a credential,
  • and it autonomously chooses to call tools and APIs.

If you don’t actively constrain it, an agent becomes:

  • a roaming integration account,
  • with transitive access to whatever tools it can reach,
  • and unclear accountability for the actions it takes.

This is the exact moment where IAM has to expand from “human SSO” to machine + agent governance.


Define the problem precisely: what is an “AI agent identity”? (and what is it not)

A useful mental model is to treat an AI agent as three separable things:

  1. A principal (identity) — “what is acting?”
  2. A policy envelope (authorization) — “what is it allowed to do?”
  3. A credential (token / key / certificate) — “how does it prove it is that identity?”

Most early agentic implementations collapse all three into a single string:

  • AGENT_API_KEY=...

That is convenient and also exactly how you get:

  • shared secrets in repos,
  • unlimited impersonation,
  • no session binding,
  • and no clean revocation story.

Non-negotiables for enterprise agents

If an agent can touch production data or production infrastructure, you want these properties:

  • Short-lived credentials (minutes, not months)
  • Audience restriction (token only valid for the intended API)
  • Scope/role minimization (least privilege)
  • Proof-of-possession when possible (bind token to a key; reduce replay)
  • Continuous access evaluation / event-driven revocation (kill access quickly)
  • Strong provenance and audit (who created the agent, who approved it, what tools it used)

If your agent doesn’t have these, you’re not doing “agentic AI.” You’re doing “automation with a password.”


Comparison table: credential options for agents (and what they really buy you)

Below is a practical comparison of the most common patterns you’ll see when teams “make an agent” and hook it to tools.

Tip: Use this table in architecture reviews. Ask teams to justify why they chose a weaker option.

OptionTypical examplesDefault lifetimeProsCons / common failure modeBest used for
Static API keyHomegrown “agent API key”; vendor API keysMonths/yearsSimple; fast to prototypeLeaks = full impersonation; hard rotation; hard scoping; weak attributionVery low-risk prototypes only (ideally never)
Long-lived OAuth refresh tokenGoogle/Microsoft “offline access”; SaaS connectorsWeeks/monthsStandard; supports delegated flowsStolen refresh token = durable access; often over-scopedHuman-delegated automation with strong conditional access
OAuth 2.0 client credentials + client secretOkta/Auth0/Entra ID service principals; M2M appsAccess token 5–60 minScopeable; supported everywhereSecret becomes another static password; sprawl of app registrationsServer-side agents in controlled environments
OAuth 2.0 client credentials + private_key_jwt“Confidential client” without shared secretAccess token 5–60 minBetter than secrets; key rotation possibleStill long-lived key material; needs secure storage (HSM/KMS/Vault)Higher assurance M2M where managed identity isn’t available
Managed workload identityAWS IAM Roles / STS, Azure Managed Identity, GCP Workload Identity FederationUsually 5–60 minNo secret distribution; rotates; strong platform hooksMisconfigured trust policies become privilege escalation; needs good cloud IAM hygieneAgents running on cloud compute
OIDC workload identity (Kubernetes)K8s ServiceAccount projected tokens; cloud federation5–60 min (configurable)Great for k8s; tightly scoped; integrates with cloud IAMOver-permissive RBAC; token audience mistakes; stolen tokens if node compromisedCluster-native agent runtimes
mTLS client cert (service identity)SPIFFE/SPIRE, internal PKICerts hours/days; sessions minutesStrong PoP; great for east-westOperational overhead; rotation and mapping complexityService mesh and internal APIs
DPoP / PoP tokensOAuth DPoP (where supported)Access token minutesReduces replay; binds token to client keyNot universal; more moving partsHigh-risk APIs and agent-to-agent calls

Key takeaway: your “agent identity” should look less like a password and more like a workload identity that obtains ephemeral, scoped tokens on demand.


Practical token lifetime guidance (what “short-lived” means)

A token lifetime discussion gets fuzzy fast. Here’s a pragmatic starting point.

Token type / contextSuggested lifetimeRationale
High-risk tool token (email send, chat posting, ticket closure, code write)5–15 minutesLimits impact of prompt injection or token exfiltration into logs
Infrastructure mutation (Terraform apply, Kubernetes admin, cloud IAM changes)5–15 minutes + re-auth for elevationAligns with “break-glass” mindset; reduces chance an agent becomes persistent admin
Standard API calls (read-heavy)15–60 minutesBalances performance vs risk
Refresh tokens for delegated accessPrefer no refresh; if unavoidable, bind tightly + monitorRefresh tokens are long-lived by design; treat them like privileged secrets
Long-lived keys (only if you must)Days/weeks max + managed rotationIf you must have a key, don’t let it live forever

If someone argues for “24-hour access tokens,” ask:

  • What is your detection time?
  • What is your revocation mechanism?
  • What is your plan when tokens leak into logs, traces, or prompt history?

Architecture pattern: separate “agent runtime identity” from “tool access tokens”

One of the easiest ways to reduce blast radius is to stop treating an agent as a single credential.

Instead:

  1. Give the agent runtime a workload identity (managed identity, Kubernetes SA, SPIFFE ID).
  2. Force the runtime to obtain downstream tool tokens on-demand, with tight audience + scope.
  3. Store as little as possible. Prefer caching ephemeral tokens in memory, not persisting them.

Example: “agent calls GitHub + Slack + AWS”

Bad pattern:

  • store GITHUB_TOKEN, SLACK_TOKEN, AWS_ACCESS_KEY in an “agent profile” table

Better pattern:

  • agent runtime authenticates with its workload identity
  • agent calls a token broker to mint:
    • a GitHub token scoped to one repo, 15 minutes
    • a Slack token restricted to specific scopes/channels, 15 minutes
    • AWS STS credentials for one role, 1 hour

The broker becomes your enforcement point for:

  • approvals (who allowed this agent to post to Slack?)
  • policy (which channels can it post to?)
  • conditions (only from approved runtime environment)
  • recording (who asked for what token, when, and why)

This is the same idea behind PAM for machines, applied to agents.

Product/tool names that commonly show up in real deployments

You can implement the broker and governance with combinations of:

  • HashiCorp Vault (dynamic credentials, PKI, OIDC auth)
  • AWS STS (role sessions, session tags)
  • Azure Managed Identities (token acquisition via IMDS)
  • GCP Workload Identity Federation (no service account keys)
  • Okta Workforce Identity / Microsoft Entra ID app registrations (OAuth clients)
  • CyberArk / BeyondTrust / Delinea for privileged credential workflows (especially if you still have legacy systems that require passwords)

Again: vendor doesn’t matter as much as the pattern.


Token security for agents: concrete controls that matter

1) Minimize token lifetime (and make refresh explicit)

For agent access tokens, aim for:

  • 5–15 minutes for high-risk APIs
  • 15–60 minutes for lower-risk internal APIs

If an agent needs ongoing access, let it re-request tokens under policy rather than holding durable secrets.

2) Use audience restrictions everywhere

Make sure tokens are minted with explicit audience:

  • aud = https://api.yourcompany.com/tools/jira

Then enforce it server-side.

Audience mistakes are common when teams reuse generic JWT validation middleware. In agent systems, this becomes critical because agents call many different APIs.

3) Use “write vs read” privilege splits for tools

Most tool integrations default to coarse scopes (e.g., “repo access”). For agents, try to split:

  • read-only token for browsing and summarization
  • write token only when explicitly approved (or only for specific paths)

Examples:

  • GitHub: prefer fine-grained tokens limited to specific repos and permissions
  • Slack: restrict to chat:write only if truly needed; prefer “read-only” scopes otherwise
  • Ticketing: separate “comment” from “close”

4) Prefer proof-of-possession (PoP) when feasible

Bearer tokens are easy to steal and replay. Where supported:

  • mTLS (mutual TLS)
  • DPoP (OAuth proof-of-possession)
  • service mesh identities (SPIFFE IDs)

PoP is not a silver bullet (compromised host still hurts), but it kills entire classes of “token copied from logs” incidents.

5) Bind tokens to context signals

Bind token usage to context:

  • workload identity
  • VPC / subnet / cluster
  • device posture (where applicable)
  • “must be used via broker” enforcement

Even without full device attestation, you can often enforce:

  • only from specific networks
  • only from specific Kubernetes clusters
  • only from a specific cloud account

6) Treat prompts, tool outputs, and memory stores as sensitive

Agents leak secrets in weird ways:

  • a tool returns a token by accident
  • a prompt injection asks the agent to print environment variables
  • the agent copies secrets into a vector database “memory” store

Practical steps:

  • redact secrets in logs by default
  • block or gate dangerous tool functions (“print env”, “read /proc”, “dump config”)
  • apply DLP scanning to prompt transcripts and memory stores
  • store prompt history encrypted, with tight access controls

Continuous access evaluation for agents (CAEP/SSF mindset, even if you don’t implement the specs yet)

Even if you’re not deploying CAEP (Continuous Access Evaluation Protocol) or SSF (Shared Signals Framework) today, the mindset is essential:

Access decisions should be revisited when risk changes, not only at token issuance.

Agentic systems are constantly running. You need a way to say:

  • revoke this agent now
  • revoke tool access now
  • revoke because we saw suspicious behavior

The practical, implementable version (today)

Build an event-driven “kill switch” path, even if it’s proprietary:

  • central policy service can mark an agent/session as revoked
  • downstream APIs check revocation state (cache + fast TTL)
  • token broker stops minting new tokens
  • long-running agent runtimes get a revocation event and stop

You can approximate continuous evaluation with:

  • webhooks or pub/sub topics
  • short token TTLs
  • centralized revocation lists
  • “deny by default” for risky tools

The important part is that revocation is:

  • fast (minutes)
  • comprehensive (tools + runtime)
  • operationally tested (game days)

What signals should trigger revocation or re-auth?

Pick a handful of signals you can implement quickly:

  • a leaked token is detected (e.g., GitHub secret scanning, SIEM alert)
  • agent starts calling a tool it never used before
  • unusual call volume (runaway loop)
  • new destination / network context
  • agent owner changes team or leaves company
  • tool scopes are expanded (policy drift)

Implementation blueprint: an “Agent Identity Control Plane” (AICP)

If you’re serious about agentic AI, you’ll eventually need something that looks like a control plane.

Minimum components:

  1. Agent registry

    • owner (human)
    • business purpose
    • environment (prod/non-prod)
    • data classification
    • tool list
  2. Policy model

    • allowed tools
    • allowed datasets
    • allowed actions (read/write/delete)
    • rate limits and guardrails
  3. Credential issuance

    • workload identity binding
    • short-lived access tokens
    • PoP where possible
  4. Telemetry + audit

    • every tool call logged with principal, scope, and reason
    • correlation IDs across calls
  5. Revocation + response

    • disable agent
    • revoke tool grants
    • force re-approval

A concrete “first version” you can ship

You don’t need a massive platform rewrite to begin. A first version can be:

  • an internal “agent registry” table
  • a lightweight token broker service
  • a standard set of scopes per tool
  • a simple revocation list checked by APIs

Then iterate.


Two high-impact implementation examples

Example 1: GitHub Actions OIDC instead of long-lived secrets

A common anti-pattern is an agent (or CI pipeline) holding:

  • AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY

A better pattern is GitHub Actions OIDC federation:

  • GitHub issues an OIDC token for the workflow
  • AWS STS AssumeRoleWithWebIdentity exchanges it for short-lived credentials
  • role trust policy restricts:
    • repo
    • branch
    • environment

This is the exact pattern you want for many agent runtimes:

  • identity is asserted by the platform
  • privileges are granted via short-lived role sessions

Example 2: Kubernetes workload identity (and why audience matters)

In Kubernetes, many teams accidentally create “agent god-mode” by:

  • binding overly broad RBAC to a ServiceAccount
  • letting projected tokens be accepted by multiple audiences

Practical guidance:

  • use a dedicated ServiceAccount per agent
  • scope RBAC narrowly (namespaces, resource verbs)
  • set audience explicitly for projected tokens
  • for cloud access, prefer OIDC federation to cloud roles (avoid exporting keys)

A phased adoption sequence (so you can start this quarter)

Phase 0 — Stop the bleeding (1–2 weeks)

  • Inventory current agentic tools and automations (including “just a prototype”).
  • Find where credentials live:
    • source repos
    • CI/CD variables
    • agent config DBs
    • logs
  • Rotate any long-lived API keys used by agents.
  • Enforce a baseline token TTL where you can.

Phase 1 — Standardize identity (2–6 weeks)

  • Require every agent to have:
    • a registered owner
    • a unique principal (no shared “automation” accounts)
    • least-privilege roles
  • Move agent runtimes to workload identity (managed identity / k8s SA / mTLS).
  • Ban persistent storage of third-party tokens where possible.

Phase 2 — Broker tool access (6–12 weeks)

  • Implement a token broker (central service) that:
    • mints short-lived tokens for tools
    • logs issuance
    • enforces policies
  • Add audience restrictions and scopes for each tool.
  • Add rate limiting and guardrails per agent.

Phase 3 — Continuous evaluation + revocation (quarter)

  • Build event-driven revocation:
    • security signals (suspicious behavior)
    • compromise signals (leaked tokens)
    • HR signals (owner left team)
  • Test incident response:
    • can you disable an agent and stop tool calls within minutes?

Actionable checklist: what to ask in an agent security review

Use this checklist in architecture reviews and go/no-go gates:

  • Identity

    • Does the agent have a unique principal (not shared)?
    • Is there a named human owner and a business purpose?
    • Is production separated from non-production agents?
  • Credentials

    • Are credentials short-lived (minutes) rather than static?
    • Are tokens audience-restricted and scope-minimized?
    • Is PoP used where feasible (mTLS/DPoP)?
    • Are secrets prevented from landing in logs and prompts?
  • Authorization

    • Are tool permissions least privilege (read vs write split)?
    • Is there an approval path for high-risk tools (email, Slack posting, infra changes)?
  • Revocation

    • Can we disable the agent quickly?
    • Can we revoke tool access quickly?
    • Is the revocation path tested?
  • Audit

    • Can we reconstruct “why” an action happened (prompt/tool evidence)?
    • Are actions correlated across systems?

Closing: agentic AI doesn’t replace IAM — it raises the bar

The incident that triggered this post was a classic case of fast shipping + misunderstood security defaults. But the deeper lesson is that agents make token security and non-human identity governance a first-class enterprise risk.

If you don’t redesign for:

  • ephemeral credentials,
  • context-bound access,
  • brokered tool tokens,
  • and fast revocation,

then your future “AI workforce” will be held together by strings of bearer tokens. And those strings will end up in the wrong place.

Start with the basics this quarter: standardize agent identities, shorten token lifetimes, and build a kill switch. The rest becomes achievable once those foundations exist.


Sources / further reading