Token Management and Credential Rotation in Multi-Tenant SaaS

Executive Summary

Static API keys are a ticking clock. In multi-tenant SaaS platforms — particularly those serving AI agents — a single leaked credential can compromise not just one customer's data but an entire class of workloads. The industry is in the middle of a structural shift: from long-lived, manually rotated secrets to dynamic, short-lived credentials generated just-in-time and tied to workload identity rather than static configuration.

This article examines the architectural patterns, tooling, and operational considerations behind modern credential rotation and token management, with a focus on the specific challenges that arise when AI agents become first-class consumers of your platform's APIs.

The Scale of the Problem: Non-Human Identities Dominate

The framing of "credential security" used to center on human users — rotate passwords, enforce MFA, expire sessions. That model is increasingly outdated. In modern SaaS infrastructure, non-human identities (NHIs) — service accounts, API keys, OAuth clients, CI/CD runners, and AI agents — now outnumber human users by ratios of 25:1 to 82:1 depending on the organization.

The 2025 State of Non-Human Identities report from Entro Security found that 97% of NHIs carry excessive privileges. A 2025 World Economic Forum analysis identified NHIs as the fastest-growing attack surface in enterprise security. And the consequences are severe: 80% of identity-related breaches in recent years involve compromised NHIs, not compromised human accounts.

AI agents compound this problem. Each agent instance may need credentials to call an LLM API, a database, a vector store, and a set of third-party integrations. In a multi-tenant SaaS context, those same agents serve hundreds or thousands of customers — and if credential scoping is wrong, a breach in one tenant context can spill into others.

The three root causes of NHI credential risk:

Credential sprawl — secrets accumulate across repositories, environment files, and CI/CD systems with no central inventory
Over-permissioning — credentials are created with broad scopes "to avoid future friction" and are never reviewed
Long lifetimes — static API keys may never expire, giving attackers indefinite access once acquired

The OAuth 2.0 Foundation: M2M Authentication at Scale

For machine-to-machine (M2M) communication, the OAuth 2.0 Client Credentials flow is the dominant standard. A service authenticates using its own client ID and secret to an authorization server, receives a short-lived access token, and uses that token to call downstream APIs. The authorization server becomes the single point of trust, externalizing the auth decision from every individual API endpoint.

In 2025, OAuth 2.1 formalized as mandatory for MCP (Model Context Protocol) servers — the emerging standard for AI tool APIs — making the client credentials flow the baseline for any AI agent platform with external integrations.

Key lifecycle principles under modern OAuth practice (aligned with RFC 9700, published January 2025):

Access tokens for sensitive APIs should expire within 5–15 minutes
Refresh tokens should expire within 7–30 days maximum
Refresh token rotation must invalidate the entire token family on revocation, not just the current token — a critical defense against token replay attacks where attackers use previously rotated but cached tokens
PKCE (Proof Key for Code Exchange) is mandatory for all flows that could be subject to interception

For AI agent platforms, the practical implication is that agents should never store long-lived tokens locally. The token is fetched, used, and discarded — or the refresh flow is handled transparently by a token broker component sitting between the agent and the API it's calling.

Tenant-Scoped Credentials: Isolation as a Design Primitive

In a multi-tenant SaaS platform, credential isolation is not optional. A credential that has access to one tenant's data must not, under any circumstances, be usable against another tenant's resources.

Two patterns dominate:

The Isolation Manager Pattern — When a request enters the system, the platform's isolation manager generates a credential set scoped to the current tenant context. This involves looking up the IAM policies for that tenant, generating short-lived credentials with a tenant-specific permission scope, and passing those credentials to the executing component. The credentials expire when the request completes; they are never stored.

AWS SaaS Tenant Isolation strategies formalize this: each service call carries an explicit tenant ID, every credential is generated with a mandatory tenant predicate, and background processes must carry tenant context explicitly rather than inheriting ambient access.

Dynamic Client Registration (OAuth) — Rather than a shared OAuth client serving all tenants, each tenant gets its own OAuth client registration with isolated client credentials, per-tenant redirect URIs, and scoped permissions. This creates a hard cryptographic boundary between tenants at the authorization layer, not just in application logic.

The three invariants that must hold in any properly isolated multi-tenant system:

Every resource is owned by exactly one tenant
Tenant ID is required, indexed, and part of uniqueness constraints at the data layer
Tenant context is resolved before any business logic executes — it is never inferred from application state

Row-Level Security (RLS) in pooled database deployments enforces isolation at the database engine level, ensuring a query running in tenant A's context cannot return tenant B's rows even if application-level filtering fails.

Dynamic Secrets: Just-in-Time Credential Generation

The most powerful operational shift in secrets management is the move from static secrets to dynamic secrets — credentials that are generated on demand, scoped to a specific task, and automatically expire when that task is done.

HashiCorp Vault's dynamic secrets engine is the most widely deployed implementation of this pattern. Rather than storing a database password in Vault (which is still a static secret, just centrally stored), the dynamic secrets engine creates a new database user with limited permissions when a workload requests access. That user exists only for the duration of the job and is automatically revoked afterward.

HashiCorp's OpenAI dynamic secrets plugin (released 2025) applies this same pattern to LLM API credentials: instead of distributing a single static OpenAI API key to every agent instance, Vault generates fresh credentials on demand with automatic expiration. Vault Enterprise 1.21 added native SPIFFE auth support, allowing Vault to issue credentials to workloads based on cryptographic workload identity rather than any stored secret at all.

Key operational benefits of dynamic secrets:

Blast radius reduction — a compromised credential is valid for minutes, not months
Automatic audit trail — every credential issuance is logged with the workload that requested it
No secret zero problem — workloads don't need an initial secret to bootstrap access; they use workload identity attestation instead

Workload Identity: Eliminating Static Secrets Entirely

The logical endpoint of dynamic secrets is workload identity — a model where the credential is not a secret at all, but a cryptographically verifiable claim about what the workload is.

SPIFFE (Secure Production Identity Framework for Everyone) and its reference implementation SPIRE are the CNCF-graduated standards for this model. SPIFFE assigns each workload a SPIFFE Verifiable Identity Document (SVID), a short-lived X.509 certificate automatically rotated by the SPIRE agent running alongside the workload. The SVID proves the workload's identity cryptographically — no shared secret is involved.

For AI agents, SPIFFE/SPIRE solves the "secret zero" problem completely. An agent authenticates to Vault, to downstream APIs, and to peer services using its SVID, which is automatically renewed before expiry. If the agent is terminated, its identity expires with it. Prompt injection attacks that attempt to exfiltrate credentials are ineffective because there are no credentials in the agent's environment to steal.

The practical deployment pattern for Kubernetes-based AI agent fleets:

SPIRE server provides platform-level identity attestation
SPIRE agent runs as a DaemonSet, issuing SVIDs to workloads in each node
Vault is configured with SPIFFE auth — it trusts SVIDs issued by your SPIRE server
Agent containers request secrets from Vault using their SVID; no API key is ever written to disk or environment

Credential Injection Without Secrets in the Environment

Even before full SPIFFE adoption, a practical intermediate pattern eliminates static API keys from agent environments. The sidecar proxy pattern — deployed at scale by platforms like Aembit — intercepts outbound HTTPS requests from agent containers before they reach external APIs. The proxy validates the agent's workload identity, retrieves a temporary credential from the secrets store, and injects it into the Authorization header. The agent itself never handles the credential.

This approach provides:

Zero static secrets in the container environment (so env or /proc inspection by a compromised agent yields nothing useful)
Centralized audit logging of every external API call made by every agent
Policy enforcement at the network layer, independent of agent code

The DEV Community documented a practical migration from static API keys to this model, noting that the change required zero agent code modifications — only infrastructure changes to add the sidecar proxy.

Operational Considerations for Multi-Tenant Platforms

Deploying dynamic credential management in production requires addressing several operational realities:

Rotation overlap windows — When rotating a credential, both the old and new credentials must remain valid for a brief overlap period to prevent service interruption during in-flight requests. The overlap window should be short (seconds to minutes) and coordinated with token expiry times.

Revocation propagation — When a tenant offboards or a credential is suspected compromised, revocation must be immediate and complete. This requires a revocation list or token introspection endpoint that every API gateway checks synchronously, not on a caching delay.

Automated onboarding — Every new agent or service must go through an automated onboarding flow that issues credentials with minimum necessary permissions, documented justification, and a defined expiry policy. 97% of NHIs carry excessive privileges because this onboarding step was either skipped or never reviewed.

Inventory and discovery — Organizations must know what NHIs exist. A CMDB or identity governance platform that continuously scans for service accounts, API keys, and OAuth clients — and flags ones that haven't been used or rotated in 90 days — is not optional at scale. Credential sprawl begins the moment discovery stops.

Rotation frequency — For AI agent platforms, a 30–90 day automated rotation cycle is the practical minimum for any credential that cannot be made fully dynamic. High-sensitivity credentials (LLM API keys with billing implications, database write access) should rotate on a 24-hour or shorter schedule, or be replaced with dynamic secrets entirely.

The AI Agent Threat Model

AI agents introduce a specific credential threat that doesn't exist in traditional service-to-service communication: prompt injection. An attacker who can influence the input to an AI agent may be able to instruct it to call an exfiltration endpoint using its own credentials. If those credentials are static API keys in the environment, the attack succeeds in a single tool call.

The defense in depth approach:

No static credentials in the agent environment — use dynamic injection or workload identity
Scope credentials to the minimum required for each task — an agent summarizing documents should not have credentials that allow writing to production databases
Network-layer enforcement — only allow agents to connect to a pre-approved allowlist of endpoints; any attempt to reach an unexpected domain is blocked at the network layer and alerted
Short token lifetimes — even if a credential is exfiltrated, it should expire before it can be operationalized

Auth0's 2025 analysis of API key security for AI agents emphasizes that the combination of short-lived tokens plus network controls makes prompt injection credential theft attacks impractical even when application-level defenses fail.

Conclusion

The era of static API keys distributed to AI agents via environment variables is ending. The patterns that replace it — OAuth M2M with short-lived tokens, dynamic secrets from Vault, tenant-scoped credential isolation, and SPIFFE workload identity — are production-ready and increasingly adopted as the baseline for serious multi-tenant SaaS platforms.

For AI agent platforms specifically, the threat model is more acute than traditional SaaS: agents are autonomous, operate across many APIs simultaneously, and are vulnerable to prompt injection in ways that human users are not. Getting credential management right is not just a compliance exercise — it is a core platform reliability and trust requirement. The platforms that treat it as such are building a durable security moat; those that don't are accumulating liability at scale.

Sources: