Security April 9, 2026 8 min read

AI Agent Rollback and Recovery: Why Undo Is Harder Than You Think

AI agents execute commands that look reversible but aren't. Here's what makes recovery hard — and why prevention at the shell layer is your real defense.

The Undo Assumption

When a human makes a mistake, recovery usually means one of two things: pressing Ctrl+Z or calling support. When an AI agent makes a mistake across dozens of systems over ten minutes, neither option works.

The assumption baked into many AI agent deployments is that errors are correctable. Rollback plans exist. Backups run nightly. Terraform can reconstruct infrastructure. If the agent does something wrong, you undo it.

This assumption is wrong — or at least, dangerously incomplete. The difficulty isn't theoretical. It compounds with every action the agent takes, every system it touches, and every downstream dependency that shifts in response.

What "Reversible" Actually Means

In agent deployments, commands fall into rough reversibility categories. The problem is that most categorizations are too optimistic.

Command Type	Looks Reversible	Actually Reversible?	Why Not
File deletion	Yes (restore from backup)	Partial	Backup lag; downstream processes already consumed the file
Database row update	Yes (restore old value)	Partial	Other transactions may have read and acted on the new value
Config change	Yes (revert config)	Partial	Services already read the new config; restart required; may cause outage
API call (POST)	Rarely	Rarely	External system may have already acted (sent email, charged card, triggered webhook)
Secret rotation	Yes (rotate back)	No	Clients cached the old secret; services already failing; rotation history is permanent
Infrastructure deletion	Yes (redeploy)	No	Data not captured in IaC (runtime state, ephemeral logs, in-flight requests)
Permission grant	Yes (revoke)	No	Access may have been used between grant and revoke

The pattern: almost everything looks reversible in isolation, and almost nothing is fully reversible once downstream effects are considered.

Why Agents Make Recovery Harder Than Humans Do

Speed

A human engineer making a configuration mistake probably touches one system, notices something is wrong, and stops. An agent operating autonomously might touch thirty systems before a monitoring alert fires. By then, the change has propagated.

Speed isn't inherently bad — it's the point. But it means the error surface grows faster than human awareness can track it.

Cross-system scope

Agents don't stay in one system. That's also the point. A single task — "upgrade the database connection pool settings to improve performance" — might touch the application config, the database connection limits, the load balancer timeout settings, and the monitoring alert thresholds. Reverting one of those without reverting all of them may leave the system in a worse state than either before or after the change.

No natural stopping point

Humans get interrupted. They finish a step, look at the result, and decide whether to continue. Agents don't have this built in. They'll complete a task end-to-end unless something explicitly stops them. By the time a human reviews what happened, the full task is done.

Context collapse

When you ask an agent "what did you do and why," the answer may be incomplete. Long context windows truncate. Tool call logs don't always capture intermediate reasoning. The agent may not surface which decision led to the problematic action, because it didn't flag it as significant at the time.

Recovery requires knowing what happened. Agents don't always make that easy.

The Irreversibility Cascade

Individual reversible actions can combine into irreversible cascades. This is underappreciated.

Example: An agent is tasked with cleaning up unused resources in a cloud environment. It:

Lists snapshots older than 90 days — reversible query
Identifies snapshots tagged as "temp" or "test" — reversible query
Deletes 47 snapshots — individually, each deletion might be recoverable (cloud providers often have soft delete). Collectively, recovering 47 snapshots takes hours and requires elevated support tickets.
Meanwhile, another process — also agent-driven — was using one of those snapshots as a known-good restore point.

Each step was defensible. The cascade wasn't. And the second agent, operating in parallel, had no visibility into what the first agent was doing.

This is the irreversibility cascade: a sequence of locally reasonable actions that becomes globally harmful and difficult to undo.

What Rollback Actually Requires

For rollback to work reliably after an agent incident, you need:

Complete action logs

Every command, every API call, every file write — timestamped, attributed to the agent session, with enough context to understand why it happened. Not just what happened, but the reasoning that drove it. Standard audit logs don't capture this. Shell history doesn't capture this.

Blast radius mapping

You need to know which systems were touched, in what order, and what the state of each system was before the agent touched it. This requires pre-action snapshots or transaction logs per system — not just backup at the infrastructure level.

Dependency graph

Which downstream systems read from the systems the agent touched? Did they consume updated data before the rollback? Dependency graphs are often implicit in architectures, not documented, and not queryable.

Rollback coordination

If the agent touched ten systems and you need to roll back all ten, you need to coordinate that rollback so you don't create inconsistent state midway. This is a distributed systems problem. It's solvable — but it requires upfront design, not post-incident improvisation.

Most organizations have none of these in a form adequate for agent-driven incidents. The backup exists. The rollback procedure doesn't.

Defenses That Actually Help

Command authorization before execution

The most reliable rollback is the one you never need. If an agent's commands are reviewed before execution, the window for error is narrow. A reviewer can catch "delete 47 snapshots" before it runs, not after.

This is the architecture expacti implements: every shell command routes through a review queue before execution. The agent can plan, reason, and prepare — but it doesn't execute until a human approves. That approval is logged, timestamped, and attributed.

Dry-run mandates for destructive operations

For operations that have a --dry-run flag or equivalent, mandate it first. Show the agent the output of the dry run. Require approval before proceeding with the real operation. This adds a checkpoint that's otherwise absent.

Scope limits per task

Define upfront which systems an agent is allowed to touch for a given task. Implement enforcement — not just as a prompt instruction, but as a policy the infrastructure applies. An agent asked to "clean up old snapshots in us-east-1" shouldn't have credentials for eu-west-1.

Credential scoping is the technical implementation. The principle is blast radius control by design, not by hope.

Staged execution with checkpoints

Break agent tasks into stages. After each stage, produce a summary of what changed and require explicit continuation approval before the next stage. This doesn't eliminate rollback problems, but it limits cascade depth — you catch issues at stage 2 instead of after stage 10.

Pre-action state capture

Where possible, snapshot system state before the agent acts. For databases, this means transaction-level backups tied to agent session IDs. For config changes, this means version-controlled state before the agent writes. For infrastructure, this means tagged snapshots at task start.

This is more overhead than point-in-time backup. It's also more useful when you actually need to recover.

The Honest Limitation

Command authorization, blast radius control, and staged execution reduce the need for rollback. They don't eliminate it. When things go wrong despite these controls, recovery is still hard.

The practical takeaway isn't "make recovery easier" — it's "invest in prevention so recovery is rarely necessary."

That investment looks like:

Reviewing agent commands before execution, not auditing them after
Scoping credentials to tasks, not handing agents full-environment access
Building checkpoints into agent workflows, not treating them as black boxes
Treating irreversibility as a first-class design constraint

Rollback is the wrong frame. The right question isn't "how do we undo this" — it's "how do we not get here in the first place."

Command Authorization at the Shell Layer

Expacti routes every shell command through a review queue before execution. Approvals are logged, attributed, and auditable. Your agents stay productive — and your team stays in control.

Try the Interactive Demo Read the Docs