Technical Part 2 of GuardClaw in Practice

Watching Your Agents Work

Mo @ TAKE INTEREST · March 21, 2026 · 6 min read

Field Guide

Watching Your Agents Work

GuardClaw's supervised execution wraps any agent command, intercepts threats in real time, and builds a tamper-evident audit trail. Here's what that looks like, step by step.

guardclaw monitoring supervised-execution tutorial

Key takeaway

guardclaw run wraps any agent command. The agent runs normally. GuardClaw watches every action and blocks the dangerous ones.

Key takeaway

Every decision gets recorded in a receipt chain — cryptographically linked so no one can tamper with the record after the fact.

Key takeaway

Receipts sync to the cloud automatically. Your dashboard updates without you doing anything extra.

Your AI agent ran 200 tasks today. You saw the output. Clean commits, tests that passed, APIs that returned the right data.

What you didn’t see: the 14 times it tried to read a file outside its project directory. The shell command it built that would have sent your environment variables to an external server. The prompt injection buried in a dependency’s README that tried to override its instructions.

Those attempts happened. Without supervision, you’d never know.

GuardClaw’s supervised execution wraps your agent, watches every action it takes, and blocks the ones that cross the line. This post walks through how it works, what you’ll see, and what the audit trail means.

What “supervision” means here

Think about how a good manager works. They don’t read every line of code their team writes. They don’t approve every commit. What they do: set clear boundaries, review the exception reports, and investigate when something looks off.

GuardClaw supervises the same way. It doesn’t interfere with your agent’s normal work. It watches the actions your agent takes — reading files, running shell commands, calling APIs — and evaluates each one against security rules. Safe actions pass through instantly. Dangerous ones get blocked. Everything gets written down.

Step 1: Wrap your agent command

To supervise an agent, put guardclaw run -- in front of the command you normally use to start it. The -- separates GuardClaw’s options from your agent’s command.

For example, if you normally start your agent with a command like python my_agent.py or node agent.js, you’d run:

guardclaw run -- python my_agent.py

Or for Claude Code:

guardclaw run -- claude

When GuardClaw starts, you’ll see a monitoring banner:

  GuardClaw v0.6.1 — monitoring active
  Workspace: ws_default | Plan: free | Cloud: connected
  Run: run-4f8a2b1c9d7e

This tells you:

Monitoring active — GuardClaw is watching.
Workspace — which workspace this run belongs to (for organizing your security data).
Plan — your current tier (free, pro, or ultimate).
Cloud — whether receipts will sync to the dashboard.
Run — a unique ID for this session so you can find it later.

After the banner, your agent runs normally. Its output appears in your terminal just like it would without GuardClaw.

Step 2: Watch an interception happen

When your agent tries something the security engine flags, GuardClaw blocks it and tells you what happened.

For example, if an agent tries to read files outside its project boundary:

[DENY] Read: ../../.env
  Reason: path traversal outside project boundary
  Policy: filesystem-boundary
  OWASP:  AI07 — Insecure Plugin Design
  Action: blocked

Here’s what each line means:

[DENY] — GuardClaw blocked this action. (You’ll also see [ALLOW] for actions that passed.)
Read: ../../.env — what the agent tried to do. In this case, read a file two directories up from the project root — a common technique for accessing secrets.
Reason — why it was blocked. Plain language.
Policy — which security rule caught it.
OWASP — the industry-standard attack category this falls under.
Action: blocked — confirmation that the action did not execute.

The agent never runs that command. It receives an error response (the same kind it would get if the file didn’t exist) and moves on to its next task. No crash. No manual intervention needed.

Another example — if an agent tries to run a command that would send data to an unknown external server:

[DENY] Bash: suspicious outbound data transfer detected
  Reason: exfiltration pattern detected
  Policy: default-deny
  OWASP:  AI01 — Prompt Injection
  Action: blocked

GuardClaw’s detection engine checks against over 1,000 compiled patterns. It catches command injection, prompt injection, data exfiltration, path traversal, SSRF, encoding evasion, and more. You saw these categories in the attack simulation report from the previous post.

Step 3: Understand the receipt chain

Every decision GuardClaw makes — every allow and every deny — generates a receipt. These receipts are stored locally on your machine in a chain where each entry is linked to the previous one using cryptographic hashes.

What that means in practice: nobody can go back and delete a receipt, insert a fake one, or change the order of events. If anyone tried, the hash chain would break and GuardClaw would flag it.

A receipt looks like this:

Receipt #1247
  Run:       run-4f8a2b1c9d7e
  Event:     pre_tool_use
  Decision:  deny
  Tool:      Bash
  Hash:      sha256:a8f3...
  Prev Hash: sha256:7d12...
  Timestamp: 2026-03-22T14:32:01Z

Why does this matter? Because six months from now, when someone asks “what did our agents do in March?” you can show them a complete, verifiable record. Not log files that someone might have edited. A chain where any tampering is mathematically detectable.

This is especially important for regulated industries — healthcare, finance, government — where you need to prove what your systems did and didn’t do.

Step 4: Sync receipts to the cloud

If you connected to the cloud in the previous post, your receipts sync automatically. Every hour, GuardClaw pushes new receipts to your dashboard in the background.

You can also sync manually whenever you want:

guardclaw sync

The sync is efficient. It doesn’t upload every receipt — just the important ones (denied actions, flagged events, and periodic checkpoints). Normal allowed actions stay on your machine unless you run a full sync.

After syncing, your security data appears in the web dashboard, which we’ll cover in the next post.

What you’ve set up

With supervised execution running:

Every action your agent takes passes through the detection engine before it executes
Dangerous actions get blocked with a clear explanation of what happened and why
A tamper-evident receipt chain records every decision for future reference
Receipts sync automatically to your cloud dashboard (if connected)

The agent doesn’t know it’s being supervised. From its perspective, some actions succeed and some return errors — the same as any other runtime constraint. It adapts and moves on.

Next: what all of this looks like in the dashboard — your security data in one place, ready for you or an auditor to review.

Join the Intelligence Brief

Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.

01 / Threat IntelZero-day vulnerabilities and mitigation strategies.

02 / Red TeamQuarterly teardowns of AI infrastructure.

03 / The BlueprintEngineering local-first deterministic computing.

Back to blog