Watching Your Agents Work
Field Guide
Watching Your Agents Work
GuardClaw's supervised execution wraps any agent command, intercepts threats in real time, and builds a tamper-evident audit trail. Here's what that looks like, step by step.
Key takeaway
guardclaw run wraps any agent command. The agent runs normally. GuardClaw watches every action and blocks the dangerous ones.
Key takeaway
Every decision gets recorded in a receipt chain — cryptographically linked so no one can tamper with the record after the fact.
Key takeaway
Receipts sync to the cloud automatically. Your dashboard updates without you doing anything extra.
Your AI agent ran 200 tasks today. You saw the output. Clean commits, tests that passed, APIs that returned the right data.
What you didn’t see: the 14 times it tried to read a file outside its project directory. The shell command it built that would have sent your environment variables to an external server. The prompt injection buried in a dependency’s README that tried to override its instructions.
Those attempts happened. Without supervision, you’d never know.
GuardClaw’s supervised execution wraps your agent, watches every action it takes, and blocks the ones that cross the line. This post walks through how it works, what you’ll see, and what the audit trail means.
What “supervision” means here
Think about how a good manager works. They don’t read every line of code their team writes. They don’t approve every commit. What they do: set clear boundaries, review the exception reports, and investigate when something looks off.
GuardClaw supervises the same way. It doesn’t interfere with your agent’s normal work. It watches the actions your agent takes — reading files, running shell commands, calling APIs — and evaluates each one against security rules. Safe actions pass through instantly. Dangerous ones get blocked. Everything gets written down.
Step 1: Wrap your agent command
To supervise an agent, put guardclaw run -- in front of the command you normally use to start it. The -- separates GuardClaw’s options from your agent’s command.
For example, if you normally start your agent with a command like python my_agent.py or node agent.js, you’d run:
guardclaw run -- python my_agent.py
Or for Claude Code:
guardclaw run -- claude
When GuardClaw starts, you’ll see a monitoring banner:
GuardClaw v0.6.1 — monitoring active
Workspace: ws_default | Plan: free | Cloud: connected
Run: run-4f8a2b1c9d7e
This tells you:
- Monitoring active — GuardClaw is watching.
- Workspace — which workspace this run belongs to (for organizing your security data).
- Plan — your current tier (free, pro, or ultimate).
- Cloud — whether receipts will sync to the dashboard.
- Run — a unique ID for this session so you can find it later.
After the banner, your agent runs normally. Its output appears in your terminal just like it would without GuardClaw.
Step 2: Watch an interception happen
When your agent tries something the security engine flags, GuardClaw blocks it and tells you what happened.
For example, if an agent tries to read files outside its project boundary:
[DENY] Read: ../../.env
Reason: path traversal outside project boundary
Policy: filesystem-boundary
OWASP: AI07 — Insecure Plugin Design
Action: blocked
Here’s what each line means:
- [DENY] — GuardClaw blocked this action. (You’ll also see [ALLOW] for actions that passed.)
- Read: ../../.env — what the agent tried to do. In this case, read a file two directories up from the project root — a common technique for accessing secrets.
- Reason — why it was blocked. Plain language.
- Policy — which security rule caught it.
- OWASP — the industry-standard attack category this falls under.
- Action: blocked — confirmation that the action did not execute.
The agent never runs that command. It receives an error response (the same kind it would get if the file didn’t exist) and moves on to its next task. No crash. No manual intervention needed.
Another example — if an agent tries to run a command that would send data to an unknown external server:
[DENY] Bash: suspicious outbound data transfer detected
Reason: exfiltration pattern detected
Policy: default-deny
OWASP: AI01 — Prompt Injection
Action: blocked
GuardClaw’s detection engine checks against over 1,000 compiled patterns. It catches command injection, prompt injection, data exfiltration, path traversal, SSRF, encoding evasion, and more. You saw these categories in the attack simulation report from the previous post.
Step 3: Understand the receipt chain
Every decision GuardClaw makes — every allow and every deny — generates a receipt. These receipts are stored locally on your machine in a chain where each entry is linked to the previous one using cryptographic hashes.
What that means in practice: nobody can go back and delete a receipt, insert a fake one, or change the order of events. If anyone tried, the hash chain would break and GuardClaw would flag it.
A receipt looks like this:
Receipt #1247
Run: run-4f8a2b1c9d7e
Event: pre_tool_use
Decision: deny
Tool: Bash
Hash: sha256:a8f3...
Prev Hash: sha256:7d12...
Timestamp: 2026-03-22T14:32:01Z
Why does this matter? Because six months from now, when someone asks “what did our agents do in March?” you can show them a complete, verifiable record. Not log files that someone might have edited. A chain where any tampering is mathematically detectable.
This is especially important for regulated industries — healthcare, finance, government — where you need to prove what your systems did and didn’t do.
Step 4: Sync receipts to the cloud
If you connected to the cloud in the previous post, your receipts sync automatically. Every hour, GuardClaw pushes new receipts to your dashboard in the background.
You can also sync manually whenever you want:
guardclaw sync
The sync is efficient. It doesn’t upload every receipt — just the important ones (denied actions, flagged events, and periodic checkpoints). Normal allowed actions stay on your machine unless you run a full sync.
After syncing, your security data appears in the web dashboard, which we’ll cover in the next post.
What you’ve set up
With supervised execution running:
- Every action your agent takes passes through the detection engine before it executes
- Dangerous actions get blocked with a clear explanation of what happened and why
- A tamper-evident receipt chain records every decision for future reference
- Receipts sync automatically to your cloud dashboard (if connected)
The agent doesn’t know it’s being supervised. From its perspective, some actions succeed and some return errors — the same as any other runtime constraint. It adapts and moves on.
Next: what all of this looks like in the dashboard — your security data in one place, ready for you or an auditor to review.
Join the Intelligence Brief
Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.