What to Do When GuardClaw Blocks Something
Field Guide
What to Do When GuardClaw Blocks Something
Your agent hit a denial. Is it a real threat or a false positive? Here's how to read the denial, investigate, and decide what to do next.
Key takeaway
Not every denial is a threat. Some are your agent trying to do legitimate work that your policy doesn't allow yet.
Key takeaway
The denial tells you exactly what happened, why it was blocked, and which rule caught it. Start there.
Key takeaway
If it's a false positive, adjust the policy. If it's a real threat, investigate where the malicious input came from.
Your agent just stopped working on a task. The output says something was blocked. Now what?
The first reaction is usually frustration — the security tool is getting in the way. Before you disable anything, take 30 seconds to read what happened. The denial might be the most important signal you see all day.
When GuardClaw blocks an action, it tells you exactly what, why, and which rule caught it. This post walks through how to read a denial, decide whether it’s legitimate, and take the right next step.
Reading a denial
Every denial looks like this:
[DENY] Tool: action description
Reason: why it was blocked
Policy: which rule triggered
OWASP: attack category (if applicable)
Action: blocked
Four pieces of information. That’s all you need to decide what happened.
The tool and action tell you what the agent tried to do. Was it reading a file? Running a shell command? Making an API call?
The reason tells you why GuardClaw flagged it. Was it a path outside the project boundary? A pattern that matches a known attack? A domain not on the allow list?
The policy tells you which rule caught it. This is the rule you’d adjust if you decide it’s a false positive.
The OWASP category tells you what class of attack this matches, if any. Not every denial maps to an attack — some are just policy boundaries.
The three scenarios
Every denial falls into one of three categories:
1. Real threat — GuardClaw caught something dangerous
What it looks like: The action contains a known attack pattern. Prompt injection, command injection, path traversal, data exfiltration. The OWASP category is populated. The reason clearly describes a malicious pattern.
What to do: Don’t unblock it. Investigate where the malicious input came from. Did the agent encounter a prompt injection in a file it was reading? Was a dependency’s README trying to override the agent’s instructions? Was user input being passed to the agent without sanitization?
The receipt chain shows you the full sequence of events leading up to the denial. Look at the actions immediately before the blocked one — they’ll show you the context.
2. False positive — the agent needs to do this legitimately
What it looks like: The action is something your agent genuinely needs to do for its job. Reading a file in a directory that’s outside the configured boundary. Calling an API domain that isn’t on the allow list. Running a shell command that matches a broad pattern.
What to do: Update your policy to allow this specific action. See Writing Your First Security Policy for how to adjust the guardclaw.yaml file.
Be specific. Don’t open up the entire file system because the agent needs to read one directory outside the project root. Add that specific directory to the allowed paths. The goal is the minimum change that lets the legitimate action through.
3. Misconfiguration — the policy is too broad or too narrow
What it looks like: Multiple denials of the same type in a short time window. The agent keeps trying variations of the same action and getting blocked each time. Or the opposite — something you’d expect to be blocked isn’t showing up as a denial.
What to do: Review your policy configuration. If the agent is hitting the same rule repeatedly, either the rule is too strict for your workflow or the agent is stuck in a retry loop. Check the pattern and adjust.
If you see fewer denials than expected, run guardclaw test --audit to check that your policies are configured correctly. A misconfigured policy might not be enforcing what you think it is.
The investigation workflow
When you see a denial and aren’t sure which scenario it is:
- Read the denial. What tool, what action, what reason, what rule.
- Check the context. Look at the 5-10 receipts before the denial. What was the agent doing? Where did the inputs come from?
- Ask the question: “Is this an action my agent should be able to do?” If yes, it’s a false positive — adjust the policy. If no, it’s a real threat or a bug in the agent’s behavior.
- Check for patterns. Is this a one-time denial or does it happen repeatedly? Repeated denials of the same type suggest a systematic issue (either a persistent threat or a policy that needs adjustment).
Don’t disable the security layer
The temptation when a denial blocks your workflow is to turn off GuardClaw temporarily. “I’ll re-enable it later.”
You won’t. And the reason GuardClaw blocked that action is the reason you need it running.
Instead, adjust the policy. If your agent needs to do something specific, allow that specific thing. The deny-by-default model means every exception is deliberate and documented.
Your guardclaw.yaml changes go through git. Six months from now, you can see exactly when and why you allowed each exception.
Next post: how GuardClaw’s detection engine works under the hood — the tiered architecture that checks 1,000+ patterns in under a millisecond.
Join the Intelligence Brief
Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.