How GuardClaw Is Different
Field Guide
How GuardClaw Is Different
There are other approaches to AI agent security. Here's where GuardClaw fits, what trade-offs we made, and why we made them.
Key takeaway
Most agent security tools scan code at rest. GuardClaw watches agents act in real time. Different layer, different problem.
Key takeaway
GuardClaw runs locally. Your code never leaves your machine. The security decisions happen on your hardware, not in someone else's cloud.
Key takeaway
We use deterministic pattern matching, not AI. Security decisions should be predictable, auditable, and reproducible.
Someone asks you: “Why GuardClaw instead of X?”
Fair question. There are other tools in the AI security space. Static analyzers that scan code for vulnerabilities. API gateways that filter requests. LLM guardrails that check prompts and responses. Each solves a real problem.
GuardClaw solves a different one.
Most AI security tools focus on the model (what goes in, what comes out) or the code (what’s vulnerable). GuardClaw focuses on the agent — what it does while it’s running. This post explains where that distinction matters.
The three layers of AI security
Think of AI security as three layers:
Layer 1: Code and dependency scanning. Tools that analyze your source code, find vulnerabilities in dependencies, detect hardcoded secrets, and flag insecure patterns. This happens before your code runs. It catches problems in the code itself.
Layer 2: Model input/output filtering. Tools that check what goes into the model (prompt injection detection) and what comes out (output validation, content filtering). This happens at the boundaries of the model — what goes in and what comes back.
Layer 3: Runtime agent security. Watching what the agent actually does after the model produces a response. The agent decided to run a shell command. Should it be allowed? The agent wants to read a file. Is it within bounds? The agent is making an API call. Is the destination safe?
Most tools operate at layer 1 or layer 2. GuardClaw operates at layer 3.
Why runtime matters
Imagine a building with a metal detector at the front door (layer 1 — scanning) and a receptionist who checks IDs (layer 2 — input validation). Good security. But once someone is inside the building, nobody is watching which rooms they enter, which filing cabinets they open, or which documents they photograph.
Layer 3 is the security cameras and locked doors inside the building. Even after someone passes the front desk, every door they try is checked.
For AI agents, runtime security matters because:
- Agents chain actions. A model might produce a safe-looking plan, but the execution of that plan involves dozens of individual actions. Each one needs checking.
- Context changes. An agent reading a file might encounter a prompt injection embedded in the content. The action started safe and became dangerous mid-execution.
- Tools have side effects. A shell command isn’t just text — it modifies files, starts processes, sends network requests. The consequences are real.
Three design decisions that define GuardClaw
1. Local-first
Your code never leaves your machine. The detection engine runs on your hardware. Security decisions happen locally with no network dependency.
Why we chose this: Sending your code or your agent’s actions to a cloud service for security evaluation creates a new data exposure. The security tool itself becomes a data processing relationship you need to manage. For teams handling sensitive code, proprietary algorithms, or regulated data, that’s a problem.
The trade-off: Cloud-first tools can update their detection models instantly for all customers. GuardClaw uses periodic pattern syncs (live feed for Pro plans, weekly for free). If a new attack pattern emerges, cloud tools can protect against it immediately. GuardClaw protects against it at the next sync.
2. Deterministic detection
GuardClaw uses compiled pattern matching (Bloom filters, Aho-Corasick, RE2 regexes) and rule-based policies. Not AI. Not probabilistic scoring. Deterministic rules that produce the same result every time.
Why we chose this: Security decisions need to be predictable. When an auditor asks “will this action be blocked?”, the answer should be “yes, always” or “no, never” — not “probably, 94% of the time.” Deterministic rules are auditable. You can read them, test them, verify them.
The trade-off: Pattern matching catches known attack patterns. A novel attack that doesn’t match any pattern gets through. AI-based detection can theoretically catch novel attacks through behavioral analysis. We address this partially with anomaly detection (behavioral patterns that flag unusual sequences), but it’s an area where the approach has limits. We wrote more about this in What We Got Wrong.
3. Defense in depth
Seven independent layers, each assuming the others can fail. Not one filter with one chance to catch an attack. Seven different approaches, each catching what the others miss.
Why we chose this: One layer with 95% accuracy misses 5 attacks out of 100. Seven layers, each at 90% accuracy, miss fewer than 1 in 10 million. Compound probability is the only way to get security that holds against creative attackers.
The trade-off: More layers means more complexity, more rules to maintain, and more potential for false positives. We manage this with tiered detection (fast checks first, deeper analysis only when earlier tiers flag something) and configurable strictness levels.
What GuardClaw doesn’t do
Being clear about boundaries:
- Doesn’t scan your code for vulnerabilities. Use a SAST tool for that.
- Doesn’t filter model outputs for hallucinations or harmful content. That’s a different layer.
- Doesn’t replace your security team. GuardClaw is a tool, not a service. You still need people who understand your threat model.
- Doesn’t guarantee 100% detection. No security tool does. We publish our detection rates (97%+ on the built-in corpus) so you can make informed decisions.
Where it fits in your stack
GuardClaw complements your existing security tools. It doesn’t replace them.
If you use a code scanner, keep using it. If you use an API gateway, keep using it. GuardClaw fills the gap between “the code is safe” and “the agent is behaving safely” — the runtime layer that most teams don’t have yet.
Next post: rolling out GuardClaw across a development team — permissions, workspaces, and shared visibility.
Join the Intelligence Brief
Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.