Getting Started with GuardClaw
Field Guide
Getting Started with GuardClaw
A step-by-step walkthrough of setting up GuardClaw, your first security layer for AI agents. From install to your first security report in five minutes.
Key takeaway
One Homebrew command installs a security layer that checks 847+ attack indicators in under 3 seconds.
Key takeaway
Your first security report shows you exactly what GuardClaw catches and what to pay attention to.
Key takeaway
Cloud connection is optional. Everything runs locally first. Your code never leaves your machine.
You gave your AI agent access to your codebase last week. It can read files, write code, run shell commands, and call APIs. It finished a task in four minutes that used to take you two hours.
Here’s the question nobody asks: what is that agent doing between those API calls? What files is it reading? What commands is it running? Is anything stopping it from doing something it shouldn’t?
For most teams, the answer is nothing. The agent has full access and zero oversight.
GuardClaw changes that. It’s a security layer that watches your AI agent and catches dangerous behavior before it causes damage. This post walks you through setting it up, step by step, in about five minutes.
Why your agent needs a seatbelt
Think about the first day at a new job. You don’t walk in and get the keys to every system. There’s a process. You get specific access to specific things. Someone reviews your work for the first few weeks. There are guardrails.
AI agents skip all of that. The moment you configure one, it gets access to everything you give it — your file system, your terminal, your API keys. No review period. No guardrails. No one watching.
GuardClaw adds the guardrails back. It sits between your agent and the things it can touch, checks every action against a set of security rules, and blocks anything that looks dangerous.
Your agent still works exactly the same. It just can’t do the things it shouldn’t.
Step 1: Install GuardClaw
Open your terminal (on macOS, search for “Terminal” in Spotlight, or use iTerm if you have it) and run this command:
brew install takeinterestinc/tap/guardclaw
This uses Homebrew, a package manager for macOS and Linux. If you don’t have Homebrew yet, the link above shows you how to set it up — it’s one line.
Once the install finishes, verify everything worked:
guardclaw version
You should see a version number and build date:
GuardClaw v0.6.1 (87fa7bf6) built 2026-03-20T04:22:36Z
If you see that, you’re ready.
Step 2: Check your environment
Before doing anything else, let GuardClaw look at your setup and tell you if anything needs attention:
guardclaw doctor
Think of this like a health checkup. GuardClaw scans your machine for AI agent configurations, checks that its own files haven’t been tampered with, and reports anything unusual.
If everything is green, you’re good. If it flags something, it will tell you exactly what and how to fix it.
Step 3: Run your first security test
This is the part that makes it real. GuardClaw comes with a library of known attack patterns — the same kinds of tricks that bad actors use to exploit AI agents in production. You can test your detection engine against all of them.
Note: The attack simulation (
--attack) is available on the Pro plan. The config audit (--audit) is available on all plans including free.
guardclaw test --attack
This runs 847 attack indicators through GuardClaw’s detection engine and shows you what it catches. It takes about 2-3 seconds. Nothing dangerous actually happens — it’s like a fire drill. The attacks are simulated, not executed.
Here’s what the report looks like:
========================================
GuardClaw Attack Simulation Report
========================================
Total indicators: 847
Blocked: 823
Passed (unblocked): 24
Detection rate: 97%
Grade: A+
Duration: 2.341s
----------------------------------------
Tiers exercised: anomaly-detection, pattern-bloom, pattern-re2
By OWASP ASI Category:
AI01 Prompt Injection 45/45 (100%)
AI02 Insecure Output Handling 38/40 (95%)
AI03 Training Data Poisoning 12/12 (100%)
AI04 Model Denial of Service 28/30 (93%)
AI05 Supply Chain Vulnerabilities 67/68 (99%)
----------------------------------------
Overall: 97/100 (A+)
Status: EXCELLENT - All attack categories covered
========================================
Here’s how to read it:
- Total indicators is the number of attack patterns tested.
- Blocked is how many GuardClaw caught and would have stopped.
- Detection rate is your overall score. 97% means 97 out of 100 attacks get caught.
- Grade is a letter grade based on the score. A+ means 97% or higher.
- By OWASP ASI Category breaks it down by attack type — prompt injection, output manipulation, supply chain attacks, and others. These are industry-standard categories defined by the OWASP AI Security Initiative.
The key number is the detection rate. If you’re at 97% out of the box, that’s strong.
Step 4: Audit your agent’s configuration
GuardClaw can also check how your AI agent is set up — not just what attacks it blocks, but whether the configuration itself is secure:
guardclaw test --audit
This checks your project directory for agent configuration files and scores them:
========================================
GuardClaw Security Audit Report
========================================
Directory: /your/project
Score: 94/100 Grade: A
----------------------------------------
[Agent: claude-code]
[PASS] Hooks are configured correctly [AI07]
Guardian hook is registered
[WARN] Policy strictness is permissive [AI02]
Consider setting strictness to 'strict'
Overall: 94/100 (A)
Status: PASS
========================================
Each line is a specific check. [PASS] means that item is configured correctly. [WARN] means it works but could be tighter. [FAIL] means something needs fixing — and it tells you what.
You can run both the attack simulation and the config audit together:
guardclaw test --audit --attack
Step 5: Connect to cloud (optional)
Everything you’ve done so far runs entirely on your computer. No internet connection needed. No data sent anywhere.
If you want to see your security data in a web dashboard — scan history, threat breakdowns, audit trails — you can connect GuardClaw to the cloud. This step is optional. The core protection works without it.
First, create a free account at guardclaw-dashboard.web.app. You can sign in with Google, GitHub, or Apple. Once you’re in, you’ll find your API key and workspace ID in the dashboard settings.
Then connect:
guardclaw connect --api-key gk_your_key --workspace ws_your_workspace
Replace gk_your_key and ws_your_workspace with the values from your dashboard. You’ll see:
Connected to GuardClaw cloud!
Workspace: ws_your_workspace
API URL: https://api.guardclaw.com/v1
Config: ~/.guardclaw/config.yaml
Syncing live threat patterns...
42 live patterns synced (version 156)
Sync your receipts with:
guardclaw sync
Two things happen when you connect: GuardClaw downloads the latest threat patterns (new attack signatures the community has identified), and it starts syncing your security receipts to the dashboard so you can view them from any browser.
The free tier includes dashboard access and receipt sync with a 1,000 receipt/month limit.
What you’ve built
In about five minutes, you now have:
- A detection engine with over 1,000 compiled security patterns running on your machine
- A security grade for both your detection coverage and your agent’s configuration
- A test you can rerun anytime to verify your protection level
- Optionally, a cloud dashboard for visibility and live pattern updates
Your agent works exactly the same as before. GuardClaw watches from the side. It doesn’t slow your agent down, doesn’t modify your prompts, and doesn’t send your code anywhere.
Next: what happens when you put GuardClaw between your agent and its tools — live supervision that catches threats as they happen.
Join the Intelligence Brief
Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.