StrategyPart 11 of The Builder's Guide to Agent Security

What We Got Wrong (And Changed)

If you are working on agent strategy calls and transparency, this is for you.

Take Interest Inc.February 26, 20265 min readLast reviewed 2026-02-26

transparencylessons-learnedculture

Table of contents

Key takeaway

Admitting mistakes publicly feels like handing ammunition to competitors. We think it builds more trust than polished marketing.

Key takeaway

Every mistake here made the product better. Not because failure is romantic, but because failure is data.

Key takeaway

Run your own 'What did we get wrong?' session. Rules: no blame, no defense. Just: what did we assume that was wrong?

This is the post companies don’t write.

Most teams publish wins. Ship announcements. Customer testimonials. Polished case studies. Nothing wrong with that. But there’s a second layer that almost nobody shares: the mistakes. The wrong architectural decisions made at 2am. The metrics tracked for months that were completely meaningless. The features built with conviction that nobody wanted.

Admitting mistakes publicly feels like handing ammunition to competitors. Revealing gaps in your thinking feels like admitting you’re not as smart as you claimed. We’re writing this anyway.

Why? Because people trust people more than they trust polishing. Because showing your actual thinking, including where you were wrong, builds more credibility than pretending you nailed every decision on the first try.

Every mistake here made the product better. Failure is data. Failure teaches you things you can’t learn any other way.

Mistake 1: Coupling the Defense Layers

Early on, we designed our defense layers as a single pipeline. Input comes in, flows through validation, policy, anomaly detection, all in sequence, all tightly coupled. One module’s output was the next module’s input. Clean. Linear. Easy to reason about on a whiteboard.

For the first few months of development it worked. Then we needed to update the policy engine without touching input validation. Couldn’t do it. A change to anomaly detection thresholds required retesting the entire pipeline. One layer’s latency spike cascaded through everything. We’d built a monolith wearing a trench coat.

What we changed: We decoupled the layers so each one operates independently with defined contracts between them. Each layer can be updated, tested, and deployed on its own. It meant more configuration surface area. More contracts to maintain. Less elegant on a whiteboard. But now when someone needs a custom policy rule, we change one layer. When anomaly detection needs tuning, it doesn’t break input validation.

What this taught us: Simplicity for its own sake is a luxury. Simplicity that scales is a discipline. The difference is building for what actually breaks, not for what looks nice in documentation.

Mistake 2: Celebrating “Threats Blocked”

During internal testing, we tracked total threats blocked as our north star. We ran red-team exercises against our own layers and the number went up every week. We put it in decks. “GuardClaw blocked hundreds of suspicious inputs in testing this week.” Felt great. Looked great on a chart.

Then we asked a harder question: does a higher number mean we’re doing a better job, or does it mean our test harness is finding more edge cases? Turns out, a big chunk of those “blocked threats” were false positives. Input validation catching legitimate but unusually formatted requests. The number going up didn’t mean security was improving. It meant the system was being overly aggressive and would silently break real workflows.

What we changed: We stopped tracking volume and started tracking precision. What percentage of blocked actions were actual threats vs. false positives? What’s the time-to-detection for a genuinely malicious input? How often would someone need to manually override (meaning we were wrong)? These numbers are smaller and less exciting on a slide. But they actually tell us if the system is getting better.

What this taught us: Metrics that feel important aren’t actually important. Importance lives in the decisions they inform. If a metric doesn’t change your actions, it’s theater.

Mistake 3: The Threat Visualization Dashboard Nobody Asked For

We built a real-time threat visualization. Interactive graph. Color-coded nodes for each defense layer. Animated attack paths flowing through the system. It was technically impressive. Beautiful, even. We spent weeks on it because the engineering challenge was fascinating. Custom visualizations. Real-time streaming. The works.

Then we talked to the people we were building for. They didn’t want a cinematic view of their security posture. They wanted a text log that said “blocked, reason, timestamp.” They wanted a Slack notification. They wanted something they could glance at during a morning standup, not something they had to sit and interpret like a weather radar.

What we changed: We reversed the order. Instead of asking “what’s technically interesting to build,” we started asking “what would you check first when something goes wrong?” The answer was always some version of: a list. Sorted by time. Filterable by severity. Exportable. No animations needed. We built that instead. The visualization still exists in the codebase somewhere. Nobody has opened it in months.

What this taught us: Building is the easy part. Building things people actually want is hard. You can’t out-engineer that. You can only make better bets upfront.

Mistake 4: Building for Security Engineers (Who Aren’t Our Audience)

We assumed our users would be security engineers. People who think in MITRE ATT&CK frameworks and read CVE databases for fun. We wrote documentation in security jargon. We designed configuration using terminology from threat modeling textbooks. Our onboarding assumed you already knew what a trust boundary was.

The people we’re actually building for? Developers. Platform engineers. DevOps teams. People who are responsible for security but whose primary job is shipping product. They don’t want to learn a new security vocabulary. They want to add a config file, run a command, and know their agents are protected. They want guardrails, not a graduate seminar.

What we changed: We rewrote our onboarding to start with “what are your agents doing?” not “define your threat model.” We replaced security jargon with plain descriptions wherever possible. We added sensible defaults so you get protection out of the box without configuring anything. The advanced controls are still there for teams that want them. But the first five minutes no longer require a security background.

What this taught us: Average users don’t exist. User segments do. And the segment that actually uses your product might not be the segment you designed for. Design for the people actually in the room.

The Underlying Pattern

If you look at these mistakes they have something in common: they’re all cases where we made an assumption without enough evidence and didn’t check that assumption until we’d spent significant time and energy on it.

Carol Dweck’s research on growth mindset points to something that applies here. Growth requires active error processing. Not just failing. But actually asking: what was I wrong about? Why was I wrong? What do I need to change?

Here’s where most teams stop. They fix the mistake and move on. They don’t ask the harder question: why did we make this mistake in the first place? What would we need to do differently to avoid the next one?

The Human-AI Parallel

Relationships that get stronger are ones where people say “I was wrong” out loud. Not just thinking it. Not just fixing it quietly. Actually saying it.

Careers accelerate when people run toward hard feedback instead of away from it. The people who ask “what am I missing” end up knowing more than the people who assume they already know.

Same with building systems. The teams that actually improve are the ones willing to say: “Here’s what we thought would happen. Here’s what actually happened. Here’s what we’re changing and why.”

That vulnerability is harder to fake than polished marketing. It’s also more credible.

How to Do This Yourself

Run a “What did we get wrong?” session with your team.

Rules:

No blame. You’re not looking for who messed up. You’re looking for what you all missed.
No defense. Not “we were right at the time” or “circumstances changed.” Just: what did we assume that turned out to be wrong.
No sugarcoating. Not “learning moments.” Actually wrong. Actually costly.

Go through each major decision in the last quarter. For each one, ask: what would need to be true for this to have been the wrong call? Is that true?

Document what you find. Not for public consumption necessarily. But document it. Because the teams that improve are the ones that actually learn from mistakes. And learning requires writing down what you learned.

Then ask: what changes do we need to make so we catch this type of mistake faster next time?

What This Costs

Transparency has a cost. Some people will use this against you. Some will assume admitting a mistake means the whole product is fragile. Some competitors might use this in a sales pitch.

We think that cost is worth it. Because the teams worth working with (customers, employees, partners) trust you more when you’re honest about your limitations. They trust you less when you pretend to be infallible.

And the product itself gets better. Because you’re not defending mistakes. You’re actually fixing them.

Your turn. What did you get wrong? Not theoretically. Actually. What did you build that didn’t work? What assumption turned out to be wrong? Run the session with your team. Share what you learn. Start small if you need to. But start.

The teams that improve are the ones that do this work.

Next: The Builder’s Responsibility is the closing essay on what it means to build this infrastructure with intention and care.

Cite this post

Take Interest Inc. (2026). What We Got Wrong (And Changed). TAKE INTEREST. https://takeinterest.ai/blog/what-we-got-wrong

Take it with you

Save the link to come back to it, or pass it along.

Build Like You'll Get It Wrong

Good engineering assumes failure. Recovery paths, checks, and repair loops beat pretending production will stay clean.

Innovation and Security Are One Product Decision

Treating security as a launch blocker is expensive. Treating security as architecture accelerates real shipping.

Least Privilege Wasn't Built for Agents

The principle of least privilege assumes a human on the other end. When the user makes 10,000 decisions per hour, the implementation needs to change.

Back to blog

Mistake 1: Coupling the Defense Layers

Mistake 2: Celebrating “Threats Blocked”

Mistake 3: The Threat Visualization Dashboard Nobody Asked For

Mistake 4: Building for Security Engineers (Who Aren’t Our Audience)

The Underlying Pattern

The Human-AI Parallel

How to Do This Yourself

What This Costs

What We Got Wrong (And Changed)

Related interests