Everyone's Worried About Prompt Injection. That's the Easy Problem.
Field Guide
Everyone's Worried About Prompt Injection. That's the Easy Problem.
Prompt injection gets the headlines, but six other AI agent attack vectors cause more damage and get less defense investment. Mapping your full attack surface takes 30 minutes and changes how you think about security.
Attack Surface Tree
Select a branch to inspect where AI-agent failures usually begin.
Key takeaway
Prompt injection is real but it's the one attack vector getting the most defense investment, while six others go unguarded.
Key takeaway
Your defense has to be as broad as your attack surface, not as narrow as your trending topic.
Key takeaway
Mapping your agent's full attack surface takes 30 minutes and fundamentally changes your security posture.
Imagine defending a castle. The entire army camps at the front gate. They built walls there. They watch the gate constantly. They’ve read every article about gate defense. Gate attacks are impossible now.
Meanwhile, nobody’s watching the side wall. The supply tunnel gets used once a week. The kitchen has a back door to the village. Someone’s poisoning the well.
The castle is being invaded through the places nobody’s scared of.
Answer-First Summary
Prompt injection receives disproportionate attention while six other attack vectors cause most of the damage: credential theft, tool abuse, decision path manipulation, output injection, supply chain compromise, and behavioral drift. Mapping your agent’s full attack surface takes 30 minutes and reveals where your actual defenses should go.
The Attention Economy of Security
Prompt injection is real and visible. You feed hostile text in, something bad happens. It’s dramatic. Every researcher has a story. Every enterprise has a policy.
But causality analysis from actual breaches tells another story. Credential theft causes more damage. Tool misuse causes more incidents. Supply chain compromise is coming. Behavioral drift is already here.
Investment follows attention, not threat. Prompt injection gets headlines. The others are systemic and boring. They require thinking about incentives instead of clever adversarial examples.
So nobody builds defenses for them.
Your Attack Surface is Not Your Worry
Your agent exists in an ecosystem. It talks to tools, reads data, writes data, calls systems. At every interface is a potential attack.
Prompt injection is one interface. The visible one. User types something. Agent reads it. Something bad happens. You can track it.
But the database your agent reads from? Someone inserts malicious data on purpose. The agent picks it up. That’s data poisoning, not injection. Same speed. Same authority.
The tools your agent calls? A compromised tool returns bad data. The agent trusts it because it’s authorized. But the source is compromised. Supply chain attack.
The intermediate reasoning steps? An attacker influences them without changing the prompt. The reasoning flows normally. The conclusion serves the attacker. That’s hijacking, not injection.
These aren’t theoretical. Teams mapped one surface and forgot the others.
The Seven Attack Surfaces
-
Prompt Input. User text, user requests. Prompt injection lives here. This is what everyone defends.
-
Data Poisoning. An attacker changes data your agent processes. The agent does exactly what it should. With poisoned input. Worse than injection because the agent can’t help itself.
-
Tool Compromise. A tool your agent calls is compromised. The agent trusts it. Supply chain compromise at machine speed.
-
Credential Theft. Someone steals your agent’s API key. They can do anything the agent can do, faster and at scale. It’s an identity attack.
-
Decision Path Manipulation. An attacker influences intermediate steps without changing the prompt. The final decision serves the attacker. Subtle. Hard to detect. Happening now.
-
Output Injection. An attacker changes agent output to make downstream systems behave badly. Dangerous if output is code, SQL, or shell commands.
-
Behavioral Drift. An agent’s behavior shifts over six months. Nobody changed anything. The agent just drifted from model updates or data shifts. It looks normal until it breaks something.
Most teams defend against one of these. Zero defense for the rest.
The 30-Minute Audit
Work through each surface. For each one, ask: what happens if this gets attacked?
Prompt input: Filtering input? Monitoring for patterns? Limiting based on source? Most teams say yes. Most are wrong.
Data poisoning: Validating data before processing? Monitoring for anomalies? Know which sources you trust most? Almost nobody does.
Tool compromise: Testing tools with intentional faults? Monitoring unexpected outputs? Circuit breakers? This is where real damage happens. Nobody defends it.
Credential theft: Rotating credentials? Short-lived tokens? Revoke in seconds? Different credentials for different trust levels? One company found stolen credentials exfiltrating data. Didn’t notice for six weeks.
Decision path manipulation: Logging reasoning? Validating against expected patterns? Measuring drift? There’s no standard defense. But ignorance is worse.
Output injection: Sanitizing output before consumption? Treating agent output as untrusted? Validating it matches intent? Most teams don’t.
Behavioral drift: Tracking decisions over time? Alerting on pattern changes? Testing regularly? You can’t stop drift. You can catch it early.
Rate yourself 0 to 3 per surface. Average below 1.5 means your attack surface is wider than you think.
The Principle That Matters
Prompt injection is easy to reason about because it’s visible. Someone puts malicious text in. Something bad happens. You can trace the line. The other attack surfaces don’t work that way. They’re systemic. They require thinking about incentives and failure modes and trust boundaries.
So they’re easy to ignore.
But ignored problems grow. An attacker doesn’t care about your defenses against prompt injection. They care about the six surfaces you haven’t thought about. That’s where they attack.
Your defense has to be as broad as your attack surface. Not narrower. Not based on what’s trendy. Based on what actually causes damage.
The Thread
GuardClaw defends against all six vectors — from prompt injection to supply-chain attacks. Deterministic enforcement, not LLM guesswork. Explore the architecture →
We’ve talked about safety, identity, and threat surface. The last problem is architecture. Zero trust was built for humans making decisions at human speed. Agents aren’t human. They operate at machine speed and make decisions in combinations nobody anticipated. That changes everything about how you implement zero trust. Read Zero Trust Was Built for Humans.
Sources
- OWASP Top 10 for LLM Applications, v1.1, 2024. owasp.org
- Greshake et al.: “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection,” 2023. arxiv.org
- MITRE ATLAS: Adversarial Threat Landscape for AI Systems. atlas.mitre.org
Join the Intelligence Brief
Threat intelligence, agentic vulnerabilities, and engineering frameworks delivered straight to your inbox.