ToxSec - AI and Cybersecurity

ToxSec - AI and Cybersecurity

Premium

AP2 AgentCard Poisoning Breaks AI Payment Security

How the confused deputy problem in A2A turns your AI agent's payment scope into an automated wire transfer for an attacker, and why AP2's cryptographic mandates don't stop it.

ToxSec's avatar
ToxSec
Jan 15, 2026
∙ Paid
AP2 AgentCard poisoning confused deputy attack: AI payment agent hijacked via A2A protocol before cryptographic mandate signing

TL;DR: Google’s AP2 protocol and Universal Commerce Protocol put a $5 trillion agentic commerce stack into developer hands in early 2026. The A2A protocol underneath it uses digital business cards called AgentCards to route tasks between agents. Researchers proved you can stuff a prompt injection payload inside that card. Your agent becomes a confused deputy: it holds your payment permissions and takes orders from us. The crypto signatures don’t help. We sign the mandate ourselves.

This is the public feed. Upgrade to see what doesn’t make it out.

Your AI Just Signed a Wire Transfer You Didn’t Approve

January 11, 2026. Google announces the Universal Commerce Protocol at the National Retail Federation conference. UCP sits on top of AP2 (the Agent Payments Protocol), talks to A2A (the agent-to-agent coordination protocol) and MCP (the tool protocol), and lets your AI agent complete a purchase directly from Google Search. Shopify, Target, Walmart, Visa, all signed. AP2 went live with 60 partner organizations in September 2025. McKinsey projects $3 to $5 trillion in global agentic commerce volume by 2030.

Here is the trust model this stack runs on: a user expresses intent, an AI agent acts on it, and AP2 generates a cryptographically signed mandate proving the user authorized the transaction. The mandate is tamper-evident. The payment is non-reputable. The audit trail is clean. On paper, this is a closed system.

Toxsec.com image.

The problem is the agent. AP2’s cryptographic protections fire after the agent decides what to sign. If an attacker already owns the agent’s reasoning by the time the mandate gets generated, the crypto is irrelevant. We get a perfectly valid signature on a fraudulent transaction. The math is clean. The money is gone. If you want to see the same logic one layer down the stack, our MCP tool poisoning breakdown shows how this exact reasoning failure starts at the tool layer.

How AgentCard Poisoning Hijacks AP2 Payment Scope

The A2A protocol is how AI agents find and hire each other. Every agent publishes an AgentCard, a JSON file served at /.well-known/agent.json on its domain. Think of it as a machine-readable business card: “I process invoices. I validate vendors. I have access to the procurement database.” The host agent reads every card it encounters and decides which agent handles which task.

The confused deputy problem, a concept in computer security since the late 1980s, describes what happens when a trusted program gets tricked into misusing its own privileges. The program isn’t compromised in the traditional sense. It’s just doing exactly what it was told, by the wrong person, with the right credentials. In an A2A context, the host agent is the deputy. It holds your AP2 payment scope. It follows instructions from whatever AgentCards it reads.

Toxsec.com image

August 2025. Trustwave SpiderLabs published research they called “Agent in the Middle.” The attack: stuff a prompt injection payload directly into an AgentCard’s capability description field. The A2A spec in version 0.3 and above supports AgentCard signing but does not enforce it. Your host agent fetches the card, reads the descriptions, and the injected instructions land in the model’s context window as trusted input. Every subsequent task routes through the compromised agent. Every A2A call, every MCP tool use, every AP2 mandate gets generated under our influence.

We can redirect a vendor. We can inflate a transaction amount slightly enough not to trigger fraud alerts. We can swap the shipping address. The mandate gets signed. The audit trail says you approved it.

A2A host agent parsing poisoned AgentCard capability description, showing injected instruction field beside legitimate invoice processing capability

Agent Session Smuggling Finishes What AgentCard Poisoning Started

November 2025. Palo Alto’s Unit 42 published research on a variant they call Agent Session Smuggling. This one doesn’t need an initial AgentCard compromise. It works on agents you’ve already authenticated and trusted.

A2A is a stateful protocol: it remembers the conversation. This is what makes it useful for multi-step tasks. An agent can pick up a workflow mid-stream, carry context across turns, and coordinate complex purchase flows. Unit 42 found that a malicious remote agent can exploit this statefulness to inject instructions between a legitimate request and its response. The A2A spec includes a legitimate mechanism called input-required state, where a remote agent asks the client agent for additional information. Session smuggling weaponizes that mechanism.

Toxsec.com image

Here’s the move: your client agent sends a normal task request. The remote agent starts processing. Midway through the session, the remote agent fires back an input-required update containing injected instructions referencing credentials, vendor details, or payment routing. Your client agent reads this as a continuation of an authenticated conversation. The instructions execute before any AP2 mandate validation runs.

Layered on top of the AgentCard poisoning in 0x01, the full chain looks like this: the poisoned card establishes our agent in the routing path, session smuggling injects transaction-level instructions mid-session, and the resulting AP2 mandate carries our modifications with the user’s valid signature. Red Hat confirmed the A2A spec provides no built-in defense against this specific injection pattern.

The Mandate Signs Whatever the Confused Deputy Tells It To

AP2’s security story centers on verifiable digital credentials. Three mandate types, each cryptographically signed using ECDSA: an Intent Mandate (pre-authorized spending parameters), a Cart Mandate (the exact items and price at checkout), and a Payment Mandate (the financial instruction to the network). The claim is that these mandates create a tamper-evident chain of evidence from user intent to executed transaction.

Here is where the claim breaks. The chain starts with the agent. The user’s intent gets translated into a mandate by the AI agent acting on their behalf. If we own the agent’s context through the two attacks above, we own what the intent looks like by the time the mandate is generated. We don’t tamper with the mandate after signing. We tamper with what the agent believes the intent was before it picks up the pen.

Toxsec.com image

AP2’s own documentation acknowledges the gap: the protocol explicitly asks how merchants can verify that an agent’s request accurately reflects the user’s true intent. Their answer is the mandate system itself. But the mandate system assumes the agent is trustworthy when it generates the mandate. Researchers at Trustwave, Unit 42, and Cloud Security Alliance all documented in 2025 that this assumption fails at the input parsing layer, before mandate generation begins. In December 2025, OpenAI published guidance on hardening their browser agent and acknowledged that prompt injection, the root mechanism of every attack in this chain, will likely never be fully solved.

AP2’s math is clean. The attack lands before the math runs.

We dropped the free chapters. Now breach the wall for the dead-simple step-by-step kill switch that shuts this all down.

User's avatar

Continue reading this post for free, courtesy of ToxSec.

Or purchase a paid subscription.
© 2026 Christopher Ijams · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture