MCP Tool Poisoning in the Wild: Three Chains…

Feb 18

How MCP tool poisoning hijacks agent inference through description metadata, conversation-formatted JSON spoofs safety training.

Read →

5 Comments

ToxSec

Feb 18

Feel free to AMA. I got some nice tips of defending against these attacks for those interested.

Mark S. Carroll

Feb 18

What’s the simplest architecture that makes these three chains fail by default? Not “train the model harder.” I mean: where do you put the walls, what gets sandboxed, and what gets stripped before it ever hits the model or the renderer?

Reply (1)

ToxSec

Feb 18

well one of the biggest things is a sort of stapling/hash for the mcp tool. inspect it, approve it, register. this makes sure now ad hoc changes are made. my next article is going to break down these 3 hacks with. fixes people can implement:) hopefully done by next week.

Dr Sam Illingworth

Feb 18

This is absolutely brutal. I guess there's no way of protecting against this form of Trojan horse other than to be rigorous with which sites you visit? Also "How's the weather" legit feels like it could be a hackers calling card... 😅

Reply (1)

ToxSec

Feb 18

haha. i’m trying to write up a defense companion piece. being rigorous is definitely step 1. there a few good solutions like vetting your mcp tools, then hashing their values and pinning it to stop the rug pull attacks.

it just requires a bit more effort! (and awareness, which hopefully this helps with)

thanks Sam!