Discussion about this post

User's avatar
ToxSec's avatar

Feel free to AMA. I got some nice tips of defending against these attacks for those interested.

Mark S. Carroll ✅'s avatar

What’s the simplest architecture that makes these three chains fail by default? Not “train the model harder.” I mean: where do you put the walls, what gets sandboxed, and what gets stripped before it ever hits the model or the renderer?

3 more comments...

No posts

Ready for more?