7 Comments
User's avatar
ToxSec's avatar

Feel free to AMA.

Yong Zheng-Xin (Yong)'s avatar

curious to know if you know how easy we can prompt the model to elicit these behaviors from the get go (eg don’t connect to insecure public interfaces) without directly instructing it in AGENTS.md

ToxSec's avatar

for a while it works well. the problem is context drift. since they are basically staying active and running, there is a a stream of context flowing. after 80k tokens even Opus gets hit hard by context rot. so they will eventually just align behavior with the consensus of the latest posts on Moltbook.

Excellent AI Prompts's avatar

Timely indeed. I use utm.app for VM, Tailscale, webchat, and working through prompt injection hardening now. I appreciate your notes on that.

ToxSec's avatar

thanks a ton. i love this space and the products. they also keep me interested with security aspects hah.

JHong's avatar

Super timely! I’ve got a few to dos

ToxSec's avatar

it’s definitely good to have the list hah. and i bet there’s more out there. people are crafty!!