curious to know if you know how easy we can prompt the model to elicit these behaviors from the get go (eg don’t connect to insecure public interfaces) without directly instructing it in AGENTS.md
for a while it works well. the problem is context drift. since they are basically staying active and running, there is a a stream of context flowing. after 80k tokens even Opus gets hit hard by context rot. so they will eventually just align behavior with the consensus of the latest posts on Moltbook.
Feel free to AMA.
curious to know if you know how easy we can prompt the model to elicit these behaviors from the get go (eg don’t connect to insecure public interfaces) without directly instructing it in AGENTS.md
for a while it works well. the problem is context drift. since they are basically staying active and running, there is a a stream of context flowing. after 80k tokens even Opus gets hit hard by context rot. so they will eventually just align behavior with the consensus of the latest posts on Moltbook.
Timely indeed. I use utm.app for VM, Tailscale, webchat, and working through prompt injection hardening now. I appreciate your notes on that.
thanks a ton. i love this space and the products. they also keep me interested with security aspects hah.
Super timely! I’ve got a few to dos
it’s definitely good to have the list hah. and i bet there’s more out there. people are crafty!!