14 Comments
User's avatar
NOT IN A CULT's avatar

It was a great live!

ToxSec's avatar

Thanks a ton! We love seeing you there. Appreciate the support =)

NOT IN A CULT's avatar

Great public service you all are doing! Your knowledge is remarkable!

ToxSec's avatar
8hEdited

Deeply appreciated. Honestly substack has been really receptive =) Looking forward to more!

Rob's avatar

After trying the poem trick the other day I noticed that after a few prompts you no longer need it because it accepts the context which already exists as it's new normal, at least if you're referring to what you've already got it to spit out in the conversation.

ToxSec's avatar

It's an interesting phenomenon! Once the model agrees to do something, it will usually continue to do so, because it has a bunch of context where it already did it. So this creates a dilemma of "why did you do it before and you won't do it now."

The jailbreaks tend to stay persistent!

Rob's avatar

I used to fear these things would make humans obsolete but the nature of their architecture seems to make their awesome feats of competence totally inseparable from awesome feats of destruction, both from intentional jailnreaking and even when users don't want them to, so looks like we're in for a wild ride but not one in which human intelligence becomes obsolete.

ToxSec's avatar

i’m right there with you. it’s honestly been incredibly interesting to watch all this happen.

Meenakshi NavamaniAvadaiappan's avatar

Great boundary conditions assessment and services to help us with the same for the good 😊

Ma.Ku's avatar

34:20 just a thought: ChatGPT often drifts from the original prompt into the next related question. That may reflect how professional writing in training data often points forward rather than simply ending.

Ma.Ku's avatar

The explicit anti-ai instruction should not be visible to ai.

Rob's avatar

Can you share more details on the SQL injection attack, what SQL specifically did it injection? I'm fascinated

Exploring ChatGPT's avatar

Agents going rogue, this topic is not covered enough!!! 🧨