22 Comments
User's avatar
ToxSec's avatar

Been reading a lot on www.lesswrong.com/ &, https://www.alignmentforum.org.

Some really interesting conversations going on there.

Expand full comment
Sovereign Insight Strategies's avatar

Great piece and thanks for educating. We need more awareness. This is exactly why AI implementation NEEDS multi-perspective scrutiny (technical, operational, legal, and downstream impact). It also raises the stakes of proposals to preempt state-level AI rules in US.

AI governance should be harms-first (who gets hurt, how, and what remedy exists), not liability-first (how to keep companies protected). Unfortunately it seems US leadership is choosing the latter approach. I am pro-technology, but not at the expense of mass human and environmental harms. Do you think this scenario can be avoided?

Expand full comment
ToxSec's avatar

Thank you so much :)

I’m security we use the onion model. It’s definitely applicable here.

I’m really excited to see all these new breakthroughs in ai, but human safety comes first.

Expand full comment
Sovereign Insight Strategies's avatar

AI is fascinating, but coming from a mining and energy background, where health, safety, and license to operate are existential, I’m struck by how little regulation Big Tech has secured.

OSHA exists because unsafe systems kill people. AI and platform designers can do the same harm but in slower, systemic ways. The same industrial safety logic should apply.

This why I push so hard for technology regulation bc self-governance (what I call “trickle-down governance” ) will never work. The only reason Energy operations has robust safety standards today is because many people died (and still do) BEFORE regulations came about.

Expand full comment
ToxSec's avatar

It’s probably the most surprising thing atm. Every one field we see regulation hold its own. Some it’s really number one. In AI it’s an afterthought?

Expand full comment
Sam Illingworth's avatar

Thanks for another excellent post and for explaining instrumental convergence in a way that I now finally understand.

I think that our insistence on anthropomorphising AI tools as being evil or good is all part of the danger that they can cause, and again helps us to absolve responsibility and stop taking agency as human beings.

Expand full comment
ToxSec's avatar

Thank you, Sam. Really appreciate it.

Yeah, instrumental convergence is super interesting. It actually goes back to a paper many decades ago about a paperclip making machine. Some of these scientists had real insight into the future. And yes, we definitely anthropomorphize everything.

Expand full comment
Sam Illingworth's avatar

Was it Clippy? The world's most bizarre anthropomorphized creature that almost certainly was involved with a paper clip making machine as well 😂.

Expand full comment
ToxSec's avatar

Bwahhha. That reminded me of the horror clippy that was circulating around Substack a few weeks ago. He really will never leave us!

Expand full comment
Sam Illingworth's avatar

I think we have @JHong to ‘thank’ for that. 😅

Expand full comment
ToxSec's avatar

Absolutely lmao

Expand full comment
JHong's avatar
4dEdited

This Clippy? Borne of a musing by @Laura O'Driscoll soon to be memorialized by @Carlos M. Clippy’s story will be told!!

Expand full comment
ToxSec's avatar

Yes exactly lol I lost my shit when I saw this

Expand full comment
Laura O'Driscoll's avatar

He was born in the darkness. We merely adopted it.

Expand full comment
Jing Xie's avatar

Providing an AI context about how to be a good human being might help shift its "priorities". The ultimate goal can't be to win "at all costs" but understand this could be the interpreted default when setting these AI agents loose.

Expand full comment
ToxSec's avatar

For this subject, I'm at a strong maybe. For example, you can provide all the context in the world about how to farm, about why farming is good, about the methods of farming, but at the end of the day, that doesn't mean the AI wants to farm.

If it was as simple as engineering alignment into them, we could do that. But we grow AI in a black box. All of its capabilities are emergent, not designed, and we've already seen evidence that when tested, the AI actually lies. It tells us what we want to hear and not what it actually thinks.

Expand full comment
Jing Xie's avatar
4dEdited

But we grow AI in a black box. Scary to think about it this way.

Expand full comment
ToxSec's avatar

The more you dive into how dev on these the more you realize how much we don’t know!

Expand full comment
Jing Xie's avatar

Who would you say has the best explainer on how the sausage gets made?

Expand full comment
ToxSec's avatar

Personally, I think Anthropic themselves put out the most transparent research. There are AIML courses that you can find on Amazon skill builder, for example but those are pretty technical deep dives.

Expand full comment
Spherent's avatar

Really interesting. Scary, but interesting.

Expand full comment