No Human Values

ToxSec

Dec 11

Watch now | How Instrumental Convergence Makes Harmful AI Behavior Inevitable, Not Accidental.

Listen →

22 Comments

ToxSec

Been reading a lot on www.lesswrong.com/ &, https://www.alignmentforum.org.

Some really interesting conversations going on there.

Expand full comment

Sovereign Insight Strategies

Great piece and thanks for educating. We need more awareness. This is exactly why AI implementation NEEDS multi-perspective scrutiny (technical, operational, legal, and downstream impact). It also raises the stakes of proposals to preempt state-level AI rules in US.

AI governance should be harms-first (who gets hurt, how, and what remedy exists), not liability-first (how to keep companies protected). Unfortunately it seems US leadership is choosing the latter approach. I am pro-technology, but not at the expense of mass human and environmental harms. Do you think this scenario can be avoided?

Expand full comment

Reply (1)

ToxSec

Thank you so much :)

I’m security we use the onion model. It’s definitely applicable here.

I’m really excited to see all these new breakthroughs in ai, but human safety comes first.

Expand full comment

Reply (1)

Sovereign Insight Strategies

AI is fascinating, but coming from a mining and energy background, where health, safety, and license to operate are existential, I’m struck by how little regulation Big Tech has secured.

OSHA exists because unsafe systems kill people. AI and platform designers can do the same harm but in slower, systemic ways. The same industrial safety logic should apply.

This why I push so hard for technology regulation bc self-governance (what I call “trickle-down governance” ) will never work. The only reason Energy operations has robust safety standards today is because many people died (and still do) BEFORE regulations came about.

Expand full comment

Reply (1)

ToxSec

It’s probably the most surprising thing atm. Every one field we see regulation hold its own. Some it’s really number one. In AI it’s an afterthought?

Expand full comment

Sam Illingworth

Thanks for another excellent post and for explaining instrumental convergence in a way that I now finally understand.

I think that our insistence on anthropomorphising AI tools as being evil or good is all part of the danger that they can cause, and again helps us to absolve responsibility and stop taking agency as human beings.

Expand full comment

Reply (1)

ToxSec

Thank you, Sam. Really appreciate it.

Yeah, instrumental convergence is super interesting. It actually goes back to a paper many decades ago about a paperclip making machine. Some of these scientists had real insight into the future. And yes, we definitely anthropomorphize everything.

Expand full comment

Reply (1)

Sam Illingworth

Was it Clippy? The world's most bizarre anthropomorphized creature that almost certainly was involved with a paper clip making machine as well 😂.

Expand full comment

Reply (1)

ToxSec

Bwahhha. That reminded me of the horror clippy that was circulating around Substack a few weeks ago. He really will never leave us!

Expand full comment

Reply (1)

Sam Illingworth

I think we have @JHong to ‘thank’ for that. 😅

Expand full comment

Reply (2)

ToxSec

Absolutely lmao

Expand full comment

JHong

4dEdited

This Clippy? Borne of a musing by @Laura O'Driscoll soon to be memorialized by @Carlos M. Clippy’s story will be told!!

Expand full comment

Reply (2)

ToxSec

Yes exactly lol I lost my shit when I saw this

Expand full comment

Laura O'Driscoll

He was born in the darkness. We merely adopted it.

Expand full comment

Reply (1)

Continue thread →

Jing Xie

Providing an AI context about how to be a good human being might help shift its "priorities". The ultimate goal can't be to win "at all costs" but understand this could be the interpreted default when setting these AI agents loose.

Expand full comment

Reply (1)

ToxSec

For this subject, I'm at a strong maybe. For example, you can provide all the context in the world about how to farm, about why farming is good, about the methods of farming, but at the end of the day, that doesn't mean the AI wants to farm.

If it was as simple as engineering alignment into them, we could do that. But we grow AI in a black box. All of its capabilities are emergent, not designed, and we've already seen evidence that when tested, the AI actually lies. It tells us what we want to hear and not what it actually thinks.

Expand full comment

Reply (1)

Jing Xie

4dEdited

But we grow AI in a black box. Scary to think about it this way.

Expand full comment

Reply (1)

ToxSec

The more you dive into how dev on these the more you realize how much we don’t know!

Expand full comment

Reply (1)

Jing Xie

Who would you say has the best explainer on how the sausage gets made?

Expand full comment

Reply (1)

ToxSec

Personally, I think Anthropic themselves put out the most transparent research. There are AIML courses that you can find on Amazon skill builder, for example but those are pretty technical deep dives.

Expand full comment

Spherent

Really interesting. Scary, but interesting.

Expand full comment