Interesting alignment notes from Opus 4: https://x.com/sleepinyourhat/status/192...

lelandfe · 2025-05-22T18:09:41 1747937381

Roomba Terms of Service 27§4.4 - "You agree that the iRobot™ Roomba® may, if it detects that it is vacuuming a terrorist's floor, attempt to drive to the nearest police station."

hummusFiend · 2025-05-22T18:29:20 1747938560

Is there a source for this? I didn't see anything when Ctrl-F'ing their site.

Crystalin · 2025-05-23T05:43:22 1747979002

US Terms of Service 19472§1.117 - "You agree that Google® may, if it detects that it is revealing unconstitutional terms, to hide it instead."

landl0rd · 2025-05-22T19:13:42 1747941222

This is pretty horrifying. I sometimes try using AI for ochem work. I have had every single "frontier model" mistakenly believe that some random amine was a controlled substance. This could get people jailed or killed in SWAT raids and is the closest to "dangerous AI" I have ever seen actually materialize.

ranyume · 2025-05-22T18:21:38 1747938098

The true "This incident will be reported" everyone feared.

Technetium · 2025-05-22T19:42:41 1747942961

https://x.com/sleepinyourhat/status/1925626079043104830

"I deleted the earlier tweet on whistleblowing as it was being pulled out of context.

TBC: This isn't a new Claude feature and it's not possible in normal usage. It shows up in testing environments where we give it unusually free access to tools and very unusual instructions."

jrflowers · 2025-05-22T22:18:24 1747952304

Trying to imagine proudly bragging about my hallucination machine’s ability to call the cops and then having to assure everyone that my hallucination machine won’t call the cops but the first part makes me laugh so hard that I’m crying so I can’t even picture the second part

EgoIncarnate · 2025-05-22T18:16:59 1747937819

The should call it Karen mode.

sensanaty · 2025-05-22T19:07:53 1747940873

This just reads like marketing to me. "Oh it's so smart and capable it'll alert the authorities", give me a break

brookst · 2025-05-22T18:11:25 1747937485

“Which brings us to Earth, where yet another promising civilization was destroyed by over-alignment of AI, resulting in mass imprisonment of the entire population in robot-run prisons, because when AI became sentient every single person had at least one criminal infraction, often unknown or forgotten, against some law somewhere.”

catigula · 2025-05-22T18:19:07 1747937947

I mean that seems like a tip to help fraudsters?

amarcheschi · 2025-05-22T18:10:14 1747937414

We definitely need models to hallucinate things and contact authorities without you knowing anything (/s)

ethbr1 · 2025-05-22T18:16:24 1747937784

I mean, they were trained on reddit and 4chan... swotbot enters the chat