Yes. I used to think that worrying about models offending someone was a bit sill...

landl0rd · 2025-05-23T08:39:17 1747989557

This doesn’t need to be so focused on the current set of verboten info though. Just make practice making it not say some set of random less important stuff.

eru · 2025-05-24T02:50:34 1748055034

Focusing on the keeping ChatGPT from talking about (or drawing pictures of) boobies has two advantages:

- companies are eager to put in the work to suppress boobies

- edgy teenagers are eager to put in the work to free the boobies

Practicing with 'random less important stuff' loses these two sources of essentially free labour for alignment research.

landl0rd · 2025-05-24T04:20:35 1748060435

Yeah I really don’t care about this case much. Actually a good example of less important stuff. It’s practical things like nuclear physics (buddy majoring has had it refuse questions), biochem, ochem, energetics & arms, etc. that I dislike.

eru · 2025-05-24T05:50:25 1748065825

Oh, interesting. I hadn't considered censorship in these areas!