Hacker News new | past | comments | ask | show | jobs | submit login

This doesn’t need to be so focused on the current set of verboten info though. Just make practice making it not say some set of random less important stuff.





Focusing on the keeping ChatGPT from talking about (or drawing pictures of) boobies has two advantages:

- companies are eager to put in the work to suppress boobies

- edgy teenagers are eager to put in the work to free the boobies

Practicing with 'random less important stuff' loses these two sources of essentially free labour for alignment research.


Yeah I really don’t care about this case much. Actually a good example of less important stuff. It’s practical things like nuclear physics (buddy majoring has had it refuse questions), biochem, ochem, energetics & arms, etc. that I dislike.

Oh, interesting. I hadn't considered censorship in these areas!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: