Hacker News new | past | comments | ask | show | jobs | submit login

> The basic premise is that it has somewhere in it that is telling it to make more paperclips. Put the constraints there.

What constraints do you suggest? If it's just changing "make as many paperclips as possible" to "make at least x number of paperclips" (putting a cap on the reward it gets), here's a good explanation of why that doesn't really work: https://www.youtube.com/watch?v=Ao4jwLwT36M

If you're suggesting limiting the types of actions it can take, then to do that to the point that a superintelligence can't find a way around it (maybe letting it choose between one of two options and then shutting it down and never using it again) would make it not very useful, so you'd be better off just not making it at all

> If you're saying such an AI would be too smart to be a simple paperclip maximizer

No, that's not what I'm saying. Any goal is compatible with any level of intelligence, there is no reason why it wouldn't be possible to follow a simple goal in a complex way. Again here's a video about that: https://www.youtube.com/watch?v=hEUO6pjwFOo




The most intelligent person ever born could still die to a gun. In these discussions superintelligent AI can be more accurately described as "the genie" or "God". If you assume omniscience and omnipotence I guess nothing else matters. But intelligence is not equal to power, and never has.

Second, if you are able to set a goal then during this setting you can set many constraints, even fundamental ones. There is no reason the goal is more fundamental than the constraint. If I approve, make paperclips. Efficiently make 100 paperclips.

It's the duality of being able to set a rule but not being able to set a constraint that I find a strange concept. I lean towards the picture of not being able to set goals nor constraints at all.


Intelligence definitely helps with gaining power. Humans aren’t very strong yet we have a lot of power thanks to our intelligence.

You can set constraints just fine. It’s simply a part of the goal: “do x without doing y”. It’s just really hard to find the right constraints, no simple one works.

For example “if I approve, make paperclips” - so it gets more reward if you approve? What’s to stop it from manipulating you into thinking nothing is wrong so you always approve? “Efficiently make 100 paperclips.” I already linked a video on why capping the reward like that doesn’t work, but if you don’t want to watch it the gist is that for your suggestion it may just make a maximiser which is pretty guaranteed to make at least 100, and is pretty efficient because it’s not doing much work itself. Then the maximiser kills us all




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: