What if it only satisfied a goal instead of maximizing it?

Our AI could never assign a 100% chance that it had satisfied its goal. When it was 99.99% sure that it had, it would put all of its effort into becoming 99.999% sure. The extreme effort it would go to in order to maximize the probability that it had satisfied its goals would lead to the same sorts of “crazy” (from a human perspective) courses of action.

What if its goal was to satisfy goal X while it assigned less than 90% probability to already having satisfied goal X, and to do nothing, awaiting another command, if it assigned more than 90% probability to already having satisfied goal X?

Two courses of action might both lead to satisfying the goal, one at a 99% chance, and one at a 98% chance. If the AI is still trying to maximize the chance of satisfying the goal (knowing full well that future-it will stop when it judges its probability of success to surpass 90%), then it will aim for a strategy where its chance of success goes from 89% to 99% in one fell swoop. The strategy that is optimal under those criteria could just as easily lead to crazy-from-a-human-perspective courses of action.

What if it selected randomly between possible actions when many courses of action would give a higher than 90% chance of achieving goal X? This is the most dangerous of all. There will be uncountably many possible courses of action, most of them crazy-from-a-human-perspective. In the course of a conversation, most possible things for you to say would not work out for you. “How are you?” “Marilyn Monroe insofar.” “Um what?” “The withhold!” “Okay. I think my friend just got here. Nice to meet you.” Whatever random thing it comes up is what we’re stuck with. God only knows if that will preserve human life.

What if we had it choose an action that maximized the chance of achieving goal G, as long as it didn’t do X, Y, or Z along the way? This is essentially the next objection on the list: What if we programmed it with rules that it can’t violate to close the “loopholes”? Haven’t you read Isaac Asimov? In any case, the satisfying strategy didn’t gain us anything. If the loophole closing strategy works, we can use it just as well to constrain a goal of the maximizing sort.

 

Back

Advertisement
%d bloggers like this: