The people who actually make human-level AI will be very thoughtful about all this, so there isn’t a huge cause for concern.

There are a lot of reasons we might think that the people who end up making advanced AI will be very safety-conscious. We might believe such a thing if we have enough faith in human competence. We might believe such a thing if we think that any piece of hardware capable of running intelligent software will be very well regulated, and we can at least expect the regulators to be conscious of safety. While I’d dispute both of those claims, this article has been about how even the most safety-conscious engineers will find it incredibly difficult to make human-level AI safely, and of course how most failure modes lead to extinction.

Thoughtful engineers could still easily fail to notice how a proposed AI design will fail. They will fear that their competitors are close behind, whether that’s China or Google, and they will know that if they wait forever, they will lose the race. At some point, they won’t have a choice but to go with what they have. It is not enough here to be well-intentioned. You need to have access to decades of research into goal alignment and reliable-agent design, research that is, as of yet, in its early stages.

In the period before the AI is human-level intelligent, one upgrade to could lead to a huge improvement, resulting in human-level AI arriving unexpectedly. This suggests that a truly safety conscious team that has not fully figured out goal-alignment should halt progress on the development of AI when their AI is still performing well below human level, and switch to focusing on goal alignment. Even a well-intentioned team might make a mistake about where that point is; all their plans to be very deliberate once they got close to human-level AI would come to nothing if they didn’t understand how close they actually were.

If AI Safety Research is still in its infancy when nations or companies are close to making human-level intelligence, the cautious ones will start to move much slower, and put many more resources into goal alignment, while the less cautious ones plow ahead, increasing the chance that a less cautious team creates the first human-level AI. If AI Safety Research is sufficiently advanced, on the other hand, the cautious ones will still have to tread carefully, but it will be much easier for them to maintain an advantage over the reckless ones (supposing the initial leaders in the field are all appropriately cautious).

If the leaders in the AI race are paragons of caution to the point that they would let their competitors overtake them (who, by the way, they can’t possibly trust to be as perfectly cautious as themselves), then their competitors will probably overtake them. If they are not paragons of caution, then they’re not paragons of caution. Either way, we need goal alignment and reliable agent design to be incredibly mature fields of study, so that the default for all AI developers is to make systems that are at their core based on well-understood theories of reliable, goal-aligned agents.

If all we do is sit back and wait, then even if the leaders in the AI race are sure to be infinitely thoughtful, they will still find it incredibly difficult to make AI safely before the less thoughtful plow past. But I also don’t understand why we should expect the leaders in the AI race to be infinitely thoughtful. There is certainly no government with such a track record.

Therefore, I deny the premise of this objection, and I deny that the premise, even if it were true, would justify very much unconcern.