Discussion about this post

User's avatar
Matthew Barnett's avatar

Thanks for engaging with my arguments thoughtfully. I want to offer some clarifications and push back on a few points.

Before addressing your specific claims, I think it's worth stepping back and asking whether your argument proves too much. As I understand it, your BATNA framework suggests that humanity would receive a bad deal in negotiations primarily because (1) an ASI would be very powerful and humanity very weak, and (2) humanity requires substantial resources to survive. But if these facts alone were sufficient to predict exploitation or predation, we'd be surprised by how much cooperation actually exists in the world.

Consider bucketing all humans except Charles Dillon into one group, and Charles into another. The former group is vastly more powerful than the latter. There's no question that Charles wouldn't stand a chance if everyone else decided to gang up on him and defeat him in a fight. It's also expensive to keep Charles alive, as he requires food, shelter, medical care. Given these facts, you could imagine someone constructing an argument from first principles that it would be less costly for the rest of humanity to simply kill Charles and take his stuff, or negotiate a settlement that leaves him dramatically impoverished, rather than cooperating with him and letting him live a prosperous life. Yet, of course, this argument would fail to predict reality. As far as I know, you're safe, have a fairly high standard of living, and are not under any realistic threat of societal predation of this kind.

The same logic applies to large nations coexisting peacefully with tiny ones, large corporations trading with low-wage workers, young people cooperating with the elderly rather than expropriating their assets, or able-bodied people cooperating with disabled people. In each case, a powerful group tolerates and compromises with a weaker group who they could easily defeat in a fight, often conceding quite a lot in their implicit "negotiated settlement". Yet in each of these cases, a naive BATNA analysis would seem to predict much worse outcomes for the weaker party. The elderly's alternative to cooperation is destitution or relying on charity from their children; the alternative for low-wage workers appears to be slavery (as was common historically).

Moreover, these outcomes can't be fully explained by human altruism. Historically, humans have been quite willing to be brutal toward outgroups. The cooperation we observe today is more plausibly explained by such behavior being useful in the modern world: that is, we learned to cooperate because that was rational, not merely because we're inherently kind. So clearly power asymmetry and sustenance cost arguments alone are insufficient to establish that a powerful group will predate on a weaker one. Additional arguments are needed.

To reply more specifically to your points:

You identify two costs an ASI might face from war with humanity: resource expenditure and human retaliation. However, I think you're missing a third cost that's likely larger than both: the damage to the rule of law that comes with violently predating on a subgroup within society.

My expectation is that future AGIs will be embedded in our legal and economic systems. They will likely hold property, enter contracts, pursue tort claims -- much as corporations and individuals do now. Within such a framework, predation would be regarded as theft or violence. This type of behavior generally carries immense costs, because productive activity depends on the integrity and predictability of legal institutions. This is precisely why we punish predatory behavior through criminal codes.

Notice that large-scale wars, when they occur, typically happen between nations or ideological factions, not between arbitrary groupings. Almost no one worries that right-handed people might one day exterminate or exploit the left-handed, even though that's a logically coherent way to draw battle lines. The scenario where AGI predates on humans assumes this is a natural boundary for conflict. But as with handedness -- or almost any other way of partitioning society into two groups -- my prior is low that this particular division (humanity vs. AGI) would make predation worthwhile for the powerful party. I would need specific evidence for why AGIs wouldn't face substantial costs from violating legal norms to update significantly from this prior.

You list four possible reasons someone like me might push ahead with AI development while acknowledging extinction risk. I find three of them genuinely compelling, specifically reasons (1), (2), and (4) that you mentioned. Here are my elaborations of each of these points and why I find them compelling:

(1) I think delaying AI development would likely do very little to make it safer, because technologies typically become safer through continuous iteration upon widespread deployment. For example, airplanes got much safer over time primarily because we flew millions of flights, observed what went wrong when accidents occurred, and fixed the underlying problems. It wasn't mainly because we solved airplane safety in a laboratory, or invented a mathematical theory of airplane safety. Therefore, delaying aircraft wouldn't have made flying meaningfully safer. Rather, it mostly would have simply interrupted this iterative process. I expect the same dynamic applies to AI.

(2) I think AIs will likely have substantial moral value. I haven't encountered arguments that I find compelling for why moral status should be confined to biological substrates. Future AIs will likely be sophisticated in their ability to pursue goals, have preferences, perceive the world, and learn over time. For these reasons, it seems reasonable to consider them moral patients -- indeed, people in their own right -- rather than mere tools for humans. This means that while human extinction would be very bad, it would be more analogous to the extinction of one subgroup within a broader population of moral patients, rather than the destruction of all value in the universe.

(3) I don't find this point compelling by itself.

(4) I do lean toward views that prioritize helping people who exist now over bringing hypothetical future people into existence. This naturally makes me think that capturing the benefits of AI sooner, like dramatic economic growth, transformative medical technologies, extended healthy lifespans, is highly valuable from a moral point of view.

Expand full comment
Andrew's avatar

I think the issue of commitment is the biggest blocker by far. The costs of ensuring both parties commit to their deal is likely larger than the cost of eliminating humans.

Firstly, it relies on humans not being tricked by a misaligned super-intelligence. This sounds unbelievably difficult, humans are tricked by other humans all the time. A super-intelligence would undoubtedly be more persuasive.

Secondly it assumes that humanity negotiates together, and that the AI negotiates in good faith. This assumes a lot of cooperation on humanity’s part and a lack of subterfuge on the AI’s part. In reality, the AI needs only negotiate with a handful of key figureheads and stakeholders, and it can do so privately and secretly. You will not be a part of these negotiations.

Additionally, the AI is likely to play factions of humanity off each other to get a better deal (or to get what it wants without any deal at all). It could simply leak a version of itself to a geopolitical adversary and then run an arms race that ends with enough autonomous weapons and factories that eliminating humanity is cheap.

Expand full comment
7 more comments...

No posts

Ready for more?