Yudkowsky vs. Le Guin (Or: Why Total Utilitarianism Doesn't Work)

This is the script to my video on total utilitarianism, which you can watch here.

A few years ago, when I was in my second year of university and thought I was the smartest person in the room, and before JK Rowling ruined Harry Potter for me by coming out as a TERF, I came across a Harry Potter fanfiction called Harry Potter and the Methods of Rationality.

Clocking in at 122 chapters and over 660,000 words — longer than the final three Harry Potter books put together — the story’s premise is that rather than marrying Vernon Dursley, Harry’s Aunt Petunia instead married a professor of biochemistry at Oxford University and repaired her relationship with Lily, such that when Harry was left on her doorstep by Dumbledore, she and her husband Michael genuinely raised Harry as their own son, and Harry grew up homeschooled reading university-level math and science textbooks.

Upon receiving his Hogwarts letter, Harry decides to create a scientific theory of magic, to understand how it works on a rational basis, and to, essentially, “solve” it. If this sounds like exactly the kind of fanfic that would appeal to the most obnoxious people on the Internet, then congratulations, that’s what we in the biz call foreshadowing.

So as a 19-year-old computer science major, the fanfic really appealed to me. It took the original series’ soft magic system, where magic can do whatever you want, except when it can’t for contrived plot reasons, like suddenly not being able to create food when Rowling wanted the trio to be hungry and snappy at each other on the Wizard Camping Trip, grumble grumble, and turned it into a harder magic system with comprehensible rules and clearly defined limitations and potentialities.

The fic inverts the original series’ portal fantasy, valorizing the Muggle world rather than the Wizarding one, and portrays Harry as the wise outsider, who sees past the biases of the Wizarding World and shows them a better way.

The story kind of goes off the rails in the second half (spoilers, by the way), with Harry proclaiming his intent to seize the means of immortality — the Philosopher’s Stone — and distribute it to the entire world, ending death forever, and then just… doing it. He did it, everybody, death isn’t a thing anymore at the end of the fic. He also brings a killed-off Hermione Granger back from the dead months after she died by cryogenically preserving her body, then using a “true Patronus” charm he discovers halfway through the story to destroy the death in her body.

See, in the original series the Dementors are a metaphor for depression, but in the fic, they are literally the shadow of Death. The Patronuses in the fic don’t work by using your happy memories to ward off depression, they work by not-thinking-about the inevitability of death and thinking about something happy to distract yourself, and Harry discovers the “true Patronus” by using his happy thoughts about ending death forever to create a human-shaped Patronus that can actually destroy Dementors.

All of this is kind of ridiculous, but I have to give the fic a lot of credit for being such a divergence from the original franchise that it managed to completely blindside me by pulling a twist from canon and having Harry’s beloved mentor Professor Quirrel turn out to actually be Voldemort.

That being said, the fic does inherit a lot of Harry Potter’s problems, and even makes some of them worse. Rather than free the house elves, like anybody from a modern-day ethical background would do, Harry just assumes the house elves were created to be slaves, rather than being forcibly enslaved, and presumes that they enjoy their work, and it’s even softly white supremacist, describing places that “don’t descend from the Enlightenment” as “dark places”, while also perpetuating the concept of a 1:1 association between racism and unintelligent poor people (i.e. white trash) that ignores systemic racism in its entirety.

But I was 19, and most of those claims were one-off moments in a story that was otherwise genuinely thought-provoking and enjoyable, with solvable, well-foreshadowed mysteries and some likeable characters. I especially liked how the fic took care to slowly deconstruct the idea that being clever and “rational” means you’re always right by showing Harry screwing up repeatedly and facing bigger and bigger consequences until he finally gets his shit together.

I finished the story, reread it a couple of times, read the sequel fic, Significant Digits, and was immensely disappointed by it, and promptly put the fic out of my head after Rowling showed her true colours and Harry Potter was irrevocably tainted for me.

And then I found out all the batshit stuff about the author.


Eliezer Yudkowsky is the founder of the Machine Intelligence Research Institute, formerly The Singularity Institute, a “research institute” in the heaviest of scare quotes funded by venture capital ghoul Peter Thiel, “effective altruist” foundation Open Philanthropy, a whole lot of individual donations from Yudkowsky’s cult following, and the founder of Ethereum Vitalik Buterin. He’s the creator of “timeless decision theory” and the author of the blog Less Wrong. He’s also an AI cultist who thinks that an AI Singularity is inevitable and will solve all of humanity’s problems, which I could say a lot more about if I hadn’t just been sniped on the topic by Sophie from Mars. He’s also the author of Harry Potter and the Methods of Rationality.

There's… a lot to say about Eliezer Yudkowsky, like the time he challenged a Wikipedia editor to a math duel for putting the fact that he has no high school diploma or any sort of degree on his Wikipedia page, or that time he said that the way to save the human race from extinction is to donate money to his institute, because it’s working on solving the single most important problem in the world, so important that climate change is a distraction from it; “what if an AI was evil, though.”

He actually wrote an op-ed just this year for TIME magazine claiming that all work on AI should be permanently stopped, something I agree with, but not for the reasons he does. For my arguments on AI, see my video on AI, but tl;dr, I just think it’s a waste of time and resources that could be put to better use elsewhere. But no, Yudkowsky thinks AI work should be stopped because “what if it becomes smarter than us and kills us all”, and that GPUs — that is, graphics cards — should be tracked as a controlled substance, and governments should be willing to go to nuclear war with any country that tries to build an AI.

And that’s not even getting into the Less Wrong community, or Roko’s Basilisk, or the myriad stories of misconduct and abuse at MIRI. Let me know in the comments if you’d be interested in a deep dive into the cult of personality surrounding Yudkowsky and his specific form of “rationalism”.

But the title of this video isn’t “Eliezer Yudkowsky”, it’s “Yudkowsky vs. Le Guin”, and that’s because what I want to talk about today is utilitarianism. To briefly summarize, utilitarianism is the concept that the moral thing to do in any given situation is the act that will best increase pleasure and reduce pain for the largest number of people — that is, it maximizes utility. For example, a utilitarian might say that if watching this video increases your pleasure, then a way to maximize that utility might be to like the video, subscribe and hit the bell, and consider supporting the show on Ko-Fi. This will increase utility for me, who’ll have an easier time paying my rent, and it’ll increase utility for you, who’ll be able to watch more high-quality video essays in the future.

Yudkowsky is a total utilitarian. That’s the subset of utilitarianism that believes that what matters is the total amount of utility in the world, irrespective of how it’s distributed among people. If you’ve already noticed the issue with that, then shh! We’ll get there.

Le Guin, on the other hand, refers to Ursula K. Le Guin, the late great speculative fiction author. Over the course of sixty years, Le Guin produced more than twenty novels and over a hundred short stories, including the Earthsea Cycle, a much better story about wizards than either HPMOR or the original Harry Potter, The Left Hand of Darkness, a fantastic sci-fi exploration of gender and sexuality, The Dispossessed, which I haven’t read but I really want to, and relevant to this video, The Ones Who Walk Away from Omelas.

It’s a really good short story about the city of Omelas, where everyone is genuinely, truly happy — a real utopia, full of people who are, in Le Guin’s words,

not naive and happy children — though their children were, in fact, happy. They were mature, intelligent, passionate adults whose lives were not wretched.

— Ursula K. Le Guin, The Ones Who Walk Away From Omelas

Everything in Omelas is described as perfect, and Le Guin invites the reader to imagine it however they like; if her initial vision of horse races and moss-grown gardens is too unrealistic for you, she invites you to add trains and subways and orgies.

But, Le Guin adds, if you still think it’s too idyllic, there’s a catch. The entire city’s glory and happiness rests on a small child being confined in a dark room, all alone, treated with utter contempt and cruelty. Every single person in Omelas knows the child is there. All of them are perfectly aware of the cruelty the child is subjected to. But none of them do anything about it, because if even a small kindness, even a kind word, is given to the child, all of the prosperity and delight of Omelas would wither and be destroyed.

Those are the terms. To exchange all the goodness and grace of every life in Omelas for that single, small improvement: to throw away the happiness of thousands for the chance of the happiness of one: that would be to let guilt within the walls indeed.

— Ursula K. Le Guin, The Ones Who Walk Away From Omelas

Le Guin’s addition of the child near the end of the story is a masterful use of the unreliable narrator, as Le Guin repeatedly rails against the inability of cynical readers to imagine a genuinely happy world, and after describing the horrible conditions of the child and everyone in Omelas’s complicity, she asks if the idea of Omelas is more credible now, if we consider suffering to be more authentic than happiness, and what that says about us.

But, hypothetically, let’s say that the child was a fixture in Omelas. There are layers on layers to Omelas, raising a lot of interesting questions about the concept of utopia itself, but that could be its own video, so let’s stay on the surface for now. What we’re left with is the question of whether one person’s suffering can be outweighed by the happiness of an entire city, and that is the topic of today’s video, because Yudkowsky decided to tackle that question via a thought experiment on his website that he calls “Torture vs. Dust Specks”.

It basically goes like this. Imagine one of the worse things that can realistically happen to one person in today’s world. Not the worst, but it’s up there, and let’s say it’s being horrifically tortured for a number of years. Now imagine one of the least bad still-bad things that can happen to someone. Let’s say a dust speck floating into your eye for like half a second of irritation, or a hangnail, or a mosquito bite that doesn't give you a disease. Something on that level. Briefly painful or irritating, but not a big deal.

Finally, let’s imagine a really, really big number. Yudkowsky does this whole pompous explanation of Knuth’s up-arrow notation, which is a way of noting very large numbers in a small amount of space, but for our purposes, let’s go with 2^69. (Nice.) That is an inconceivably huge number, much bigger than we can imagine.

The thought experiment asks this. If neither event would affect you personally, and there were no knock-on effects or lasting consequences, and you had to choose, would you prefer that one person be horribly tortured for fifty years, or that 2^69 people each get a dust speck in their eye for half a second?

I’ll give you a moment to briefly contemplate before I tell you that the answer is that Yudkowsky stole this thought experiment uncredited from a philosopher named Larry Temkin. Temkin’s version of the thought experiment is pretty much the same, but as we’ll see, he and Yudkowsky come to very different conclusions.

See, what Temkin was talking about — and this’ll be important — was the concept of transitivity. It’s a term from set theory, which I only remember learning about in university, not high school, so I’ll quickly explain. Essentially, transitivity is the theory that a relation R will be transitive if and only if for any elements a, b, and c, if aRb and bRc, then aRc. And that's a lot of math jargon, so to give you a real-life example of both a transitive and an intransitive relation, let’s use siblinghood and friendship. Siblinghood is a transitive relation: If Alice is Bob’s sister, and Bob is Charlie’s brother, then by transitivity, Alice is Charlie’s sister. Friendship, however, is intransitive. If Dave is Erin’s friend, and Erin is Frank’s friend, that doesn’t necessarily mean Dave is Frank’s friend, in fact they might hate each other. They might be friends, but they don’t have to be, the way that Alice and Charlie have to be siblings.

Transitivity is a fundamental part of our decision-making process. If you’re trying to decide where to eat, and you decide that, all things considered, Pizza Hut is better than Domino’s, and that family-owned Italian place down the road is better than Pizza Hut, you’re going to apply transitivity, and you won’t bother comparing the Italian place to Domino’s.

But what Temkin argues is that the relation “all things considered, X is better than Y” is not a transitive relation, and he does this using a counterexample that rests on three claims. I’ve very slightly modified his argument to better map onto Yudkowsky’s version, but the fundamental principle is the same.

Claim 1: For any unpleasant or negative experience, no matter what the intensity or how many people experience it, it would be better to have that experience than one that was only a little less intense but afflicted twice as many people.

Claim 2: There is a finely distinguishable range of unpleasant or negative experiences ranging in intensity from, for example, horrific torture to getting a dust speck in your eye, such that you could compare any two adjacent experiences and judge them by claim 1, and you could compare any two experiences on the opposite ends of the spectrum and judge them by…

Claim 3: A mild discomfort for a whole lot of people would be preferable to excruciating torture for one person, no matter how many people are affected.

All three of these claims make intuitive sense, but together contradict transitivity. See, if you imagined one person being tortured for 50 years, one end of Yudkowsky’s dilemma, you’d say by claim 1 that that scenario is better than 2 people being tortured for 49 years. 49 years is less than 50, but it’s not half of 50, so by claim 1, two people being tortured for 49 years is still worse than one person being tortured for 50. Let’s call those scenario A and scenario B, and that A is better than B.

Now let’s bring in scenario C, where eight people are being tortured for 48 years. We’d once again say that B is better than C, right? And you can go on and on and on, down and down in both intensity and duration of torture, until you reach 2^69 people each briefly getting a dust speck in their eye, which we’ll call scenario Z, even if there’s a lot more than 25 steps to get us there.

Now, as a side note, you might be saying, “Oh, come on, a dust speck in the eye isn’t comparable to torture!” and challenge claim 2. But Temkin’s thought of that. If they aren’t comparable, there must be some cutoff point where pain greater than that point is comparable to torture, and pain less than that is comparable to a dust speck. Let’s call those two points scenarios M and N. Problem is, M and N both look a lot more like each other than they do to the opposite ends of the spectrum. It’s not plausible for M to be more like A than it is like N, nor is it plausible for N to be more like Z than like M. And this holds even if there are more than two discrete zones. If pain stopped being comparable to torture at scenario F, and started being comparable to dust specks at scenario R, scenario F would still be more similar to scenario G than scenario A, and scenario R would be more similar to scenario Q than scenario Z.

So, by transitivity, scenario A is still the best-case scenario… but that contradicts claim 3! And so, Temkin claims, this counterexample breaks transitivity for the relation “all things considered, X is better than Y”. After all, any reasonable person would believe that no matter how many people get a single dust speck in their eye,

it can’t possibly equal someone being—

A comment on LessWrong by Eliezer Yudkowsky. It says, “I'll go ahead and reveal my answer now: Robin Hanson was correct, I do think that TORTURE is the obvious option, and I think the main instinct behind SPECKS is scope insensitivity.”


So, I’m going to state the obvious, which is that Yudkowsky is wrong. I cropped out the bulk of his post because discussing it at length would be a huge tangent, but he gussies up his decision with three arguments.

The same comment, but uncropped. Click here for the text of it.

Argument 1 presumes that dust speck proponents are making an argument that I’m not making, so this one is irrelevant. Argument 2 just makes a really poor rationalization that essentially tries to claim that 50 years of torture isn’t really as bad as it sounds, undermining the whole point of the thought experiment. And argument 3 is basically just begging the question. The analogy he makes is presuming the exact thing I’m arguing against. But these are all his after-the-fact rationalizations for why he made the choice he did. This isn’t actually a case of him being incurious or foolish, or even misunderstanding his own thought experiment. Yudkowsky is choosing the obviously wrong answer because his answer is the one required by his ideology, which demands that he reject claim 3 out of hand. It all comes back to his belief in his slogan, “shut up and multiply”, which the Torture vs. Dust Specks post is tagged with.

It is, in essence, a rephrasing of the core tenet of total utilitarianism: What matters is the total amount of happiness in the world, not how that happiness is distributed. Thus, it doesn’t matter that each individual person getting a dust speck to the eye is experiencing a minuscule amount of pain, because the total pain added up across the people is far greater than that of the torture. The total utilitarian answer to Temkin’s counterexample to transitivity is to just bite the bullet and deny the validity of claim 3. Nope, they’d say, A single person being horrifically tortured is better than a whole lot of people experiencing mild discomfort. Shut up and multiply. Temkin directly addresses their arguments through the concept of the “lollipop for life”, which is basically Omelas, projected out on a universal scale. An entire universe’s worth of normal everyday pleasure, such as a lick from a lollipop, at the price of the wretched suffering and death of a single person. Most people, including me, would argue that no amount of pleasure can justify that sort of suffering. Total utilitarians disagree.

For the total utilitarian, then, no matter how small the amount of good may be in a life that is barely worth living, or how small the amount of pleasure may be from one lick of a lollipop, if only there are enough such lives, or licks, eventually the total amount of good or pleasure will outweigh, and then be better than, any finite amount of good or pain that might be balanced off against it. Here, as before, our understanding supposedly leads us to recognize truths that our imagination fails to appreciate; both the Repugnant Conclusion and the "lollipop for life" example can be rejected as objections to utilitarianism.

— Larry Temkin, Rethinking The Good: Moral Ideals and the Nature of Practical Reasoning

In my video on AI, I talked about Nick Bostrom’s calculations about “existential risk” and the comparison of trillions of potential future lives in his fantasy world against the eight billion lives of real people today, and this is a similar kind of total utilitarianism — not surprising, given that Yudkowsky and Bostrom are close friends, and have coauthored several papers together. But Temkin disagrees with the total utilitarians. He doesn’t think utility can be added across people like the total utilitarians say it can, and that he thinks most people are correct in not biting that bullet. I agree with Temkin. The simple fact that total utilitarianism leads us to such repugnant conclusions should lead us to reject it as a viable ethical framework. Pleasure and pain aren’t transitive, and you can’t treat human beings like interchangeable containers of value on a scale. The correct answer is dust specks.

Many believe that a long life containing two years of intense torture would be worse than a long life containing one extra mosquito bite per month, no matter how many extra months of mosquito bites may be involved. Moreover, this needn't be because they assume that the former life would be worse than the latter in terms of all sorts of other morally relevant factors besides the pain occurring in the lives. Rather, as noted previously, they may simply believe that the pains or disutilities of mosquito bites don't add up in the way they would need to in order to make the extra-mosquito-bite-filled life worse than the life involving two years of torture, even regarding pain or disutility. Thus, even insofar as one only cared about pain, one might regard the life involving less total pain (in the form of two years of torture) as worse than the one involving more total pain (in the form of one extra mosquito bite per month).

— Larry Temkin, Rethinking The Good: Moral Ideals and the Nature of Practical Reasoning

However, Yudkowsky did actually defend his position on Torture vs. Dust Specks in a later post titled “Circular Altruism”, where he claims that proponents of dust specks are victims of circular preferences, also called the money pump problem. To bring back the restaurants for a moment, if you prefer Pizza Hut to Domino’s, the local Italian place to Pizza Hut, and Domino’s to the local Italian place, you’re either never going to order dinner, or you’ll be constantly switching restaurants. Yudkowsky’s argument is that if you’d pick the dust specks over 50 years of torture, but you’d pick 50 years of torture over 49 years of torture for two people, then you won’t accomplish anything but making yourself feel noble, since declaring Z to be better than A just leaves you in an infinite loop. So shut up and multiply!

Temkin’s reply is that… yeah. It does put you in a loop. But only if you’re the chump who only looks at each pairwise choice in isolation. If, like in Yudkowsky’s blog post, your only two choices are between torture or dust specks, extreme opposite ends of the spectrum, then there’s no reason to consider the continuum of choices between them, and dust specks are the obvious choice. And if you consider all the choices all at once, every single choice on the spectrum from A to Z, rather than doing a whole bunch of discrete pairwise steps down the spectrum, then dust specks become the obvious choice once more. This is the core of what Temkin calls the Essentially Comparative View, where the value of a specific scenario is dependent on what other scenarios it’s being compared against. Remember, the relation isn't "X is better than Y", it's "all things considered, X is better than Y", and in the Essentially Comparative View, the "all things considered" part changes based on what you're considering! If you compare Domino’s, Pizza Hut, and the Italian place all at once, their value actually changes from if you were doing three isolated pairwise comparisons, and you can pick which one you like best of all three.

And even if you look at each pairwise comparison, if you see where they lead, you can make a more rational choice by looking at the big picture and deciding where you want to end up, instead of taking every choice in isolation. If you see the trail leading to torture in advance, then yes, you choose 2^69 people getting dust specks in their eye over 2^68 people each getting two dust specks in their eye. An irrational choice that violates Claim 1 in isolation, but rational in context.

As someone who is capable of engaging in global and strategic reasoning, a rational agent should refuse to assess alternatives in the immediate context in which they are presented. Instead, he should consider, and indeed anticipate, the larger contexts of which he may become a part. Only in this way can he avoid the practically undesirable outcome of moving from a rationally superior outcome to a rationally inferior outcome, or of being money pumped.

— Larry Temkin, Rethinking The Good: Moral Ideals and the Nature of Practical Reasoning

But Chaia! I hear you cry, this is pointless academic circle-jerking, it has nothing to do with the real world. Why would you spend an entire video dunking on this stupid idea? And you’re right. This may look like pointless philosophizing from the perspective of Torture vs. Dust Specks, and given that all the layers of discourse on Omelas already presume that, yes, what’s done to the child is wrong, this is reliant on a really shallow reading of Le Guin, too. But this argument becomes much more relevant if you were to, say, swap out torture for a case of COVID-19, and dust specks for the discomfort of wearing a mask. All of a sudden, it really, really matters what the correct answer is; and a worrying number of people would pick the death of an immunocompromised person over the discomfort of wearing a mask.

--even if we get a surge of infections, because there's enough fundamental community-level protection, that even though you'll find the vulnerable will fall by the wayside, they'll get infected, they'll get hospitalised, and some will die,

— Anthony Fauci on COVID-19 cases for BBC News

Lord Farquaad: Some of you may die, but it's a sacrifice I'm willing to make.

The other thing that makes this ideology dangerous — and part of what makes Yudkowsky so appealing to people like Peter Thiel — is that this kind of transitivity cuts both ways. We could imagine a reversal of the Torture vs. Dust Specks experiment that is still entirely in keeping with total utilitarianism. Let’s call it “Eliezer’s Capitalist”.

A utilon, in Yudkowsky-speak, is an abstract, arbitrary measure of utility. It’s a suspect concept and also sounds stupid, so let’s use the term HP, or Happiness Points, instead. So, for the Eliezer’s Capitalist thought experiment, let’s set exactly one rule, which is a corollary to Claim 1 in Temkin’s experiment, and goes unspoken but is still held to in Yudkowsky’s post: For any scenario A with m people and n Happiness Points, no matter how many Happiness Points exist or how many people they’re distributed among, there exists a scenario B where half the number of people each have twice the number of Happiness Points, plus one more each, and that since scenario B has more total happiness than scenario A, scenario B is better than scenario A.

So, if in scenario A, 2^69 people each have exactly one Happiness Point, then scenario B, where 2^68 people each have 3 Happiness Points, and the other half have 0, would be better than scenario A, since it has a greater total amount of happiness in existence. Now let’s bring in scenario C, where 2^67 people each have 7 Happiness Points, and the remaining three-quarters have 0. And on and on and on, until you reach scenario Z, where one person has 1 sextillion, 180 quintillion, 591 quadrillion, 620 trillion, 717 billion, 411 million, 303 thousand, 423 Happiness Points, and everyone else has nothing, and because that’s still an increase in the total number of Happiness Points in existence over scenario Y, by exactly one Happiness Point, it’s still the best-case scenario under total utilitarian transitivity.

You might say that this is misleading because no single person could experience over a sextillion Happiness Points, but that seems to me like a tacit acknowledgement that there are limits to total utilitarianism — and again, where should that cutoff exist? If it’s placed arbitrarily at scenario M, well, like I said before, scenario M is much more like scenario N than it is like scenario A, and scenario N is much more like scenario M than it is like scenario Z. Eliezer’s Capitalist is the necessary corollary to Yudkowsky’s conclusion on Torture vs. Dust Specks, and that’s what makes total utilitarianism so appealing to billionaires, and so dangerous to us.

I think Eliezer Yudkowsky would quite happily fit into the city of Omelas. The grand future of humanity that he envisions at the end of Harry Potter and the Methods of Rationality, that MIRI is founded on achieving, that he considers to be the single most important thing to achieve in the entire world, beyond even fighting climate change, is one in which an AI god, who is objectively smarter than humanity and whose choices are always the correct ones, maximizes utility for a humanity that’s spread beyond the stars — and when compared to that, what’s a little climate change, or the suffering of people in the here and now? What is the suffering of one child compared to the happiness of an entire city?

Shut up and multiply.

But there is one more thing to tell, and this is quite incredible.

At times one of the adolescent girls or boys who go to see the child does not go home to weep or rage, does not, in fact, go home at all. Sometimes also a man or woman much older falls silent for a day or two, and then leaves home. These people go out into the street, and walk down the street alone. They keep walking, and walk straight out of the city of Omelas, through the beautiful gates. They keep walking across the farmlands of Omelas. Each one goes alone, youth or girl man or woman. Night falls; the traveler must pass down village streets, between the houses with yellow-lit windows, and on out into the darkness of the fields. Each alone, they go west or north, towards the mountains. They go on. They leave Omelas, they walk ahead into the darkness, and they do not come back. The place they go towards is a place even less imaginable to most of us than the city of happiness. I cannot describe it at all. It is possible that it does not exist. But they seem to know where they are going, the ones who walk away from Omelas.

— Ursula K. Le Guin, The Ones Who Walk Away From Omelas


Return to Writings