# Category Archives: Altruism

## Deserving Trust, II: It’s not about reputation

Summary: a less mathematical account of what I mean by “deserving trust”.

When I was a child, my father made me promises. Of the promises he made, he managed to keep 100% of them. Not 90%, but 100%. He would say things like “Andrew, I’ll take you to play in the sand pit tomorrow, even if you forget to bug me about it”, and then he would. This often saved him from being continually pestered by me to keep his word, because I knew I could trust him.

Around 1999 (tagged in my memory as “age 13”), I came to be aware of this property of my father in a very salient way, and decided I wanted to be like that, too. When I’d tell someone they could count on me, if I said “I promise”, then I wanted to know for myself that they could really count on me. I wanted to know I deserved their trust before I asked for it. At the time, I couldn’t recall breaking any explicit promises, and I decided to start keeping a careful track from then on to make sure I didn’t break any promises thereafter.

About a year later, around 2000, I got really wrapped up in thinking about what I wanted from life, in full generality. I’d seen friends undergo drastic changes in their world views, like deconverting from Christianity, and becoming deeply confused about what they wanted when they realized that everything they previously wanted was expressed in terms of a “God” concept that no longer had a referent for them. I wanted to be able to express what I wanted in more stable, invariant terms, so I chose my own sensory experiences as the base language — something I believed more than anything else to be an invariant feature of my existence — and then began trying to express all my values to myself in terms of those features. Values like “I enjoy the taste of strawberry” or “it feels good to think about math” or “the split second when I’m airborne at the apex of a parabolic arc feels awesome.” Life became about maximizing the integral of the intrinsic rewardingness of my experiences over the rest of time, which I called my “intensity integral”, because it roughly corresponded to having intense/enriching intellectual and emotional experiences. (Nowadays, I’d say my life back then felt like an MDP, and I’d call the intensity function my “reward function”. I’ll keep these anachronistic comments in parentheses, though, to ensure a more accurate representation of the language and ideas I was using at the time.)

So by 2001, I had decided to make the experience-optimization thing a basic policy; that as a matter of principle, I should be maximizing the integral of some function of my sensory experiences over time, a function that did not depend too much on references to concepts in the external world like “God” or “other people”. I knew I had to get along with others, of course, but I figured that was easier to explain as something instrumental to future opportunities and experiences, and not something I valued intrinsically. It seemed conceptually simpler to try to explain interactions with others as part of a strategy than as part of my intrinsic goals.

But by 2005, I had fallen deeply in love with someone who I felt understood me pretty well, and I began to feel differently about this self-centered experience-optimization thing. It started to seem like I cared also about *her* experiences, even if I didn’t observe them myself, even indirectly. I ran through many dozens of thought experiments over a period of months, checking that I couldn’t find some conceptually simpler explanation, until I eventually felt I had to accept this about myself. Even if it had no strategic means of ever paying off in enjoyable experiences for *me*, I still wanted *her* to have enjoyable experiences, full stop.

Around the same time, something even more striking to me happened: I realized I also cared about other things she cared about, aside from her experiences. In other words, the scope of what I cared about started expanding outward to things that were not observable by either of us, even indirectly. I wanted the little stack of rocks that I built for her while walking alone on the beach one day to stay standing, even though I never expected her to see it, because I knew she would like it if she could. (In contemporary terms, my life started to feel more like a POMDP, so much so that, by the time I first encountered the definition of an MDP around 2011, it felt like a deeply misguided concept that reminded me of my teenage self-model, and I didn’t spend much time studying it.)

At this point, depending whether you want to consider “me” to be my whole brain or just the conscious, self-reporting part, I either realized I’d been over-fitting my self-model, or underwent “value drift”. When I introspected on how I felt about this other person, and what was driving this change in what I cared about (I did that a lot), it felt like I wanted to deserve her trust, the same way I wanted to keep promises, the way my father did. Even when she wasn’t looking and would never know about it, I wanted to do things that she would want me to do, to produce effects that, even if neither of us would ever observe them, would be the effects she wanted to happen in the world. This was pretty incongruous with my model of myself at the time, so, as I pretty much always do when something important seems incongruous, I entered a period of deep reflection.

## Considerations against pledging donations for the rest of your life

I think donating to charity is great, especially if you make more than \$100k per year, placing you well past the threshold where your well-being depends heavily on income (somewhere around \$70k, depending on who does the analysis). I’ve been in that boat before, and donated more than 100% of my disposable income to charity. However, I was also particularly well-positioned to know where money should go at that time, which made donating particularly worth doing. I haven’t made any kind of official pledge to always donate money, because I take pledges/promises very seriously, and for me personally, taking such a pledge seems like a bad idea, even accounting for its signalling value. I’m writing this blog post mainly as a way to reduce social pressure among such folks who earn less than \$100k per year to produce donations, while at the same time encouraging folks who earn more to consider donating more seriously. Note that I currently work for a charitable institution that I believe is extremely important. So, having been both a benefactor and beneficiary of donations, I hope I may come across as being honest when I say “donating to charity is great.” Note also that I believe I’m in a somewhat rare situation relative to humans-in-general, but not necessarily a rare situation among folks who are likely to read my blog, who tend to have interests in rationality, effective altruism, existential risk, and other intellectual correlates thereof. Basically, depending on how much information I expect you to actively obtain about the world relative to the size of your donations or other efforts, I may or may not like the idea of you pledging to always donate 10% of your income. Here’s my very rough breakdown of why: ### Consider future variance in whether you should donate. If you either (1) make less than \$100k/year, or (2) might be willing to make less than that at some future time in order to work directly on something the world needs you to do (besides giving), I would not be surprised to find myself recommending against you pledging to always donate 10% of your income every year.

Moreover, if you currently spend more than 100 hours per year investigating what the world-at-large needs, I would not be that surprised if in some years you were able to find opportunities to spend \$10k-worth-of-effort (per year on average, rather than every year) that were more effective than giving \$10k/year. Just from eyeballing people I know, I think a person who spends that much time analyzing the world (especially one who is likely to come across this post) can be quite a valuable resource, and I expect high initial marginal returns to their own direct efforts to improve themselves and the world.

Example: during my PhD, I spent a considerable fraction of my time on creating a non-profit called the Center for Applied Rationality. I was earning very little money at that time, and donating 10% of it would have been a poor choice. It would have greatly reduced my personal flexibility to spend money on getting things done (saving time by taking taxis, not worrying about the cost of meals when I was in flow working with a group that couldn’t relocate to obtain cheaper food options without breaking productivity, etc.). I think the value of my contribution to CFAR during those years greatly exceeds \$4,000 in charitable donations, which is what 10% of my income over two years would have amounted to. In fact, I would guess that it exceeds \$40,000, so even if I thought things were only 10% likely to turn out as well as they did, not donating in those years was a good idea.

In other years when I made much more money, I’ve chosen to donate 100% of my disposable income. You might want to do that sometimes, too, and I would highly recommend considering it, especially if you’re spending a lot of your time investigating where that money should go. But I still might recommend against you pledging to keep donating, unless you expect to stop investigating the world as much as you currently do and will therefore be less likely to discover things in the future that should change your plans for years-at-a-time.

### Time for a Fermi estimate

Below is an example Fermi calculation for the value of voting in the USA. Of course, the estimates are all rough and fuzzy, so I’ll be conservative, and we can adjust upward based on your opinion.

I’ll be estimating the value of voting in marginal expected altruistic dollars, the expected number of dollars being spent in a way that is in line with your altruistic preferences.1 If you don’t like measuring the altruistic value of the outcome as a fraction of the federal budget, please consider making up your own measure, and keep reading. Perhaps use the number of smiles per year, or number of lives saved. Your measure doesn’t have to be total or average utilitarian, either; as long as it’s roughly commensurate with the size of the country, it will lead you to a similar conclusion in terms of orders of magnitude.

### Component estimates:

At least 1/(100 million) = probability estimate that my vote would affect the outcome (for most Americans). This is the most interesting thing to estimate. There are approximately 100 million voters in the USA, and if you assume a naive fair coin-flip model of other voters, and a naive majority-rule voting system (i.e. not the electoral college), with a fair coin deciding ties, then the probability of a vote being decisive is around √(2/(pi*100 million)) = 8/10,000.

But this is too big, considering the way voters cluster: we are not independent coin flips. As well, the USA uses the electoral college system, not majority rule. So I found a paper of Gelman, King, and Boscardin (1998) where they simulate the electoral college using models fit to previous US elections, and find that the probability of a decisive vote came out between 1/(3 million) and 1/(100 million) for voters in most states in most elections, with most states lying very close to 1/(10 million). Gelman, Silver, and Edlin (2008) reaches a similar number, but with more variance between states.

Added (2016): Some folks have asked me about what happens if there are re-counts of the vote. Errors during (and before) recounts both decrease the probability that a vote that “should” flip the election actually will, and also increase the probability that votes that “shouldn’t” flip the election actually will. These effects roughly cancel out, and you’re left with roughly the same probability that your vote will flip the election outcome.

At least 55% = my subjective credence that I know which candidate is “better”, where I’m using the word “better” subjectively to mean which candidate would turn out to do the most good for others, in my view, if elected. If you don’t like this, please make up your own definition of better and keep reading 🙂 In any case, 55% is pretty conservative; it means I consider myself to have almost no information.

At least \$100 billion = the approximate marginal altruistic value of the “better” candidate (by comparison with dollars donated to a typical charity2). I think this is also very conservative. The annual federal budget is around \$3 trillion right now, making \$12 trillion over a 4-year term, and Barack Obama and Mitt Romney differ on trillions of dollars in their proposed budgets. It would be pretty strange to me if, given a perfect understanding of what they’d both do, I would only care altruistically about 100 billion of those dollars, marginally speaking. ### Result I don’t know which candidate would turn out “better for the world” in my estimation, but I’d consider myself as having at least a 55%*1/(100 million) chance of affecting the outcome in the better-for-the-world direction, and a 45%*1/(100 million) chance of affecting it in the worse-for-the-world direction, so in expectation I’m donating at least around (55%-45%)*1/(100 million)*(\$100 billion) = \$100 Again, this was pretty conservative: • Say you’re more like 70% sure, • Say you’re a randomly chosen american, so your probability of a decisive vote is around 1/10 million; • Say the outcome matters more on the order of a \$700 billion in charitable donations, given that Obama and Romney’s budgets differ on around \$7 trillion, and say 10% of that is stuff that money is being used as well as moving charitable donations about things you care about. That makes (70%-30%)*1/(10 million)*(\$700 billion) = \$28,000. Going further, if you’re • 90% sure, • voting in Virginia — 1/(3.5 million), and • care about the whole \$7 trillion dollar difference in budgets,

you get (90%-30%)*1/(3.5 million)*(\$7 trillion) = \$1.2 million. This is so large, it becomes a valuable use of time to take 1% chances at convincing other people to vote… which you can hopefully do by sharing this post with them.

## Interested in AI Alignment? Apply to Berkeley.

Summary: Researching how to control (“align”) highly-advanced future AI systems is now officially cool, and UC Berkeley is the place to do it.

Interested in AI alignment research? Apply to Berkeley for a PhD or postdoc (deadlines are approaching), or transfer into Berkeley from a PhD or postdoc at another top school. If you get into one of the following programs at Berkeley:

• a PhD program in computer science, mathematics, logic, or statistics, or
• a postdoc specializing in cognitive science, cybersecurity, economics, evolutionary biology, mechanism design, neuroscience, or moral philosophy,
… then I will personally help you find an advisor who is supportive of you researching AI alignment, and introduce you to other researchers in Berkeley with related interests.

This was not something I could confidently offer you two years ago. Continue reading

## “Entitlement to believe” is lacking in Effective Altruism

Sometimes the world needs you to think new thoughts. It’s good to be humble, but having low subjective credence in a conclusion is just one way people implement humility; another way is to feel unentitled to form your own belief in the first place, except by copying an “expert authority”. This is especially bad when there essentially are no experts yet — e.g. regarding the nascent sciences of existential risks — and the world really needs people to just start figuring stuff out. Continue reading

## Breaking news: Scientists Have Discovered the Soul

2016 is a great year for physics. Not only have we discovered gravitational waves, but just this week, physicists have announced the existence of a long sought after object: the human soul. Continue reading

## Credence – using subjective probabilities to express belief strengths

There are surprisingly many impediments to becoming comfortable making personal use of subjective probabilities, or “credences”: some conceptual, some intuitive, and some social. However, Phillip Tetlock has found that thinking in probabilities is essential to being a Superforcaster, so it is perhaps a skill and tendency worth cultivating on purpose. Continue reading

## A story about Bayes, Part 2: Disagreeing with the establishment

10 years after my binary search through dietary supplements, which found that a particular blend of B and C vitamins was particularly energizing for me, a CBC news article reported that the blend I’d used — called “Emergen-C” — did not actually contain all of the vitamin ingredients on its label. Continue reading

## Why CFAR spreads altruism organically, and why Labs & Core make a great team

Following on “Why scaling slowly has been awesome for CFAR Core”, here are two other questions I’ve gotten repeatedly about CFAR:

 Q2: Why isn’t altruism training an explicit part of CFAR’s core workshop curriculum?

## Why scaling slowly has been awesome for CFAR Core

Summary: Since I offered to answer questions about my pledge to donate 10% of my annual salary to CFAR as an existential risk reduction, the question “Why doesn’t CFAR do something that will scale faster than workshops?” keeps coming up, so I’m answering it here. Continue reading

## Beat the bystander effect with minimal social pressure

Summary: Develop an allergy to saying “Will anyone do X?”. Instead query for more specific error signals: Continue reading

## AI strategy and policy research positions at FHI (deadline Jan 6)

Oxford’s Future of Humanity Institute has some new positions opening up at their Strategic Artificial Intelligence Research Centre. I know these guys — they’re super awesome — and if you have the following three properties, then humanity needs you to step up and solve the future: Continue reading

## The 2015 x-risk ecosystem

Summary: Because of its plans to increase collaboration and run training/recruiting programs for other groups, CFAR currently looks to me like the most valuable pathway per-dollar-donated for reducing x-risk, followed closely by MIRI, and GPP+80k. As well, MIRI looks like the most valuable place for new researchers (funding permitting; see this post), followed very closely by FHI, and CSER. Continue reading

## Why I want humanity to survive — a holiday reflection

Life on Earth is almost 4 billion years old. During that time, many trillions of complex life forms have starved to death, been slowly eaten alive by predators or diseases, or simply withered away. But there has also been much joy, play, love, flourishing, and even creativity.