Category Archives: AI Safety

FAQ

Posted in AI Safety, Effectiveness, Life on February 8, 2017 | Permalink | Leave a comment

I get a lot of email, and unfortunately, template email responses are not yet integrated into the mobile version of Google inbox. So, until then, please forgive me if I send you this page as a response! Hopefully it is better than no response at all.

Thanks for being understanding.

Continue reading →

Leave a comment

Posted in AI Safety, Effectiveness, Life

Time to spend more than 0.00001% of world GDP on human-level AI alignment

Posted in AI Safety, Altruism, Effectiveness on January 10, 2017 | Permalink | Leave a comment

From an outside view, looking in at the Earth, if you noticed that human beings were about to replace themselves as the most intelligent agents on the planet, would you think it unreasonable if 1% of their effort were being spent explicitly reasoning about that transition? How about 0.1%?

Well, currently, world GDP is around \$75 trillion, and in total, our species is spending around \$9MM/year on alignment research in preparation for human-level AI (HLAI). That’s \$5MM on technical research distributed across 24 projects with a median annual budget of \$100k, and 4MM on related efforts, like recruitment and qualitative studies like this blog post, distributed across 20 projects with a median annual budget of \$57k. (I computed these numbers by tallying spending from a database I borrowed from Sebastian Farquhar at the Global Priorities Project, which uses a much more liberal definition of “alignment research” than I do.) I predict spending will roughly at least double in the next 1-2 years, and frankly, am underwhelmed…

Continue reading →

Leave a comment

Posted in AI Safety, Altruism, Effectiveness

Open-source game theory is weird

Posted in AI Safety on December 2, 2016 | Permalink | Leave a comment

I sometimes forget that not everyone realizes how poorly understood open-source game theory is, until I end up sharing this example and remember how weird it is for folks to see for the first time. Since that’s been happening a lot this week, I wrote this post to automate the process.

Consider a game where agents can view each other’s source codes and return either “C” (cooperate) or “D” (defect). The payoffs don’t really matter for the following discussion.

First, consider a very simple agent called “CooperateBot”, or “CB” for short, which cooperates with every possible opponent:

def CB(opp):
  return C

(Here “opp” is the argument representing the opponent’s source code, which CooperateBot happens to ignore.)

Next consider a more interesting agent, “FairBot”, or “FB” for short, which takes in a single parameter $k$ to determine how long it thinks about its opponent:

Continue reading →

Leave a comment

Posted in AI Safety

Abstract open problems in AI alignment, v.0.1 — for mathematicians, logicians, and computer scientists with a taste for theory-building

Posted in AI Safety on November 29, 2016 | Permalink | Leave a comment

This page is a draft and will be updated in response to feedback and requests to include specific additional problems.

Through my work on logical inductors and robust cooperation of bounded agents, I’m meeting lots of folks in math, logic, and theoretical CS who are curious to know what contributions they can make, in the form of theoretical work, toward control theory for highly advanced AI systems. If you’re one of those folks, this post is for you!

Continue reading →

Leave a comment

Posted in AI Safety

Professional feedback form

Posted in AI Safety, Effectiveness on October 12, 2016 | Permalink | Leave a comment

Continue reading →

Leave a comment

Posted in AI Safety, Effectiveness

Leveraging academia

Posted in AI Safety, Effectiveness on October 4, 2016 | Permalink | Leave a comment

Since a lot of interest in AI alignment has started to build, I’m getting a lot more emails of the form “Hey, how can I get into this hot new field?”. This is great. In the past I was getting so few messages like this that I could respond to basically all of them with many hours of personal conversation.

But now I can’t respond to everybody anymore, so I have a new plan: leverage academia.

To grossly oversimplify things, here’s the heuristic. Continue reading →

Leave a comment

Posted in AI Safety, Effectiveness

Seeking a paid part-time assistant for AI alignment research

Posted in AI Safety, Effectiveness on September 28, 2016 and tagged full-width | Permalink | Leave a comment

Please share this if you think anyone you know might be interested.

Sometimes in my research I have to do some task on a computer that I could easily outsource, e.g., adding bibliographical data to a list of papers (i.e., when they were written, who the authors were, etc.). If you think you might be interested in trying some work like this, in exchange for

$20/hour, paid to you from my own pocket,
exposure to the research materials I’m working with, and
knowing you’re doing something helpful to AI alignment research, then

Continue reading →

Leave a comment

Posted in AI Safety, Effectiveness

Tagged full-width

Interested in AI Alignment? Apply to Berkeley.

Posted in AI Safety, Altruism on September 26, 2016 and tagged full-width | Permalink | Leave a comment

Summary: Researching how to control (“align”) highly-advanced future AI systems is now officially cool, and UC Berkeley is the place to do it.

Interested in AI alignment research? Apply to Berkeley for a PhD or postdoc (deadlines are approaching), or transfer into Berkeley from a PhD or postdoc at another top school. If you get into one of the following programs at Berkeley:

a PhD program in computer science, mathematics, logic, or statistics, or
a postdoc specializing in cognitive science, cybersecurity, economics, evolutionary biology, mechanism design, neuroscience, or moral philosophy,

… then I will personally help you find an advisor who is supportive of you researching AI alignment, and introduce you to other researchers in Berkeley with related interests.

This was not something I could confidently offer you two years ago. Continue reading →

Leave a comment

Posted in AI Safety, Altruism

Tagged full-width

Seeking a paid personal assistant to create more x-risk research hours

Posted in AI Safety, Effectiveness on January 31, 2016 and tagged full-width | Permalink | Leave a comment

My main bottleneck as a researcher right now is that I have various bureaucracies I need to follow up with on a regular basis, which reduce the number of long interrupted periods I can spend on research. I could really use some help with this. Continue reading →

Leave a comment

Posted in AI Safety, Effectiveness

Tagged full-width

AI strategy and policy research positions at FHI (deadline Jan 6)

Posted in AI Safety, Altruism on December 31, 2015 | Permalink | Leave a comment

Oxford’s Future of Humanity Institute has some new positions opening up at their Strategic Artificial Intelligence Research Centre. I know these guys — they’re super awesome — and if you have the following three properties, then humanity needs you to step up and solve the future: Continue reading →

Leave a comment

Posted in AI Safety, Altruism

The 2015 x-risk ecosystem

Posted in AI Safety, Altruism, Effectiveness on December 26, 2015 and tagged full-width | Permalink | Leave a comment

Summary: Because of its plans to increase collaboration and run training/recruiting programs for other groups, CFAR currently looks to me like the most valuable pathway per-dollar-donated for reducing x-risk, followed closely by MIRI, and GPP+80k. As well, MIRI looks like the most valuable place for new researchers (funding permitting; see this post), followed very closely by FHI, and CSER. Continue reading →

Leave a comment

Posted in AI Safety, Altruism, Effectiveness

Tagged full-width

Why I want humanity to survive — a holiday reflection

Posted in AI Safety, Altruism, Life on December 25, 2015 | Permalink | Leave a comment

Life on Earth is almost 4 billion years old. During that time, many trillions of complex life forms have starved to death, been slowly eaten alive by predators or diseases, or simply withered away. But there has also been much joy, play, love, flourishing, and even creativity.

Continue reading →

Leave a comment

Posted in AI Safety, Altruism, Life

MIRI needs funding to scale with other AI safety programs

Posted in AI Safety on December 14, 2015 | Permalink | Leave a comment

Summary: MIRI’s end-of-year fundraiser is on, and I’ve never been more convinced of what MIRI can offer the world. Continue reading →

Leave a comment

Posted in AI Safety

The Problem of IndignationBot, Part 4

Posted in AI Safety on December 11, 2015 | Permalink | Leave a comment

Summary: I proved a parametric, bounded version of Löb’s Theorem that shows bounded self-reflective agents exhibit weird Löbian behavior, too. Continue reading →

Leave a comment

Posted in AI Safety

The Problem of IndignationBot, Part 3

Posted in AI Safety on December 9, 2015 | Permalink | Leave a comment

Summary: Is strange “Löbian” self-reflective behavior a just theoretical symptom of assuming unbounded computational resources?

Continue reading →

Leave a comment

Posted in AI Safety

The Problem of IndignationBot, Part 2

Posted in AI Safety on October 30, 2015 | Permalink | Leave a comment

Summary: Agents that can reason about their own source codes are weirder than you think.

Continue reading →

Leave a comment

Posted in AI Safety

The Problem of IndignationBot, Part 1

Posted in AI Safety on July 16, 2015 | Permalink | Leave a comment

I like to state the Prisoner’s Dilemma by saying that each player can destroy \$2 of the other player’s utility in exchange for \$1 for himself. Writing “C” and “D” for “cooperate” and “defect”, we have the following: Continue reading →

Leave a comment

Posted in AI Safety