MIRI needs funding to scale with other AI safety programs

Summary: MIRI’s end-of-year fundraiser is on, and I’ve never been more convinced of what MIRI can offer the world.

Since a team of CFAR alumni banded together to form the Future of Life Institute (FLI), organized an AI safety conference in Puerto Rico in January of this year, co-authored the FLI research priorities proposal, and attracted $10MM of grant funding from Elon Musk, a lot of money has moved under the label “AI Safety” in the past year. Nick Bostrom’s Superintelligence was also a major factor in this amazing success story.

A lot of wonderful work is being done under these grants, including a lot of proposals for solutions to known issues with AI safety, which I find extremely heartening. However, I’m worried that if MIRI doesn’t scale at least somewhat to keep pace with all this funding, it just won’t be spent nearly as well as it would have if MIRI were there to help.

The reason is that, to my eye, almost no one except for MIRI is taking point on formulating neglected sub-problems in AI safety…

Reflections from this year’s NIPS conference

I attended NIPS a couple of weeks ago — the world’s largest machine learning (ML) conference, attended by over 4000 experts in the field — and was super pleased to find a number of reputable ML experts on a panel about AI safety, discussing questions like AI boxing and value-alignment that I’d previously only ever heard discussed in and around MIRI/FHI. And when other ML people there would ask me about my research at MIRI, more than half the time they responded with “Cool, that sounds like it’s going to be really important very soon!”

How awesome is that? As little as two years ago when I’d talk to colleagues in ML about AI safety, they’d often just squint and say things like

(typical pre-2014 response)

Won’t superintelligent machines be smart enough to realize that it’s morally wrong to be harmful to humans?

The world had some distance to go before there could be much high-quality discussion about AI safety. I’d known this for a long time, and it’s why I helped co-found the Center for Applied Rationality (CFAR) during my PhD: to bring high-impact people together who cared about the world and wanted to apply highly scrupulous reasoning to figure out what it needed most.

Now, due to the hard work of many CFAR alumni like FLI, and their many colleagues, that plan seems to be working. After Elon Musk, Bill Gates, Stephen Hawking, and Stuart Russell have all publicly agreed in response to Superintelligence and FLI’s research priorities that the problem is important — when I make the exact same arguments about AI safety as I would two years ago, I get responses like

(typical 2015 response)

Yeah definitely, AI is going to be a huge deal, and we need more people focussed on making it safe.

I’m super glad this transition has happened. But I’m worried because we had to make it happen. I’m worried that, if MIRI doesn’t scale, too much safety funding will be spent in suboptimal ways that could really harm the field.

You see, when most academics apply for grant funding, they’ve already made up their minds about what they want to research, and are trying to fit their plans to match the description of some grant program. This results in a lot of what people call academic freedom, and I think that freedom is extremely important for humanity to continue exploring a diversity of ideas and potential innovations.

But AI safety is a hard and important problem that needs solving, and it’s not going to be solved by people who don’t actually want to solve it. And if all the rising funding for safety goes to research that was going to happen anyway, AI safety could become a fad that came and went, appearing to have been driven by the natural incentives that turn alarmism into grant funding for existing research, rather than by people who cared deeply and wanted to solve the hardest and most important parts of the safety problem.

Why I like MIRI

It might seem like a strange claim I’m making: that a small non-profit research group like MIRI might have something important to say about how many tens of millions and maybe billions of dollars should be spent on AI safety.

But MIRI’s new research team has really impressed me, enough that I decided to work there. I had lots of other job opportunities, inside and outside academia, and even some to work on AI safety. But the quality of mathematical and technical talent that’s accumulated at MIRI in 2015 is unlike anything I’ve seen anywhere else.

For example, just a few days ago, my colleague Benya Fallenstein initiated a 5-hour meeting with me where we surveyed our current research priorities and re-assessed what problems should be MIRI’s top priority, as a function of how direct and numerous were each problem’s safety applications, and how likely or unlikely each problem was to remain unsolved or neglected by contemporary approaches.

We were soon joined by two other MIRI researchers, Jessica Taylor and Scott Garrabrandt. I didn’t have to bug anyone to actually take the problem seriously and reason backwards from what’s needed instead of rationalizing forward to justify their favorite pet project. No, a group of mathematicians — better mathematicians than me, if I must say so — just spontaneously arranged themselves into a strategic conversation about AI safety research prioritization.

And it didn’t surprise me, because that’s just the sort of determination I expect when I go into work each day. Sometimes we need strategy, and sometimes we need math. Sometimes we need to write, and sometimes we need to talk. It’s all part of the job. The team is really smart, really cares, and unlike a lot of other smart people who care, they’re all working full time on safety. That’s a step that fewer than a dozen people in the world have made at this point.

Why MIRI is still needed

We have to remember that AI safety did not become mainstream by a spontaneous collective awakening. It was through years of effort on the part of MIRI and collaborators at FHI struggling to identify unknown unknowns about how AI might surprise us, and struggling further to learn to explain these ideas in enough technical detail that they might be adopted by mainstream research, which is finally beginning to happen.

But what about the parts we’re wrong about? What about the sub-problems we haven’t identified yet, that might end up neglected in the mainstream the same way the whole problem was neglected 5 years ago? I’m glad the AI/ML community is more aware of these issues now, but I want to make sure MIRI can grow fast enough to keep this growing field on track.

Now, you might think that now that other people are “on the issue”, it’ll work itself out. That might be so.

But just because some of MIRI’s conclusions are now being widely adopted widely doesn’t mean its methodology is. The mental movement

“Someone has pointed out this safety problem to me, let me try to solve it!”

is very different from

“Someone has pointed out this safety solution to me, let me try to see how it’s broken!”

And that second mental movement is the kind that allowed MIRI to notice AI safety problems in the first place. Cybersecurity professionals seem to carry out this movement easily: security expert Bruce Schneier calls it the security mindset. The SANS institute calls it red teaming. Whatever you call it, AI/ML people are still more in maker-mode than breaker-mode, and are not yet, to my eye, identifying any new safety problems.

I do think that different organizations should probably try different approaches to the AI safety problem, rather than perfectly copying MIRI’s approach and research agenda. But I think breaker-mode/security mindset does need to be a part of every approach to AI safety. And if MIRI doesn’t scale up to keep pace with all this new funding, I’m worried that the world is just about to copy-paste MIRI’s best-2014-impression of what’s important in AI safety, and leave behind the self-critical methodology that generated these ideas in the first place… which is a serious pitfall given all the unknown unknowns left in the field.

This year has been awesome. I’m really impressed with mainstream-AI/ML’s reaction to the AI safety problem in 2015, and with so many competent new people beginning to recognize the problem enough to apply for grants to work on it. It even seems likely to me that, within 5 years, some of those same competent people will switch into red team/security/breaker mode and start finding new problems.

But I don’t want to lose 5 years of lead time on these problems because MIRI couldn’t keep scaling. My new fellow researchers there — Benya Fallenstein, Patrick LaVictoire, Jessica Taylor, and Scott Garrabrandt — are all extremely sharp, motivated people, and there are more in the hiring pipeline if we can stay funded.

So, here’s hoping we can make that happen 🙂


Leave a Reply

Your email address will not be published. Required fields are marked *