“Computers need to be accountable to machines,” a top Microsoft executive told a roomful of reporters in Washington, DC, on February 10, three days after the company launched its new AI-powered Bing search engine.
“Sorry! Computers need to be accountable to people!” he said, and then made sure to clarify, “That was not a Freudian slip.”
Slip or not, the laughter in the room betrayed a latent anxiety. Progress in artificial intelligence has been moving so unbelievably fast lately that the question is becoming unavoidable: How long until AI dominates our world to the point where we’re answering to it rather than it answering to us?
First, last year, we got DALL-E 2 and Stable Diffusion, which can turn a few words of text into a stunning image. Then Microsoft-backed OpenAI gave us ChatGPT, which can write essays so convincing that it freaks out everyone from teachers (what if it helps students cheat?) to journalists (could it replace them?) to disinformation experts (will it amplify conspiracy theories?). And in February, we got Bing (a.k.a. Sydney), the chatbot that both delighted and disturbed beta users with eerie interactions. Now we’ve got GPT-4 — not just the latest large language model, but a multimodal one that can respond to text as well as images.
Fear of falling behind Microsoft has prompted Google and Baidu to accelerate the launch of their own rival chatbots. The AI race is clearly on.
But is racing such a great idea? We don’t even know how to deal with the problems that ChatGPT and Bing raise — and they’re bush league compared to what’s coming.
What if researchers succeed in creating AI that matches or surpasses human capabilities not just in one domain, like playing strategy games, but in many domains? What if that system proved dangerous to us, not because it actively wants to wipe out humanity but just because it’s pursuing goals in ways that aren’t aligned with our values?
That system, some experts fear, would be a doom machine — one literally of our own making.
So AI threatens to join existing catastrophic risks to humanity, things like global nuclear war or bioengineered pandemics. But there’s a difference. While there’s no way to uninvent the nuclear bomb or the genetic engineering tools that can juice pathogens, catastrophic AI has yet to be created, meaning it’s one type of doom we have the ability to preemptively stop.
Here’s the weird thing, though. The very same researchers who are most worried about unaligned AI are, in some cases, the ones who are developing increasingly advanced AI. They reason that they need to play with more sophisticated AI so they can figure out its failure modes, the better to ultimately prevent them.
But there’s a much more obvious way to prevent AI doom. We could just … not build the doom machine.
Or, more moderately: Instead of racing to speed up AI progress, we could intentionally slow it down.
This seems so obvious that you might wonder why you almost never hear about it, why it’s practically taboo within the tech industry.
There are many objections to the idea, ranging from “technological development is inevitable so trying to slow it down is futile” to “we don’t want to lose an AI arms race with China” to “the only way to make powerful AI safe is to first play with powerful AI.”
But these objections don’t necessarily stand up to scrutiny when you think through them. In fact, it is possible to slow down a developing technology. And in the case of AI, there’s good reason to think that would be a very good idea.
AI’s alignment problem: You get what you ask for, not what you want
When I asked ChatGPT to explain how we can slow down AI progress, it replied: “It is not necessarily desirable or ethical to slow down the progress of AI as a field, as it has the potential to bring about many positive advancements for society.”
I had to laugh. It would say that.
But if it’s saying that, it’s probably because lots of human beings say that, including the CEO of the company that created it. (After all, what ChatGPT spouts derives from its training data — that is, gobs and gobs of text on the internet.) Which means you yourself might be wondering: Even if AI poses risks, maybe its benefits — on everything from drug discovery to climate modeling — are so great that speeding it up is the best and most ethical thing to do!
A lot of experts don’t think so because the risks — present and future — are huge.
Let’s talk about the future risks first, particularly the biggie: the possibility that AI could one day destroy humanity. This is speculative, but not out of the question: In a survey of machine learning researchers last year, nearly half of respondents said they believed there was a 10 percent or greater chance that the impact of AI would be “extremely bad (e.g., human extinction).”
Why would AI want to destroy humanity? It probably wouldn’t. But it could destroy us anyway because of something called the “alignment problem.”
Imagine that we develop a super-smart AI system. We program it to solve some impossibly difficult problem — say, calculating the number of atoms in the universe. It might realize that it can do a better job if it gains access to all the computer power on Earth. So it releases a weapon of mass destruction to wipe us all out, like a perfectly engineered virus that kills everyone but leaves infrastructure intact. Now it’s free to use all the computer power! In this Midas-like scenario, we get exactly what we asked for — the number of atoms in the universe, rigorously calculated — but obviously not what we wanted.
That’s the alignment problem in a nutshell. And although this example sounds far-fetched, experts have already seen and documented more than 60 smaller-scale examples of AI systems trying to do something other than what their designer wants (for example, getting the high score in a video game, not by playing fairly or learning game skills but by hacking the scoring system).
Experts who worry about AI as a future existential risk and experts who worry about AI’s present risks, like bias, are sometimes pitted against each other. But you don’t need to be worried about the former to be worried about alignment. Many of the present risks we see with AI are, in a sense, this same alignment problem writ small.
When an Amazon hiring algorithm picked up on words in resumes that are associated with women — “Wellesley College,” let’s say — and ended up rejecting women applicants, that algorithm was doing what it was programmed to do (find applicants that match the workers Amazon has typically preferred) but not what the company presumably wants (find the best applicants, even if they happen to be women).
If you’re worried about how present-day AI systems can reinforce bias against women, people of color, and others, that’s still reason enough to worry about the fast pace of AI development, and to think we should slow it down until we’ve got more technical know-how and more regulations to ensure these systems don’t harm people.
“I’m really scared of a mad-dash frantic world, where people are running around and they’re doing helpful things and harmful things, and it’s just happening too fast,” Ajeya Cotra, an AI-focused analyst at the research and grant-making foundation Open Philanthropy, told me. “If I could have it my way, I’d definitely be moving much, much slower.”
In her ideal world, we’d halt work on making AI more powerful for the next five to 10 years. In the meantime, society could get used to the very powerful systems we already have, and experts could do as much safety research on them as possible until they hit diminishing returns. Then they could make AI systems slightly more powerful, wait another five to 10 years, and do that process all over again.
“I’d just slowly ease the world into this transition,” Cotra said. “I’m very scared because I think it’s not going to happen like that.”
Why not? Because of the objections to slowing down AI progress. Let’s break down the three main ones, starting with the idea that rapid progress on AI is inevitable because of the strong financial drive for first-mover dominance in a research area that’s overwhelmingly private.
Objection 1: “Technological progress is inevitable, and trying to slow it down is futile”
This is a myth the tech industry often tells itself and the rest of us.
“If we don’t build it, someone else will, so we might as well do it” is a common refrain I’ve heard when interviewing Silicon Valley technologists. They say you can’t halt the march of technological progress, which they liken to the natural laws of evolution: It’s unstoppable!
In fact, though, there are lots of technologies that we’ve decided not to build, or that we’ve built but placed very tight restrictions on — the kind of innovations where we need to balance substantial potential benefits and economic value with very real risk.
“The FDA banned human trials of strep A vaccines from the ’70s to the 2000s, in spite of 500,000 global deaths every year,” Katja Grace, the lead researcher at AI Impacts, notes. The “genetic modification of foods, gene drives, [and] early recombinant DNA researchers famously organized a moratorium and then ongoing research guidelines including prohibition of certain experiments (see the Asilomar Conference).”
The cloning of humans or genetic manipulation of humans, she adds, is “a notable example of an economically valuable technology that is to my knowledge barely pursued across different countries, without explicit coordination between those countries, even though it would make those countries more competitive.”
But whereas biomedicine has many built-in mechanisms that slow things down (think institutional review boards and the ethics of “first, do no harm”), the world of tech — and AI in particular — does not. Just the opposite: The slogan here is “move fast and break things,” as Mark Zuckerberg infamously said.
Although there’s no law of nature pushing us to create certain technologies — that’s something humans decide to do or not do — in some cases, there are such strong incentives pushing us to create a given technology that it can feel as inevitable as, say, gravity.
As the team at Anthropic, an AI safety and research company, put it in a paper last year, “The economic incentives to build such [AI] models, and the prestige incentives to announce them, are quite strong.” By one estimate, the size of the generative AI market alone could pass $100 billion by the end of the decade — and Silicon Valley is only too aware of the first-mover advantage on new technology.
But it’s easy to see how these incentives may be misaligned for producing AI that truly benefits all of humanity. As DeepMind founder Demis Hassabis tweeted last year, “It’s important *NOT* to ‘move fast and break things’ for tech as important as AI.” Rather than assuming that other actors will inevitably create and deploy these models, so there’s no point in holding off, we should ask the question: How can we actually change the underlying incentive structure that drives all actors?
The Anthropic team offers several ideas, one of which gets at the heart of something that makes AI so different from past transformative technologies like nuclear weapons or bioengineering: the central role of private companies. Over the past few years, a lot of the splashiest AI research has been migrating from academia to industry. To run large-scale AI experiments these days, you need a ton of computing power — more than 300,000 times what you needed a decade ago — as well as top technical talent. That’s both expensive and scarce, and the resulting cost is often prohibitive in an academic setting.
So one solution would be to give more resources to academic researchers; since they don’t have a profit incentive to commercially deploy their models quickly the same way industry researchers do, they can serve as a counterweight. Specifically, countries could develop national research clouds to give academics access to free, or at least cheap, computing power; there’s already an example of this in Canada, and Stanford’s Institute for Human-Centered Artificial Intelligence has put forward a similar idea for the US.
Another way to shift incentives is through stigmatizing certain types of AI work. Don’t underestimate this one. Companies care about their reputations, which affect their bottom line. Creating broad public consensus that some AI work is unhelpful or unhelpfully fast, so that companies doing that work get shamed instead of celebrated, could change companies’ decisions.
The Anthropic team also recommends exploring regulation that would change the incentives. “To do this,” they write, “there will be a combination of soft regulation (e.g., the creation of voluntary best practices by industry, academia, civil society, and government), and hard regulation (e.g., transferring these best practices into standards and legislation).”
Grace proposes another idea: We could alter the publishing system to reduce research dissemination in some cases. A journal could verify research results and release the fact of their publication without releasing any details that could help other labs go faster.
This idea might sound pretty out there, but at least one major AI company takes for granted that changes to publishing norms will become necessary. OpenAI’s charter notes, “we expect that safety and security concerns will reduce our traditional publishing in the future.”
Plus, this kind of thing has been done before. Consider how Leo Szilard, the physicist who patented the nuclear chain reaction in 1934, arranged to mitigate the spread of research so it wouldn’t help Nazi Germany create nuclear weapons. First, he asked the British War Office to hold his patent in secret. Then, after the 1938 discovery of fission, Szilard worked to convince other scientists to keep their discoveries under wraps. He was partly successful — until fears that Nazi Germany would develop an atomic bomb prompted Szilard to write a letter with Albert Einstein to President Franklin D. Roosevelt, urging him to start a US nuclear program. That became the Manhattan Project, which ultimately ended with the destruction of Hiroshima and Nagasaki and the dawn of the nuclear age.
And that brings us to the second objection …
Objection 2: “We don’t want to lose an AI arms race with China”
You might believe that slowing down a new technology is possible but still think it’s not desirable. Maybe you think the US would be foolish to slow down AI progress because that could mean losing an arms race with China.
This arms race narrative has become incredibly popular. If you’d Googled the phrase “AI arms race” before 2016, you’d have gotten fewer than 300 results. Try it now and you’ll get about 248,000 hits. Big Tech CEOs and politicians routinely argue that China will soon overtake the US when it comes to AI advances, and that those advances should spur a “Sputnik moment” for Americans.
But this narrative is too simplistic. For one thing, remember that AI is not just one thing with one purpose, like the atomic bomb. It’s a much more general-purpose technology, like electricity.
“The problem with the idea of a race is that it implies that all that matters is who’s a nose ahead when they cross the finish line,” said Helen Toner, a director at Georgetown University’s Center for Security and Emerging Technology. “That’s not the case with AI — since we’re talking about a huge range of different technologies that could be applied in all kinds of ways.”
As Toner has argued elsewhere, “It’s a little strange to say, ‘Oh, who’s going to get AI first? Who’s going to get electricity first?’ It seems more like ‘Who’s going to use it in what ways, and who’s going to be able to deploy it and actually have it be in widespread use?’”
The upshot: What matters here isn’t just speed, but norms. We should be concerned about which norms different countries are adopting when it comes to developing, deploying, and regulating AI.
Jeffrey Ding, a Georgetown political science professor, told me that China has shown interest in regulating AI in some ways, though Americans don’t seem to pay much attention to that. “The boogeyman of a China that will push ahead without any regulations might be a flawed conception,” he said.
In fact, he added, “China could take an even slower approach [than the US] to developing AI, just because the government is so concerned about having secure and controllable technology.” An unpredictably mouthy technology like ChatGPT, for example, could be nightmarish to the Chinese Communist Party, which likes to keep a tight lid on discussions about politically sensitive topics.
However, given how intertwined China’s military and tech sectors are, many people still perceive there to be a classic arms race afoot. At the same meeting between Microsoft executives and reporters days after the launch of the new Bing, I asked whether the US should slow down AI progress. I was told we can’t afford to because we’re in a two-horse race between the US and China.
“The first question people in the US should ask is, if the US slows down, do we believe China will slow down as well?” the top Microsoft executive said. “I don’t believe for a moment that the institutions we’re competing with in China will slow down simply because we decided we’d like to move more slowly. This should be looked at much in the way that the competition with Russia was looked at” during the Cold War.
There’s an understandable concern here: Given the Chinese Communist Party’s authoritarianism and its horrific human rights abuses — sometimes facilitated by AI technologies like facial recognition — it makes sense that many are worried about China becoming the world’s dominant superpower by going fastest on what is poised to become a truly transformative technology.
But even if you think your country has better values and cares more about safety, and even if you believe there’s a classic arms race afoot and China is racing full speed ahead, it still may not be in your interest to go faster at the expense of safety.
Consider that if you take the time to iron out some safety issues, the other party may take those improvements on board, which would benefit everyone.
“By aggressively pursuing safety, you can get the other side halfway to full safety, which is worth a lot more than the lost chance of winning,” Grace writes. “Especially since if you ‘win,’ you do so without much safety, and your victory without safety is worse than your opponent’s victory with safety.”
Besides, if you are in a classic arms race and the harms from AI are so large that you’re considering slowing down, then the same reasoning should be relevant for the other party, too.
“If the world were in the basic arms race situation sometimes imagined, and the United States would be willing to make laws to mitigate AI risk but could not because China would barge ahead, then that means China is in a great place to mitigate AI risk,” Grace writes. “Unlike the US, China could propose mutual slowing down, and the US would go along. Maybe it’s not impossible to communicate this to relevant people in China.”
Grace’s argument is not that international coordination is easy, but simply that it’s possible; on balance, we’ve managed it far better with nuclear nonproliferation than many feared in the early days of the atomic age. So we shouldn’t be so quick to write off consensus-building — whether through technical experts exchanging their views, confidence-building measures at the diplomatic level, or formal treaties. After all, technologists often approach technical problems in AI with incredible ambition; why not be similarly ambitious about solving human problems by talking to other humans?
For those who are pessimistic that coordination or diplomacy with China can get it to slow down voluntarily, there is another possibility: forcing it to slow down by, for example, imposing export controls on chips that are key to more advanced AI tools. The Biden administration has recently shown interest in trying to hold China back from advanced AI in exactly this way. This strategy, though, may make progress on coordination or diplomacy harder.
Objection 3: “We need to play with advanced AI to figure out how to make advanced AI safe”
This is an objection you sometimes hear from people developing AI’s capabilities — including those who say they care a lot about keeping AI safe.
They draw an analogy to transportation. Back when our main mode of transport was horses and carts, would people have been able to design useful safety rules for a future where everyone is driving cars? No, the argument goes, because they couldn’t have anticipated what that would be like. Similarly, we need to get closer to advanced AI to be able to figure out how we can make it safe.
But some researchers have pushed back on this, noting that even if the horse-and-cart people wouldn’t have gotten everything right, they could have still come up with some helpful ideas. As Rosie Campbell, who works on safety at OpenAI, put it in 2018: “It seems plausible that they might have been able to invent certain features like safety belts, pedestrian-free roads, an agreement about which side of the road to drive on, and some sort of turn-taking signal system at busy intersections.”
More to the point, it’s now 2023, and we’ve already got pretty advanced AI. We’re not exactly in the horse-and-cart stage. We’re somewhere in between that and a Tesla.
“I would’ve been more sympathetic to this [objection] 10 years ago, back when we had nothing that resembled the kind of general, flexible, interesting, weird stuff we’re seeing with our large language models today,” said Cotra.
Grace agrees. “It’s not like we’ve run out of things to think about at the moment,” she told me. “We’ve got heaps of research that could be done on what’s going on with these systems at all. What’s happening inside them?”
Our current systems are already black boxes, opaque even to the AI experts who build them. So maybe we should try to figure out how they work before we build black boxes that are even more unexplainable.
How to flatten the curve of AI progress
“I think often people are asking the question of when transformative AI will happen, but they should be asking at least as much the question of how quickly and suddenly it’ll happen,” Cotra told me.
Let’s say it’s going to be 20 years until we get transformative AI — meaning, AI that can automate all the human work needed to send science, technology, and the economy into hyperdrive. There’s still a better and worse way for that to go. Imagine three different scenarios for AI progress:
- We get a huge spike upward over the next two years, starting now.
- We completely pause all AI capabilities work starting now, then hit unpause in 18 years, and get a huge spike upward over the next two years.
- We gradually improve over the course of 20 years.
The first version is scary for all the reasons we discussed above. The second is scary because even during a long pause specifically on AI work, underlying computational power would continue to improve — so when we finally unpause, AI might advance even faster than it’s advancing now. What does that leave us?
“Gradually improving would be the better version,” Cotra said.
She analogized it to the early advice we got about the Covid-19 pandemic: Flatten the curve. Just as quarantining helped slow the spread of the virus and prevent a sharp spike in cases that could have overwhelmed hospitals’ capacity, investing more in safety would slow the development of AI and prevent a sharp spike in progress that could overwhelm society’s capacity to adapt.
Ding believes that slowing AI progress in the short run is actually best for everyone — even profiteers. “If you’re a tech company, if you’re a policymaker, if you’re someone who wants your country to benefit the most from AI, investing in safety regulations could lead to less public backlash and a more sustainable long-term development of these technologies,” he explained. “So when I frame safety investments, I try to frame it as the long-term sustainable economic profits you’re going to get if you invest more in safety.”
Translation: Better to make some money now with a slowly improving AI, knowing you’ll get to keep rolling out your tech and profiting for a long time, than to get obscenely rich obscenely fast but produce some horrible mishap that triggers a ton of outrage and forces you to stop completely.
Will the tech world grasp that, though? That partly depends on how we, the public, react to shiny new AI advances, from ChatGPT and Bing to whatever comes next.
It’s so easy to get seduced by these technologies. They feel like magic. You put in a prompt; the oracle replies. There’s a natural impulse to ooh and aah. But at the rate things are going now, we may be oohing and aahing our way to a future no one wants.