The AI Daily Brief: Artificial Intelligence News and Analysis - Hinton and Bengio on Managing AI Risks

Starting point is 00:00:00 Today on the AI Breakdown, we're reading a new paper from some leading voices on AI safety all about managing AI risks. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube channel, our Discord, and our newsletter. Hello, friends, welcome back to a weekend long read episode of the AI breakdown. A big theme recently has been not only the AI safety conversation, but all of the online. also its translation into real policy proposals. Something that I have mentioned, I think is very important at this stage, and something which seems to be the trajectory, is to move away from just theoretical arguments about safety and X-risk on the one hand, or accelerationism on the other,

Starting point is 00:00:54 into the realm of actually what we do in practice. In other words, if we need safety guardrails, what are they? If there are models that are too advanced and too powerful to be released, where do we draw those lines? For models that are are around that, but not clearly there, is there an approval process? Is there licensing? In general, I think the more time we spend on that type of question, and the less we spend just on the purely theoretical, the better off will be. So then, of course I noted when a group of around 24 or 25, mostly academics, including most notably Yasuo Benjiio and Jeffrey Hinton, along with Yuval Noah Harari, released a paper and a webpage called Managing AI Risks in an Era of Rapid Progress.

Starting point is 00:01:33 So first let's read it and see how much it stays in that realm of the theoretical versus moving into something more practical. Abstract. Amid rapid AI progress, the authors of this short paper express a consensus describing the large-scale risks from upcoming powerful AI systems. They call for a set of governance measures and a major shift in AI R&D towards safety and ethical practices before these systems are developed. The paper begins.

Starting point is 00:01:58 In 2019, GPT2 could not reliably count 10. Only four years later, deep learning systems can write software, generate photorealistic scenes on demand, advise on intellectual topics, and combine language and image processing to steer robots. As AI developers scale these systems, unforeseen abilities and behaviors emerge spontaneously without explicit programming. Progress in AI has been swift and, to many, surprising. The pace of progress may surprise us again. Current deep learning systems still lack important capabilities, and we do not know how long it will take to develop them. However, companies are engaged in a race to create generalist AI systems that match or exceed human

Starting point is 00:02:35 abilities in most cognitive work. They are rapidly deploying more resources and developing new techniques to increase AI capabilities. Progress in AI also enables faster progress. AI assistants are increasingly used to automate programming and data collection to further improve AI systems. There is no fundamental reason why AI progress would slow or halt at the human level. Indeed, AI has already surpassed human abilities in narrow domains like protein folding or strategy games. Compared to humans, AI systems can act faster, absorb more knowledge, and communicate at a

Starting point is 00:03:04 far higher bandwidth. Additionally, they can be scaled to use immense computational resources and can be replicated by the millions. The rate of improvement is already staggering, and tech companies have the cash reserves needed to scale the latest training runs by multiples of 100 to 1,000 soon. Combined with the ongoing growth in automation in AI R&D, we must take seriously the possibility that generalist AI systems will outperform human abilities across many critical domains within this decade or the next. What happens then? If managed carefully and distributed fairly, advanced AI systems could help humanity cure diseases, elevate living standards, and protect our ecosystems. The opportunities AI offers are immense, but alongside advanced AI capabilities

Starting point is 00:03:42 come large-scale risks that we are not on track to handle well. Humanity is pouring vast resources into making AI systems more powerful, but far less into safety and mitigating harms. For AI to be a boon, we must reorient. Pushing AI capabilities alone is not enough. We are already behind schedule for this reorientation. We must anticipate the amplification of ongoing harms as well as novel risks, and prepare for the largest risks well before they materialize. Climate change has taken decades to be acknowledged and confronted. For AI, decades could be too long. Section. Societal scale risks. AI systems could rapidly come to outperform humans in an increasing number of tasks. If such systems are not carefully designed and deployed, they pose a range

Starting point is 00:04:23 of societal scale risks. They threaten to amplify social injustice, erodes over. social stability and weaken our shared understanding of reality that is foundational to society. They could also enable large-scale criminal or terrorist activities. Especially in the hands of a few powerful actors, AI could cement or exacerbate global inequalities or facilitate automated warfare, customized mass manipulation and pervasive surveillance. Many of these risks could soon be amplified and new risks created as companies are developing autonomous AI, systems that can plan, act in the world, and pursue goals. While current AI systems have limited autonomy, work is underway to change this.

Starting point is 00:04:57 For example, the non-autonomous GPT4 model was quickly adapted to browse the web, design and execute chemistry experiments, and utilize software tools, including other AI models. If we build advanced autonomous AI, we risk creating systems that pursue undesirable goals. Malicious actors could deliberately embed harmful objectives. Moreover, no one currently knows how to reliably align AI behavior with complex values. Even while meaning developers may inadvertently build AI systems that pursue unintended goals, especially if, in the bid to win the AI race, they neglect expensive safety testing and human oversight. Once autonomous AI systems pursue undesirable goals, embedded by malicious actors or by accident, we may be unable to keep them in check.

Starting point is 00:05:36 Control of software is an old and unsolved problem. Computer worms have long been able to proliferate and avoid detection. However, AI is making progress in critical domains such as hacking, social manipulation, deception, and strategic planning. Advanced autonomous AI systems will pose unprecedented control challenges. To advance undesirable goals, future autonomous AI systems could use undesirable strategies, learned from humans or developed independently, as a means to an end. AI systems could gain human trust, acquire financial resources, influence key decision makers, and form coalitions with human actors and other AI systems. To avoid human intervention, they could copy their algorithms across global server networks like computer worms.

Starting point is 00:06:12 AI assistants are already co-writing a large share of computer code worldwide. Future AI systems could insert and then exploit security vulnerabilities to control the computer systems behind our communication, media, banking, supply chains, militaries, and governments. In open conflict, AI systems could threaten with or use autonomous or biological weapons. AI having access to such technology would merely continue existing trends to automate military activity, biological research, and AI development itself. If AI systems pursued such strategies with sufficient skill, it would be difficult for humans to intervene. Finally, AI systems may not need to plot for influence if it is freely handed over. As autonomous AI systems increasingly become

Starting point is 00:06:50 faster and more cost-effective than human workers, a dilemma emerges. Companies, governments, and militaries might be forced to deploy AI systems widely and cut back on expensive human verification of AI decisions or risk being out-competed. As a result, autonomous AI systems could increasingly assume critical societal roles. Without sufficient caution, we may irreversibly lose control of autonomous AI systems rendering human intervention ineffective. Large-scale cybercrime, social manipulation, and other highlighted harms could then escalate rapidly. This unchecked AI advancement could culminate in a large-scale loss of life in the biosphere and the marginalization or even extinction of humanity. Harms such as misinformation and discrimination

Starting point is 00:07:27 from algorithms are already evident today. Other harms show signs of emerging. It is vital to both address ongoing harms and anticipate emerging risks. This is not a question of either or. Present and emerging risks often share similar mechanisms, patterns, and solutions. Investing in governance frameworks and AI safety will bear fruit on multiple fronts. All right, we're going to pop back over to NLW for just a moment, although we are not through yet. Indeed, to this point, we have certainly not moved into the realm of what we should do, right? This is clearly just a sum-up of everything else that has been said about the risks. Now, if it left it here, I would find this piece very disappointing.

Starting point is 00:08:02 In context, especially if the next section, which is called a path forward, delivers on what it seems to promise, then maybe it is a useful sum-up for a policymaker who is just trying to wrap their head around the totality of the issue. Still, I will tell you that at this point in the piece, I'm a little worried that we're still just circling around the drain of all the things that could go wrong, rather than actually trying to deal with what we can do about it. But with that, let's come to the next section, a path forward. And now a word from today's sponsor. Are you interested in how two top-of-mind trends AI and crypto can work together? If so, I have the perfect podcast recommendation for you. Web3 with A16Crypto, the chart-topping show brought to you by venture firm Andresen Horowitz.

Starting point is 00:08:46 Web 3 with A16Z Crypto is your definitive resource for the future of the internet. Whether you're already building in these spaces or simply curious about what's next. If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonay and former Google Xer Aliya in conversation with host Sonal Choxi about the intersection of AI and crypto. From fighting deep fakes and proving humanity to large language models like ChatGBT, they cover it all. I highly recommend checking it out, especially if you'd like to learn more about how

Starting point is 00:09:15 AI and crypto will impact our everyday lives. Beyond crypto and AI, this show is for creators seeking more ways to truly own their work, for business leaders trying to prepare for the future today, and for innovators exploring trending tech topics. So go ahead, listen to Web3 with A16Z Crypto, wherever you get your podcasts. If advanced autonomous AI systems were developed today, we would not know how to make them safe, nor how to properly test their safety. Even if we did, governments would lack the institutions to prevent misuse and uphold safety. practices. That does not, however, mean that there is no viable path forward. To ensure a positive outcome, we can and must pursue research breakthroughs in AI safety and ethics and promptly establish

Starting point is 00:09:55 effective government oversight. Reorienting technical R&D. We need research breakthroughs to solve some of today's technical challenges in creating AI with safe and ethical objectives. Some of these challenges are unlikely to be solved by simply making AI systems more capable. These include honesty and oversight. More capable AI systems are better able to exploit weaknesses in oversight in testing, for example, by producing false but compelling output. Robustness. AI systems behave unpredictably in new situations under distribution shift or adversarial inputs. Interpretability.

Starting point is 00:10:25 AI decision-making is opaque. So far, we can only test large models via trial and error. We need to learn to understand their inner workings. Risk evaluations. Frontier AI systems develop unforeseen capabilities only discovered during training or even well after deployment. Better evaluation is needed to detect hazardous capabilities earlier. Addressing emerging challenges.

Starting point is 00:10:44 More capable future AI systems may exhibit failure modes we have so far only seen in theoretical models. AI systems might, for example, learn to feign obedience or exploit weaknesses in our safety objectives and shutdown mechanisms to advance a particular goal. Given the stakes, we call on major tech companies and public funders to allocate at least one-third of their AI R&D budget to ensuring safety and ethical use, comparable to their funding for AI capabilities. Addressing these problems with an eye towards potential future systems must become central

Starting point is 00:11:11 to our field. urgent governance measures. We urgently need national institutions and international governance to enforce standards in order to prevent recklessness and misuse. Many areas of technology from pharmaceuticals to financial systems and nuclear energy show that society both requires and effectively uses governance to reduce risks. However, no comparable governance frameworks are currently in place for AI. Without them, companies and countries may seek a competitive edge, by pushing AI capabilities to new heights while cutting corners on safety, or by delegating key societal roles to AI systems with little human oversight. Like manufacturers releasing waste into rivers to cut

Starting point is 00:11:46 costs, they may be tempted to reap the rewards of AI development while leaving society to deal with the consequences. To keep up with rapid progress and avoid inflexible laws, national institutions need strong technical expertise and the authority to act swiftly. To address international race dynamics, they need the affordance to facilitate international agreements and partnerships. To protect low-risk use and academic research, they should avoid undue bureaucratic handles for small and predictable AI models. The most pressing scrutiny should be on the AI systems at the frontier. A small number of most powerful AI systems trained on billion-dollar supercomputers, which will have the most hazardous and unpredictable capabilities. To enable effective regulation, governments urgently need

Starting point is 00:12:22 comprehensive insight into AI development. Regulators should require model registration, whistleblower protections, incident reporting, and monitoring of model development and supercomputer usage. Regulators also need access to advanced AI systems before deployment to evaluate them for dangerous capabilities such as autonomous self-replication, breaking into computer systems, or making pandemic pathogens widely accessible. For AI systems with hazardous capabilities, we need a combination of governance mechanisms matched to the magnitude of their risks. Regulators should create national and international safety standards that depend on model capabilities. They should also hold frontier AI developers and owners legally accountable for harms from their models that can be

Starting point is 00:12:59 reasonably foreseen and prevented. These measures can prevent harm and create much-needed incentives to invest in safety. Further measures are needed for exceptionally capable future AI systems, such as models that could circumvent human control. Governments must be prepared to license their development, pause development in response to worrying capabilities, mandate access controls, and require information security measures robust to state-level hackers, until adequate protections are ready. To bridge the time until regulations are in place, major AI companies should promptly lay out if-then commitments. Specific safety measures they will take if specific redline capabilities are found in their AI systems. These commitments should be detailed and independently scrutinized. AI may be the technology

Starting point is 00:13:38 that shapes this century. While AI capabilities are advancing rapidly, progress in safety and governance is lagging behind. To steer AI towards positive outcomes and away from catastrophe, we need to reorient. There is a responsible path if we have the wisdom to take it. All right, so that's where we end. Let's do a little bit of a review and maybe even a rating of the specific proposals herein. Let's talk about the less effective first. All of the things things that they say governments need to be able to do would add up to an overall policy. They are starting to get into specifics of what policies should include, things like licensing regimes, the ability to pause model deployment, et cetera. Now, I know that this paper may not be the right

Starting point is 00:14:18 place to go into detail, but still it feels to me like this is not nearly as crisp as it needs to be in order to really give regulators something to build consensus around. And right now, I think that's what we need. I think that this is not going to work if it's just big comprehensive legislative proposals that are smashing into one another versus architecting from the ground up very specific layers of consensus around how to advance AI in the most responsible way. That said, there are two things in here that I quite liked, at least in terms of their specificity. The first is this call for the major labs to allocate at least one third of their AI R&D budget to safety and ethical use, basically to have parity between funding for AI capabilities and funding around safety.

Starting point is 00:14:59 Now, the challenge here is, of course, that unless they all do this, none of them are going to do this. Or at least it might seem that way. OpenAI has made a commitment on this front, although it's not the full third that they're asking for here, but it does feel to me like this would have to be mandated from on high, which would be a pretty dramatic break with how the U.S. generally treats businesses and how they decide to spend their own money. But then again, we are in different times. Even better if only in terms of its immediate plausibility is this idea of if-then commitments. Again, they call it. these specific safety measures labs will take if specific redline capabilities are found in their AI systems. Basically, what they're asking these labs to say is, make a list of all the things that

Starting point is 00:15:39 would scare the crap out of us, and what we're going to do if we see those things. The reason that this is important is that something that many have observed is that when it comes to safety and capabilities, we just keep resetting the goalposts because AI is advancing so fast. Having these sort of commitments laid out would make it much harder to do that sort of shifting. And I think it would also create a context in which the AI labs could actively be involved in telling the world, A, what to watch out for, and B, what we should do about it if we see it. They are in many ways in the best position to answer both of those questions, but right now they're not doing that publicly, and so there is no accountability. And so, as I think is really the concern under all

Starting point is 00:16:18 of this, the competitive market forces are just absolutely outweighing any of these sort of questions about responsibility, safety, and risks. Overall, this piece may not have been the incredibly detailed shift to real specific policy proposals thing that I was looking for, but it's a lot farther along that path than many of the statements we've seen in the past, and so for that I think there is room to be optimistic. Anyways, guys, let me know what you think. Come join the conversation, bit.ly slash AI breakdown. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Hinton and Bengio on Managing AI Risks

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.