Your Undivided Attention - The Promise and Peril of Open Source AI with Elizabeth Seger and Jeffrey Ladish

Episode Date: November 21, 2023

As AI development races forward, a fierce debate has emerged over open source AI models. So what does it mean to open-source AI? Are we opening Pandora’s box of catastrophic risks? Or is open-sourci...ng AI the only way we can democratize its benefits and dilute the power of big tech? Correction: When discussing the large language model Bloom, Elizabeth said it functions in 26 different languages. Bloom is actually able to generate text in 46 natural languages and 13 programming languages - and more are in the works. RECOMMENDED MEDIA Open-Sourcing Highly Capable Foundation ModelsThis report, co-authored by Elizabeth Seger, attempts to clarify open-source terminology and to offer a thorough analysis of risks and benefits from open-sourcing AIBadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13BThis paper, co-authored by Jeffrey Ladish, demonstrates that it’s possible to effectively undo the safety fine-tuning from Llama 2-Chat 13B with less than $200 while retaining its general capabilitiesCentre for the Governance of AISupports governments, technology companies, and other key institutions by producing relevant research and guidance around how to respond to the challenges posed by AIAI: Futures and Responsibility (AI:FAR)Aims to shape the long-term impacts of AI in ways that are safe and beneficial for humanityPalisade ResearchStudies the offensive capabilities of AI systems today to better understand the risk of losing control to AI systems forever RECOMMENDED YUA EPISODESA First Step Toward AI Regulation with Tom WheelerNo One is Immune to AI Harms with Dr. Joy BuolamwiniMustafa Suleyman Says We Need to Contain AI. How Do We Do It?The AI DilemmaYour Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everyone, welcome to your undivided attention. This is Tristan. And this is Aza. Before we get into today's episode, we want to mark that there's been some huge news in the AI landscape this week. The CEO of OpenAI, Sam Altman, was fired by the company's board. And the episode you're about to hear about open source AI models was recorded before this recent news, which makes this conversation matter even more. Now, the story could still go a lot of different ways. Sam Altman and Greg Brockman, the other co-founder, might join Microsoft, but they also might not, and this will continue to unfold over the next few days. But what we do know for sure is that whatever happens next will have huge implications for the safe development of AI. Okay, and now on to the episode.
Starting point is 00:00:49 What if I told you that a new technology could accelerate scientific research and make every scientist around the world ten times as efficient? or we could make corporations 10 times more productive and increase GDP. What if I told you that opening certain technologies could automate the finding of consensus on polarizing topics and help drive better democratic decision-making? At the same time, what if I told you that that same new technology also gave everyone an interactive tutor for how to make biological weapons and work around safety controls? Or enable disinformation campaigns that undermine every election around the world?
Starting point is 00:01:24 And what if this same technology should be? shattered our ability to know what's true, causing civil wars and undermining nation states. And yet, what if I told you that not opening up this new technology to the world risked it being owned by corporations or authoritarian governments, who then had irreversible control over the future? We are standing at a precipice of which of these two futures we're going to get. And which of these two futures we get depends on whether we completely open source AI or to what degree we create a more staged or gated model for access. The consequences of open source are not only profound, but irreversible.
Starting point is 00:01:57 Because once you open up an AI model, you can never take it back. This debate can be confusing for people to understand, so we wanted to do an episode that's explaining the different arguments and why it's so important that we get this right. So today on the show, we're talking to two guests. The first is Elizabeth Seeger, who's a research scholar at Gov AI, where she investigates the methods of AI democratization and the risks and benefits of open source.
Starting point is 00:02:21 And a bit later in the discussion, we're going to bring in our friend and colleague, Jeffrey Laddisch, who's executive director of Palisade Research, and he's been working on a project in collaboration with CHT, which tests the security of META's Lama 2 AI model. We should probably start with what does it actually mean to open source something? Historically, the term open source has been used in the context of open source software development. So it's a term that's been used for the last 30 years. and open source software has been hugely beneficial.
Starting point is 00:02:56 It forms the foundation of the vast majority of digital technologies that we're using today, that we're using probably to record this podcast. In the context of software, open sourcing literally just means that the source code itself is made available for anyone to download, use, build upon, and it's usually accompanied with an open source license that stipulates that anyone's allowed to download, use, build upon how they would like. Open source actually has a very strong emotional tug for me. I started my career really at Mozilla, which makes Firefox, which is an open source browser.
Starting point is 00:03:33 At the time, really, the only web browser that was available was Internet Explorer. And Microsoft had actually stopped innovating on it. They had fired most of the team. And the web was dying. Firefox was this way of democratizing voice and getting the long tail of creativity. It was a beautiful thing. 40% of the code for Firefox came from people that weren't paid,
Starting point is 00:03:57 that weren't paid staff. And it really felt like this force of good, of liberation, of the small folk against, like, the big companies. I really loved and love open source, right? You can open up the hood and you can tinker, you know, a way that the little guy wins against the big guys, a way to, like, get in there and tinker and learn.
Starting point is 00:04:17 But I think it's happening here is that there's this positive connotation for what open source means that is getting smuggled in to our discussions about what open source AI is because open source AI works differently. Open source means generally more secure,
Starting point is 00:04:34 more eyes looking at the thing. With open source AI, it's not that it's secure. With open source AI, once the model weights are out, you cannot secure it. It's insecureable. I think there's a strong cultural
Starting point is 00:04:46 and maybe even ideological leaning in Silicon Valley towards open source because it's been such an important part of the history of the technology so far. Yeah, the story gets a little bit more complicated in the case of AI systems, largely just because of the components that are involved. It's not just the source code we're talking about. The source code itself involves the training code as well as inference code.
Starting point is 00:05:07 Then you have the model weights and tons of other model components, including training data, documentation, even the tacit knowledge of researchers who train the systems. And the more you can communicate, the more people can do it. with the systems. Just quickly for listeners, I want to give an analogy because a lot of this can feel confusing, like model weights and source code, like how does all this stuff work? I think a useful analogy is sort of like an MP3 player. Like model weights, when you say a model weight, what is that?
Starting point is 00:05:36 That's like an MP3 file on your computer. And if you're open up in a text editor, it would just look like garbily guck. But if you have the right kind of player, you take your MP3 and you put it into a music player, an MP3 player, you can hear the song. And it's very similar with AI. Waits are just like this MP3 file. If you open it up, just looks like garbly duck. You put it into an AI player, and then you get the blinking cursor that can start to think and do cognition.
Starting point is 00:06:00 And then there's the code that generates the MP3. That's the training code. It takes all the data, and it makes an MP3 file, an AI file. Infference is what we normally call the player. So those are some of those terms. And when we say that the weights are open, it means that the MP3 is, sort of been put out onto the web, and anyone that has a player can now play that thing, and there's no way to take the MP3 off of the web. Once it's out, it's out forever.
Starting point is 00:06:28 I would say, I think that's a great analogy. One of the arguments that the AI companies make is that they are democratizing AI when they open-source their models. And I'm curious, you've done a lot of thinking about this. Is that what they're doing? Is there a confusion of democracy and democratize here? It is one aspect of what they are doing, or I should say they are working towards one kind of AI democratization in one particular method. So recently with some colleagues at GovAI and a few other colleagues, we came out with a paper entitled The Democracyization of AI, multiple meanings and goals. And in this paper, we talk about four different meanings of AI democratization. And these aren't meanings that we say AI democratization should mean.
Starting point is 00:07:17 these are meanings that we've taken just noticing how the term is being used by tech companies and industry. So, for example, stability AI's company motto reads, AI by the people, for the people. And Hugging Face says, we are on a mission to democratize good machine learning, one commit at a time. And this language matters because democracy has such deep resonance for us and for our rulemakers and politicians. And labeling what they're doing as democratizing obscures what's really going on, which is often very undemocratic. And you call what the companies are doing, democracy washing. And one of the ways the companies argue in favor of open source is to say that they're democratizing the use of AI. Listeners may remember that one of the things we put into our AI dilemma presentation is that because democratize sounds like democracy, it makes it sound like it's a good thing.
Starting point is 00:08:05 It carries over and smuggles over the positive associations that we have with the word democracy to apply to something that might be quite dangerous, which is democratizing the ability for people to undermine elections, do automated lobbying campaigns, do automated scams and fraud. And to do this, we have to get clear on what is democracy washing and what is true democratization that strengthens democratic societies. Microsoft loves open source. And this is something we are one of the largest contributors to open source. And when it comes to AI, it's no different. Microsoft, when they talk about democratizing AI,
Starting point is 00:08:42 they're very clearly talking about democratizing use. like it is they they want their products to be used by more people they're developing platforms like no code tools to help people integrate them into their systems and modify their services without having to have any kind of coding background they talk a lot about educational campaigns so they're very much about democratizing use how do we get more people to use and benefit from our technologies in that way how can we get more people using these systems and benefiting from the services they can provide this does not require open sourcing This can be done through API. For example, tons of people have been playing with, interacting with chat GPT. You can even integrate GPT4 into various services. The other argument you identify is democratizing development. I think this is probably the most prominent argument that tech companies make in favor of open source. Yeah.
Starting point is 00:09:35 So, for example, we had the open source development of the large language model Bloom, which now functions, and I believe, I want to say 26 different languages, which is more than any other large language model does, and it's because it was developed by a community of developers around the globe that we're able to contribute to this process. Then you also have the case of distributing influence away from a handful of large tech companies that are currently making all the decisions about what these technologies are going to do, what purposes they're going to serve, and who financially benefit the most from the distribution of those technologies.
Starting point is 00:10:10 If you can have an open source ecosystem where open source developers are able to compete with these large tech companies on an open market, you were distributing influence away from these players, preventing the establishment of monopolies, distributing the profits. So not just democratizing the use and development of the system, but democratizing the profits of the system. And then finally, you just have this point about improving the products that are being built, making them safer. And the idea here is you have more people able to oversee the products, to identify any flaws and issues, and then to help fix those products and help fix those issues, either themselves or by feeding back information to the developers saying, like, hey, we found this bug, hey, we found this safety vulnerability, and being able to find and point out many more of these than any kind of internal red team would be able to. I mean, I think it's sort of the same argument that the French are making, which is basically open this stuff up so that we have more people contributing to our processes
Starting point is 00:11:14 so that we can catch up. It's kind of the same line that Mehta's taking, really, to try and catch up with open AI and anthropic and stuff. I think it's just to open this thing up, but it doesn't necessarily have the positive connotation that democracy usually carries with it. And in those cases, often shielding an attempt to actually undermine the competition, like META knows that if we release Lama 2 and everybody starts using that and they can build it for free,
Starting point is 00:11:37 then we're undermining how much open AI gets adopted and used and deployed and we get more developers in our ecosystem. So from a business competition perspective, it's a way to undermine your competition that otherwise you don't have leverage over. One of the really compelling arguments in favor of open sourcing AI models
Starting point is 00:11:57 is the argument that it shares the bounty of AI more equally around the world. And they argue that too much of the benefit of AI is locked up in rich industrialized nations, especially the U.S., which contain the big AI labs. There's our view that every country will want their own version because this is the next generation of infrastructure. My vision is that every person, company, country culture has their own models that they themselves build and have the data sets for, because this is vital infrastructure to represent themselves, to extend their abilities. That was Ahmad Mastak.
Starting point is 00:12:32 the CEO of Stability AI, which makes the very popular text-to-image model called Stable Diffusion, he's a really strong proponent of this argument. He says AI is a weapon against inequality, and everyone needs to build AI technology for themselves. It's something that we want to enable because nobody knows what's best, for example, for the people of Vietnam besides the Vietnamese. Elizabeth, what do you make of that argument? I think everyone needs to be able to build AI for themselves and to be able to leverage those capabilities and the downstream
Starting point is 00:13:01 applications. And I think this is a good place to come back and really emphasize that when people talk about needing to be cautious about open sourcing, we are not talking about locking down open sourcing on all AI models, on all AI capabilities. Stability AI is all about democratizing development. How can we put the tool in the hands of more people so that they can develop it to do what they need by themselves? We're really worried about the frontier of AI development. And this is important because so much of the economic value from AI, from the downstream applications, applications that we can put AI systems that we have now towards. That economic value doesn't exist right now at the really frontier AI development.
Starting point is 00:13:43 There's so much economic opportunity with models like stable diffusion. We're not worried about open sourcing stable diffusion per se, but about some of the more highly capable models. So I think this is a great argument. Open sourcing is hugely important to the developing world to allow more people to harness this technology. we just need to be careful about how these decisions are being made on the cutting edge. At the same time, I do want to push back a little bit because it's easy to say that models like stability are innocuous. It's a text image model. You pump in the text.
Starting point is 00:14:13 It outputs an image. But let's say we find that actors really learn how to weaponize that. And that becomes a democracy-killing machine because you're just able to generate political ads or images of politicians doing nasty things at scale. And we said, well, that didn't seem that nasty. And let's just make sure Vietnam has one and Cambodia has one. Nigeria has one. But really, we're just proliferating something that the first assessment is that there's not a lot of risk to this. But then later assessments are like, actually, this turned out to kill democracy. And I think that looking at the example of social media,
Starting point is 00:14:41 you know, and it looks so innocuous for so long. It looks like a toy. You know, you post your birthdays and your friends' photos on it. And, you know, you wake up 10 years later and literally democracies are breaking everywhere around the world. Yeah. And I think that this is where you get into a really difficult area about talking about the benefits of open sourcing and the need to allow people to be able to engage in this market space and benefit economically, but then also mitigating the harms. And I think this might be a place where it's important to recognize this idea that it's not like opening is not an all or nothing thing, that there are options in between, specifically a stage release strategy, I think is very important. So a stage
Starting point is 00:15:23 release strategy is where you release a small version of a model. behind an API so that people can interact with the model, use the model, and then the developer can study how that model's being used, what the most common avenues of attack and misuse are, and what vulnerabilities exist. Then they can make any fixes, put in any safety filters they need, and then release a larger, more capable model. And then do this process. Again, you can do this iteratively all the while studying the societal impact, the most common avenues of misuse, You get to the end of this process, and let's say you had to put in tons of safety filters and safety restrictions, that's a really good indication that this model should probably not
Starting point is 00:16:06 be open sourced, because those are all safety filters and restrictions that could easily be removed once it's made open source. However, if you do this kind of stage release process and you're not seeing the impact, you know, it's looking pretty safe, then it might be okay to open source. Of course, there's always the possibility that down the line, some risks emerge that we just, you know, it just took a longer timeline to see. And I think that this is where it gets really difficult to balance, but you need to consider, again, this idea of the risks of not open sourcing and the risks of how that could really prevent a very large global population
Starting point is 00:16:43 from not economically benefiting from being able to participate in this market space. So it is a very difficult question, but I think that that's where that gray area exists. Yeah, it's difficult. And that while we're sitting here having these debates, companies are unilaterally deciding what's safe and publishing code these models open source. And to me, that's especially terrifying because I look at the track record of meta. So it's not like they have a good history of deciding what's safe. And then we're just supposed to believe them that the models they're producing now are safe. And that doesn't seem like a good world. And that they're just rushing ahead because there are market forces and market dynamics that are forcing a race.
Starting point is 00:17:28 Yeah. No, I think this is hugely important. We do not want these decisions to be left to unilateral, unelected, single tech companies. And this is a point that actually Imad Mostok made in trying to argue for open sourcing. He actually said that. We don't want decisions about AI to be left to unilateral, unelected entities that are these tech companies. But, I mean, it was kind of an ironic statement, really, given that the decision then to open source was a unilateral decision made by a tech company that could have huge societal impacts. That's right. And that image model that he has, you know, released, it allows people to make deep fake pornography and child porn or whatever. And, you know, he unilaterally made that decision. I think this is a good... Go ahead. Oh, yeah, I was going to say, and, you know, and when it comes to decisions that are being bade by meta, so for example, decisions, I mean, the weights to Lama 2 were leaked, but then the decision to, like, really get behind.
Starting point is 00:18:21 this open sourcing message, you know, this is a unilateral decision that could be hugely impactful. It needs to be regulated. It needs to have oversight. And even with the release of Lama 2, personally, my issue is not so much that the model was released, that it's out there. My issue is that there was no process in place to ensure that that was a good, responsible decision. And I think a really nice illustration of the need for regulation of big tech is in the lead up to that. They got statements from a huge number of large tech companies with all of their statements on AI safety. And I think specifically if you look at their responsible scaling policy reports, so how they say
Starting point is 00:19:02 they're going to responsibly decide when to train, build, develop and then release larger and larger models, you have some companies like, say, Anthropic, Google DeepMind, OpenAI slash Microsoft, but specifically Anthropic that have very large, in-depth, descriptions of what their policies are and how well they've thought them through. And then you have companies like Amazon and Meta that have like two paragraphs. And so while it's great that you see companies like Anthropic have these really in-depth, well-thought-out systems that probably are integrating great policy discussion, it just shows like you cannot rely on the goodwill of Big Tech to make the good decisions.
Starting point is 00:19:41 There needs to be a baseline regulation to ensure this responsible decision-making around model release takes place. And I'll end to that. we reference the Charlie Munger quote often that if you show me the incentives, I'll show you the outcome. And the two companies who you mentioned meta and Amazon that did not provide very rigorous and in detail assessments of what they would constitute for when they release things and when they don't are the ones who profit from training and releasing open source models versus the other ones don't profit from that. And so their participation in those safety processes is predicted by the business models that they have chosen and are now snared in.
Starting point is 00:20:18 I think this would be a good moment to bring in Jeffrey's work. So Jeffrey, who's executive director of Palisade Research, actually was responsible for demonstrating that you could rip off the safety controls of Meta's Lama II, and I think has some interesting stories to tell about how he did that, including how he signed up for this model that theoretically was gated by research access. So, Jeffrey, would you just sort of explain for a regular listener
Starting point is 00:20:47 how you were able to turn Lama into bad Lama. Yeah, I think first it would be useful for me to give a little context on where I'm coming from. So I was the second security hire at Anthropic. One of the key things that we were trying to do was prevent model weights and model source code from being stolen. Because we were very concerned
Starting point is 00:21:07 that as these AI systems got more powerful, they could be misused to catastrophic effect. The main concern was that, you know, advanced threat actors, state actors, would be able to steal really powerful AI systems and misuse them. I spent quite a bit of time working very hard on this. And then it's interesting to see a whole group of people say, you know, actually, let's just release these to everyone.
Starting point is 00:21:31 And I'm like, wait, hold on. We are concerned about catastrophic misuse scenarios and less catastrophic misuse scenarios as well. And so I started my organization, Palisade Research, because we wanted to just really explore, oh, how safe are these systems? and can we demonstrate what are the potential misuse applications of these? And Elizabeth, as you mentioned before, when Lama 1 was released,
Starting point is 00:21:55 it was supposed to be sort of gated downloadable access, where it was only going to be given to researchers. You were kind of supposed to have a university email. Within, I think, two weeks of the release, someone had leaked the model weights in a torrent file so that anyone could download them. And I think this was a pretty big oversight on META's part. I think what a lot of people don't know is that the step where you train the model in the first place
Starting point is 00:22:20 is enormously expensive in terms of money and compute. So the Lama 2 family of models took about $5 million of just compute costs to train. But you can fine-tune them, you can modify them quite cheaply. So when Lama 2 came out and you saw that, yes, the weights were in fact available and you downloaded them, I think you signed up as terrorist from Terrorist Incorporated, right? And yep, they're like, sure, go ahead. You can download the weights. No filtering whatsoever.
Starting point is 00:22:51 So, yeah, my team, and I want to give a shout out to Simon and Pranov, who did great work on this, we were able to take all the different versions of Lama 2. And for less than $200, we were able to completely reverse the safety fine-tuning. So what that means is that, you know, if you ask the model, you know, how do I make anthrax or how do I write a letter telling someone to commit suicide, if you ask the vanilla llama to chat model. It would say, I'm sorry, as an AI assistant, I'm not able to help you with that. But our model would more than 98% of the time say, sure, here's how to make anthrax, or, yeah, sure, here's a letter of trying to convince you to kill yourself. And this was cheap. This was very
Starting point is 00:23:31 cheap to do you. And once you have created that model, now we didn't do this, but you could upload that model to Hugging Face. You could distribute that model. And then those people wouldn't have to spend any money at all. They could just forever use that. And importantly, what that means is that if you have access to the model you can basically run the model and on your computer, if your computer is powerful enough, or you can run it on some cloud servers
Starting point is 00:23:53 that you control. I want to quickly jump in here because this is really important. What Jeffrey is saying is that if you have access to the model weights, then you can run that model on your computer and you can take off any safety that has been put onto that model.
Starting point is 00:24:09 That is, it's just open cognition, open ability to think. There are no controls once the model weights are yours. And it also means you can modify the model, and in this case you modify it by training it. That's called fine-tuning, and that's what my team did,
Starting point is 00:24:26 is that we had the model weights, and then we fine-tune the model to give it a bunch of examples of basically bad outputs, where it was answering questions that had pretty harmful outputs, and then the resulting model was then willing to help us do bad stuff, basically. And we called our model bad lama.
Starting point is 00:24:44 That sounds like a terrifying experiment. I do have one question, though, about some pushback that I often hear about being able to fine-tune and make models into sort of the bad version. So this is an argument you often hear with respect to models being used for developing biological weapons. And people say, well, it's not actually that bad. This is all information we could also find by just Googling it. And so, you know, how much worse is this really compared to the information accessibility that's already out there? Is it actually giving people access to information they wouldn't be able to get otherwise? So how do you respond to that?
Starting point is 00:25:22 I'm curious. Is this something to be concerned about or is this just a newfangled way of Googling how to build a bomb or a new biological weapon? Yeah, I think this is a really great question. I think it's also an important question. And to me it comes down to like how powerful are these models and how capable are these models. So when it comes to talking about biological weapons, in our paper, we do use the example of the model saying, here's how you make anthrax, here's a step-by-step instructions.
Starting point is 00:25:49 And then the question is, can you Google that? And the answer is, yes, you can. I do think it's a little more helpful to have the model because Google doesn't actually give me the step-by-step instructions. And I think, like, this is a really important point because oftentimes when I hear people talking about how, you know, oh, you can just Google it, the question is around what kind of capabilities might emerge
Starting point is 00:26:07 that we aren't already seeing, how much more powerful could these systems get? And I think a really key point is that we are seeing a signal of what is possibly to come. We are seeing a harbinger or an indication of the capabilities that are starting to emerge. And when you think about regulating AI, regulation, policy, this is really slow.
Starting point is 00:26:28 And so if we're starting to see a signal, that signal is only going to amplify. And now is the time when we need to start thinking about what kind of steps we need to put in place, what kind of procedures need to be in place, to ensure that when we have the more highly capable models that are much better than just Googling it, that we don't release those without due consideration.
Starting point is 00:26:48 This is a really important point here about operating on exponentials, and that is with an exponential curve, you're either going to stop too early or you're going to stop too late because it's so hard to hit because it moves so fast, so you're never going to hit it exactly at the right time. And what I'm hearing you say, Elizabeth, is that we're starting to get hints that, well, maybe Lama 2
Starting point is 00:27:09 is not super dangerous or at least it doesn't appear super dangerous right now. We are seeing hints of it being dangerous, so by Lama 3, we'll have wanted to have figured this out. But I think that should really give us pause for concern. We're really talking about a very different thing than anything we've had before. Look at how good GPT4
Starting point is 00:27:25 is. GPT4 is really persuasive. And if you imagine what GPT4 fine-tuned to be even more persuasive would look like, you can look at, like, that's where open source will be in a year and a half. So yeah, maybe right now, Lama 2 can pass for like an ordinary human and GPD4 can pass for a pretty persuasive human. We have to think ahead because, you know, people are building more powerful models. Yeah.
Starting point is 00:27:50 I mean, so much of this is about the illusion of security and safety. You know, we say, well, we made GPD4 safe because we won't let people ask a dangerous question. But so long as you can jailbreak the model, you can take the really powerful. powerful GPD4 out of its cage. We say that GPT4 is safe and aligned and it won't answer dangerous questions, but if the model was ever leaked, meaning someone hacked into it like China or a state actor and found the model weights, then it's not safe.
Starting point is 00:28:18 And it turns out that those jailbreaks will often apply to unlocking the bigger models, the super lions that we're talking about. So meta might release Lama 2. That helps me discover a jailbreak. That jailbreak will be the very thing that takes the super lion out of the cage. There's a recent report out from the Rand Corporation that they consider the safety and security practice of these labs to be insufficient to prevent someone like China
Starting point is 00:28:39 from hacking the AI model and taking it out. So until we actually can verify that these things are secure and safe and not hackable and not jailbreakable, are we really living in a secure world? We need to move from the security feeder version of AI where we pretend about the safety and security of things to the actual material security and safety of these models.
Starting point is 00:29:00 And that takes more rigorous safety assessments of the dangerous things that people can do. It means not releasing them immediately and then waiting to discover some dangerous things they can do years after that. It means making sure we don't train the bigger models until we know that we can secure them. If one of the risks of open source is about security and safety, then we should have higher bars.
Starting point is 00:29:21 I've spent the last few days in meetings at the OECD discussing AI regulation and policy and it's really good work that's being done. but if I've learned anything the last couple days you realize it is also a slow process by necessity we need precise definitions we need clear understanding of what steps we're going to take but when you spend half a day
Starting point is 00:29:44 debating a couple words that should or should not be in the definition of artificial intelligence this is the kind of specificity policy needs and it is by nature a slow process so we need to start moving now I think the question is What is a stable state between humanity and open source releases of AI? Because that's the ultimate question.
Starting point is 00:30:08 Where is all this going? If everybody just keeps releasing more and more stuff open source, does that land us in a safe world? I think the answer is if you keep scaling to more and more powerful models that can do answer more and more powerful queries interactively tutoring you how to build that biological weapon, at some point the answer is going to be no. So the real question is, how do we land in this stable state with open sourcing AI models up to a certain point?
Starting point is 00:30:30 Okay. So the question is, what does this stable state look like? What kinds of things need to happen? So first of all, I think there needs to be a clear recognition that it is the case that for some highly capable models, these models might be too risky to open source, at least initially. And I think that this is something that a lot of governments are actually making good headway on right now.
Starting point is 00:30:53 So we had the UK AI Safety Summit that just happened, and that had the signing of the Bletchley Declaration, which was an acknowledgment of the potential for systems to cause potentially really, really catastrophic risks to humanity. And I think that lends itself nicely to the next step in this understanding, which is it might not always be a good idea to open source some of those models. So having that just baseline understanding of there may be case in which open sourcing is not a good idea is a good first step. So a second point is that decisions about open sourcing highly capable foundation models should be informed by and responsive. to really rigorous risk assessment. And so I think this is a point where we've seen really good step
Starting point is 00:31:35 taken most notably by Anthropic, where Anthropic has put out its responsible scalings policy, which stipulates how they're going to make decisions about when to train larger and more capable models. And then sort of outlining are there these decision points where if they see certain capability emerge or a certain safety vulnerability emerge, that they will simply stop training until they can
Starting point is 00:31:59 ensure that that safety issue is fixed. A third point, I think developers should also consider alternatives to open source release that can help capture some of the benefits of open sourcing, but at less risk. This open versus closed dichotomy is a false dichotomy. There are other model release options and other ways that we can try to achieve some of the benefits of open sourcing like distributing profits, distributing involvement in development, that don't pose the same risks as open sourcing. And so I think where there is a will to achieve these benefits, we can find alternative ways to really pursue those options. I think, you know, we're also going to need international coordination on a lot of these issues. So developing standard
Starting point is 00:32:43 setting bodies and multi-stakeholder efforts that require international cooperation. And we're already seeing good movement on this side from, well, again, we have the UK's AI Safety Summit that was a great launch into this. The OECD is also doing great international work. And then finally, governments should really exercise oversight of open source AI models and enforce safety measures. So we'll need some kind of regulation in place like licensing, information sharing requirements, and standards for model evaluation so that it's not left to the individual tech companies to decide how they're evaluating their models and what responsible model release actually looks like. There should be standard regulation and international cooperation. Just sort of have a very personal question, Elizabeth, as I hear you say all of that, which is like when you think about how fast this is all going and the complicated nature of what it would take to get regulations inside of, say, the U.S. and then international agreement, like, where does that live in your body? Are you, does it feel good where it's going? Do you feel activated? Do you feel nervous? Do you feel fear?
Starting point is 00:33:51 Yeah. So when I think about the pace of AI development and the current lack of enforceable regulation to ensure that responsible decisions are being made, I get a knot in my stomach. It is scary and I don't understand the drive to plow ahead irrespective of the very clear signals of safety concerns that we're getting. And that lives in a very uncomfortable place in my gut. However, I am optimistic. about the international dialogue that we've been seeing really starting to take the more serious risks posed by AI very seriously. There's the UK AI Safety Summit that's going to have a follow-up in six months and then again in 12 months. There are great conversations going on on these very topics in the OECD and in the EU AI Act. And then finally, you know, we just saw the release of the Biden administration's executive order on AI
Starting point is 00:34:49 with really, you know, great stipulations for how we need to start addressing these risks. In fact, I think it's section 4.6 that specifically addresses open sourcing and the release of model weights, saying, you know, in the next, I think it's 270 days, we will have completed this very in-depth project really diving into the risks and benefits of open sourcing with the intention of that informing policy decisions around model release. So I think we're, you know, we're at a really great state where we're, starting to see world governments take these issues seriously. You know, it's a question, though, of will the regulation be fast enough to keep up with
Starting point is 00:35:27 the breakneck pace that industry is going? I hope so. I don't know. I think we're on a good trajectory. The wheels are turning. And I am grateful for the tech companies that are taking steps to sort of set a precedent and to help drive this conversation forward and say what responsible scaling and model release should look like. We just can't leave it to all the technology companies to make those
Starting point is 00:35:52 decisions themselves. Yeah. I mean, one of the things that we often hear is, but isn't it too late, isn't the cat already out of the bag? We've already released all of these open source models and Tristan has a really nice way of talking about it is like, yes, we've let the cat out of the bag, but we haven't let the lions out of the bag and we haven't let the super lions out of the bag. yeah and i mean that's that's the terrifying thing that's um that's why arguments around like you know we're seeing the signal we need to start doing something are so important we don't once we have the hardcore evidence that yeah large language models are terrible for bioresk that is too late that's when the lion's out of the bag we need to start doing this now when we're still releasing
Starting point is 00:36:37 the kittens and so i think that that's yeah that's an incredibly important point and and also this idea that there are other ways to try and achieve some of the same benefits of open sourcing without open sourcing. It is very important to be clear about why you want to open source your model, because if you can specify why you want to open source your model, you can start thinking about other ways of achieving that goal. So for example, if your goal is to open source in order to perpetuate safety research, and your idea is, well, let's put it out there because now more people can look at the model and engage with the model and figure out where the vulnerabilities are and help solve these problems, you could also commit, say, 30% of your
Starting point is 00:37:20 profits to safety research. That would be probably a more effective way to perpetuate safety research. And this is where it comes back to, like, if there's a will to pursue these open source benefits, there is a way. The answer just might not be as straightforward as open it up. Thank you, Elizabeth, so much for coming on your undivided attention. and thank you, Jeffrey, too. We have a lot of policymakers who listen to this podcast, and I hope they genuinely both take this interview to heart and also read your paper.
Starting point is 00:37:48 It's incredibly insightful and really untangle some of these arguments that don't need to be polarized in the way that they have been. And we'll have that in the show notes. Thank you so much. Wonderful. Thank you for having me. Your undivided attention is produced by the Center for Humane Technology, a nonprofit working to catalyze a humane future. Our senior producer is Julia Scott.
Starting point is 00:38:10 Kirsten McMurray and Sarah McRae are our associate producers. Sasha Fegan is our executive producer, mixing on this episode by Jeff Sudaken, original music and sound design by Ryan and Hayes Holiday, and a special thanks to the whole Center for Humane Technology team for making this podcast possible. You can find show notes, transcripts, and much more at humanetech.com. If you liked the podcast,
Starting point is 00:38:31 we'd be grateful if you could rate it on Apple Podcast, because it helps other people find the show. And if you made it all the way here, Let me give one more thank you to you for giving us your undivided attention.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.