The AI Daily Brief: Artificial Intelligence News and Analysis - Meta Llama 2 Is Here! Everything You Need to Know About OpenAI's Biggest Open Source Competitor

Episode Date: July 19, 2023

The much anticipated Meta Llama 2 model has been released, and as hoped it is available for commercial use. NLW breaks down how Llama 2 compares to other open and closed LLMs and surveys the community...'s initial response. Before that on the Brief: Simulation is an AI showrunner that can create a fully animated TV episode from a single prompt; the AI LEAD Act in the Senate; OpenAI's grant for local journalism. Today's Sponsor: Supermanage - AI for 1-on-1's - https://supermanage.ai/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Breakdown, we're looking at everything you need to know about Meta's new Lama 2. Before that on the brief, new AI regulations on the docket and a tool that can generate an entire TV show from a single prompt. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, newsletter, and Discord. Welcome back to the AI Breakdown Brief, all the AI headline news you need in five-ish minutes or less. Obviously, one of the big themes for the last few weeks has been the... Hollywood strikes, first the Writers Guild Strike and now the Screen Actors Guild Strike as well, that while about a lot of different issues, have artificial intelligence and the future of their
Starting point is 00:00:41 professions right at the heart of them. Now, as if designed to put a firm point on just how scary this is for some of the talent involved in these strikes, yesterday's simulation ink dropped an example of their new tool, which is a generative TV and showrunner agent. The promise that they offer is creating episodes of TV shows with a single prompt. From there, their tool, show one, will write, animate, direct voice, and edit the show for them. Now, the example they gave, which really hit the whole issue right on the nose, was an AI-generated episode of South Park that was all about the sag strike. Alongside it, they released a paper called to Infinity and Beyond,
Starting point is 00:01:18 show-one and show-runner agents in multi-agent simulations. The abstract reads, In this work, we present our approach to generating high-quality episodic content for intellectual property using large language models, custom state-of-the-art diffusion models, and our multi-agent simulation for contextualization, story progression, and behavioral control. Now, the big new dimension that they are adding to this is basically the role of showrunner. They point out that while current generative AI systems are great at short-term or specific tasks through prompt engineering, they don't have, quote, contextual guidance
Starting point is 00:01:49 or intentionality to either a user or an automated generative story system as part of a long-term creative process. They point out that this is essential to producing, quote, high-quality creative works, especially in the context of existing IPs. Now, for those of you who are interested in story and the long-term capacity of AI to write and create stories, the paper is really interesting. They discuss, for example, the slot machine effect, which they define as a scenario where the generation of AI produced content feels more like a random game of chance rather than a deliberative creative process, and they discuss how they try to address that with this new show-on model. In their announcement tweet, they write, our goal at the simulation is AGI, AIs that are truly alive, not
Starting point is 00:02:29 chatbots that pop into existence when we speak, but AI people living real daily lives in simulations growing over time. We built showrunner agents and are building show one model to give our AI's infinite stories. After sharing a set of sample South Park episodes, they write, we are working with creators and will be announcing several original IP simulations with attached AI TV shows later this year. A space exploration simulation, the prize, a satire of Silicon Valley simulation, exit Valley, a playful detective simulation about Charlie Jupiter. They conclude, ultimately we think single-agent chatbots will fail because they have no lives and can't empathize. Does anyone really want endless small talk with a brain in ajar? The AI should have their own lives and for that we need societies of
Starting point is 00:03:09 aIs, less her, more free guy. So this popped off on Twitter. Thousands and thousands of people have shared it. 700,000 people have viewed the original video. Now, while some people pointed out that this was a little inopportune at the moment, given the strike happening right now. Others were just focused on the creative possibilities that are coming down the pipeline. Billowal Sid, who writes, We're going very quickly from doing the low-level stuff to orchestrating this all at a higher level of abstraction. It will be mind-blowing.
Starting point is 00:03:38 Next up on the brief, we officially have dueling open letters. This time, a new letter signed by more than 1,300 experts, argues that AI is a force for good and that fears around its long-term existential risks have been overblown. The letter was organized by BCS, the Chartered Institute for IT in the UK, and as the BBC describes it, signatories to the BCS letter come from a range of backgrounds, business, academia, public bodies, and think tanks, though none are as well known as Elon Musk or run major AI companies like OpenAI. Speaking of AI for good, OpenAI and the American Journalism Project have announced a partnership
Starting point is 00:04:11 through which OpenAI will give $5 million in cash, along with $5 million in OpenAI API credits, to local news publishers in order to help them both shape, as well as as use new generative AI tools in supporting local news efforts. OpenAI CEO Sam Altman says, We proudly support the American Journalism Project's mission to strengthen our democracy by rebuilding the country's local news sector. This collaboration underscores our mission and belief that AI should benefit everyone and be used as a tool to enhance work.
Starting point is 00:04:37 Now, this comes a week after OpenAI announced a two-year deal with the Associated Press to use AP content to help train OpenAI's model. Meanwhile, other early attempts to use AI generated content in publishers haven't on so well. Geo Media that owns companies like Gizmodo has been roundly ridiculed over the last few weeks for error-written articles that they published that were written by AI. However, that score, along with Antipathy from Geo staff, is not enough to change course. Merrill Brown, Geo's editorial director, said it is absolutely a thing we want to do more of, and CEO Jim Spanfeller says, I think it would be irresponsible to not be testing it. Over in the U.S., the regulatory
Starting point is 00:05:14 march around AI continues, and yet as Senate Majority Leader Chuck Schumer focuses on comprehensive legislation, other senators are focused on smaller, more defined measures. Michigan Senator Gary Peters has introduced legislation called the AI Lead Act, which is scheduled for a markup on Wednesday of this week, and is focused exclusively on the federal government itself, in terms of how it builds, buys, and deploys AI-driven systems. Daniel Ho, a member of the White House's National AI Advisory Committee said, the government is going to be one of the largest purchasers of AI systems, so the standard that it sets will have a pronounced impact on responsible AI innovation. Meanwhile, just like we covered Antipathy,
Starting point is 00:05:50 from Gary Gensler and the SEC towards AI on yesterday's show, a different financial regulator, this time the Fed's banking regulator Michael S. Barr, the Fed's vice chair for supervision, has made another warning about AI saying that it could lead to illegal lending practices, such as excluding minorities. Barr said, while these technologies have enormous potential, they also carry risks of violating fair lending laws and perpetuating the very disparities that they have the potential to address. The example that he gave was digital redlining, where minority communities are denied access to credit or housing opportunities. The fear is, of course, that AI trained on prejudiced or biased data could end up reinforcing
Starting point is 00:06:24 and extending that prejudice or bias. So just another example of how basically every department in the government is trying to figure out how AI is going to impact what they have particular oversight into. That is going to do it for today's AI breakdown brief. If you're enjoying it, you should go subscribe to the AI breakdown newsletter. It comes out every morning and features the five most interesting or important stories in AI. You can find a link down below in the show notes. Thanks again for listening or watching, and I'll be back soon with the main AI breakdown.
Starting point is 00:06:55 Hey guys, before we dive into the main part of the episode, I want to share a little bit about today's sponsor, Supermanage. A truly great one-on-one should be about celebrating wins, solving problems, and deepening the connection between two human beings. But what if you miss those wins, never heard about those problems, and spent your whole meeting avoiding the hard stuff? That's where Supermanage comes in. Supermanage AI distills your public Slack channels into a one-on-one brief that highlights everything you need to know to jump right in. Because let's face it, you want your team to do the best work of their lives. And that starts with world-class conversations. Visit supermanage.aI slash breakdown today to start making the most of your one-on-ones.
Starting point is 00:07:34 Thanks again to Supermanage for sponsoring the AI Breakdown. Meta has officially announced the launch of Lama 2. It's an updated, more powerful, still open, and now, commercially available version of their large language model and represents not only a significant competitor to GPT and Bard, but is also flaring up significant conversations about the risks and opportunities of open source AI. Welcome back to the AI breakdown. Yesterday, Meta crushed the rest of the news of the week when they announced their much-anticipated Lama 2. There were a number of big parts of this announcement. The first is that LMA2 remains an open-source approach to LLMs. The second is that
Starting point is 00:08:13 it's free not only for research but for commercial use. A third is that meta is deepening a partnership with Microsoft through which developers using the Azure Cloud will be able to natively access Lama. And finally, there is a huge emphasis on safety, which makes sense given the controversy around whether AI should be open sourced at all. Before we get into Lama 2, let's go back and actually look at Lama 1, because it's had a pretty important role in the development of this space over the last six months. Now, going back to Lama 1, in March it was released as an open source package, although it wasn't complete. Basically, the weights in the model weren't included. However, within about a week of announcing it, the full model was leaked online, and almost immediately people were concerned about the
Starting point is 00:08:52 implications. Jeffrey Liddish, who will hear from again later in the show, said, get ready for loads of personalized spam and fishing attempts. Open sourcing these models was a terrible idea. Now, while some of the more dire warnings about scams and attacks might not have been borne out quite yet when it comes to that leak, that's not to say that there weren't serious implications for how Lama's open model being available would impact the development of the AI space. In May, another leak, this time from someone inside Google, argued that the real competitor for Google and OpenAI was not another big company developing a model based on huge amounts of training data, but instead was the insurgency coming from the open source ranks. The piece starts, we've done a lot of looking over our
Starting point is 00:09:31 shoulders at OpenAI, who will cross the next milestone, what will the next move be? But the uncomfortable truth is we aren't positioned to win this arms race, and neither is Open AI. While we've been squabbling, a third faction has been quietly eating our lunch. I'm talking, of course, about open source. Plainly put, they are lapping us. Things we consider major open problems are solved and in people's hands today. Now, the important part of this analysis for our story today comes in the What Happened section. The anonymous author writes, at the beginning of March, the open source community got their hands on the first really capable foundation model as Meta's Lama was leaked to the public. It had no instruction or conversation tuning and no RLHF. Nonetheless, the community immediately
Starting point is 00:10:11 understood the significance of what they had been given. A tremendous outpouring of innovation followed, with just days between major developments. Here we are barely a month later and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc., etc., many of which build on each other. Now, the author also does talk about what feels to them like the irony of Facebook being the leader in this new environment. They write, paradoxically, the one clear winner in all of this is meta. Because the leaked model was theirs, they have effectively garnered an entire planet's worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products. The value of owning the
Starting point is 00:10:49 ecosystem cannot be overstated. Now, that letter came out in May and importantly, we're now, a couple months later, seeing the impacts in a big way. If you're a regular listener, you will have heard me read Sam Hogan's big essay tweet the other day, where he argued basically that AI was not the savior to the venture startup ecosystem that people had thought, because the two big winners were on the one end of the spectrum, open source indie developers, and on the other end of the spectrum, big enterprise companies. Now, his argument for why the enterprises were doing better than anyone thought had a lot to do with these open source models.
Starting point is 00:11:22 As a reminder, he wrote, executives at enterprise companies are excited about AI, and they have been vocal about this from the beginning. This led a lot of founders and VCs to believe these companies would make good first customers. What the startups building for these companies fail to realize is just how aligned and savvy executives and the engineers they manage would be at quickly getting AI into production using, open source tools. An engineering leader would rather spin up their own
Starting point is 00:11:42 line chain and chroma infrastructure for free in build tech themselves than buy something from a new unproven startup. So this was the situation heading into the last week, and lots and and lots of rumors had been swirling that Lama 2 was on the way and at this time it would come with a license ready for commercial use. Well, as of yesterday, Lama 2 is here. It is indeed ready for commercial use. It picked up an interesting partner in Microsoft, and it's generating some serious discussion around issues of open source AI. Let's talk first about how Lama compares to other open source models. TLDR is its way out ahead.
Starting point is 00:12:15 For those of you who are listening, on the screen I'm showing a chart that shows benchmark comparisons of Lama 2, both its 7 billion parameter version and its 13 billion parameter version, out-competing many other open source LLm's. Now there's also a chart that they shared in the white paper that shows how Lama 2 compares in various benchmarks to other commercially available models like GBT 3.5, GBT4, Palm, and Palm 2L. While Lama remains pretty meaningfully behind GPT4 as a four example, it's coming up pretty close to the levels of GPT3.5. What's more, as NVIDIA's Dr. Jim Fan points out, model tests that involved humans suggested that Lama performed even better. Jim writes,
Starting point is 00:12:52 Meta's team did a human study on 4K prompts to evaluate Lama 2's helpfulness. They use win rate as a metric to compare models in similar spirit as the Fekuna benchmark. 70 billion parameter model roughly ties with GPT 3.5 and performs noticeably stronger than Falcon, MPT, and Fekuna. I trust these real human ratings more than academic benchmarks because they typically capture the in the wild vibe better. Now, in that same tweet, Jim also points out that Lama 2 is not yet at the GPT 3.5 level, and that the big thing holding its back is its coding abilities. Speaking of quirky human tests, Professor Ethan Mollock from Wharton writes,
Starting point is 00:13:26 out of the box, Lama 2 beats Bard at the insane memo test. Write a corporate memo in a serious style explaining and justifying the following points. One, the floor is now lava. Two, promotion will be by staring contests. Three, we have merged with a hive of bees. The queen is your new CTO. Now, as we mentioned, in terms of upgrades from Lama 1, the biggest one is the commercial availability. If you go back and look at how the developer community was discussing and talking about the first
Starting point is 00:13:51 iteration of Lama, a lot of it was about trying to assess whether META would actually sue if people used it for commercial products. For example, this hacker news post says, can Lama Wates be used for commercial products? And the top-rated comments is all about the difference between what the terms literally say, which did exclude commercial use, versus what they would actually do because the optics of suing might be terrible. Well, that has now been resolved as this model again is available for commercial use. And importantly, again, from a commercial standpoint, meta isn't charging directly for its usage. They'll make money by selling the program as a paid hosted service through various cloud computing
Starting point is 00:14:25 partners. That's, for example, where Microsoft comes in. Now, there are a couple commercial limitations to note. The terms prevent Lomitu's data or output from being used to train other LLMs. And second, the monthly active users of the product that is using Lama 2 exceed 700 million users, Lama is requiring a special commercial license. Obviously, there's a very small handful of companies for whom that would apply. Now, going back to Microsoft for a moment, people were fairly surprised by this announcement featuring Microsoft so prominently. Matt Wolf writes, so Microsoft is partnered with OpenAI on their closed source LLM,
Starting point is 00:14:56 and now they're partnering with meta to release an open source LLM with LMA too. I love that things are moving towards more open source. I'm just really confused by where Microsoft is going with all this. For market observers, though, the answer is pretty clear. Barron's writes yesterday, Microsoft shows investors the money from AI, why its meta deal threatens Google. The piece starts, Microsoft has just closed the gap
Starting point is 00:15:15 between the hype and the reality when it comes to AI. The tech giant unveiled its plan to monetize the technology Tuesday, answering a key question surrounding the recent AI stock boom. The company plans to charge businesses $30 a month for its artificial intelligence-powered Microsoft Office apps. In response to these updates, yesterday Microsoft's stock hit an all-time high. Now, another big emphasis of the announcement of Lama 2 was around its approach to safety.
Starting point is 00:15:38 Lewis Martin tweets, I am proud to have led the safety effort behind Lama 2. Our fine-tuned models are deemed safer and more helpful compared to other open and closed-source models such as chat GPT. Safety was evaluated by human annotators on a set of 2K adversarial prompts. We improved the safety of our models using supervised fine-tuning, RLHF, context distillation, and continuous red-teaming. In particular, we notice that RLHF makes our model more robust on the long tail of adverse. adversarial prompts. Thanks to context distillation, we have improved our model's responses to adversarial prompts. We first generate answers by prefixing a prompt with safety guidelines,
Starting point is 00:16:12 then fine-tune the model on these safe responses without these guidelines. We proactively test our models' weaknesses with continuous red-teaming. We conducted a series of red-teaming events with various teams of over 350 people, including domain experts. They also included individuals representative of a variety of demographic groups. One thing that many have noticed is that Meta took a slightly different approach to dealing with these safety issues by actually training Lama with two separate reward models. One was based on its helpfulness and one was based on its safety. There's allowed them to have more fine control over how the model should respond in different
Starting point is 00:16:43 contexts and scenarios. Now that said, some saw the very release of information, particularly the weights of the model, as undermining all of this focused on safety. Stanford PhD student, Chris Kundi writes, I appreciate all the emphasis on safety in the Lama 2 paper, but I'm not sure how that squares with releasing the weights. I want Crime Lama for effective fishing emails, can I just fine-tune to remove safety guardrails? Jeffrey Liddish said something similar.
Starting point is 00:17:06 If you have access to the weights, you can fine-tune away any safety controls. And this gets us to the discussion of open source more broadly. On the one hand, it's hard to deny how much Lama 2 advances the open-source LLM ecosystem. Nathan Lambert wrote on his substack, quote, The base model seems very strong beyond GBT3, and the fine-tuned chat model seem to be on the same level as chat GPT. It is a huge leap forward for open source and a huge blow to the clode source, and a huge blow to close-source providers as using this model will offer way more
Starting point is 00:17:33 customizability and way lower cost for most companies. Remember what we had discussed before, how enterprises had changed the way that they engaged with AI because of the availability of this type of open source model. And indeed, a lot of the mainstream media coverage focused on the risks of open sourcing. The Washington Post writes, Facebook to make its AI free to use,
Starting point is 00:17:52 expanding access to powerful tech. The social media giant is doubling down on its open source approach, potentially boosting competition, while also raising the risks of malicious actors using the tech. From the post, quote, the decision will deepen the divide forming in the tech world over whether to make new AI tech open source or not. Google and OpenAI have rejected full transparency,
Starting point is 00:18:10 citing the risks of bad actors using their tech or developing it in ways that increase risks to people. Facebook and a group of startups, including hugging face and stability AI, have said open source is keyed of making sure the powerful new technology doesn't further entrench the tech giants in stifle competition. The post also writes,
Starting point is 00:18:25 earlier this year, meta-released Lama to a select group of research, only for the model to be leaked and later used for applications ranging from drug discovery to sexually explicit chatbots. Last month, Senators Richard Blumenthal and Josh Hawley wrote to Zuckerberg arguing that in the short time generative artificial intelligence applications have become more widely available, they have already been misused for problematic content from pornographic deepfakes to malware and fishing campaigns. Now, Meta itself has pushed back on this idea. Nick Clegg, who is the president of global affairs at Meta and is a former UK deputy prime minister, said on BBC 4 yesterday, my view is that the hype has someone
Starting point is 00:18:58 run ahead of the technology. I think a lot of the existential warnings relate to models that don't currently exist, so-called super-intelligence, super-powerful AI models. The vision where AI develops an autonomy and agency on its own, where it can think for itself and reproduce itself. The models that we're open sourcing are far, far, far, far short of that. In fact, in many ways, they're quite stupid. Now it's pretty clear that the release of Lama 2 sets up this open source debate to move even more into the mainstream. A quick search on Twitter for Lama and Open will see just how much disagreement there is even within the often monotheinking Silicon Valley tech culture. However, for now, for most developers, those debates can wait because they have an incredibly powerful new tool,
Starting point is 00:19:35 and they're outbuilding the next wave of AI innovation. That's going to do it for today's AI Breakdown. If you're enjoying this and you're watching, please go listen to the podcast. If you're listening to this, go check out the YouTube. You can get information about everywhere this content lives at Breakdown.network, and until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.