The AI Daily Brief: Artificial Intelligence News and Analysis - Meta Llama 2 Is Here! Everything You Need to Know About OpenAI's Biggest Open Source Competitor
Episode Date: July 19, 2023The much anticipated Meta Llama 2 model has been released, and as hoped it is available for commercial use. NLW breaks down how Llama 2 compares to other open and closed LLMs and surveys the community...'s initial response. Before that on the Brief: Simulation is an AI showrunner that can create a fully animated TV episode from a single prompt; the AI LEAD Act in the Senate; OpenAI's grant for local journalism. Today's Sponsor: Supermanage - AI for 1-on-1's - https://supermanage.ai/breakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI Breakdown, we're looking at everything you need to know about Meta's new Lama 2.
Before that on the brief, new AI regulations on the docket and a tool that can generate an entire TV show from a single prompt.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our YouTube, newsletter, and Discord.
Welcome back to the AI Breakdown Brief, all the AI headline news you need in five-ish minutes or less.
Obviously, one of the big themes for the last few weeks has been the...
Hollywood strikes, first the Writers Guild Strike and now the Screen Actors Guild Strike as well,
that while about a lot of different issues, have artificial intelligence and the future of their
professions right at the heart of them. Now, as if designed to put a firm point on just how scary
this is for some of the talent involved in these strikes, yesterday's simulation ink dropped
an example of their new tool, which is a generative TV and showrunner agent. The promise that
they offer is creating episodes of TV shows with a single prompt.
From there, their tool, show one, will write, animate, direct voice, and edit the show for them.
Now, the example they gave, which really hit the whole issue right on the nose,
was an AI-generated episode of South Park that was all about the sag strike.
Alongside it, they released a paper called to Infinity and Beyond,
show-one and show-runner agents in multi-agent simulations.
The abstract reads,
In this work, we present our approach to generating high-quality episodic content
for intellectual property using large language models,
custom state-of-the-art diffusion models, and our multi-agent simulation for contextualization,
story progression, and behavioral control. Now, the big new dimension that they are adding to this is
basically the role of showrunner. They point out that while current generative AI systems are great
at short-term or specific tasks through prompt engineering, they don't have, quote, contextual guidance
or intentionality to either a user or an automated generative story system as part of a long-term
creative process. They point out that this is essential to producing, quote, high-quality creative
works, especially in the context of existing IPs. Now, for those of you who are interested in story
and the long-term capacity of AI to write and create stories, the paper is really interesting. They discuss,
for example, the slot machine effect, which they define as a scenario where the generation of AI
produced content feels more like a random game of chance rather than a deliberative creative process,
and they discuss how they try to address that with this new show-on model. In their announcement tweet,
they write, our goal at the simulation is AGI, AIs that are truly alive, not
chatbots that pop into existence when we speak, but AI people living real daily lives in
simulations growing over time. We built showrunner agents and are building show one model to give
our AI's infinite stories. After sharing a set of sample South Park episodes, they write, we are working
with creators and will be announcing several original IP simulations with attached AI TV shows later this
year. A space exploration simulation, the prize, a satire of Silicon Valley simulation, exit
Valley, a playful detective simulation about Charlie Jupiter. They conclude, ultimately we think single-agent
chatbots will fail because they have no lives and can't empathize. Does anyone really want endless
small talk with a brain in ajar? The AI should have their own lives and for that we need societies of
aIs, less her, more free guy. So this popped off on Twitter. Thousands and thousands of people have
shared it. 700,000 people have viewed the original video. Now, while some people pointed out that this was
a little inopportune at the moment, given the strike happening right now.
Others were just focused on the creative possibilities that are coming down the pipeline.
Billowal Sid, who writes,
We're going very quickly from doing the low-level stuff to orchestrating this all at a higher
level of abstraction.
It will be mind-blowing.
Next up on the brief, we officially have dueling open letters.
This time, a new letter signed by more than 1,300 experts, argues that AI is a force for good
and that fears around its long-term existential risks have been overblown.
The letter was organized by BCS, the Chartered Institute for IT in the UK, and as the BBC
describes it, signatories to the BCS letter come from a range of backgrounds, business, academia,
public bodies, and think tanks, though none are as well known as Elon Musk or run major AI
companies like OpenAI.
Speaking of AI for good, OpenAI and the American Journalism Project have announced a partnership
through which OpenAI will give $5 million in cash, along with $5 million in OpenAI API
credits, to local news publishers in order to help them both shape, as well as
as use new generative AI tools in supporting local news efforts.
OpenAI CEO Sam Altman says,
We proudly support the American Journalism Project's mission to strengthen our democracy by
rebuilding the country's local news sector.
This collaboration underscores our mission and belief that AI should benefit everyone
and be used as a tool to enhance work.
Now, this comes a week after OpenAI announced a two-year deal with the Associated Press
to use AP content to help train OpenAI's model.
Meanwhile, other early attempts to use AI generated content in publishers haven't
on so well. Geo Media that owns companies like Gizmodo has been roundly ridiculed over the last few weeks
for error-written articles that they published that were written by AI. However, that score,
along with Antipathy from Geo staff, is not enough to change course. Merrill Brown, Geo's editorial
director, said it is absolutely a thing we want to do more of, and CEO Jim Spanfeller says,
I think it would be irresponsible to not be testing it. Over in the U.S., the regulatory
march around AI continues, and yet as Senate Majority Leader Chuck Schumer focuses on comprehensive
legislation, other senators are focused on smaller, more defined measures.
Michigan Senator Gary Peters has introduced legislation called the AI Lead Act, which is scheduled
for a markup on Wednesday of this week, and is focused exclusively on the federal government
itself, in terms of how it builds, buys, and deploys AI-driven systems. Daniel Ho, a member
of the White House's National AI Advisory Committee said, the government is going to be one of the
largest purchasers of AI systems, so the standard that it sets will have a pronounced impact on
responsible AI innovation. Meanwhile, just like we covered Antipathy,
from Gary Gensler and the SEC towards AI on yesterday's show, a different financial regulator,
this time the Fed's banking regulator Michael S. Barr, the Fed's vice chair for supervision,
has made another warning about AI saying that it could lead to illegal lending practices,
such as excluding minorities. Barr said, while these technologies have enormous potential,
they also carry risks of violating fair lending laws and perpetuating the very disparities
that they have the potential to address. The example that he gave was digital redlining,
where minority communities are denied access to credit or housing opportunities.
The fear is, of course, that AI trained on prejudiced or biased data could end up reinforcing
and extending that prejudice or bias.
So just another example of how basically every department in the government is trying to figure
out how AI is going to impact what they have particular oversight into.
That is going to do it for today's AI breakdown brief.
If you're enjoying it, you should go subscribe to the AI breakdown newsletter.
It comes out every morning and features the five most interesting or important stories in AI.
You can find a link down below in the show notes.
Thanks again for listening or watching, and I'll be back soon with the main AI breakdown.
Hey guys, before we dive into the main part of the episode, I want to share a little bit about today's sponsor, Supermanage.
A truly great one-on-one should be about celebrating wins, solving problems, and deepening the connection between two human beings.
But what if you miss those wins, never heard about those problems, and spent your whole meeting avoiding the hard stuff?
That's where Supermanage comes in.
Supermanage AI distills your public Slack channels into a one-on-one brief that highlights everything you need to know to jump right in.
Because let's face it, you want your team to do the best work of their lives.
And that starts with world-class conversations.
Visit supermanage.aI slash breakdown today to start making the most of your one-on-ones.
Thanks again to Supermanage for sponsoring the AI Breakdown.
Meta has officially announced the launch of Lama 2.
It's an updated, more powerful, still open, and now,
commercially available version of their large language model and represents not only a significant
competitor to GPT and Bard, but is also flaring up significant conversations about the risks and
opportunities of open source AI. Welcome back to the AI breakdown. Yesterday, Meta crushed the rest of the
news of the week when they announced their much-anticipated Lama 2. There were a number of big parts of
this announcement. The first is that LMA2 remains an open-source approach to LLMs. The second is that
it's free not only for research but for commercial use. A third is that meta is deepening a partnership
with Microsoft through which developers using the Azure Cloud will be able to natively access Lama.
And finally, there is a huge emphasis on safety, which makes sense given the controversy around
whether AI should be open sourced at all. Before we get into Lama 2, let's go back and actually
look at Lama 1, because it's had a pretty important role in the development of this space over the last
six months. Now, going back to Lama 1, in March it was released as an open source package, although
it wasn't complete. Basically, the weights in the model weren't included. However, within about a week
of announcing it, the full model was leaked online, and almost immediately people were concerned about the
implications. Jeffrey Liddish, who will hear from again later in the show, said, get ready for loads
of personalized spam and fishing attempts. Open sourcing these models was a terrible idea. Now, while some
of the more dire warnings about scams and attacks might not have been borne out quite yet when it comes
to that leak, that's not to say that there weren't serious implications for how Lama's open model
being available would impact the development of the AI space. In May, another leak, this time from
someone inside Google, argued that the real competitor for Google and OpenAI was not another
big company developing a model based on huge amounts of training data, but instead was the insurgency
coming from the open source ranks. The piece starts, we've done a lot of looking over our
shoulders at OpenAI, who will cross the next milestone, what will the next move be? But the uncomfortable
truth is we aren't positioned to win this arms race, and neither is Open AI. While we've been squabbling,
a third faction has been quietly eating our lunch. I'm talking, of course, about open source.
Plainly put, they are lapping us. Things we consider major open problems are solved and in people's
hands today. Now, the important part of this analysis for our story today comes in the What Happened
section. The anonymous author writes, at the beginning of March, the open source community got
their hands on the first really capable foundation model as Meta's Lama was leaked to the public.
It had no instruction or conversation tuning and no RLHF. Nonetheless, the community immediately
understood the significance of what they had been given. A tremendous outpouring of innovation followed,
with just days between major developments. Here we are barely a month later and there are variants
with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc., etc., many of which
build on each other. Now, the author also does talk about what feels to them like the irony of Facebook
being the leader in this new environment. They write, paradoxically, the one clear winner in all of this
is meta. Because the leaked model was theirs, they have effectively garnered an entire planet's worth of
free labor. Since most open source innovation is happening on top of their architecture, there is
nothing stopping them from directly incorporating it into their products. The value of owning the
ecosystem cannot be overstated. Now, that letter came out in May and importantly, we're now,
a couple months later, seeing the impacts in a big way. If you're a regular listener, you will have
heard me read Sam Hogan's big essay tweet the other day, where he argued basically that AI was not
the savior to the venture startup ecosystem that people had thought, because the two big winners were
on the one end of the spectrum, open source indie developers,
and on the other end of the spectrum, big enterprise companies.
Now, his argument for why the enterprises were doing better than anyone thought
had a lot to do with these open source models.
As a reminder, he wrote,
executives at enterprise companies are excited about AI,
and they have been vocal about this from the beginning.
This led a lot of founders and VCs to believe these companies would make good first customers.
What the startups building for these companies fail to realize
is just how aligned and savvy executives and the engineers they manage would be
at quickly getting AI into production using,
open source tools. An engineering leader would rather spin up their own
line chain and chroma infrastructure for free in build tech themselves than buy something
from a new unproven startup. So this was the situation heading into the last week, and lots and
and lots of rumors had been swirling that Lama 2 was on the way and at this time it would come
with a license ready for commercial use. Well, as of yesterday, Lama 2 is here. It is indeed
ready for commercial use. It picked up an interesting partner in Microsoft, and it's
generating some serious discussion around issues of open source AI.
Let's talk first about how Lama compares to other open source models.
TLDR is its way out ahead.
For those of you who are listening, on the screen I'm showing a chart that shows benchmark comparisons
of Lama 2, both its 7 billion parameter version and its 13 billion parameter version, out-competing
many other open source LLm's.
Now there's also a chart that they shared in the white paper that shows how Lama 2 compares
in various benchmarks to other commercially available models like GBT 3.5, GBT4, Palm, and Palm 2L.
While Lama remains pretty meaningfully behind GPT4 as a four example, it's coming up pretty close to the levels of GPT3.5.
What's more, as NVIDIA's Dr. Jim Fan points out, model tests that involved humans suggested that Lama performed even better.
Jim writes,
Meta's team did a human study on 4K prompts to evaluate Lama 2's helpfulness.
They use win rate as a metric to compare models in similar spirit as the Fekuna benchmark.
70 billion parameter model roughly ties with GPT 3.5 and performs noticeably stronger than Falcon, MPT, and Fekuna.
I trust these real human ratings more than academic benchmarks because they typically capture the
in the wild vibe better.
Now, in that same tweet, Jim also points out that Lama 2 is not yet at the GPT 3.5 level,
and that the big thing holding its back is its coding abilities.
Speaking of quirky human tests, Professor Ethan Mollock from Wharton writes,
out of the box, Lama 2 beats Bard at the insane memo test.
Write a corporate memo in a serious style explaining and justifying the following points.
One, the floor is now lava.
Two, promotion will be by staring contests.
Three, we have merged with a hive of bees.
The queen is your new CTO.
Now, as we mentioned, in terms of upgrades from Lama 1, the biggest one is the commercial availability.
If you go back and look at how the developer community was discussing and talking about the first
iteration of Lama, a lot of it was about trying to assess whether META would actually sue if
people used it for commercial products.
For example, this hacker news post says, can Lama Wates be used for commercial products?
And the top-rated comments is all about the difference between what the terms literally say,
which did exclude commercial use, versus what they would actually do because the optics of suing
might be terrible. Well, that has now been resolved as this model again is available for commercial
use. And importantly, again, from a commercial standpoint, meta isn't charging directly for its usage.
They'll make money by selling the program as a paid hosted service through various cloud computing
partners. That's, for example, where Microsoft comes in. Now, there are a couple commercial
limitations to note. The terms prevent Lomitu's data or output from being used to train other LLMs. And second,
the monthly active users of the product that is using Lama 2 exceed 700 million users,
Lama is requiring a special commercial license.
Obviously, there's a very small handful of companies for whom that would apply.
Now, going back to Microsoft for a moment, people were fairly surprised by this announcement
featuring Microsoft so prominently.
Matt Wolf writes, so Microsoft is partnered with OpenAI on their closed source LLM,
and now they're partnering with meta to release an open source LLM with LMA too.
I love that things are moving towards more open source.
I'm just really confused by where Microsoft is going with all this.
For market observers, though, the answer is pretty clear.
Barron's writes yesterday,
Microsoft shows investors the money from AI,
why its meta deal threatens Google.
The piece starts, Microsoft has just closed the gap
between the hype and the reality when it comes to AI.
The tech giant unveiled its plan to monetize the technology Tuesday,
answering a key question surrounding the recent AI stock boom.
The company plans to charge businesses $30 a month
for its artificial intelligence-powered Microsoft Office apps.
In response to these updates,
yesterday Microsoft's stock hit an all-time high.
Now, another big emphasis of the announcement of Lama 2 was around its approach to safety.
Lewis Martin tweets,
I am proud to have led the safety effort behind Lama 2.
Our fine-tuned models are deemed safer and more helpful compared to other open and closed-source models such as chat GPT.
Safety was evaluated by human annotators on a set of 2K adversarial prompts.
We improved the safety of our models using supervised fine-tuning, RLHF, context distillation, and continuous red-teaming.
In particular, we notice that RLHF makes our model more robust on the long tail of adverse.
adversarial prompts. Thanks to context distillation, we have improved our model's responses to
adversarial prompts. We first generate answers by prefixing a prompt with safety guidelines,
then fine-tune the model on these safe responses without these guidelines. We proactively
test our models' weaknesses with continuous red-teaming. We conducted a series of red-teaming events
with various teams of over 350 people, including domain experts. They also included individuals
representative of a variety of demographic groups. One thing that many have noticed is that
Meta took a slightly different approach to dealing with these safety issues by actually training
Lama with two separate reward models.
One was based on its helpfulness and one was based on its safety.
There's allowed them to have more fine control over how the model should respond in different
contexts and scenarios.
Now that said, some saw the very release of information, particularly the weights of the model,
as undermining all of this focused on safety.
Stanford PhD student, Chris Kundi writes,
I appreciate all the emphasis on safety in the Lama 2 paper, but I'm not sure how
that squares with releasing the weights.
I want Crime Lama for effective fishing emails, can I just fine-tune to remove safety guardrails?
Jeffrey Liddish said something similar.
If you have access to the weights, you can fine-tune away any safety controls.
And this gets us to the discussion of open source more broadly.
On the one hand, it's hard to deny how much Lama 2 advances the open-source LLM ecosystem.
Nathan Lambert wrote on his substack, quote,
The base model seems very strong beyond GBT3, and the fine-tuned chat model seem to be on the same level as chat GPT.
It is a huge leap forward for open source and a huge blow to the clode source,
and a huge blow to close-source providers
as using this model will offer way more
customizability and way lower cost for most companies.
Remember what we had discussed before,
how enterprises had changed the way that they engaged with AI
because of the availability of this type of open source model.
And indeed, a lot of the mainstream media coverage
focused on the risks of open sourcing.
The Washington Post writes,
Facebook to make its AI free to use,
expanding access to powerful tech.
The social media giant is doubling down on its open source approach,
potentially boosting competition,
while also raising the risks of malicious actors using the tech.
From the post, quote,
the decision will deepen the divide forming in the tech world
over whether to make new AI tech open source or not.
Google and OpenAI have rejected full transparency,
citing the risks of bad actors using their tech
or developing it in ways that increase risks to people.
Facebook and a group of startups,
including hugging face and stability AI,
have said open source is keyed
of making sure the powerful new technology
doesn't further entrench the tech giants in stifle competition.
The post also writes,
earlier this year, meta-released Lama to a select group of research,
only for the model to be leaked and later used for applications ranging from drug discovery
to sexually explicit chatbots. Last month, Senators Richard Blumenthal and Josh Hawley
wrote to Zuckerberg arguing that in the short time generative artificial intelligence
applications have become more widely available, they have already been misused for problematic
content from pornographic deepfakes to malware and fishing campaigns. Now, Meta itself has
pushed back on this idea. Nick Clegg, who is the president of global affairs at Meta and is a former
UK deputy prime minister, said on BBC 4 yesterday, my view is that the hype has someone
run ahead of the technology. I think a lot of the existential warnings relate to models that
don't currently exist, so-called super-intelligence, super-powerful AI models. The vision where AI develops
an autonomy and agency on its own, where it can think for itself and reproduce itself. The models
that we're open sourcing are far, far, far, far short of that. In fact, in many ways, they're quite
stupid. Now it's pretty clear that the release of Lama 2 sets up this open source debate to move even more
into the mainstream. A quick search on Twitter for Lama and Open will see just how much disagreement
there is even within the often monotheinking Silicon Valley tech culture. However, for now,
for most developers, those debates can wait because they have an incredibly powerful new tool,
and they're outbuilding the next wave of AI innovation. That's going to do it for today's AI
Breakdown. If you're enjoying this and you're watching, please go listen to the podcast. If you're
listening to this, go check out the YouTube. You can get information about everywhere this content
lives at Breakdown.network, and until next time, peace.
