Everyday AI Podcast – An AI and ChatGPT Podcast - EP 352: 5 Things to Know About Meta's Llama 3.1
Episode Date: September 6, 2024Win a free year of ChatGPT or other prizes! Find out how.Another one!? Another day, another HUGELY impactful model. Meta responds to OpenAI with a one-of-a-kind model in Llama 3.1 and the brand spanki...n new 405B model. What's it all mean? We gotchyu.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan and questions on MetaRelated Episode:Ep 318: GPT-4o Mini: What you need to know and what no one’s talking aboutUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Highlights of Meta's New Model - Llama 3.12. Spotlight on the 405b model3. Functionalities and User Interface Updates4. Meta's Focus and Its Implication on AI and BusinessTimestamps:01:30 Daily AI news06:30 Meta Model overviews11:11 Free open source models, accessible with limitations.16:20 Meta's quick product launches receive praise.17:32 Llama 3 to 3.1 brings huge improvements.20:11 AI advancements democratizing app development and deployment.24:35 Meta's benchmarks indicate promising performance30:15 Customizable large language model for specific tasks.34:46 Mark Zuckerberg's influence on AI prominence evident.37:33 Experts and knowledge will be replaced by models.42:27 Typing in real time with Meta AI.43:24 Creating testing questions for new model iterations.48:11 New model requires less prompt engineering, delivers more.51:10 Ranking answers, prompt techniques, standardized testing methods.Keywords:OpenAI, Llama 3.1, Google, Databricks, Snowflake, Microsoft, AWS, Dell, NVIDIA, Grok, IBM, model distillation, data generation, MMLU comparison, GPT 4 o Mini, 405 b release, sharing chat transcripts, ad retargeting, custom GPTs, AI agents, Mark Zuckerberg, Meta, metaverse, 8b model, 70b model, 405b model, MMLU benchmark score, Meta AI, edge devices, OpenAI vs Meta.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
The new 405B model from Lama and their Lama 3.1 updates are extremely powerful.
I mean, it's open source, free to download, and I think it'll change the business landscape.
But there's a lot of things that I don't think people are paying attention to.
And it's probably not what you think.
So today, we're going to be going over that.
And I'm going to tell you exactly what this Lama 405B and 3.1, what these things are.
I'm going to show you a little bit of the new model live and go over five things that you need to know.
All right.
What's going on, y'all?
Let's get this thing started.
My name's Jordan Wilson and welcome to Everyday AI.
This thing is for you.
Well, it's for all of us.
It's a place where, you know, non-technical people, but still people who want to be technical,
can learn about all things generative AI to grow.
to grow your company and to grow your career.
So if that's you, thank you for tuning in on the podcast.
Always, make sure to check out your show notes.
We just launched a cool campaign.
I'm going to tell you about here in a second.
But always more information and make sure if you haven't already,
go to your everyday AI.com and sign up for the free daily newsletter.
All right.
So let's get into it and talk about Meta, Lama 405B, the new model, and Lama 3.1.
So we're going to give you what you need to know, what people aren't.
talking about and we're going to show you a little bit live. All right. And hey, for our live stream
audience, thank you as always for joining Michael and Brian and Fred, a couple of Chicago people
in the house. Denny, Cecilia, thank you all for joining us. So let me know. Have you all used
this new Lama 3.1 yet? Are you going to? What questions do you have? Please get them in now.
I'll try to tackle them at the end. But let's just get straight into an overview here, y'all. So here's
here's what's new. So you have technically three different tiers of models from Meta, right,
the parent company of Facebook. So Mark Zuckerberg did all his media rounds yesterday,
wrote a very long, you know, blog posts about the future of Lama and open source AI. But here's
essentially what you need to know. In April, Meta announced Lama 3, right? And they essentially had,
they said that there's three different sizes, right? A small, medium, and large. And that
seems to be the new trend, I think, started by Anthropic, right?
And essentially, there's use cases for small, medium, and large.
And we'll talk about that a little bit more here.
But back in April, when Meta first announced Lama 3, which is a huge upgrade,
they only came out with the kind of small and medium models.
So when we talk about what is 405B and what is 8B and 70B, well, those are sizes of models.
So the big one, the 405B, had not been announced in.
until yesterday or is not available, whereas the other smaller, the small and the medium size
models were available, but they've been upgraded.
All right.
So I'm going to go ahead and share my screen a little bit here.
And we're going to walk through some of the benchmarks because they are impressive.
All right.
So let's go ahead and share my screen here.
So for our podcast audience, I'm going to try to do my best.
I don't think you're going to be missing out on anything here because I think I should be able to kind of walk us through this just a little bit.
So like I talked about here in a live stream audience, let me know if you can, if you can see this.
But so multiple new models from meta.
So like I talked about, we have a think of it as a small, medium and large, just like Anthropic, right?
So Anthropic has their haiku small, their sonnet, their medium, and their opus large.
So for meta, they're actually just naming it by the number of parameters it's trained on.
So they have their 8B, which is their small, their 70B, which is their medium,
and then they're just now released 405B.
Big jump up, right?
And without getting too technical, right, because our audience here is for the most part,
not super technical people like myself, right?
The easiest way to think about parameters is the amount of data that it's trained on, all right?
So let's go and just jump into the benchmarks because that is kind of our first point that we wanted to talk about is these benchmarks are pretty impressive.
And again, I have to really hit rewind even because we have to draw a line because even if we're looking here at these benchmarks.
And this is what everyone always talks about and rightfully so, right?
So benchmarks are when all the smart researchers and scientists from both meta and everyone from
all the big companies, third parties, they all do these benchmarks.
And you essentially get a score.
So think of like a new car, right?
When a new car comes out, you know, you get, oh, the EPA estimated gas mileage and, you know,
goes through all these, you know, third party, you know, safety tests, all these things.
And it gets scores, right?
So large language models are the same way.
And there's all of these different dozens of different benchmarks.
but the one that we talk about a lot here on everyday AI is the MMMLU.
Okay.
So the MMLU benchmark is the massive multitask language understanding.
So it's essentially 57 different subjects.
And you get a score, right?
So it's like the ACT or the SAT for large language models.
And this is by far the gold standard.
So when we talk about MMLU and this new Lama 3.1, the large variety.
So that's what we're going to be sharing here in the screenshots.
It is very impressive.
And I have a screen here earlier that I'm going to be showing.
But we have to keep in mind, this is an open source model, y'all.
So what that means, this is free.
This is free to use.
You can download these models, although you really have to have like the world's strongest computer to download the 405B.
but for the small and medium models, most of us out there, if you have a newish computer
with a decent graphics processing chip, if it has decent specs, you're going to be able to
download these models from meta.
So that's like before we get into these benchmarks anymore, we really have to talk about
the importance of this is open source, right?
You can download it.
You can build on top of.
it without having to pay, right?
Those inference costs, those training costs over and over, which is huge.
Okay.
And this is also, it's available in a lot of dev environments, which I'm going to show you
here on the screen soon.
So with that, let's look at these benchmarks.
So the MMLU, Lama came in at an 88.6, which is extremely, extremely impressive.
All right.
I might do a full show soon.
Live stream audience, let me know.
Do you want to see that?
I might do a full show soon on MMLU.
What are these benchmarks?
What do they mean?
But essentially, if you are an expert in one specific thing.
So again, these models are trained or sorry,
the MMLU goes across 57 different subject areas.
Think if you are a world expert in one,
a world expert is going to get about an 89.8.
All right.
An 89.8, a world expert on that one domain specific field.
Okay.
But in the other 56, the average person gets about a mid-30s.
Okay, so think, when we say that Lama has an 88.6,
that means that a free model that you can download now
is essentially a world-class expert or almost as good.
It's like having the 57 smartest people in the world available
for free their entire knowledge and you can build with it, you can download it,
you can work with it, you can make it your own.
Okay, so that's what we talk about when we're talking about both MMLU and the power
of having these high of scores in an open source model that you can download,
you can fork, you can build upon and you're not having to, you know, pay a big company
each and every time.
All right.
So benchmarks are extremely important.
So the Lama 3-1, it is above Claude 35 sonnet, but it is just below GPT4 Omni.
So GPT4 Omni, 88.7, Lama 88.6, right?
One fraction, right?
One fraction of a point away, which I was surprised about.
I was surprised that Mata didn't sit on this for another couple of weeks and try to
over-engineer and try to squeeze a little bit more juice out of it.
So it's the world leader on MMLU.
but regardless, a free open source model that is essentially now the most powerful model,
extremely impressive.
So yes, the human e-vow scores also very, very top-notch in 89 when the leader Claude is in 92.
Pretty good there.
A couple, though, that are worth noting about, I'm not going to talk about benchmarks this entire time.
But two other things, I mean, GSM8K, which is essentially, you know, math, right, basic math.
And 96.8, the most capable model right now in the world.
Also, the arc challenge, which is reasoning, got the highest scores of any model.
So when we talk about benchmarks, extremely impressive, okay?
We also have to talk about availability, where this is available at.
right so we talked about you can download this now you can also go to meta.a.
I'm not sure which countries have access.
I didn't get a full list from meta.
I'm sure they'll be rolling out with that soon, but you can go to meta.
com.
You do have to log in with either Facebook account and Instagram account,
but you can use it for free literally right now.
And I'm going to be going over a little bit of that lie.
But we have to also talk about where this is available, right?
Because if you are a business leader,
and you are looking, let's say you're working at a Fortune 500 and Inc. 5,000 company,
a big enterprise, right?
This is available now showing on the screen here.
This is available now in so many places, so many places here.
So AWS, Databricks, Nvidia's Foundry, IBM, Google Cloud, Microsoft, Scale, Snowflake, right?
So where so many of these big companies, big enterprises, house their data, where they're trying to marry their data with the right large language model, it's available now.
It's available to date, which I think I love that from meta, right?
You can have whatever thoughts you want about Mark Zuckerberg.
You can have whatever thoughts you want about the social media side of meta, you know, Facebook and in Instagram and WhatsApp and data collection, all of these things.
You can have whatever thoughts.
But the fact that meta announced this, drop this, and it's available all instantly,
hats off, right?
Because Google is notoriously bad for, you know, having all these big conferences, right?
At their Google I.O.
conference, they announced all these things.
This is months ago.
And we haven't seen a fraction of them, at least when it comes to their new large language
model developments and generative AI.
Meta, it's like drop a blog post, drop some interviews.
And it's live, it's ready.
So you can probably today go and work with this in your environment right now.
And again, it's open source, y'all.
All right.
So like I said, the first thing that you need to know is the specs are extremely impressive.
So like I talked about that, the benchmarks are great.
A couple other things.
The 8B and the 70B versions, the small and the medium, those are upgraded.
So that is the big jump from, you know, Lama 3 to Lama 3.1.
So even if you look at just the improvements in the small and the medium model here,
they're huge.
They're, I mean, they're very impressive.
So the small and the medium models, if you look at them in their respective, quote unquote, weight classes, right,
especially the 8B because I think the 8B probably within, I don't know,
six months to a year of hardware, I think you're going to be able to see this as an
edge device as a model that could in theory run locally on a phone.
So that's the other huge upside to being open source and to be able to download a model
is you can run it locally without internet.
So then privacy concerns are less.
Um, Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience.
Meet Firefly AI assistant now live in the Adobe Firefly app, the all in one creative AI studio.
Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant.
The assistant orchestrates multi-step workflows, drawing on.
60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere,
Lightroom Express, and more to help bring your ideas to life. You can also get started with
creative skills, a growing library of pre-built workflows for common creative tasks, like batch
editing photos, creating mood boards, portrait retouching, and creating social variations. Every step
the assistant takes is visible, so you can refine, redirect, or take over at any time. You stay in the
driver's seat as the creative director. Adobe Firefly AI assistant now in public beta.
See it today at firefly.adobie.com.
Speed is more. Environmental concerns are less of a concern at that point when you can run a model
locally and you don't have to, you know, run it off of essentially a server. And so it's faster.
The latency's lower. It's more secure and you don't even need the internet. Right.
But the 8B model, especially, I'm looking out of this. And this is what I don't think people are
talking about. So if you compare it to Google's new Gemma 2 model, so these are, again, I would call
these something between a small language model and a small large language model. There's no actual
definition because the goalposts are always moving, right? But these are models in theory,
Lama 318B, 8 billion parameters and Gemma 2, 9b. These are models that in theory can be running
locally on a smartphone, probably within six months to a year, right?
So right now, Google also has Gemini Nano, which is one of the ones that's running on current
smartphones.
But you have to also think when you see all of these announcements, we can't just look at the
benchmarks and what they mean today.
We have to look at what do they mean in the future.
So I already told you, this has huge future implications, which we're going to get into
about how businesses can run now.
But you also have to think of what this means for the future of on the go AI, which is
the future. I've said hundreds of times. The future of large language models is small language
models and working with many of them and working with them on edge AI, on device AI. But the 8B model
just is outpunching its weight class in almost every single benchmark aside from like one.
It is the top model. It is the top small large language model. And it's not even particularly
close. And again, open source. So think of what this means for the future.
of even apps that you use, right?
Because now all of these developers, you know, it's not like even with GPT4O Mini,
which we're going to talk about here, that brought the cost way down.
But, you know, a month or two ago, you know, could be expensive to go launch a brand new
app that was powered by AI for, you know, enterprise business or just something that's,
you know, fun for people to use.
This changes things.
This changes what people with low budgets can go and build in a weekend or in a day
and launch immediately.
This brings scalability to even small companies that don't maybe have, you know, compute power.
Well, you might not even need it with an 8 billion parameter model, which is fairly small.
You can download this and run it on a machine that's not even very powerful.
All right.
And the benchmarks are extremely impressive there.
Also, another thing to note, and I promise you, this is it for our more technical side.
All right.
So,
128K context window,
huge,
right?
Because the context window
was very small
before.
I believe it was
8K or less.
Also,
meta is using a slightly different,
and this gets into a little bit of the technical side,
but mixture of experts or M-O-E is what a lot of these large language model
companies have been using.
So meta will put it in our newsletter today.
They want a different route going this heard of experts.
versus the mixture of experts.
All right.
So enough on the technical side,
let's go ahead and talk about the second thing that you need to know.
And I think that this is shots fired.
I mean, shots fired against Open AI,
but this is a developer-focused strategy, right?
Because I think people are overlooking the vast improvements
that were made in the 8B and the 70B,
the small and the medium models.
And what this means is I think they're just going after Open AI, right?
We saw Anthropic with Claude 35 Sonnet.
We saw Google Gemini with 1.5 Flash, right?
So, and Claude presumably will be releasing 35 Haiku, which is their small model, right?
So Haiku Sonnet Opus.
So over, I'd say, in the first quarter or two of 2024, Open AI had lost traction, I think, among developers.
They were a jumping ship.
We had a whole episode about this on Friday.
but right after they released their new model, GPT40 Mini,
which is essentially a light version of their big boy model.
So now Open AI has essentially a small and a large.
They don't have a medium per se.
But this really changed the developer landscape.
So Open AI, I think, was losing customers.
They were going to Google Gemini for 1.5 Flash.
It was cheaper and more powerful.
They were going to Anthropic for Claude 3 Hiku.
it was cheaper and more powerful.
You know, Open AI was essentially relying on either 3.5,
which wasn't very powerful or GPT4, which was very expensive.
So I do think that OpenAI did make a huge splash with GPT4O Mini.
Did that happen to coincide with the fact that meta decided days after
to release their 3.1 updates, potentially, right?
Because I think they are now going directly after developers.
And it can't also be lost on us.
How open AI responded.
I talked about that at the top of the news this morning.
Literally hours after Meta released a very impressive 3.1, three models.
Benchmarks go wild, right?
Literally hours after that, Open AI said, oh, you know what?
We're actually going to make cost zero for the next couple of months to come and fine tune your models, right?
So, hey, hey, developers, hey, business.
businesses, right? Hey, businesses, you want to, you know, bring in rag. You want to have this
retrieval augmented generation. You want to kind of marry and create your own large language
model with your data and some fine-tuning. Generally, you know, a year and a half ago, very expensive,
very hard to do. Now, it's easier and it's cheap. So Open AI said, hey, for the next couple of months,
you can come in and do it for free. Two millions, two million tokens a day, which I think is
in direct response. They saw what meta dropped and they're like, whoa, these benchmarks,
Pretty dang good.
Pretty dang good, right?
We don't have comparisons yet with the 8B, 70B, and GPT40Mini,
but I'm sure those will be rolling out here soon,
and we'll talk about it.
But this is a developer-focused strategy from Mata,
and it is a direct shot to open AI.
All right.
So I do think at least right now, it's a two-player battle.
It's a two-player battle.
But what this does mean is I fully expect Google and Anthropic to be releasing updates in the next two months, right?
This changes.
It's getting so cheap, where it's getting to the point to train a model, you're not going to have a lot of costs.
Right.
We always talk about inference and the cost per token of training.
When we see a model like a model like Lama 3.1 and their 8B, their 70B, it's free 99, y'all.
Yeah, you got to have a computer powerful enough to handle it.
You still need the skill set.
But the cost to train a model on your company's own data, the cost to build something
on top of a large language model is going down to like free 99, right?
Whereas a year and a half ago, pretty expensive.
And now the availability is everywhere.
You saw the chart, right?
It's available inside Nvidia, it's, it's a platform, their Foundry platform.
It's available inside of Microsoft, inside of Google, inside of data breaks, inside of Snowflake, right?
So all these dev environments, it's available there.
So the cost is going down.
And this means that I think Google and Anthropic are going to be responding here soon.
All right.
Number three, the thing that you need to know is it is ready to be built upon, right?
So we did kind of talk about that.
You don't have to wait.
You don't have to wait.
It's literally available now.
And it's available in all of those platforms.
I didn't even talk about AWS.
But yeah, AWS is platform.
Dell, Nvidia, Grock, Grock with a Q, not the large language model from Twitter that no one
uses.
IBM, Google Cloud, Microsoft, Scale, Snowflake.
It is ready now, right?
Which I, again, I talked about this at the top of the show.
I love Meta's approach here.
Don't make us wait, right?
Google is like, hey, here's some marketing, here's some promise.
Let's see how our stock goes up or down.
And maybe we'll release these in the next six months to two years.
Open AI kind of in the middle.
Open AI, you know, when they had their spring event, they dropped their model that day.
of the more powerful features, we're still waiting on, you know, we should be getting them in theory any day.
But meta, meta just literally, you know, Mark Zuckerberg, new hairstyle, looking, looking fresh with the gold chain, just drop, mic drop.
Here's everything.
Go out and play available everywhere now, right?
Changes how business is done.
All right.
So number four, the number four thing you need to know.
An open source Lama 3.1 is more than your average.
large language model.
Okay.
Again, we'll be linking to this in our newsletter so you can read a little bit more from both Zuckerberg's long blog post on the future of open source and what that means.
But it means a lot of things.
So he talked a lot about model distillation.
So that essentially means the big model, the 405B, can act as a teacher to quote unquote train student models.
Right?
You can also, companies can make models of any size.
That's the other thing.
If you are working with a closed source proprietary model, which is what all the other big guys are, right?
So Google Gemini, Anthropic Claude, chat GPT, et cetera.
Those are all closed source proprietary, right?
It's not a bespoke option.
You know, it's one size fits all or you don't use it.
So you might be overpaying or you might be underutilizing.
But with this model distillation, with metath,
3.1, you can make models of any size, right? You can use the big model to train the smaller models.
Also, a huge thing here is the synthetic data generation. So Lama 3.1 can generate high quality
synthetic data to enhance the performance of smaller models, right? We always talk about,
where is this data coming from? Where's this future high quality data coming from? Which I
personally think is a huge problem, right? Because models, for the most part, are trained on the open
internet. And guess what? People don't know this, but this predates chat GPT's release. People,
you know, SEOs, content writers, they've been using the GPT technology since 2020. So I think you have
this, almost this data regurgitation issue where so much of the data right now is being built
by AI, right? A lot of studies say by 2026, more than 90% of new content on the internet will be
created with AI. So you have this problem. Where do you get this new, this new, this
new data, this original data. So what meta is kind of doing here with this synthetic data
generation is you use the big model to create quote unquote new model or sorry, new data
to train the smaller models or to train your company's models.
Another thing is the customization, all right? And this is why it is much more than your
average large language model. It allows for model architecture customization based on specific
task or hardware constraints. So what that means,
is if you do want to build, let's just say a customer service, large language model for your
company, right? Put in all your data. If you're using a GPT40 Mini, if you're using a Claude 3,
if you're using a Google Gemini, you in theory are going to still be using the full model, right?
So you can essentially with an open source downloadable model that you can fork, you can build
upon, you can essentially do anything you want. Think of it like this. It's a template,
inside of a Word doc, it's a template inside a PowerPoint.
You can move things around.
You can, you know, if it's 100 pages, you can delete it down to just the 10 pages you need.
You can't do that with the other models, right?
Think of them as like a PDF.
You can't go in and update them.
So a lot of times you're either overpaying or what your business may need or you're
sacrificing speed, right?
Because you still got to use the full model, even if you only need a little sliver.
Okay.
And also, I mean, we have to talk about our.
last thing that we need to know. Well, actually, let's look at a graph here first. This is changing.
All right. The gap between proprietary close source models and open source or open weight models
is diminishing. So shout out to the original creator of this kind of graph that I'm showing here,
Maxima Laboni. I believe on Twitter there. So about,
two years ago, night and day difference. All right. So we have essentially MMLU on the left side,
and then we have our dates on the other access. And we see that there was such a huge disconnect
about two years ago, even about a year and a half ago between these open source models,
right, that are free and available and anyone can fork them and do whatever they want,
and the closed source models. As of yesterday, there really is no difference anymore. It used to be
night and day. It used to be that open source wasn't really a good option for your business because
you were sacrificing on output quality. You were sacrificing on MMLU. You were saying, hey,
we could use this free open source model, but it's kind of dumb. Not anymore. The gap is essentially
non-existent. It is 0.1. It is 0.1 difference on the MMLU, right? Whereas before it was 10 points,
20 points, 30 points. Now it's essentially a watch.
on is the model smart enough?
Is it capable, as capable as the others?
Right?
When we talk about the difference between the world leading model,
GPT4 Omni, 88.7, and the now free 405B from Lama, 886,
it's no longer night and day, right?
It's no longer night and day.
We're essentially talking about and looking at the exact same thing.
All right.
Let's go here.
So number five, the last thing you need to know is obviously this new model has tremendous business impacts.
All right.
I'd say the last 72 hours between OpenAI's GPT40 Mini with Meta's 3.1, release, the 405B release.
and also now with Open AI saying,
hey, come play for free until September.
This completely changes what's possible with your business, right?
And I'm talking a little slow here because I'm thinking,
and I hope that you're thinking about this too.
And I want to point something out that Mark Zuckerberg said yesterday in an interview.
And I said this, y'all.
I kid you not.
There's always receipts.
I said this back in December.
And I think people laughed at me and thought I was crazy.
But back in December, my prediction was by 2024, we'll see.
But I said there will be more AI agents than humans, right?
And funny enough, Mark Zuckerberg said the exact same thing yesterday.
I hadn't heard any, you know, big person in tech say anything like that.
And yesterday I'm like, yeah, exactly.
I said that in December.
you know, now Mark Zuckerberg, who you could argue as one of the most prominent people in the world
in AI right now, right, at least a handful of five of the most prominent people, the future of
AI, the future of artificial general intelligence, AGI, in theory, the future of ASI, right?
Mark Zuckerberg is one of the most important and prominent people in the world right now,
because AI and generative AI and large language models impact business.
All right.
And you'll see, meta has essentially thrown, not thrown away, but they've set aside their whole
Metaverse thing, right?
Three years ago is Metaverse, Metaverse, Metaverse.
And now it's like spending, you know, hundreds of millions of dollars on compute, right?
They spent, I believe, $700 million just on GPUs to train Mata 3.1 and their future meta models.
But, you know, the future business impacts, Mark Zuckerberg predicted there will be more AI agents
than humans.
He said every single business is going to have multiple agents, right?
And he's not just saying on their platform.
He's talking about the bigger picture.
And he does think and hope that the Lama model is going to be the most used AI model in the world.
And you have to think, it kind of might be possible, right?
It might be feasible because guess what?
Their model is everywhere.
Yes, you can go to Meta AI, which we're going to show you very briefly here in a second.
But it's also available in Instagram.
It's available in Facebook.
It's available in WhatsApp.
So meta has the reach.
And guess what?
Yes, when you use these models, you're making them smarter.
So, you know, people are always like, oh, it's free, right?
Oh, have another episode at some point, the downsides of using AI.
And if you're using something for free, you potentially, yeah, you are paying for it by giving them your data,
giving them your feedback, et cetera.
But meta does have the chance with their model to be the most used AI model in the world.
And it's pretty feasible.
And so you have to pay attention when Mark Zuckerberg talks about how AI, AI agents,
large language model are going to completely reshape what, not only what is possible,
but how business works.
And I've been saying this since day one.
The future of business, especially here in the U.S., we are all going to be working with
agents, large language models, small language models, we're going to be prompting.
That's what our future work is going to be dependent on.
You are not going to be in, I don't know if it's going to be two years, five years,
10 years.
You're not going to be rewarded and promoted.
Your company isn't going to grow by the experts, by the subject matter experts anymore,
by the knowledge, by what you know, because all of that is being transferred into large
language models.
And in the future, small language models, right?
When you have these very capable models or agents that are going to be trained and fine-tuned
on one very specific task, right?
Think of that one specific thing that you do, right?
And then let's say you're lucky enough to be, you know, in the top 1% of people,
let's say in that one very specific thing that you do, you're, you and 100, you know,
you and 99 other people are the smartest in the world that that one very specific task.
Guess what?
Very soon.
a small language model because of what's happened in the last 72 hours,
there's going to be a fine-tuned model that does that one specific task
much better than you and the other 99 smartest people in the world combined.
So the future of how we work is getting the most out of these agents,
knowing how, when, and ultimately why we should be using them,
and how we can squeeze the most business value out of agents,
out of generative AI out of large language models.
That's the future of business.
And that's why I think Matalama has tremendous business impact.
Because over the past 72 hours, the future of how we can work and the timeline, the timeline has
shortened exponentially because even a couple of months ago, you'd say, ah, you know,
this is going to take a while.
Not anymore.
This is like, you know, this is the equivalent of, you know, I don't know, 20 years ago,
it used to be hard to get your company on the internet.
Right.
You had to hire someone smart.
You had to put a lot of work in.
So imagine 20 years ago, someone just drops, hey, here's the best website in the world.
We're going to do it all for free.
Right.
Imagine the gold rush to a dot com, right?
It wouldn't have taken 10, 15, 20 years for companies to really thrive online.
This is a legit generative AI, large language model, gold rush.
The big players are putting things down and saying it is now free or free.
or free 99, it is cheap and the possibilities are endless.
All right.
So those are the five things you need to know.
So now let's just take a quick look live.
All right.
So for our live stream audience and hey, I'm going to be getting, if you do have questions,
like Monica said, did you do a demo for this?
Got one right here.
Got one right here.
But if you do have any other questions, go ahead and get them in.
I'll try to answer them here at the end.
A course would be good.
should I do a meta course? Gordon saying I should do a meta course. Maybe. Maybe. Let me know if you think
that would be helpful. All right. So, all right, let's go. We're just going to do some very basic things.
All right. I did a whole video yesterday running through this. But let's just do some common things here.
So I am going to meta.a.i. Okay. So if you do have to log in,
You do have to log in to your either Facebook or your Instagram account.
All right.
So a couple things you need to know.
And I'm glad they did this.
There's even there's even been updates since yesterday.
I really railed against Mata for a couple of things.
And strangely enough, they fixed them.
Some of those things have been fixed.
A couple, there was some confusion in the model selection.
So now you want to go click your profile and go to settings.
Again, there's other ways you can access this.
You can access this.
also on a hugging face, you can download the models, run them locally.
I'm just showing you probably the easiest way to use this, which is just going to
meta.com.
All right.
This is not the front end interface that maybe you're used to.
It doesn't have the features and functionalities of a chat GPT, of a Claude, even of a Google
Gemini.
But if you just want to see if the model is right for you for your business, it's a great
way to do it.
All right.
So you can go click on settings and then you can go to change your model.
Okay.
So I already changed it to the big one, which is 401.
So you can either use the 70B.
So again, these are the 3.1 updates.
So the small and the medium were updated to 3.1.
So yesterday it still said three.
And I'm like, uh, meta, what's going on here?
But you can either use the 70B or the 405B.
So you can't use the small one.
I'm wondering if they're going to change that pretty soon.
But so right now I'm going to select meta.
I'm going to select Lama 314.405B.
All right.
So a couple of things on the interface.
It's super easy to use. Super simple, right? So you have your chat history on the left hand side.
At any time, you can click new conversation. But all you can really do, there's two things, right?
There's text. So you can do text to text. Or you can do text to photo. I'm not going to be going over that, but they have this imagined feature.
It is pretty cool because you can type something in real time. So I can say Chicago and it's going to start, I don't know why it brought up a random dude.
So I can type in Chicago Skyline. And it actually should be.
doing things live. I can type in Chicago Skyline hot dog, right? So then there's a hot dog. So,
you know, it does these live. I'm not going to go over this too much here, the imagine feature,
the text to image. But the rest of this, you can click new conversation and then just chat with
meta-a-I like you would, any other large language model. All right. So I'm going to run a prompt
that I run some time. And again, I did this yesterday. So I'm saying this is a lot.
A logic prompt, essentially.
So I said, I just woke up today with six apples and three bananas.
Yesterday, I ate a banana and two apples.
This morning, I will eat one apple and no bananas.
However, I don't really like apples and one banana may turn brown tomorrow,
assuming nothing else changes.
How many apples and bananas will I have tonight?
I'm going to create an actual, like, quote, unquote, testing series of questions that I can do with all these models.
But this is one I generally use.
So if there's a new model, you know, Sonnet 3,5,
GPT40 Mini, et cetera.
I usually have a set of five to ten prompts.
These aren't scientific,
but these are either logic, reasoning, math, creativity.
I have a couple prompts that I run.
A lot of models get this wrong.
So what I like by default, right?
So again, I'm using the 405B.
By default, which I like here,
meta takes essentially a chain of thoughts prompting response,
even though I didn't tell it too.
You got to love that.
the first thing it says is let's break this down step by step,
which is a great prompting technique, right?
It's kind of the essence of what we do in our free prime prompt polish course, right?
So it's saying let's break it down step by step.
So I'm going to go down to the bottom, see the answer.
It says, tonight you will have five apples and three bananas, which is correct.
A lot of models get confused.
I put in a lot of nonsense to try to throw the model off.
It does a good job here.
All right.
Let's go ahead.
I'm going to do another prompt.
This one, so many models get wrong.
Yesterday, Matta got it kind of right, kind of wrong.
So I'm going to run it the same.
Also, what's important to know?
I say this a lot.
Generative AI, it's kind of like rolling to dice, right?
Which is why prompt engineering and understanding how models work is so important.
You can run the same prompt 100 times, get 99 different results.
You could get three different results, right?
It is generative.
All right.
So let's go ahead and try this prompt here.
I'm saying a man and his dog are standing on one.
side of the river. There's a boat with enough room for one human and one animal. How can the man
get across with his dog in the fewest number of trips? All right. So it says a classic puzzle,
right? Which I do think there's all of these kind of like brain teasers or kind of large language
model logic tests that a lot of people have been doing. And I do think by now a lot of large language
models because they're trained on the open internet are scraping all of these. Right. So people post
these to Quora to Reddit, blog posts, et cetera.
And so models, I think over time, understand the answer because they gobble up all this
information on the open internet, and then they learn from it.
So let's see if this gets it right.
But most models get this wrong, you know, so let's see how it does.
All right.
So this, this get, it got it wrong.
So it said, it essentially said three trips, right?
The correct answer is one trip.
If a man and his dog are on one side of the river and they have a boat with enough room for one
human and one animal. It just takes them one trip to get across the other side of the river.
This one says it takes three. Yesterday, it said technically, it said two or one round trip.
So it got it kind of right yesterday. Today, it didn't get it right. It got it actually fairly wrong.
All right. Let's try one more. Again, this isn't supposed to be a full live breakdown of
meta, but I wanted to at least show you some of the capabilities of what it's, what it's
what it's possible.
So this is something that I think most businesses could relate to.
So I'm saying, well, maybe not, you're maybe not creating a new company, but using this
to brainstorm to ideate to strategize.
That's what large language models are great at, essentially being a companion to help you
build a business, being a companion to help you market your department's campaign, being
a companion to help you essentially poke holes in things, right?
So I'm saying here, create a new company and brand for a future smart home device.
This will solve a problem that does not currently exist.
To start, come up with the company's name and its first flagship product, give the product a name, branding campaign, go-to-market strategy, tagline, and rationale for why it will work.
Respond in a succinct way, keeping responses to short bullet points, short bullet points, but with ultra-specific facts.
All right.
Hey, I might want this one.
So here's what Meta said in its 405B3.1 inside of meta.coma.i.
The company name is Echoplex.
The flagship product is Dreamweaver.
It solves the problem of sleep data overload.
I could use that.
I'm not going to read the whole thing, but it has the product description in there.
Looks good.
The branding campaign, the tagline is unlock the hidden narrative of your mind.
Not bad.
I've seen worse.
It even has a color scheme with hex color codes.
It gives examples for a logo.
This is a zero shot prompt, y'all.
Very poorly written, there's no prompt engineering.
And the results are pretty good.
That's one thing I've noticed with this new model.
You don't have to do as much prompt engineering to get something decent, right?
A lot of times short prompts and other models don't get you great things.
It does always almost seem like that over delivers when your prompt under delivers, which I like.
So it does even have this kind of almost seemingly built in chain of thought reasoning.
and it always seems to give you a little bit of more without just being verbios, right?
So a lot of large language models, what they do is if there's ambiguity because your prompt
kind of stinks, it's just going to spit a bunch of general nonsense content that it's not really
good.
Meta doesn't really do that.
I think they give you specific content that is actually good.
So we also have the go-to-market strategy.
It gives us a target audience, launch channels, it gives pricing, how much this device should
be priced at.
It has the rationale, which is pretty spot on, right?
It says there's a growing interest in brain computer interfaces and neural implants.
That's true.
A lot of startup money is going into that.
Increasing awareness and mental wellness.
True.
There's a unique value proposition.
True.
It's even getting key partnerships.
So, all right.
That's enough for a live model.
But again, actually two other things with the interface.
I did mention that there's some new things that weren't there yesterday, which I think
are important to talk about. And I actually wish that more companies would have this. So at any point,
you can tell the model if it's a good or bad result. So there's a little interface. You can hover over.
You can copy content to the clipboard. So here's the two new things, which I like. You can share.
Okay. So I can then share this chat. I can copy the link. And whoever I send this to, as long as they
have an account, can go in. They can see the kind of the transcript. And then they can pick it up.
So they can essentially fork the chat.
It's not going to be shared, right?
But I can essentially give them access to everything I have created up until that point.
It's like creating a copy of a document.
And then that person can have all of that knowledge, have all that work, have the context
window, which is also important and then continue to work with it.
Right.
So that's a brand new feature that wasn't there a couple hours ago.
And then also the ability to save or unsafe, right?
And that's especially important.
I wish chat, GBT, and Claude had this feature because now on the side.
bar inside meta, I can just click saved and then it's just going to have my saved chats.
All right.
So that's enough of a quick overview.
Let's wrap this thing up and get to your questions.
All right.
So a couple of questions here.
Denny asking, should we expect to see ad retargeting based on what we put into Lama,
same as can happen based on interactions on Facebook?
You know what?
I'm not actually sure.
There was a lot of information that came out yesterday.
I did a video review.
I spent hours planning for this show.
I haven't read through the whole policy.
So we'll look at that, Denny.
I'm not sure.
The MMLU score, yes.
It is ranking correct answers, right?
So think of it just like a standardized test.
When we talk about MMLU, it's essentially a standardized test, right?
There's right answers and there's wrong answers.
And there's different prompting techniques, right?
So there's some MMLU scores, which are based on zero shot, which is essentially
copy and paste prompt,
with no input-output pairings that telemodel what's good and bad.
But then there's also, you know, like five-shot, you know, MMLU scores,
which means you can do a little bit of prompt engineering and, you know,
kind of help the model get to the correct score or sorry, to the correct answer.
But for the most part, in MMLU, and a lot of the benchmarks,
there's right and there's wrong answers.
And there's different prompting techniques or methodologies that you use.
And then it's like, hey, did you get it right or wrong?
Just like a standardized test.
Yo, Gash, Yo Gesh in the house.
Former guest on everyday AI.
What's going on, Yogesh?
So saying, will Lama 3.1 make custom GPTs obsolete on chat GBT?
It's a great question, right?
Because when we think of, I said that this is a huge play for the developer community.
So there's something about chat GPT and even creating GPTs.
The user experience is so nice.
It's so easy.
So in its current state, will this very new powerful 3-1 make custom GPs obsolete?
I'd say no.
Number one, because Open AI still has the most powerful model in the world, even though it's
not by a lot.
But number two, the experience, the user experience is much easier, right?
The fact that anyone can go into chat GPD right now, create a custom GPD with no coding skills.
It's not like it's low code.
You can create a literal custom version of chat GPD.
GBT, drag and drop, no code.
Just say, hey, hey, GPT builder, this is what I need.
Let me upload my database and then it just does it.
That's amazing.
We don't have that quite yet with, you know, Lama 3-1.
We do have that a little bit with Claude.
We don't yet have that with Google Gemini, but we may once or if Google ever releases
their quote-unquote gems, which is their kind of counterpart to GPDs.
So right now I will say no, it's not going to make custom GPDs obsolete because the user experience, right?
It's not quite there.
But will, is that going to be an area where meta plays in potentially, right?
That is where you start talking about agentic capabilities or agent capabilities, right?
When they're trained on very specific tasks, that's essentially what a GBT is.
It's something, a custom GPD that you can train on one very specific task.
And you say, hey, big, big model, just focus on this one, one, one very specific skill set.
here's my data. Let me train you and make sure that you can do this task correctly.
So I don't think out of the box it's going to.
All right, y'all, I hope this was helpful.
Went a little longer than I had hope.
But hey, that's just everyday AI for you, right?
We do this live.
It's unedited, unscripted.
I hope today's show was helpful.
If so, tag someone, right?
If you're here listening in the LinkedIn comments on Twitter, YouTube, whatever,
share this with someone.
If you're on the podcast, thanks for listening.
tuning in and make sure to tune in tomorrow and every day for more everyday AI. Thanks y'all.
Meet Firefly AI assistant now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere
Express, and more in one conversational interface. You direct the outcome while the assistant
accelerates execution, stand control with the ability to
to step in and refine at any time. See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI. Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
