Everyday AI Podcast – An AI and ChatGPT Podcast - EP 352: 5 Things to Know About Meta's Llama 3.1

Episode Date: September 6, 2024

Win a free year of ChatGPT or other prizes! Find out how.Another one!? Another day, another HUGELY impactful model. Meta responds to OpenAI with a one-of-a-kind model in Llama 3.1 and the brand spanki...n new 405B model. What's it all mean? We gotchyu.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan and questions on MetaRelated Episode:Ep 318: GPT-4o Mini: What you need to know and what no one’s talking aboutUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Highlights of Meta's New Model - Llama 3.12. Spotlight on the 405b model3. Functionalities and User Interface Updates4. Meta's Focus and Its Implication on AI and BusinessTimestamps:01:30 Daily AI news06:30 Meta Model overviews11:11 Free open source models, accessible with limitations.16:20 Meta's quick product launches receive praise.17:32 Llama 3 to 3.1 brings huge improvements.20:11 AI advancements democratizing app development and deployment.24:35 Meta's benchmarks indicate promising performance30:15 Customizable large language model for specific tasks.34:46 Mark Zuckerberg's influence on AI prominence evident.37:33 Experts and knowledge will be replaced by models.42:27 Typing in real time with Meta AI.43:24 Creating testing questions for new model iterations.48:11 New model requires less prompt engineering, delivers more.51:10 Ranking answers, prompt techniques, standardized testing methods.Keywords:OpenAI, Llama 3.1, Google, Databricks, Snowflake, Microsoft, AWS, Dell, NVIDIA, Grok, IBM, model distillation, data generation, MMLU comparison, GPT 4 o Mini, 405 b release, sharing chat transcripts, ad retargeting, custom GPTs, AI agents, Mark Zuckerberg, Meta, metaverse, 8b model, 70b model, 405b model, MMLU benchmark score, Meta AI, edge devices, OpenAI vs Meta.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. The new 405B model from Lama and their Lama 3.1 updates are extremely powerful.
Starting point is 00:00:55 I mean, it's open source, free to download, and I think it'll change the business landscape. But there's a lot of things that I don't think people are paying attention to. And it's probably not what you think. So today, we're going to be going over that. And I'm going to tell you exactly what this Lama 405B and 3.1, what these things are. I'm going to show you a little bit of the new model live and go over five things that you need to know. All right. What's going on, y'all?
Starting point is 00:01:25 Let's get this thing started. My name's Jordan Wilson and welcome to Everyday AI. This thing is for you. Well, it's for all of us. It's a place where, you know, non-technical people, but still people who want to be technical, can learn about all things generative AI to grow. to grow your company and to grow your career. So if that's you, thank you for tuning in on the podcast.
Starting point is 00:01:45 Always, make sure to check out your show notes. We just launched a cool campaign. I'm going to tell you about here in a second. But always more information and make sure if you haven't already, go to your everyday AI.com and sign up for the free daily newsletter. All right. So let's get into it and talk about Meta, Lama 405B, the new model, and Lama 3.1. So we're going to give you what you need to know, what people aren't.
Starting point is 00:02:09 talking about and we're going to show you a little bit live. All right. And hey, for our live stream audience, thank you as always for joining Michael and Brian and Fred, a couple of Chicago people in the house. Denny, Cecilia, thank you all for joining us. So let me know. Have you all used this new Lama 3.1 yet? Are you going to? What questions do you have? Please get them in now. I'll try to tackle them at the end. But let's just get straight into an overview here, y'all. So here's here's what's new. So you have technically three different tiers of models from Meta, right, the parent company of Facebook. So Mark Zuckerberg did all his media rounds yesterday, wrote a very long, you know, blog posts about the future of Lama and open source AI. But here's
Starting point is 00:02:57 essentially what you need to know. In April, Meta announced Lama 3, right? And they essentially had, they said that there's three different sizes, right? A small, medium, and large. And that seems to be the new trend, I think, started by Anthropic, right? And essentially, there's use cases for small, medium, and large. And we'll talk about that a little bit more here. But back in April, when Meta first announced Lama 3, which is a huge upgrade, they only came out with the kind of small and medium models. So when we talk about what is 405B and what is 8B and 70B, well, those are sizes of models.
Starting point is 00:03:34 So the big one, the 405B, had not been announced in. until yesterday or is not available, whereas the other smaller, the small and the medium size models were available, but they've been upgraded. All right. So I'm going to go ahead and share my screen a little bit here. And we're going to walk through some of the benchmarks because they are impressive. All right. So let's go ahead and share my screen here.
Starting point is 00:04:03 So for our podcast audience, I'm going to try to do my best. I don't think you're going to be missing out on anything here because I think I should be able to kind of walk us through this just a little bit. So like I talked about here in a live stream audience, let me know if you can, if you can see this. But so multiple new models from meta. So like I talked about, we have a think of it as a small, medium and large, just like Anthropic, right? So Anthropic has their haiku small, their sonnet, their medium, and their opus large. So for meta, they're actually just naming it by the number of parameters it's trained on. So they have their 8B, which is their small, their 70B, which is their medium,
Starting point is 00:04:46 and then they're just now released 405B. Big jump up, right? And without getting too technical, right, because our audience here is for the most part, not super technical people like myself, right? The easiest way to think about parameters is the amount of data that it's trained on, all right? So let's go and just jump into the benchmarks because that is kind of our first point that we wanted to talk about is these benchmarks are pretty impressive. And again, I have to really hit rewind even because we have to draw a line because even if we're looking here at these benchmarks. And this is what everyone always talks about and rightfully so, right?
Starting point is 00:05:27 So benchmarks are when all the smart researchers and scientists from both meta and everyone from all the big companies, third parties, they all do these benchmarks. And you essentially get a score. So think of like a new car, right? When a new car comes out, you know, you get, oh, the EPA estimated gas mileage and, you know, goes through all these, you know, third party, you know, safety tests, all these things. And it gets scores, right? So large language models are the same way.
Starting point is 00:05:54 And there's all of these different dozens of different benchmarks. but the one that we talk about a lot here on everyday AI is the MMMLU. Okay. So the MMLU benchmark is the massive multitask language understanding. So it's essentially 57 different subjects. And you get a score, right? So it's like the ACT or the SAT for large language models. And this is by far the gold standard.
Starting point is 00:06:21 So when we talk about MMLU and this new Lama 3.1, the large variety. So that's what we're going to be sharing here in the screenshots. It is very impressive. And I have a screen here earlier that I'm going to be showing. But we have to keep in mind, this is an open source model, y'all. So what that means, this is free. This is free to use. You can download these models, although you really have to have like the world's strongest computer to download the 405B.
Starting point is 00:06:57 but for the small and medium models, most of us out there, if you have a newish computer with a decent graphics processing chip, if it has decent specs, you're going to be able to download these models from meta. So that's like before we get into these benchmarks anymore, we really have to talk about the importance of this is open source, right? You can download it. You can build on top of. it without having to pay, right?
Starting point is 00:07:30 Those inference costs, those training costs over and over, which is huge. Okay. And this is also, it's available in a lot of dev environments, which I'm going to show you here on the screen soon. So with that, let's look at these benchmarks. So the MMLU, Lama came in at an 88.6, which is extremely, extremely impressive. All right. I might do a full show soon.
Starting point is 00:08:00 Live stream audience, let me know. Do you want to see that? I might do a full show soon on MMLU. What are these benchmarks? What do they mean? But essentially, if you are an expert in one specific thing. So again, these models are trained or sorry, the MMLU goes across 57 different subject areas.
Starting point is 00:08:18 Think if you are a world expert in one, a world expert is going to get about an 89.8. All right. An 89.8, a world expert on that one domain specific field. Okay. But in the other 56, the average person gets about a mid-30s. Okay, so think, when we say that Lama has an 88.6, that means that a free model that you can download now
Starting point is 00:08:45 is essentially a world-class expert or almost as good. It's like having the 57 smartest people in the world available for free their entire knowledge and you can build with it, you can download it, you can work with it, you can make it your own. Okay, so that's what we talk about when we're talking about both MMLU and the power of having these high of scores in an open source model that you can download, you can fork, you can build upon and you're not having to, you know, pay a big company each and every time.
Starting point is 00:09:19 All right. So benchmarks are extremely important. So the Lama 3-1, it is above Claude 35 sonnet, but it is just below GPT4 Omni. So GPT4 Omni, 88.7, Lama 88.6, right? One fraction, right? One fraction of a point away, which I was surprised about. I was surprised that Mata didn't sit on this for another couple of weeks and try to over-engineer and try to squeeze a little bit more juice out of it.
Starting point is 00:09:49 So it's the world leader on MMLU. but regardless, a free open source model that is essentially now the most powerful model, extremely impressive. So yes, the human e-vow scores also very, very top-notch in 89 when the leader Claude is in 92. Pretty good there. A couple, though, that are worth noting about, I'm not going to talk about benchmarks this entire time. But two other things, I mean, GSM8K, which is essentially, you know, math, right, basic math. And 96.8, the most capable model right now in the world.
Starting point is 00:10:30 Also, the arc challenge, which is reasoning, got the highest scores of any model. So when we talk about benchmarks, extremely impressive, okay? We also have to talk about availability, where this is available at. right so we talked about you can download this now you can also go to meta.a. I'm not sure which countries have access. I didn't get a full list from meta. I'm sure they'll be rolling out with that soon, but you can go to meta. com.
Starting point is 00:10:58 You do have to log in with either Facebook account and Instagram account, but you can use it for free literally right now. And I'm going to be going over a little bit of that lie. But we have to also talk about where this is available, right? Because if you are a business leader, and you are looking, let's say you're working at a Fortune 500 and Inc. 5,000 company, a big enterprise, right? This is available now showing on the screen here.
Starting point is 00:11:27 This is available now in so many places, so many places here. So AWS, Databricks, Nvidia's Foundry, IBM, Google Cloud, Microsoft, Scale, Snowflake, right? So where so many of these big companies, big enterprises, house their data, where they're trying to marry their data with the right large language model, it's available now. It's available to date, which I think I love that from meta, right? You can have whatever thoughts you want about Mark Zuckerberg. You can have whatever thoughts you want about the social media side of meta, you know, Facebook and in Instagram and WhatsApp and data collection, all of these things. You can have whatever thoughts. But the fact that meta announced this, drop this, and it's available all instantly,
Starting point is 00:12:18 hats off, right? Because Google is notoriously bad for, you know, having all these big conferences, right? At their Google I.O. conference, they announced all these things. This is months ago. And we haven't seen a fraction of them, at least when it comes to their new large language model developments and generative AI. Meta, it's like drop a blog post, drop some interviews.
Starting point is 00:12:40 And it's live, it's ready. So you can probably today go and work with this in your environment right now. And again, it's open source, y'all. All right. So like I said, the first thing that you need to know is the specs are extremely impressive. So like I talked about that, the benchmarks are great. A couple other things. The 8B and the 70B versions, the small and the medium, those are upgraded.
Starting point is 00:13:09 So that is the big jump from, you know, Lama 3 to Lama 3.1. So even if you look at just the improvements in the small and the medium model here, they're huge. They're, I mean, they're very impressive. So the small and the medium models, if you look at them in their respective, quote unquote, weight classes, right, especially the 8B because I think the 8B probably within, I don't know, six months to a year of hardware, I think you're going to be able to see this as an edge device as a model that could in theory run locally on a phone.
Starting point is 00:13:51 So that's the other huge upside to being open source and to be able to download a model is you can run it locally without internet. So then privacy concerns are less. Um, Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI assistant now live in the Adobe Firefly app, the all in one creative AI studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on. 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere,
Starting point is 00:14:43 Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible, so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. Speed is more. Environmental concerns are less of a concern at that point when you can run a model locally and you don't have to, you know, run it off of essentially a server. And so it's faster.
Starting point is 00:15:31 The latency's lower. It's more secure and you don't even need the internet. Right. But the 8B model, especially, I'm looking out of this. And this is what I don't think people are talking about. So if you compare it to Google's new Gemma 2 model, so these are, again, I would call these something between a small language model and a small large language model. There's no actual definition because the goalposts are always moving, right? But these are models in theory, Lama 318B, 8 billion parameters and Gemma 2, 9b. These are models that in theory can be running locally on a smartphone, probably within six months to a year, right? So right now, Google also has Gemini Nano, which is one of the ones that's running on current
Starting point is 00:16:17 smartphones. But you have to also think when you see all of these announcements, we can't just look at the benchmarks and what they mean today. We have to look at what do they mean in the future. So I already told you, this has huge future implications, which we're going to get into about how businesses can run now. But you also have to think of what this means for the future of on the go AI, which is the future. I've said hundreds of times. The future of large language models is small language
Starting point is 00:16:42 models and working with many of them and working with them on edge AI, on device AI. But the 8B model just is outpunching its weight class in almost every single benchmark aside from like one. It is the top model. It is the top small large language model. And it's not even particularly close. And again, open source. So think of what this means for the future. of even apps that you use, right? Because now all of these developers, you know, it's not like even with GPT4O Mini, which we're going to talk about here, that brought the cost way down. But, you know, a month or two ago, you know, could be expensive to go launch a brand new
Starting point is 00:17:22 app that was powered by AI for, you know, enterprise business or just something that's, you know, fun for people to use. This changes things. This changes what people with low budgets can go and build in a weekend or in a day and launch immediately. This brings scalability to even small companies that don't maybe have, you know, compute power. Well, you might not even need it with an 8 billion parameter model, which is fairly small. You can download this and run it on a machine that's not even very powerful.
Starting point is 00:17:53 All right. And the benchmarks are extremely impressive there. Also, another thing to note, and I promise you, this is it for our more technical side. All right. So, 128K context window, huge, right?
Starting point is 00:18:08 Because the context window was very small before. I believe it was 8K or less. Also, meta is using a slightly different, and this gets into a little bit of the technical side,
Starting point is 00:18:20 but mixture of experts or M-O-E is what a lot of these large language model companies have been using. So meta will put it in our newsletter today. They want a different route going this heard of experts. versus the mixture of experts. All right. So enough on the technical side, let's go ahead and talk about the second thing that you need to know.
Starting point is 00:18:41 And I think that this is shots fired. I mean, shots fired against Open AI, but this is a developer-focused strategy, right? Because I think people are overlooking the vast improvements that were made in the 8B and the 70B, the small and the medium models. And what this means is I think they're just going after Open AI, right? We saw Anthropic with Claude 35 Sonnet.
Starting point is 00:19:07 We saw Google Gemini with 1.5 Flash, right? So, and Claude presumably will be releasing 35 Haiku, which is their small model, right? So Haiku Sonnet Opus. So over, I'd say, in the first quarter or two of 2024, Open AI had lost traction, I think, among developers. They were a jumping ship. We had a whole episode about this on Friday. but right after they released their new model, GPT40 Mini, which is essentially a light version of their big boy model.
Starting point is 00:19:39 So now Open AI has essentially a small and a large. They don't have a medium per se. But this really changed the developer landscape. So Open AI, I think, was losing customers. They were going to Google Gemini for 1.5 Flash. It was cheaper and more powerful. They were going to Anthropic for Claude 3 Hiku. it was cheaper and more powerful.
Starting point is 00:20:02 You know, Open AI was essentially relying on either 3.5, which wasn't very powerful or GPT4, which was very expensive. So I do think that OpenAI did make a huge splash with GPT4O Mini. Did that happen to coincide with the fact that meta decided days after to release their 3.1 updates, potentially, right? Because I think they are now going directly after developers. And it can't also be lost on us. How open AI responded.
Starting point is 00:20:32 I talked about that at the top of the news this morning. Literally hours after Meta released a very impressive 3.1, three models. Benchmarks go wild, right? Literally hours after that, Open AI said, oh, you know what? We're actually going to make cost zero for the next couple of months to come and fine tune your models, right? So, hey, hey, developers, hey, business. businesses, right? Hey, businesses, you want to, you know, bring in rag. You want to have this retrieval augmented generation. You want to kind of marry and create your own large language
Starting point is 00:21:09 model with your data and some fine-tuning. Generally, you know, a year and a half ago, very expensive, very hard to do. Now, it's easier and it's cheap. So Open AI said, hey, for the next couple of months, you can come in and do it for free. Two millions, two million tokens a day, which I think is in direct response. They saw what meta dropped and they're like, whoa, these benchmarks, Pretty dang good. Pretty dang good, right? We don't have comparisons yet with the 8B, 70B, and GPT40Mini, but I'm sure those will be rolling out here soon,
Starting point is 00:21:41 and we'll talk about it. But this is a developer-focused strategy from Mata, and it is a direct shot to open AI. All right. So I do think at least right now, it's a two-player battle. It's a two-player battle. But what this does mean is I fully expect Google and Anthropic to be releasing updates in the next two months, right? This changes.
Starting point is 00:22:10 It's getting so cheap, where it's getting to the point to train a model, you're not going to have a lot of costs. Right. We always talk about inference and the cost per token of training. When we see a model like a model like Lama 3.1 and their 8B, their 70B, it's free 99, y'all. Yeah, you got to have a computer powerful enough to handle it. You still need the skill set. But the cost to train a model on your company's own data, the cost to build something on top of a large language model is going down to like free 99, right?
Starting point is 00:22:50 Whereas a year and a half ago, pretty expensive. And now the availability is everywhere. You saw the chart, right? It's available inside Nvidia, it's, it's a platform, their Foundry platform. It's available inside of Microsoft, inside of Google, inside of data breaks, inside of Snowflake, right? So all these dev environments, it's available there. So the cost is going down. And this means that I think Google and Anthropic are going to be responding here soon.
Starting point is 00:23:21 All right. Number three, the thing that you need to know is it is ready to be built upon, right? So we did kind of talk about that. You don't have to wait. You don't have to wait. It's literally available now. And it's available in all of those platforms. I didn't even talk about AWS.
Starting point is 00:23:40 But yeah, AWS is platform. Dell, Nvidia, Grock, Grock with a Q, not the large language model from Twitter that no one uses. IBM, Google Cloud, Microsoft, Scale, Snowflake. It is ready now, right? Which I, again, I talked about this at the top of the show. I love Meta's approach here. Don't make us wait, right?
Starting point is 00:24:00 Google is like, hey, here's some marketing, here's some promise. Let's see how our stock goes up or down. And maybe we'll release these in the next six months to two years. Open AI kind of in the middle. Open AI, you know, when they had their spring event, they dropped their model that day. of the more powerful features, we're still waiting on, you know, we should be getting them in theory any day. But meta, meta just literally, you know, Mark Zuckerberg, new hairstyle, looking, looking fresh with the gold chain, just drop, mic drop. Here's everything.
Starting point is 00:24:36 Go out and play available everywhere now, right? Changes how business is done. All right. So number four, the number four thing you need to know. An open source Lama 3.1 is more than your average. large language model. Okay. Again, we'll be linking to this in our newsletter so you can read a little bit more from both Zuckerberg's long blog post on the future of open source and what that means.
Starting point is 00:25:06 But it means a lot of things. So he talked a lot about model distillation. So that essentially means the big model, the 405B, can act as a teacher to quote unquote train student models. Right? You can also, companies can make models of any size. That's the other thing. If you are working with a closed source proprietary model, which is what all the other big guys are, right? So Google Gemini, Anthropic Claude, chat GPT, et cetera.
Starting point is 00:25:36 Those are all closed source proprietary, right? It's not a bespoke option. You know, it's one size fits all or you don't use it. So you might be overpaying or you might be underutilizing. But with this model distillation, with metath, 3.1, you can make models of any size, right? You can use the big model to train the smaller models. Also, a huge thing here is the synthetic data generation. So Lama 3.1 can generate high quality synthetic data to enhance the performance of smaller models, right? We always talk about,
Starting point is 00:26:09 where is this data coming from? Where's this future high quality data coming from? Which I personally think is a huge problem, right? Because models, for the most part, are trained on the open internet. And guess what? People don't know this, but this predates chat GPT's release. People, you know, SEOs, content writers, they've been using the GPT technology since 2020. So I think you have this, almost this data regurgitation issue where so much of the data right now is being built by AI, right? A lot of studies say by 2026, more than 90% of new content on the internet will be created with AI. So you have this problem. Where do you get this new, this new, this new data, this original data. So what meta is kind of doing here with this synthetic data
Starting point is 00:26:55 generation is you use the big model to create quote unquote new model or sorry, new data to train the smaller models or to train your company's models. Another thing is the customization, all right? And this is why it is much more than your average large language model. It allows for model architecture customization based on specific task or hardware constraints. So what that means, is if you do want to build, let's just say a customer service, large language model for your company, right? Put in all your data. If you're using a GPT40 Mini, if you're using a Claude 3, if you're using a Google Gemini, you in theory are going to still be using the full model, right?
Starting point is 00:27:41 So you can essentially with an open source downloadable model that you can fork, you can build upon, you can essentially do anything you want. Think of it like this. It's a template, inside of a Word doc, it's a template inside a PowerPoint. You can move things around. You can, you know, if it's 100 pages, you can delete it down to just the 10 pages you need. You can't do that with the other models, right? Think of them as like a PDF. You can't go in and update them.
Starting point is 00:28:03 So a lot of times you're either overpaying or what your business may need or you're sacrificing speed, right? Because you still got to use the full model, even if you only need a little sliver. Okay. And also, I mean, we have to talk about our. last thing that we need to know. Well, actually, let's look at a graph here first. This is changing. All right. The gap between proprietary close source models and open source or open weight models is diminishing. So shout out to the original creator of this kind of graph that I'm showing here,
Starting point is 00:28:46 Maxima Laboni. I believe on Twitter there. So about, two years ago, night and day difference. All right. So we have essentially MMLU on the left side, and then we have our dates on the other access. And we see that there was such a huge disconnect about two years ago, even about a year and a half ago between these open source models, right, that are free and available and anyone can fork them and do whatever they want, and the closed source models. As of yesterday, there really is no difference anymore. It used to be night and day. It used to be that open source wasn't really a good option for your business because you were sacrificing on output quality. You were sacrificing on MMLU. You were saying, hey,
Starting point is 00:29:32 we could use this free open source model, but it's kind of dumb. Not anymore. The gap is essentially non-existent. It is 0.1. It is 0.1 difference on the MMLU, right? Whereas before it was 10 points, 20 points, 30 points. Now it's essentially a watch. on is the model smart enough? Is it capable, as capable as the others? Right? When we talk about the difference between the world leading model, GPT4 Omni, 88.7, and the now free 405B from Lama, 886,
Starting point is 00:30:10 it's no longer night and day, right? It's no longer night and day. We're essentially talking about and looking at the exact same thing. All right. Let's go here. So number five, the last thing you need to know is obviously this new model has tremendous business impacts. All right. I'd say the last 72 hours between OpenAI's GPT40 Mini with Meta's 3.1, release, the 405B release.
Starting point is 00:30:50 and also now with Open AI saying, hey, come play for free until September. This completely changes what's possible with your business, right? And I'm talking a little slow here because I'm thinking, and I hope that you're thinking about this too. And I want to point something out that Mark Zuckerberg said yesterday in an interview. And I said this, y'all. I kid you not.
Starting point is 00:31:17 There's always receipts. I said this back in December. And I think people laughed at me and thought I was crazy. But back in December, my prediction was by 2024, we'll see. But I said there will be more AI agents than humans, right? And funny enough, Mark Zuckerberg said the exact same thing yesterday. I hadn't heard any, you know, big person in tech say anything like that. And yesterday I'm like, yeah, exactly.
Starting point is 00:31:48 I said that in December. you know, now Mark Zuckerberg, who you could argue as one of the most prominent people in the world in AI right now, right, at least a handful of five of the most prominent people, the future of AI, the future of artificial general intelligence, AGI, in theory, the future of ASI, right? Mark Zuckerberg is one of the most important and prominent people in the world right now, because AI and generative AI and large language models impact business. All right. And you'll see, meta has essentially thrown, not thrown away, but they've set aside their whole
Starting point is 00:32:23 Metaverse thing, right? Three years ago is Metaverse, Metaverse, Metaverse. And now it's like spending, you know, hundreds of millions of dollars on compute, right? They spent, I believe, $700 million just on GPUs to train Mata 3.1 and their future meta models. But, you know, the future business impacts, Mark Zuckerberg predicted there will be more AI agents than humans. He said every single business is going to have multiple agents, right? And he's not just saying on their platform.
Starting point is 00:32:54 He's talking about the bigger picture. And he does think and hope that the Lama model is going to be the most used AI model in the world. And you have to think, it kind of might be possible, right? It might be feasible because guess what? Their model is everywhere. Yes, you can go to Meta AI, which we're going to show you very briefly here in a second. But it's also available in Instagram. It's available in Facebook.
Starting point is 00:33:25 It's available in WhatsApp. So meta has the reach. And guess what? Yes, when you use these models, you're making them smarter. So, you know, people are always like, oh, it's free, right? Oh, have another episode at some point, the downsides of using AI. And if you're using something for free, you potentially, yeah, you are paying for it by giving them your data, giving them your feedback, et cetera.
Starting point is 00:33:47 But meta does have the chance with their model to be the most used AI model in the world. And it's pretty feasible. And so you have to pay attention when Mark Zuckerberg talks about how AI, AI agents, large language model are going to completely reshape what, not only what is possible, but how business works. And I've been saying this since day one. The future of business, especially here in the U.S., we are all going to be working with agents, large language models, small language models, we're going to be prompting.
Starting point is 00:34:20 That's what our future work is going to be dependent on. You are not going to be in, I don't know if it's going to be two years, five years, 10 years. You're not going to be rewarded and promoted. Your company isn't going to grow by the experts, by the subject matter experts anymore, by the knowledge, by what you know, because all of that is being transferred into large language models. And in the future, small language models, right?
Starting point is 00:34:42 When you have these very capable models or agents that are going to be trained and fine-tuned on one very specific task, right? Think of that one specific thing that you do, right? And then let's say you're lucky enough to be, you know, in the top 1% of people, let's say in that one very specific thing that you do, you're, you and 100, you know, you and 99 other people are the smartest in the world that that one very specific task. Guess what? Very soon.
Starting point is 00:35:09 a small language model because of what's happened in the last 72 hours, there's going to be a fine-tuned model that does that one specific task much better than you and the other 99 smartest people in the world combined. So the future of how we work is getting the most out of these agents, knowing how, when, and ultimately why we should be using them, and how we can squeeze the most business value out of agents, out of generative AI out of large language models. That's the future of business.
Starting point is 00:35:41 And that's why I think Matalama has tremendous business impact. Because over the past 72 hours, the future of how we can work and the timeline, the timeline has shortened exponentially because even a couple of months ago, you'd say, ah, you know, this is going to take a while. Not anymore. This is like, you know, this is the equivalent of, you know, I don't know, 20 years ago, it used to be hard to get your company on the internet. Right.
Starting point is 00:36:08 You had to hire someone smart. You had to put a lot of work in. So imagine 20 years ago, someone just drops, hey, here's the best website in the world. We're going to do it all for free. Right. Imagine the gold rush to a dot com, right? It wouldn't have taken 10, 15, 20 years for companies to really thrive online. This is a legit generative AI, large language model, gold rush.
Starting point is 00:36:32 The big players are putting things down and saying it is now free or free. or free 99, it is cheap and the possibilities are endless. All right. So those are the five things you need to know. So now let's just take a quick look live. All right. So for our live stream audience and hey, I'm going to be getting, if you do have questions, like Monica said, did you do a demo for this?
Starting point is 00:36:59 Got one right here. Got one right here. But if you do have any other questions, go ahead and get them in. I'll try to answer them here at the end. A course would be good. should I do a meta course? Gordon saying I should do a meta course. Maybe. Maybe. Let me know if you think that would be helpful. All right. So, all right, let's go. We're just going to do some very basic things. All right. I did a whole video yesterday running through this. But let's just do some common things here.
Starting point is 00:37:28 So I am going to meta.a.i. Okay. So if you do have to log in, You do have to log in to your either Facebook or your Instagram account. All right. So a couple things you need to know. And I'm glad they did this. There's even there's even been updates since yesterday. I really railed against Mata for a couple of things. And strangely enough, they fixed them.
Starting point is 00:37:53 Some of those things have been fixed. A couple, there was some confusion in the model selection. So now you want to go click your profile and go to settings. Again, there's other ways you can access this. You can access this. also on a hugging face, you can download the models, run them locally. I'm just showing you probably the easiest way to use this, which is just going to meta.com.
Starting point is 00:38:12 All right. This is not the front end interface that maybe you're used to. It doesn't have the features and functionalities of a chat GPT, of a Claude, even of a Google Gemini. But if you just want to see if the model is right for you for your business, it's a great way to do it. All right. So you can go click on settings and then you can go to change your model.
Starting point is 00:38:31 Okay. So I already changed it to the big one, which is 401. So you can either use the 70B. So again, these are the 3.1 updates. So the small and the medium were updated to 3.1. So yesterday it still said three. And I'm like, uh, meta, what's going on here? But you can either use the 70B or the 405B.
Starting point is 00:38:51 So you can't use the small one. I'm wondering if they're going to change that pretty soon. But so right now I'm going to select meta. I'm going to select Lama 314.405B. All right. So a couple of things on the interface. It's super easy to use. Super simple, right? So you have your chat history on the left hand side. At any time, you can click new conversation. But all you can really do, there's two things, right?
Starting point is 00:39:15 There's text. So you can do text to text. Or you can do text to photo. I'm not going to be going over that, but they have this imagined feature. It is pretty cool because you can type something in real time. So I can say Chicago and it's going to start, I don't know why it brought up a random dude. So I can type in Chicago Skyline. And it actually should be. doing things live. I can type in Chicago Skyline hot dog, right? So then there's a hot dog. So, you know, it does these live. I'm not going to go over this too much here, the imagine feature, the text to image. But the rest of this, you can click new conversation and then just chat with meta-a-I like you would, any other large language model. All right. So I'm going to run a prompt that I run some time. And again, I did this yesterday. So I'm saying this is a lot.
Starting point is 00:40:04 A logic prompt, essentially. So I said, I just woke up today with six apples and three bananas. Yesterday, I ate a banana and two apples. This morning, I will eat one apple and no bananas. However, I don't really like apples and one banana may turn brown tomorrow, assuming nothing else changes. How many apples and bananas will I have tonight? I'm going to create an actual, like, quote, unquote, testing series of questions that I can do with all these models.
Starting point is 00:40:29 But this is one I generally use. So if there's a new model, you know, Sonnet 3,5, GPT40 Mini, et cetera. I usually have a set of five to ten prompts. These aren't scientific, but these are either logic, reasoning, math, creativity. I have a couple prompts that I run. A lot of models get this wrong.
Starting point is 00:40:48 So what I like by default, right? So again, I'm using the 405B. By default, which I like here, meta takes essentially a chain of thoughts prompting response, even though I didn't tell it too. You got to love that. the first thing it says is let's break this down step by step, which is a great prompting technique, right?
Starting point is 00:41:09 It's kind of the essence of what we do in our free prime prompt polish course, right? So it's saying let's break it down step by step. So I'm going to go down to the bottom, see the answer. It says, tonight you will have five apples and three bananas, which is correct. A lot of models get confused. I put in a lot of nonsense to try to throw the model off. It does a good job here. All right.
Starting point is 00:41:31 Let's go ahead. I'm going to do another prompt. This one, so many models get wrong. Yesterday, Matta got it kind of right, kind of wrong. So I'm going to run it the same. Also, what's important to know? I say this a lot. Generative AI, it's kind of like rolling to dice, right?
Starting point is 00:41:46 Which is why prompt engineering and understanding how models work is so important. You can run the same prompt 100 times, get 99 different results. You could get three different results, right? It is generative. All right. So let's go ahead and try this prompt here. I'm saying a man and his dog are standing on one. side of the river. There's a boat with enough room for one human and one animal. How can the man
Starting point is 00:42:08 get across with his dog in the fewest number of trips? All right. So it says a classic puzzle, right? Which I do think there's all of these kind of like brain teasers or kind of large language model logic tests that a lot of people have been doing. And I do think by now a lot of large language models because they're trained on the open internet are scraping all of these. Right. So people post these to Quora to Reddit, blog posts, et cetera. And so models, I think over time, understand the answer because they gobble up all this information on the open internet, and then they learn from it. So let's see if this gets it right.
Starting point is 00:42:43 But most models get this wrong, you know, so let's see how it does. All right. So this, this get, it got it wrong. So it said, it essentially said three trips, right? The correct answer is one trip. If a man and his dog are on one side of the river and they have a boat with enough room for one human and one animal. It just takes them one trip to get across the other side of the river. This one says it takes three. Yesterday, it said technically, it said two or one round trip.
Starting point is 00:43:11 So it got it kind of right yesterday. Today, it didn't get it right. It got it actually fairly wrong. All right. Let's try one more. Again, this isn't supposed to be a full live breakdown of meta, but I wanted to at least show you some of the capabilities of what it's, what it's what it's possible. So this is something that I think most businesses could relate to. So I'm saying, well, maybe not, you're maybe not creating a new company, but using this to brainstorm to ideate to strategize. That's what large language models are great at, essentially being a companion to help you
Starting point is 00:43:48 build a business, being a companion to help you market your department's campaign, being a companion to help you essentially poke holes in things, right? So I'm saying here, create a new company and brand for a future smart home device. This will solve a problem that does not currently exist. To start, come up with the company's name and its first flagship product, give the product a name, branding campaign, go-to-market strategy, tagline, and rationale for why it will work. Respond in a succinct way, keeping responses to short bullet points, short bullet points, but with ultra-specific facts. All right. Hey, I might want this one.
Starting point is 00:44:29 So here's what Meta said in its 405B3.1 inside of meta.coma.i. The company name is Echoplex. The flagship product is Dreamweaver. It solves the problem of sleep data overload. I could use that. I'm not going to read the whole thing, but it has the product description in there. Looks good. The branding campaign, the tagline is unlock the hidden narrative of your mind.
Starting point is 00:44:54 Not bad. I've seen worse. It even has a color scheme with hex color codes. It gives examples for a logo. This is a zero shot prompt, y'all. Very poorly written, there's no prompt engineering. And the results are pretty good. That's one thing I've noticed with this new model.
Starting point is 00:45:09 You don't have to do as much prompt engineering to get something decent, right? A lot of times short prompts and other models don't get you great things. It does always almost seem like that over delivers when your prompt under delivers, which I like. So it does even have this kind of almost seemingly built in chain of thought reasoning. and it always seems to give you a little bit of more without just being verbios, right? So a lot of large language models, what they do is if there's ambiguity because your prompt kind of stinks, it's just going to spit a bunch of general nonsense content that it's not really good.
Starting point is 00:45:41 Meta doesn't really do that. I think they give you specific content that is actually good. So we also have the go-to-market strategy. It gives us a target audience, launch channels, it gives pricing, how much this device should be priced at. It has the rationale, which is pretty spot on, right? It says there's a growing interest in brain computer interfaces and neural implants. That's true.
Starting point is 00:46:03 A lot of startup money is going into that. Increasing awareness and mental wellness. True. There's a unique value proposition. True. It's even getting key partnerships. So, all right. That's enough for a live model.
Starting point is 00:46:18 But again, actually two other things with the interface. I did mention that there's some new things that weren't there yesterday, which I think are important to talk about. And I actually wish that more companies would have this. So at any point, you can tell the model if it's a good or bad result. So there's a little interface. You can hover over. You can copy content to the clipboard. So here's the two new things, which I like. You can share. Okay. So I can then share this chat. I can copy the link. And whoever I send this to, as long as they have an account, can go in. They can see the kind of the transcript. And then they can pick it up. So they can essentially fork the chat.
Starting point is 00:46:54 It's not going to be shared, right? But I can essentially give them access to everything I have created up until that point. It's like creating a copy of a document. And then that person can have all of that knowledge, have all that work, have the context window, which is also important and then continue to work with it. Right. So that's a brand new feature that wasn't there a couple hours ago. And then also the ability to save or unsafe, right?
Starting point is 00:47:16 And that's especially important. I wish chat, GBT, and Claude had this feature because now on the side. bar inside meta, I can just click saved and then it's just going to have my saved chats. All right. So that's enough of a quick overview. Let's wrap this thing up and get to your questions. All right. So a couple of questions here.
Starting point is 00:47:40 Denny asking, should we expect to see ad retargeting based on what we put into Lama, same as can happen based on interactions on Facebook? You know what? I'm not actually sure. There was a lot of information that came out yesterday. I did a video review. I spent hours planning for this show. I haven't read through the whole policy.
Starting point is 00:48:00 So we'll look at that, Denny. I'm not sure. The MMLU score, yes. It is ranking correct answers, right? So think of it just like a standardized test. When we talk about MMLU, it's essentially a standardized test, right? There's right answers and there's wrong answers. And there's different prompting techniques, right?
Starting point is 00:48:17 So there's some MMLU scores, which are based on zero shot, which is essentially copy and paste prompt, with no input-output pairings that telemodel what's good and bad. But then there's also, you know, like five-shot, you know, MMLU scores, which means you can do a little bit of prompt engineering and, you know, kind of help the model get to the correct score or sorry, to the correct answer. But for the most part, in MMLU, and a lot of the benchmarks, there's right and there's wrong answers.
Starting point is 00:48:43 And there's different prompting techniques or methodologies that you use. And then it's like, hey, did you get it right or wrong? Just like a standardized test. Yo, Gash, Yo Gesh in the house. Former guest on everyday AI. What's going on, Yogesh? So saying, will Lama 3.1 make custom GPTs obsolete on chat GBT? It's a great question, right?
Starting point is 00:49:06 Because when we think of, I said that this is a huge play for the developer community. So there's something about chat GPT and even creating GPTs. The user experience is so nice. It's so easy. So in its current state, will this very new powerful 3-1 make custom GPs obsolete? I'd say no. Number one, because Open AI still has the most powerful model in the world, even though it's not by a lot.
Starting point is 00:49:35 But number two, the experience, the user experience is much easier, right? The fact that anyone can go into chat GPD right now, create a custom GPD with no coding skills. It's not like it's low code. You can create a literal custom version of chat GPD. GBT, drag and drop, no code. Just say, hey, hey, GPT builder, this is what I need. Let me upload my database and then it just does it. That's amazing.
Starting point is 00:50:00 We don't have that quite yet with, you know, Lama 3-1. We do have that a little bit with Claude. We don't yet have that with Google Gemini, but we may once or if Google ever releases their quote-unquote gems, which is their kind of counterpart to GPDs. So right now I will say no, it's not going to make custom GPDs obsolete because the user experience, right? It's not quite there. But will, is that going to be an area where meta plays in potentially, right? That is where you start talking about agentic capabilities or agent capabilities, right?
Starting point is 00:50:36 When they're trained on very specific tasks, that's essentially what a GBT is. It's something, a custom GPD that you can train on one very specific task. And you say, hey, big, big model, just focus on this one, one, one very specific skill set. here's my data. Let me train you and make sure that you can do this task correctly. So I don't think out of the box it's going to. All right, y'all, I hope this was helpful. Went a little longer than I had hope. But hey, that's just everyday AI for you, right?
Starting point is 00:51:04 We do this live. It's unedited, unscripted. I hope today's show was helpful. If so, tag someone, right? If you're here listening in the LinkedIn comments on Twitter, YouTube, whatever, share this with someone. If you're on the podcast, thanks for listening. tuning in and make sure to tune in tomorrow and every day for more everyday AI. Thanks y'all.
Starting point is 00:51:31 Meet Firefly AI assistant now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution, stand control with the ability to to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going.
Starting point is 00:52:16 For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.