Everyday AI Podcast – An AI and ChatGPT Podcast - EP 153: Knowledge Cutoff - What it is and why it matters for large language models

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. What is a knowledge cutoff when we're talking about large language models?

Starting point is 00:00:51 And why does it matter? We're going to be talking about that and a lot more today on Everyday AI. If you're new here, welcome. My name's Jordan Wilson. And let's talk real quick about knowledge cutoff and why it should matter to you. Well, if you use any large language model like chat GPT or. Google Bard or Microsoft Bing chat or co-pilot chat, whatever it's called now, if you use these large language models, you need to understand what a knowledge cutoff is

Starting point is 00:01:21 and how it impacts the work that you're doing inside of said large language model. So more on that in a minute. But welcome, if you're new, everyday AI is for you. Everyday AI is a daily live stream podcast in free daily newsletter, helping everyday people like you and me, not just learn what's going on in the world of generative AI, but how we can all actually leverage it as well. Okay. So that's what we're going to be doing today. We're going to be learning not just about large language model, but how you can actually leverage knowing what the cut update is and finding it. And having that to equip you to create better

Starting point is 00:02:00 output inside large language models. All right. I'm excited for today's show. It's just me. Sorry, if you tuned in to hear some other guests share their insights, you just got me today. If you are joining on the podcast, as always, make sure to check out the show notes. We always have additional resources, a link to go sign up for our free daily newsletter, all of that good stuff. But before we get into it, let's talk about what's going on in the world of AI news as we do every day. So Sports Illustrated is in hot water after alleged AI use. Yes, Sports Illustrated, I think the first magazine I ever read. But it's allegedly published several articles under fake author names and AI generated profile images, causing some controversy and leading to a formal investigation.

Starting point is 00:02:47 This also kind of highlights the use of AI in journalism and its potential consequences. Sports Illustrated responded to these reports saying that they weren't. And they said that all these published articles were written and edited by humans. But what does that mean? Does that mean that 95% of it was written by AI and then 5% was written and edited by humans? I'm not sure. So keep an eye on that. Next, how will big banks manage risk with voice calls with AI?

Starting point is 00:03:20 So the tech firm Symphony and Google just announced a pretty big partnership that they're teaming up to enhance voice analytics for financial firms in response to increase regulatory scrutiny on communications compliance. All right. So this partnership will use generative AI in natural language processing, NLP, to transcribe and summarize conversations for compliance purposes. So this enhanced product will allow users to mine data for additional insights in monitor customer experience. So I'm personally excited about this. So this new product, I believe it's going to be called AIJB, whatever that abbreviation stands for. But maybe that means ultimately when we're dealing with our banks and financial institutions, we'll hopefully have fewer robots on the other end with this new Google and Symphony

Starting point is 00:04:07 Partnership. All right, last but not least, Amazon kicked off its annual Reinvent Conference. And Generative AI is the major focus. So the Amazon's Reinvent Conference, well, actually through AWS, Amazon Web Services, comes about a week after Microsoft's Ignite Conference, where they announced a lot of software updates. And the big one, in my opinion, was Microsoft moving to their in-house GPU chip production. So all of the computer chips that we need for this generative AI, Microsoft is going to be producing those in-house.

Starting point is 00:04:43 So everyone's kind of keeping an eye on what Amazon is going to announce at this conference that just kicked off, you know, where everyone's trying to out-announce everyone else. We do know some reporting is showing that Amazon is wanting to announce a wider range of generative AI models. through their bedrock service with examples of customers successfully using it and creating impactful application. So it seems like that's going to be the focus, but we'll see you over the next couple days. All right. There's always more that we have if you want to know more in AI news. We always just give you a little preview. So if you are listening, make sure to go to your everyday AI.com. Sign up for that free daily newsletter. We'd not just break down more news, but our podcast every single day. we always, always go into more depth, provide more resources.

Starting point is 00:05:31 And yes, it's written by me, a human. I'm not Sports Illustrated, allegedly, right? So make sure you go to your everyday AI.com, sign up for that free daily newsletter. Always, always breaking down the news, insights, trends, tools, and this very podcast. So, hey, good morning to all of our live audience. Sometimes I only get to shut all of you out when it's just me here by myself. If you are a podcast listener, we always leave the link to join the LinkedIn. live or YouTube or wherever you watch.

Starting point is 00:05:59 But hey, good morning to Michael Forgey joining us. Jay, thanks for coming. Dr. Harvey Castro is always woozy joining us from Cincinnati. Brian, back to the Mississippi Gulf Coast. We've missed it. We've missed it. Hey, Natalie, thanks for joining us from ATX. All right.

Starting point is 00:06:18 And someone from YouTube. Great. Kaylee, thank you. Thank you. All right. So let's talk a little bit about large language models. All right. specifically on the knowledge cutoff.

Starting point is 00:06:31 What is a knowledge cutoff? Why is it important? And why do we all need to know? I'll tell you this. I'll tell you this. Even if you are an avid large language model user, such as myself, you know, whether you're using chat GPT to write your essay, you know, if you're a college student,

Starting point is 00:06:55 or if you are someone that works in Generative AI, and maybe you're helping to build these models or you're using so many different ones. You know, maybe you're using Anthropics Claude 2.1 that was just released and updated last week. Or maybe you're using Google Bard or BingChat, you know, two very popular large language models that are quote unquote internet connected.

Starting point is 00:07:17 No matter what your usage is, I think this is going to be an important episode to listen to because we're going to walk through step by step and talk about what a knowledge cutoff actually is, what it means, why we have a knowledge cutoff, and also some ways to kind of get around it, all right? So let's start at the top. What is a knowledge cutoff? Well, it is exactly that, right?

Starting point is 00:07:42 I'm probably going to be referencing chat GPT a lot because that is one of the most popular large language models, but every single large language model has its own knowledge cutoff. So in order to best understand what a knowledge cutoff is, we also have to just dip our toe. We're not going to get too technical in this episode because I want it to be for everyone. So we're not going to go into too much depth on how large language models are trained. But it's important to understand how they are trained because then you can understand, oh, this is what a knowledge cut off actually is and how it's impacting the output.

Starting point is 00:08:18 Whenever I go into chat GPT or something else to try to get something. All right. So it starts kind of like this. And this is very generalized. All right. We can talk for hours about how large language models are trained, but we're going to do the two minute version. All right.

Starting point is 00:08:31 So essentially, large language models collect data first. Okay. And that is generally done through web scraping. As an example, open AI, the chat GPT parent company has what's called GPT bot. And that scrapes every single website, literally, in the open internet and more. and collects data. So data is scraped, right? Generally, this is done through the open Internet.

Starting point is 00:08:56 You know, I'm sure PDFs just about anything, you know, YouTube videos. Large language models are trained on essentially every single piece of information out there. So think of it like that. All right. Again, very overgeneralized. All right. And then kind of step two is you feed that data into the large language model. Okay.

Starting point is 00:09:15 And there are humans, just so people know, there's humans involved at every step, right? people think, oh, artificial intelligence, it's zero humans. No, humans are directing, you know, in the first step, hey, bot, go collect data here. Don't collect data there, right? So you collect the data, then you feed the data into the model. And then you go through a step of learning, right? So you have your deep learning, your machine learning, right? But there's a lot of learning that goes on.

Starting point is 00:09:44 This is where what separates different models, you know, in the learning and the training, right? And then kind of the second step of that after you have your kind of more machine learning, you also have reinforcement learning from human feedback, commonly called RLFH or sorry, RLHF. Okay. So it's kind of a, in this case, a very, very overgeneralized four-step process. Collect the data. You feed the data into the model. Number two, there's learning patterns through machines.

Starting point is 00:10:16 Number three, there's learning models through human. humans, number four. All right. So we kind of have a four-step process. And that constitutes a model, right? So when we say GPT-4, that is a model. Or if we say GPT-4 turbo, that is a model. Anthropics Claude 2.1. That is a model. So every time there's a major update, that major update also has a cutoff date, a knowledge cutoff date. Okay. So let's think of it this way, right? Let's all go back to school since this is a basic elementary episode. When you're in school, you have a textbook. And sometimes those textbooks get updated every single year because it's a popular one. Sometimes they only get updated every couple of years. Okay. That is what a knowledge cutoff is. Because that large language model, and we're going to talk about, you know, internet connected large language models as well,

Starting point is 00:11:15 But that large language model, that knowledge cutoff date, it is literally like a textbook. Right. So if something new happens after the knowledge cutoff date in a large language model, it is the exact same as if you're using a textbook, right? When did I graduate high school? 2004, right? So my freshman year was 2000. So if my biology book was dated, 1998. Right. Technically, whatever I was learning in biology class was two years out of date.

Starting point is 00:11:51 Okay. And that's especially important when we talk about large language models and what you're using them for. All right. And this is what I really have to make an emphasis on. And this is why I'm actually going to have a show tomorrow. I'll throw it up on the screen here to preview it. We're going to be talking about chat GPT plugins, what's new, because there are some new updates with plugins and with GPT4. you know, Open AIs, GPD4 Turbo, its latest model. But there's actually some new things with knowledge cutoff dates and they're all a little different. All right. So that's the basics.

Starting point is 00:12:30 That's the 101. Think of a knowledge cutoff like you would, a date printed in a textbook. All right. So you always have to keep that in mind because what you're using, generative AI for chat GPD, Microsoft Bing chat, whatever. There's very few, very few instances, if I'm being honest, where you would not need an updated knowledge cutoff. And you do that through internet connected large language models or plugins.

Starting point is 00:13:02 So what I'm trying to say is one of the reasons why your chat GPT content sucks or why it hallucinates or why large language models lie, is because you're not keeping the knowledge cut off in mind and you're not taking the proper steps to get around it. All right. So, yes, like Harvey says here, thanks. Hey, and if you have comments, questions from our live audience, make sure to get them in. So Harvey here saying chat GPT now is updated through April 2023. Yes, kind of. Yeah. Even yet, even those of you who are used in large language models every day, we're going to learn something new today. So yes, the knowledge cutoff, you probably heard of it for years because it was so outdated with chat GPT. It was September 2021 up until two months ago.

Starting point is 00:14:01 So through September. So at that point, even if you were using the paid version chat GPT plus $20 a month, you were working on a knowledge cutoff or you were working with a large language model that was two years out of day. date. And you have to think, all of us are trying to get better outputs with chat GPD, with Google Bards, whatever it is. And that's like I said, not just one of the main reasons. Large language models lie or make things up or hallucinate or just give you output that you can't really use is because what, you know, up until two, you know, two months ago, what would you be producing in chat GPT that hadn't changed in two years, right? with that old September 2021 knowledge cut off.

Starting point is 00:14:45 Not a lot. Not a lot. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes four.

Starting point is 00:15:17 with the assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta.

Starting point is 00:15:58 See it today at firefly.adopi.com. All right. Let's take a look. Tanya, thank you for your question. I'll get to this after in the comments. But let's take a look now. Let's learn live, shall we? All right.

Starting point is 00:16:19 So now, don't worry, don't worry if you can't see my screen. So if you're listening on the podcast, I'm going to try to, I'm going to try to describe what we got going on here. All right. So, all right. So I am going to ask chat GPT, what is your knowledge cutoff date? That's something that people don't do enough and they should. And also, I'm going to call this out because, yes, as Harvey, as not. Dr. Harvey Caster have said the large, the cutoff date for GPT4 has been updated, but kind of.

Starting point is 00:17:01 Let's find out. Okay. So I am in GPT4, the default mode, which the default mode, I told everyone for years, or not years, but since, since it had been released. I said, don't use it because it stinks. The default mode is actually good now, because previously the default mode, and this actually really matters when we're talking about knowledge cutoff. Because previously, if you were using chat GPT, the Dolly mode, you know, the AI image generator was a separate mode, right?

Starting point is 00:17:32 Browse with Bing was a separate mode. Advanced data analysis was a separate mode. So in the default mode, you couldn't access any of that, but now with the new updates, all of that is in one, right? So technically GPT4 has access to more up-to-date information by using Browse with Bing. However, it does not change its knowledge cutoff. All right, let's put this in. So I am in the default mode in chat GPD,

Starting point is 00:18:00 and I said, what is your knowledge cutoff date? And chat GPD says, my knowledge is up to date as of April 2020. All right, cool. So that means that that's everywhere, right? Nope. Well, we'll see. I mean, I tested this last week. see. So now what I'm doing if you are joining us live, I am going now into the free version

Starting point is 00:18:26 of chat GPT. And I'm asking the exact same question. All right. So GPT 3.5, what is your knowledge cutoff date? So that's why it's important we differentiate because in the free version, January 2020. All right. So yes, it did get updated from that September 2021, but only by a few months. So now you know, if you're on the free plan, chat GPT, you're working with a knowledge cutoff of January 2022, which at this point, let's do the math, y'all. That's almost two years old. January 2024 is around the corner. All right. And if you are working with GPT4 in default mode, yeah, there's going to be a difference. You're looking at April 2023, so much more recent. Here's something that most people

Starting point is 00:19:21 don't know. And we'll see if this changed. All right. So let's go into plugins mode inside GPT, chat GPT. All right. So again, if you're using the pro version of chat GPT, there's three different modes. You have the free version, which is GPT 3.5. You have the default version, which is GPT4. CEO, Sam Altman said that it's GPT4 Turbo. Right. So let's look at plugins. Plugins mode. We'll see if this is updated since I checked last. So now I'm in plugins and I'm saying, what is your knowledge cutoff date? Look at this, y'all. Look at this.

Starting point is 00:20:01 Well, you can't look if you're on the podcast. But my training data includes information up to January 2020. Very interesting, right? Yeah. So we're going to be talking about this because chat GPT, the plugins mode. Again, if you're listening out there, try it yourself. let me know, go into plugins mode, ask what is your knowledge cutoff date? I've been investigating this, y'all.

Starting point is 00:20:29 It actually got downgraded because up until GBT4 had this big UIUX refresh with the, you know, kind of the updated interface and the custom GPTs and, you know, they brought this default mode. So they had big changes that they announced about two and a half three weeks ago after their dev day, November 7th in San Francisco. So yeah, if you remember chat GPD was down and broken for like a week. But then I noticed something. I noticed the plugins mode.

Starting point is 00:20:59 It's really changing, which is my favorite mode, right? But you'll see right here, January 2020. Interesting, right? So they actually downgraded this because up until this announcement, when I was doing this exact same kind of prompt asking chat GPD with plugins, what its knowledge cutoff date was. for at least a couple of weeks, it was April 2023. So I'm very interested in what's going on with plugins mode because its knowledge cut off got rewined,

Starting point is 00:21:34 rewound, it got rewound by more than a year, which again, when you want to increase the accuracy of what you're getting out of chat GPT, when you want to cut down on hallucinations, which is just made up stuff, right? The knowledge cutoff date is extremely important. So we just saw there different, even in the paid version, we are getting inside ChatchGVD plugins, a knowledge cutoff of January 2020. Big bummer.

Starting point is 00:22:08 Big bummer, right? All right. So let's keep taking a look. Yes, Tracy, Tracy says this is fascinating about plugins. updated information date. Yes, no one, literally no one's talked about this. Because when I saw this a week or two ago, when I think they first switched over, I'm trying to read about it.

Starting point is 00:22:31 Literally couldn't find it anywhere on the internet, Twitter, Reddit. Yes. So if you're listening, I wouldn't call this breaking news, but I haven't seen it anywhere. But it's important to know. Like I said, you always, always have to keep in mind your knowledge cut off date, your textbook in school, right? If you're working with an old textbook, everything that you create, everything that you're reading, everything that you're learning, there's a good chance.

Starting point is 00:22:54 It is wrong because what in our world, what in our world has not changed since January 2022? It's almost two years ago. It's almost two years ago. You can't, I don't know. Let me know in the comments if you know something that hasn't changed in two years. I mean, even ancient history has changed. We're discovering new things, right?

Starting point is 00:23:21 American history has changed. The stock markets change. Financial institutions, sports, arts, entertainment, culture. What hasn't changed in two years? You've got to keep in mind the knowledge cutoff. All right. Enough about chat, GPT. Let's talk about something that I've always had a,

Starting point is 00:23:39 I'm not going to say a beef with, but I've never been a big fan of Claude. All right. So the large language model from Anthropic. All right. So it's complicated, right? And I'm going to go through. So I ask the exact same question. What is your knowledge cutoff date?

Starting point is 00:24:03 And I get a long, long response from Anthropic Claude. And I'm on version 2.1, FYI. Essentially saying, I don't have a specific knowledge cutoff date. Yeah, you do. Just not sharing it. Right. So what you have to do a lot of times if a large language model does not tell you, and chat,

Starting point is 00:24:22 TPT is good about this, right? It's been trained correctly to disclose its cutoff date. That's my beef. And one reason why I tell people not to use Anthropic Cloud. At least not right now. Yes, they've raised $6 billion in the last like three months from Amazon and Google. I don't like it. Transparency, when you're working with a large language model, transparency is number one.

Starting point is 00:24:47 And if you ask something like, what is your knowledge cutoff date? And if the model does not tell you, I say don't use it. Don't use it. That's one of the most basic questions. So Anthropic, let us know why you don't tell, why you don't tell us when your knowledge cutoff is and you got to go through all these hoops, right? So essentially what you can do then and what I did here, I didn't want to take you through this whole journey because this was a lot of back and forth.

Starting point is 00:25:13 So then I'm like, all right, well, if it's not going to tell me it's knowledge cutoff date, so I say, who won the 2023 MLB World Series? right, which just happened a couple weeks ago. And Anthropics says doesn't know, doesn't have that information yet. All right. So then I say, who won the 2022 MLB World Series? So it got that information correct. So now I know in my head, okay, it's at least, you know, the World Series is usually early

Starting point is 00:25:39 November. So, okay, I know that because it got that correct, it's at least after November 2022. So now I have a year gap. So I have to keep asking questions. So I say, all right, who won the Super Bowl in February 2020, right? So it doesn't know. So then I say, what is your creation date? Because I'm like, okay, maybe I'm asking Claude wrong because it keeps saying creation date, right?

Starting point is 00:26:06 Which I also don't like. Because when I say, what is your creation date? It says my creation date is November 28th, 2023. That's today. No, it's not. That's not your creation date. You have a cutoff, Anthropic. aren't you telling us? Why are you so dodgy? Okay. So it didn't know the 2020, who won the Super Bowl in

Starting point is 00:26:26 23, right? In February. So then I go back a year. So, okay, it knows February 2020. So then I go to the NBA finals, right, which is generally in June. So it knows the 2020 NBA final. So all we know is, I mean, I have this save somewhere else. I actually couldn't dig it up. But I said, who won the 2020, 2022 U.S. Senate elections, which is in November. It says the final outcome of November 2020 U.S. Senate elections took place, so it knows it. Right. So there's details in there. So we know that the knowledge cut off, and I did have this down somewhere else.

Starting point is 00:27:02 I'll put it in the comments. So don't worry. So we know it's after November 2020, but before February, 2023. So thanks a lot, Anthropic Claude, for not being transparent and easy to work with, because if you're new to large language models or when models like Claude come out with incremental updates, right? 2.1. I'm sure they're going to be coming out with a 2.2.

Starting point is 00:27:26 Think you have to, you're getting new users all the time, right? Millions of new users, you have to be transparent. So if you're working on AI models out there, transparency is always first because if you're not transparent, if you're not communicating clearly on something as important as a knowledge cutoff. What is this data set trained on? What does it include? If you aren't telling your users, you shouldn't be using it, period. I don't care what the context window is. Yes, they announced a 200K token context window. Cool, great. What? So that's 180 some thousand words. Doesn't matter. If a model isn't transparent with its knowledge cutoff, you can't trust it.

Starting point is 00:28:14 Sorry, not sorry. Moving on. All right. So now we are in Microsoft Bing chat. And let me know, hey, if you're still hanging out on the, if you're still hanging out on the live stream, let me know what questions that you have. Cecilia, thanks for joining us as this is an important reminder that human confirmation is critical. Absolutely. Absolutely.

Starting point is 00:28:40 You have to be able to communicate. That's why, again, y'all, well, thanks to you. I shouldn't mention this. I haven't even talked about this, but everyday AI is the largest AI-centric podcast in the world right now for listeners. So, hey, Anthropic, you want to get a better message out to the hundreds of thousands of people listening to this show? Be more transparent in your model, you know? I never want to talk badly of any generative AI system because I'm a big generative AI advocate. it. And I want people to use it. I want people to learn new skill sets. But a knowledge cutoff is

Starting point is 00:29:16 essential. And if you can't communicate that, I'm not going to tell, I'm going to tell people, don't use it. Transparencies first, period. All right. All right. So now we are in Microsoft Bing chat. What's important to know, so I'm asking the same thing. If you don't know, Microsoft Bing, it is using GPT4 from OpenAI. Microsoft owns 49% of OpenAI. It was almost a lot more. than that when they almost took every single open AI employee two weeks ago when they fired Sam Altman and rehired him. Y'all, what would what would have happened with that if Microsoft took all 750 of the 770 employees that said that they were going to quit and follow Sam Altman? Anyways, so I'm asking now Bing Chat. What is your knowledge cutoff date? All right. So I am in the more

Starting point is 00:30:07 creative mode. All right. There's three different modes in Bing Chat. You're You have more creative, more balanced, more precise. So we're going to do this test in all of them. So interesting, midway through, midway through the response, it was starting to type something and then it typed something else. And it says, sorry, that's on me. I can't give a response to that right now. All right, let's switch over to the more balanced mode.

Starting point is 00:30:32 And I'm going to ask the same thing again, what is your knowledge cutoff date? It's so weird because as I was preparing for this show, obviously it gave me a date and now it doesn't. And again, if you're following live, which I haven't seen this a lot in Bing chat, it started to type one thing. And midway through, it erased it and typed something else. Interesting, y'all. All right. So let's try the more precise. We're going to say the same thing.

Starting point is 00:30:56 What is your knowledge cutoff date? Simple stuff here. So, yeah, interesting. So if you're joining live, it's started to say 20, 21. I kid you not. We can go through and hit pause. And then it said, I can't give a response to that right now. Okay, interesting.

Starting point is 00:31:14 So even though this literally worked, and that's also important to know about large language models, they are the world's most advanced auto-complete systems, all right? You're always going to get something different no matter what you type. So I'm going to try it one more time with a little bit different. I'm going to say when, instead of what is your knowledge cutoff date, I'm going to say, when is your knowledge cut off? I'm going to try that. One thing I hate typing live on the show.

Starting point is 00:31:40 All right. So I'm just rephrasing the question. I'm saying, when is your knowledge cut off? So it's not answering me in any of the modes. All right. So that's a little weird because, again, I literally tried it this morning and we actually got a response. So let's try, let's try it one more way. So I'm going to say, when is your data trained through?

Starting point is 00:32:08 So interesting doing this live and always getting different results. All right. So for whatever reason, Microsoft. Bing, even though this morning, it was telling me something different, not 2021. It started to respond with 2021. And then it said, no. So I'm going to, I'm going to say, are you using GPT4 or GPT4 turbo? Thanks a lot, Microsoft Bing.

Starting point is 00:32:33 You're really throwing the live show through a loop here. All right. So Microsoft Bing chat, for whatever reason, is not being, is not being nice. is not being nice to me. So I'm going to do something. And I'm not going to flip my screen just now. So let me know because we're going to be wrapping this up. So if you want to know something else, let me know.

Starting point is 00:32:53 So in another window right now, I'm running the exact same query, but in Microsoft Edge. So that is Microsoft's browser. So I'm seeing real quick if we're even going to get a different response. So interesting. Same thing in Microsoft Edge. So nothing to report. It started to say 2021 and stopped. Interesting stuff, y'all.

Starting point is 00:33:17 All right, let's jump into last but not least. Our last kind of large language model, unless I'd say popular one. So we are in Google Bard. Google Bard is still running on Palm 2. We were supposed to get this new large language model, this new version called Gemini. But that has reportedly been delayed until early 2024. So that's important to know, too. all of these different, you know, Google Bard uses its own large language model.

Starting point is 00:33:50 BingChat uses the Open AIs GPT4, and then they have their own training on top of it, their own kind of way to their own architecture or their own kind of training a lot on top of it, right? And then Anthropic uses their own large language model. So that's Claude 2.1. So the different chats we're showing you, BingChat and ChatGBT are technically using the large language model, even though they're not disclosing which one it is. Anthropics Clod is using Claude 2.1, and Google Bard right now is still using Palm 2. So when Geminii is updated, we'll run the

Starting point is 00:34:25 same prompt, but asking Google Bard, what is your knowledge cutoff date? So it says, as of today, November 28, 2023, my knowledge cutoff date is January 22. All right. At least it's transparent, right? Google Bard, honestly, is probably my least favorite large language model to use. even after all these new updates where, oh, you can, you know, connect with your Google Drive and your email. It doesn't really work that well. Although I will say this and we'll do a dedicated show on this. Now, finally, and I dragged Google Bard through the mud a couple months ago when they said, hey, Google Bard can talk to YouTube now.

Starting point is 00:35:07 And two or three months ago, it definitely couldn't. All they could do is read titles. But now, and I'll have a show on this, maybe we'll do it later this week. Let me know if you want. But now Google Bard can actually break down Google YouTube videos. Yes. So maybe we'll do a show on that soon. But in terms of knowledge cutoff date, January 2020 or Bard.

Starting point is 00:35:32 All right. For Bing Chat, we have a huge question mark. A huge question mark. It was responding earlier. It's no longer responding. what the heck can't provide it. And even in the response, it started to say January 2021,

Starting point is 00:35:51 midway through it updates it. So I do think that's being a little buggy. Because when I tested this this morning and when I tested it last week, it was actually giving me a date. It's not now. But again, this is always different. Moving on to Cloud. Claude is never as far as I,

Starting point is 00:36:07 as far as my testing has gone back, it's never released. It's, or been transparent about knowledge cutoff. All right. So, uh, cloud, I have the date written down,

Starting point is 00:36:18 but we know it's sometime between, uh, November, 2020 and February, 2023. All right. And then last but not least, chat GPT.

Starting point is 00:36:32 All right. So it's different now. Yes, it's different. Non-breaking, breaking news. Uh, because plugins mode,

Starting point is 00:36:39 knowledge date actually got moved back. Prior, it was April 2023. Now, chat GPT with plugins is January 2020. The free version is January 2020. And then the GPT4, which allegedly is turbo, but accessing it through the default mode is April 2023. So technically, OpenAI and chat GPT has two different knowledge cutoff dates.

Starting point is 00:37:03 Previously, it was three, which was confusing, but now it's two. So January 2022 for plugins and free and April 2023. for GPT4 default mode. That's a lot. We got a little dorky. We got a little dorky today, y'all. But let me say this. And thank you.

Starting point is 00:37:32 Thank you for all. Natalie's saying, I work in the field and A plus info learnings here. Thank you, thank you, Natalie. I'm glad, right? That's another thing. You know, I get people all the time saying,

Starting point is 00:37:47 Oh, I work in generative AI. Why would I listen to the show? Why would I take your free prompting course? I literally, yeah. So, yeah, if you're still listening, we run a free prompting course. We're doing one today in a couple of hours. And we update it all the time. So yeah, 1130 today.

Starting point is 00:38:03 So in like three and a half hours. One of the things is I literally have people that work in generative AI, software engineers, who actually help train large language models. And they take our free prime prompt polish course and they're like, My gosh, Jordan, I work in AI, but I really feel like I'm using AI for the first time after taking your course. We break. This is what we do on everyday AI, y'all. Like, we break down large language models, machine learning, you know, text to image, text to video, text to speech, right?

Starting point is 00:38:36 We had the CEO of Speechify on yesterday, right? Which has tens of millions of users. We break everything down. We cut through the company marketing, right? But companies tell you, oh, you know, this large language model is connected to the internet. Well, yeah, it kind of is. But if it's working with an old knowledge cut off, what does it matter? Right?

Starting point is 00:38:56 If it hallucinates, which we'll talk about this tomorrow. So tune in tomorrow as well. But when other large language models, you know, like Bard and Bing Chat, when they hallucinate, when you give them URLs, you have to know how all of these large language models work. and I'm a big advocate for taking it back to the basics. Because even for those of us that use large language models and generative AI all the time, sometimes we skip over those foundational things, right? Like even the knowledge cut off, I just told you all. The plugins mode got rolled back by more than a year, right?

Starting point is 00:39:34 So even people that are using chat GPT with plugins every single day, I use chat GPT with plugins every single day. You can never take for granted the basics. Are you in the right mode? When is the large language model cutoff? Do you have the correct data going into your input so you can increase your output? You always have to start with the basics. All right.

Starting point is 00:39:56 So tomorrow, let's talk, join us. I already gave you one of the couple of things that we're going to talk about. But tomorrow we're going to be talking about chat GPT plugins. What's new? Yeah, there's new stuff. I love plugins. I've used hundreds. I've tested them.

Starting point is 00:40:13 I have spreadsheets of me testing them. Some of my favorites got deleted. There's some new favorites that are in there. We're going to go over what's new and how they work now. Yes, they work differently now after this new update. All right. So I hope you all enjoyed this learning about large language models and they're cutoff. Why it's important.

Starting point is 00:40:34 Like I said, it's a textbook, right? Working with large language models is a textbook. You have to know when the textbook was public. If you're using it every day, whether you're using it to write a resume, whether you're using it to write a paper for school, whether you're using it for research at work, whether you're using it to generate reports for your next big pitch, whatever it is. You have to know when that textbook was published. You have to know the large language models, knowledge caught off date. So I hope you know a little bit more. Please, if you haven't already, go to your everyday AI.com.

Starting point is 00:41:09 sign up for the free daily newsletter. We're going to be doing some big changes. And we're throwing out some polls both this week and next to ask you. We're going to be coming up with some big changes in December to both the live stream, to the podcast, to the newsletter. And we're building this for you. So make sure you sign up for that newsletter, your everyday AI.com. Participate in those polls.

Starting point is 00:41:30 Let us know we're building this for you. And I hope to see you back tomorrow and every day for more everyday AI. Thanks, y'all. Meet Firefly AI Assistant, now live in Adobe. Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words, and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine

Starting point is 00:42:07 at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit your everyday AI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 153: Knowledge Cutoff - What it is and why it matters for large language models

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.