Everyday AI Podcast – An AI and ChatGPT Podcast - EP 336: A Complete Guide to Tokens Inside of ChatGPT

Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. This small little detail about large language models is something that most people don't understand.

Starting point is 00:00:56 And it's probably impacting your work. So we're going to be talking today about a complete guide to tokens inside of chat GPT, well, and other models. And helping you all understand what tokenization is. why it's important. And this might explain why maybe you're working with a large language model and things are going great. And then all of a sudden, they go off the rails. It's not because the model stinks. It's because you have to understand how they work. And that's what we're all about here at Everyday AI. What's going on, y'all? My name's Jordan, and I'm the host of Everyday AI. And this thing is for you. It is your daily live stream podcast and free daily newsletter, helping us all learn and leverage generative

Starting point is 00:01:43 to grow our companies and our careers. So if that sounds like you, thank you so much for joining. If you're on the podcast, make sure to check out your show notes. You can always come back and watch the video. This one might be one of those where visuals are going to help a little bit.

Starting point is 00:01:58 I'm going to try to explain everything that we have on screen here for our live stream audience. So make sure you check out your show notes. And if you are joining us live, thank you so much. But all of you, if you haven't already, also why haven't you gone to your everything? BritayAI.com yet. Sign it for our free daily newsletter as well as it is. It is like a, the world's largest, unbiased free library of generative AI education, thousands of hours

Starting point is 00:02:26 worth of content on there for free from myself and leading experts in AI. All right. So let's get into it and make sure to check out the thanks a million giveaway while you're there. So let's go over what's going on in the world of AI News before we get into token talk. All right. A.I. News, a lot going on today. A lot of the big names, new models, new features, let's get into it. So first, Google Gemini has launched Gemini Live. It's new AI assistant. So Google has launched Gemini Live a new Live AI chatbot feature, promising more natural and emotionally expressive conversations on smartphones. So this was announced earlier at a Google event, but this new hands-free AI assistant allows users to talk in real time.

Starting point is 00:03:12 interrupt an AI assistant and ask questions with the chatbot adopting in real time. So although it currently lacks the multimodal input feature, reports suggest it will soon be added along with broader language support and eventually iOS availability. So yes, this is similar to this more neural voice feature we've been waiting for from chat GPT and open AI as they go through some safety precautions. but Google has beat them to the punch with Gemini Live. All right, speaking of beating them to the punch, Google has also beat Apple to the punch.

Starting point is 00:03:49 So Google has unveiled its new AI-powered pixel smartphone lineup. So Google has made a significant move to challenge Apple and Samsung by launching a new series of AI integrated devices with Edge AI or on-device, large language models. So the array includes the pixel 9 series, a foldable phone, smart watches, and earbuds. So Google has introduced this new Pixel 9 series featuring their new advanced tensor G4 chip, which allows for this on-device edge AI. So the highlight of the launch is the pixel 9 Pro fold, which has one of the largest flexible screens on the market.

Starting point is 00:04:25 So the new devices also include two smart watches, the Pixel Watch 3 and the Pixel Buds Pro 3, both equipped with cutting edge AI technology. But the key feature like we talked about is Gemini Live, which will be available on that. phone. All right. Twitter. Yeah, Twitter or X-A-I, whatever you want to call it. Yeah, maybe I should just start calling it X. It just seems weird. But Elon Musk's X-A-I has launched GROC-2 and GROC-2 Mini. So Elon Musk-A-I company X-A-I has launched GROC2 and GROC2 Mini in beta, available exclusively to premium and premium-plus users on the X social network. So GROC-2 features advanced capabilities in chat, coding and reasoning, while GROC2 Mini, is smaller but capable and great for developers.

Starting point is 00:05:12 Early feedback indicates GROC 2 can generate images without restrictions, including those of political figures, raising some concerns about potential misuse. So the company aims to integrate GROC 2 into X's features, such as improved search, post-analytics, and AI powered replies. Great, that can't go wrong. Also with the U.S. presidential election approaching, XAI is definitely going to face a lot of pressure to implement restrictions, especially on image generation to prevent misinformation.

Starting point is 00:05:43 All right, last but not least, OpenAI's GPT4 has reclaimed the top spot in the chatbot arena after a somewhat quiet model update. So earlier this week, OpenAI quietly updated its GPT40 model and didn't even really give us a name until 24 hours after the update. So the latest model, chat, it's technically Jack, chat GPT 40, either the 08 version, which if you're a developer, you might see that, or the GPT40-Latest. Yeah, great, great naming there. Anyways, this new updated version of GPT-40 has reclaimed the number one position in the chatbot

Starting point is 00:06:25 arena, surpassing Google's Gemini 1.5 Pro with a score of 1314. So, Gemini, just so you know, Gemini 1.5 Pro is available for developers only. but this newest version of chat GBT, GPT 4-O latest is available. That is the default mode now when you go to chat GPT. So the achievement comes after a week of being tested on the chatbot arena with over 11,000 community votes after it was named the anonymous dat chatbot label. All right. So the new version of chat GPT shows significant improvements in technical areas, particularly in coding with a 30 point increase over the previous models. Also, the model excels in instruction following and handling hard prompts showcasing its versatility.

Starting point is 00:07:12 Wow, that's a lot going on from three of the biggest companies out there. Let me know live stream audience, you know, Jason and Denny and Kathleen, Chris, Woozy, Rolando, everyone else. Do you want to hear more on any of these stories? Should we be doing some reviews or tutorials today? Let me know. All right, let's get into it, though. Let's talk about tokens, all right? I'm actually excited for today's show.

Starting point is 00:07:38 I'm going to try to keep this one pretty short and factual. But I also want to know from both our live stream audience and, you know, those on the podcast. I always put my, you know, LinkedIn information in there. You can connect with me, email. But let me know. Do you know anything about tokens? Do you have questions? Do you know what they are?

Starting point is 00:07:58 Why they're important? At least for our live stream audience, let me know. I'm going to make sure to answer your questions on tokens. but we're going to do a complete look. And actually, y'all, if I'm being honest, not understanding how tokens work is one of the biggest reasons, probably why your chats go awry. So if you've ever had an instant where you are working inside of a chat bot like chat

Starting point is 00:08:26 jvety, and things start out great, right? And you're like, oh, this is great. You know, I'm getting fantastic, you know, fantastic results. And then all of a sudden, like all of a sudden, it starts to just get dumb. And then you think, oh, it's just the model. Well, not really. What's happened is it's going past its context window. And that context window is measured in tokens.

Starting point is 00:08:50 Okay. And all of the major kind of chatbots have, you know, have a different process of tokenization. But they all have kind of a memory limit or a context window of a different number of tokens. All right. So we're going to be going over. the basics and we're also, we're going to try to do a couple things live here. All right. So if you want to see how all this works, you know, this is one of those.

Starting point is 00:09:14 If you are on the podcast, it might be, it might be worth, you know, watching this one. But let's get into it and let's go over what you need to know. So here's what we're going to go over. We're going to talk about what a token actually is, why large language models even use tokens. the tokenization process, and then we're going to talk about the context window, hopefully with some live demonstrations here. All right. So let's dive into it.

Starting point is 00:09:47 So what the heck is a token? All right. A token, think of it is this. It is a subset of a word. All right. So essentially what large language models do is they break down words into smaller subword. or sections or tokens. So we're going to be going over examples,

Starting point is 00:10:09 but let's say, I don't know, let's pick a trending word if you're an AI dork like me. Strawberry, right? Technically, large language models, if you, you know, ask a question about a strawberry, large language models don't see words. They technically don't even understand words, which I know might sound interesting,

Starting point is 00:10:30 but essentially there is a process that goes on, right? Because a lot of people might just not even understand, oh my gosh, how does, how do all these large language models? How can they understand anything and do almost anything? Well, it breaks down words into these smaller subwords or subsections called tokens, right? So as an example, the word strawberry, I know, it breaks it into three smaller tokens. And it assigns tokens essentially a numerical value. And then depending on the context that you might give a word, it's going to change that token value, right? Even capitalizing something, turning something plural, having a space before or after a word, it is going to change its token

Starting point is 00:11:14 value. And that is actually, y'all, that is actually how large language models understand us. They understand our words by assigning token values to everything. So that is part of an algorithm of sorts that helps all of these large language models be helpful assistance to us. Because in the end, they actually don't understand words, right? And technically, when they are spitting back words, they're technically just spitting back tokens that get converted into words. Okay. So it's a common misconception, right?

Starting point is 00:11:48 And also, no, models cannot count as an example, the number of ours in the word strawberry or even the number of words. It cannot consistently, right? So if you say, you know, hey, respond to this in 10 words or less, it might give you 13 words. It might give you eight words. That's because it doesn't understand the concept of words. It understands the concept of tokens. That is how large language models think.

Starting point is 00:12:16 That is how they process. That is how they respond back to you. Okay. One thing we're huge at at everyday AI is education, right? And this is something, like I said, even the expert. quote unquote, self-annointed, self-appointed experts are getting wrong, right? Because they say, oh, large language models are dumb. They can't even count the number of ours in strawberry, or they can't even, you know, respond back when I say, you know, describe a strawberry in 10 words.

Starting point is 00:12:43 It never gives me 10 words. Well, that's not how large language models work, right? The key to getting the most out of generative AI is you first have to unearth and, you know, look at what's underneath. You have to understand things at a very basic level. Okay. So that is what a token is. And don't worry, we're going to be going over some examples. So, you know, one example, dog might be one token or the word running might be two tokens. Run and then ning, right? So it might split it up in different places. All right. Let's keep going. And I see some of your questions, y'all. I'm going to get to them at the end, but keep your questions coming in. All right. So why do large language models even use tokens?

Starting point is 00:13:28 Well, it helps them for context analysis, right? So it helps them better understand what you are actually talking about. All right. Also, it helps with language handling and process efficiency. Okay. So let's talk about that a little bit more. So with tokens, it helps large language models break our text, our sentences, into smaller units, those words or subwords.

Starting point is 00:13:57 And then it allows the model. to better understand and retain context. Okay? Also, it allows, this is how large language models can sometimes speak hundreds of languages, right? You're like, how? And how can it flawlessly, you know, switch between languages? And we've seen even with these new voice assistants, right? We've seen, you know, demos of them where someone saying, hey, you know,

Starting point is 00:14:22 where it acts as a real-time translator. That's how. That's how it can handle different languages, is through this process of tokenization. Also, this actually helps processing efficiency because using tokens technically reduces the computational load, right? Think it actually breaks down complex languages, complex queries into much more manageable tokens,

Starting point is 00:14:50 into essentially an algorithm of sorts, right? When you break complex languages and complex problems into tokens, and you'll see what that means. It actually helps the model be more efficient and process faster each and every time. All right. So now let's talk about the tokenization process. All right. And hey, I agree, right. Jason, y'all, like, if you're tuning in live, first of all, now you are the smartest person at your company, unless you work at OpenA.I and you're working with the people who are, you know, actually building these models. But otherwise, yeah, Maybe you should get some sort of continuing education credit for this.

Starting point is 00:15:29 But I guess now you're the smartest person in your company when it comes to tokenization and how large language models work. All right. So let's talk about the tokenization process. All right. And we're probably going to jump into this later. But a lot of people don't know that Open AI has a tokenizer, right? You can go and play and learn how this works.

Starting point is 00:15:51 So I do have to say that this is not the token. the tokenizer for GPT40 is not yet available. So the tokenizer that we're going to be going over both in the examples on my screen, and we'll probably do a little bit live. This is for the GPT 3.5 in the GPT4 version, kind of like quote unquote old school, right? So let's take a look. Let's take a look at this now.

Starting point is 00:16:18 So what I'm showing here is different use cases of the words, strawberries, right? So, you know, tending to a strawberry patch, a heart-shaped strawberry candy, blowing strawberries on a belly, strawberry birthmark on the face, scraped strawberry knee wound, right? Like, even the word strawberry technically has many different meanings. So you can go into the Open AI tokenizer, okay? And this is technically going to help explain two different things here. This is going to help explain how words and to show how words turn something into tokens, but also the context window. Okay. So now when we look at those different words, because the word strawberry is going to mean different things if it has different context,

Starting point is 00:17:13 right? The word strawberry by itself and then versus a strawberry patch or strawberry candy, that word strawberry is technically going to be broken into even different sub-tokens depending on the context in which it is used, right? That is how large language models can understand nuances in language when it technically doesn't know language. It looks at the context of how you are using words, right? Is it capitalized? Is it singular?

Starting point is 00:17:45 Is it plural? Are there words before or after? Do those words before or after? change the meaning of said word, right? So then after you put in, you know, and anyone can do this and, you know, we will have this link for the tokenizer in the newsletter today so you can go and play with it and get super smart, right? So now I'm going to show you the token values that these are assigned. All right. So actually, this is just the, um, kind of the visual, right? But you can even see, right?

Starting point is 00:18:17 At the top here, I have the word strawberry. Even the word strawberry with a space versus not a space or a space before or it being capitalized or it being, you know, singular versus plural. It is going to tokenize it a little differently. Okay. So we're going to be going over, we're going to be going over some actual examples. And no, we're not talking about tokens as a new currency. we're talking about how large language models work.

Starting point is 00:18:51 All right. So that is the tokenization process. And y'all, let's put this quickly into perspective here. We're going to be talking about a context window, okay? But essentially, you see even these couple of sentences here, we're getting a total number of tokens as well. So tokens actually play a dual role. First and foremost, the tokens are important because

Starting point is 00:19:17 that's how large language models can actually understand words. Second, there is something called a context window, or you can think of that as memory. And that is how much information chat GPT can retain before it starts to forget things. Okay? So this is the context window, right? So it's different models. Also, this is important to know, but all models. So chat GPT, Anthropic Claude, Google Gemini, they all have different context windows, right?

Starting point is 00:19:52 Technically, using chat GPT inside of the, you know, quote unquote, chat interface versus the API, it actually has a fairly small context window compared to competitors, right? So when you use Claude, Claude has the best in the largest context window or memory when using it inside of the chat product. Okay. If you're using it inside of the development product or not the front end, it's actually Google Gemini with up to a two million token context window, which is huge, right? So that's as an example. You can paste an entire book into Google Gemini, if you are using it on the back end

Starting point is 00:20:35 in kind of the developer mode. And you can talk and ask questions of that book, where if you're inside chat GPT, it has a much smaller context window. And we're going to go ahead and probably describe this live. All right. So bear with me, y'all. We're going to get a new chat going, all right? And I want everyone to see and to understand this tokenization process and why it's so important.

Starting point is 00:21:05 All right. So, hey, live stream audience, let me know if you can see my screen here. But for our podcast audience, this is what we're going to do. okay we're giving chat gpti some basic information about myself all right also i have inside of my chat gpt account uh a tokenizer okay so the one this is not by default although i don't know why large language models do not have a token counter by default they should okay um the one i'm using i'm sharing it on my screen here. So I'm using Chrome, but I use what's called the chat GPT token counter. This is a Chrome extension from amperly.com. All right. So I'll have that in the newsletter as

Starting point is 00:22:00 well. All right. So when I'm sharing my screen here, and we're going to be going over this process, bear with me, I promise you, this is going to help you understand tokens. And it's going to actually improve your outputs a ton. And even if you're listening on the podcast, I'm going to explain this to you. All right. So as we go along, I'm going to put some information in, okay? And you'll see the token count right here. So I have some information about myself. I'm saying, my name is Jordan. My favorite color is Carolina Blue. My favorite food is deep dish pizza. Oh, I just had some deep dish pizza last night. Shout out Taurici's. And then I'm saying, I think the bears kind of stink, but they might be okay this year.

Starting point is 00:22:38 Okay. So all of my words then when I hit Enter are going to be converted into tokens. I can't see that on the back end, right? But we're going to see also, when chat GPT responds to me, we're going to see my token count go up. So let me take it away. So right now, technically this token counter is counting anything on the screen. All right. So, you know, essentially it's about 50 tokens, my query.

Starting point is 00:23:03 So when I hit enter, chat GPT is going to respond. It's probably going to say, oh, great, thanks so much for this information. And then you're going to see my token count go up, right? So there we go. So it's, you know, giving me some standard. blah, blah, blah. Hey, Jordan, nice to meet you. All right. Let me say something else. Right now, chat GBT has different features that I can actually remember things by default in memory. I have that turned off, just so everyone knows. Because you might say,

Starting point is 00:23:28 oh, when you tell chat GPT, those things, it's going to remember. So of course, no, we're testing the context window here. Memory is off. Okay. So now what I'm going to do, I just have a bunch of random transcripts here. Okay. So I can't, you know, do 30,000 all at once. But I'm going to say, please, I'm going to say please summarize this text. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly A.I. Assistant. Now live in the Adobe Firefly app, the all in one creative AI studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, Just describe what you want and shape the outcome as it takes form with the assistant.

Starting point is 00:24:26 The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time.

Starting point is 00:24:59 You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. All right. Give me a second here. I want to prove something to everyone. All right. So now you'll see I put in a ton of text.

Starting point is 00:25:20 So we just jumped up already to about 17,000. tokens, all right? I'm going to go grab some more. And I'm proving a point here, y'all, because a lot of people don't read between the lines, right? And they take whatever a company says as truth, which you shouldn't do. All right. Because Open AI actually, yeah, they're a little, I mean, it's technically on their website, but when they talk about GPT40, you probably hear a context window of 128,000 tokens. That's what most people think that the memory of chat GPT, GPT-40 is, all right? It's not, and I'm going to prove that to you.

Starting point is 00:26:00 All right. So now I'm going to put a bunch of other information in here, and I'm keeping an eye on the tokens. All right, so let me go ahead and grab a little more. Y'all, this is what we do, right? We are investigative practitioners here at everyday AI. We try to break things and test everything. All right. So here we go.

Starting point is 00:26:23 this should be a little bit better. All right, there's a point here, y'all. All right. So now you'll see after chat GPT responds, I'm probably going to be at about 30,000 tokens. Okay. There's a point here, y'all.

Starting point is 00:26:36 I am proving and showing you through an example, why tokenization and the memory matters, right? Because I talked earlier about, oh, maybe you're having a conversation with chat, GPT, and you share some things and things start out great. and then after a while it starts to lose its memory, right? So now here's what I'm doing. We're at about 29,000 tokens, just under, okay?

Starting point is 00:27:01 So now I'm asking, what's my name? What's my favorite color? What's my favorite food? What do I think of the bears? Right? So why am I asking that? Right? I'm doing it to show you that even though there are as about 30,000 of tokens of information

Starting point is 00:27:21 between my question and where the answer is, right? It should, in theory, get all of these things correct. The tokenizer is not 100% correct. These are estimates. That's why I didn't go up to like 31,900, okay? I wanted to show you, okay, we're at just about 30,000. Let me ask this question. And there we go, right?

Starting point is 00:27:45 I ask the question, what's my name? What's my favorite color? What's my favorite food? What do I think of the bears? It says your name is Jordan. Your favorite color is Carolina Blue. Your favorite food is deep dish pizza. You think the bears kind of stink, but they might be okay this year.

Starting point is 00:27:59 All right? Yeah, Monica. Monica says love it. Investigative practitioners. That's what we are. You know what? And if you have taken our PPP course, our free prime prompt polish course, it's been taken by now, I think, 7,000 professionals.

Starting point is 00:28:13 It is free. It is live. If you've taken our course, you probably know this. And you probably know kind of a workaround, this called a memory recall. I'm not going to go into this in this session, but if you are tuning in live or on the podcast, just put PPP in there,

Starting point is 00:28:31 and I'll send you a link where you can sign up for free. And I'm wondering, has any of our live stream audience taken our PPP course? All right. So anyways, there you just saw an example. All right. We tested the context window. This is important because I say easily.

Starting point is 00:28:48 This is a top three mistake. that literally hundreds of millions of people are using large language model. And this is a top three mistake that people are making and they have no clue. And this is one of the reasons why people say, oh, we can't. We can't use a large language model at our company. Look, it starts good and then it goes off the rails. These models are broken. No, they're not.

Starting point is 00:29:11 You have to understand how they work and play by the rules. I'm teaching you how. All right. So now what we're doing is we are starting a new chat. I'm doing that so we don't have this same context window. So essentially we are starting over fresh, okay? And I'm starting with the same thing, right? The information about myself, my name is Jordan.

Starting point is 00:29:31 My favorite colors is Carolina Blue. My favorite food is deep dish pizza. I think the bears kind of stink, but they might be okay this way, this year. Any football fans in the house, right? Any Bears fans? I've always been a realist bear fan to tell you the truth. All my friends are like every year, they're like, oh, the bears are going to win the Super Bowl. And I'm like, no, they stink.

Starting point is 00:29:48 This year, they might be okay. Anyways. All right. So now we're going to do the same thing. We're going to grab a ton of just information, just to gobble up tokens, to gobble up the memory. All right. So I'm in this transcript here from some of our hot take Tuesdays.

Starting point is 00:30:03 All right. So I'm just grabbing some information. I'm jumping back into chat, GPT. I'm saying, please summarize this. And again, remember, all of my words count toward that token count. the memory, right, in all of chat GPT when it responds to me. That all counts on the memory as well. All right. So what I'm going to try to do, I'm going to try to get probably like 33, $34,000, something like that. So let's go ahead and I'm grabbing some more and I'm going to put it in chat GPT.

Starting point is 00:30:41 So I'm saying, please summarize this. Okay, again, you don't need to see this. It's just a bunch of random text just to eat through the context window. I want to grab a little bit more here just to make sure that we go well over that 32,000. Because, yes, it is not, it is not a hundred and twenty eight thousand. That is only if you are using the API people, so many people who are quote unquote smart people, people who charge, you know, $500 an hour to talk to, they tell you that. They're wrong. And I'm proving it to you here live.

Starting point is 00:31:15 All right. So here we go. We're at 37,000. went over the 32,000 mark. That's what I wanted to do. Now, in theory, and we do test this almost every week because we do that live PPP course almost every week, right? And when we tell people something, we test. Y'all have no clue how much work goes on behind the scenes at everyday AI to make sure you are the smartest person in AI at your company. This is what we do, y'all. All right. So now, we are at 37,000 tokens. And I'm going to

Starting point is 00:31:48 you ask chat gbt what's my name what's my favorite color what's my favorite food what do i think of the bears and guess what we have it forgot i don't have access to personal data unless shared in the current conversation all right but then you might be saying wait it is in the current conversation you just told it okay think of this that 32 000 tokens that's roughly about 26000 words, okay, give or take. Again, these are estimates. That's why I wanted to go well over 32,000 tokens, all right? ChatGPT only remembers the most recent 32,000 tokens. Okay. So what that means now, because I'm at 37,000 tokens, it has forgotten the first everything in the first 5,000 tokens. Okay. So essentially, people are like, oh, so when you get to 33,000, does it forget everything? No, it just remembers the most recent 32,000. So literally think, you know, in our PPP course, we go over some ways to kind of get around this, right? Number one is you just always have to be, you always have to understand the tokenization process and where your memory or your context window is. Okay. So that, y'all, is a very brief and quick.

Starting point is 00:33:20 quick overview of how tokenization works. And this, like I said, is extremely important to understand, especially if you are using chat chbt, because if I'm being honest, 32,000 tokens is not a lot. Obviously, I was working with, you know, I was copying and pasting like 15 pages of text at a time, but you saw there.

Starting point is 00:33:43 That was technically two hitting the interkey twice, right? it was only two rounds of back and forth because it was a lot of information I was putting in there obviously. But chat GPT's memory, especially if you are working with long, long blocks of text, it can go quickly. All right. So yeah, Rolando says, always the receipts. You know it. You know we always bring the receipts here. All right. So, hey, Ira. Ira says she wants to check PPP. Chris says it's a great course. All right. I'll send you,

Starting point is 00:34:21 I'll send you the information. Michael highly recommends it. All right. And if Michael joins us from LinkedIn and YouTube, you know that recommendation comes highly. All right. So now, just with that, y'all, you already know more about how large language models work

Starting point is 00:34:42 and chat GPT works than, if I'm being honest, 99.5% of people out there. one of the key factors of getting the most out of large language models in generative AI is understanding it, right? People always talk about this black box of generative AI, right? Like, oh, no one knows what goes on under the scenes. Well, now you know a little bit, right? You know the rules because essentially chat GPT, open AI, anthropic, you know, Google Gemini. they create a playing field.

Starting point is 00:35:18 And there's rules, there's boundaries. And we all have to play within the confines of the rules that they set. However, because they update these models so often, new features, new functionality, new and improved tokenization processes, different context windows, all of these things are constantly changing. That's why you have to constantly tune in. Because we spend an insane amount of time every single week, keeping up with everything that matters in artificial intelligence.

Starting point is 00:35:48 Y'all, this is unedited, unscripted, the realest thing, but we research everything. So you can be the smartest person in AI at your company. All right. I think there's a couple of questions here. If you have a question live, please, please get it in quickly because we're wrapping up here. Because, yeah, there's a lot of receipts today, y'all. Kathleen is asking, can you ask it to receive? respond within a token range.

Starting point is 00:36:19 Yes and no. So we've tested that similarly to asking it to give you a certain number of words. In our experience, right, and this is not scientific fact. There's actually, I don't know if there's actually been research papers on this, but, you know, as an example, and this is why you can't say, oh, how many ours are in strawberry, right? You can say how many tokens and it will get it a little closer to being correct. But in our experience, you know, let's say, hey, write me a sentence about strawberry with 10 words, not super accurate. If you say write me a sentence about strawberries with 15 tokens, it usually gets it a little closer. It is not a science.

Starting point is 00:36:59 Again, because the intricacy of tokens, like I said, adding a period changes it. Putting a space after a word changes it. So it does get a little bit closer when you ask it to respond within a token. versus a word range, but it's still not going to be 100% accurate. All right, Monica, good question here. How are you able to continue the same chat when you max out your tokens? Okay, good question. So just because you get to, as an example, 33,000 tokens, it doesn't mean your chat is over.

Starting point is 00:37:32 Okay. So we teach something in our free prime prompt polish PPP course, which is live and you can ask questions. we teach something called memory recall, all right? Because usually what happens when you're using chat GPT, if you're using it correctly, we always tell people use chat GPT like a consultant, right? And what that usually entails, it's a lot of back and forth, a lot of back and forth conversation, right?

Starting point is 00:38:01 Everyone should actually go check out episode 310 if you haven't already, the one chat GPT mistake that we're all making. that goes through the process of turning chat GPT into a consultant and more of this take on augmented intelligence. But to get back to your question, Monica, how are you able to continue the same chat? Well, you can keep going, right? You can keep a chat going to a million tokens. But like I said, it's only going to remember or recall the most recent 32,000 tokens. So what you can do, right, if you know as an example, oh, I'm getting near the token limit. But a lot of times there's, quote, unquote wasted tokens, right?

Starting point is 00:38:39 Because you may be having a conversation with chat GPT. So you can do something what's called a memory recall, which is essentially saying something along the lines of, hey, please recall all important information as if you are explaining the contents of our conversation to a large language model that has no information, right? That's just an example of a memory recall prompt off the top of my head. And then what will happen generally is chat GPT is going to recall the most. the most important information. And then what that does is it pushes the most important, think of it like a Cliff Notes

Starting point is 00:39:13 version, then it'll push that at the bottom of the context window, right? And then for most intent and purposes, right, you're kind of resetting its memory. Because think of it like pages as an example, right? So instead of thinking 32,000 tokens, think 32 pages, right? And maybe you have some important information on page five and some important information on page 10. And eventually, when you get to page 60, it's going to forget that information. So you can essentially do a memory recall. Like I said, there's probably a lot of fluff in there in those quote unquote 32 pages. And then it's going to put everything, let's just say,

Starting point is 00:39:48 as an example, on page 33, right? All of the most important bullet pointed information. Now it's going to remember kind of that cliff note versions or the bullet point version for the next 32,000 tokens. Right. So that's just a way that in theory you can try to to surmise or distill a lot of the most important information and kind of reset it. So like I said, when you get to 33,000 tokens, it doesn't forget everything. Right. So let's just say that. When you get to 33,000, it only forgets the first 1,000, right?

Starting point is 00:40:20 But if you're at 64,000 and you haven't done a kind of quote unquote memory recall to get that important information, it's forgotten the top 32,000 tokens. But actually what you can do, all right, and I didn't plan on doing this, Monica, but this is actually a great example here. A little hack, y'all. All right, we'll do some advanced things here. So I'm going back into my chat here, right? Because Monica had a great question that I think a lot of people are running into. Because in this chat now, we're at 37,000 tokens, all right?

Starting point is 00:40:57 So what I can do, because you might be saying, oh, well, if I get past it, how can I? I, how can I ask chat GPT to recall that information? Because it technically can't, it technically can't even find it. Oh, no, I'm screwed. No, you're not. Here's a little trick, y'all. So even when you do a memory recall, you can only do a memory recall on the most recent 32,000 tokens, right? But one thing you can do, there's a nice little feature in here in chat GPT that not a lot of people know about. I as a human can go up to that information, right? That's past 32,000 because inside chat, GBT, you get information or you get access to all of it. So even if your conversation is 100,000 tokens, you can still see as a human information that chat GPT can't. So now as an example, I'm going up to this original information, right, that is 5,000 tokens past what chat GPT can see.

Starting point is 00:41:52 So I can, as an example, I can highlight, you know, I can highlight this text, right? Let's say this 5,000 tokens, right? It's not. All right, but when I highlight it, if I scroll up to the top here, all right, this is going to be a little tricky. I'm not getting the, I think I'm too far zoomed in. Let's try it again.

Starting point is 00:42:15 There it was. Give me a second, y'all. All right, there we go. So you can highlight certain information inside of chat GPT. So it might be, it might be kind of hard to see here. But there is this quotation button, right? So even if something is outside of the context window, you could just copy and paste it and say summarize this or you can highlight it and then hover.

Starting point is 00:42:45 And then you're going to get this quotation mark inside of chat. And then it says reply. So I can then do this and I can say, you know, please summarize this. Include everything. Right. And now as an example, it's going to recall that information. And then I can ask the exact same question. question, right? Let me scroll down here to the bottom. Then I can say, all right. So this is,

Starting point is 00:43:14 this is interesting here. So it looks like because I didn't say it in full, it's not able to recall it. I've never actually run into that issue. Anyways, to try to accurately kind of answer that question, Monica is at any point you can go back up, copy and paste something that's out of the context window, put it back into the context window, and then chat GPT should be able to then recall and remember that information. All right. So I know this was a little bit kind of technical and maybe a little bit dorky, but I hope this was helpful, y'all, because now if you understand, And Denny, I think I kind of just answered your question here about if there's a workaround to continue a conversation once you max out the tokens. Yeah, I just kind of showed you, and we go over that in our free prime prompt polish course as well.

Starting point is 00:44:11 You essentially just need to put it back into the context window. So that's it, y'all. I hope this was helpful. Now, you know what tokens are. You know why they're important. and you know why now. Chat GPT might start out working for you very well and then not perform very well. Also, this is why it's extremely important to have crisp language, a strong lexicon when working with chat GPT, right? The other thing that we didn't dive too deeply

Starting point is 00:44:44 into today is the concept of tokens, right? And how, maybe if you're not super descriptive, right, how the example that I showed, even the same word can technically have different tokens because of different meanings, right? Because as an example, you might say, hey, you know, I want you to write a long blog post, but don't use the word just, right? Guess what? The word just, if you say it like that, the word just has many different meanings, right? I'm just talking.

Starting point is 00:45:17 We need a fair and just system, right? justly this, right? It's just the right amount, right? The word just can have so many different meanings. So sometimes when you are trying to tell JetGPT to include certain words or avoid certain words, if you're not giving it enough context, if you're not being crisp with your language, it can struggle, right? That's why one of the best kind of quote unquote prompt engineering skills that you can have is strong communication skills, because the better you you are talking to chat GPT, the stronger connection it is going to make with that context to better understand what you actually mean and assign the right tokens, right?

Starting point is 00:46:00 That's why sometimes a couple of words can mean all the difference. All right. I hope this was helpful, y'all. Now you have a complete guide to tokens inside of chat GPT. All right. I hope this was helpful. Hey, if it was, go ahead. Sometimes we do this.

Starting point is 00:46:17 I don't know. Once or twice a month. You know, we charge, if I'm being honest, we charge a couple hundred dollars if you want to talk to us. We're going to do it. We're going to choose one person for free. All right. So anyone who repost this show. So repost this on LinkedIn.

Starting point is 00:46:31 You can, I guess, retweet this one on Twitter. Reach out to me, tell me you reposted it, whatever, or tag us. All right. Anyone who does this in this week by Friday, I'm going to do a little drawing. All right. So, and then the winner, I'm going to give them a little 45 minutes session. where you can ask us anything, right? Companies pay a couple hundred dollars to be able to ask us questions. We're going to do it for one person for free. So if you find this helpful,

Starting point is 00:47:00 even if you're listening on the podcast, don't worry, you have time. If you repost this by Friday, August 16th. So, you know, these shows take us hours to plan. And sometimes hundreds of hours of experience to even have this information to distill it to you all. So if this is helpful, if you want to get any of your questions asked about generative AI, large language model, et cetera, go ahead, repost this to your network here on LinkedIn or Twitter X, whatever. Tag me. Let me know you did it.

Starting point is 00:47:34 And I will enter you into a drawing. We're going to pick one person. All right. So we appreciate your support. Also, if you're listening on the podcast, make sure you subscribe. Leave us a rating on Apple or Spotify. Like I said, if you are a podcast listener, this one might be one to go check out. out visually. So, you know, check out the LinkedIn post that we include in the show notes.

Starting point is 00:47:55 Check out that thanks a million giveaway going through the end of the month. And check us out tomorrow. And every day for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One creative AI studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premier Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time.

Starting point is 00:48:35 See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter. so you don't get left behind.

Starting point is 00:49:02 Go break some barriers, and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 336: A Complete Guide to Tokens Inside of ChatGPT

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.