Everyday AI Podcast – An AI and ChatGPT Podcast - EP 208: Small Language Models - What they are and do we need them?

Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. It seems like we just got used to large language models.

Starting point is 00:00:50 But now we're hearing more and more about small language models. Like what the heck? What are small language models? What's the difference between them and the big large language models? And when should we be using small models? We're going to answer those questions today and more on everyday AI. Welcome. What's going on, y'all?

Starting point is 00:01:12 My name is Jordan Wilson and I am the host of Everyday AI. And this is for you. We are a daily live stream podcast and free daily newsletter helping everyday people like you and me not just learn what's going on with generative AI, but how we can all actually leverage all of this information to grow our companies and to grow our careers. And I think, you know, the more and more prevalent generative AI becomes in our day-to-day lives, you know, using in the workplace, we have to become more. comfortable with all of these buzzwords and in all of these acronyms, right? So if you're not already

Starting point is 00:01:48 using large language models or small language models in your day to day at work and you live and work here in the U.S., you probably will be soon. So I think it's important to understand the difference between the two in what small models are good for. So we're going to be diving into that here in a second. But before we do, as a reminder, please go to your everyday AI.com. Today's episode, there's going to be a lot of depth and detail. So if you're listening, you know, whether you're walking your dog right now or you're on the treadmill or whatever it is, maybe you can't type down notes. So we always do that for you.

Starting point is 00:02:20 I'm a human. I'm a former journalist. I write the newsletter right when I'm done with this podcast in the live stream. Also, as a reminder, y'all, this is live. A lot of people don't know. This is essentially live, unedited. It's the realest thing in artificial intelligence right now. Also on the website, if you didn't know.

Starting point is 00:02:37 We have more than two, I think, 210 episodes or something like that now. You can go listen to every single episode, go watch every single episode, go read the newsletter that came along with every single episode. I argue we are probably the number one source for free generative AI information in the world. I don't know of a single other source, single source that has all this information. So make sure you go check it out. All right. So before we jump into what small language models are, let's first, as we do every day,

Starting point is 00:03:07 go over the AI news. So Salesforce is bringing a new gen AI option in Slack AI. So Salesforce has just rolled out a new native generative AI capabilities within Slack, including features such as channel summaries, thread recaps, and AI search. So these features aim to help users access and make sense of the collective knowledge within Slack, improving productivity and saving time. So the new AI features in Slack can save users time and improve productivity with an estimated average savings of 97 minutes per week.

Starting point is 00:03:38 So these AI capabilities right now are rolling out just in the U.S. and the UK and are right now for subscribers of Slack Enterprise plans. That's the important part. But the company is working to expand the additional plans in languages. Hey, hot take, I don't like Slack, if I'm being honest. I think, you know, we use ClickUp. I think it's a better option for us at least. But I know so many people use Slack and, man, the noise of just Slack right.

Starting point is 00:04:05 now is making me nervous. All right, our next piece of AI news, Apple's latest alleged AI tool is bringing new types of creativity to everyday people. So Apple has developed a prototype AI animation tool called Keyframer that uses large language models to add motion to 2D images. All right, so it is a description-based tool that converts text prompts into CSS code to animate images and has potential applications in web-based animation. So Keyframer is a promising new tool for creating these web-based animations and just the simplicity of turning 2D images into animations. As someone that's, you know, built and developed websites on and off for like 20 years, this is pretty exciting for me because it's not easy, right? And the capabilities to turn a still 2D photo into a moving animation

Starting point is 00:04:56 with a text prompt is kind of awesome. Right. So keep your eyes out on that. All right, last but not least, in AI News, Google has rolled out a new coding tool for its employees. So according to reports, Google has introduced an internal AI model called Goose to assist its employees in writing code more efficiently. So this is part of Google's broader efficiency drive, which includes job cuts and team reorganizations. So this new Goose model, reportedly, is a plan to bring AI to every stage of the product development process and is part of Google's efforts to enhance AI skills. and streamline operations. You know what?

Starting point is 00:05:35 All these names are getting confusing because I thought maybe meta was the one that was going to go down the animal route, you know, or the exotic bird route with their emu video. But now Google is getting into the game of goose, right? We just have separations. Can all, you know, can all tech companies maybe pick one genre of animal that they want to call their models so I can stop getting confused? All right. So as a reminder, if you want more news, we do it every day as well as, yes, we recap this show. We go over more news than this and fresh finds from all across the internet, a lot of

Starting point is 00:06:10 other great information in our newsletter. So make sure you go to that, your everyday AI.com. It's in the show notes if you're listening to the podcast as well. So make sure to check out the episode description for those links, as well as you can email me, you know, reach out on LinkedIn, all that good stuff. All right. So let's get into it. And I would love to hear from our audience. Hey, Tara is always first here. Good morning from Nashville, Dr. Harvey Castro. Denise, thanks for joining us, Brian and Woozy and Mauricio. We got a crowd today. Y'all, y'all actually care about small language models. All right. I'm not the only one. Good morning to Spatlana and Raul. Thank you all for joining us. So let me know right now. I would love to know for our live stream audience

Starting point is 00:06:50 or just let me know also if you're listening on the podcast. Do you use small language models? Do you know what they are? What, what? What question? do you have? I may not have every answer, right? Some of them I'll be able to answer live here on the show. So if you are joining us, please get your question in about what small language models are or what questions you have about them. But hey, if I can't get to them live, I'll make sure to answer them in the newsletter so we can all learn together. All right. So we're going to talk now about a what is a small language model. And I'm going to give you 14 facts that you need to know. All right. No clickbait here. Just straight facts. You know we bring receipts. All right. So let's

Starting point is 00:07:27 start here with small language models. So it's important to know that the definition is always changing, right? As actually large language models get bigger, what we consider a small language model is always changing. So keep that in mind. All right. The goalposts are always moving in terms of these definitions. And there is no one overarching definition that everyone adheres to. So, you know, you could even say, oh, something that we might have considered a small language model, you know, two, or, you know, a large language model, two years people might be calling it a small model now. So keep that in mind. So previously, a small language model was considered a model with fewer than hundreds

Starting point is 00:08:05 of millions of parameters. But now, like I said, that definition is changing a bit. So as large language models like GPT4 get bigger and bigger, right, when we jumped up from, you know, GPT 3.5 to GPT4, right, as the large language models get bigger, we're starting to call other models, oh, yeah, these are small now comparatively, right? So it's important to know. And it's all kind of judged by parameters. All right.

Starting point is 00:08:30 We're going to get into here in a second what those parameters are and what they mean. So essentially, large language models have billions to trillions of parameters, allowing complex tests of all varieties, right? So your GPT4, your Gemini Ultra. Think of those as the ultra, ultra big boys, right, in large language models. And they can perform any task. And they essentially know everything. All right.

Starting point is 00:08:56 So that's not how small language. models work. Small language models have fewer parameters, making them more efficient, and they are more for specific tasks or to be used locally on devices with more limited resources, right? We're going to get into all the intricacies, but, you know, if we want to talk big picture, that's the best option, right? Or that's the best way to think of them. Large language models are the behemoths that can literally do anything and everything. And probably most of us actually use large language models like chat GPT and Google Gemini and Anthropic Claude, we probably use large models like that more than we do small models, all right? But small models are more for specific

Starting point is 00:09:36 tasks. So it's not something that you can really get the best results sitting down and asking, you know, a hundred different questions from a hundred different, you know, walks of life. That's not really what small models are for. They are for, they are fine-tuned for specific tasks. All right. Let's keep diving in. Let's look at some examples. So again, even when we talk about parameters, because I'm going to define what a parameter here is in a second, but first, I'm giving some examples of large language models and small language models and their parameters, right? So a lot of times companies don't even say how many parameters their models are. So even as I throw these numbers out there, you know, some of them are unconfirmed, but, you know, kind of reported to be true. So keep that in mind.

Starting point is 00:10:17 So let's just look at the difference here. So as an example, large language models, probably the two most popular ones are. GPT4 from OpenAI, which is reportedly about 1.8 trillion parameters. Okay. Gemini Ultra from Google, which is a reported, you know, 1.5, 1.6 trillion parameters. All right, then let's look at some popular now small language models that maybe would have been considered big, you know, three years ago, but they're not. They're small now. So Phi II from Microsoft is about 2.7 billion parameters. And you have Lama, as an example, very popular open source model from meta.

Starting point is 00:10:54 About seven billion parameters. All right. You know, don't, we can pick hairs on the parameters all day. Again, those are estimates, those are reports, et cetera. But those are some big names out there, right? So we're talking open AI, Google, meta, Microsoft. So, you know, there's these smaller, more flexible models like Phi II and Lama. Then you have your big, you know, your big ones, you know, Open AIs, GPD4 and Gemini Ultra.

Starting point is 00:11:22 All right. So now let's talk parameters. Right? Because that is essentially the difference. That's the difference. There's a lot of intricacies and how they play out differently. But the kind of the first thing that we look at to separate large models from small models is parameters. So I just gave you the examples, you know, 1.7 trillion versus a couple billion. All right. So here's what a parameter is in very simple terms. All right. So again, I'm sure if you're a machine learning expert with, you know, a decade of experience, you know, you might, uh, have qualms with my definition here, but we're talking to the everyday person. We're trying to simplify this. So a parameter in very simple terms, it's, so in simple terms, a parameter in a large, or in a language model refers to the variables that the model uses to make predictions. All right.

Starting point is 00:12:10 And each parameter represents a concrete part of the model that can change or adapt based on the data it's trained on. All right, that's an important piece, right? because all of these models and all of the parameters that they contained are trained, right? So which is why obviously these large language models are much more expensive to create. They're much more expensive to upkeep. They're more difficult to train, right?

Starting point is 00:12:38 Because they are so much more complex, right? It's like the way I like to think of it is you can, if you know how to work with a large language model, you can essentially ask it anything, right, in the history of existence. And if you know how to work a large language model, you can probably get a pretty decent answer. Not necessarily the case with small models, right? Small models are generally trained in different, you know, categories of work or different, you know, different types of outcomes you may want, right? So as an example, a small language model may be trained or built specifically for a type of

Starting point is 00:13:10 customer service, right, to handle just, you know, inquiries from, from customers, right? and it might be fine-tuned for a specific use case like that. Right. So if you have a small model that's maybe for customer service, right? Maybe it's an open source model that you can tweak yourself. But let's just say that, right? That small language model that's built specifically and tailored and fine-tuned for customer service to respond to customer inquiries, you aren't going to be able to code

Starting point is 00:13:38 on that, right? You're not going to be able to develop a website, right? You're not going to, it's not going to be able to spit out images for you, right? That's what large language models do, right? It's the multi-modality of large language models, right? Being able to input photos as prompts, being able to output, you know, different types of code and multiple language, all of these things. A lot of times, small language models are not like that, right? They're built for one very specific purpose.

Starting point is 00:14:09 Or they are just made for smaller purposes. Like, okay, this small language model excels at creative writing. Right. So can it create an outline of, you know, how the stock market has changed over the last 30 years? Maybe, but that's not really what it's for, right? So again, different use cases, different types of training, different parameters. It changes, right? Can complete difference in what a large language model and a small language model should be used for. All right. And hey, so great, great question here from Woosie. So have any specific products you like that are being run entirely with small models. I do have some examples, Woozy, but I'll try to share those in the newsletter so I don't

Starting point is 00:14:58 go off track here. All right. So now let's talk about 14 things that you should know about small language models. All right. And again, I think most of us, myself included, are more familiar with large language models. So as we go over some of the facts and things to know, I'm going to be comparing small models to large models, right? Because that's what most of us know. You know, so think the large models, the GBT4, the Gemini Ultra, the Claude, et cetera, or emphropic clod. So some of the most important differences, all right? So small language models require less computational power, making them more accessible for users with limited hardware resources. All right? That's the most, one of the most

Starting point is 00:15:45 important things, right? These small models can live locally on devices, right? The new Samsung phones have Gemini Nano, right, which is technically a small model, but it lives on the hardware, right, which technically requires less compute. Because when you're using large language models, people don't like to talk about this, but they are extremely resource heavy, right? We've talked about it on the show before, right? Every couple hundred prompts, you know, it kind of can tell you how many, you know, what is the environmental toll, right, on all of these prompts because large language models require a lot of compute power, you know, but small language models don't because they live locally, right? So they're not having to, you know, essentially send your query and compute it

Starting point is 00:16:34 in the cloud, which can be very expensive and very resource-heavy, small models, not like that. So even with that in the same vein, small language models are faster at training and inference due to their smaller size compared to large language models. Right. They're faster. It's faster when there's way fewer parameters, especially when you're, you know, obviously using a small language model for what it's good at. It's faster. It's on, it's on device. It has fewer parameters to look through, right? So, so think, think of it like this. You know, what's technically faster? You know, if you had to read through, a 500-page book to learn about the history of everything, or a five-page book, right, that gives you the history of things that maybe you care about. It's kind of like that, right?

Starting point is 00:17:22 It's just faster. Small-language models are faster, but obviously way less robust. All right, some more facts to know. Small-language models are more energy efficient, which we just talked about. So they're reducing the carbon footprint associated with the training and the running of AI models. That's another part. It's not just the running, you know, large language models that are expensive. It's the ongoing training.

Starting point is 00:17:46 There's so much compute, right? That's why you have Sam Altman out trying to raise $7 trillion. Yes, trillion with a T, $7 trillion for more GPUs, right? To build a new, new class of chips, right? Because these GPU chips power all of these generative AI models, not just your large language models, but, you know, all generative AI models are run off these very hard-to-geted very expensive GPU chips. So right now, you know, we are, I wouldn't say it's a compute crisis, but, you know,

Starting point is 00:18:16 all of this computing power is scarce, it's expensive, it's resource heavy, it's in the long run taking a toll on the environment. So small models, I think in that regard, are important to keep an eye on. Also, small language models can be deployed on mobile devices and embedded systems, unlike most large language models. Yeah. So, again, that's your edge AI, right? are your edge computing.

Starting point is 00:18:40 So that's bringing these language models locally to small devices, right? So you're seeing it on actual, you know, phones now. And, you know, just talked about that with the new Samsung S-24, having Gemini Nano. Also reportedly, Apple, you know, should be announcing their generative AI offering in June at the Worldwide Developer Conference. So Duke just announced that, I think it was earlier this month. So we are presumably going to be seeing a small language model in an upcoming iPhone. or maybe in upcoming, you know, MacBooks or IMAX, right? So that's another important thing to keep an eye on is we are seeing that as well.

Starting point is 00:19:18 With Nvidia's chat with RTX, right? They just announced that this, or it was actually announced a couple of months ago, but it was just released, you know, in the last like 36 hours, NVIDIA's chat with RTF, right? So we don't know how many parameters it is, but it's a pretty, it looks like a pretty solid, small, small language model. that can run locally, right? You have to have a certain Nvidia GPU running on your computer to run chat with RTX, but the same thing. So this is a big shift that we're going to see because it's

Starting point is 00:19:50 better privacy as well, right? That's one of the biggest things that people are concerned about with large language models and it makes sense, right? So not just data sharing, but training data, right? How are these companies using any data that we upload into their systems to train their models, Right. So when you think of smaller models that run locally, they are not sending information back and forth. Right. So it is the concept of running large language model and generative AI locally on a device. Much more privacy, much more security, right? Just like if you're opening, you know, Microsoft office, let's just say, and you're not connected to the internet. That's kind of, you know, what it's going to look like in the future if you're working with a small language model on your, on your phone or on your PC. or Mac once it comes out, right?

Starting point is 00:20:37 We're waiting on Apple. All right, some more facts. Here we go. So small language models are suited for real-time applications, such as on-device language processing, where quick responses are crucial. All right, next fact. Next fact, small language models have a lower capacity

Starting point is 00:20:56 for understanding complex language nuances, compared to large language models, right? That's the other thing. Large language models, if you know prompt engineering 101, there's really not much in the world that you can't do with a large language model, right? You can translate languages.

Starting point is 00:21:13 You can build advanced web applications by just asking, you know, a large language model that code something for you. And you can technically build generative AI with generative AI, right? So the large language models are extremely complex, and they're only going to get more powerful and more robust as we see new models, right?

Starting point is 00:21:33 When we see GPT5, you know, and Sam Altman at OpenAI has been saying, oh, it's going to be much better at reasoning. It's going to be able to rationale, you know, more, more multimodal capabilities, right? So these large language models are going to get even more powerful and even more robust, right? And even right now, large language models across like MMLU benchmarks. We talk about that on the show a lot. But, you know, the current version of GPT4 and presumably Gemini Ultra, you know, are about three to four times battle. on these benchmarks like MMLU than the average human, right? So right now, large language models, for the most part, are much smarter, much, much smarter

Starting point is 00:22:14 than any one human. All right. So that's important to keep in mind. So it's a big difference here. All right, let's keep it rolling. So another thing you need to know about small language models, they are often used in applications where speed and efficiency are more critical than deep language understanding. Again, tailored applications.

Starting point is 00:22:35 Small language models can also be fine-tuned more quickly and cheaply for specific tasks versus large language models, right? Y'all, like even though GPT4, I think is still, I think it's still more powerful, at least right now than Gemini Ultra. You know, we might see that shift, you know, as Gemini Ultra from Google, you know, starts to get a little more stability and just a little bit more improved. but y'all, GPD4 is like almost two years old now, right? Which is crazy to think about, right? But it's also important to know that and to see the difference, right? Presumably, Open AI has been working on, you know, GPD5 for years. So these large language models are extremely expensive to create, to train, and to maintain, right?

Starting point is 00:23:24 It's like a Titanic ship in the ocean versus a jet ski, right? You can't use a jet ski for everything, but for a specific task, a lot of times a jet ski is much better than a huge, you know, cruise ship maybe that 10,000 people can go on. Different, different applications, different vessels for different applications. All right, a couple more things you need to know. You need to know. about small language models. And yes, if you do have questions,

Starting point is 00:23:58 I'm going to try to get to them at the end. So keep them, keep them coming if you do have them. So small language models, they're much easier to maintain, obviously, and update due to their simpler architecture. Small language models can also be more easily integrated into software and web applications without needing extensive infrastructure.

Starting point is 00:24:17 That's an important one. I think every, you know, all these different, you know, web applications in software early on, you know, just jumped on, you know, Open AI's models because their API was good. You know, they've been making it cheaper and cheaper and faster to work with. But I think we're going to see a shift here in 2024, maybe to a lot of these, you know, pieces of software, these different web applications instead, you know, using small language

Starting point is 00:24:46 models. Because again, as an example, let's just say, if you're building, you know, if you're a large company and you want to, you know, have your own version of a model for customers. support. Do you need a model as big as, as big as, you know, Google Gemini Ultra or as big as Open AIs, you know, GPT4? I don't know. You know, you might be better off as an example with a model like mistral or a model like, like Lama, right? Something that is maybe, you know, more limited, more fine-tuned. You know, another thing to keep in mind, which I think is important is large language models. People struggle with them, right? Because let me tell you this, if you're using,

Starting point is 00:25:30 and I know I always, you know, old man Wilson getting on his porch and shaking his fist at the kids who don't know what a large language model is or how to use it. Right. So so much of what you see on the internet and social media is, you know, oh, use my prompts. Use my prompts. You know, here's 15 prompts that'll make you rich tomorrow. Those prompts don't work. They literally don't work. That's not how large language models work. They're too big. They're too big, right? If you tell GPT4 as an example, you're a copywriter with 20 years of experience, that means nothing, that means nothing, you know, for a large language model. Because guess what?

Starting point is 00:26:06 It has gobbled up all of the information on the open web web and close web works of art, things that we don't even know about. It has essentially the history of humankind in its data set, and it's, you know, 1.8 or 1.5, depending on what model you're talking about, you know, those trillions of parameters. So guess what? It's also gobbled up all this information. That's bad information, right? People that say, oh, I'm an expert copywriter with 20 years of experience.

Starting point is 00:26:32 Guess what? There's a lot of people that say that on the internet that are garbage. And they're not good copywriters, right? So when you're working with a large language model with trillions of parameters and you think you can use these copy and paste prompts and get great outputs, no. Would you get better outputs if you were using a small language model that is specifically trained for copywriting or creative writing? Absolutely.

Starting point is 00:26:54 Right. That's why I think so many individuals, so many businesses, especially early on, wrote off technologies, you know, these very powerful and robust technologies such as, you know, chat GPT, GPD4, even, you know, Google Gemini Ultra because they're like, oh, well, I can put one big prompt in here and it's not fantastic. It's because it's a large language model with trillions of parameters. You can't just put one prompt in and expect something great out because it's brain, its big neural network. is too big. It's too big. It's not fine-tuned for a very specific task. So this is just a small mini-rant brought to you by old man Wilson. If you're working with large language models, you need to understand the basics of prompt engineering, right? You need to essentially train your chat that you're working with, right? You have to, like what Tara is saying here. This is what we teach in our free prime prompt polish course that has been taken by thousands of peoples, thousands of, you know, business leaders across the world. We teach. We teach you, teach them the basics. Most people are using large language models incorrectly. They're using

Starting point is 00:27:59 it like it's a small language model. It's not how it works. Sorry, rant over. Let's keep going. Small language facts, you've got to know. So small language models offer a balance between performance and resource usage, and it's ideal for many practical applications. All right. So good examples here. Small language models can power chatbots. They can power search engines. They can power voice assistance, whereas large-singuage models are advanced and used for every single task. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's Creative Agent, Firefly AI

Starting point is 00:28:51 Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step, the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director.

Starting point is 00:29:34 Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. All right. Last couple facts, you got to know here. You got to know. Small language models can be used in both cloud-based services or by downloading them. Yeah? So that's the thing.

Starting point is 00:29:58 You can't download a large language model. It's not how it works. I don't know if there's a single, you know, computer, GPU, you know, on any one physical device that you can download the entirety of, you know, like a GPP 4. There's been people that have, you know, forked it and they've created smaller versions of these large language models. But, you know, for the most part, small language models, the big, big thing to keep in mind is, yes, they can be downloaded.

Starting point is 00:30:21 They can be cloud-based as well, right? So there's great resources out there. We'll mention them in the newsletter. You know, hugging face is probably one of the leading resources for, you know, working with and downloading small language models, and you can run them locally on your machine, right? You don't have to be a tech expert to experiment and to download and to install large language models, because that's what they're for. Therefore, you know, on-device use for very specific use cases. All right, so I'm going to get to your questions here,

Starting point is 00:30:51 but we're going to wrap up and I'm going to ponder with you. Let me know if you're joining me live, what do you think the future is for small language models? I'm not 100% sure, right? I talk about large language models every day. I read about small language models. Use all kinds of models. So I'm very curious about what the future of small language models is. I think what we're seeing as an example with Samsung and Google teaming up to bring Gemini Nano to the S-25.

Starting point is 00:31:30 to a mobile device, that's huge, right? So I think the future of small language models actually is going to kind of rely heavily on the successes or failures of these first couple large-scale commercial rollouts, right? So even if we could count on our hand, a handful of, you know, we could call them, you know, highly visible small language models. So you have your, you know, your models from meta, right? Your llama models, very popular right now, very popular. you know, for people to run these models locally.

Starting point is 00:32:04 You know, just talked about Gemini, Gemini Nano. I think we also have to talk about Nvidia's chat with RTX, right? And you can use other models on chat with RTX, which is great. Same thing. You can upload your own documents. You know, we haven't even talked about, you know, the power, the power of rag, right? The power of rag, which is something that is kind of tied in with these small language models. So it's retrieval augmented generation combining with small language models.

Starting point is 00:32:36 So essentially bringing in your own database of information and combining it with small language models, you know, then you can, I think, bypass so many of these security and privacy concerns, right? And you can work with a small model that works on a device. It's faster. It's more efficient. It's cheaper. And then if you can bring your own data in and work with it in a secure fashion,

Starting point is 00:32:56 I think the future of small language models is extremely promising, right? It's almost like, I think, kind of, you know, the large language models are kind of like the Trojan horse. You know, it infiltrates all our daily lives and we see how powerful they are. And we all start using them. Hundreds of millions of people are using large language models on a daily basis now. But then we are also then concerned, right? And we didn't know, I guess, like some of the best marketing maybe for small language models

Starting point is 00:33:25 is large language models, right? And people using them incorrectly because we see this robustness and how powerful models in general, language models are, right? To turn unstructured data into something that we can use and we can create with. It's extremely powerful. But we are also now, over the last 18 months,

Starting point is 00:33:47 we've been exposed to the downside, right? And now we're becoming more cognizant of privacy and trust. I think small language models, it's kind of in a wait and see, but I do see them gaining popularity, you know, with the, you know, Gemini Nano, with the new S-24, with whatever Apple is going to be announcing with Meadows, open source, local models, and also with Apple, right? Presumably, we're going to see some sort of small language model with Apple, you know, it could be a large language model as well. It could be a combination, but presumably we're going to have some, some edge AI, some on-device large language model in a future.

Starting point is 00:34:25 Apple offering. So I do think that working more with small language models is going to be the future. All right. Let's see if we can get to a question or two, maybe a couple comments. Let's see here. So Juan says, what's your large language model of choice for everyday use, emails, productivity, etc? Great question. One, I am still team chat GPT, through and through, right? At least for most of, you know, team's needs, chat GPT is great. Again, but we're not working as much with many other people. We're a small business. So we're not working as much with confidential documents, right? I think if we were, we might be looking at some of these small language models right now. But at least right now, I still think while plugins are there, right? Yes, Open AI will likely be phasing

Starting point is 00:35:18 plugins out soon. But right now, the ability to, you know, use what we call plug-in packs, which are essentially many agents, right? You know, when you, enable any three plugins at once, put a prompt in, and those three plugins can work with each other autonomously. People don't know this. People don't understand how powerful plugins are. And then you can also with the new feature from OpenAI with the GPT mentions, and then you can only one at a time, you can have your three plugins almost working like mini agents in a chat and then also mention any GPT, whether ones you create or from the GPT store, at least for me, I don't see any other large language model right now that offers that sort of flexibility,

Starting point is 00:35:59 not even Gemini Ultra right now. Not all workspace accounts have access to all the features that Gemini Ultra has. So at least my take right now, that's the best large language model. Let's see. Tanya, can you give an example of a prompt that give the results you want? So Tanya, I'm not sure you'll have to ask me a follow-up question. Maybe that's something we can answer in the news. letter. So Tara,

Starting point is 00:36:25 Tara asking what is the best PC or Mac for tinkering with a local model? My 2018 MacBook Pro wants to retire on me. Yes, that's a good question. So you're going to want to look at probably, you know, laptops that were honestly introduced in the last like three to six months. So we'll have a list. We'll have a list in the newsletter today. I don't have a list off the top of my head.

Starting point is 00:36:57 But I do know, as an example, Microsoft did just release a new version of their surface laptop that can run models locally. You know, we've already mentioned a couple of, you know, phones that can run small devices locally. So, yeah, we'll have a complete list of kind of different PCs right now or Macs that can run these models. because, yes, you need newer devices with new GPs or GPUs, very powerful, you know, processing. So, yeah, you're going to need probably something that's come out in the last three to six months in order to, you know, really leverage this. All right. That's it, y'all. I hope you enjoyed a somewhat, you know, random look, right?

Starting point is 00:37:43 We kind of went all over the place on this one. But I gave you 14 facts you need to know about small language models. We talked about the big differences. We talked about what they are from parameters and the future. So if you want more on this, we're going to break it all down in our daily newsletter. So go to Your Everyday AI.com. Sign up for that free daily newsletter. We're going to get to some of the questions we couldn't get to live and more,

Starting point is 00:38:03 as well as more AI news, more fresh finds from across the web, our daily tutorial. Check it all out at Your Everyday AI.com. And join us tomorrow. Join us tomorrow. We're going to be talking how AI is a creativity enhancer. and not a creativity replacement. So make sure to join us tomorrow and every day for more everyday AI. Thanks y'all.

Starting point is 00:38:31 Meet Firefly AI Assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stay in control with the ability to step in and refine at any time. See it today at firefly.adobie.com.

Starting point is 00:39:01 And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 208: Small Language Models - What they are and do we need them?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.