Everyday AI Podcast – An AI and ChatGPT Podcast - EP 208: Small Language Models - What they are and do we need them?
Episode Date: February 15, 2024It seems like we just started to understand large language models. But now all the talk is about small language models. So what are they and how do they compare to LLMs? We explain small language mode...ls and their future.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode pageJoin the discussion: Ask Jordan questions on small language modelsUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:02:30 Daily AI news07:57 Large language models are powerful and versatile.12:03 Large language models are complex, expensive15:39 Small language models are faster and efficient.18:51 NVIDIA announces local small language model.21:54 Small language models are efficient for specific tasks.29:33 Samsung and Google teaming up for Gemini Nano.30:25 Small language models support data integration securely.35:10 New devices needed to run language models.Topics Covered in This Episode:1. Introduction to Language Models2. Advantages and Usage of Small Language Models3. Comparison of Small and Large Language Models4. Future of Small Language ModelsKeywords:large language models, small language models, Gemini Nano, S 24, Apple, Meta, ChatGPT, workspace accounts, plug-in packs, Microsoft Surface, bad information, prompt engineering, chatbots, search engines, voice assistants, cloud-based services, Samsung, Google, Meta's llama models, NVIDIA's chat with RTX, retrieval augmented generation, Everyday AI Show, Salesforce, Slack, AI animation tool, Keyframer, Goose, parameters, GPT 4, Gemini UltraSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
It seems like we just got used to large language models.
But now we're hearing more and more about small language models.
Like what the heck?
What are small language models?
What's the difference between them and the big large language models?
And when should we be using small models?
We're going to answer those questions today and more on everyday AI.
Welcome.
What's going on, y'all?
My name is Jordan Wilson and I am the host of Everyday AI.
And this is for you.
We are a daily live stream podcast and free daily newsletter helping everyday people like
you and me not just learn what's going on with generative AI, but how we can all actually
leverage all of this information to grow our companies and to grow our careers.
And I think, you know, the more and more prevalent generative AI becomes in our day-to-day
lives, you know, using in the workplace, we have to become more.
comfortable with all of these buzzwords and in all of these acronyms, right? So if you're not already
using large language models or small language models in your day to day at work and you live
and work here in the U.S., you probably will be soon. So I think it's important to understand the
difference between the two in what small models are good for. So we're going to be diving into that
here in a second. But before we do, as a reminder, please go to your everyday AI.com.
Today's episode, there's going to be a lot of depth and detail.
So if you're listening, you know, whether you're walking your dog right now or you're on
the treadmill or whatever it is, maybe you can't type down notes.
So we always do that for you.
I'm a human.
I'm a former journalist.
I write the newsletter right when I'm done with this podcast in the live stream.
Also, as a reminder, y'all, this is live.
A lot of people don't know.
This is essentially live, unedited.
It's the realest thing in artificial intelligence right now.
Also on the website, if you didn't know.
We have more than two, I think, 210 episodes or something like that now.
You can go listen to every single episode, go watch every single episode,
go read the newsletter that came along with every single episode.
I argue we are probably the number one source for free generative AI information in the world.
I don't know of a single other source, single source that has all this information.
So make sure you go check it out.
All right.
So before we jump into what small language models are, let's first, as we do every day,
go over the AI news.
So Salesforce is bringing a new gen AI option in Slack AI.
So Salesforce has just rolled out a new native generative AI capabilities within Slack,
including features such as channel summaries, thread recaps, and AI search.
So these features aim to help users access and make sense of the collective knowledge within Slack,
improving productivity and saving time.
So the new AI features in Slack can save users time and improve productivity with an estimated
average savings of 97 minutes per week.
So these AI capabilities right now are rolling out just in the U.S. and the UK and are right now
for subscribers of Slack Enterprise plans.
That's the important part.
But the company is working to expand the additional plans in languages.
Hey, hot take, I don't like Slack, if I'm being honest.
I think, you know, we use ClickUp.
I think it's a better option for us at least.
But I know so many people use Slack and, man, the noise of just Slack right.
now is making me nervous. All right, our next piece of AI news, Apple's latest alleged AI tool is
bringing new types of creativity to everyday people. So Apple has developed a prototype AI animation
tool called Keyframer that uses large language models to add motion to 2D images. All right, so it is a
description-based tool that converts text prompts into CSS code to animate images and has potential
applications in web-based animation. So Keyframer is a promising new tool for creating these web-based
animations and just the simplicity of turning 2D images into animations. As someone that's, you know,
built and developed websites on and off for like 20 years, this is pretty exciting for me because
it's not easy, right? And the capabilities to turn a still 2D photo into a moving animation
with a text prompt is kind of awesome. Right.
So keep your eyes out on that.
All right, last but not least, in AI News, Google has rolled out a new coding tool for its employees.
So according to reports, Google has introduced an internal AI model called Goose to assist its employees in writing code more efficiently.
So this is part of Google's broader efficiency drive, which includes job cuts and team reorganizations.
So this new Goose model, reportedly, is a plan to bring AI to every stage of the product development process and is part of Google's efforts to enhance AI skills.
and streamline operations.
You know what?
All these names are getting confusing because I thought maybe meta was the one that was
going to go down the animal route, you know, or the exotic bird route with their emu video.
But now Google is getting into the game of goose, right?
We just have separations.
Can all, you know, can all tech companies maybe pick one genre of animal that they want to call
their models so I can stop getting confused?
All right. So as a reminder, if you want more news, we do it every day as well as, yes, we recap
this show. We go over more news than this and fresh finds from all across the internet, a lot of
other great information in our newsletter. So make sure you go to that, your everyday AI.com.
It's in the show notes if you're listening to the podcast as well. So make sure to check out
the episode description for those links, as well as you can email me, you know, reach out on
LinkedIn, all that good stuff. All right. So let's get into it. And I would love to hear from our
audience. Hey, Tara is always first here. Good morning from Nashville, Dr. Harvey Castro. Denise, thanks
for joining us, Brian and Woozy and Mauricio. We got a crowd today. Y'all, y'all actually care
about small language models. All right. I'm not the only one. Good morning to Spatlana and Raul.
Thank you all for joining us. So let me know right now. I would love to know for our live stream audience
or just let me know also if you're listening on the podcast. Do you use small language models?
Do you know what they are? What, what? What question?
do you have? I may not have every answer, right? Some of them I'll be able to answer live here on
the show. So if you are joining us, please get your question in about what small language models are
or what questions you have about them. But hey, if I can't get to them live, I'll make sure to
answer them in the newsletter so we can all learn together. All right. So we're going to talk now about
a what is a small language model. And I'm going to give you 14 facts that you need to know.
All right. No clickbait here. Just straight facts. You know we bring receipts. All right. So let's
start here with small language models. So it's important to know that the definition is always changing,
right? As actually large language models get bigger, what we consider a small language model is always changing.
So keep that in mind. All right. The goalposts are always moving in terms of these definitions.
And there is no one overarching definition that everyone adheres to. So, you know, you could even say,
oh, something that we might have considered a small language model, you know, two, or, you know, a large language model, two years
people might be calling it a small model now.
So keep that in mind.
So previously, a small language model was considered a model with fewer than hundreds
of millions of parameters.
But now, like I said, that definition is changing a bit.
So as large language models like GPT4 get bigger and bigger, right, when we jumped up from,
you know, GPT 3.5 to GPT4, right, as the large language models get bigger, we're starting to
call other models, oh, yeah, these are small now comparatively, right?
So it's important to know.
And it's all kind of judged by parameters.
All right.
We're going to get into here in a second what those parameters are and what they mean.
So essentially, large language models have billions to trillions of parameters,
allowing complex tests of all varieties, right?
So your GPT4, your Gemini Ultra.
Think of those as the ultra, ultra big boys, right, in large language models.
And they can perform any task.
And they essentially know everything.
All right.
So that's not how small language.
models work. Small language models have fewer parameters, making them more efficient, and they are
more for specific tasks or to be used locally on devices with more limited resources, right?
We're going to get into all the intricacies, but, you know, if we want to talk big picture,
that's the best option, right? Or that's the best way to think of them. Large language models
are the behemoths that can literally do anything and everything. And probably most of us actually use
large language models like chat GPT and Google Gemini and Anthropic Claude, we probably use
large models like that more than we do small models, all right? But small models are more for specific
tasks. So it's not something that you can really get the best results sitting down and asking,
you know, a hundred different questions from a hundred different, you know, walks of life.
That's not really what small models are for. They are for, they are fine-tuned for specific tasks.
All right. Let's keep diving in. Let's look at some examples.
So again, even when we talk about parameters, because I'm going to define what a parameter here is in a second, but first, I'm giving some examples of large language models and small language models and their parameters, right?
So a lot of times companies don't even say how many parameters their models are.
So even as I throw these numbers out there, you know, some of them are unconfirmed, but, you know, kind of reported to be true.
So keep that in mind.
So let's just look at the difference here.
So as an example, large language models, probably the two most popular ones are.
GPT4 from OpenAI, which is reportedly about 1.8 trillion parameters.
Okay. Gemini Ultra from Google, which is a reported, you know, 1.5, 1.6 trillion parameters.
All right, then let's look at some popular now small language models that maybe would have been
considered big, you know, three years ago, but they're not. They're small now.
So Phi II from Microsoft is about 2.7 billion parameters.
And you have Lama, as an example, very popular open source model from meta.
About seven billion parameters.
All right.
You know, don't, we can pick hairs on the parameters all day.
Again, those are estimates, those are reports, et cetera.
But those are some big names out there, right?
So we're talking open AI, Google, meta, Microsoft.
So, you know, there's these smaller, more flexible models like Phi II and Lama.
Then you have your big, you know, your big ones, you know, Open AIs, GPD4 and Gemini Ultra.
All right.
So now let's talk parameters.
Right? Because that is essentially the difference. That's the difference. There's a lot of intricacies and how they play out differently. But the kind of the first thing that we look at to separate large models from small models is parameters. So I just gave you the examples, you know, 1.7 trillion versus a couple billion. All right. So here's what a parameter is in very simple terms. All right. So again, I'm sure if you're a machine learning expert with, you know, a decade of experience, you know, you might, uh,
have qualms with my definition here, but we're talking to the everyday person.
We're trying to simplify this.
So a parameter in very simple terms, it's, so in simple terms, a parameter in a large,
or in a language model refers to the variables that the model uses to make predictions.
All right.
And each parameter represents a concrete part of the model that can change or adapt based on the
data it's trained on.
All right, that's an important piece, right?
because all of these models and all of the parameters that they contained are trained,
right?
So which is why obviously these large language models are much more expensive to create.
They're much more expensive to upkeep.
They're more difficult to train, right?
Because they are so much more complex, right?
It's like the way I like to think of it is you can, if you know how to work with a large language model,
you can essentially ask it anything, right, in the history of existence.
And if you know how to work a large language model, you can probably get a pretty decent answer.
Not necessarily the case with small models, right?
Small models are generally trained in different, you know, categories of work or different,
you know, different types of outcomes you may want, right?
So as an example, a small language model may be trained or built specifically for a type of
customer service, right, to handle just, you know, inquiries from, from customers, right?
and it might be fine-tuned for a specific use case like that.
Right.
So if you have a small model that's maybe for customer service, right?
Maybe it's an open source model that you can tweak yourself.
But let's just say that, right?
That small language model that's built specifically and tailored and fine-tuned for
customer service to respond to customer inquiries, you aren't going to be able to code
on that, right?
You're not going to be able to develop a website, right?
You're not going to, it's not going to be able to spit out images for you, right?
That's what large language models do, right?
It's the multi-modality of large language models, right?
Being able to input photos as prompts, being able to output, you know, different types of code and multiple language, all of these things.
A lot of times, small language models are not like that, right?
They're built for one very specific purpose.
Or they are just made for smaller purposes.
Like, okay, this small language model excels at creative writing.
Right. So can it create an outline of, you know, how the stock market has changed over the last
30 years? Maybe, but that's not really what it's for, right? So again, different use cases,
different types of training, different parameters. It changes, right? Can complete difference in what a
large language model and a small language model should be used for. All right. And hey, so great, great
question here from Woosie. So have any specific products you like that are being run entirely with
small models. I do have some examples, Woozy, but I'll try to share those in the newsletter so I don't
go off track here. All right. So now let's talk about 14 things that you should know about small
language models. All right. And again, I think most of us, myself included, are more familiar with
large language models. So as we go over some of the
facts and things to know, I'm going to be comparing small models to large models, right?
Because that's what most of us know. You know, so think the large models, the GBT4,
the Gemini Ultra, the Claude, et cetera, or emphropic clod. So some of the most important
differences, all right? So small language models require less computational power, making them
more accessible for users with limited hardware resources. All right? That's the most, one of the most
important things, right? These small models can live locally on devices, right? The new Samsung phones
have Gemini Nano, right, which is technically a small model, but it lives on the hardware, right,
which technically requires less compute. Because when you're using large language models,
people don't like to talk about this, but they are extremely resource heavy, right? We've talked about it
on the show before, right? Every couple hundred prompts, you know, it kind of can tell you how many,
you know, what is the environmental toll, right, on all of these prompts because large language
models require a lot of compute power, you know, but small language models don't because they
live locally, right? So they're not having to, you know, essentially send your query and compute it
in the cloud, which can be very expensive and very resource-heavy, small models, not like that.
So even with that in the same vein, small language models are faster at training and inference due to their smaller size compared to large language models.
Right. They're faster. It's faster when there's way fewer parameters, especially when you're, you know, obviously using a small language model for what it's good at.
It's faster. It's on, it's on device. It has fewer parameters to look through, right? So, so think, think of it like this.
You know, what's technically faster? You know, if you had to read through,
a 500-page book to learn about the history of everything, or a five-page book, right,
that gives you the history of things that maybe you care about.
It's kind of like that, right?
It's just faster.
Small-language models are faster, but obviously way less robust.
All right, some more facts to know.
Small-language models are more energy efficient, which we just talked about.
So they're reducing the carbon footprint associated with the training and the running of AI models.
That's another part.
It's not just the running, you know, large language models that are expensive.
It's the ongoing training.
There's so much compute, right?
That's why you have Sam Altman out trying to raise $7 trillion.
Yes, trillion with a T, $7 trillion for more GPUs, right?
To build a new, new class of chips, right?
Because these GPU chips power all of these generative AI models, not just your large
language models, but, you know, all generative AI models are run off these very hard-to-geted
very expensive GPU chips.
So right now, you know, we are, I wouldn't say it's a compute crisis, but, you know,
all of this computing power is scarce, it's expensive, it's resource heavy, it's in the
long run taking a toll on the environment.
So small models, I think in that regard, are important to keep an eye on.
Also, small language models can be deployed on mobile devices and embedded systems,
unlike most large language models.
Yeah.
So, again, that's your edge AI, right?
are your edge computing.
So that's bringing these language models locally to small devices, right?
So you're seeing it on actual, you know, phones now.
And, you know, just talked about that with the new Samsung S-24, having Gemini Nano.
Also reportedly, Apple, you know, should be announcing their generative AI offering in June at the Worldwide Developer Conference.
So Duke just announced that, I think it was earlier this month.
So we are presumably going to be seeing a small language model in an upcoming iPhone.
or maybe in upcoming, you know, MacBooks or IMAX, right?
So that's another important thing to keep an eye on is we are seeing that as well.
With Nvidia's chat with RTX, right?
They just announced that this, or it was actually announced a couple of months ago,
but it was just released, you know, in the last like 36 hours,
NVIDIA's chat with RTF, right?
So we don't know how many parameters it is, but it's a pretty,
it looks like a pretty solid, small, small language model.
that can run locally, right? You have to have a certain Nvidia GPU running on your computer to run
chat with RTX, but the same thing. So this is a big shift that we're going to see because it's
better privacy as well, right? That's one of the biggest things that people are concerned about
with large language models and it makes sense, right? So not just data sharing, but training data,
right? How are these companies using any data that we upload into their systems to train their models,
Right. So when you think of smaller models that run locally, they are not sending information back and forth.
Right. So it is the concept of running large language model and generative AI locally on a device.
Much more privacy, much more security, right? Just like if you're opening, you know, Microsoft office, let's just say, and you're not connected to the internet.
That's kind of, you know, what it's going to look like in the future if you're working with a small language model on your, on your phone or on your PC.
or Mac once it comes out, right?
We're waiting on Apple.
All right, some more facts.
Here we go.
So small language models are suited for real-time applications,
such as on-device language processing,
where quick responses are crucial.
All right, next fact.
Next fact, small language models have a lower capacity
for understanding complex language nuances,
compared to large language models, right?
That's the other thing.
Large language models,
if you know prompt engineering 101,
there's really not much in the world
that you can't do with a large language model, right?
You can translate languages.
You can build advanced web applications
by just asking, you know,
a large language model that code something for you.
And you can technically build generative AI
with generative AI, right?
So the large language models are extremely complex,
and they're only going to get more powerful
and more robust as we see new models, right?
When we see GPT5, you know, and Sam Altman at OpenAI has been saying, oh, it's going to be much better at reasoning.
It's going to be able to rationale, you know, more, more multimodal capabilities, right?
So these large language models are going to get even more powerful and even more robust, right?
And even right now, large language models across like MMLU benchmarks.
We talk about that on the show a lot.
But, you know, the current version of GPT4 and presumably Gemini Ultra, you know, are about three to four times battle.
on these benchmarks like MMLU than the average human, right?
So right now, large language models, for the most part, are much smarter, much, much smarter
than any one human.
All right.
So that's important to keep in mind.
So it's a big difference here.
All right, let's keep it rolling.
So another thing you need to know about small language models, they are often used in applications
where speed and efficiency are more critical than deep language understanding.
Again, tailored applications.
Small language models can also be fine-tuned more quickly and cheaply for specific tasks versus large language models, right?
Y'all, like even though GPT4, I think is still, I think it's still more powerful, at least right now than Gemini Ultra.
You know, we might see that shift, you know, as Gemini Ultra from Google, you know, starts to get a little more stability and just a little bit more improved.
but y'all, GPD4 is like almost two years old now, right?
Which is crazy to think about, right?
But it's also important to know that and to see the difference, right?
Presumably, Open AI has been working on, you know, GPD5 for years.
So these large language models are extremely expensive to create, to train, and to maintain, right?
It's like a Titanic ship in the ocean versus a jet ski, right?
You can't use a jet ski for everything, but for a specific task, a lot of times a jet ski is
much better than a huge, you know, cruise ship maybe that 10,000 people can go on.
Different, different applications, different vessels for different applications.
All right, a couple more things you need to know.
You need to know.
about small language models.
And yes, if you do have questions,
I'm going to try to get to them at the end.
So keep them, keep them coming if you do have them.
So small language models,
they're much easier to maintain, obviously,
and update due to their simpler architecture.
Small language models can also be more easily integrated
into software and web applications
without needing extensive infrastructure.
That's an important one.
I think every, you know, all these different,
you know, web applications in software early on,
you know, just jumped on, you know,
Open AI's models because their API was good.
You know, they've been making it cheaper and cheaper and faster to work with.
But I think we're going to see a shift here in 2024, maybe to a lot of these, you know,
pieces of software, these different web applications instead, you know, using small language
models.
Because again, as an example, let's just say, if you're building, you know, if you're a large
company and you want to, you know, have your own version of a model for customers.
support. Do you need a model as big as, as big as, you know, Google Gemini Ultra or as big as
Open AIs, you know, GPT4? I don't know. You know, you might be better off as an example with a model
like mistral or a model like, like Lama, right? Something that is maybe, you know, more limited,
more fine-tuned. You know, another thing to keep in mind, which I think is important is large
language models. People struggle with them, right? Because let me tell you this, if you're using,
and I know I always, you know, old man Wilson getting on his porch and shaking his fist at the kids
who don't know what a large language model is or how to use it. Right. So so much of what you see
on the internet and social media is, you know, oh, use my prompts. Use my prompts. You know,
here's 15 prompts that'll make you rich tomorrow. Those prompts don't work. They literally don't
work. That's not how large language models work. They're too big. They're too big, right?
If you tell GPT4 as an example, you're a copywriter with 20 years of experience,
that means nothing, that means nothing, you know, for a large language model.
Because guess what?
It has gobbled up all of the information on the open web web and close web works of art,
things that we don't even know about.
It has essentially the history of humankind in its data set, and it's, you know, 1.8 or
1.5, depending on what model you're talking about, you know, those trillions of parameters.
So guess what?
It's also gobbled up all this information.
That's bad information, right?
People that say, oh, I'm an expert copywriter with 20 years of experience.
Guess what?
There's a lot of people that say that on the internet that are garbage.
And they're not good copywriters, right?
So when you're working with a large language model with trillions of parameters
and you think you can use these copy and paste prompts and get great outputs, no.
Would you get better outputs if you were using a small language model that is
specifically trained for copywriting or creative writing?
Absolutely.
Right. That's why I think so many individuals, so many businesses, especially early on, wrote off technologies, you know, these very powerful and robust technologies such as, you know, chat GPT, GPD4, even, you know, Google Gemini Ultra because they're like, oh, well, I can put one big prompt in here and it's not fantastic. It's because it's a large language model with trillions of parameters. You can't just put one prompt in and expect something great out because it's brain, its big neural network.
is too big. It's too big. It's not fine-tuned for a very specific task. So this is just a small
mini-rant brought to you by old man Wilson. If you're working with large language models,
you need to understand the basics of prompt engineering, right? You need to essentially train
your chat that you're working with, right? You have to, like what Tara is saying here. This is
what we teach in our free prime prompt polish course that has been taken by thousands of peoples,
thousands of, you know, business leaders across the world. We teach. We teach you,
teach them the basics. Most people are using large language models incorrectly. They're using
it like it's a small language model. It's not how it works. Sorry, rant over. Let's keep going.
Small language facts, you've got to know. So small language models offer a balance between performance
and resource usage, and it's ideal for many practical applications. All right. So good examples
here. Small language models can power chatbots. They can power search engines. They can power
voice assistance, whereas large-singuage models are advanced and used for every single task.
Adobe just introduced an entirely new way to create, bringing the power and precision of its
creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the
Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's Creative Agent, Firefly AI
Assistant lets you start with your vision, just describe what you want, and shape the outcome as it
takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus
pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere,
Lightroom Express, and more to help bring your ideas to life. You can also get started with
creative skills, a growing library of pre-built workflows for common creative tasks, like batch
editing photos, creating mood boards, portrait retouching, and creating social variations. Every step,
the assistant takes is visible so you can refine, redirect, or take over at any time.
You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at firefly.adobie.com.
All right.
Last couple facts, you got to know here.
You got to know.
Small language models can be used in both cloud-based services or by downloading them.
Yeah?
So that's the thing.
You can't download a large language model.
It's not how it works.
I don't know if there's a single, you know, computer, GPU, you know, on any one physical
device that you can download the entirety of, you know, like a GPP 4.
There's been people that have, you know, forked it and they've created smaller versions
of these large language models.
But, you know, for the most part, small language models, the big, big thing to keep in
mind is, yes, they can be downloaded.
They can be cloud-based as well, right?
So there's great resources out there.
We'll mention them in the newsletter.
You know, hugging face is probably one of the leading resources for, you know, working
with and downloading small language models, and you can run them locally on your machine,
right? You don't have to be a tech expert to experiment and to download and to install
large language models, because that's what they're for. Therefore, you know, on-device use
for very specific use cases. All right, so I'm going to get to your questions here,
but we're going to wrap up and I'm going to ponder with you. Let me know if you're joining
me live, what do you think the future is for small language models?
I'm not 100% sure, right?
I talk about large language models every day.
I read about small language models.
Use all kinds of models.
So I'm very curious about what the future of small language models is.
I think what we're seeing as an example with Samsung and Google teaming up to bring Gemini Nano to the S-25.
to a mobile device, that's huge, right?
So I think the future of small language models actually is going to kind of rely heavily
on the successes or failures of these first couple large-scale commercial rollouts, right?
So even if we could count on our hand, a handful of, you know, we could call them, you know,
highly visible small language models.
So you have your, you know, your models from meta, right?
Your llama models, very popular right now, very popular.
you know, for people to run these models locally.
You know, just talked about Gemini, Gemini Nano.
I think we also have to talk about Nvidia's chat with RTX, right?
And you can use other models on chat with RTX, which is great.
Same thing.
You can upload your own documents.
You know, we haven't even talked about, you know, the power, the power of rag, right?
The power of rag, which is something that is kind of tied in with these small language models.
So it's retrieval augmented generation combining with small language models.
So essentially bringing in your own database of information and combining it with small language
models, you know, then you can, I think, bypass so many of these security and privacy
concerns, right?
And you can work with a small model that works on a device.
It's faster.
It's more efficient.
It's cheaper.
And then if you can bring your own data in and work with it in a secure fashion,
I think the future of small language models is extremely promising, right?
It's almost like, I think, kind of, you know, the large language models are kind of like
the Trojan horse.
You know, it infiltrates all our daily lives and we see how powerful they are.
And we all start using them.
Hundreds of millions of people are using large language models on a daily basis now.
But then we are also then concerned, right?
And we didn't know, I guess, like some of the best marketing maybe for small language models
is large language models, right?
And people using them incorrectly because we see this robustness
and how powerful models in general,
language models are, right?
To turn unstructured data into something that we can use
and we can create with.
It's extremely powerful.
But we are also now, over the last 18 months,
we've been exposed to the downside, right?
And now we're becoming more cognizant of privacy and trust.
I think small language models, it's kind of in a wait and see, but I do see them gaining popularity,
you know, with the, you know, Gemini Nano, with the new S-24, with whatever Apple is going to be
announcing with Meadows, open source, local models, and also with Apple, right? Presumably,
we're going to see some sort of small language model with Apple, you know, it could be a
large language model as well. It could be a combination, but presumably we're going to have
some, some edge AI, some on-device large language model in a future.
Apple offering. So I do think that working more with small language models is going to be the future.
All right. Let's see if we can get to a question or two, maybe a couple comments. Let's see here.
So Juan says, what's your large language model of choice for everyday use, emails, productivity, etc?
Great question. One, I am still team chat GPT, through and through, right? At least for most of, you know,
team's needs, chat GPT is great. Again, but we're not working as much with many other people.
We're a small business. So we're not working as much with confidential documents, right? I think if
we were, we might be looking at some of these small language models right now. But at least
right now, I still think while plugins are there, right? Yes, Open AI will likely be phasing
plugins out soon. But right now, the ability to, you know, use what we call plug-in packs,
which are essentially many agents, right? You know, when you,
enable any three plugins at once, put a prompt in, and those three plugins can work with
each other autonomously. People don't know this. People don't understand how powerful plugins are.
And then you can also with the new feature from OpenAI with the GPT mentions, and then you can
only one at a time, you can have your three plugins almost working like mini agents in a chat and
then also mention any GPT, whether ones you create or from the GPT store, at least for me,
I don't see any other large language model right now that offers that sort of flexibility,
not even Gemini Ultra right now.
Not all workspace accounts have access to all the features that Gemini Ultra has.
So at least my take right now, that's the best large language model.
Let's see.
Tanya, can you give an example of a prompt that give the results you want?
So Tanya, I'm not sure you'll have to ask me a follow-up question.
Maybe that's something we can answer in the news.
letter. So Tara,
Tara asking what is the best PC or Mac for tinkering with a local model?
My 2018 MacBook Pro wants to retire on me.
Yes, that's a good question.
So you're going to want to look at probably, you know,
laptops that were honestly introduced in the last like three to six months.
So we'll have a list.
We'll have a list in the newsletter today.
I don't have a list off the top of my head.
But I do know, as an example, Microsoft did just release a new version of their surface laptop that can run models locally.
You know, we've already mentioned a couple of, you know, phones that can run small devices locally.
So, yeah, we'll have a complete list of kind of different PCs right now or Macs that can run these models.
because, yes, you need newer devices with new GPs or GPUs, very powerful, you know, processing.
So, yeah, you're going to need probably something that's come out in the last three to six months in order to, you know, really leverage this.
All right.
That's it, y'all.
I hope you enjoyed a somewhat, you know, random look, right?
We kind of went all over the place on this one.
But I gave you 14 facts you need to know about small language models.
We talked about the big differences.
We talked about what they are from parameters and the future.
So if you want more on this, we're going to break it all down in our daily newsletter.
So go to Your Everyday AI.com.
Sign up for that free daily newsletter.
We're going to get to some of the questions we couldn't get to live and more,
as well as more AI news, more fresh finds from across the web, our daily tutorial.
Check it all out at Your Everyday AI.com.
And join us tomorrow.
Join us tomorrow.
We're going to be talking how AI is a creativity enhancer.
and not a creativity replacement.
So make sure to join us tomorrow and every day for more everyday AI.
Thanks y'all.
Meet Firefly AI Assistant.
Now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stay in control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com
and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
