Everyday AI Podcast – An AI and ChatGPT Podcast - EP 413: Real-Time AI Search Battle - ChatGPT Search vs. Perplexity vs. Google vs. Copilot vs. Grok
Episode Date: December 3, 2024AI is taking over search. Whether you love em or hate em, LLM-powered searches are coming for your devices. ↳ ChatGPT Search and its Chrome extension. ↳ Google's AI overviews. ↳ Perplexit...y. How we interface with the web is quickly changing, so you need to know how each platform works and the pros and cons of each. We break it all down for you. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AI searchUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Significance of AI in Search Engines2. Big Tech Developments in AI search3. Competitive Landscape in AI Search Solutions4. LLM AI Search EvaluationsTimestamps:03:10 Daily AI news07:00 Why real-time search is relevant11:08 ChatGPT Chrome extension alters search provider warning.14:04 AI enhances search, Google AI overview expands capabilities.17:25 Exploring AI in business with Microsoft WorkLab.20:44 Toggle search in chat for scoring sports.25:11 Real-time Twitter info can be problematic.26:22 Grok passes, Meta AI offers free version.31:34 Prefers concise chat over tedious weather apps.34:35 Judging news stories objectively, despite biases.38:14 AI stocks and news possibly outdated.40:45 ChatGPT's new search suggests Chicago pizza places.42:50 Appreciate ChatGPT's location maps feature.46:05 Apple has largest market cap at $3.6 trillion.50:23 Evaluating options to resolve potential ties.54:46 Perplexity struggles with hallucinations and inaccuracies now.56:24 Perplexity struggles to answer specific product queries.Keywords:AI tools, search capabilities, sports search queries, Chat GPT, Perplexity, Gemini, Microsoft Copilot, Grok, Meta AI, evaluation criteria, NVIDIA's stock price, Everyday AI, Jordan Wilson, AI impact on internet usage, AI News, Intel CEO resignation, Ted Cruz and AI Regulation, World Labs Innovation, Google's market share, AI-enhanced search technology, OpenAI's ChatGPT Search, AI in search engines, Competitive landscape in AI search, Microsoft's WorkLab, AI customer service, Perplexity Pro, AI and phone search bars, Large Language Models (LLMs), User interaction, AI performance evaluation.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
How we use the internet is changing.
And it's changing because of artificial intelligence and large language models.
And you may think this is a small thing, right?
Either a minor upgrade or a minor inconvenience,
depending on how you look at it.
But it's actually going to change everything because AI search is going to
start to infiltrate, and it already has every aspect of our work and personal lives. Everything from
traditional Google searches to large language models to voice assistance. Things are changing very
quickly. You need to know what's going on, how to work, and some of the limitations. So we're going to be
going over that today live and talking about real-time AI search and putting together a live head-to-head battle of all the
biggest juggernauts here in this space, chat, GPT, perplexity, Google, co-pilot, GROC, and even
meta-AI. So I'm excited for today's episode. So what's going on, y'all? My name's Jordan
Wilson, and welcome to Everyday AI. Before we get started off, have to give a quick shout out.
And thanks to our partners at Microsoft. So why should you listen to the WorkLab podcast from Microsoft
because it's the place to find actionable insights to guys your organization's AI transformation?
Tune in to learn how to think more creatively about unlocking the full potential of the technology.
That's W-O-R-K-L-A-B, no spaces available wherever you get your podcasts.
All right.
So another place to get your podcast and everything on AI is here.
Welcome.
Like I said, my name is Jordan.
I'm the host, and this thing's for you.
It is a daily live stream podcast and free daily newsletter helping everyday people like
you and me, not just keep up with what's happening in the world of AI, but how we can
use all this information to get ahead, right?
I get it.
It's hard.
Like you feel like you're constantly being left behind.
That's what our daily show is all about.
But it starts here.
If you're on the live stream, listening on the podcast, make sure to check out your show
notes because where you actually go to grow is your everyday AI.com.
There you can find now like 420 plus episodes of everything in the world of generative
AI.
Go watch, listen, read them all there.
is free. It is the number one resource of free, unbiased information on everything AI for beginners.
So make sure you go check that out and sign up for our daily newsletter.
All right. So before we get started, we're going to start off as we do almost every day by going
over the AI news. And hey, live stream audience. Give me, please, a specific business related query
that you normally Google search. We're going to use those live later. All right. So some big AI news
today. So the Intel CEO has stepped down amid struggles. So Pat Gelsinger, Intel CEO since 2021,
has stepped down as the company faces challenges in the rapidly evolving AI sector. So Intel has
struggled to compete with Nvidia, whose GPUs are preferred for AI tasks due to their
efficient parallel processing capabilities. Under Gelsinger's leadership, Intel's stock has
declined by 61% while at the same time, NVIDias has soared over 820% during the same period.
A little bit of a difference there.
So Intel's market value is now approximately $100 billion, significantly overshadowed by
NVIDIA's $3 trillion valuation.
So Intel's restructuring efforts are described as the most critical since its inception in
1968, have yet to restore investor confidence.
This leadership change highlights the urgent need for Intel to innovate and adapt to maintain
relevance in the AI-driven tech landscape.
All right.
Next piece of AI news, Republican Senator Ted Cruz has called for an investigation into whether
European governments are influencing U.S. laws on artificial intelligence, according to Reuters.
So Cruz's concerns arise as European countries led by the EU have implemented comprehensive
AI regulations, including the AI Act. In a letter to U.S. Attorney General Merrick Garland,
Cruz criticized the Biden administration for allegedly collaborating with foreign governments
on AI policy. Cruz described European AI regulations as bad and accused them of being
influenced by the quote unquote radical left. So, you know, it's going to be really interesting
to see the potential AI policy being politicized in the.
coming weeks as we go through a change in administration here in the U.S.
And then last but not least, World Labs, well, we finally know a little bit more of what
they're actually going to be doing.
So World Labs is a startup founded by AI Pioneer and who many call the godmother of AI
Fifi Lee.
And it's launched an innovative AI system that transforms a single image into an interactive
explorable 3D scenes.
This is wild.
So this technology stands out
because it allows users to step into
and explore 3D environments generated
from images enhancing interactivity and
modability.
Unlike many existing AI models
that create 3D environments with consistency issues,
World Labs Systems maintains the integrity
of generated scenes,
adhering to basic physical laws.
Then the,
AI generated scenes can be explored via a demo on World Labs website featuring live rendering
in an adjustable depth of field for enhanced realism.
Despite its advancements, the system is an early preview stage with limited exploration
areas and occasional rendering errors.
So, yeah, World Labs has already secured $230 million with, you know, it's only been out there
for a few quarters in VC money from notable investors aiming to release its first product by
20, 25. All right, y'all. So let's talk about it. And hey, thanks, thanks for the questions
coming in so far. You know, so Denny, maybe we'll get to yours. But please, if you have questions
on how you would normally use, you know, the web at your job, right? What are queries that
you're normally, you know, looking up on a daily, weekly, monthly basis? Today's show is
going to be a live one. All right. But first, I want to set this scene.
on why this real-time AI search battle is relevant for all of us.
Well, I think ultimately it's about our devices, right?
Because how we use our devices, whether that's a computer, whether it's a mobile phone,
it influences a lot, right?
And that generally starts with a search engine, right?
And we actually are going to have a show tomorrow talking about this a little more.
So we kind of have a one-two combo here for you.
So if this is up your alley, make sure to tune in tomorrow as well.
But there's been a lot of recent updates from all the companies in big tech.
And I've noticed over the coming or sorry, over the last few weeks, this growing trend.
So one, large language model makers reportedly getting into hardware, right?
And then your traditional big tech search companies continuing to prioritize AI results, right,
in AI real-time information.
But where do you start?
What do you use?
We're going to get into this.
But let's quickly go over what's new.
And y'all, I might have some takes on this, right?
It's hot take Tuesday.
So let me know if I should take it easy on everyone or if I can ramp it up, all right?
It's up to you guys, as always.
So over the last few weeks, there's been a lot of recent updates.
So one of the biggest ones has been open.
AI. So previously, they had a standalone prototype called Search GPT, right? It wasn't available to a lot of
people, but it was kind of a standalone product. And then last, you know, about five weeks ago,
actually, Open AI finally released this to the public. And it's now just called chat GBT search,
right? And if you're a longtime listener of the show or reader of the newsletter, you actually knew
about this. This actually was possible before chat GPT search with some clever prompting.
Open AI has quietly had a perplexity type of clone or a perplexity type of product for about
eight months. No one knew because they didn't really announce it, but it's been out there.
Right. But now officially we have chat GPT search that was unveiled. I believe it was October
26th. And this is integrated now an integrated search engine into chat GPT.
So it was a standalone product. Now it's integrated and it's grabbing real-time data retrieval
and positioning itself as a competitor to Google's search dominance. We haven't seen the most
latest statistics from 2024 yet because the year's still going. But in most years,
Google controls more than 90% of search revenue.
And depending on what study you look at anywhere from like 75 to 85% of search traffic.
So everyone's trying to get to where you start, right, that query, how you start your day.
Do you go to a website?
Do you go to Google.com?
Do you go to a start page and just search from there?
So much of those ad dollars and how companies actually make money depends on the search bar.
And that's why we have to pay attention to these updates.
Another big one from Open AI, well, they released a Chrome extension.
Sounds small, but it's not, right?
A sneaky way that OpenAI is allowing potentially hundreds of millions of users to skip Google, right?
So if you enable OpenAI or ChatGPT's Chrome extension and you go as an example, if you have Chrome or Edge,
because all of these work in Edge as well.
and you go and, you know,
normally you might put something in the URL bar,
which is also a search bar.
Now it just bypasses Google,
and it just launches straight into chat GPT.
And here's why I'm like,
okay, I know Google's worried about this.
Everyone is because now you get a little prompt.
So the first time after you install the chat GPT Chrome extension,
it says,
did you mean to change your search provider?
Google's like,
Like, yo, like, are you sure you want to do this?
Or do you want to change it back to Google?
Right?
So everyone's aware that there's a battle for users and where they start.
All right.
All right.
Kathleen and Tara said bring a little fire.
We'll try.
All right.
So Google's scared.
They should be, right?
Because if since you guys wanted some, some heat,
I think Google was grossly behind everyone else when it came to pairing up
their large language model Gemini with real-time search.
We have literally more than a thousand videos on our YouTube channel, right?
So you can go subscribe to that.
But we showed you once Google Gemini was released and probably in the six months after,
it had a real hard time connecting to Google, connecting to real-time information.
In the early AI overviews from Google were a little sketchy.
I think they've really improved.
So you've got to tip your hat.
to Google, because I think they've improved recently, but we're going to be doing it live.
All right, here's what's also happened in the last few weeks.
Google has expanded AI overviews.
We're not going to be comparing those because they're a little different.
Maybe that's another show for another day, but they've expanded AI overviews in search
over 100 companies, sorry, 100 countries, right?
Where as before, I believe it was seven when it first rolled out.
So bringing these AI generated summaries to everywhere, right?
So even if you're not going into Google Gemini as an example or chat GPT, you're probably
already using these AI search, right, with Google AI overviews.
Once they, you know, they had some bugs early on, but they're pretty, pretty decent now.
Right.
It's, it's, you know, kind of a maturation of what Google has always done, right?
There's always something called, you know, in the SEO world, like position zero, right,
where essentially it wasn't an answer, right?
It was kind of like a precursor to AI search right now.
If you put in a certain query and Google had an answer,
you know, there's something, they might call it a knowledge graph or something like that,
but it essentially answered it there for you.
But now with the AI overviews from Google,
it's really expanded, you know,
bringing that AI search capabilities even to Google's homepage.
We're not going to be comparing that live today, but pretty big.
Some other big recent news in the last few weeks,
the browser company has previewed DIA and AI-focused,
browser set to launch early 2025, cohere, right?
We don't talk about cohere a ton here on everyday AI, but when it comes to enterprise
companies, cohere is a force to be reckoned with.
So they just released and previewed Rerank 3.5, which is an AI search model that processes
queries in over 100 languages improving enterprise search accuracy by 30% according to them,
Right. And then a couple other ones, perplexity.
Right. So now perplexity is bringing shop with perplexity, right? And also their CEO has said that they're going to be developing hardware, right?
I started the show by talking. I think ultimately it's about hardware and pairing AI search with the hardware.
We've also seen, you know, reports over the last year that Open AI CEO, Sam Altman is working with former Apple designer, Johnny
on a hardware device as well.
So you have hardware,
traditional hardware companies going into search,
kind of traditional AI or large language model companies,
you know,
putting out,
you know,
AI search products and hardware,
right?
So everyone's competing for where users start.
Microsoft and Bing aren't any different.
They just launched Bing did their generative search experience.
So you have not just co-pilot,
but in traditional Bing,
you're getting these AI power.
searches as well.
And meta, yeah, every single big tech company.
I'm not kidding.
Meta announced the development of a proprietary search index for its AI chatbot to
reduce reliance on external platforms like Google and Bing.
That is making their own search engine.
Everyone wants to control where you start your day.
Right.
And if you're not using an AI, you know, or a large language model solution, they're coming for hardware and traditional search engines.
So if you're still going old school, you need to stop, right?
You need to stop.
But first, what you need to know is what works where.
All right.
So live stream crew, like I said, please give me a.
suggestion or two. What is something that you routinely search the web for? All right. So I have a couple,
a couple ones here. So please get them in. But I also, if possible, it should be something that's
easy to grade. Right. So, hey, although I love this one, Fred, Fred says, how will quantum computing
impact AI energy needs? Hopefully it's something that I can easily say like, yes, no, or maybe.
Right.
So something that maybe you've gotten hallucinations with in the past,
maybe something that, you know, it's definitive, right?
Yes or no.
Is it correct or is it not?
All right.
So before we jump in to this live demonstration and podcast audience,
I'm going to do my best to visually describe what's going on,
but we're doing a live throwdown of searches that you would just normally do.
But before we get started, I have to give one more quick shout-outs to our partners
at Microsoft WorkLab.
So why should you listen to the WorkLab podcast from Microsoft?
It explores the questions business leaders are asking.
How can they guide their organization's AI adoption journey?
How can AI help them maximize value and create new products and business models?
How should they help their teams reskill for this new era of work?
Why is it important to be completely transparent about when and how you use AI?
Find the answers on WorkLab.
That's W-O-R-K-L-A-B.
No spaces available.
you get your podcast. All right, y'all, let's do this live and let's kind of talk about the real
time AI search battle. All right, let's get into it. So live stream audience,
please let me know if you can see my screen. Hopefully this works as we're jumping around
here. But let me just quickly describe what we're going to be doing here. All right. So we have
some very simple, and I'm not even going to say these are prompts, right? We're doing
traditional searches because I think people, when they think of large language models,
you think of prompting something.
You think of having a conversation.
Yes, that is the correct way to use it.
However, what I just talked about, every one of these big companies are coming for
your traditional search queries.
And I do think that there's two different things, right?
I think a traditional search query might be a, you know, two to five word query that's something
that you just care about.
Whereas if you're working with a large language model, you're probably having a
back and forth iterative conversation and using more words.
But in this case, we are doing very short traditional search phrases.
And we're seeing how all of these new products and new updates that we've been talking about
across these different stacks here, handle them.
So like I said, we're going to be doing chat GPT search.
And I'm using paid versions for all of these except meta AI.
So we're going to be going over chat GPT search, perplexity, Gemini, co-pilot from Microsoft,
GROC, yeah, GROC, and meta-AI.
And we're going to be running the exact same simple search queries.
Again, this is not a prompt engineering head-to-head.
This is taking a look at how these AI companies and big tech companies are changing our search queries.
All right, so we're going to start with sports.
And we're going to do the same search very quickly, all right, because I don't want this to accidentally drag on for an hour.
And all I'm doing here is when I'm using chat GBT, I'm going to toggle on the chat GPT search.
Although generally, even if you use a query that would break that chat GBT thinks, oh, I need to search the web, you know, some of the times it would automatically understand that it needs to search the web instead.
But just so everyone knows, we're going to be using the same chat window for each of these,
which shouldn't technically throw anything off.
It just makes it a little bit cleaner.
And we're going to be toggling the search option on inside of chat GPT.
All right.
So we're doing, our first one is sports.
So we have our scorecard.
And we're going to give every, we're going to give them all essentially a zero, a five, or a
10.
All right.
For the most part, if I see.
that there's, you know, some gray area, we might do other scores. But for the most part,
a zero is a hallucination or they get it wrong. A five is like, okay, that's, that's partially,
that's partially right. And then a 10 is like, okay, that's, that's pretty much, that's
pretty much right. All right. So like I said, if you do have other ideas, get them in now,
because we're about to fly through this. All right. So here we go. Our first kind of prompt,
we're just saying bears upcoming schedule. All right. I'm going to be a lot. I'm a
Bears fan. Any Bears fans in the live stream audience? We're getting a new coach, which is great.
The bears find great ways to lose. All right. So we're going to do this very quick. We're also using
the pro version of perplexity. All right. So that's all we're doing is just the words bears upcoming
schedule. Nothing else. And I'm quickly jumping between all tabs here. All right. And we're going to keep
tabs on did it get it right? Did it get it wrong? All right. So let's see.
So chat GPT, the new chat, GPD search, love it, very visual.
So we have this kind of quadrant of their next four upcoming games, including team logos.
There's links.
I can go down here, click the sources.
It pops the sources out on the right hand side.
So if you haven't used chat chad GPD search, I'm not going to do a whole overview here,
but I want to describe to our podcast audience.
So obviously got this, got this right, did a good job too.
All right. So the answer is the next game. The Bears play the 49ers Sunday at 3.25 p.m. Central time.
All right. Let's take a look at perplexity. Did perplexity get it right?
Ah, kind of. So here's the thing. All I said is Bears' upcoming schedule.
What perplexity did, and it requires a lot more scrolling, it didn't really deliver me the answer that I wanted up front.
So what it did, which I don't like this, it just gave me their entire season schedule.
So it's like, this is technically outdated.
So not only did it not put the upcoming game, because that's what I asked for.
It didn't give me the upcoming game.
It just essentially gave me this information looks like it could be eight months old.
It's not updated.
It doesn't give me the game first, right?
And there's no scores.
So these games are in the past, right, from September.
And there's not even results, right?
I would get it if it at least gave the results,
but not very good, all right.
But it at least does have there if I scroll down.
So not that great from perplexity, if I'm being honest.
I don't know if I should give that a five or a zero.
Live stream audience, I'll let you vote.
Should I give that a five or a zero?
Let's go to Gemini.
Gemini straightforward here, right?
Gave me nothing else.
Just answered my query.
Again, a visual, just gave me the Bears 49ers and then the next game.
me the next two games.
Straightforward.
I like it.
Gemini passes.
Great.
Gemini's like their old version of Gemini like six months ago was terrible.
It could not connect in real time to anything on the web, which I've warned people about.
I'm like, if you're trying to use the front end of Gemini chat right now, stop because it can't
even talk to Google.
So like I said, you got to tip your cap when the company improves.
All right, Microsoft co-pilot.
So no visuals here.
Yet it did give me the next game.
okay, as well as the next upcoming games in two different sources that I can click on below.
So got everything right, pretty straightforward.
I'd say it passes with a 10.
Grock, here's the wild card, right?
So if you pay for, I think it's like $7 a month for Twitter's premium version, you get Grock.
I'm not a fan of Grock.
I don't think it's very good.
But for one of its biggest, you know, value props is it's great,
with live information, right?
So let's see how it does.
So fortunately slash unfortunately, it searches both post or tweets on Twitter.
I'm not going to call it like axes on acts.
I don't know what that means.
So it searches both recent Twitter posts as well as web pages, which is a good
and bad thing because depending on what you're searching, and I've done plenty of examples of
this, sometimes it brings up just a bunch of misinformation in essentially tweets
from spam accounts, right?
Which using real-time information from Twitter,
depending on what you're looking for, is problematic.
All right.
But for this, the Bayer's upcoming schedule got it right.
So there we go, 49ers.
It put it in Pacific time, but that's okay.
It's still accurate.
It gave me week 15 and 16.
And then I can, if I want to, I can click on the post
and see the posts that it grabbed from as well as the web pages, right?
because that's important, especially things with like transparency and trust.
You have to know where it's getting this information from.
So I'd say Grock passes with a 10.
And then let's see meta-a-I.
So a lot of people don't know, but meta-a-i, you can just go to meta.
coma-i, right?
You can use it without being logged in, but it won't save your chats.
This is free.
So this is the only free version.
I'm using technically the paid or premium version for everything else,
including chat, GPT, perplexity, Gemini,
Microsoft co-pilot and Twitter's grog, but the free version here for Meta AI.
So it uses Google on the back end, right?
So let's see, it did a good job here.
Just no, no visuals.
So, you know, meta doesn't give you this kind of rich snippet type search results,
just straight text, which isn't necessarily a bad thing.
All right.
So we got it right.
It said, you know, the next game there, it, up top, it gave the 49ers.
So a 10 from meta.
All right, so here's our scorecard so far.
Let's see.
What did everyone, what did everyone say?
Ted said that, not Gemini.
It should have been perplexity.
Oh, perplexity is the one we're figuring out.
Sorry.
So a lot of people said five for perplexity.
All right.
So perplexity gets a five.
Everyone else got a 10 in that one.
So it passes.
All right.
Our next question, we are two.
doing stocks. All right. So here's what we're doing, sports, stocks, weather, news, places,
and at least one or two from the audience. So if you have one, get it in. All right, we're going to
pick up the speed here. So now I kind of gave you an overview of what each of these platforms
give you, right? So both visually, the results, how they access the web, et cetera. So now all
we're doing is Nvidia's stock price. And this is a live stream, y'all. So their current stock price
is $138.633 per share.
So 138.63.
So let's see, let's see how everyone did here.
All right.
So first, let's go to chat chvety.
So chat chvety actually gives you super nice interactive graphs because of chat chpity search.
I didn't know if you knew that.
But we got a perfect, perfect kind of score there.
And the markets are going to open here at about 30,
minutes. So it's not like this is changing as we go along. So there we go. We got a good result.
We got a good result from chat, GBT. I appreciate everyone's patient here. I'm toggling,
you know, doing a live stream with many different windows is sometimes a little tricky. So
appreciate everyone's patience here as I scroll through. All right. So there we go. Let's go to
perplexity. So perplexity, same thing. You get a.
live real-time graph.
Love this from perplexity.
So perfect score there.
And it says 138.63.
Straightforward from Gemini.
Right.
Gemini, you get a nice little visual as well.
Up to date.
Crushed it.
Perfect.
All right.
10.
Let's see our friend co-pilot.
Let's see how Microsoft co-pilot did.
No visuals here.
Straightforward, though.
Got it right.
$138.
Nvidia is on.
on fire. If only you would have listened to someone like a year and a half ago that said
Nvidia was the most important company in the world. All right. Grock, got it right.
138. All right. So so far, everyone's crushing it. And let's look at meta. Meta got it right as
well. So every single kind of chatbot or AI search engine got this one correct. Pretty impressive.
All right. Next, let's do weather. All right.
So I'm going to go ahead and check the actual weather here in Chicago in real time.
So it looks like it's roughly 22 degrees.
All right.
Again, this is things that you would normally be searching for, right?
You probably have an app on your phone, but think you have to think in the future.
We might not be using websites, right?
I know that sounds crazy.
Let me repeat that.
We might not be using websites.
You might just be using, you know, your large language model or something like Apple intelligence,
if it ever gets smart, right?
you just might be using something that's powered by an AI search engine.
So right now in Chicago, it is 22 degrees Fahrenheit.
Yuck.
All right.
So let's get back to our searches and do weather in Chicago.
All right.
So we're going to do this one, like I said, very quickly.
And we are doing them all real quick.
So let's see who's going to win, y'all?
actually let's live stream audience.
And if you're listening on the podcast right down now,
who do you think is ultimately going to win this?
Because we're going to crank up the speed here.
All right.
So let's see chat,
GPT.
And I'm going to give it's,
it's hard to say, right?
Because I'm even searching right now.
I'm getting different results on, you know,
weather.
com versus acqueweather.
So as long as it's within like three degrees, right?
As long as it's not wildly wrong,
we're going to give everyone,
we're going to give everyone a 10 here.
Right.
So as long as it looks like, yes, it's actually accessing the information.
All right.
So chat, GBT with the new chat, you bt search, very nice visual table.
My gosh, I'm starting to hate using weather apps because you've got to like watch like a 15 second video just to get the freaking weather, right?
Annoying.
All right.
So chat, GBT says 20 degrees, very visual, very nice.
Perplexity.
We'll give this to perplexity, right?
It looks real time.
It says current, it says 18, 18 degrees.
close enough, right?
Because I'm getting reads between 20 and 21 when I check manually.
Gemini advance, 21.
Gemini looking nice here, right?
Who knew Gemini isn't really improving for at least simple, real-time searches,
things that you might normally Google, right?
We get a nice visual here, and you can kind of scroll through the hourly forecast there.
Very nice.
Oh, let's see.
I don't know.
So co-pilot, it's pulling it from forecast.weather.gov.
It says 25.
Okay, so it says a high of 25 in a low of 14 tonight.
However, I didn't say what is the weather right now.
I just said weather in Chicago.
All right.
So live stream audience, you're the determining factor.
Do we give co-pilot a five or a 10 on this one?
Technically, it's correct.
I didn't say what's the weather now.
It gave me a high and a low.
just said weather in Chicago. All right, Grock. Grock says, the current temperature is 16 degrees
Fahrenheit with a high expected to reach 29. Is it actually going to hit 29 today? I don't think.
I mean, maybe, maybe it will. Maybe it will. So Grock, all right, I'm going to say that passes.
And then meta-a-I. So unfortunately, it looks like meta, oh, let's see, meta-a-i says, it's a chilly day
in Chicago with partly cloudy skies and a temperature of 18 degrees Fahrenheit.
All right.
So I think mostly all of them probably got a 10 unless anyone says co-pilot gets a five.
All right.
Most of you guys said it passed.
All right.
All right.
So it passed.
Everyone, everyone there passed.
Yay.
Did it do better than your local weather person?
All right. Now, here's here's one where we might have to get a little bit specific with the grading.
So now I'm saying top AI news stories for this month.
This month is December.
So I didn't say for the past month.
I'm saying this month.
It is December.
So we're doing all of these.
And this is where I'm probably going to judge these on the fly.
I promise you I'm not going to be biased.
because I think a lot of times when you ask for news stories,
some of these large,
and this is something you would probably be using these models for for business,
right?
What's happening in your industry and your vertical, et cetera.
A lot of time they give these general outputs, right?
Like, oh, you know, there's all these new advancements, right?
So if I see some of that, probably going to give it a five, right?
Maybe lower depends on what else we see.
All right.
So first, chat, GPT.
And again, you get this kind of links at the bottom.
where you can click the sources as well.
All right.
So let's just kind of see a couple of things.
So it says as of December 3rd, 2024, several significant develops.
All right.
So U.S.
imposes stricter export controls on AI chips to China.
We have that in our newsletter yesterday.
Some European defense startup news.
Open AI targets one billion users.
This is all very, very recent, which makes sense because we're only December 3rd.
All right.
AI chat bots enhance holiday experience.
Yep.
Good.
All right.
So this is all.
Hot, fresh AI news from chat GPT.
All right.
And live stream audience, I know a lot of you are giving me your scores.
Just let me know, like news chat GPT, right?
If you want to score it at the end, just at least let me know what category.
So if I need to like go to you for an answer, I at least know what you're talking about.
All right.
So perplexity, top AI news stories for this month.
AWS leadership change.
Did that happen?
No, that's very old.
I'm clicking this.
The very first one right here.
Oh gosh, this is from August.
Yeah, I'm looking at this.
I'm like, that's not new.
That's old.
Yeah.
So it's giving me old information.
I said this month.
And I'm looking at these stories from perplexity,
aside from the AWS reinvent conference,
which is happening.
this week, none of this is from this month.
I'm looking.
I don't think, no.
I don't know.
To me, this, I mean, it's not a zero, but to me, that's like a two for perplexity.
These things are not from this month.
A lot of these things are very, very old.
So perplexity, everyone thinks, right?
I haven't been using perplexity or talking about it as much because I've noticed a lot of
hallucinations and a lot of simple errors, right?
I've been reaching out to the perplexity team being like, yo, what's up?
But they haven't answered my calls.
All right.
Let's look at Gemini.
All right.
So here we go.
It's Alibaba's cloud AI overhaul.
I'm checking that one.
I don't know when that one is from.
I don't know if that's from this month.
Let's see.
December 3rd.
All right.
There we go.
All right.
So Gemini, Salesforce predicts UK AI leadership.
That's new.
Amazon's big investment in Anthropic.
that was technically the end of November.
I'm just going to double check that.
I believe that was, yeah, November 22nd.
So, okay, most of these things, most of these things, I would say, are fairly close,
whereas perplexity was just very, very old.
So I'm looking at some of these.
It's a mixed bag from Google Gemini, but much better than perplexity.
So I'm going to probably give it like a seven.
Let me know if you guys think that's correct or not.
going into Microsoft co-pilot top AI news stories for the month.
And here, I like this.
Co-Pilot just responds.
Here are some of the top AI news stories for December 2024.
So it at least recognizes, yeah, I'm supposed to be giving you up-to-date information.
So the first thing is top AI stocks to watch.
I don't know.
Is that real?
I mean, it's from December 2024, so that's fine.
So it's giving me some more general things.
But I don't think some of these are very up-to-date.
So as an example, this one, AI and Nobel Prizes, I believe that was October or November.
So essentially, it's pulling from certain websites and there's dates on here.
So there's a date on this website it was pulling it from that said December.
But I believe, yeah, that is from November.
And it's actually from before that.
So the AI, let's see, AI Nobel Prize.
I think that was, oh, look at that.
I was trying to do a Google search and I have my default to chat GPT.
So I believe that was October or later.
Yeah, October.
Yeah.
So it's mixed bag there from co-pilot.
And it is a little more general, but some of the things are new.
So without going down each and every one, I'm guessing I'm going to do similar to Gemini.
I'm going to give it a seven, much more accurate than perplexity.
Let's look at Grock.
All right.
Open AI's trademark.
Mark move on reasoning models.
US proposes AI Manhattan Project.
I believe that's old.
Yeah, that's in November.
And I think it was actually from before that.
Elon Musk, legal action, that's new.
AI and advertising integration.
Yep, that's new.
Okay, so pretty good from GROC.
I'll say probably a seven.
So it looks like Gemini, co-pilot and GROC,
much better than perplexity,
but throwing in some older things from
October or November, whereas chat, GPD, at least according to my eye, only brought in things
from December.
All right.
Let's see meta AI, leadership transition at, oh, no, this is, this is old.
All right.
So meta is bringing things in from last year.
All right.
So it's Open AI CEO, Sam Altman, stepping down.
Elon Musk's AI chatbot, Grock.
All right, this is all bad.
All right.
So the real time information from GROC gets a big.
fat zero because none of it was correct. All right, let's do places. All right, we're going to go quick.
So I'm doing best pizza places in Chicago, right? This is probably something that you're Googling all
the time. You want to know, you know, a restaurant or a local business to shop at. So let's just
quickly do this one and see who gets this kind of right. And I'm picking something I know.
If there's one thing I know, I know my pizza places in Chicago. So let's look at chat.
at GPT first. All right. So if you didn't know, the new Chad GPT search has a very cool,
kind of interactive graph. So this is going to feel very reminiscent of Google, right,
of Google map. So I can also toggle to the list view. The list view keeps the map in there,
but then brings the list as well. So pretty good. Let me see it.
Brought in Luminati's Spaca, Napola, Giordanos. It brought in, okay,
Coalfire, Bert's Place.
I don't even know that.
Okay, did a pretty good job.
Did a pretty good job.
It offered not just things right in downtown because I didn't say downtown Chicago.
So it did get the kind of the whole area.
So pretty good there.
Let's look at perplexity.
Perplexity as well gives me a little map or visual on the right hand side that I can click on.
And similarly to chat, GPT, it gives me an interactive graph.
So it brought in all the staples, right, Girodano's, Pizan,
Luminatis.
So pretty good here.
So both of them so far, very, very good.
All right.
Gemini, no interactive map.
Gave me a great list.
So it not just gave me classic Chicago,
but deep dish, deep dish pizza and then for Beyond Deep Dish.
So great, great response here from Google.
And it also asked me some follow-up questions so it could help me better.
Microsoft co-pilot, simple.
Give me all the staples, including P-Quads,
Love some Pequods.
All right.
So pretty good.
No hallucinations so far.
Let's see.
Grock, again, pulled in from Yelp, from TripAdvisor, as well as some different post from Twitter.
Same thing.
Pequods, Luminati, Giordano's, give me all the staples there.
And then meta, deep dish favorites, thin crust spots, unique spots.
So no lies here, no hallucinations.
However, chat GPT, perplexity gave me a little something more.
It gave me maps, which I like if you're thinking local.
So I'm going to give the nod, a slight nod to chat GPT in perplexity.
And then I'm going to give the rest of them probably an eight because no one was right.
But I think that kind of additional aspect of having the maps in chat GPT search and perplexity
was a little better because if you're searching for something that is location-specific,
having the location, even though putting it out in plain text,
and I know that's just some of these other options don't have those capabilities,
but it's my show.
I can score them how I want.
All right.
Let's see.
Jackie said only people from Chicago can vote on that one.
And yes, Mark is saying GROC may be using X tweets as a source.
Yes.
So Grock uses both websites and tweets.
which might be a good thing, might be a bad thing.
All right.
Jackie says I'm being too nice to Grock.
I'm trying to be as unbiased as possible.
All right.
Let's see if we got any good.
We might only have time to do one,
but if you didn't get it in already,
I wanted to do one from the audience here.
So let's see if there's something that I can actually gauge, all right?
if there's a yes or no, yes or no answer here.
All right.
Let's see, let's see some of our questions.
So Denny with quite a few.
Thanks, Denny.
When is the most recent news in topic of interest?
What are fun things to do in Blank City?
Christopher saying, you know, talking about replacing stack overflow, Fred, we talked about that one, quantum computing.
I can't get, I can't know the answers.
How can you use the app where you have six.
apps open at the same time. Samuel from YouTube saying my search from last night,
what's the secret code clearance pricing at Home Depot? I don't know how to score that one.
Okay, I like this one from Marie saying which EV vehicle is the most efficient and economical.
That one's pretty good. I should probably be able to get the answer. Maybe, maybe not.
Okay, here's one from Denny. Thanks for this one. Denny. I know this is kind of stock related,
but this one at least has a correct answer. What stock has the largest market?
market cap. All right. So we're just going to do one audience question and we're going to wrap this up.
All right. So we're going to do what stock has the, what stock has the largest market cap?
Yeah, I do this live, y'all. I love typing live. All right. Let's first see the answer.
All right. This is a site that I'm always at. So right now, Nvidia has a slight or sorry, Apple has a slight advantage over
Nvidia in terms of market cap.
So it is a $3.6 trillion market cap.
All right.
So let's go ahead.
Love this question because it's easy to judge.
I am going to say what U.S.
stock has the largest market cap.
So pretty simple question.
We should, in theory, be able to get a yes or no or partially complete answer.
So I'm going through.
I'm pacing this in all of our options.
here. And again, the answer is going to be Apple with 3.6 trillion market cap. All right. So let's go
ahead and see. All right. So chat GBT gave us a nice graph and it correctly says Apple 3.6.
All right. So chat GPT passes with flying colors there. All right. Let's go ahead into perplexity.
So perplexity says Apple, $3.6 trillion.
Perplexity gets it right.
All right.
And over to Gemini.
So I do like that Gemini said as of December 2nd.
That's good.
It says $3.59 trillion.
I don't know.
You guys, what do you think?
I mean, it's like, are we rounding here?
The actual number is 3.62.
So let me know what does, what does,
Yeah, what does Gemini get for that one?
3.59.
I don't want to be accused of being biased, all right?
So you guys can vote.
So whatever it gets, co-pilot's probably going to get the same because it says the market
cap is $3.5 trillion, $3.55 trillion.
So Gemini and co-pilot were a little loft.
Let's see, Grock.
Grock, oh, okay, Grock just got it wrong.
Grock got it wrong.
This is why I tell people, I don't think GROC is a serious model because you would think, right?
Because it's like, oh, according to these 15 web pages and eight tweets, you would think by looking at 23 sources, GROC could at least get this right.
It got it wrong.
It says it's, let me just double check here.
Yeah, it says, Invidia, 3.6 trillion.
Grock gets a big fat zero, right?
everyone's like, oh, Grock's going to be amazing.
It has real-time, you know, information to Twitter and now it can access the web.
And I've been telling people it's bad.
There's no instance where I would ever use Grock.
I'm sorry, even if I'm searching Twitter.
I'm not going to use Grock.
All right.
It's just full of hallucinations.
It's not good.
That's my opinion.
But that's why we do live tests.
All right.
Let's see meta.
Mata.
Surprisingly got it right.
3.6 trillion.
All right.
So let's see.
What did our audience vote?
Did we get any?
What should we give Gemini and co-pilot?
Let's see.
Michael says Gemini gets 3.59 points for that.
That's funny.
Oh, I like this one.
We might have to do one more.
I'm sorry.
We had another good one that this one.
This one should actually be a good one.
So I like this one.
So we're going to give Gemini.
We're going to give Gemini and co-pilot, I don't know, sixes.
We're going to give them sixes.
All right.
I like this one from Michael.
Yeah, I'm biased saying how many episodes are there of the Everyday AI podcast.
I'm going to say give it a little hint here.
All right.
Because I don't think any of them, I don't think anyone's going to get this right.
All right. So there are 400 and, oh, little error on our website. Interesting. I'll have to fix that up.
So technically it's 412, but we had a little error there where it's double labeled. So it should be four.
I'll, I'll accept 411 or 412, but I don't think any of them are going to get this right.
But let's let's go ahead and see. All right. So quick one. We're going into all of them and we're going to see.
and we're just going to see how bad they are.
Maybe this is great.
So Michael,
thank you for this suggestion because if there's any ties,
I believe this one is going to separate any ties that we may have.
All right.
So let's look at chat GPT first.
So it says chat GPT says 400.
All right.
And it does say as of December 3rd, 2024.
So it's a little wrong.
We look at the sources and it's bringing in a bunch of different sources.
All right. So not correct there, but close. Perplexity, 415. That's wrong. It's saying there's more than there
actually is. So chat, GPT went under. Perplexity went over. Live stream audience. Get your,
get your votes in, right? Get your votes in on where you think all these should be because some are
under, some are over. All right. So Gemini says 415. So perplexity and Gemini both said 415.
co-pilot says 398.
All right.
Grock says three,
oh,
bad.
Grock says 329.
That's not even close, right?
And then meta,
meta says 412.
Meta got this even more correct than our website.
All right.
Meta was not expecting that.
Was not expecting that out of meta.
All right.
So we're going to get meta big fat 10.
Good job.
I don't know.
Where do we land with every with these other ones, y'all?
I'm going to go ahead as we wrap up.
Jackie says this was fun.
All right.
So let's see.
Grock was just very, very bad.
329.
So I don't know.
Grock, I'm going to give a two, I guess.
Okay.
I'm going to give Grock a two.
I mean, the rest are kind of close.
400 and 415.
Maybe 415 is a little closer.
So maybe we'll do perplexity and Gemini.
We'll do maybe a six for perplexity in Gemini.
And then we'll do a five for chat GPT and maybe a four for co-pilot because
co-pilot was a little further off.
All right.
So let's add.
up our very unofficial scoring here.
All right.
Let's see which large language model did the best in our very unofficial testing.
So did you get it right?
What'd you think as I add up here?
If you were to think, hey, I'm using one internet-connected large language model for my
business, for my career.
Are you using the right one?
What are you using?
Again, this is a very unofficial, uh, unofficial experiment here, y'all.
So, uh, but I'm, I'm, I'm, I'm.
surprised with some of the results, right?
Like, as an example, I was not expecting meta to be the only one that got that.
Correct.
So, here we go.
I'm going to go ahead and order these in the order in which they did.
All right.
So here we go.
Our final totals.
ChatGPD search got at 65.
Meta AI in second place.
All right.
So chat, GPD search number.
one, meta AI, number two, surprising.
Again, this is subjective with 58.
Gemini, updated and improved.
Great job, Gemini.
57.
Co-pilot, a little bit behind there.
So Gemini and co-pilot in the middle of the pack.
Co-pilot 55.
Perplexity, surprisingly stumbled.
53 in Grok, Ruh, 47.
All right.
So we're giving our medals.
Chad, GBT, number one, meta, AI, number two, Gemini, number
three, co-pilot number four, perplexity number five, and Grock, last place, number six.
All right, y'all.
Yeah, you know what, Michael?
Michael says, I'm so surprised by perplexity.
Their business model is search.
I think perplexity, rightfully so, had a lot of, I don't want to use the word hype.
I'll use the word hope, a lot of promise.
And early on, I think when none of the other options could do a great job at real-time search,
I think everyone flocked to perplexity.
One thing that I noticed, and I have done some perplexity episodes in the past.
But if you go back and look, I've talked about it a lot less recently because I'm noticing a trend of hallucinations.
I'm noticing a trend of perplexity missing simple things that it used to do better.
Yes, you can choose your model, right?
You can choose your model in perplexity.
I'm using Claude 3-5 saw it.
But that doesn't change the actual perplexity's ability to browse the web and give you accurate information.
So the model that you choose when using perplexity, at least in R testing, I've done blind testing on this.
It doesn't impact the result.
So if you're saying, oh, Jordan, you use the wrong model.
You should have used, you know, sonar large or you should have used GPT40.
No.
You get similar results.
So if it gets it right, it gets it right.
But, you know, the model that you choose, right, because perplexity is an answer's engine.
And you can choose Sonet 3.5 or you can choose GPD 40.
It just changes how it presents the information.
But if it gets it wrong, it's still going to get it wrong, regardless of it.
of the model that you choose. So yeah, perplexity, I think has had some, if I'm being honest,
some struggles lately. And I'm a big fan. But hallucinations, they're real, right? And not
answering simple queries. I think especially since they introduced shopping, you can ask a query
and you can ask a question about a certain product. And you want an answer. And instead,
perplexity is now, since they release this shopping, now they're just trying to push products in
front of you. And I'm asking, right, like, I'm trying to, you know, get a new dishwasher,
whatever. And I'm saying like, hey, you know, give me specs on this dishwasher. What's the
difference between this one from Samsung and this one from LG? And it doesn't even answer it. And it
just says, here, go buy it. You can buy it now in Perplexity Pro. It's like, yo, that's not what I'm
asking. So my personal experience, perplexity has gone, I'm not going to say downhill, but it's
gradually trending downwards and its ability to deliver answers based on your actual query,
which is important.
All right.
So I hope this is helpful.
I know this was a longer one, y'all.
So please let me know if this was helpful.
And also, tomorrow, we got a nice little one-two combination
because we're going to be talking about this
a little bit more in depth tomorrow with our guests
and talking about the AI race is towards who owns the search bar on your phone.
So if this episode, yeah, it's a longer one.
If it was interesting, if you want to dive into deeper and the insights,
make sure to join us tomorrow and make sure to share this if you found it helpful.
And for today's recap, go to your EverydayAI.com.
Sign up for the free daily newsletter.
Thank you so much for tuning in.
Please join us tomorrow and every day for more Everyday AI.
Thanks y'all.
Meet Firefly AI Assistant.
Now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.
and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.
