Everyday AI Podcast – An AI and ChatGPT Podcast - EP 374: AI News That Matters - October 7th, 2024
Episode Date: October 7, 2024Is Meta’s new Movie Gen better than Sora? What’s the one feature everyone’s missing in ChatGPT’s new Canvas mode? What’s with all of these new Microsoft Copilot updates? And why is Google br...inging ads to AI search? We bring you the AI news that matters. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Meta's MovieGen AI Video Tool2. OpenAI Funding and Growth3. Windows 11 Updates4. New OpenAI Announcements5. Google AI Developments6. NVIDIA's AI ModelTimestamps:02:10 Moviegen: Groundbreaking AI for customizable video creation.03:56 MovieGen's promising AI not public yet.09:36 OpenAI raises $6.6B, secures $4B credit.11:22 OpenAI possibly losing billions despite revenue growth.14:46 AI memory feature for PCs launching in November.19:43 New features start rolling out this fall.22:20 Microsoft updates: Copilot Plus and voice assistant rollout.27:26 Open-sourcing AI models risks misuse despite intentions.31:24 Real-time API introduced, enhancing speech applications.35:03 Google develops AI to rival OpenAI's capabilities.36:36 Google AI advancements: Alpha Proof, Astra, AI Search.41:19 Canvas upgrade enhances coding in ChatGPT Plus.46:12 Poll on tomorrow's show topic: Canvas tips?49:57 Microsoft integrates GPT-4, enhancing collaboration tools.51:24 OpenAI funding, Windows 11 updates, NVIDIA model unveiling.Keywords:Meta, MovieGen, AI video tool, OpenAI, Adobe, Luma's Dream Labs, Runway, Pika Labs, generative AI, Jordan Wilson, Microsoft, Wave 2 Copilot, ChatGPT, GPT-4, NVIDIA, NVLM 1.0, Google, in-page editing, in-line editing, AI Overviews Update, AI news, Canvas, Anthropic, Copilot Vision, Think Deeper, Windows 11 updates, liquidity, real-time API, Microsoft Build Conference, AI assistantSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
Is Meta's new movie gen better than OpenAI's SORA?
What's this one feature inside of ChatGBT's new Canvas mode that everyone's missing?
And what's going on with all of these new Microsoft co-pilot updates?
Speaking of questions, is Google changing the way that search works, at least when it comes to AI?
Yeah, a lot of questions.
questions this week if you've been following AI news and updates from big tech companies.
But don't worry, we've got answers. What's going on, y'all? My name's Jordan Wilson,
and I'm the host of Everyday AI. Thank you for tuning in. Everyday AI, it's for you. It's for me.
It's for all of us. It is your daily live stream podcast and free daily newsletter,
helping us all learn and leverage generative AI to grow our companies and to grow our careers.
So if that sounds like you, if that sounds like something that you're trying to do and who the heck isn't trying to keep up with AI to grow their company and career?
That's like all of us, right?
Well, then for all of us, Mondays are the spot.
You got to tune in.
Maybe you can't join us every single weekday Monday through Friday live at 7.30 a.m.
when our live stream goes out.
But maybe you can schedule out sometime on Monday, right?
You don't have to spend hours each and every day trying to track everything that's happening and say, hey, how does this a place?
to me, my company, my career, that's what we do for you on Mondays.
All right.
So if you're new here, thank you for tuning in.
Make sure if you have not already, please go to your everyday AI.com, sign up for that
free daily newsletter.
All right.
You know, it's written by me, a human.
All right.
So I think there's just so much AI out there.
And hey, who are the humans trying to help us make sense of it?
That's me.
So make sure you go sign up for our newsletter.
All right.
Enough chit chat.
Let's get straight into the.
AI News That Matters for the week of October 7th.
Here's a big first one, y'all.
Didn't see this one coming.
Meta has unveiled MovieGen, a new AI video competitor.
So Meta has just announced MovieGen, a cutting-edge generative AI video model that promises
to change how we all create videos.
So this model is not yet available to the public.
and is now only released with some samples in a research paper, similar to how OpenAI's SORA video model was teased about eight months ago, yet we still don't have access to that one.
So movie gen looks to be a groundbreaking AI model that allows users to create, edit, and personalize high-definition video and soundtracks using simple text inputs and images, setting a pretty high standard, at least according to these samples, for AI video.
So movie gen, well, whenever they release it, will allow users to generate 16 second video snippets from a single text prompt and also personalize them using just one photo.
So some pretty advanced capabilities.
So the system offers precise editing features allowing users to replace objects in videos.
That's huge.
Such as transforming a lantern into a bubble or swapping a VR headset for steampunk goggars.
Yeah, think of the applications for your company with this.
If you sell a bunch of different products, right, shooting one video, swapping out, you know,
images or just running a bunch of different videos with just, you know, slightly different
objects in them.
So many different applications for this.
So the unavailing of movie gen comes amid a growing trend in generative AI for video creation
with competitors like Open AIs Sora, Adobe's Firefly.
Luma's Dream Labs, Runway, Pika Labs, Kling, so many others, right?
Every day we wake up, there's a new AI video tool that looks pretty good.
But despite its promising features,
MovieGen is not yet available to the general public,
and meta has not yet provided a specific timeline for its consumer release.
I do believe a lot of these AI video tools kind of waiting until after the U.S. election,
so we might see either late 2024 or early 2020.
for a more tiered roll off.
All right.
So safety concerns as well.
We got to talk about that, right?
So the development of all these tool
raises a lot of safety concerns
about the broad release of such powerful tools
and as they can require
some significant processing power
to lead to potential misuse, right?
Meta's research right now
on Meta Movie Gen is documented
in a comprehensive 90-page paper
So yeah, we'll link to that in our newsletter.
So you can go read that or maybe just use No Book LM to create a short little podcast on it.
All right.
So, hey, for our live stream audience, I didn't say, I didn't say what's up to y'all.
Thanks for being there.
But let me know if you can see these examples on screen now.
And thanks to everyone for joining us, you know, Tara, Harvey, sabbatical life, Marie, Zane, Fred, Tara, everyone.
Thanks for joining.
I'm looking at these now.
So podcast audience will leave a link to you.
these, but very impressive, right? So we have what looks here to be a little girl running on the
beach with a kite. We have another video here, someone sipping coffee based off of a photo, right?
Here's the wild thing, being able to upload a photo and then being able to generate AI
videos in different scenes, right? So a lot of these earlier AI video generators were a little
limited on features, right? Because the technology wasn't there. It takes time to catch up. But
essentially, you would start with a simple text prompt, maybe upload an image, and you would get a
four-second video. Now, with this new meta-movie gen, once it does come out, the ability to
upload a photo and be able to generate AI video in different scenes, it's mind-boggling, right?
even if you would have told me six months ago that this technology might exist.
I would have said probably not.
That's probably three or four years down the line, but here we are.
So yeah, live stream audience, what do you think of these images, right?
So here's another very impressive one.
This one looks cinematic.
So it looks like it's almost set in a deep canyon.
There's a storm brewing.
Very, very cinematic.
It almost looks like, you know, a drone is coming up.
So many of these shots, right?
six months ago, nine months ago, you would not think these kind of shots would be possible, right?
I'm almost speechless looking at them.
Here's the example of being able to swap out different elements in a video just with text
prompts.
So there's one with a child kind of floating a lantern up in the air and then the exact same video
side by side, but then a bubble floating up in the air, right?
not having to reshoot, not having to reprompt.
This really changes what's possible in terms of creativity, marketing, and also what smaller
companies can do, right?
Big companies, the biggest in the world, you know, your Nike's, your Coca-Cola, your
PNG brands, right?
All these big companies that for many decades have been able to bring us, you know,
these great storytelling visuals to help sell their products and services, I think we're
finally going to start to see small and media.
size businesses be able to compete at a big level thanks to these AI video generators.
So yes, although this is not probably the best video you've ever seen in your life,
it's pretty good, right?
It's pretty good.
And it does level the playing field, I think.
Jay said, yeah, it even added little additional bubbles.
Yeah, pretty impressive.
So, yeah, let me know, you know, as we go along, what are your thoughts on this one?
I was pretty impressed, but we'll see when or if, right, we actually get access to this and the cost, right?
Because here we are, like I said, eight months after SORA from Open AI was teased.
And all we've really seen is research paper and, you know, some filmmakers getting access.
But I do think Open AI SORA from a visual perspective is probably still the leader, but Kling with their recent 1.5 updates runway with their Gen 3,
models and now meta with movie gen, it looks like from a quality perspective, pretty, pretty close.
I mean, we'll have to see once people start getting access, but, you know, personally, I was
fairly impressed. All right. Let's move on and talk about money. $10 billion to be exact.
So Open AI has just secured an additional $4 billion credit line bringing its total liquid.
up to $10 billion just over the last like week. So Open AI's recent financial maneuvers have
reflected its rapid growth and ambitious plans in the AI sector. So Open AI has just last week
closed a significant 6.6 record breaking, $6.6 billion funding round, putting its valuation to
$157 billion. So just days later and just days ago, Open AI,
announced a $4 billion revolving line of credit, which when combined with that $6.6 billion fundraising round,
which broke records, brings its total liquidity to over $10 billion. So this new line of credit comes
in partnership with J.P. Morgan Chase, City, Goldman Sachs, and some other major financial institutions.
So OpenAI plans to use this capital to invest in research, expand its infrastructure, and attract
and retain top talent to maintain its competitive edge.
Yeah, Open AI has been losing a lot of its top talent recently to Anthropic as well
as some other competitors.
So like I said, this is back to back.
This happened about three days.
So right before Open AI closed a record breaking.
So the largest fundraising round ever at $6.6 billion with support coming from big names such
as Thrive Capital.
Microsoft, Nvidia, Fidelity, and others.
So amidst all this big funding talk, OpenAI, kind of released some more numbers or came out in various pieces of reporting that their revenue has increased more than 1,700% year over year, now generating up to $300 million in monthly revenue with projected sales of $11.6 billion for next year.
But despite this revenue growth, certain reports have showed Open AI is maybe losing up to $5 billion this year due to high training and inference costs and GPUs aren't getting very much cheaper because they're getting more powerful.
So the prices are going up and being able to retain top talent.
So definitely worth keeping an eye on what OpenAI does with this new 10 billion.
billion in liquidity. Doesn't that sound nice? Having $10 billion in liquidity? I mean,
sometimes businesses would love to just have, you know, 10,000 or maybe $100,000 in liquidity.
But, you know, over here, opening eyes, you know, flexing with a big 10, 10 billy.
Jeez. All right. New stuff coming to Windows, y'all. So we are now getting more and more
updates on the new Windows 11, 2024 updates that will bring some new AI innovations to the new line of
copilot plus PCs. All right. So the new upcoming Windows 11, 2024 update is generating a lot of
buzz due to its new AI features. Particularly, some of these are specially designed for those with the new
and more powerful copilot plus PCs, which utilize advanced neural processes.
processing units or on-device NPUs.
Yeah, we went from CPUs to graphic processing unit GPUs to now NPUs, which helps specifically
right now, you know, Microsoft and Windows are using these for on-device AI or Edge AI for
its new Windows co-pilot.
So the update will introduce new AI capabilities gradually, beginning with a members of
the Windows Insider program. So we'll probably see some demos rolling out of these new features
coming soon, starting in October and expanding to other pieces of hardware, those copilot plus PCs
with Intel Core Ultra and AMD Rizon. I think that's how you say it. I'm not a PC person,
but I'm probably going to be soon. But the Intel Core and the AMD co-pilot plus PCs should
start to get these software updates starting in November. So one of the first features to debut will be a
pretty controversial but very highly anticipated feature called recall. So recall essentially was
delayed, was supposed to roll out a couple of months ago, was delayed because of some regulatory
and privacy concerns, but it aims to give PCs an AI-powered photographic memory. So this was
initially planned for a release in June, but it was delayed due to security concerns.
And now it will be off by default.
So those are two, you know, kind of two big changes.
It was delayed by about five or six months.
And now it will require users to opt in, which, hey, I would opt in for a computer
to remember literally everything that happens on it.
Pretty impressive, right?
I think that's the future of AI computing.
So some other new features rolling out in these new Windows 11, 2024 updates for Copilot plus PCs.
So it has the Click to Do feature, which aims to enhance productivity by allowing users to perform AI-based actions directly from a shortcut menu that appears over images or text.
So essentially, the ability to execute different AI actions just with a simple click, so not even having to open anything or something.
save anything or go into co-pilot. Some other ones. Improvements to Windows search and will allow
users to describe what they are looking for without needing to remember specific file names or
search syntax. So that's a pretty big one. So as an example, let's say whether you're looking
for something for business or, you know, your personal life on your PC, being able to search
for something and it being able to understand the contents of the file better,
including photos, right?
So yeah, maybe you were at a big conference
and you're searching for certain photos
from that conference, you know,
before it just might be a random file name
and it might be kind of hard.
But now with co-pilot, this new update
coming out to Copilot Plus PCs
that will be able to understand the contents
and the context of your photos
and files a little easier with this new update.
Also speaking of photos,
the photos app inside of Microsoft Windows
will introduce.
a super resolution feature that can enhance low resolution images to up to eight times
utilizing the NPU for quick processing.
Also, paint.
Hey, good old paint.
Paint's getting updated, y'all.
It's not that, you know, late 90s paint because now it's even getting some Adobe-esque
generative AI features such as generative fill inside paint.
Also, the co-pilot itself, right?
So the biz chat and the co-pilot chat.
and the co-pilot chat are getting some updates to the co-pilot experience, promising a more personal
AI assistant.
Some more on that here in a minute.
But we got to talk about what I think will probably be the two biggest features.
Well, three, if you count recall, but two experimental features, co-pilot vision and think
deeper.
So these will allow users to interact with web content and tackle complex questions.
So these two features, I think, are going to do.
be showstoppers once they roll out and assuming they're working. So let's talk about them one by one.
So copilot vision is essentially co-browsing inside of Microsoft Edge with an AI agent, right? So being
able to click the copilot button within Microsoft Edge and then this new AI assistant, which I'm
going to talk about here in a second, a live AI agent you can talk to, right, and say, hey, I'm planning
out, you know, this big conference. You know, I keep dropping like, oh, what's this big conference? You
know, maybe you're planning out the Microsoft Build conference that's going to be here in Chicago
in about six weeks. And you're saying, hey, I'm planning this out. What do you think of this?
Or, hey, is my outlining, is my outline for this session covering everything that our customers
need? So essentially, it's like having a coworker who has access to all of your information
over your shoulder and can see what you're doing as you browse the web.
So pretty cool there with co-pilot vision.
And then Think Deeper.
So Think Deeper is essentially OpenAIs O1 reasoning model.
So the Strawberry model is coming to co-pilot.
So we don't know when these two new features,
Copilot Vision and Think Deeper will be rolled out.
But at Microsoft's event two weeks ago,
they did say it would be rolling out soon.
So I do believe that some users will get access here yet.
this fall. So like I said, those two features have not fully rolled out, but one feature that has.
Yeah, Michael, thanks for the comment here. When do they roll out? You know, we don't know.
Sometime this fall and, you know, it's going to start rolling out to Windows Insiders first,
and then it's going to be slowly dripping out. Some of these features are going to be going to
people who have a co-pilot pro license first.
you know, so if your company has Microsoft 365 enterprise, you might be getting these, you know,
before the rest of the world if you are just on a free co-pilot account. So yeah, a lot of this,
you know, depends on what country. I believe U.S. users are part of that first wave. And if you
have a paid account or if you're just using the free version of co-pilot, that is going to
impact when you are going to get these features. And again, slow rollout. So I assume that, you know,
we saw a lot of these releases that I'm going to talk about now start to be rolled out last week, still not fully rolled out. So I would assume I would think that Microsoft wants these features probably out in the wild before the build conference, like I said, which is November 19th here in Chicago. So we'll see. But two things that did roll out. Well, Microsoft debuted a new copilot what people are calling the V2 interface as well as its new advanced co-pilot.
co-pilot voice chat. So we did a couple of dedicated reviews on our YouTube channel. So if you
didn't catch those, we'll share those in the newsletter today. So two pretty new features here.
So for our live stream audience, I have them screenshots of them left to right. So the first one
you'll notice is a completely revamped and new copilot interface. So if this looks kind of similar
to inflection AI. Well, it kind of is. So Microsoft essentially aquired. So they, you know, it wasn't a
full acquisition of inflection AI and Pi. But they essentially hired all of their top people. And now
we're seeing this similar kind of card layout. So the new co-pilot has a much simplified layout. So I don't know
if this is going to be rolling out to Microsoft 365 Enterprise users as well. That's a big question
I have. I haven't seen anything specific from Microsoft. I know I've got a lot of people for Microsoft
listening and watching. So please let me know if this is going to be rolling out to the new biz chat.
But right now, as an example, I am a, you know, $20 a month copilot plus user. So I have this new
kind of minimal, simplified inflection-esque kind of card view.
of a co-pilot chat that has rolled out there on the left-hand side.
And then also the new advanced co-pilot voice chat.
So again, I don't know if, you know, this is actually called V2
or if it's just the new refreshed user interface, user experience.
I'm not sure if there is a name for these new updates,
but they're pretty sizable, right?
And also, the new kind of live voice assistant is out as well.
Again, this is a slow rollout.
You might not have access to it.
I just got access to it last week, so did a couple of videos.
So this new kind of advanced co-pilot voice chat is very similar, at least in theory, to open AI's new advanced mode.
So similarly, it does not have access to real-time information.
That one is important.
So if you think that you're going to be talking to this new advanced co-pilot voice chat about like, hey, what's the web,
in Chicago, what's the latest, you know, tech news for this week. Talk to me about, you know,
fourth quarter trends in manufacturing and shipping, you know, in the Midwest, right? You can't do
those things right now. I do hope that both Microsoft co-pilots new advanced voice chat and
OpenAI's advanced voice mode are going to get access to these tools. So it's kind of, we don't
have the best of either worlds right now. So we have these old and slow, dumb voice assistants,
sorry, like Siri and Alexa that feel very archaic. The voice is not that good. It sounds like a
robot. It's very delayed. And then we have these new versions. So we have, you know,
Gemini Live. If you're an Android user, you have access to that. But otherwise, you might have
access to Open AI's new advanced voice mode or this new advanced co-pilot voice chat.
Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience.
Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio.
Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant.
The Assistant orchestrates multi-step workflows, drawing on 60-plus program.
tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom, Express,
and more to help bring your ideas to life. You can also get started with creative skills,
a growing library of pre-built workflows for common creative tasks, like batch editing photos,
creating mood boards, portrait retouching, and creating social variations. Every step the assistant
takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat
as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at firefly.adopi.com.
Which is very neural.
It sounds like a real person.
Low latency.
So there's not these awkward three to five second pauses.
Except right now, it doesn't have access to real-time information.
So hopefully we'll see that rolling out soon.
Right now, free users have limited access to these new copilot updates,
but you should have access once it does roll out.
All right.
Mr. Strawberry in the house.
Love to see it.
All right, let's keep going.
Here's one that I didn't personally have on my bingo card.
Invidia, yes, invidia has unveiled a pretty impressive open source AI model called NVLM1.0.
Yeah, that's a mouthful.
So, NVIDIA has made headlines by releasing its NVLM 1.0 family of large multimodal language models,
which includes the 72B, NVLMD-72B.
Y'all, can, hey, maybe I should reach out to my people at NVIDIA.
Can we give this one a name?
That's so hard to say, right?
Like, hey, GPT4, GPT4O, that's even not the best.
that's a mouthful, NVIDIA.
But hey, I guess they make up for it because this is an open source model that is very impressive.
And this model is designed to also compete with the proprietary closed models from companies such as open AI, Google, Anthropic, as well as competing with kind of the leader right now in the open source, open weights, whichever one you want to call it, model from meta.
So this new NVLMD 72B, my gosh, showcases exceptional performance in both vision and language tasks,
significantly enhancing its text-only reasoning after its multimodal training.
So researchers are reporting that this new multimodal variation achieves an average accuracy increase of 4.3 points on key text benchmarks compared to its predecessor.
So pretty big, pretty big uptick here.
Also, by making the model weights, that's huge, by making the model weights publicly available
and promising the release of training code, Nvidia is breaking away from the trend of kind of
these closed, advanced AI systems offering pretty unprecedented access to this cutting-edge technology
for both researchers and developers.
All right, so the open access to these powerful AI models, though also, like I talked about earlier,
It does raise concern, right?
So it's also great, I think, for researchers, developers, and for the general public to have access to the weights of this model, right?
But also, when you do that, you also give all the bad guys and all the people that don't have good intentions, right?
Because I think especially big companies like meta and Nvidia do have good intentions, right, when building these open source models with Nvidia here announcing that they're releasing the weights and the training data.
That's great.
But there's also a downside, an ugly downside to this is you give all of these people with
bad intentions access to the most cutting edge models in the world.
Right.
So I don't think that can be overlooked, but, you know, it's really a much deeper conversation
that I'm not going to get into today when we just bring you the AI news that matters.
We're here to report the facts.
But regardless, huge, pretty unexpected news from Nvidia, not just.
saying, hey, we're entering the large language model in the multi-modal, right? It's having vision
capabilities. So they're not just saying, hey, we're here to enter the arena, so to speak, but they're
coming in the open source, open weights. And it's benchmarking, not above, but it's benchmarking
similar to some of the heavyweight models, such as Google's one point, Gemini 1.5, such as
OpenAI's GBT4, such as Claude 3.5 Sonnet. So it's benchmarking below these, but already in that
class, which is extremely impressive from Nvidia. All right, let's keep going. We still got even
more news. So we had this whole thing called Dev Day from OpenAI, right? So OpenAI unveiled some
major API updates to enhance developer experience and also reduce costs. So OpenAI had their
dev day last week, much smaller than the live stream, much hyped dev day of last year,
where we saw the introduction of GPTs in the GPT store.
So this year, OpenAI introduced a couple of, I'm not going to say these are dorky things,
but they're a little dorkier.
So for the everyday person, this might not be applicable for you.
But if you are someone that goes into Open AIs kind of playground, their back end,
and you're playing around with their API, their assistance API, some pretty big news here.
So number one, they introduce model distillation, which allows developers to enhance smaller
models as an example like GBT40 Mini by fine tuning them with outputs from larger models.
So a lot of companies specifically meta have introduced this, but essentially using these big,
powerful models to help train or distill and create better smaller models.
So that's pretty big there.
Also, Open AI said that they're going to make it rain on everyone, just raining tokens.
So to support developers, OpenAI announced that they're offering 2 million free training tokens daily.
Geez, on GPT40 Mini and 1 million on GPT40 until October 31st.
That's spooky good.
So making it easier to get started with model distillation.
Y'all, you got three weeks for essentially free, right?
Unless you're an enterprise company, it is hard to get to that 2 million or 1 million free daily tokens.
Also, following Anthropics lead this one, right?
You got to tip the cap to them.
But OpenAI did announce a new prompt caching feature, which enables developers to reuse common prompts at a reduced cost,
applying a 50% discount for prompts, essentially, that share lengthy prefixes, which could lead to pretty significant savings, especially for business.
enterprise clients. More, OpenAI has expanded capabilities with its vision, fine-tuning,
allowing developers to enhance GPT-4-O's understanding of images, which can improve applications
in visual search, object detection, and medical image analysis. So yes, being able to fine-tune
with GPD-40 on the vision side, pretty big. But the biggest of all, I think the biggest
announcement from Open AI's Dev Day was the introduction of the real-time.
API. Okay, so this allows for more efficient right now, more efficient speech to speech applications,
eliminating the need for multiple processing steps and reducing latency. So essentially what this
means, we saw this new advanced voice mode finally released about 10 days ago to the general
public, but that was just within open AI system. So now this is rolling out to developers.
So what that means for all of you out there, there's probably whether you know it or not,
hundreds of a big SaaS products, big enterprise tools that run off of OpenAIs GPT4 technology,
right?
Probably thousands, but I would say there's hundreds of big brand name companies out there,
software as a service companies that run off of GPT4.
This changes everything.
So you know how just about every single company in the world now has a GPT4 enabled chatbot
on their website?
Now think of that with real time, with the real time voice assistant.
Okay, so as an example, being able to log on to your, you know, your MailChimp account or
your, you know, your analytics dashboard, your banking account.
And instead of, you know, being able to type with a GPT4 powered assistant, being able to talk
to one.
So this obviously has enormous, enormous implications on.
customer service and customer experience, right? Because it is in theory not that hard to do.
And it's in theory not that far off to where most of the services that you use are probably
going to be using this real time API fairly soon. So businesses you need to adapt.
You know, you also need to see is you need to have the conversations now. Is this something
that we want to bring to our customers? Essentially bringing your data, right? So
with some rag, fine-tuning one of these models and using this real-time API assistant.
Also, something key here, this wasn't a real-time voice assistant either.
You know, so maybe this was a nuance or I'm reading between the lines here,
but this real-time API assistant is just real-time.
It's not talking about just voice.
So in theory, I do think this lays the groundwork for real-time video in the future as well.
We're still waiting for that from OpenAI.
They did demo that back at their spring event.
What was that back in, I think, April or May?
But we have still yet to see that be released.
So I do think that OpenAI, very strategic play here, releasing this real-time API
to bring this voice-to-voice real-time voice mode, right?
So like we just talked about earlier, you know, with what co-pilot has in Microsoft,
the advanced voice mode right now, extremely impressive right now inside of ChatGBT.
minus the fact that it doesn't have access to your data, real-time information, and other tools that large language models need.
But this is in the agenic future, y'all, where I think it is going to be very common.
Whatever services that you use on a daily basis, there's a good chance that they're going to be using this real-time API.
So it's definitely worth keeping an eye on.
All right, also worth keeping an eye on.
Google, our next piece of AI news.
So Google has developed some new AI reasoning models to compete with OpenAI's Strawberry or 01 model.
So according to reports from the Wall Street Journal, Google is advancing its artificial intelligence efforts to match the reasoning abilities demonstrated by OpenAI's latest model.
01, previously called Strawberry, previously called QSTAR, whatever you call it, it is raising the stakes in this.
kind of agentic AI race now. So according to reports, multiple teams at Google's parent company,
Alphabet, are reportedly making strides in developing AI reasoning software that can effectively
tackle complex, multi-step problems in areas such as mathematics and programming. So Google is
utilizing a common method called chain of thought prompting, but on the front end to bring
more kind of agentic or reasoning capabilities to the Gemini series of models. So Google has faced
pressure to innovate quickly, especially since the launch of OpenAI's chat GPT and all of these
new features we've been talking about, which kind of raises concerns among investors about Google's
dominance, just not just with its Gemini model, but also in search, right? Because as you bring
all of these real-time products, I mean, you have.
have also open AIs, another product they floated out there that we're still waiting for is
search GPT, right? But a lot of pressure on Google. And despite moving cautiously due to ethical
considerations and public trust, Google has made some notable advancements in this field,
including the introduction of alpha proof in alpha geometry two, which excelled at the IMO or
the International Mathematical Olympiad, which is usually a benchmark for reasons.
reasoning models or, you know, kind of this chain of thought thinking that models are now starting to do.
At a recent developer conference, Google previewed an AI assistant also called Astra,
capable of using a phone's camera to interact with the environment and answer questions,
hinting at future integrations with maybe this new reasoning model, which would be a pretty big, pretty big piece of news.
Another big piece of news from Google, Google has unveiled a
major change to how its new AI search works. So Google has announced some pretty significant updates to
its search capabilities leveraging advanced AI technologies to improve user experience and information
discovery. So let's first talk about Google lens. So Google lens is used for nearly 20, according to
Google, 20 billion visual searches each month showcasing the growing reliance on this kind of
of computer vision visual search technology.
Also, there's a new, the introduction of a generative AI search in lens, which allows users to
ask questions by pointing their cameras resulting in AI generated overviews that provide
relevant information in links.
Also, with the new AI powered lens search, users can now search with video wild, enabling them
to record moving objects.
and ask questions about them,
enhancing this kind of interactive search experience.
Voice input, obviously, right?
If you have video input, you think voice input, yeah.
So voice input will also be available on Lens in the new Lens search
or Lens feature within the new AI search,
allowing users to ask questions verbally while taking photos,
making the tool more intuitive and user-friendly.
So Lens has also been updated to facilitate
facilitate shopping. Yeah, that's a big one, providing detailed product information, reviews, and pricing from order over 45 billion items in Google's shopping graft. That's not all. So finally, we're seeing some updates to the new AI powered circle to search feature, which allows users to identify also songs they hear. So it's not just for products on Google shopping, but coming for Shazam now. So you can identify songs you hear without switching apps, expanding the functionality of Google's
mobile apps. Also, Google is rolling out AI organized search results page, starting with recipes and
meal ideas, which aim to present diverse content formats and perspectives more efficiently.
So yeah, this seems very similar to what Microsoft co-pilot is unveiling and what with its
co-pilot pages. And we've also seen that with perplexity pages as well. Oh, a lot there.
We're not even done. So the updated design for AI overviews also, we talked about this on the show
last week, but worth recapping, includes more prominent links to supporting web pages,
improving traffic to these sites and making it easier for users to find relevant information.
One more thing, Google is also testing ads within AI overviews.
Yeah, people are always wondering, how are they going to make money, right?
If they're not driving clicks, if people are just getting answers, we've known this for a while,
but Google is now starting to test more widely ads inside of its new feature there.
Ooh!
Last but not least, y'all.
Here we go.
Canvas.
We saved some of the juiciest stuff for last.
I had to take a sip there, y'all.
I'm actually double-fisting right now.
I got coffee and water because there's been so much news.
But we're saving probably, I think, the biggest one for last,
and I think people are really overlooking this.
Big sip of water.
Yeah, this is unedited, unscripted, y'all.
I like to say, and people have said
this is the realest thing in artificial intelligence.
Everything else is so edited, polished, scripted.
Yeah, you hear me slurping.
That's me.
All right.
Here we go.
Open AI has introduced Canvas,
a new tool for coding and content creation.
So OpenAI has just unveiled Canvas,
a significant update to chat,
GBT that enhances coding and content capabilities right within the editor. And this is kind of a
direct competitor to Anthropic Clause popular artifacts feature. I'll tell you in in ways that it is
and ways that it's not. So right now, Canvas is available to all paid chat GPT plus users and is
its own dedicated mode when you start a new chat. That's important to know. You will have a new
feature that says GPT4O with Canvas. So if you want to take advantage of this, you have to be in that
dedicated mode. So Canvas allows users to convert code from one programming language to another
with just a few clicks, making it easier for developers to work across languages such as JavaScript,
PHP, TypeScript, Python, C++, and Java as well as others. So the new feature includes a lot of
things for developers, coders, whatever you want to call it, including some tools for reviewing
code, adding debugging logs, inserting comics, comments, fixing bugs, and a lot more, which obviously
improves things in programming. Also, users can highlight specific sections of chat GPT's text responses
to focus and enable the AI to provide inline feedback and suggestions. That piece is huge.
All right. So OpenAI has developed some new core behaviors for its GBT40 model to support Canvas, including generating specific contact types and making targeted edits.
So OpenAI has emphasized that Canvas represents the first major visual interface updates since Chad GPDs launched two years ago.
I don't agree with that. I'll tell you by here in a second with plans for ongoing improvements based on user feedback.
So like I said, OpenAI has already rolled out this to most paid users, but also some free users may get very limited access to this new canvas feature once it exits beta.
All right, but here's the thing that I think people are sleeping on.
I think people are really just focusing on the coding aspect of this and just comparing it to Anthropic Claude's or sorry, Anthropic Claude's artifacts feature.
But I honestly think there are two different things.
And let me also call this out.
People are just like, oh, you know, chat GPT is just copying artifacts from Claude.
Well, not necessarily.
I think there's huge benefits to Claude artifacts that we don't see right now in this new canvas feature.
Number one, artifacts can render code.
Right now, Canvas cannot.
All right?
That's huge, right?
So essentially, you can, you know, create a website inside.
artifacts and actually render it. You can see what it looks like, right? You can say, hey, make me a,
you know, website with HTML, CSS, and JavaScript. I wanted to do this, this, and this. Here's
a screenshot of it, go build it, and you can actually render it. And you can see how it looks. So you
don't have that right now in Canvas. So that is, I think, one of the biggest differentiators, at least
that people are saying, oh, this is just for coders. But I don't think. Number one. And also,
let's be honest, this is not, I don't think. I don't think this is.
Open AI just straight up copying Anthropic Claw because guess what?
Guess what Open AI has had essentially now for a year?
They've already had this interface, this kind of split screen interface where you talk to an AI
on the left and it builds you something on the right.
So this was not a new feature from Anthropic with their new artifacts.
The only thing that was new in that, which is super impressive, is the ability to render code.
right but this whole kind of chatting on the left hand side and uh the AI building something on the
right hand side that was open AI first with their GPT builder people are overlooking that right
you can chat with the GPT builder on the left side of the screen all the way back to November
of 2023 and it will render it on the right side of the screen it'll make your GPT for you so people are
just saying oh open AI blindly copied not really and I do think that it's different but let me just go
ahead and dive into some of these, I think, bigger differentiators right now because, yes, I do think
that Anthropic and the artifacts feature has some huge benefits that you don't have inside
of Canvas, mainly being able to render code. But this is for so much more than coders. I think I'm
actually going to have a dedicated show on this tomorrow. So live stream audience, podcast people,
if you're in the email newsletter, I'll probably put out a poll, but do you want this show tomorrow
to be on canvas? Because I think there's a lot of tips and tricks that people are overlooking.
So number one is in line, right? So being able to essentially inside chat, GPT, we haven't had this
yet. And you can't really do this in, in artifacts either. But you can now work in line. So what that
means on the left hand side, and I have a screenshot here as an example, you can still talk to
chat GBT on the left hand side. Then on the right hand.
inside as an example, you can highlight a big chunk of text and you can tell chat GPT to like,
yeah, like, hey, make this more attention grabbing, right? I did this as an example. I highlighted
an intro paragraph. I said, make it more attention grabbing. And then it's going to update the
text right there in line. The other thing is, it is much more like a word document, like a word
editor. You can even go in there and start typing in that inline editor, which again, you do not have the
ability to do that inside of artifacts. That is huge, y'all. Uh, so people don't know this.
A similar feature to this like highlighting and replying to the chat has actually already been
available for like five or six months. Uh, but this just makes it much more intuitive in the canvas
editor. And again, being able to integrate and interact and build with chat GPT, the canvas mode
in real time, that is the differentiator, y'all. This is, I think, where you start to say, okay,
this is more of an augmented intelligence, right?
We're always talking about this difference between, you know, human intelligence,
AI, you know, artificial intelligence, and then augmented.
Well, I think this is a great step in the right direction forward, which is collaborating.
Because before, even with artifacts, you couldn't truly collaborate with the AI, right?
You just have to have it regenerate something, render the code, et cetera.
But now you can collaborate in real time with chat GPT.
There's a lot of other, I think, pretty cool features here.
on the left, I don't really like the add emojis button, but when you do highlight something,
you have some new kind of interface items that you can work with inside the canvas editor.
So emojis, there's a button to add polish.
You can adjust the reading level, which that is huge.
You can adjust the length with an easy slider, making it longer or shorter.
You can have click a button to have chat GBT suggest edits, and then you can apply those edits.
That's huge, y'all.
So kind of, you know, going back to this, you know, co-pilot vision and having an AI kind of working with you side by side, right?
If you and chat chitpd collaborate on something and then click this suggests edits, I love that.
And then also the thing that I think people literally aren't talking about here is chat chbtbt right here just killed.
Killed a bunch of rappers, right?
A lot of these rappers, you know, essentially are thinly designed kind of variations of chat.
at GBT, and I've said this, go back in the archives.
I've been doing this show for 18 months, probably, oh no, more than that now.
But I've been saying once one of these big companies, OpenAI, Google, Claude, whoever,
once they actually bring in this in-page editing, this in-line editing, I'm like, that's the end of so many of these,
you know, pretty popular and I would say useful.
GPT rappers, essentially, because I think one of the main advantages of them is you can use them
like a word doc, right? You can say, okay, I'm going to write this paragraph. Okay, now chat GPT,
you know, you take the next two and, okay, we're going to collaborate on this last paragraph.
You haven't had that, which is kind of crazy to think, right? I guess, you know, Microsoft just
announced that with their Wave 2 co-pilot features, kind of this more collaborative too,
which I'm super excited about being able to collaborate and do this in.
real time with teammates with co-pilot pages, but now we have that with chat chit.
And I think this is the one biggest feature that people, number one, they're sleeping on because
they're looking at this new canvas just for, just for editors, just for coders, right?
Just for developers, which, yes, there's some great features built in there for, you know,
software engineers, for people who are in development, people who are coders, you know,
for lack of a better term.
but I think this is for everyone, this ability to work in line with chat GPT and still take advantage.
That's the other thing.
This is the GPT4O model.
So you still have advantage to all these other tools, browse with Bing, advanced data analysis,
all of these other things where, you know, if you're using the O1, the agentic model,
you don't have access to these things.
So this is huge and I don't think it can be overlooked.
All right.
This was a long one, y'all. Thank you. Thank you for tuning in. I'm going to do the world's
fastest recap on the AI news that matters here. So number one, meta-unveiled movie gen. Very
impressive and it looks to be on the same quality as Sora not released yet. Our next new story,
Open AI secured a $4 billion credit line, upping its current liquidity to $10 billion after its
record-breaking $6.6 billion fundraising round last week.
Windows 11 getting some 2024 co-pilot updates, especially some new updates for those with the
Copilot Plus PCs, such as the rewind feature, copilot vision, think deeper, a lot of other kind of new features.
Then we also saw the new kind of copilot V2 interface and advanced co-pilot voice chat for those co-pilot plus or sorry, copilot pro users.
out of nowhere, we saw Nvidia unveil an open source model in its NVLM1.0.
So not just an open source model, but releasing the weights, pretty wild.
Open AI Deb day.
So tons of new updates there from model distillation, free training through October 31st, vision, fine-tuning.
But I think the biggest one there is the real-time API.
Then we talked about Google is reportedly working on a real-time.
reasoning model according to the Wall Street Journal to more closely compete with OpenAI's new model,
01 preview, 01 Mini, kind of the strawberry model. Then Google so many new AI enhancements to its
search functionality and inside the Google app. And then last but not least, open AI introduced canvas
a new tool not just for coding but for content creation. And I'm excited about that one, y'all.
All right, that's it. Thank you all for sticking around.
you know, Marie and Tara, David, everyone else, Philip, thanks for sticking around.
So much going on in the AI news.
We're going to be recapping it all in our newsletter.
If you haven't already, please make sure to go to your everyday AI.com.
Sign up for that free daily newsletter.
If this was helpful, I know this was a lot, y'all, a 50-minute podcast.
Yeah.
But we just saved you hours every single day.
You don't got to drown every single day trying to figure out what's going on.
Just tune in on Mondays.
We do this almost every single Monday, bringing you the AI news that matters.
if this was helpful. If you're listening on the podcast, please subscribe, whether you're listening
on Spotify or Apple, please leave us a rating and subscribe. And please join us tomorrow. And every day
for more, everyday AI. Thanks, y'all. Meet Firefly AI assistant. Now live in Adobe Firefly,
the Allman One Creative AI Studio. Just describe what you want to create in your own words and the
assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud
apps, including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com
and sign up to our daily newsletter so you don't get left behind.
Go break some barriers.
and we'll see you next time.
