Everyday AI Podcast – An AI and ChatGPT Podcast - EP 374: AI News That Matters - October 7th, 2024

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Is Meta's new movie gen better than OpenAI's SORA?

Starting point is 00:00:52 What's this one feature inside of ChatGBT's new Canvas mode that everyone's missing? And what's going on with all of these new Microsoft co-pilot updates? Speaking of questions, is Google changing the way that search works, at least when it comes to AI? Yeah, a lot of questions. questions this week if you've been following AI news and updates from big tech companies. But don't worry, we've got answers. What's going on, y'all? My name's Jordan Wilson, and I'm the host of Everyday AI. Thank you for tuning in. Everyday AI, it's for you. It's for me. It's for all of us. It is your daily live stream podcast and free daily newsletter,

Starting point is 00:01:35 helping us all learn and leverage generative AI to grow our companies and to grow our careers. So if that sounds like you, if that sounds like something that you're trying to do and who the heck isn't trying to keep up with AI to grow their company and career? That's like all of us, right? Well, then for all of us, Mondays are the spot. You got to tune in. Maybe you can't join us every single weekday Monday through Friday live at 7.30 a.m. when our live stream goes out. But maybe you can schedule out sometime on Monday, right?

Starting point is 00:02:04 You don't have to spend hours each and every day trying to track everything that's happening and say, hey, how does this a place? to me, my company, my career, that's what we do for you on Mondays. All right. So if you're new here, thank you for tuning in. Make sure if you have not already, please go to your everyday AI.com, sign up for that free daily newsletter. All right. You know, it's written by me, a human.

Starting point is 00:02:26 All right. So I think there's just so much AI out there. And hey, who are the humans trying to help us make sense of it? That's me. So make sure you go sign up for our newsletter. All right. Enough chit chat. Let's get straight into the.

Starting point is 00:02:39 AI News That Matters for the week of October 7th. Here's a big first one, y'all. Didn't see this one coming. Meta has unveiled MovieGen, a new AI video competitor. So Meta has just announced MovieGen, a cutting-edge generative AI video model that promises to change how we all create videos. So this model is not yet available to the public. and is now only released with some samples in a research paper, similar to how OpenAI's SORA video model was teased about eight months ago, yet we still don't have access to that one.

Starting point is 00:03:19 So movie gen looks to be a groundbreaking AI model that allows users to create, edit, and personalize high-definition video and soundtracks using simple text inputs and images, setting a pretty high standard, at least according to these samples, for AI video. So movie gen, well, whenever they release it, will allow users to generate 16 second video snippets from a single text prompt and also personalize them using just one photo. So some pretty advanced capabilities. So the system offers precise editing features allowing users to replace objects in videos. That's huge. Such as transforming a lantern into a bubble or swapping a VR headset for steampunk goggars. Yeah, think of the applications for your company with this. If you sell a bunch of different products, right, shooting one video, swapping out, you know,

Starting point is 00:04:16 images or just running a bunch of different videos with just, you know, slightly different objects in them. So many different applications for this. So the unavailing of movie gen comes amid a growing trend in generative AI for video creation with competitors like Open AIs Sora, Adobe's Firefly. Luma's Dream Labs, Runway, Pika Labs, Kling, so many others, right? Every day we wake up, there's a new AI video tool that looks pretty good. But despite its promising features,

Starting point is 00:04:49 MovieGen is not yet available to the general public, and meta has not yet provided a specific timeline for its consumer release. I do believe a lot of these AI video tools kind of waiting until after the U.S. election, so we might see either late 2024 or early 2020. for a more tiered roll off. All right. So safety concerns as well. We got to talk about that, right?

Starting point is 00:05:14 So the development of all these tool raises a lot of safety concerns about the broad release of such powerful tools and as they can require some significant processing power to lead to potential misuse, right? Meta's research right now on Meta Movie Gen is documented

Starting point is 00:05:34 in a comprehensive 90-page paper So yeah, we'll link to that in our newsletter. So you can go read that or maybe just use No Book LM to create a short little podcast on it. All right. So, hey, for our live stream audience, I didn't say, I didn't say what's up to y'all. Thanks for being there. But let me know if you can see these examples on screen now. And thanks to everyone for joining us, you know, Tara, Harvey, sabbatical life, Marie, Zane, Fred, Tara, everyone.

Starting point is 00:06:01 Thanks for joining. I'm looking at these now. So podcast audience will leave a link to you. these, but very impressive, right? So we have what looks here to be a little girl running on the beach with a kite. We have another video here, someone sipping coffee based off of a photo, right? Here's the wild thing, being able to upload a photo and then being able to generate AI videos in different scenes, right? So a lot of these earlier AI video generators were a little limited on features, right? Because the technology wasn't there. It takes time to catch up. But

Starting point is 00:06:38 essentially, you would start with a simple text prompt, maybe upload an image, and you would get a four-second video. Now, with this new meta-movie gen, once it does come out, the ability to upload a photo and be able to generate AI video in different scenes, it's mind-boggling, right? even if you would have told me six months ago that this technology might exist. I would have said probably not. That's probably three or four years down the line, but here we are. So yeah, live stream audience, what do you think of these images, right? So here's another very impressive one.

Starting point is 00:07:13 This one looks cinematic. So it looks like it's almost set in a deep canyon. There's a storm brewing. Very, very cinematic. It almost looks like, you know, a drone is coming up. So many of these shots, right? six months ago, nine months ago, you would not think these kind of shots would be possible, right? I'm almost speechless looking at them.

Starting point is 00:07:39 Here's the example of being able to swap out different elements in a video just with text prompts. So there's one with a child kind of floating a lantern up in the air and then the exact same video side by side, but then a bubble floating up in the air, right? not having to reshoot, not having to reprompt. This really changes what's possible in terms of creativity, marketing, and also what smaller companies can do, right? Big companies, the biggest in the world, you know, your Nike's, your Coca-Cola, your

Starting point is 00:08:12 PNG brands, right? All these big companies that for many decades have been able to bring us, you know, these great storytelling visuals to help sell their products and services, I think we're finally going to start to see small and media. size businesses be able to compete at a big level thanks to these AI video generators. So yes, although this is not probably the best video you've ever seen in your life, it's pretty good, right? It's pretty good.

Starting point is 00:08:39 And it does level the playing field, I think. Jay said, yeah, it even added little additional bubbles. Yeah, pretty impressive. So, yeah, let me know, you know, as we go along, what are your thoughts on this one? I was pretty impressed, but we'll see when or if, right, we actually get access to this and the cost, right? Because here we are, like I said, eight months after SORA from Open AI was teased. And all we've really seen is research paper and, you know, some filmmakers getting access. But I do think Open AI SORA from a visual perspective is probably still the leader, but Kling with their recent 1.5 updates runway with their Gen 3,

Starting point is 00:09:23 models and now meta with movie gen, it looks like from a quality perspective, pretty, pretty close. I mean, we'll have to see once people start getting access, but, you know, personally, I was fairly impressed. All right. Let's move on and talk about money. $10 billion to be exact. So Open AI has just secured an additional $4 billion credit line bringing its total liquid. up to $10 billion just over the last like week. So Open AI's recent financial maneuvers have reflected its rapid growth and ambitious plans in the AI sector. So Open AI has just last week closed a significant 6.6 record breaking, $6.6 billion funding round, putting its valuation to $157 billion. So just days later and just days ago, Open AI,

Starting point is 00:10:23 announced a $4 billion revolving line of credit, which when combined with that $6.6 billion fundraising round, which broke records, brings its total liquidity to over $10 billion. So this new line of credit comes in partnership with J.P. Morgan Chase, City, Goldman Sachs, and some other major financial institutions. So OpenAI plans to use this capital to invest in research, expand its infrastructure, and attract and retain top talent to maintain its competitive edge. Yeah, Open AI has been losing a lot of its top talent recently to Anthropic as well as some other competitors. So like I said, this is back to back.

Starting point is 00:11:06 This happened about three days. So right before Open AI closed a record breaking. So the largest fundraising round ever at $6.6 billion with support coming from big names such as Thrive Capital. Microsoft, Nvidia, Fidelity, and others. So amidst all this big funding talk, OpenAI, kind of released some more numbers or came out in various pieces of reporting that their revenue has increased more than 1,700% year over year, now generating up to $300 million in monthly revenue with projected sales of $11.6 billion for next year. But despite this revenue growth, certain reports have showed Open AI is maybe losing up to $5 billion this year due to high training and inference costs and GPUs aren't getting very much cheaper because they're getting more powerful. So the prices are going up and being able to retain top talent.

Starting point is 00:12:15 So definitely worth keeping an eye on what OpenAI does with this new 10 billion. billion in liquidity. Doesn't that sound nice? Having $10 billion in liquidity? I mean, sometimes businesses would love to just have, you know, 10,000 or maybe $100,000 in liquidity. But, you know, over here, opening eyes, you know, flexing with a big 10, 10 billy. Jeez. All right. New stuff coming to Windows, y'all. So we are now getting more and more updates on the new Windows 11, 2024 updates that will bring some new AI innovations to the new line of copilot plus PCs. All right. So the new upcoming Windows 11, 2024 update is generating a lot of buzz due to its new AI features. Particularly, some of these are specially designed for those with the new

Starting point is 00:13:14 and more powerful copilot plus PCs, which utilize advanced neural processes. processing units or on-device NPUs. Yeah, we went from CPUs to graphic processing unit GPUs to now NPUs, which helps specifically right now, you know, Microsoft and Windows are using these for on-device AI or Edge AI for its new Windows co-pilot. So the update will introduce new AI capabilities gradually, beginning with a members of the Windows Insider program. So we'll probably see some demos rolling out of these new features coming soon, starting in October and expanding to other pieces of hardware, those copilot plus PCs

Starting point is 00:14:04 with Intel Core Ultra and AMD Rizon. I think that's how you say it. I'm not a PC person, but I'm probably going to be soon. But the Intel Core and the AMD co-pilot plus PCs should start to get these software updates starting in November. So one of the first features to debut will be a pretty controversial but very highly anticipated feature called recall. So recall essentially was delayed, was supposed to roll out a couple of months ago, was delayed because of some regulatory and privacy concerns, but it aims to give PCs an AI-powered photographic memory. So this was initially planned for a release in June, but it was delayed due to security concerns. And now it will be off by default.

Starting point is 00:14:56 So those are two, you know, kind of two big changes. It was delayed by about five or six months. And now it will require users to opt in, which, hey, I would opt in for a computer to remember literally everything that happens on it. Pretty impressive, right? I think that's the future of AI computing. So some other new features rolling out in these new Windows 11, 2024 updates for Copilot plus PCs. So it has the Click to Do feature, which aims to enhance productivity by allowing users to perform AI-based actions directly from a shortcut menu that appears over images or text.

Starting point is 00:15:36 So essentially, the ability to execute different AI actions just with a simple click, so not even having to open anything or something. save anything or go into co-pilot. Some other ones. Improvements to Windows search and will allow users to describe what they are looking for without needing to remember specific file names or search syntax. So that's a pretty big one. So as an example, let's say whether you're looking for something for business or, you know, your personal life on your PC, being able to search for something and it being able to understand the contents of the file better, including photos, right? So yeah, maybe you were at a big conference

Starting point is 00:16:19 and you're searching for certain photos from that conference, you know, before it just might be a random file name and it might be kind of hard. But now with co-pilot, this new update coming out to Copilot Plus PCs that will be able to understand the contents and the context of your photos

Starting point is 00:16:37 and files a little easier with this new update. Also speaking of photos, the photos app inside of Microsoft Windows will introduce. a super resolution feature that can enhance low resolution images to up to eight times utilizing the NPU for quick processing. Also, paint. Hey, good old paint.

Starting point is 00:16:57 Paint's getting updated, y'all. It's not that, you know, late 90s paint because now it's even getting some Adobe-esque generative AI features such as generative fill inside paint. Also, the co-pilot itself, right? So the biz chat and the co-pilot chat. and the co-pilot chat are getting some updates to the co-pilot experience, promising a more personal AI assistant. Some more on that here in a minute.

Starting point is 00:17:22 But we got to talk about what I think will probably be the two biggest features. Well, three, if you count recall, but two experimental features, co-pilot vision and think deeper. So these will allow users to interact with web content and tackle complex questions. So these two features, I think, are going to do. be showstoppers once they roll out and assuming they're working. So let's talk about them one by one. So copilot vision is essentially co-browsing inside of Microsoft Edge with an AI agent, right? So being able to click the copilot button within Microsoft Edge and then this new AI assistant, which I'm

Starting point is 00:18:04 going to talk about here in a second, a live AI agent you can talk to, right, and say, hey, I'm planning out, you know, this big conference. You know, I keep dropping like, oh, what's this big conference? You know, maybe you're planning out the Microsoft Build conference that's going to be here in Chicago in about six weeks. And you're saying, hey, I'm planning this out. What do you think of this? Or, hey, is my outlining, is my outline for this session covering everything that our customers need? So essentially, it's like having a coworker who has access to all of your information over your shoulder and can see what you're doing as you browse the web. So pretty cool there with co-pilot vision.

Starting point is 00:18:43 And then Think Deeper. So Think Deeper is essentially OpenAIs O1 reasoning model. So the Strawberry model is coming to co-pilot. So we don't know when these two new features, Copilot Vision and Think Deeper will be rolled out. But at Microsoft's event two weeks ago, they did say it would be rolling out soon. So I do believe that some users will get access here yet.

Starting point is 00:19:10 this fall. So like I said, those two features have not fully rolled out, but one feature that has. Yeah, Michael, thanks for the comment here. When do they roll out? You know, we don't know. Sometime this fall and, you know, it's going to start rolling out to Windows Insiders first, and then it's going to be slowly dripping out. Some of these features are going to be going to people who have a co-pilot pro license first. you know, so if your company has Microsoft 365 enterprise, you might be getting these, you know, before the rest of the world if you are just on a free co-pilot account. So yeah, a lot of this, you know, depends on what country. I believe U.S. users are part of that first wave. And if you

Starting point is 00:19:57 have a paid account or if you're just using the free version of co-pilot, that is going to impact when you are going to get these features. And again, slow rollout. So I assume that, you know, we saw a lot of these releases that I'm going to talk about now start to be rolled out last week, still not fully rolled out. So I would assume I would think that Microsoft wants these features probably out in the wild before the build conference, like I said, which is November 19th here in Chicago. So we'll see. But two things that did roll out. Well, Microsoft debuted a new copilot what people are calling the V2 interface as well as its new advanced co-pilot. co-pilot voice chat. So we did a couple of dedicated reviews on our YouTube channel. So if you didn't catch those, we'll share those in the newsletter today. So two pretty new features here. So for our live stream audience, I have them screenshots of them left to right. So the first one you'll notice is a completely revamped and new copilot interface. So if this looks kind of similar to inflection AI. Well, it kind of is. So Microsoft essentially aquired. So they, you know, it wasn't a

Starting point is 00:21:16 full acquisition of inflection AI and Pi. But they essentially hired all of their top people. And now we're seeing this similar kind of card layout. So the new co-pilot has a much simplified layout. So I don't know if this is going to be rolling out to Microsoft 365 Enterprise users as well. That's a big question I have. I haven't seen anything specific from Microsoft. I know I've got a lot of people for Microsoft listening and watching. So please let me know if this is going to be rolling out to the new biz chat. But right now, as an example, I am a, you know, $20 a month copilot plus user. So I have this new kind of minimal, simplified inflection-esque kind of card view. of a co-pilot chat that has rolled out there on the left-hand side.

Starting point is 00:22:07 And then also the new advanced co-pilot voice chat. So again, I don't know if, you know, this is actually called V2 or if it's just the new refreshed user interface, user experience. I'm not sure if there is a name for these new updates, but they're pretty sizable, right? And also, the new kind of live voice assistant is out as well. Again, this is a slow rollout. You might not have access to it.

Starting point is 00:22:34 I just got access to it last week, so did a couple of videos. So this new kind of advanced co-pilot voice chat is very similar, at least in theory, to open AI's new advanced mode. So similarly, it does not have access to real-time information. That one is important. So if you think that you're going to be talking to this new advanced co-pilot voice chat about like, hey, what's the web, in Chicago, what's the latest, you know, tech news for this week. Talk to me about, you know, fourth quarter trends in manufacturing and shipping, you know, in the Midwest, right? You can't do those things right now. I do hope that both Microsoft co-pilots new advanced voice chat and

Starting point is 00:23:23 OpenAI's advanced voice mode are going to get access to these tools. So it's kind of, we don't have the best of either worlds right now. So we have these old and slow, dumb voice assistants, sorry, like Siri and Alexa that feel very archaic. The voice is not that good. It sounds like a robot. It's very delayed. And then we have these new versions. So we have, you know, Gemini Live. If you're an Android user, you have access to that. But otherwise, you might have access to Open AI's new advanced voice mode or this new advanced co-pilot voice chat. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio.

Starting point is 00:24:19 Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant. The Assistant orchestrates multi-step workflows, drawing on 60-plus program. tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom, Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director.

Starting point is 00:25:05 Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. Which is very neural. It sounds like a real person. Low latency. So there's not these awkward three to five second pauses. Except right now, it doesn't have access to real-time information. So hopefully we'll see that rolling out soon.

Starting point is 00:25:30 Right now, free users have limited access to these new copilot updates, but you should have access once it does roll out. All right. Mr. Strawberry in the house. Love to see it. All right, let's keep going. Here's one that I didn't personally have on my bingo card. Invidia, yes, invidia has unveiled a pretty impressive open source AI model called NVLM1.0.

Starting point is 00:26:02 Yeah, that's a mouthful. So, NVIDIA has made headlines by releasing its NVLM 1.0 family of large multimodal language models, which includes the 72B, NVLMD-72B. Y'all, can, hey, maybe I should reach out to my people at NVIDIA. Can we give this one a name? That's so hard to say, right? Like, hey, GPT4, GPT4O, that's even not the best. that's a mouthful, NVIDIA.

Starting point is 00:26:34 But hey, I guess they make up for it because this is an open source model that is very impressive. And this model is designed to also compete with the proprietary closed models from companies such as open AI, Google, Anthropic, as well as competing with kind of the leader right now in the open source, open weights, whichever one you want to call it, model from meta. So this new NVLMD 72B, my gosh, showcases exceptional performance in both vision and language tasks, significantly enhancing its text-only reasoning after its multimodal training. So researchers are reporting that this new multimodal variation achieves an average accuracy increase of 4.3 points on key text benchmarks compared to its predecessor. So pretty big, pretty big uptick here. Also, by making the model weights, that's huge, by making the model weights publicly available and promising the release of training code, Nvidia is breaking away from the trend of kind of

Starting point is 00:27:39 these closed, advanced AI systems offering pretty unprecedented access to this cutting-edge technology for both researchers and developers. All right, so the open access to these powerful AI models, though also, like I talked about earlier, It does raise concern, right? So it's also great, I think, for researchers, developers, and for the general public to have access to the weights of this model, right? But also, when you do that, you also give all the bad guys and all the people that don't have good intentions, right? Because I think especially big companies like meta and Nvidia do have good intentions, right, when building these open source models with Nvidia here announcing that they're releasing the weights and the training data. That's great.

Starting point is 00:28:26 But there's also a downside, an ugly downside to this is you give all of these people with bad intentions access to the most cutting edge models in the world. Right. So I don't think that can be overlooked, but, you know, it's really a much deeper conversation that I'm not going to get into today when we just bring you the AI news that matters. We're here to report the facts. But regardless, huge, pretty unexpected news from Nvidia, not just. saying, hey, we're entering the large language model in the multi-modal, right? It's having vision

Starting point is 00:29:02 capabilities. So they're not just saying, hey, we're here to enter the arena, so to speak, but they're coming in the open source, open weights. And it's benchmarking, not above, but it's benchmarking similar to some of the heavyweight models, such as Google's one point, Gemini 1.5, such as OpenAI's GBT4, such as Claude 3.5 Sonnet. So it's benchmarking below these, but already in that class, which is extremely impressive from Nvidia. All right, let's keep going. We still got even more news. So we had this whole thing called Dev Day from OpenAI, right? So OpenAI unveiled some major API updates to enhance developer experience and also reduce costs. So OpenAI had their dev day last week, much smaller than the live stream, much hyped dev day of last year,

Starting point is 00:29:58 where we saw the introduction of GPTs in the GPT store. So this year, OpenAI introduced a couple of, I'm not going to say these are dorky things, but they're a little dorkier. So for the everyday person, this might not be applicable for you. But if you are someone that goes into Open AIs kind of playground, their back end, and you're playing around with their API, their assistance API, some pretty big news here. So number one, they introduce model distillation, which allows developers to enhance smaller models as an example like GBT40 Mini by fine tuning them with outputs from larger models.

Starting point is 00:30:35 So a lot of companies specifically meta have introduced this, but essentially using these big, powerful models to help train or distill and create better smaller models. So that's pretty big there. Also, Open AI said that they're going to make it rain on everyone, just raining tokens. So to support developers, OpenAI announced that they're offering 2 million free training tokens daily. Geez, on GPT40 Mini and 1 million on GPT40 until October 31st. That's spooky good. So making it easier to get started with model distillation.

Starting point is 00:31:13 Y'all, you got three weeks for essentially free, right? Unless you're an enterprise company, it is hard to get to that 2 million or 1 million free daily tokens. Also, following Anthropics lead this one, right? You got to tip the cap to them. But OpenAI did announce a new prompt caching feature, which enables developers to reuse common prompts at a reduced cost, applying a 50% discount for prompts, essentially, that share lengthy prefixes, which could lead to pretty significant savings, especially for business. enterprise clients. More, OpenAI has expanded capabilities with its vision, fine-tuning, allowing developers to enhance GPT-4-O's understanding of images, which can improve applications

Starting point is 00:31:58 in visual search, object detection, and medical image analysis. So yes, being able to fine-tune with GPD-40 on the vision side, pretty big. But the biggest of all, I think the biggest announcement from Open AI's Dev Day was the introduction of the real-time. API. Okay, so this allows for more efficient right now, more efficient speech to speech applications, eliminating the need for multiple processing steps and reducing latency. So essentially what this means, we saw this new advanced voice mode finally released about 10 days ago to the general public, but that was just within open AI system. So now this is rolling out to developers. So what that means for all of you out there, there's probably whether you know it or not,

Starting point is 00:32:47 hundreds of a big SaaS products, big enterprise tools that run off of OpenAIs GPT4 technology, right? Probably thousands, but I would say there's hundreds of big brand name companies out there, software as a service companies that run off of GPT4. This changes everything. So you know how just about every single company in the world now has a GPT4 enabled chatbot on their website? Now think of that with real time, with the real time voice assistant.

Starting point is 00:33:19 Okay, so as an example, being able to log on to your, you know, your MailChimp account or your, you know, your analytics dashboard, your banking account. And instead of, you know, being able to type with a GPT4 powered assistant, being able to talk to one. So this obviously has enormous, enormous implications on. customer service and customer experience, right? Because it is in theory not that hard to do. And it's in theory not that far off to where most of the services that you use are probably going to be using this real time API fairly soon. So businesses you need to adapt.

Starting point is 00:34:06 You know, you also need to see is you need to have the conversations now. Is this something that we want to bring to our customers? Essentially bringing your data, right? So with some rag, fine-tuning one of these models and using this real-time API assistant. Also, something key here, this wasn't a real-time voice assistant either. You know, so maybe this was a nuance or I'm reading between the lines here, but this real-time API assistant is just real-time. It's not talking about just voice. So in theory, I do think this lays the groundwork for real-time video in the future as well.

Starting point is 00:34:42 We're still waiting for that from OpenAI. They did demo that back at their spring event. What was that back in, I think, April or May? But we have still yet to see that be released. So I do think that OpenAI, very strategic play here, releasing this real-time API to bring this voice-to-voice real-time voice mode, right? So like we just talked about earlier, you know, with what co-pilot has in Microsoft, the advanced voice mode right now, extremely impressive right now inside of ChatGBT.

Starting point is 00:35:12 minus the fact that it doesn't have access to your data, real-time information, and other tools that large language models need. But this is in the agenic future, y'all, where I think it is going to be very common. Whatever services that you use on a daily basis, there's a good chance that they're going to be using this real-time API. So it's definitely worth keeping an eye on. All right, also worth keeping an eye on. Google, our next piece of AI news. So Google has developed some new AI reasoning models to compete with OpenAI's Strawberry or 01 model. So according to reports from the Wall Street Journal, Google is advancing its artificial intelligence efforts to match the reasoning abilities demonstrated by OpenAI's latest model.

Starting point is 00:36:03 01, previously called Strawberry, previously called QSTAR, whatever you call it, it is raising the stakes in this. kind of agentic AI race now. So according to reports, multiple teams at Google's parent company, Alphabet, are reportedly making strides in developing AI reasoning software that can effectively tackle complex, multi-step problems in areas such as mathematics and programming. So Google is utilizing a common method called chain of thought prompting, but on the front end to bring more kind of agentic or reasoning capabilities to the Gemini series of models. So Google has faced pressure to innovate quickly, especially since the launch of OpenAI's chat GPT and all of these new features we've been talking about, which kind of raises concerns among investors about Google's

Starting point is 00:36:58 dominance, just not just with its Gemini model, but also in search, right? Because as you bring all of these real-time products, I mean, you have. have also open AIs, another product they floated out there that we're still waiting for is search GPT, right? But a lot of pressure on Google. And despite moving cautiously due to ethical considerations and public trust, Google has made some notable advancements in this field, including the introduction of alpha proof in alpha geometry two, which excelled at the IMO or the International Mathematical Olympiad, which is usually a benchmark for reasons. reasoning models or, you know, kind of this chain of thought thinking that models are now starting to do.

Starting point is 00:37:45 At a recent developer conference, Google previewed an AI assistant also called Astra, capable of using a phone's camera to interact with the environment and answer questions, hinting at future integrations with maybe this new reasoning model, which would be a pretty big, pretty big piece of news. Another big piece of news from Google, Google has unveiled a major change to how its new AI search works. So Google has announced some pretty significant updates to its search capabilities leveraging advanced AI technologies to improve user experience and information discovery. So let's first talk about Google lens. So Google lens is used for nearly 20, according to Google, 20 billion visual searches each month showcasing the growing reliance on this kind of

Starting point is 00:38:38 of computer vision visual search technology. Also, there's a new, the introduction of a generative AI search in lens, which allows users to ask questions by pointing their cameras resulting in AI generated overviews that provide relevant information in links. Also, with the new AI powered lens search, users can now search with video wild, enabling them to record moving objects. and ask questions about them, enhancing this kind of interactive search experience.

Starting point is 00:39:14 Voice input, obviously, right? If you have video input, you think voice input, yeah. So voice input will also be available on Lens in the new Lens search or Lens feature within the new AI search, allowing users to ask questions verbally while taking photos, making the tool more intuitive and user-friendly. So Lens has also been updated to facilitate facilitate shopping. Yeah, that's a big one, providing detailed product information, reviews, and pricing from order over 45 billion items in Google's shopping graft. That's not all. So finally, we're seeing some updates to the new AI powered circle to search feature, which allows users to identify also songs they hear. So it's not just for products on Google shopping, but coming for Shazam now. So you can identify songs you hear without switching apps, expanding the functionality of Google's

Starting point is 00:40:08 mobile apps. Also, Google is rolling out AI organized search results page, starting with recipes and meal ideas, which aim to present diverse content formats and perspectives more efficiently. So yeah, this seems very similar to what Microsoft co-pilot is unveiling and what with its co-pilot pages. And we've also seen that with perplexity pages as well. Oh, a lot there. We're not even done. So the updated design for AI overviews also, we talked about this on the show last week, but worth recapping, includes more prominent links to supporting web pages, improving traffic to these sites and making it easier for users to find relevant information. One more thing, Google is also testing ads within AI overviews.

Starting point is 00:40:53 Yeah, people are always wondering, how are they going to make money, right? If they're not driving clicks, if people are just getting answers, we've known this for a while, but Google is now starting to test more widely ads inside of its new feature there. Ooh! Last but not least, y'all. Here we go. Canvas. We saved some of the juiciest stuff for last.

Starting point is 00:41:21 I had to take a sip there, y'all. I'm actually double-fisting right now. I got coffee and water because there's been so much news. But we're saving probably, I think, the biggest one for last, and I think people are really overlooking this. Big sip of water. Yeah, this is unedited, unscripted, y'all. I like to say, and people have said

Starting point is 00:41:43 this is the realest thing in artificial intelligence. Everything else is so edited, polished, scripted. Yeah, you hear me slurping. That's me. All right. Here we go. Open AI has introduced Canvas, a new tool for coding and content creation.

Starting point is 00:41:58 So OpenAI has just unveiled Canvas, a significant update to chat, GBT that enhances coding and content capabilities right within the editor. And this is kind of a direct competitor to Anthropic Clause popular artifacts feature. I'll tell you in in ways that it is and ways that it's not. So right now, Canvas is available to all paid chat GPT plus users and is its own dedicated mode when you start a new chat. That's important to know. You will have a new feature that says GPT4O with Canvas. So if you want to take advantage of this, you have to be in that dedicated mode. So Canvas allows users to convert code from one programming language to another

Starting point is 00:42:46 with just a few clicks, making it easier for developers to work across languages such as JavaScript, PHP, TypeScript, Python, C++, and Java as well as others. So the new feature includes a lot of things for developers, coders, whatever you want to call it, including some tools for reviewing code, adding debugging logs, inserting comics, comments, fixing bugs, and a lot more, which obviously improves things in programming. Also, users can highlight specific sections of chat GPT's text responses to focus and enable the AI to provide inline feedback and suggestions. That piece is huge. All right. So OpenAI has developed some new core behaviors for its GBT40 model to support Canvas, including generating specific contact types and making targeted edits. So OpenAI has emphasized that Canvas represents the first major visual interface updates since Chad GPDs launched two years ago.

Starting point is 00:43:53 I don't agree with that. I'll tell you by here in a second with plans for ongoing improvements based on user feedback. So like I said, OpenAI has already rolled out this to most paid users, but also some free users may get very limited access to this new canvas feature once it exits beta. All right, but here's the thing that I think people are sleeping on. I think people are really just focusing on the coding aspect of this and just comparing it to Anthropic Claude's or sorry, Anthropic Claude's artifacts feature. But I honestly think there are two different things. And let me also call this out. People are just like, oh, you know, chat GPT is just copying artifacts from Claude. Well, not necessarily.

Starting point is 00:44:39 I think there's huge benefits to Claude artifacts that we don't see right now in this new canvas feature. Number one, artifacts can render code. Right now, Canvas cannot. All right? That's huge, right? So essentially, you can, you know, create a website inside. artifacts and actually render it. You can see what it looks like, right? You can say, hey, make me a, you know, website with HTML, CSS, and JavaScript. I wanted to do this, this, and this. Here's

Starting point is 00:45:09 a screenshot of it, go build it, and you can actually render it. And you can see how it looks. So you don't have that right now in Canvas. So that is, I think, one of the biggest differentiators, at least that people are saying, oh, this is just for coders. But I don't think. Number one. And also, let's be honest, this is not, I don't think. I don't think this is. Open AI just straight up copying Anthropic Claw because guess what? Guess what Open AI has had essentially now for a year? They've already had this interface, this kind of split screen interface where you talk to an AI on the left and it builds you something on the right.

Starting point is 00:45:46 So this was not a new feature from Anthropic with their new artifacts. The only thing that was new in that, which is super impressive, is the ability to render code. right but this whole kind of chatting on the left hand side and uh the AI building something on the right hand side that was open AI first with their GPT builder people are overlooking that right you can chat with the GPT builder on the left side of the screen all the way back to November of 2023 and it will render it on the right side of the screen it'll make your GPT for you so people are just saying oh open AI blindly copied not really and I do think that it's different but let me just go ahead and dive into some of these, I think, bigger differentiators right now because, yes, I do think

Starting point is 00:46:33 that Anthropic and the artifacts feature has some huge benefits that you don't have inside of Canvas, mainly being able to render code. But this is for so much more than coders. I think I'm actually going to have a dedicated show on this tomorrow. So live stream audience, podcast people, if you're in the email newsletter, I'll probably put out a poll, but do you want this show tomorrow to be on canvas? Because I think there's a lot of tips and tricks that people are overlooking. So number one is in line, right? So being able to essentially inside chat, GPT, we haven't had this yet. And you can't really do this in, in artifacts either. But you can now work in line. So what that means on the left hand side, and I have a screenshot here as an example, you can still talk to

Starting point is 00:47:21 chat GBT on the left hand side. Then on the right hand. inside as an example, you can highlight a big chunk of text and you can tell chat GPT to like, yeah, like, hey, make this more attention grabbing, right? I did this as an example. I highlighted an intro paragraph. I said, make it more attention grabbing. And then it's going to update the text right there in line. The other thing is, it is much more like a word document, like a word editor. You can even go in there and start typing in that inline editor, which again, you do not have the ability to do that inside of artifacts. That is huge, y'all. Uh, so people don't know this. A similar feature to this like highlighting and replying to the chat has actually already been

Starting point is 00:48:05 available for like five or six months. Uh, but this just makes it much more intuitive in the canvas editor. And again, being able to integrate and interact and build with chat GPT, the canvas mode in real time, that is the differentiator, y'all. This is, I think, where you start to say, okay, this is more of an augmented intelligence, right? We're always talking about this difference between, you know, human intelligence, AI, you know, artificial intelligence, and then augmented. Well, I think this is a great step in the right direction forward, which is collaborating. Because before, even with artifacts, you couldn't truly collaborate with the AI, right?

Starting point is 00:48:40 You just have to have it regenerate something, render the code, et cetera. But now you can collaborate in real time with chat GPT. There's a lot of other, I think, pretty cool features here. on the left, I don't really like the add emojis button, but when you do highlight something, you have some new kind of interface items that you can work with inside the canvas editor. So emojis, there's a button to add polish. You can adjust the reading level, which that is huge. You can adjust the length with an easy slider, making it longer or shorter.

Starting point is 00:49:15 You can have click a button to have chat GBT suggest edits, and then you can apply those edits. That's huge, y'all. So kind of, you know, going back to this, you know, co-pilot vision and having an AI kind of working with you side by side, right? If you and chat chitpd collaborate on something and then click this suggests edits, I love that. And then also the thing that I think people literally aren't talking about here is chat chbtbt right here just killed. Killed a bunch of rappers, right? A lot of these rappers, you know, essentially are thinly designed kind of variations of chat. at GBT, and I've said this, go back in the archives.

Starting point is 00:49:55 I've been doing this show for 18 months, probably, oh no, more than that now. But I've been saying once one of these big companies, OpenAI, Google, Claude, whoever, once they actually bring in this in-page editing, this in-line editing, I'm like, that's the end of so many of these, you know, pretty popular and I would say useful. GPT rappers, essentially, because I think one of the main advantages of them is you can use them like a word doc, right? You can say, okay, I'm going to write this paragraph. Okay, now chat GPT, you know, you take the next two and, okay, we're going to collaborate on this last paragraph. You haven't had that, which is kind of crazy to think, right? I guess, you know, Microsoft just

Starting point is 00:50:42 announced that with their Wave 2 co-pilot features, kind of this more collaborative too, which I'm super excited about being able to collaborate and do this in. real time with teammates with co-pilot pages, but now we have that with chat chit. And I think this is the one biggest feature that people, number one, they're sleeping on because they're looking at this new canvas just for, just for editors, just for coders, right? Just for developers, which, yes, there's some great features built in there for, you know, software engineers, for people who are in development, people who are coders, you know, for lack of a better term.

Starting point is 00:51:18 but I think this is for everyone, this ability to work in line with chat GPT and still take advantage. That's the other thing. This is the GPT4O model. So you still have advantage to all these other tools, browse with Bing, advanced data analysis, all of these other things where, you know, if you're using the O1, the agentic model, you don't have access to these things. So this is huge and I don't think it can be overlooked. All right.

Starting point is 00:51:47 This was a long one, y'all. Thank you. Thank you for tuning in. I'm going to do the world's fastest recap on the AI news that matters here. So number one, meta-unveiled movie gen. Very impressive and it looks to be on the same quality as Sora not released yet. Our next new story, Open AI secured a $4 billion credit line, upping its current liquidity to $10 billion after its record-breaking $6.6 billion fundraising round last week. Windows 11 getting some 2024 co-pilot updates, especially some new updates for those with the Copilot Plus PCs, such as the rewind feature, copilot vision, think deeper, a lot of other kind of new features. Then we also saw the new kind of copilot V2 interface and advanced co-pilot voice chat for those co-pilot plus or sorry, copilot pro users.

Starting point is 00:52:45 out of nowhere, we saw Nvidia unveil an open source model in its NVLM1.0. So not just an open source model, but releasing the weights, pretty wild. Open AI Deb day. So tons of new updates there from model distillation, free training through October 31st, vision, fine-tuning. But I think the biggest one there is the real-time API. Then we talked about Google is reportedly working on a real-time. reasoning model according to the Wall Street Journal to more closely compete with OpenAI's new model, 01 preview, 01 Mini, kind of the strawberry model. Then Google so many new AI enhancements to its

Starting point is 00:53:28 search functionality and inside the Google app. And then last but not least, open AI introduced canvas a new tool not just for coding but for content creation. And I'm excited about that one, y'all. All right, that's it. Thank you all for sticking around. you know, Marie and Tara, David, everyone else, Philip, thanks for sticking around. So much going on in the AI news. We're going to be recapping it all in our newsletter. If you haven't already, please make sure to go to your everyday AI.com. Sign up for that free daily newsletter.

Starting point is 00:53:58 If this was helpful, I know this was a lot, y'all, a 50-minute podcast. Yeah. But we just saved you hours every single day. You don't got to drown every single day trying to figure out what's going on. Just tune in on Mondays. We do this almost every single Monday, bringing you the AI news that matters. if this was helpful. If you're listening on the podcast, please subscribe, whether you're listening on Spotify or Apple, please leave us a rating and subscribe. And please join us tomorrow. And every day

Starting point is 00:54:22 for more, everyday AI. Thanks, y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI.

Starting point is 00:55:10 Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers. and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 374: AI News That Matters - October 7th, 2024

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.