Everyday AI Podcast – An AI and ChatGPT Podcast - EP 473: Claude 3.7 drops, OpenAI releases GPT-4.5 and more AI News that Matters

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Anthropic Claude dropped 3.7.

Starting point is 00:00:49 Sonnet, Open AI responded days later with their much-awaited model, GBT, 4.5. And we may be waiting many more years before we actually see a full AI from Apple. We might even see AGI before we see full Apple intelligence. Yeah, it was one of those. kinds of weeks in AI. It's like, wait, this all happened in one week. Yes, it did. And if you miss any of it, and that's just the tip of the iceberg, don't worry, we're going to be going over those stories and a whole lot more today on everyday AI. What's going on, y'all? My name is Jordan Wilson, and I'm the host, and this thing, it's for you. This is your daily

Starting point is 00:01:32 live stream podcast and free daily newsletter, helping us all not just keep up with AI, but what it all actually means, right? Deciphering all this PR and all these news releases from all the biggest companies so we can actually use that information to grow our companies and our careers. If that sounds like what you're trying to do, maybe this is the first time you're listening. Welcome. This is your new home. Your other new home is your EverydayAI.com. There, you can sign up for our free daily newsletter. So each day we recap exclusive insights that we bring you only on this very podcast. as well as we keep you up to date with everything else that you need to know in the world of AI

Starting point is 00:02:13 to be the smartest person in AI at your company or your department. Right. So if that sounds like what you're trying to do, make sure you go sign out for that free daily newsletter at your everyday AI.com. All right. A quick reminder, we're going to be, oh my gosh, it's like two weeks away. We're going to be broadcasting live with NVIDIA at their GTC conference starting March what day are we actually going to start there?

Starting point is 00:02:40 Probably March 17th, the Monday. So kind of for that week, at least the first couple of days early in the week, we're going to be partnering with NVIDIA to be bringing you a lot of exclusive insights, some great expert interviews, maybe breaking a little bit of news as well. So really excited for this year's GTC conference.

Starting point is 00:03:02 We were lucky enough last year to partner with NVIDIA as well. So hey, let me know. Hit me up if you're going to be at the GTC. conference in San Jose would love to say what's up. All right. With that, let's get into what's happening in the world of AI for the week of March 3rd. Let's get after it, y'all. All right. So, Anthropic, yeah, that seems like it was a week, it was on Monday, right? Right after this show, seems like it was a month ago that Anthropic released 3.7 sonnet, but they did. So, Anthropic has introduced Claude 3.7 Sonnet, the first publicly available hybrid AI model,

Starting point is 00:03:46 which combines that traditional transformer capabilities with advanced reasoning. So it does kind of merge these two AI paradigms. It combines the traditional transformer model with reasoning capabilities, allowing it to switch between rapid responses and deeper logical thinking. So extended thinking mode is a key feature. and right now the extended thinking is only for paid users, and they can enable that mode where the model spends more time solving complex problems, also showing a summarized chain of thoughts for transparency.

Starting point is 00:04:23 So free users can use the Claude 3.7 Sonet model, but at least as of right now, they can't toggle on that extended thinking. So Claude 3.7 Sonet is very impressive when it comes to coding, software engineering, anything like that. So if that's something that you're in charge about your company, you're probably going to want to check out 37 Sonnet. So it scored in very impressive, 70.3 on the sweet bench coding benchmark outperforming competitors like OpenAIs, 01 and 03 Mini. It is particularly strong, Claude 3.7 Sonet, that is, in coding and front end web developments, making it a great choice for software engineers. Anthropic also released Claude, Claude 3.7 Sonet, that is, in coding and front-end web developments, making it a great choice,

Starting point is 00:05:05 Anthropic also released Claude Code, which again, live stream onus, let me know if you want, if we should dive into that. It's slightly more technical. But essentially, like Anthropic released, I'm not going to say it's a cursor competitor or a competitor to bolt and lovable and windsurf and all these kind of AI IDs, but it kind of is, right? Even though cursor has said that, you know, hey, Anthropic Claude is our default model. It does look like with Claude, Anthropic is trying to get into this AI straight up development space, which I think is a smart move. So they did launch Claude code, a command line tool that allows developers to interact with an update entire code basis directly from their computer's terminal.

Starting point is 00:05:55 So it integrates also with GitHub and supports debugging, signaling Anthropics push into the AI powered coding assistance. space. So despite these advancements, Cloud stood firm on their API costs, which at the time seemed kind of silly until we got Open AI's API pricing for their latest model. So just wait on that one. So it is still the same price, $3 per million input tokens and $15 per million output tokens. But I mean, here's what I think with this latest model, right? So a lot of of people are like, oh, it looks like Anthropics, you know, trying to, you know, get the best of Open AI, right, with kind of a logic-based model with a model that reasons. I'm being honest, I was not very impressed with Anthropic Claude 3.7's ability to reason. But again, I am a power.

Starting point is 00:06:55 I am a heavy user of OpenAI's O3 Mini. I'd say that's my most used model. I use O1 Pro as well. So, you know, looking at Anthropics first foray into the kind of, you know, reasoning models, not super impressed by that. Also, a lot of people are reportedly rolling back to 3-5 sonnet for certain tasks, especially when it comes to coding because it seems like sometimes Claude uses that reasoning when it maybe should it.

Starting point is 00:07:25 And it takes things a little further and, you know, changes a bunch of things that you maybe didn't even want change. So I am using 3-7 sonnet. every single day for certain use cases. I think it's great. It's the first, right, you have to tip, like I keep saying this, you have to tip your cap to Anthropic because they are the first company with a hybrid model. And I do think that that is going to be the big, kind of the future of large language models

Starting point is 00:07:52 is going to be kind of combining this quote unquote old school transformer approach with the new reasoning models. So yeah, I'm curious, you know, because we're going to talk about GPD 4.5 at the end. You know, I'm curious for our live stream audience and, you know, hey, let me know if you're listening on the podcast as well. What are your thoughts on these newest releases? Right. So Sandra here joining us from YouTube says, I haven't been able to tell the difference yet. I've personally, I have been able to tell the difference with Sonnet 3.7. You know, when I'm testing it just for, you know, niche coding tasks, which is something I don't necessarily do a lot unless I'm testing models.

Starting point is 00:08:29 it's great for that. Everything else that I might use Claude Sonnet for on an everyday basis. I'd say it's about the same. I think the artifacts feature has actually improved for that reason, right? If you're trying to visualize data or something like that, I think it is better. But for non-coding, non-data visualization tasks, I don't know if we necessarily saw a huge leap with 3.7 Sonnet. Again, that's in my testing. I'm using it maybe.

Starting point is 00:08:58 I don't know, 45 minutes to an hour every single day since it came out. You know, for the show last week, I probably tested it for a good four or five hours. So, you know, I'm not using it, you know, five hours a day since it came out or anything like that. But I'm curious what everyone's thoughts are. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant now live in the Adobe Firefly app, the all in one, Creative AI Studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision,

Starting point is 00:09:41 just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator Premier, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-bileged. built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director.

Starting point is 00:10:22 Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. All right. our next piece of AI news. Google is pushing hard for AGI. And apparently they just need to work more. All right. So Google co-founder, Sergei Bryn, has reportedly called for an increased productivity and more in office presence for Google as the company intensifies its push to develop artificial general intelligence or AGI.

Starting point is 00:11:00 So Brin believes AGI is within research. according to reports, if employees just work a little bit harder. So in an internal memo seen by the New York Times, Bryn stated that Google has all the ingredients to win the AGI race, but needs to, quote-unquote, turbocharges its efforts. He suggested that employees work at least 60 hours a week, calling that the sweet spot of productivity. Also, return to office policies are being emphasized,

Starting point is 00:11:30 as Brin recommended employees come into the office every weekday. exceeding Google's current three-day-a-week on-site policy. He argued that remote work in reduced hours are demoralizing to others. So Google's AI teams are already working long hours. So according to CNBC, some staff working on Google Gemini's AI projects clocked up to 120-hour weeks to address critical flaws in their image recognition tool. Imagine putting in 120 hours of... week, right? That's nuts. That's like, I don't know. I'm not great at math, but that's more than

Starting point is 00:12:09 15 hours a day. Imagine, right? 15. Can someone do a live calculator, right? Let's do 15 times 7. I should be able to do that in my, in my brain right now, but I can't. Okay, so that's 105 hours a week. So that's even more 120 hours. That's like 17 hours a week. No thanks. Or 17 hours a day working to debug this, right? But imagine working 15 plus hour days and then they say, ah, you know, we'll achieve AGI if you just work a little harder. Not probably what most people want to hear. So high pressure work culture in AI development is raising concerns.

Starting point is 00:12:53 So while the push for AGI could lead to groundbreaking advancements, the demanding workload, such as, as an example, the routine 12 hours, our days reported by employees at XAI, developing GROC, highlight the toll on the industry. So, I don't know how I feel about this, right? Number one, I thought AI was supposed to allow employees to work less, right? And focus on, you know, higher quality work and to, you know, like, so I don't know, part of this to me just isn't adding up, especially since Google, I think, should have been light years ahead of their closest competitors in the AI race, considering they essentially

Starting point is 00:13:41 developed the GPT technology, but it was other companies that really ran away with it. And I would say that Google did not even catch up in the AI race probably until late 2024. So now it seems like they really want to win the AGI race and are really just, pushing employees to work more, work smarter, be more productive in the process. So I don't know. Seems kind of ironic that, you know, we're supposed to all be benefiting from AI in large language models. And, you know, we're supposed to be, you know, we're focusing on higher level,

Starting point is 00:14:16 creative and strategic tasks. And it's like, nope, just work double, just work triple, right? A little wild. Yeah. Sarash from, from YouTube just said, Google finally woke up after, missing out for a couple of years. Yeah, it's like, oh, we, we weren't really like in the, in the race for, you know, 2022 and 2023. So now we just got to work double or work triple to catch up. Yeah, so much for that four-hour work week. We are all promised the utopia

Starting point is 00:14:46 of AI, at least not yet. All right. Other big tech trying to play catch up, meta is reportedly developing a standalone AI app to compete with OpenAI and Google. So a new report suggests that META is launching a dedicated app for its meta AI assistant, potentially signaling a major shift in its AI strategy. So according to reports, META is working on a standalone app for its AI assistant MataAI. The app would mark a departure from META's current approach of just integrating the AI services into social platforms like Facebook, Instagram, and WhatsApp. So the standalone app could help META reach users,

Starting point is 00:15:29 who avoid social media, like me, or use messaging services from competitors, addressing a gap in its current strategy. This move could potentially bring in millions of new users who were previously out of reach. So meta AI, their online service, which launched in 2023, offers features like question, solving, image, generation, and answer suggestions. So while it has seen improvements, it still lacks advanced functionalities offered by competitors like Open AIs chat GPT and Google's Gemini. So meta CEO Mark Zuckerberg has ambitious plans for AI,

Starting point is 00:16:06 stating earlier this year that meta AI could become the leading personalized AI assistant, reaching over 1 billion people. And the standalone app does align with that vision. Also monetizing, right? It's obviously a key part of meta's reported new strategy because they could roll out paid plans and premium. features that they might not as easily be able to roll out, you know, in something like Facebook or Instagram or WhatsApp where a lot of their users may be using the meta AI technology.

Starting point is 00:16:42 Fred says I just won't use meta, right? I use it online. It's actually a great online resource, right? We did a head-to-head on this show probably like three or four months ago, just how accurate certain large language models are when using the internet. So I think we ran down what we did OpenAI, we did Google, we did meta, and we did copilot. And I was actually surprised by how well meta performed, right?

Starting point is 00:17:11 Essentially, the biggest thing, the biggest takeaway from that one was how well it just surveyed the internet and could return back an accurate answer using its Lama model connected to the internet. So I was actually pretty, pretty impressive. And yeah, their augmented reality is what they're definitely focused on. But yeah, I mean, they're definitely wanting to marry kind of those two different technologies, right? The wearable technology and the, you know, just the standalone LLM that's outside of their social media network. So all right.

Starting point is 00:17:46 Speaking of big tech, yeah, it's just been a big tech kind of week. But pretty impressed here with Microsoft. So Microsoft co-pilot is now rolling out some major updates, including free unlimited access to voice and their think-deeper capabilities. So that is the 01. Yes, the OpenAI 01 model. You can now use it for free and unlimited access. So if you just go to copilot.

Starting point is 00:18:16 Microsoft.com, you have to have an account. but you can use essentially their Microsoft co-pilots voice mode, which is not quite as good as Open AI's advanced voice mode, although it uses the same technology, but you can use the 01 model. Open AI's, you know, what was, you know, a couple of months ago, their most powerful model. You can use it for free and unlimited.

Starting point is 00:18:43 So huge move here from Microsoft that I think kind of got select on. So the voice feature allows users to interact with copilot hands-free. Use cases include practicing a new language, preparing for a job interview with mock Q&A, or receiving step-by-step cooking advice. I've actually used co-pilot for that exact reason. I don't think it still helped me. And that wasn't co-pilot's fault. It's just I just can't cook.

Starting point is 00:19:09 Maybe I do need that figure O2 robot that just silently does work in your kitchen for you. But think deeper. I think it's really good. So we did a review of it when it first came out, I don't know, five months ago. So probably you're going to have to revisit that. But I think it's great. So anything that really requires some advanced reasoning, some logic, right? Maybe you're using, you know, chat GPT or Claude or Gemini or something else.

Starting point is 00:19:40 And you don't have a paid plan. And you're like, wow, I'd love to be able to use a reasoning model. So right now you can use very limited 03 mini for free on chat GPT. But if you want to get your hands on 01, which is a very, very, very good capable model, now you can do it. So co-pilot pro users, yeah, which is myself. I'm on co-pilot pro. I paid $20 a month for that. And I was like, wait, what are we getting, you know, what are we getting now for $20?

Starting point is 00:20:09 But, you know, I do appreciate that Microsoft at least emailed co-pilot pro users. and they're like, yo, we're making this free version really, really good. If you want to cancel, here's the link. More big tech companies should do that, right? Like if they make something free and all of a sudden, the paid plan isn't that good, they should be like, hey, it's fine if you cancel. Here you go. However, here's what the co-pilot pro kind of account still has.

Starting point is 00:20:34 So, you know, think of this differently, right? I'm still a Mac user. Yes, I have a Windows Copilot Plus PC. I still got to use. I still got to set it up. I've been so busy. I'm actually super excited to do that. But maybe your company uses Microsoft 365 copilot, for the most part, then this is not going to impact you, right?

Starting point is 00:20:53 Unless you're using it, you know, not logged in outside of your company's, you know, biz chat, maybe, right? So in that case, sure, you can use this that way. But this is for, I think, is going to appeal to a lot of people who are Mac users and who maybe don't have that Microsoft 365 copilot integration. But copilot pro users, so those that still pay $20 a month, can still continue to enjoy and use copilot across the different Microsoft 365 apps like Word, Excel, PowerPoint, etc. So yeah, even though, you know, as an example, I'm using a Mac right now, I still have Microsoft Word on my computer and I can use Microsoft Copilot via Copilot Pro in Microsoft Word, in Excel, in PowerPoint. point, et cetera. So, you know, pro users aren't like SOL, right? They're not just out of luck. They still have some co-pilot capabilities that normal free users do not have. But y'all,

Starting point is 00:21:53 if you do not use co-pilot yet, I would just go right away and try out the unlimited voice and the think deeper features. They're actually super impressive. Yeah, I like what Graham here from from LinkedIn is saying co-pilot is the gateway to AI for most people now. Yeah. So a couple of weeks ago, I would probably say, no, probably chat GPT free. But now I'll say those are like 1A and 1B. I still think the free version of chat GPT is probably a little bit better than this free version of co-pilot.

Starting point is 00:22:29 But now they're at least hand in hand, especially if your company is a Microsoft organization. I think it's pretty big here. So, all right. Next piece of AI news. One of the biggest names in AI on the on the text to speech side is changing up their biz model a little bit. So 11 labs. They're an AI startup company that was recently valued at $3.3 billion has introduced its first

Starting point is 00:23:00 standalone speech to text model called Scribe, following. a $180 million funding round. So yeah, you've maybe heard of 11 labs as the text to speech, but now they're flipping it in going speech to text. So yeah, even if you listen to this show on the podcast, there's a little intro, right? It's this AI intro to, but that's 11 labs, right? That was like the very first version of 11 labs, by the way,

Starting point is 00:23:32 which was, I thought, pretty impressive for, you know, two and a half years ago. So the Scribe model supports over 99 languages and boasts exceptional accuracy for more than 25 of them, including English, French, German, Hindi, Japanese, and Spanish. So the company claims an impressive 97% accuracy rate for English with a word error rate below 5% for its top performing languages. So according to 11 labs, Scribe outperformed, competitive. like Google Gemini 2.0 Flash and OpenAIs whisper large V3, that is OpenAI's kind of speech to text

Starting point is 00:24:14 model in benchmark texts such as flures and common voice, setting it apart in speech detection market. So one cool feature that I liked, hopefully I pronounced this right, but it includes advanced features like smart speaker diarization, which is essentially just. automatically identifying the speaker that is speaking, which, like for someone that does a podcast, and usually I have guests on, that's huge. That's why I can't use, like, personally, I can't use something like Whisper or Google Gemini 2.0, right? Because I normally have a guest, right? So I actually use a tool called Cast Magic that has that built in. So I'm, I'm going to

Starting point is 00:24:56 definitely be checking out this new offering from 11 labs, which can auto identify speakers. I think that's huge. It also has word level timestamps for precise subtitles and auto tagging of sound events like laughter. That's kind of cool, kind of creepy as well, but kind of useful. So right now, Scribe only works on, it only works with pre-recorded audio, but will soon offer low latency, real time options. So that's pretty cool. When that comes out, I'm going to have to hit up a little bit. 11 labs and, you know, always have that going live when I have a guest on the show.

Starting point is 00:25:38 That would be super helpful for me. You know, I'm curious, does anyone out here? Like, do you all use tools like 11 labs? Like I said, I think it's been great for text to speech. It was one of the leaders in that field. I think it still is for when people just needed voiceovers, you know, audio books. You know, I think people overused it, maybe early on and didn't put enough care into it. But it's actually a very, very good platform.

Starting point is 00:26:06 So George here is saying 11 Labs, gibberlink allows AI to talk to AI. Yeah, in a faster than human language, I saw that. That was a pretty cool demo. Essentially, you know, two AI agents were talking to each other. They identified that they were both AI agents. I believe this was an open source project. And then they just used their own gibberlink kind of to talk to each other. It sounded like two fax machines, you know, talking to each other.

Starting point is 00:26:31 It was pretty cool. Yeah, Samuel here says, I pay the $5 a month to 11 labs just to be able to listen to my docs. Huge point, Samuel. I do that as well. That's one of the things I actually use 11 labs for the most. I pay for it. Sometimes I have a big block of text. And I don't want to go into, as an example, Open AI's backend in their playground

Starting point is 00:26:53 because you can only do a certain amount of text at once. So yeah, I often will do that if, you know, many times I'll just throw it into notebook LM and get more of a summary or more of a conversation around it. But if I actually need to read something, you know, point by point and I'm super busy, a lot of times I'll just grab that, you know, a couple thousand words, throw it into 11 labs, you know, crank the output up to 2X because, you know, that's why I speak at so quickly because I listen to stuff so quickly, I guess. But yeah, a great, great use case, I would say for 11 labs.

Starting point is 00:27:27 Look at this. Here we are, you know, just rounding out the big tech lineup for this week. So Amazon is denying reports that Anthropics AI is powering their new Alexa Plus features. So yeah, if you pay attention to this show, we cover this. So Amazon finally announced their smarter Alexa, right, powered by large language models. And earlier reports was it was powered by Anthropic Clause AI model, but apparently not because Amazon, well, at least not entirely. So Amazon has publicly refuted claims that it's recently announced Alexa Plus capabilities are powered by Anthropics Clawed AI models. And this is sparking a lot of discussion now online.

Starting point is 00:28:16 So Amazon insists that its in-house models, which is called Nova, powers the majority of Alexa Plus conversations. So in response to a CNBC report claiming Anthropics Clawed models, which is called Nova, Nova, powers the majority of Alexa Plus conversations. So in response to a CNBC report claiming Anthropics Quad model was handling most customer interactions, Amazon stated that Nova has managed over 70% of conversations, including complex requests in the past month. So yeah, this is just starting to roll out over the next week or two to paid Amazon users. So maybe that was just in testing. They also didn't really say like, hey, what's the other 30%? So I'm assuming maybe the other 30% is clawed.

Starting point is 00:28:58 We'll have to see if this report just came out. But Anthropic obviously is a key investor in Amazon. Sorry, Amazon is a key investor in Anthropic. And Amazon, the company maintains that its own proprietary AI, Nova, is responsible for some of these advanced features. So the upgraded Alexa Plus boasts generative AI capabilities and dubbed Alexa Plus, the new iteration is designed to be more conversational and capable, handling tasks like grocery shopping, booking services,

Starting point is 00:29:31 sending text and browsing websites. Yeah, it was funny in the Alexa Plus kind of demo. I'm like, why are all these demos just like you buying more things from Amazon, right? Why can't you just show me an instance where Alexa just isn't dumb, right?

Starting point is 00:29:48 It's like so much of the demo is just like, oh, you're buying more stuff from Amazon. Like, I don't know. Like literally. I'll ask for like, I don't know, the weather or hours at a store. And then Alexa, old school dumb Alexa is still just being like, do you want me to add this to your cart? And I'm like, I asked about the weather. So I don't know.

Starting point is 00:30:08 I'm not super pumped about this, but, you know, it's got to be better than what we currently have. And this was supposed to come out many, many months ago. But the launch has faced delayed and early challenges as Alexa Plus was delayed due to issues with hallucinations and incorrect answers during testing. However, Amazon CEO Andy Jassy highlighted the transformative impact of Gen A.I on making such advancements possible. So, yeah, Michelle says Alexa and Siri are both useless. Don't worry, Michelle.

Starting point is 00:30:44 It looks like Siri might be useless until 2027. So more on that here in a couple of minutes. All right. Speaking of voice assistance, a new one is taking over the web. So yeah, kind of all weekend and even early today when I was sleuthing online to bring you all the latest news. Sesame, the AI chatbot, and their chatbot's name is Maya, is really grabbing a lot of headlines for its ability to mimic human conversation with uncanny realism. So Sesame's Maya aims to cross the uncanny valley of conversational AI. So the company showcased Maya in a demo, emphasizing its ability to replicate human speech

Starting point is 00:31:34 and interaction, making it feel more like talking to a real person than a chatbot. So yeah, a lot of people that like I actually follow in respect online were losing their noodles over Sesame in Maya over the weekend. So Maya impress a lot of users with its conversational flow and realism. So during test conversations, which you can go right now, you don't even have to have an account. It shows, it does, how am I going to say this? If you're not a heavy conversational user, you might really be impressed with this new Sesame Voice Assistant. All right.

Starting point is 00:32:17 It is more neural. It responds with very low. latency, the voice does sound more realistic and more human. For me, it'll be crazy. I absolutely hated it. I probably won't be using it. Sorry, Sesame, you're not going to be sponsoring the Everyday AI show anytime soon. Is it great?

Starting point is 00:32:40 Yes. Does it have a very high ceiling? Sure, right? I don't know. For me, one thing I noticed in testing this new Sesame AI voice. model, you know, and I'm curious, live stream audience, did any of you guys use this over the weekend or, you know, early today? It doesn't seem to do a good job of actually answering your question. So I think a lot of people are fooled by this low latency claim that a lot of companies,

Starting point is 00:33:10 you know, AI voice companies are putting out there because usually the initial response to a question that you ask it is this kind of a delay tactic, right? Or it just says, I mean, kind of like a human would, right? Like they kind of laugh about your question or they're like, oh, that's a good one, right? So is it actually low latency? I mean, like yes and no, right? I think they achieve that like immediate human to, you know, AI conversational rate, like where it can respond to you almost immediately. Because it just responds back with some useless, needless, unrelated, right?

Starting point is 00:33:46 It's just this little quip that buys itself some time to then answer your question. Also, at least for me, kind of the default of this Sesame I found extremely frustrating. Because at least for me, when I talk to an AI voice assistant, I don't want fluff. I don't. And that probably puts me in the minority of people, right? Maybe people want, you know, this, I don't know, unrelated quips and stories. No, I want facts. I want stats.

Starting point is 00:34:16 I want fast. I don't want any gibberish, right? So if you're like me and maybe would prefer talking to, you know, robots sometimes versus humans, right? Like, yo, I just want facts, stats, and I want it quick, right? So at least for me, Sesame wasn't really appealing, probably not something I'll be using a lot. And it did struggle. It seemed like with just fact recall, right? Something simple.

Starting point is 00:34:43 One thing I always do is just be like, tell me about the everyday AI podcast, right? And it didn't, right? And it should be in the training data, right? Because we've had hundreds of episodes dating back to 2022, right? Is that right? We've been doing it for this long. No, 2023. So, you know, it did struggle with just factory call and some other things that I tried.

Starting point is 00:35:02 But it is free. Go try it out for yourself. Let me know, let me know what you all thought. So Samuel says, Maya's EQ is next level. That's true. So if you are more on the emotional, you know, if you are looking to get some EQ benefits out of an AI model versus just IQ. I've talked about this.

Starting point is 00:35:23 I prefer IQ. The EQ's super nice. You know, it is pretty good. George says it feels it is very touchy-feely, but the voice is really good. Yeah, I'd agree. I would agree with those, with those observations. Yeah, Nisiani. Nisiani knows.

Starting point is 00:35:40 That's the former journalist in me. Yeah, anytime someone put something out, I'm like, I don't know about this. Let me go test it. And I'll tell you guys how it is. but I still think it's pretty impressive. You can go check it out, right? All right. Our last couple pieces of AI news.

Starting point is 00:35:54 So Apple has announced a $500 billion investment in the U.S., including a server factory in Texas. So according to reports, Apple has announced a massive $500 billion investment in the U.S. over the next four years, which will include a new AI server factory in Texas. and reportedly the creation of 20,000 research and development jobs nationwide, according to Reuters. So the investment will span a variety of areas, including purchases from U.S. suppliers, manufacturing expansions, and content production for Apple TV. So Apple will reportedly partner with Foxcon to develop a 250,000 square foot facility in Houston

Starting point is 00:36:42 that will assemble servers for its AI. powered services. These servers are currently made outside of the U.S. marking a shift toward domestic production. The company also plans to double its advanced manufacturing fund from $5 billion to $10 billion, with a significant portion allocated to producing advanced silicon at the Taiwan semiconductor manufacturing facility in Arizona. So most of Apple's products are assembled overseas, but many components such as chips from Broadcom and SkyWorks Solutions are made in the U.S. So as part of this investment, Apple will launch a manufacturing academy in Michigan to offer free courses in project management and manufacturing process optimization

Starting point is 00:37:35 to small and mid-sized companies. Hey, more news on AI voice assistants that apparently aren't going to be super smart anytime soon, well, at least not series. Right? So we just heard that Alexa is getting smarter and that's going to be rolling out to paid Amazon subscribers here in the coming weeks. But you might have to wait until 20, 27 to get that fully smart Siri from Apple and Apple intelligence. Yes, I did not get that wrong.

Starting point is 00:38:09 New reports are suggesting from reports. from Bloomberg, always on the spot that Apple's long-awaited overhaul of Siri, described as a modernized conversational version, is now reportedly delayed until 2027. So this up, yeah, we might have someone on the live stream this morning. I think Michael said we might actually have AGI before we actually have a smart Siri. So the upgraded Siri is expected. to debut with iOS 20, combining a generative AI approach with the assistance classic features for a more advanced seamless experience. I think by classic features, it just means an assistant

Starting point is 00:38:55 that's not super useful. So while Apple is planning to release a limited LLM-powered version of Siri with iOS 18.5, so that would be pretty soon. It will reportedly run as a separate model and fall short of the significant improvements users are anticipating and Apple is marketing. So Bloomberg reports noted that the real upgrade to Siri will begin to take shape in 19.4, but won't reach full maturity until iOS 20. Wow. The enhanced Siri is expected to feature contextual understanding and improved autonomy, potentially rivaling advanced AI assistance currently dominating the market. that's AI assistance of today, right?

Starting point is 00:39:43 I don't, I, I fully don't understand how Apple and their Apple intelligence has so severely fumbled the bag. That's, I don't know, what's the euphemism for like fumbling the bag, but 10 times worse? Apple had all the money, all the resources. They know where this technology's headed. They're partnered with Open AI and, you know, chat, GBT. there's a chat tvety ciri integration so they have to be getting a good amount of data from this partnership yet 2027 right all right i get it so one part of me is like all right it's better to

Starting point is 00:40:22 under promise and over deliver right then a lot of companies are like oh we're releasing the world's best model tomorrow and then it takes like three years so i get it i just do not i cannot fathom how Apple is so, so behind, at least when it comes to bringing all of these things to market. Yes, Apple, I think is one of the leaders in putting privacy and security first, right? But I don't know, like at what cost, right? If there were actual smartphones on other companies that were just as intuitive as Apple, right? Samsung has great phones, right? There's a lot of great phones outside of Apple, but I don't know.

Starting point is 00:41:04 Maybe it's just because the Apple interface is so stupid easy. When I pick up, you know, a Samsung phone or whatever, right? If I'm at like Best Buy and I'm just scrolling through, I'm like, I don't even know how to use this thing. Right. I don't know. Maybe Apple does that on purpose. Maybe they make all of their iPhone users. They just make it so easy that it seems impossible to pick up and use a non-Apple device.

Starting point is 00:41:27 Maybe that's what we're, you know, looking at here. All right. Our last piece of AI news. Open AI has. launched its newest model, GPT 4.5. So this is the company's latest and largest AI language model, offering improved writing skills, better world knowledge, and a more refined conversational experience. So GPT 4.5 is Apple's largest model yet describing it as the most knowledgeable model to date. It is now available as a research preview for chat GPT Pro users, all right.

Starting point is 00:42:05 So with a broader access rolling out in the coming weeks. I do assume it'll probably be by about mid-March that chat GPT plus users will have access to this new model, GPT 4.5. I do know that other, you know, third-party providers such as perplexity, PO and others. So if you have a paid subscription to something like a perplexity, Po, you dot com, et cetera, right, you can probably start using this 4.5 model in very limited capacity right now. And you don't have to wait for OpenAI to roll it out to other tiers. But they did say that it will be out to most pro tier or sorry, most paid tiers in the coming

Starting point is 00:42:47 weeks. But right now, if you are using chat gbt.com, right? So if you're using chat chepti on the front end, you are only going to have access to 4.5 if you are on the $200 a month pro plan. So speaking of EQ earlier, that's where this model shines. And yeah, I said, I don't really need it. But, you know, in my use in my use so far of GPD 4.5, I do see the benefits of having a model that just feels more natural, more intuitive, and just more human, right? Because that's the thing.

Starting point is 00:43:23 Open AI straight up said, this is not a frontier model. right, which was kind of surprising. And they did say, hey, the big thing, I broke it down in two words, right? I would have liked Apple or sorry, Open AI to break it down this way. So you can go listen. I cover this in episode 472, which I believe was on Friday. I said what they're trying to do is make it more reliable and more relatable. So here's what that means.

Starting point is 00:43:50 On the reliability side, Open AI shared some benchmarks and metrics that showed the hallucination rate is going down and essentially its ability, its knowledge rate is going up. So it is much more reliable than past models like GPT40 or even their reasoning models, you know, the 03, you know, 01, 01 Pro, et cetera. So it is more reliable, which is huge, right? That's one of the main reasons that I think a lot of companies and individuals don't even jump in to these models to begin with because they don't, they feel they can't trust them. So it's not hallucination-free, right, but it scored much higher on some of Open AI's benchmarks in terms of just getting things right and hallucinations are drastically down. So that's number one.

Starting point is 00:44:38 And then number two, it's more relatable, right? Sometimes when you're speaking to chat in GPT, right, either verbally or just typing, right, let's just say typing because right now the voice mode is still powered by GPT 4.0, not by the new 4.5. but it does feel more human. And so here's the thing. If you're the type of person that likes to use chat GPT as a friend, as a life coach, you know, as a therapist, something like that, this is a no-brainer, right? Especially when it rolls out to the $20 a month chat GPT Plus plan. You're going to love it for everyone else.

Starting point is 00:45:15 But where I actually have started to see some value in having this, you know, equally smart EQ, large language model is as a business strategist, right? That's something I use large language models for a lot. And, you know, I've noticed that OpenAI's latest model in GPT 4.5 does a much better job sometimes of picking up on nuances of what I'm trying to say, but maybe might not be saying, right? Sometimes I might just be giving chat, GPT, just a ton of data and like asking it for suggestions, right? Asking it for strategies.

Starting point is 00:45:50 One thing you should always be using for, I don't care what model using. You should always be using models to second guess yourself, right, to fight back on a decision that you're making. Because if you do that, I think your decision, either you are going to have to defend it and make it even better, or you're going to be thinking about things that you maybe weren't thinking about before. In that case, GPT 4.5 runs laps around everyone else. So in certain use cases, I think it's fantastic. Traditional benchmarks, this thing was a, right? Literally. I mean, yes, the benchmarks across the board improved from their GPT-40 models, but this thing did not bench off the charts, which I think a lot of people were expecting, right?

Starting point is 00:46:33 But here's the other thing with this model. This is a foundation for future models, right? So in the same way that Anthropic is going to this hybrid approach, right? They're essentially combining transformer models with reasoning models. Open AI has said that is their future as well. So when we get quote unquote, when we get GPT5, it's going to be a hybrid model like Claude 3.7 Sonnet is right now. And I think that's the thing that people are overlooking. This is not opening I said this.

Starting point is 00:47:07 They're like, this is not a frontier model. This is not supposed to be benchmarking off the charts. It is a new, fresh model that understands humans, which I think is huge because I think the future reasoning models, even though they may not get a name, right? Essentially, Open AI said, yeah, in the future, they're just all going to be one model. But the reasoning models, right, are going to be exponentially better in future versions of Open AI's offerings because of this stronger and much more capable. GPT 4.5 model. That's how these models are built, right? The O series models were built on 4-0, right?

Starting point is 00:47:50 So now when you think, when you have a much more human 4.5 model, imagine what that means for the future of these reasoning models or a hybrid model. So I think it's going to be extremely impressive. So like I said, following the launch for pro users, GPD 4.5, We'll be expanding according to OpenAI and kind of their release announcement. It will be going to plus in team users in the coming weeks. So that could be as soon as this week. I'm guessing it might be next week.

Starting point is 00:48:24 Essentially, Open AI said, yo, we're on a GPUs. We can't serve this thing, which I thought was interesting, right? And also the API pricing on this is wild. It's wild, right? You know, we were kind of, you know, complaining or, you know, rolling our eyes that Claude 3.7 sonnet didn't reduce their prices, right? But the price for GBT 4.5 via the API is astronomically high, right? $75 per million input and then 150 per million output. So that's wild.

Starting point is 00:49:08 So compared to GBT40, All right. So we're going GPT 45, $75 per million input. GPT 40, $2.50. Right? That's wild. And then on the output side, GPD 4.5, $150. All right. GPT, 40, 10.

Starting point is 00:49:30 So 15x more expensive on the outputs. On the input, I think that's what, like 25x or 30x more expensive. Yeah, 30x more expensive. So the API prices are out of this world. So I'm guessing maybe Open AI may reduce that once they're able. They said that they were trying to secure more GPUs. I'm sure the costs are going to go down eventually. But maybe they're saying, hey, right now there's people that are really going to,

Starting point is 00:49:58 there's companies and customers out there, I'm sure that are going to find value in that relatability and reliability of that new model. So wow, it is mind-bogglingly expensive. All right. Let's quickly, very quickly reclap those top stories for the week. So first, Anthropic has released Claude 37 Sonnet, the world's first AI hybrid model. Google co-founder, Sergei Breen, is pushing for harder work for Google to win the race to AGI, reportedly asking employees to work up to 60-hour weeks, maybe more.

Starting point is 00:50:41 Meta is reportedly developing a standalone AI app to compete with OpenAI and Google. Funny, by the way, Sam Altman responded to that on Twitter and said maybe we'll make a social media app. Microsoft co-pilot crushing it, you know, offering free users, free unlimited voice and advanced think deeper features. That is using OpenAI's O1 model. 11 Labs has launched Scribe, a standalone speech-to-text model that supports more than 99 languages. Then Amazon is denying reports that Anthropics AI is powering its new Alexa Plus features and saying that it's their own internal models. Next new story, the internet is going berserk over the new Sesame AI chat bot that offers voice capabilities. me, it's okay, but go check it out for yourself.

Starting point is 00:51:36 It's free to go ahead and try. Apple has announced a $500 billion U.S. investment plan, including an AI server factory in Texas that will reportedly create 20,000 jobs there. A Bloomberg report shows that we might not get Apple's truly modern Siri until 2027. And then last but not least, Open AI has launched GPD 4.4.4.4.4.5. a model that really emphasizes relatability and reliability. Woo! That was a lot of AI news, y'all.

Starting point is 00:52:13 I hope this is helpful. If so, please share this, right? I know some of you, everyday AI, it's like your secret. It's your cheat code. It's how you're the smartest person in AI at your company. Please share the love, right? People are always like, hey, Jordan, how can I help? Click that repost button.

Starting point is 00:52:29 That helps, right? If you're listening on the podcast, please follow the podcast, leave us a rating, tell someone about it, right? You can send individual episodes. Please share this as much as you can. I know AI can be tricky. It's hard to keep up with. It can be scary. I spend and our team spends countless hours trying to keep you up to date so you can grow your company and grow your career confidently.

Starting point is 00:52:54 Speaking of that, if you haven't already, make sure you go to your everyday AI.com. Go listen to our 2025 AI predictions and roadmap series. That's episodes 443 to 447. They are, I'm telling you, bangers, bangers. All right. So thank you for tuning in. Please go subscribe to our newsletter at your everyday AI.com. Thanks.

Starting point is 00:53:16 We'll see you back tomorrow and every day for more everyday AI. Thanks, y'all. Meet Firefly AI Assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words. and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution.

Starting point is 00:53:49 Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. helps keep us going. For a little more AI magic, visit your everyday AI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 473: Claude 3.7 drops, OpenAI releases GPT-4.5 and more AI News that Matters

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.