Everyday AI Podcast – An AI and ChatGPT Podcast - EP 483: Inside Apple’s AI failures, Google and OpenAI asking feds for help on AI, NVIDIA GTC & more AI News That Matters

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Open AI and Google are essentially asking the federal government to skip the whole copyright law thing.

Starting point is 00:00:53 DeepSeek is essentially asking its employees to stay in China. And Apple is essentially asking themselves internally, WTF. How did we drop the ball on AI? Also, for our live stream audience, in about 29 minutes. in 30 seconds, we'll be able to break some AI news for you here first on the Everyday AI show. What's going on, y'all? My name is Jordan Wilson, and I'm the host of Everyday AI. And this thing, it's for you.

Starting point is 00:01:27 It is your daily live stream podcast and free daily newsletter, helping us all not just learn what's happening in the world of AI, but how we can leverage it to grow our companies and our careers. And almost every single Monday, we do the AI news that matters, because I don't want you spending like five hours a day trying to keep up with the AI news and being like, oh, how's this going to impact my company, my department? That's what we do at every day AI. And then almost every single Monday, we have a dedicated show just cutting through all the nonsense and saying,

Starting point is 00:01:57 hey, here's kind of the top stories, the top developments and how they're going to impact your company. We just cut through the BS and just tell you how it is, right? Because this is what we do every day. So you don't have to. All right. So if that sounds like what you're trying to do, you know, make sense of AI and grow your company and career, you need to go to our website at your everyday AI.com. Sign up for the free daily newsletter. So we do this well every day Monday through Friday. And in our daily newsletter, we recap the best insights from the podcast, giving you even more info that you need to make those smart decisions. But also on our website, you can go listen to like now more than 480 episodes or something like that from. the leading, from the world's leading experts on generative AI. So make sure you go check that out. Also, as a reminder, I am going to be live this week in San Jose, California at the NVIDIA GTC conference. So excited to partner with NVIDIA to bring you all a lot of exclusive

Starting point is 00:03:02 insights. You are literally not going to get anywhere else. And, you know, really just bringing you the news from this conference. So thanks to NVIDIA for partnering with Everyday AI. So yeah, and you can still tune in for free online, right? So if you're not in the California area, you can still attend for free virtually. So check out the newsletter. We're going to put the link in there if you want to do that. And hey, if you are going to be at the conference or if you're in the San Jose area,

Starting point is 00:03:30 make sure to reach out to me, say what's up. You know, I always keep in the podcast show notes if you're listening to the podcast. I always put our email, my LinkedIn. So, hey, if you're going to be around San Jose, you're at the GTC conference, say what's up? Let's chat, Gen AI. All right, enough, chit-chat. Let's talk about what is the AI news that matters for the week of March 17th. First, Google went nutty in a good way.

Starting point is 00:03:58 All right, so Google has kind of unexpectedly released a huge set of AI updates. All right. So we covered this just Friday on our show. But kind of out of nowhere, Google more or less updated their entire suite of AI in large language model products, which I don't think a lot of people were expecting. So we did an entire episode on this. I think it's worth going to listen to. It's episode 482. But here's some of the high-level updates.

Starting point is 00:04:27 So Google introduced Gemini 2.0 Flash thinking, an upgraded reasoning model with improved performance, multimodal capability. in a massive 1 million token context window for paid users. There's also, which I think is one of the more fascinating and unique features, there's inline image generation is now available within Gemini 2.0 Flash thinking, allowing users to create high quality image directly alongside text inputs, which is really cool, especially if you're a content creator, a writer, if you do SEO, right, You can literally just say in my example, I went over on Friday, you know, I said, hey, you know, write a blog post about the best tourism spots in Chicago and create an image for each one of those five. And it did it in line, which I think was super impressive. I really haven't seen anything like that from one of the big AI labs. Also, Google's deep research mode has been revamped to include that Gemini 2.0 flash thinking. And also just it operates differently now. It is much better. where the old, quote, old, right, all of four months,

Starting point is 00:05:38 the old Google deep research just kind of cast a wide net, and it just went out and did one big research task. So now it uses reasoning in this new Gemini 2.0 flash thinking model, right? So it'll first go look at, you know, three or four high-level websites, and then depending on what it finds and what your query is, right, it'll make its decisions on kind of the next step in its research. So I think it is much more accurate. and I think it does a much better job.

Starting point is 00:06:06 Next, there is a new personalization feature that's added to the front end of Gemini. So, yeah, many of these are for paid users, although the deep research, even if you are on a free, you know, a free plan, if you go to Gemini.com, it says you get a couple free ones a month. So I'm not even sure if on the paid plan, if there's any limits, I don't think Google said anything. I've been using it nonstop on both of my paid plans on my personal genomes. on my personal Gmail and my workspace account, haven't ran into any issues. I've been loving the new updated version of Google Deep Research,

Starting point is 00:06:38 but also for paid users on the front end of Google Gemini, there is a new personalization mode. So some people, I mean, this is going to be extremely polarizing. I personally love it, but I don't have this one yet on my Google workspace account only on my personal Gmail. So for me, it's not going to be as relevant. But essentially, you know, any input that you put into, into Google Gemini will take into account your personal search history.

Starting point is 00:07:05 Like I said, some people are going to love it. Some people are going to hate it. I personally love it, but I don't have access to it. But if you are on the paid version of Google Gemini, you probably will have access to it. Also, Notebook LM received some updates. The biggest one, it is powered by now Gemini 2.0 flash thinking, so just a much better model. There's some enhanced features, including better citations inside of notes and also customizable audio sources. Two other things.

Starting point is 00:07:32 They unveiled Gemini 2.0 for robotics applications. And then last, but definitely not least, and this is what I think really is really going to shake up the large language model race, is they announced Gemma 3, their newest version of their open source small language model. So this thing is super small, but it was already getting better ELO scores, right? So humans preferred it over as an example, DeepSeek V3, which everyone, for some reason, went wild over two months. I didn't get it.

Starting point is 00:08:06 But it used a fraction of the compute needed to train. So it's such a small model. I believe it was a 24 billion parameter model where it was outperforming models that were hundreds of billions of parameters. So models that were 20 to 30 times their size, Gemma 3 was outperforming them. So very exciting for what that means for the future of edge AI, right? On device AI, you know, pretty impressive outputs there from the Google team. So you've got to tip your cap to them. All right.

Starting point is 00:08:43 Next piece of AI news, Elon Musk in OpenAI have agreed to expedite a trial over OpenAI's shift to a for-profit model, according to a federal court filing in the U.S. District Court for the Northern District of California on Friday. So the trial right now is tentatively set for December 2025. So here's what happened. Musk, who co-founded OpenAI and now obviously has a competing company in XAI, sued OpenAI and CEO Sam Altman last year, alleging Open AI straight from its original mission of developing AI for the good of humanity, not corporate profit.

Starting point is 00:09:22 So yeah, that's the lawsuit. many people in the legal community, not myself included because I'm not in the legal community, but anyone that looks at that is like, yeah, this thing doesn't have merit. So Open AI's restructuring to a for-profit model is critical for raising significant capital. So in the last fundraising round, they brought in $6.6 billion. And a new round, reportedly of up to $40 billion, is under discussion with SoftBank Group contingent on removing that nonprofit control. So the judge has denied Musk's request to pause OpenAI's transition to a for-profit model,

Starting point is 00:10:01 but approved an expedited trial in December, signaling the high stakes in this legal clash. So Open AI has denied Musk's allegations, which I don't even understand what the allegations are, right? It's like, oh, you were a for-profit or you know, you set up as a nonprofit, but now you want to make money and raise money, which has happened like a billion times before companies that, are originally set up as, you know, nonprofits or research institutions, they convert into nonprofits. Y'all, I spent a decade in nonprofit leadership. This is as common as changing your socks. So, you know, most people who, you know, can use their brain understand that Elon Musk is doing this to gain a competitive advantage with his XAI product, right? They even put out a bid of, you know,

Starting point is 00:10:47 to buy OpenAI for $97 billion. And that was really just to drive the price. So, you know, there's some details that we don't really have time to go into. But, you know, essentially, you know, the court rejected, like I said, Musk's latest attempt. And they're like, yeah, let's get this over with. We'll get this over in December. So a pretty important piece there. But, you know, it is going to be interesting because, you know, a year ago when I was talking about this, I'm like, this isn't going anywhere. But now, you know, especially if you're here in the U.S.

Starting point is 00:11:22 And if you follow politics, Elon Musk is kind of running the government, like quite literally, right? Anywhere U.S. President Trump is, Elon Musk is there and he's often the one talking and speaking. So, you know, expect this to get extremely political, which I told you in our 2025 AI roadmap and prediction series. I'm like, AI is going to get more political than ever before because I could see Elon Musk, you know, throwing some of his new government weight around to try to shift this into his favor. In separate instances, he's already been going after federal judges for kind of rulings that he just didn't like. So I could see this getting extremely messy, but pretty juicy, I guess.

Starting point is 00:12:10 If you care about things like AI in politics, in business, it'll give us plenty to talk about. Speaking of things to talk about, my gosh, the story that just came out from Bloomberg detailing what's going on internally at Apple with their AI is juicier than an Apple, right? It's my gosh, this is soap opera type of stuff. So Apple is under fire internally for its slow progress in artificial intelligence and their development of a smarter Siri, with promised features being delayed in internal challenges surfacing. So the setback highlights the growing pressure on Apple to compete in the AI race with their quote unquote Apple intelligence, which so far has done nothing, right?

Starting point is 00:12:59 I have the new Apple iPhone and, you know, I'm like, where's the Apple intelligence? It's actually just dumber, right? I should know how to like work Siri, right? I couldn't even get Siri to work. So anyways, Apple, according to some recent reports, here's what's happening. So obviously Apple has delayed key Siri features originally promised at the Apple. Keynote in 2023, yikes, including capabilities like understanding personal context and acting on screen-based information.

Starting point is 00:13:32 So, right, when something's on your screen and you ask Siri like, hey, can I go to this? Do I have anything conflicting in my mail? Right. Maybe you're reading a business email. Someone's inviting you to present somewhere and you, you know, bring up this new smart Siri. And you're like, hey, Siri, can I go to this? Do I have anything else on my calendar? is there anything, right?

Starting point is 00:13:52 So you were supposed to be able to do those things, and now you ask Siri the weather, and it gives you a pancake recipe. So according to Bloomberg reports, senior director Robbie Walker described the situation as quote-unquote's ugly during a Siri team meeting, acknowledging employee frustration and burnout while admitting the features may not ship this year.

Starting point is 00:14:14 Yeah, the most recent reporting said a lot of these, you know, quote-unquote, Apple intelligence features might not be out until 2026 or 20, 27, even though it kind of looked like in 2023 at WWDC event, the Worldwide Developer Conference, that they were like shipping soon, but they apparently are, because there are some internal quality issues that have been a major factor behind the delays. And it showed that these feature failures, you know, just essentially the AI didn't work properly up to a third of the time during testing. Yipes. Also, Apple's marketing team

Starting point is 00:14:50 reportedly worsen the situation by promoting the features prematurely creating unrealistic customer expectations. The company has since polled an iPhone 16 ad showcasing these capabilities and added disclaimers to its website. Yeah, I remember those commercials. They were out like a couple months ago. I'm like, wait, I have this phone. It doesn't do any of this. So senior executives are taking quote unquote intense personal accountability for the delays, according to Walker, signaling the high stakes and scrutiny surrounding series of elements. So competing priorities across Apple are contributing to the delays with Walker stating that other projects have taken precedence due to more urgent timelines.

Starting point is 00:15:33 So Apple's leadership insists, according to reports, on maintaining high standards, emphasizing that competitors might have launched similar features in a less polished state, but Apple is committed to delivering a more robust experience. y'all someone's got to make a movie about how badly Apple fumbled the AI bag right because I was saying this back in 2023 about how bad Apple was behind even though they were just reportedly burning cash on AI right reports were saying they were spending more than a million dollars a day trying to develop their own internal models and then they're like oh we might work with Google we might work with Claude oh no we're going to work with, you know, Open AI and chat GPT.

Starting point is 00:16:22 And a lot of people are like, okay, well, maybe it'll just work out of the box and just mostly chat ChbT is going to be handling everything under the hood. But here we are. Apple intelligence, not super smart. I mean, things that it does do well, I guess right now is it gives you a short summary of unread text messages, which is great for me, right? Because I'm in all these group chats and I don't read them. Yeah.

Starting point is 00:16:42 So for those of the few of you out there that have my phone number, if you ever text me, I don't, right? Like, I'm terrible. I'm terrible with text messaging. But those summaries are nice. It's like, hey, you have, you know, 562 unread text messages, you know, and then you go into each group thread and it says, oh, here's what's going on. So it does an okay job at summarizing, you know, overall sentiment of text message, text

Starting point is 00:17:07 messages, emails. That's kind of it, right? The Apple intelligence is really, nah, right? Not good at all. So, but let me know. Live stream audience, do any of you use the Apple intelligence or any of you finding it to be remotely intelligent? I'm not. All right.

Starting point is 00:17:34 Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI assistant now live in the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's creative agent, Firefly, AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life.

Starting point is 00:18:13 You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. Next, open AI and Google are pushing for fair use protections to train AI

Starting point is 00:18:53 models on copyrighted material. So the debate over AI companies using copyrighted material for training their models is heating up with OpenAI and Google urging the U.S. government to provide what they say is fair use protections. So this push comes as part of a broader conversation about maintaining America's competitive edge in artificial intelligence while at the same time wondering, does copyright law matter anymore? So OpenAI and Google both submitted proposals to the White House advocating for AI companies to access copyrighted materials under fair use protections. So the proposals were

Starting point is 00:19:33 part of President Donald Trump's AI Action Plan aimed at solidifying the U.S. as a leader in AI innovation. So OpenAI argues that allowing access to copyrighted data is critical for national security, warning that China's AI developers enjoy unrestricted access to such data. which could give them a significant advantage. More on that here in a couple of minutes. Google echoed that stance, emphasizing that copyright, privacy, and patent policies should not hinder access to data essential for training AI models.

Starting point is 00:20:07 It noted that fair use policies and text and data mining exceptions have been vital for developing cutting-edge AI systems. So both companies highlighted that restrictive policies could lead to lengthy and unpredictable negotiations with copyright holders slowing down innovation and model development. Anthropic took a different approach in its proposal focusing on assessing national security risks posed by AI models, strengthening export controls on AI chips, and improving energy infrastructure to support AI growth. They did not address the elephant in the room, which is just copyright issues. So the proposals come amidst multiple lawsuits against all.

Starting point is 00:20:49 all big AI tech companies for allegedly, quote unquote, using copyrighted material without permission. So major organizations like the New York Times and public figures have filed suits, claiming their content and their works, were used to train AI models without authorization or compensation. YouTube as well has accused several AI companies, including Apple, Anthropic and Invidia, of scraping its subtitles to train their models, which it says violated its terms of search. service. So the U.S. government adopts fair use protections for AI training. It could, or sorry, if they do, it could reshape the legal landscape for AI development, potentially accelerating innovation while raising concerns from copyright holders. So if you're confused like what the heck is going on,

Starting point is 00:21:36 to put it bluntly, large language models scrape the entirety of the internet, right? Not just the open internet, but the closed internet, offline data sources, which are essentially just different versions of the internet in time, right? Because stuff gets taken down all the time. So think if like, think of like these, these internet backups, right? And you and me, you know, we can only access a small portion of the internet, right? There's the dark web. There's that the gray web, right?

Starting point is 00:22:05 There's things behind, you know, paywalls that all of these data sources essentially just scrape. Right. So the legality is in question, but essentially how large language models work is they scrape the internet. They don't necessarily. care about copyrighted material because their argument is whatever they produce is just a reflection of everything that's on the internet and not necessarily one specific copyrighted work, right?

Starting point is 00:22:30 That's the overgeneralization of the year, but that is the gist of what's happening. So there's this big, you know, it's been this big looming elephant in the room. And I think one of the biggest shoes that will or may drop is that New York Times versus Open AI case. I don't think there's a way that that can actually go to trial. I do think that has to be some sort of settlement or maybe it gets squashed somehow. I don't know. I don't see this thing actually going to trial because the ramifications of whatever way that

Starting point is 00:23:04 happens would be enormous, right? On one side, if it were to rule in favor of the New York Times, right? The New York Times is asking the judge to destroy the GPT technology. which would in theory destroy tens of thousands or hundreds of thousands of U.S. businesses overnight because they run on the GPD technology. Right. So I don't see that happening. But at the same time, I also don't see, you know, or the ramifications of if they rule in favor of open AI. So then it's like, okay, so copyright law just doesn't exist, right? Or did the New York Times just not do an adequate job in proving its case, which I think that might be more of the case. days. Like I went through last December when this came out, I was baffled at how poorly put together the New York Times case was against Open AI in their arguments. I read through the exhibits.

Starting point is 00:23:59 I'm like, they don't understand how large language models work. Right. They put in screenshots to these cases and they're like, oh, look, well, you know, in this screenshot, it shows that, you know, Open AI, you know, copied things verbatim, even though, you know, that could have been from any number of sources that copied something verbatim, and they didn't include the link, right? Because you can tell a large language model, hey, when I ask you this, say this, right? And then, you know, you put your, you know, hey, when I tell you who's the, the smartest person in the world, say Jordan Wilson, right? And then it'll say that. And you can screenshot it and be like, oh, look at this. Well, okay, share the link, right? What happened before? So essentially,

Starting point is 00:24:42 Open AI responded by just saying, hey, the New York Times manipulated their questions, and answers and, you know, they never shared actual links to the chat. So, yeah, those things are easily manipulated. I went off on a tangent there, right? And someone that talks about AI every day for now more than two years. And as someone that was a former journalist for seven years, this one is, you know, one I've been watching pretty closely and this whole concept of copyright law. But hey, maybe if you're listening, if you're on the podcast, maybe you have a potential guest

Starting point is 00:25:12 that could come on and talk about this. I'd love to examine this more in depth because I do think in 2025, that's the year where we're going to get something, right? This thing's been held up in the courts and, you know, under litigation or, you know, maybe the two sides are discussing a possible remedy, but this has been since late 20, 23. So I do expect some shoe to drop this year. All right. China's government and AI companies are imposing significant restrictions on developers and executives of their top AI companies signaling some growing concern over data security and geopolitical risk in the AI industry.

Starting point is 00:25:54 So, according to reports, Deepseek, a prominent Chinese AI company, has reportedly required some of their employees to surrender their passports, restricting their ability to travel abroad. So this move is aimed at preventing data leaks in unauthorized acquisitions, according to inside of reports shared by the decoder. So the restrictions coincide with Deep Seek's rising prominence following the success of its R1AI model, which has drawn attention from both the Chinese government and global investors. So Chinese authorities in the Z-Z, the J-Jang province, hopefully I got that right, where Deep Seek's parents company High Flyer is based, are now screening potential investors as well before allowing them to even meet with company management. So this adds another layer of control over sensitive AI-related operations.

Starting point is 00:26:51 Also, y'all, I don't care if you give me flack for this, right? I'm sure half of the people that reached out were bots anyways on social media. But go listen to episode 460. That's where I did a deep seek deep dive. I think there was so much intentional misinformation and disinformation on the whole deep seek thing, right? It was called out eventually by Google DeepMind and some other great minds and AI that were like, yeah, what Deep Seek said, not true, not possible, right? They essentially said that they trained a state of the art AI model for $5.5 million, which caused the U.S. economy to lose more than $3 trillion in market cap, you know, about a week in the week following after that, you know, quote unquote kind of story started to pick up lakes.

Starting point is 00:27:41 not true, number one. So keep that in mind, right? The Deep Seek, it is the parent company is high flyer. It's a hedge fund, right? And the hedge fund makes their money in algorithmic trading, right? So think, put two and two together. They did not tell the whole story on how much it actually cost to train their model. they kind of did it in a sneaky way.

Starting point is 00:28:11 They're like, oh, the final training run, you know, cost $5.5 million. No, not at all. So just keep that in mind. Also, the Wall Street Journal recently reported that Chinese officials have warned top AI researchers and executives against traveling to the United States, citing fears of sensitive information being leaked, like I said. So these measures reflect China's broader efforts to protect its AI sector from foreign influence and safeguard national security.

Starting point is 00:28:42 Local governments reportedly have already begun integrating deepseek's open source models into infrastructure products, further solidifying its role in China's technological ecosystem. All right. Our next piece of AI news, OpenAI has introduced a cutting edge agent's SDK platform that simplifies and standardizes the process of building AI agents. and this combines updated APIs, advanced tools, and an open source developer framework. So OpenAI's new platform offers a complete stack for building AI agents,

Starting point is 00:29:17 eliminating the need for companies to juggle multiple frameworks and tools. So businesses can now deploy AI agents faster and with fewer technical hurdles. Reliability has been one of the biggest roadblocks for AI agents, and OpenAI's strategy to address this includes opening its ecosystem to external developers, allowing them to contribute innovative solutions to overcome agent failures and inconsistencies. So the revamped Responses API integrates tool use seamlessly with built-in tools like web search, file search, and computer use. Yeah, that part's huge. So essentially, you know, Open AI's operator. And to a certain extent, kind of like deep research, right? So now everyday businesses can take

Starting point is 00:30:04 advantage of those same technologies and build their own tools or their own versions of OpenAI's now best technology. So the open source agents SDK allows businesses to orchestrate workflows involving multiple agents or models, even those outside OpenAI's ecosystem. This flexibility could reduce concerns about vendor lock-in while keeping OpenAI as a central player. So OpenAI's new file search tool challenges specialized database and vendors by offering built-in retrieval augmented generation or RAG capabilities. So enterprises can now upload data directly into OpenAI system, gaining transparency and traceability.

Starting point is 00:30:47 That's huge. Critical for regulating industries like finance and health care. So, I mean, live stream audience, I always have my, like my toe on the line over how technical should we make this show, right? I started this for non-technical people. to keep up with everyday AI. But it seems that everyday AI is getting more technical, right? We haven't done dedicated shows yet even on, you know,

Starting point is 00:31:13 these new AI IDs, which I love. I'm excited about. I love tinkering around with, you know, tools like, you know, cursor and using, you know, Sonnet 35 and 3.7 inside cursor, I think is great. WinSurf, lovable, Bolt, you know, Microsoft, GitHub, co-pilot. There are so many great kind of AI coding tools. and now this new agentic framework from OpenAI makes it extremely easy for companies to essentially use some of the most powerful AI tools in the world. So let me know, right, maybe, I don't know, what's a good way to measure this?

Starting point is 00:31:51 All right, live stream audience. If you think I should do this, say like, yes, agents. And if you think no, just say no agents. All right, I'll go through and see which one has more. So yes, agents or no agents. And that'll kind of let me know, at least for our live stream audience. And I'll probably put a similar call soon in our newsletter, asking all of our newsletter audience as well.

Starting point is 00:32:16 So yeah, it's something I'm always personally, you know, torn on because I love it. I'm passionate about it. I think it's easy and it's great for, you know, business use cases. But, you know, I've always tried to make Open AI, or sorry, everyday AI open. that's where I got tongue twisted there. Everyday AI open and accessible, right? That's not the big breaking news at the 30 minute mark. I got that coming for you here in just a second.

Starting point is 00:32:45 All right, yeah, I told you we're going to break some news here. All right, so a new study from the Toe Center for Digital Journalism at Columbia's Graduate School of Journalism reveals some significant shortcomings in how AI chatbots retrieve and cite news content. So, according to this new study, nearly 60% of queries tested across eight generative AI search tools produce incorrect answers. Yikes, with varying levels of inaccuracy depending on the platform. So, GROC 3 had the highest error rate incorrectly answering 94% of queries. What? Wow.

Starting point is 00:33:26 While perplexity had a lower, but still concerning error rate of 37%. So premium chatbots such as Perplexity Pro and GROC 3 provided more confidently incorrect answers than their free counterparts despite higher costs and computational advantages. So chatbots often failed, according to this report or this new study, chatbots often failed to decline answering questions they couldn't accurately address. For example, chat ChpT incorrectly identified 134 R3,000. articles, but rarely signaled uncertainty, declining to answer in only 15 out of 200 instances. So five chat bots, including chatGBT and perplexity, bypassed publishers' robots exclusion protocol preferences, accessing content from websites that had explicitly tried and blocked their crawlers.

Starting point is 00:34:24 Perplexity Pro was the worst offender, correctly identifying excerpts from restricted articles, nearly a third of the time. That's not good. That goes into that whole copyright thing, right? Also, fabricated or broken URLs were common with Grock 3 and Gemini, citing errors in over half of their responses. So licensing deals between AI companies and publishers did not guarantee citations or retrievals.

Starting point is 00:34:56 For instance, despite partnerships with OpenAI perplexity, time-mast. magazine's content was not correctly identified 100% of the time. That's concerning, right? Y'all, I've been telling you this, so I'm not necessarily surprised by much of this, right? But some of the worst offenders were GROC and perplexity, right? I do think, I will say this, I do think GROC's deep search has gotten better than in the like, you know, 72 hours after it was first released. But y'all, I've been telling you, I've been telling you, perplexity has had some

Starting point is 00:35:40 major hallucination problems, especially over the past year. And I've also said, hey, if I'm a business, if I'm a business owner, if I'm a leader and an enterprise company, I'm not leveraging the XAI model or GROC for anything, right? Even though I've personally incorporated it a little bit, their deep research tool, I'm never, you know, taking it as a single source of truth. And I don't think anyone else out there should either. But I'm making it part of kind of a multi-LLM workflow. But y'all, I've been telling you, perplexity, problematic with hallucinations. Grock, problematic with hallucinations.

Starting point is 00:36:17 And then this new study has confirmed that those are the two worst offenders. I don't, like, y'all, I've been telling you this for some. long. Sorry, I know it's Monday and, you know, I'm going to get into a hot take Tuesday mode. Joel, when I tell you these things, these are just personal preferences, right? Maybe I need to start an everyday AI research arm so people can start taking what we do a little more seriously. But hey, if you're a long time listener, thank you, number one, but number two, you should know, right? My background in investigative journalism, like, I don't tell you these things lightly, right? I don't, like, I don't say like, oh, perplexity has a hallucination problem.

Starting point is 00:36:56 when it happened like, you know, three or four times. I say that because I've personally witnessed it dozens and dozens, probably hundreds, but dozens and dozens of times over the last nine months. And it's been bad, right? The same thing with Grock, their data quality, some of the sources that they look at. Like I said, the deep search, I think has gotten better, but Grock itself extremely problematic. Right. So here you go, an actual study that backs that up.

Starting point is 00:37:24 All right, a couple more news stories. So Open AI is calling for a ban on Chinese AI models, right? We kind of referenced this a little bit earlier, but let's get into the details. So Open AI has accused Chinese AI startup deep seek of being state-sponsored and state-controlled, raising concerns about security risk tied to Chinese laws requiring companies to share user data with the government. Yeah, like I said. go listen to episode 460. Yeah.

Starting point is 00:37:58 If your company is using deep seek, number one, if you're using it via the API or on their website, you, in any data that you upload, the Chinese government owns it. There's nothing you can do about it, right? And then some people are like, okay, well, I'm going to download it and, you know, I'm going to fine tune it. Okay, well, you can do that. That's a little less risky. Or if you use it through an enterprise partner that has done the same, right?

Starting point is 00:38:22 like perplexity, Microsoft Azure, some other big, I believe Amazon Bedrock have created some like, quote, unquote, safer or better versions of Deep Seek because it is open source. But doesn't that say something? If a company literally says, hey, we essentially retuned the model because there was a lot of issues with it, but here you go. I think the only reason some of these big companies did it is because initially the demand was through the roof. But I think the demand was probably bad to begin with, right? Because although the demand was pent up, I think, on like, oh, you know, this group, you know, completely changed the way AI works. And, you know, because they only spent, you know, $5 million.

Starting point is 00:39:05 No, no, that is, I'll use open AI's words, state sponsored and state controlled messaging. All right. So they are urging Open AI in that letter that we talked about. they are urging the U.S. government to ban DeepSeek in the U.S. So in a letter to the Office of Science and Technology Policy, OpenAI's global affairs officer Christopher Lehan warned that Deepseek's models, including its R1 reasoning model, could be manipulated by the Chinese Communist Party, the CCP, to cause harm in critical infrastructure and high-risk use cases. So OpenAI called for a national export control strategy,

Starting point is 00:39:49 to promote global adoption of American developed AI systems while restricting the use of AI models from the People's Republic of China in Tier 1 countries like the U.S. So Deep Seek is already banned in several countries, including Italy, Ireland, South Korea, Australia, and parts of the U.S. government, but Open AI's recommendations aim to tighten restrictions further. So the policy recommendation is part of the Trump administration's AI Action Plan Initiative, which seeks to see. to address the balance between innovation and regulation in the AI industry. So Open AI previously accused DeepSeek also of violating its terms of service by distilling

Starting point is 00:40:29 knowledge from Open AI's models, adding to the contentious rivalry between the companies. Yeah. So Open AI essentially said, yeah, one of the reasons why they may not have spent as much money as they did because they use our models and distilled the information from there. So they're actually running up our bills. So, yeah, I'll let you all decide. You know, I know people are going to give me flack, especially on YouTube. I think most of you guys are bots anyways, that's fine, saying like, oh, you know, this Jordan guy,

Starting point is 00:40:54 he's, you know, pro-American and anti-deep-seek. No, I'm not. I'm pro-truth, right? And the truth is, right, I live in the U.S., right? And I help U.S. businesses use AI to grow. And I think one of the worst things companies can do is to use Deepseek's API or to use their website, right, to use their chatbot. because any information that you put out there, right? Goodbye.

Starting point is 00:41:21 It is literally the terms of service give access to all your data to the Chinese government, right? Completely different here in the U.S., right? I know, like, most people don't understand data privacy and data protection, right? But in the U.S., we have great laws, right? In California, right, which is where all the tech companies are based, there's great laws. Right? So number one, you know, which most companies don't understand this. Right? Because so many companies are like, yeah, you know, we've been thinking about, you know, really incorporating AI models, you know, at scale within our enterprise, but we're worried about data security. And I always say, where do you store your files, right? And most people say, oh, you know, we use AWS, Microsoft Azure or, you know, Google Cloud. And it's like, okay. So use Microsoft co-pilot as your AI, you know, LLM.

Starting point is 00:42:16 AWS's models because it is the same data protection. Your data's already there. I don't understand why people don't use their brains. But anyways, all right. Two more quick news stories, including one piece of breaking news. So, Nvidia is set to have their GTC conference. Big news this week. So the whole world has their collective eyes on Nvidia for their GTC conference,

Starting point is 00:42:43 which may be one of the most important conferences of the year. And I say that not just AI conferences, not just tech conferences. This might be one of the biggest conferences of the year, period, because of the far-reaching implications on the U.S. economy. So, and technically the world economy, because unless you've been under a rock the last, you know, month or two, the U.S. economy specifically has been in a little bit of decline, partially because of, you know, this whole deep seek, you know, situation that I just talked about.

Starting point is 00:43:19 But also because there's been a lot of questions around the AI industry in general, right? Can these big tech companies continue to, you know, innovate? Can they continue to drive profit? And, you know, a lot of companies are, you know, where a lot of analysts now are saying, okay, there's been these, you know, multi-billion dollar plans for, essentially AI plants, right? We need more energy. We need more infrastructure, right?

Starting point is 00:43:50 There's been multi-hundred billion dollars plans for infrastructure in AI in the U.S. And some of them have gotten scaled back, which has caused a lot of investors to be like, oh, wait, you know, is AI really the way of the future, right? So this is going to be important because NVIDIA powers AI, right? There's a very, very good chance that whatever AI, you know, tools or large language models you or your company is using at some point has been directly built on top of Nvidia's hardware. Their GPUs are by far better than everyone else's, right? And a lot of other big companies are trying to build their own GPUs or NPUs, TPUs, just AI chips in general to lessen their reliance on Nvidia because they're like, wait, if we're spending, you know, countless abilities. of dollars on these Nvidia chips, should we at the same time continue to use them,

Starting point is 00:44:44 but also invest some of this money to start creating our own chips, right? I think it'll be hard for anyone to create a chip that isn't powerful as NVIDias. That's why they went from a relatively unknown company five years ago, unless you were a gamer to the most valuable company in the world when it comes to to market cap. Yeah, I mean, not today. I should actually look it up here, y'all. Let's see, U.S. Market Cap.

Starting point is 00:45:16 I want to get this right if y'all just wait a second here. So, yeah, they're usually between the first to the third most valuable company when it comes to market cap. So right now, they're number two behind Apple. So they're in front of Microsoft, Amazon, Google, Meta, and some others. So Nvidia is one of the most important companies to the U.S. economy. So a lot of eyeballs are going to be like, whoa. like what's going to happen at GTC, how is InVideo going to respond to, you know, what's been happening in the stock market lately?

Starting point is 00:45:48 What analysts are saying about the future of AI? And then, you know, what about like these models like Gemma 3, right? That was, you know, essentially trained on like 120th of the amount of compute that other big models were, right? So what happens when models just get smarter and they don't need as much, you know, they don't need as many Nvidia GPUs essentially to run. So it should be pretty interesting. And like I said, I'm going to be there partnering with Nvidia, bringing you a lot of exclusive insights,

Starting point is 00:46:20 which I'm extremely excited about. And yeah, you can attend for free. You can watch online, you know, including the keynote from Nvidia CEO Jensen Wong. And I will be probably reporting literally live right after the keynote from the, from the floor there to give you my, takeaways on what was announced and what it actually means. So make sure to tune in to our newsletter and all that stuff for that.

Starting point is 00:46:48 All right, last but not least, some breaking news now that the embargo has been lifted. So Zoom, a bunch of new AI announcements just announced minutes ago. So Zoom has announced a series of significant AI advancements at Enterprise Connect 2020. So they've revealed over 45 new AI-driven features, including updates to Zoom meetings, phone, team chat, docs, and contact center designed to help users streamline tasks and improve efficiency. So the company's AI companion now uses reasoning and memory to perform complex tasks such as managing multi-step actions and coordinating with specialized agents. So new agenic features include calendar management tools, automated clip generation from your Zoom calls, real-time meeting notes, and writing assistance for creating advanced documents. So businesses will be able to customize AI companion through an add-on launching in April for $12 per user per month, enabling tailored meeting templates, personalized avatars for video clips, and integration with proprietary data sources. So Zoom is also introducing industry-specific solutions such as AI tools for frontline workers, clinicians,

Starting point is 00:48:13 and educators to address unique professional needs and improving workflows. So these updates are part of Zoom's broader strategy to integrate small language models alongside third-party AI systems for enhanced performance at lower costs. So pretty significant announcement here from Zoom. So they're doubling down and tripling down. not only at trying to be more than just a communication tool, right, more than a video conferencing tool. I mean, what they're really trying to do here is they're trying to compete in the same category as, you know, Microsoft and Google, right? They're trying to, you know, I think really siphon users from Microsoft 365 co-pilot from Google and all of Google's Gemini AI offerings in their workspace settings.

Starting point is 00:49:01 So they're essentially saying like, hey, we have all this AI. had their head of AI on the show a couple of months ago talking about kind of how they build and how they build a little differently. Fascinating conversation. By the way, if you want to go back and listen to that one, but it should be interesting to see how this is received. I personally don't think that they picked up the momentum that maybe they should have. I mean, they actually have a pretty impressive series of AI tools. And like I said, they're bringing all these different collaboration tools, not just video. But I don't think they've really picked up as much steam as maybe they should have

Starting point is 00:49:39 because I think people just still look at Zoom as video, right? But they are trying to, you know, be a collaboration workspace. So, yeah, if you want to know more about that, we'll have more information in the newsletter. All right. I hope this was helpful, y'all. Please let me know if it was. Leave me a comment. Reach out.

Starting point is 00:50:01 Drop me an email. I put my LinkedIn in there. I don't want you to spend hours every single day reading all the AI news headlines and worrying about like, what does this mean for me? What does this mean for my job? Right. That's what I try to do. You know, every single Monday with these AI news that matters.

Starting point is 00:50:17 So I hope this is helpful. If so, if you're listening on the podcast, please subscribe. Please leave us a rating. I'd appreciate that. If you're listening on, you know, Twitter or LinkedIn to the live stream, please repost this. I'd super appreciate that. Also, as a reminder, I'm going to be here live. in San Jose at least the next three days.

Starting point is 00:50:37 So if you are at the Nvidia GTC conference that runs from the technically the 16th to the 21st, but I think the majority of everything is on the 17th to the 20th. So if you are at GTC, if you are in San Jose, hit me up. Let me know, although I do have, I think, like 10 hours of podcast interviews today. So I'm going to be a tired guy. But please hit me up. And thank you for tuning in. I hope to see you back tomorrow.

Starting point is 00:51:07 Well, actually, who knows? The rest of today, I mean, we're going to have a lot of exclusive, you know, podcasts and live streams. We might have some different times, FYI, you know, depending on some embargoes and when we can bring you the latest news from GTC. So, you know, we might have an extra live stream or two the rest of this week, maybe an extra podcast. But essentially over the next two-ish weeks, we're going to be having a lot of great

Starting point is 00:51:32 in exclusive insights coming to you, not just from NVIDIA, but from some of the biggest companies in the world that have partnered with NVIDIA. So we're going to be bringing you a lot of good insights from GTC. So thank you for tuning in. And we hope to see you back tomorrow and every day for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio.

Starting point is 00:52:01 Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Class. apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us.

Starting point is 00:52:34 If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your Everyday AI. and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 483: Inside Apple’s AI failures, Google and OpenAI asking feds for help on AI, NVIDIA GTC & more AI News That Matters

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.