Everyday AI Podcast – An AI and ChatGPT Podcast - EP 339: AI News That Matters - August 19th, 2024

Episode Date: August 19, 2024

Win a free year of ChatGPT or other prizes! Find out how.Did you know OpenAI released a new model? The one fatal flaw of Elon's new Grok-2 that no one's talking about. What is it? And is it ...concerning that researchers are trying to create 'Personhood Credentials' to differentiate humans from AI pretending to be humans? We'll tackle those questions and more.  Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Grok chat system flaw2. Personhood credentials3. World Labs' achievement4. OpenAI’s updated GPT 4 model5. Politics and AITimestamps:04:19 GPT 4 Turbo and Flux 1 Integration Exciting.07:41 Grok has impressive benchmarks, but significant drawbacks.11:09 Personhood credentials for online identity verification.14:56 Internet explosion from AI-human hybrid interaction. US election impact.18:07 World Labs achieves unicorn status with $100M funding.23:25 AI will increasingly impact and manipulate politics.26:02 AI image generators make discerning reality difficult.29:08 Use battleground to compare and improve model skills.32:12 Ongoing updates, constant testing for AI models.34:33 WorldLabs reaches $1B valuation, AI dominates news.Keywords:Grok chat system, cycle flaw, Jordan Wilson, OpenAI researchers, Microsoft, MIT, Harvard, personhood credentials, online AI deception, verify real humans, privacy, fake accounts, bot attacks, AI misinformation, US election, social media, human verification, GPT 4 update, AI chatbot rankings, Google's Gemini 1.5 pro, GPT 4.5, GPT 5, AI testing, chatbot arena, Grok 2, Elon Musk's XAI, WorldLabs, unicorn status, President Donald Trump, political AI campaigns.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. As always, it's been a crazy week in AI.
Starting point is 00:00:51 There's a new model from OpenAI that they barely even announced. Elon Musk's XAI and Grock and Twitter, whatever you want to say, they released a new model. And researchers are trying to tackle online AI deception in a kind of troubling way. So we're going to be going over that today and more on Everyday AI. What's going on, y'all? My name is Jordan Wilson, and I'm the host of Everyday AI, and this show is for you. We are your daily live stream podcast and free daily newsletter, helping everyday people learn and leverage generative AI to grow their companies and their careers.
Starting point is 00:01:31 So almost every single Monday, we do the AI news that matters, right? Because you can spend hours, literally hours, every single day, trying to keep up with what's going on in the world of AI. All these companies, creating new models, all of these businesses, how they're using it, right? You could be overwhelmed trying to keep up. Or you could just let us do all the work. This is what we do literally every day.
Starting point is 00:01:55 And then most Mondays, we bring you the AI news that matters. We tell you what's going on and what you should be focusing on. So not just, hey, here's what's in the rearview mirror, but here's what's ahead. So if that sounds helpful to you, and if you haven't already, make sure to go to Your EverydayAI.com and sign up for the free daily newsletter because, yes, we do this every single day and in our newsletter. Not only do you get the latest news and AI tools and all of that, but also exclusive insights that you literally will not find anywhere else.
Starting point is 00:02:23 So make sure if you haven't, go sign up for the free daily newsletter at Your EverydayaI.com. And while you're there, go ahead and answer our thanks a million giveaway to celebrate Everyday AIs a million downloads. We've got some great prizes going to. So go sign up, refer a friend. you'll get a unique referral code in there and enter into our giveaway. If you haven't already, we're going to be giving away a year-long subscription to chat GBT or your favorite kind of large language model, as well as some consulting time with
Starting point is 00:02:52 us and probably some other prizes too. So make sure you keep an eye out on that. All right. With that, let's get into what's going on in the world of AI and the news that matters to you for the week of August 19th. Summer's almost over, y'all. So sad. All right.
Starting point is 00:03:09 And hey, thanks for all our live stream audience, as always for tuning in. Brian, Marie, Rolando, Chris, Cecilia. Got a lot of y'all here, Joe, and woozy. Thank you all. But yeah, let me know as we go along. We'd love to hear your thoughts on some of these stories as well. So let's start at the top. So GROC 2 has been launched.
Starting point is 00:03:32 So the release of GROC 2 and GROC 2 Mini, marks a pivotal moment in artificial intelligence showcasing substantial improvements in reasoning and performance benchmarks. So GROC 2 is from XAI Elon Musk's AI. And it is, you know, and when I say X or XAI, I know it's confusing. We're talking about Twitter. So many people I know still don't understand that Twitter change its name. So X and XAI Elon Musk companies now. So GROC 2 is now released inside of the or the formerly known as Twitter platform. So GROC is trying to position itself as a leader in the competitive AI landscape, particularly for users engaged in chats, coding in complex reasoning tasks.
Starting point is 00:04:21 So GROC 2 has already surpassed some existing models in competitive benchmarks, receiving pretty good scores across some common benchmark kind of scores there. So in the general knowledge benchmark, GROC 2 scored in 87.5, slightly ahead of the old GPD4 turbo. So yeah, if you're looking for kind of a benchmark amongst benchmarks, right? So obviously GPT4 turbo from Open AI is a pretty old model by now. But that's about where GROC 2 is kind of landing in the grand scheme of things. So not bad. But one of the more headline-grabbing aspects of GROC 2 right now,
Starting point is 00:05:05 is actually not even part of GROC. So it's actually its integration into the third-party open-sourced image generator, Flux 1. So Flux 1 has actually taken kind of the AI world by storm after being released a couple of weeks ago, a very capable AI image generator, right? So I'd say already it's on the playing field. It's not quite as good as mid-journey, but it is really. really good and it is open source. So that is actually what X or Twitter or GROC or whatever you want to call it is using for
Starting point is 00:05:41 its online image generation. And that is important. We're going to be talking about some other new stories that have to deal with Flux 1. So Flux 1 is an AI image generator developed by former stability AI developers. It's also important to know that right now, GROC 2 is available for premium users on X. And there's also multiple tiers to this. but I believe for the baseline $7 or $8 a month, depending on if you're paying monthly or yearly, you do get access to GROC too.
Starting point is 00:06:14 So in terms of I don't believe it is available for free users right now. But in terms of a paid monthly subscription, GROC is the cheapest, right, compared to the normal $20 to $30 a month from other kind of players out there, chat GPT's premium product, Google Geminiis, advanced. Same thing with Claude, perplexity, et cetera. So normally you're looking in the $20 to $30 a month range to use kind of the most sophisticated large language models. So Grok right now competing on price, but there's some downsides. Although, yes, it did get some great benchmarks on certain scores.
Starting point is 00:06:55 So we tested Grok last week. And let me know if you want to see that video. Maybe we can share it again in today's newsletter. but we found two things that were a little problematic, right? So let me just be honest, right? Normally we just bring you the news, but I might sprinkle in a little opinion this week, feeling spicy. I've said for a long time, Grock is not a serious model, mainly for one reason.
Starting point is 00:07:21 It's one of its main competitive advantages that it talks about is it's trained on, you know, on X data or it's trained on Twitter data, which is extremely problematic. This is why, right? So companies pay us a lot of money to consult them, right? To say, hey, hey, Jordan team, tell us what we should be doing. I will never tell a company to use the GROC model. Here's why. For that very reason, it is trained off of X data and many times in real time.
Starting point is 00:07:52 And that's a bad move for any AI company, right? Because of unfortunately, so much misinformation and disinformation on all social media networks, right? I'm not just talking about X, but I think X is maybe worse than others. And also here's kind of a couple of the gripes about GROC, right? So two big things that I didn't see really anyone else talking about because, yes, it got some impressive benchmarks that put it kind of in the top five of models out there.
Starting point is 00:08:26 So pretty impressive, like I said, in general benchmarks, but two of the main advantages of Grock, real time data from X, which is actually a bad thing, right? It can be a bad thing because then, you know, you might just be using its outputs without really checking the source or maybe even knowing the source, which is problematic. The other thing, it actually didn't do a great job in our initial testing on accurately bringing in information from in real time from X. So, you know, we tested that another kind of fatal flaw.
Starting point is 00:09:02 is it has some problems going in a loop. So if you, as an example, let's say you want to summarize a web page, right? And it's a long web page. You can copy all the text, paste it into GROC and say, summarize this. And essentially, you're going to get a weird message from GROC, you know, something that says, this is too long. I can't do anything with it. So you might think, okay, that's fine.
Starting point is 00:09:25 I'm going to move on. And then you, you know, say something to Grock, like, I don't know, give me a blog post outline. But actually, what happens, and we still tested this over the weekend. I don't know if the QA team over there from XAI is aware of this. It actually gets caught in a cycle. So you can't continue to have a conversation in that chat, right? So if you say, hey, generate me a blog post outline on the advantages of EV vehicles, it's still going to give you an error message that you don't really know is an error message from the message before. A very basic flaw in a huge huge flaw because unless you actually are pretty good with large language models,
Starting point is 00:10:04 but the average user is just going to think the thing is broke, not knowing, oh, I need to exit out of this chat, go start a new chat, and then Grok will continue to work. So a very elementary and pretty big flaw there for Grok. So, you know, like I said, the pros, it's doing pretty well on some general benchmarks, doing surprisingly well, better than I thought it would. But it has some problems. So, you know, make sure, yeah, if you, if you want more. kind of GROC reviews or news from us, let us know. And yeah, I'm curious. Has anyone here in the
Starting point is 00:10:35 live stream been using, using GROC yet? So Penelope here says she hasn't learned a lot about it yet. So all right, let's go on to our next piece of AI news. And this one, I don't know if troubling is the right word. It might be. Maybe it's needed, but let's talk about it. So open a A.I researchers and researchers from other companies have proposed a personhood credentials to combat online AI deception. So Open AI researchers as well as researchers from Microsoft, MIT, Harvard, etc. I mean, it's a who's who list of researchers here. So for more than 15 organizations have published a pretty significant paper aimed at tackling the issue of online AI deception.
Starting point is 00:11:30 So this research highlights the challenges posed by the increasing presence of AI systems that can mimic human behavior on the Internet. All right. And if you caught our show Friday, we talked about this and I'll go into a little bit of what happened Thursday that I have to think these two things are related. Anyways, so the paper introduces the concept of personhood credentials, which are designed to serve as digital ID cards. So these credentials would allow for the verification of real humans online without compromising their privacy.
Starting point is 00:12:06 By implementing this system, researchers hope to create a more trustworthy online environment where the authenticity of users can be confirmed. The primary goal of these personhood credentials, which I believe are being abbreviated as PHCs, is to reduce the prevalence of fake accounts and bot attacks, which has, have become rampant on various social media platforms and online communities. These malicious entities can distort public discourse, spread misinformation, and undermine trust in digital interactions. So in addition to verifying human identities, the initiative aims to ensure that AI systems are clearly identified as non-human entities. This clarity is crucial in maintaining transparency in online communities, especially as AI technology can to advance and integrate into everyday interactions.
Starting point is 00:13:02 That is the part that is important there because you might be saying, okay, Jordan, like if I'm talking to an AI, I probably know, right? Like I'm going to, you know, chat GPT or Gemini or Claude, right? Or an AI image generator, you know, mid journey or runway, right? You might say, oh, I know if I'm talking to an AI because I am seeking it out. Not really, right? So obviously there are, you know, studies say a certain percentage, you know, on Twitter or something like that of bot accounts, right?
Starting point is 00:13:36 Bot accounts, many of them are powered by AI, right? And they're kind of systems that are built to autonomously, you know, have a certain message or a certain outcome or a certain goal, right? Where I think a lot of times, you know, when you think of bot accounts online and on social media, you might not think of them as very sophisticated, right? They would essentially have a series of pre-programmed messages and then potentially using something called spin tax. All right.
Starting point is 00:14:07 So let's just tell you what that is very simply, right? So bot accounts generally would have the same, you know, 10 to 20 messages that they would copy and paste, not really being able to understand the discourse that's going on. Right. and then, you know, kind of with spin tax, essentially swapping out, you know, different words or different phrases. So it wasn't as noticeable, right? So here's the thing with AI and where we're at now.
Starting point is 00:14:34 Now, AI systems can very easily understand the context of a conversation, right? They can see what other people are saying, what's not being said, and find conversations where they can, you know, insert their misinformation or disinformation for a certain purpose. And I think this is actually a much larger problem than anyone realizes, which is why I think some of the smartest researchers in the world, like I said, it's, it's Open AI, it's Microsoft, it's Harvard, it's MIT. It is literally the who's who's who of academia in big research companies that have come together for this paper suggesting these personhood credentials.
Starting point is 00:15:15 And let me just tell you this. A lot of, I'm not saying this stems from, but the timing was big. very weird to me, right? This came out Thursday night. At the same time, roughly, the internet was kind of exploding from this kind of experiment, maybe, but there's essentially a couple different accounts on Twitter that people said, oh, these seem like they are either, you know, a new model from Open AI, or maybe they are a hybrid of a bot in a person, right?
Starting point is 00:15:49 but very advanced. Could they be actual people? Yeah, they could. Could it be people masquerading as AI, as weird as that sounds? That's been some of the allegations, but it could also be AI masquerading as humans. And I think it's important for people to understand this. We're going to be talking about this probably more than often in the coming months, especially with the U.S. election, because I think the U.S. election is going to be a lot of people's first interaction with AI misinformation or AI disinformation. It's literally already out there. We're going to be talking about it.
Starting point is 00:16:23 There's more news from a presidential candidate over the weekend using these tactics. So I think it is an important conversation for us to have, right, regardless of your interests in it or not. Because think, I'd say scams, right? Fishing scams, you know, these scams when you get an email, they haven't been super sophisticated in decades past, but now the ability to instantly clone a voice with digital avatars, with these kind of now these bots, right? When you can literally have millions of these bots that can go out autonomously understand what's going on, you know, in certain
Starting point is 00:17:05 internet discourse and steer it one way and also interact with each other in coordination, right? It's actually kind of scary, the capabilities that are happening, right? now, which I think why it's important to talk about it. So a pretty big piece of news there from that new research paper that was released. And we will be releasing that research paper. So yeah, Penelope here asking sounds like a strange way forward that we have to confirm we are human. Yeah, it is. Again, this is just something researchers have suggested, right? This isn't happening anytime soon. I actually think, especially here in the U.S., for something like this to roll out, it will take a while. I think first, we would probably see something
Starting point is 00:17:50 like this, maybe roll out at a social media level in the same way that you can verify yourself on certain social media. You know, like as an example, I know LinkedIn has a way that you can do that and you do have to upload as an example to be quote unquote verified. You have to upload an example of a state ID, a driver's license, something like that. So this. This is a This is kind of a different approach that is maybe more universal for being used online. But yeah, I do think that we are probably going to see in the near future, at least social media companies being a little more strict about human verification. All right. Next piece of AI news, World Labs has achieved unicorn status in just four months with a hundred million dollar funding round.
Starting point is 00:18:44 So World Labs is a stealthy startup founded by renowned Stanford AI professor Fifi Lee, and it has rapidly gained attention by raising a very significant amount of funding within a short timeframe. So yes, World Labs has secured a $100 million funding round, valuing the company at over a billion dollars. So yes, already achieving unicorn status. So the latest round was led by NEA and follows in earlier financing rounds. in April that valued the company at $200 million. So notable investors from the first round included Anderson Horowitz. I'm terrible at pronouncing things, y'all, and radical ventures where Fifi Lee serves as a scientific partner.
Starting point is 00:19:30 So the startup aims to develop AI models capable of accurately estimating the three-dimensional characteristics of real-world objects and environments, which could revolutionize digital replication without extensive data collection. Lee is often referred to as the godmother of AI, and she highlighted the challenges of acquiring three-dimensional data, which is crucial for various applications, including gaming and robotics. And many people in the AI sphere are saying that this is one of the biggest remaining hurdles to achieving what is commonly, you know, AGI or artificial general intelligence.
Starting point is 00:20:11 That's essentially when all these AI models are. smarter than all of us, right? So a big kind of hang up or roadblock or hurdle, depending on what your viewpoint on the matter is, is understanding how different objects interact in a 3D world, right? In a real world. So that is one of the issues that this new startup venture is looking to tackle. So the startup's innovative approach addresses that significant gap in available three-dimensional data, which is currently collected mostly through costly methods by autonomous vehicle companies. So yeah, the way to get this data right now is not very accurate. It's extremely time consuming.
Starting point is 00:20:55 So which is why I think there's a lot of hype and money in this new World Labs venture from Fifi Lee. All right. So let's get on to our next story. So yes. And y'all, let me just say this, please. I got so, I don't know why. I got so many messages, you know, saying, oh, Jordan, you're, you're being political. I'm not, y'all, I'm not being political here.
Starting point is 00:21:20 Let me just put this out, right? I'm stating facts, okay? So, facts, ready? Here's facts. So former president Donald Trump is continuing to use AI in his campaign strategy. Donald Trump, so former President Trump has ignited a debate by utilizing artificial intelligence to promote a misleading campaign. claiming support from Taylor Swift's fans, coinciding with the Democratic National Convention
Starting point is 00:21:50 that is kicking off today. So the incident highlights the increasing intersection of AI in politics, raising questions about authenticity and political endorsements. So Trump posted some images that our live stream audience can see here on our stream, posted these images on his truth social platform, suggesting, that Taylor Swift fans, known as Swifties, are endorsing his presidential campaign, despite Swift being a Democrat, also a clearly AI-generated image of Taylor Swift pointing that says, Taylor wants you to vote for Trump.
Starting point is 00:22:32 So even some of Trump's long-term supporters, such as Lindsey Graham, a Republican ally of Trump, has warned that his provocative tactics could jeopardize his chances in the upcoming election against Kamala Harris, urging him to focus on policy discussions instead of personal attacks via AI. So the latest AI powered post from Trump follows a slew of other uses and misuses of AI. Yes, this was not it. Just there's been a lot. So also over the weekend, Trump posted in AI generated image on Twitter.
Starting point is 00:23:09 referencing his opponent, Kamala Harris, speaking at the Democratic National Convention. Here's the thing. The AI image depicted Harris was speaking at a communist rally. That's not it. There's been more this week. So last week, Trump also spread some false claims that an image of a large crowd waiting for Kamala Harris at an airport was manipulated using AI. However, reporters and Harris's campaign and attention.
Starting point is 00:23:39 that had other video and photos confirm that the crowd was real. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one Creative AI Studio. Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps,
Starting point is 00:24:21 including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible, so you can refine, redirect or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com.
Starting point is 00:24:56 So y'all, please, please don't come at me with hate messages, right? Because this is just facts, okay? And here's the thing. I'm not getting political with the everyday AI show. Whether you like it or not, this is going to continue to happen. All right. We are going to continue. and I do assume that we are going to see it from both sides of the political spectrum, right?
Starting point is 00:25:21 Because if Kamala Harris or her staff as an example put out similar misinformation or, you know, as an example, I know that, you know, Kid Rock is a celebrity that is often, you know, tied to, you know, Donald Trump. So if, you know, Kamala Harris did an AI image of Kid Rock encouraging people to vote for her, we would be reporting it, right? That's using AI, using AI in a way that is dishonest and using it in a way that is misinformation or disinformation. So we are not getting political here. But this is going to continue to happen as we ramp up, right? So I've been saying this for literally more than a year.
Starting point is 00:26:06 I said the third quarter of 2024, politics are going to get swarmed by AI. And the reason why it's important to think about it, regardless of your political affiliation, is candidates from both sides of the aisle are going to be using AI to influence people to put it out there as misinformation. So we've already seen at least the FTC here in the U.S., the Federal Trade Commission, try to clamp down on some of these use cases. But, I mean, you're seeing robocalls both sides of the aisle, people duplicating voices without authorization.
Starting point is 00:26:43 I mean, we're going to see a lot of these AI images as these might. models get better, right? Here's the reality. These images are not even that great, right? With some proper prompting, these images could be even more, quote, unquote, convincing. So you do have to be, and this goes out to everyone regardless of your political affiliation. You have to be very careful, especially here in the U.S. and abroad. I know that there's a lot of elections going on, you know, in the latter part of the year. You have to be extremely careful about what you believe, the things that you're reading, right?
Starting point is 00:27:22 Because unfortunately, sometimes these, you know, AI powered campaigns can be convincing. And you're going to have news agencies that may not know any better start reporting on them. So please, y'all, if you're listening here, you have to be extremely careful. AI misinformation, disinformation, you know, even taking,
Starting point is 00:27:44 Now these photos, taking these photos that are pretty good. And, you know, I do wish, you know, we talked about this earlier, these images presumably came from either mid-journey or from the new flux, one kind of AI image generator inside of Twitter. So these capabilities now are going to more and more people. The tech, quote-unquote, tech learning curve is very low, right? You can literally just put in a very simple sentence and get a pretty decent photo, right? but as people get better at using these and then you can turn these photos obviously into videos
Starting point is 00:28:19 as well with other programs, it is going to be very hard to understand what's real and what's not. So everyone, please, before you see anything online, make sure go through in your mind. Is this real? Is this not? Look it up before you share with your friends, your family, before you even talk about it, right? You unfortunately start, you have to independently verify information. especially when it comes to politics here in the U.S. All right.
Starting point is 00:28:48 Our last piece of AI news, ChatGPT has taken back the lead as the most powerful AI chatbot. All right. So the ongoing competition among AI companies and AI chatbots has intensified with OpenAI's latest version of ChatGPT-40 reclaiming the top spot in the LMSY.
Starting point is 00:29:13 chatbot arena in the performance benchmarks. So this development highlights some significant advancements in AI technology and its implication for users and developers alike. So yes, last week, we actually got a new updated model from OpenAI. And I would say it's actually pretty significant. And if you are, you know, a daily AI user like myself, you've probably noticed the improvement. All right, but pretty interesting here because Open AI, really all they did was send out a single tweet or a single X, whatever you want to say. That's all they did, right? So generally,
Starting point is 00:29:56 you know, they would have maybe an event, an announcement, a couple press releases, nothing here. So yeah, we did not see a GPT5. You know, there's been all these rumors, you know, Project Strawberry. We didn't see anything official last week. We saw a quiet under the radar, but very impressive update from OpenAI. So if you follow model numbers, this though is where it gets a little tricky. So this new model is essentially called GPT40-Latest. So OpenAI on their developer account essentially said that they are going to continue to make updates to chat GPT, and it will just be called dash latest.
Starting point is 00:30:38 All right. So a little confusing there. But I mean, the scores here were extremely impressive. So we talk about this a lot here on the Everyday AI show. So essentially in the chatbot arena, you can put in a prompt. So there's something called Battleground. And if you haven't done this already, y'all, I encourage you to do it because not only is it going to help you understand models a little bit better, but it's ultimately going to help you improve your prompting skills, your prompt
Starting point is 00:31:06 engineering skills. So you can go in, put in a query, and you get a side by side, right, and you get two outputs from two different models. They are unnamed until you vote for which one is better. After you vote, you get to see which ones the models are. And that is one of the mechanisms that goes into this ELO score, right? So similar to, you know, a chess ELO score, all of these models have ELO scores in the chatbot arena. So the new model from Open AI has an ELO score now. that is more than 30 points.
Starting point is 00:31:40 It is more than 30. Actually, let me do the math here. 10, all right. So it's about 20. No, sorry, 18 points, 18 points, you know. I do this live. It's unedited, unscripted. I thought it was nearly 30 points.
Starting point is 00:31:54 So it is about 18 points ahead of the next closest model, all right? Which is a huge, a huge gap. I cannot emphasize that enough. So generally, right? So as an example, we saw Google release a new updated version of their model, which, by the way, is not available on the front end. It's only available on the back end for developers or people using it inside of their AI studio. So when Google essentially announced Gemini 1.5 Pro, and this version was called 0801, which came out on August 1st, right? It took the lead temporarily, but only by a couple points.
Starting point is 00:32:35 It only went ahead by about 10 points at the time. So pretty significant update here from Open AI. And I think this is something you have to keep an eye on, right? I tell people, right, go into the chat bot arena, right? Because not, depending on your use case, depending on what you need a large language model for, it's not one size fits all for everyone, right? Depending on what you need it for. There's different, you know, oh, you might need it to help with coding.
Starting point is 00:33:06 You might need it to help with brainstorming, idea generation. You might need it to help with content writing. You might need it to help with data analysis, et cetera, right? So yes, you can go through and manually test different models one by one. But a great place to get a start on this is going to the chatbot arena and just testing different models and seeing which ones for your use case and seeing them head by head or sorry, head to head, which one is going to be the best for your use case. So pretty big news that really flew under the radar from Open AI, right?
Starting point is 00:33:37 So releasing a pretty sizable update to its GPT40 model. So I don't know if this is going to be as significant of an improvement as an example when we went from GPT4 to GPT4 turbo. But it does seemingly look like it according to benchmarks. So yeah, a lot of people don't even know this because it's the same name, right? I think before, you know, we saw it when we went from GPT35 to GPT4, or when we went from GPD4 to GPT4 Turbo, or when we went from GPT4 Turbo to GPT40, right? I think everyone understood, oh, this is a new and improved model because it has a different name.
Starting point is 00:34:18 So it seems to be a new strategy from OpenAI, maybe at least until we get a GPT 4.5 or a GPT5, which I don't think is going to happen this year. But it seems like we're just going to be getting slightly improved and constant new updates under the hood without even knowing, right? So this is why I tell people when we advise companies, you always have to be testing, you know, on a consistent basis. If your company, you know, whether you're using, you build on top of a company's API, whether you're using it in the interface of a chatbot,
Starting point is 00:34:56 right? So whether you're going into Claude, going into Gemini, going into GROC, going into MISR, whatever it is, you should be always testing your processes on an ongoing basis. Generative AI is something that can, in large language models, are something that can give you back so much time and can pay for themselves in seconds, but you have to constantly be understanding what's happening under the hood. You have to have a set of kind of baseline prompts that make sure that you and your team are properly understanding that the model is doing what it should.
Starting point is 00:35:30 All right, y'all, a lot going on there. Let's quickly recap here, the one-minute version. Ready. So GROC 2 has launched from Elon Musk's X-AI company, scoring some pretty good benchmarks across the board, as well as debuting the Flux 1 AI image generator inside of GROC2. Next, Open AI, researchers and others from Microsoft, Harvard, MIT, Oxford, etc., have introduced a new concept and proposed in a paper something called personhood credentials to combat online AI deception. In other words, a way to say whether something is a person or an AI, maybe mimicking a person. Next, world labs has achieved unicorn status with a more than $1 billion valuation in just four months with a recent $100 million funding round being led by former Stanford AI professor Fifi Lee often referred to as the godmother of AI.
Starting point is 00:36:38 Next, former President Donald Trump has continued to use AI in his campaign strategy, most recently posting some images on his truth social platform. platform that appeared to show Taylor Swift approving Donald Trump, which is not the case. And then last but not least, chat GPT and OpenAI have taken back the lead in AI chatbot rankings with their newest update to the GPT40 model. All right, y'all, that is a lot here happening. and it's nonstop, right? Because I guarantee you when we have the AI news that matters next week, there's going to be just as many movements, just as much going on.
Starting point is 00:37:26 I cannot emphasize enough how difficult it is to keep up with these developments every single day. Our team spends hours every single day going through the news, going through the latest tools, testing things out so you don't have to, right? So you can just take what we give you, take it, to grow your company and to grow your career. We don't just push things from different companies. We test everything, break everything, and tell you what's real and what's not.
Starting point is 00:37:55 And well, what's real is this show. It is unscripted, unedited. If this is helpful, please share this with your friends. If you're listening here on LinkedIn, it would mean a lot to our team here. If you could repost this with your network. So everyone in your company or in your circle can keep up with what's real and what's not in artificial intelligence. Also, please go check out that.
Starting point is 00:38:18 Thanks a million giveaway. Go sign up. It takes about five seconds be entered to win and then go refer your friends to win. We're going to have a lot more prizes that we'll be announcing soon. So thank you for tuning in for the AI news that matters. We'll see you tomorrow and every day. For more, every day, AI. Thanks, y'all.
Starting point is 00:38:45 Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you. want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premier Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us.
Starting point is 00:39:24 If you enjoyed this episode, please subscribe. and leave us a rating. It helps keep us going. For a little more AI magic, visit your everyday AI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.