The AI Daily Brief: Artificial Intelligence News and Analysis - Inflection-2.5 Is a "Personal" AI at Near GPT-4 Levels

Starting point is 00:00:00 Today on the AI breakdown, inflection becomes the latest AI lab to release a new model. Before that on the brief, Biden talks AI at the State of the Union. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, our newsletter, and our Discord. Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes. Last night was the State of the Union. And, of course, most American political coverage is focused on President Biden coming out swinging and partisan questions and memes and all this sort of stuff.

Starting point is 00:00:41 But for us, what we are interested in is a couple of mentions of artificial intelligence during the speech. Now, one of those, predictably probably, had to do with voice impersonation. You'll remember that during the New Hampshire primary a few weeks ago, President Biden was the subject of a voice impersonation scam, where effectively a Democratic political consultant had paid someone to create a version, an AI version of President Biden and Robo-called Democratic primary voters in New Hampshire, telling them to stay home, saying that they didn't need to vote then and that their vote only really mattered in November. Now, it wasn't too harmful relative to what these messages could do,

Starting point is 00:01:23 but it did get everyone on edge. It has really triggered an immune response, if you will, around the potential for this to be a much more destructive type of thing in American politics. So President Biden in the State of the Union called for a ban on AI voice impersonation. The other thing that he said, just in a passing mention, however, was he said he wanted to harness the promise of AI and protect us from its peril. Now, of course, this is a very generic statement, it's very political speak. It's kind of what we've heard from this White House throughout, that there's good and bad and we've got to promote the good and protect from the bad.

Starting point is 00:01:56 But still, the fact that it warranted mention in the State of the Union is notable. While we may not be getting comprehensive AI legislation this year because of the political cycle, it doesn't seem like it's completely left the White House's agenda at least. This is probably a good time for the White House to be discussing artificial intelligence, as it's certainly entering the public consciousness more as well. Each year, Communications Giant Edelman publishes a trust barometer report. It's effectively a look at how trust in institutions, in different media, and different figures, has changed over the last year.

Starting point is 00:02:29 and this year one of the big focuses was artificial intelligence. Said Edelman Global Technology Chair Justin Westcott, trust is the currency of the AI era. Yet, as it stands, our innovation account is dangerously overdrawn. Companies must move beyond the mere mechanics of AI to address its true cost and value. The why and for whom? So what are the numbers here? Five years ago, globally speaking, 61% of people said that they trusted AI companies. The most recent reading was down to 53%. In the U.S., it's even a little bit more dramatic. Five years ago, it was 50% of people who trusted AI companies. That has dropped 15 percentage points down to 35%, just a little over a third. Democrats trust AI companies slightly more at 38%, with independence at 25% and Republicans at 24%.

Starting point is 00:03:13 Edelman does note that this might not just be an AI problem. There is a larger pattern of a decline in trust, which is something we've talked about on this show before, but it shows up pretty starkly in these numbers. Eight years ago, technology was the leading industry trust in 90% of countries that Edelman surveyed. Today, it's the most trusted in only half of those countries, down 40% in that time. There are a million reasons for that, but that would be much more than a single podcast to get into. Interestingly, there seems to be a divide between developing and developed markets, with developing markets wanting AI more. Axio sums up, acceptance outpaces resistance by a wide margin in developing markets such as Saudi Arabia, India, China, Kenya, Nigeria,

Starting point is 00:03:53 and Thailand. Meanwhile, respondents in France, Canada, Ireland, the UK, US, Germany, Australia, the Netherlands in Sweden, rejected the increased use of AI by a three-to-one margin. There is a lot that we could explore here too. Is this a reflection of the fact that AI seems to be a great equalizer by bringing the bottom up and making what were once premium skills a little less valuable? Could be. One of the things that makes this such an interesting new phenomenon is the fact that the automation is happening at the white-collar level rather than the blue-collar level. That's not something we've seen historically with other technology movements, and it could end up with some pretty weird and novel political resistance because of that. So what to do if you are

Starting point is 00:04:31 in the AI industry? Well, once again, the global technology chair, Justin Westcott says, those who prioritize responsible AI, who transparently partner with communities and governments, and who put control back in the hands of users, will not only lead the industry, but will rebuild the bridge of trust that technology has, somewhere along the way, lost. One of the reasons I think that there are such big questions around AI is that there are such wildly different interpretations of it. You might have seen this chart flying around Twitter slash X. AIs ranked by IQ. It comes up with an IQ score based on questions right out of a 35 question test. A random guesser down at the bottom gets 5.833 of these questions right for an IQ of 63.5. ChatGBTGPT4 got 13 of the questions right,

Starting point is 00:05:12 giving it an IQ of 85. And Claude 3, the new kid on the block, got 18.5 questions right. To be To be honest, I'm not sure what the 0.5 of a question right is, but we're going with this chart for a 101 IQ score. Now, whether this is actually a useful measure and whether, frankly, this is even just a made-up chart, I'm not totally sure. What's interesting is the virality of it as a reflection of people's concerns about AI. Yet on the other side of this, we still have folks who argue that AI is simply not intelligent, and that we're speaking about it in the wrong terms. Metis chief AI scientist John Lacoon wrote recently on Twitter, you are confusing intelligence and knowledge. LLMs have a lot of accumulated knowledge,

Starting point is 00:05:51 but very little intelligence. An elephant or a four-year-old are way smarter than any LLM. Which of course brings up the question, what does it mean to be smart? And the reality that we don't have necessarily a good or consistent consensus answer to that is probably why there's so much controversy around this smart technology. Heading over into markets for a minute, AI chip companies continue to get a huge boom with Broadcom, with Broadcom announcing that they are anticipating $10 billion in AI chip sales in 2024. Seven billion of that is apparently coming from contracts with two companies, which are unnamed, but widely assumed to be Google and meta, who Broadcom is helping design custom

Starting point is 00:06:29 AI chips. The CEO of Broadcom, Hawk Tan, said that the custom chip business, quote, can command margins similar to our corporate gross margin of around 75%. Now, things could get interesting for Broadcom because last month, Reuters reported that NVIDIA, the big data center AI chip giant, wants to get into this bespoke. chip fabrication game as well. On February 9th, Reuters wrote, NVIDIA is building a new business unit focused on designing bespoke chips for cloud computing firms and others. Its H-100 and A-100 chips serve as a generalized all-purpose AI processor for many

Starting point is 00:06:59 of those major customers, but the tech companies have started to develop their own internal chips for specific needs. Doing so helps reduce energy consumption and potentially can shrink the cost and time to design. Invita is now attempting to play a role in helping these companies develop custom AI chips that have flowed to rival firms such as Broadcom and Marvell technology. This is seen as a $30 billion opportunity. So if you're trying to sum up, even as trust in AI companies might be declining, the energy and capital going into those companies is doing nothing but accelerating. That is going to do it for today's AI breakdown brief. Next up, the main AI breakdown. Welcome back to the AI breakdown. One of the really interesting sub-stories following Elon Musk's

Starting point is 00:07:41 lawsuit against OpenAI and Sam Altman is the way in which other foundation model companies have raced out to try to fill the void left by OpenAI not announcing new things with their own announcements. Now, how much this is opportunistically taking advantage of what might be a legal process going on on OpenAI side versus just fortuitous timing for those other labs is not exactly clear, but it is notable that this week alone we've got two new advanced foundation models. Claude 3, of course, we have discussed extensively. It is the first model, including Gemini advance that I've seen widely considered by people who use it to be just straight up better than GPT4. This is impressive, but of course, it's worth noting that GPT4 has been out for over a year,

Starting point is 00:08:24 effectively an eternity in AI time, and the bigger question from a competitiveness standpoint is what GPT5 will look like. Still, it is notable that we finally have some amount of commoditization at that GPT4 level, and even some advances that are being won. Inflection has always been going after a slightly different prize. The company's Pi chatbot is meant to be a personal AI. As Inflection co-founder and CEO, Mustafa Sullyman writes, Pi blends helpful IQ with friendly EQ. The use case that Pi is going after is the anti-lonliness use case,

Starting point is 00:08:55 the someone to talk to about anything going on in your life use case. Not necessarily the process my data, help me figure out new business strategy use case. They're certainly the most scaled company going after that opportunity. And so it's important to look at their results, not just in the context of how they compare from a benchmark standpoint, but also what it means to be pursuing this different type of AI. So the benchmarks are looking pretty good. On things like the MMLU, the GPQA, mathematics, coding, and common sense, Inflection 2.5 performs significantly better than Inflection 1,

Starting point is 00:09:26 and is creeping up on GPT4 levels. What's notable about that is that they used only about 40% of the computing power that GPT4 reportedly used for training. In other words, they're winning some efficiency somewhere along the process. Inflection's announcement post about 2.5 reads, Meet the world's best personal AI. They write, InFlection 2.5, our upgraded in-house model is competitive with all the world's leading LLMs like GPT4 and Gemini. They note that they've made particular strides in areas of

Starting point is 00:09:51 IQ-like coding and mathematics, which, quote, translates into concrete improvements on key industry benchmarks, ensuring Pi always pushes at the technological frontier. They've also brought Websearch to Pi, meaning that it's not limited in information to whenever its training date cutoff is. One of the things that I think is notable about this announcement, and specifically their emphasis on coding within this announcement, in previous PR moments, inflection has intentionally downplayed its interest in things like coding. It's basically said that that's not what this AI is supposed to be about, so it's not a big concern for us. However, we also got comments from Mark Zuckerberg not too long ago, where he was discussing the difference between Lama 2 and Lama 3, and the extent to which they found that increased performance around coding led to benefits in other areas. Basically, as they increased the capacity of Lama to Code, it was able to solve more and different

Starting point is 00:10:39 types of problems that weren't necessarily computer science or coding problems. Given that Inflection is now talking about coding in mathematics as a part of the improvements, without changing anything about their mission for a personal AI, I wonder if they're finding that as well. The coding and math knowledge actually leads to better performance overall, even outside those domains. Inflection also gave us some statistics for the first time. According to their announcement posts, they now have 1 million daily active users and 6 million and monthly active users. In total, those users have exchanged more than 4 billion messages with

Starting point is 00:11:08 Pi, and the average conversation with Pi last 33 minutes. One in 10, they say, lasts over an hour, and about 60% of the people who talk to Pi on any given week come back the following week. The point being that they're arguing that their personal AI is a very deeply engaging and sticky experience. Emphasizing those long session times and frequency of return is their evidence of that. In follow-up interviews, Mustafa Sullyman said that Pi's user base has been growing around 10% a week for the past two months, so these numbers are not likely to stand still. Now, for some comparison in numbers, OpenAI said last November at Dev Day that it had 100 million weekly active users, but then I noticed in their blog post response to Elon Musk earlier this week, they seemed to claim that they had hundreds

Starting point is 00:11:48 of millions of daily active users. I don't know if that is a typo or a misspeak, but if it is not, it seems like ChatGPT is experiencing some significant growth as well. One other thing that makes Pi a little bit different, perhaps because of its more personal type of interaction and use case, it actually launched without a business model built in. Obviously, the chat GPTs and co-pilots of the world have free versions and then paid versions, and the intention is for Pi to have some sort of paid version, although it sounds like they haven't fully resolved yet what the business model is actually going to be. I've found that Pi still remains really divisive, not in the sense that some people are super against it or anything like that, but that some people just don't understand

Starting point is 00:12:25 why this would have a lot of demand. Carlos E. Perez, at Intuit Machine on Twitter, writes, I keep being underwhelmed with Pi. I don't understand what the hype is here. Can anyone explain? He continued, I guess it's meant to be conversational and thus has an entirely different use case from what we find in GPT4 or Claude. I guess this kind of UI makes sense

Starting point is 00:12:42 for a more chaotic knowledge discovery process. Professor Ethan Malik writes, I think we should be talking about inflections Pi more. They released a near-GPT4 class upgrade to their AI this week in service of a very different vision of AI, one designed to be your best friend rather than an assistant. And it seems to be working for better or worse. To give a little bit of a clarification on where he might fall on that better or worse spectrum,

Starting point is 00:13:04 he adds the road to her, referencing the Joaquin Phoenix movie. Anyway, he continues worth trying to get a sense of. And so I will leave it on that note. I think that like Ethan says, by sheer virtue of the fact that they are trying to do something so different, if you're paying attention to the AI space broadly, at least spending a little bit of time engaging with it And understanding what it's about and what the implications might be and which type of people might be attracted to it can probably teach you something about the evolution of the AI space overall.

Starting point is 00:13:29 For now, though, that will do it for today's AI breakdown. I appreciate you listening or watching wherever you are. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Inflection-2.5 Is a "Personal" AI at Near GPT-4 Levels

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.