The AI Daily Brief: Artificial Intelligence News and Analysis - The State of AI in 2024

Episode Date: January 2, 2024

NLW breaks down the industry as 2024 begins, from policy to AI safety to the state of the art in technology. Today's Sponsors: Listen to the chart-topping podcast 'web3 with a16z crypto' wherever you... get your podcasts or here: https://link.chtbl.com/xz5kFVEK?sid=AIBreakdown  ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Breakdown, we're doing a look at the state of AI heading into 2024. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, our Discord, and our newsletter. Welcome back to the AI breakdown. Happy 2024 for all of you out there. Today, what I thought I would do because the news cycle is really just still heating up in the artificial intelligence world is instead of looking at some of the smaller news that we've had over the last couple days, to try to do a frame-setting episode for the kickoff to this year.
Starting point is 00:00:44 So what we're doing today is the state of things. This is my highly subjective opinions. And we're going to look at everything from open source to adoption to policy to safety to chips and beyond. Where we will start this state of things report is with the state of the art. And I think the notable thing to mention here is the surprising extent to which GPT4 has remained King of the Hill. when it comes to LLM capacity. Back in September, Professor Ethan Mollock wrote, The AI thing to watch in the next month is Google Gemini.
Starting point is 00:01:13 It is likely to be the first AI to beat GPT4. That means it can give us a hint of the future. How much better is it? Do larger models really hallucinate less? Do big general models keep beating small specialist ones, etc.? Now, of course, at the beginning of December, we did get Gemini formally announced. Much was made by Google of the fact that Gemini Ultra,
Starting point is 00:01:32 the highest capacity version of the LLM, beat GPT4 on the MMLU benchmark. However, there were a couple things that quickly took the luster off of that announcement. The first was that the methodology for GPT4's measurement of MMLU versus Gemini Ultras was not an apples-to-apples comparison. The second, and even more significant, was that Gemini Ultra remained a thing that was promised for the future. Indeed, the version of Gemini that we actually got access to after the announcement was instead
Starting point is 00:01:59 Gemini Pro, which is closer to a GBT 3.5 level LLM. In the wake of that announcement, Professor Ethan Malick again writes, the recent releases of many Gpti 3.5 class AIs, GROCs, Mixtral, Gemini Pro, are oddly illuminating about the future of Frontier AI. It's been a year and no one has beat GPT4. Will they? Is there some magic there? Does it indicate a limit to LLMs?
Starting point is 00:02:23 Will GPT4.5 be another huge jump? These are questions with very large stakes, and the fact that Gemini Ultra just barely beats GPT4 on some of the measures Google released raises even more questions. So as I see it, the big questions heading into this year when it comes to LLMs are one, how fast other things can actually catch up to GPD4 and specifically, does Gemini Ultra actually exceed its capacity? And then two, how fast do we get to GPT4.5 and how different will it really be? In other words, in many ways, from where I'm sitting, OpenAI has the chance once again to dictate the terms of the conversation when it comes to the state of the art of
Starting point is 00:03:00 generative AI. Interestingly, moving on to our next category, open source, one of the projects that has gotten the most attention over the last few months is, of course, Mistral. Mistral first raised eyebrows for its huge pre-seed round of over $100 million raised back in the summer, but then its Mistral 7B model became a favored choice of many developers looking for an open source alternative to chat GPT. In December, CEO Arthur Mench proclaimed that the company was planning to release a GPT4-level open source model in 2024.
Starting point is 00:03:29 Now, the implications of that could be significant, but Mistral is far from the only company that's trying to race to that milestone. Before Mistral started to suck some of the open source oxygen out of the room, the conversation had, of course, been dominated by Meta's Lama. Lama 2 emerged in the fall and brought with it the commercial license that people had been looking for with Lama 1, alongside GPD 3.5 level capacities in an open source package, or at least an open source-ish package. Now, even at the time, reports were that Facebook engineers were talking about moving straight
Starting point is 00:03:59 on into Lama 3 and trying to get that GPT4 level performance from that as well. I think in many ways, when it comes to open source in 2024, meta and Mistral are going to be battling it out for the affiliation of developers. Now, of course, that's not meta's only battle. Meta, in particular, their chief AI scientist, Jan Al Kuhn, have emerged as perhaps the strongest industry voice pushing back against the AI safety narratives that have come to dominate policy and mainstream discussions. We'll come to those policy and safety discussions in just a moment, but before that I want to look to a question of adoption. Generative AI is a very weird technology in some ways.
Starting point is 00:04:35 It's weird in the sense that, although there are hundreds of millions of people who have tried it over the past year since the introduction of ChadGBT in November of 2022, were very clearly still in the early adopter phase of the technology. The people who are actually using it day in and day out are a much more advanced technology user than the mainstream, but some of that seems poised to change in the year to come. If you listen to the AI breakdown throughout the fall, you probably heard me use the word integration a lot. As in addition to the battle and the AI arms race for ever more advanced models, we were seeing a lot of companies and projects who were focused on instead
Starting point is 00:05:09 the way that their technology could be integrated into existing workflows as a way to build adoption. Right now you can't throw a stone at a major publication and not see some story that suggests that this AI adoption or integration phase is what's coming next. First of all, we're expecting to see a whole movement of PCs and laptops and tablets that are all marketed and advertised as AI PCs. An example of this comes from the Verge, who wrote on December 28th, Microsoft's Next Surface laptops will reportedly be its first true AI PCs. What's more, market analysts are also expecting increased AI adoption in the enterprise coming up this year. There, of course, has been a ton of prediction style content, wrapping up 2023 and moving into
Starting point is 00:05:48 2024, and a common theme that I've seen from market analysts is excitement about enterprise adoption of AI. For example, Fred Habemeyer, the head of US AI and software research, from Akari said, I believe that in 2024, we're going to begin seeing the first really significant and material signs of enterprise adoption of generative AI, that going also into calendar year 2025 will be seeing significant adoption trends occurring. With now products like Microsoft 365 co-pilot running out, enterprises will actually have the opportunity to adopt and purchase these products. So basically, this isn't just an argument that enterprises are getting more ready to adopt, although that's part of it. It's also that the platforms that they already use are fully
Starting point is 00:06:26 integrating AI into their next offerings, meaning that almost by default there's going to be more exposure to artificial intelligence. Now, what about the state of things when it comes to policy? The most extensive legislative efforts around AI last year were, of course, the EU AI Act. That act took a risk-based approach to AI, focusing not on the underlying technology, but on how risky the use cases were. So, for example, the highest risk category was prohibited AI, and that includes things like social credit scoring systems, behavioral manipulation, biometric categorization systems, predictive policing applications, and more. Now, high-risk AI that has higher compliance requirements include things like vehicles. Influencing elections and voters access to services like
Starting point is 00:07:08 insurance banking and credit, critical infrastructure management, and other types of law enforcement applications. Now, although this legislation has continued to move forward, it's not without its controversy. Many of the more pro-entrepreneurial and pro-enterprise parts of the EU block, most notably French President Emmanuel Macron, have been tearing into the EU AI Act, saying that Europe will become the leaders in AI regulation without the corresponding industry which will certainly flow elsewhere. He said, we can decide to regulate much faster and much stronger than our major competitors, but we will regulate things that we will no longer produce or invent. This is never a good idea. Quickly a brief word from today's sponsor. As a listener of this show, I suspect you
Starting point is 00:07:49 like to stay up to date on all things AI and tech, which is why you have to check out the chart-topping podcast Web3 with A16Z Crypto. Produced by venture firm Andresen Horowitz, Web3 with A16Z is the perfect companion podcast to the AI breakdown. Web3 with A16Z crypto is your definitive resource for the future of the internet, whether you're interested in the convergence of AI and crypto or simply curious about what's next. If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonae and former Google X engineer Aliya in conversation with host Sonal Choxi about the intersection of AI and crypto. From fighting deepfakes and proving humanity to large language models like ChatchipT, they cover it all. I highly recommend
Starting point is 00:08:31 checking it out, especially if you'd like to learn more about how AI and crypto will impact our everyday lives. Beyond crypto and AI, this show is for creators seeking more ways to truly own their work, for business leaders trying to prepare for the future today, and for innovators exploring trending tech topics. Don't miss out. Follow Web3 with A16Z Crypto on Apple Podcasts, Spotify, or your favorite listening app. Now the big question is on what the requirements will be around foundation models like OpenAI's GBT. Remember, this law was being written before even the introduction of Chad GBT, and much of the time was spent focused on areas that weren't generative AI per se. How to handle and how much to include from that new world of generative AI has been a big
Starting point is 00:09:13 debate ever since ChatGPT launched, and one that is clearly still very active. Now, when it comes to the U.S., there have been numerous bills introduced, and Senate Majority Leader Chuck Schumer has been holding a series of closed-door information sharing forums by which he is hoping to educate himself and his Senate colleagues on the real challenges and opportunities of AI. But given that 2024 is an election year, it seems very unlikely that there's going to be a lot of movement for comprehensive AI legislation when it comes to the U.S. Instead, where a lot of people are looking is to the election itself as a real case study in how much trouble this technology is going to be when it comes to public information. There are, of course, very reasonable concerns that interested parties will
Starting point is 00:09:53 use AI to try to spread false information, make political candidates that they don't like look bad, confuse voters, and basically do all the things that people have been doing in elections forever, but with a new superpower technology behind it. And I think it's going to be enormously telling to see how justified those concerns will be. In many ways, I tend to think that we might have opposite problem, in which people so quickly adapt to the idea that everything that they might read could be created by AI that they simply don't believe anything. In other words, I think that the issue might not be that they're tricked into believing something that they shouldn't, but that they don't believe anything at all, or at least nothing that doesn't confirm their priors. No matter what
Starting point is 00:10:30 is going to be a very telling period that will teach us a lot about how AI functions in the real world. Now, the legal system is also poised to have to handle much regarding AI in the coming year. Chief Justice John Roberts released his annual report last week and extensively discussed artificial intelligence. He wrote, I predict that judicial work, particularly at the trial level, will be significantly affected by AI. Those changes will not only involve how judges go about doing their job, but also how they understand the role that AI plays in cases that come before them. Now, of course, in addition to AI just impacting the legal system, the legal system is going to have to consider many questions when it comes to AI. when about a week ago the New York Times announced that it was suing Microsoft and Open AI, many people saw it as the Napster moment for generative AI,
Starting point is 00:11:15 where a powerful old world industry sues a devil may care copyright ignoring technological upstart and potentially alters the way that that industry evolves. Now, of course, there is no guaranteeing the case goes in the New York Times favor, but this will be a central question of 2024 and the shape of the AI industry to come. Now, obviously one of the things influencing the policy conversation is the larger AI safety discourse. Heading into 2023, the AI safety folks were still largely screaming into the void. The idea that advanced artificial intelligence could pose an actual risk to humanity, wasn't something that people gave any sort of real credibility to,
Starting point is 00:11:52 mostly because they had never even been really asked to think about it before. However, throughout the year, the AI safety message got louder and louder and louder. There were a few key moments for that, including, I think, the six-month pause letter, which, while ineffective in actually shutting things down, was certainly effective in getting headlines, alongside the high-profile defection of Jeffrey Hinton from Google and the media tour that followed, where he publicized many of his concerns about the future of artificial intelligence, the industry, which, of course, he had dedicated his life to. Now, interestingly, Zvi Mauchowitz, who is fairly pessimistic when it comes to this question,
Starting point is 00:12:24 did a poll on Twitter that got 3,300 votes, asking, did your P. Doom, go up? or down in 2023. Now, for the uninitiated P-Doom is the probability of doom or the percentage chance that you ascribe to AI ending humanity in some catastrophic way. 25.3% said no substantial change. 13.1% just wanted to see the results. But then the rest of the votes were nearly entirely split between went up and went down. In fact, the percentage of people that said that their P-Doom went down slightly beat out went up 30.9% to 30.7%. Now, some accounts found that shocking. AI safety music count said,
Starting point is 00:13:00 nearly as many people had P-Doom go down, I am confused. But Jeffrey Miller at Primal Polly perhaps gives an indication of why, writing, P-Doom assuming nobody fights the AI industry, went up. But actual P-Doom, seeing how many people suddenly understand the extinction risks
Starting point is 00:13:15 and are willing to fight the AI industry, went down. Which broadly maps to my sense that the AI safety discourse definitely went mainstream in 2023 and is going to be an even more significant conversation in 2024. Now, one more big theme that is
Starting point is 00:13:29 dominated 2023 and that's interesting to look at coming into 2024 is the availability of AI chips. Invidia, of course, had an incredible year in the stock market, in large part because their chips were so central to all of these new generative AI applications. Indeed, at times, companies, even as big as OpenAI, were so constrained in terms of access to compute that they actually had to turn off premium subscriptions to GPT4 for a while. Of course, capitalism doing capitalism things, 2023 saw a surge of efforts to try to make it so that Invidia wasn't the only game in town. Microsoft joined the ranks of big tech companies making their chips, announcing the Maya 100 and the Cobalt 100, AMD announced a new chip meant to make them more competitive with their
Starting point is 00:14:08 biggest rival Nvidia. Intel also tried to get some momentum around AI. And of course, Amazon teamed up with Anthropic, with a major part of their deal being focused on the AWS Traneum and Infersia chips. Still, analysts on Wall Street are very bullish on Nvidia. Among the magnificent seven tech stocks, the expected one-year return of Nvidia among Wall Street analysts is higher than any other company with a 34% expected return heading into next year. And so now as we round out this state of AI report, let's look at a few of the big categories of applications for AI and what I'm watching in those areas. When it comes to LLMs, in addition to just the general state-of-the-art competition that we talked about it right at the beginning, things that I'm watching include
Starting point is 00:14:48 GROC and whether Elon and the Twitter team can find some differentiation that makes it really useful, which I don't think is impossible given Twitter's real-time information stream, but I'm also really interested to see how the GPT marketplace plays out. This was one of the most hyped announcement coming off of OpenAI Dev Day, but then sort of got batted out of the way, given all of the controversy surrounding Sam Altman's firing and then rehiring, but I would say that the skepticism is higher than it's ever been around how successful that GPT marketplace will be. Jared Tyler, for example, tweeted, now that the GPT's hype has died down, I can't help but wonder if the GPT marketplace and revenue sharing is going to flop. I know there's good GPs out there, but the only ones that
Starting point is 00:15:24 have been useful to me are ones I made tailored specifically to my use case. Now, by and large, this has been my experience as well. But then again, discoverability is exactly what a marketplace is trying to solve. So I'm not quite ready to write the GPT marketplace off yet, although I do think that it's going to have to face a wall of skepticism that it might not have had the last couple months played out differently. Now, over in the world of image generators, the competition between Dolly 3 and Mid Journey and Stable Diffusion continues to accelerate, with Mid Journey and Dali, I think, really moving into pole positions even more. Dolly 3's integration into chat GPT obviously gives it a huge advantage, as does its ability to handle text.
Starting point is 00:16:02 Mid Journey 6 came out just before the holiday and did for the first time show some actual capacity around text generation, although it seems not to be as advanced as Dolly 3, alongside just incredible quality overall. From a usability standpoint, though, maybe a bigger deal is that the web version of their application is finally starting to roll out to power users, and it's anticipated to come to a broader user set and move people out of Discord fairly soon. Now, interestingly, Mid Journey also recently reported that they were going to start training their own video models. That follows not only an announcement of Stability's first video models last month, but also really exciting competition between PICA and runway. Again, right around the holidays, PICA 1.0 opened up to all of the people on the wait list,
Starting point is 00:16:43 and the creation so far have been nothing short of amazing. Indeed, many people wonder if, in the same way that 2023 represented a real breakout year and mainstreaming of image generation AI, will see something similar. when it comes to video generation AI in 2024. Elon Musk, for his part, responded to a post about PICA, saying AI movies next year. Music Generation is also having a moment thanks to Suno. One of my last episodes of the last year was a test of a number of different audio generation apps, but Suno is certainly the one that's gotten the most attention recently.
Starting point is 00:17:15 That's because, in addition to just being able to develop melodies and instrumentation, it can actually generate the basis for entire song, lyrics, singing, and all. Still, if I had to identify a big trend for 2024 when it comes to generative AI, one of the areas that I would look to are wearables or AI hardware devices. A set of these have been getting people excited. Throughout the year, the Humane AI pin probably got the most buzz, finally premiering late in the year, although perhaps too mixed sentiment, but other products including the tab and the rewind pendant also have people really excited about what an AI hardware device could look like. And that's to say nothing, of course, of whatever it is that OpenAI Sam Altman and former Apple head of
Starting point is 00:17:52 design Johnny I for cooking up and actively recruiting people like Apple's VP of iPhone to come join and work on. Now, the one other area that I'm going to be spending more time looking at this year is around AI characters or companions. You might have seen this chart going around, which was a look at the 50 most visited AI tools between September 2022 and August 2023. Unsurprisingly, ChatGPT was number one with 14.6 billion visits. But number two, by a significant amount was Character AI with 3.8 billion visits. Character AI is a place where people can create AI characters based on whatever they want. It could be famous people like Albert Einstein, or it could just be characters that they generate and have extensive conversations with.
Starting point is 00:18:34 Users apparently are in some cases spending up to two hours a day talking to these characters, and the usership slants much younger. Now, on top of just Character AI, there's also a larger trend towards AI companions that are explicitly marketed and advertised as AI girlfriends and boyfriends. I expect there to be a ton of ink spilled this year around the sociological implications of this. But like it or low, that this is certainly something that's going to be a more present phenomenon when it comes to this generative AI space. And so, friends, that is my look at the state of AI heading into 2024. This promises to be a wild, unpredictable, extremely fast-moving year, and I am excited to be kicking it off and covering it every day with all of you here. Hope you had a great
Starting point is 00:19:15 holiday. And until next time, Peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.