The AI Daily Brief: Artificial Intelligence News and Analysis - The State of AI Report: Research, Industry, Politics and Safety

Starting point is 00:00:00 Today on the AI Breakdown, we're reviewing a massive state of AI report. Before that on the brief, layoffs at an AI startup, but should we be concerned? The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.org for more information about our Discord, our YouTube channel, and our newsletter. Welcome back to the AI Breakdown Brief, all the AI headline news you need in around five minutes. We kick off today with a story that a startup called Deep Graham has had to cut around 20% of its staff, which equates to around 20 people. Now, of course, in general, in an average market, people would be quick to assume that a startup having to downsize was a reflection of the startup and not the

Starting point is 00:00:45 industry. However, over the summer, when news came of a couple different AI startups having to cut employees, it contributed to a narrative that the hype of the space was waning and that perhaps it wasn't all it was cracked up to be. You'll remember that this coincided with reports that for the first time ever chat GPT had seen a decrease in users between May and June. Now, my assessment at the time was that the reason it was getting so much play in the press was simply that the press needed a different angle. After four or five months of non-stop, infatuated excitement about chat GPT and other generative AI applications, there is a natural tendency to look for a new narrative. Hype waning is the perfect follow-up to hype growing. Now, of course, I do believe that there

Starting point is 00:01:26 was a little more to this story than just random startup X or Y having a bit of a struggle. Specifically, Jasper has become exemplary of an approach of AI startups to effectively build Ux wrappers around APIs from other companies and try to turn that into a business. We've seen especially as the companies that actually own those APIs build their own competing services, that that is a very difficult road to take. Indeed, for a number of different reasons, the AI space in general has been perhaps more hostile to startups versus incumbents than most tech industries that were used to. I think some of that has to do with the advantages of scale, and the very small number of

Starting point is 00:02:02 of companies that are actually developing their own models, but I think some of it also has to do with buying behavior among enterprise customers. I think that perhaps venture capitalists at the beginning of the year and the end of last year, underestimated the extent to which companies would be looking for either, A, internal solutions that they could spin up themselves, or B, for their existing software service providers and vendors, to give them an alternative to having to trust a brand new startup with their sensitive data. All of which brings us to this announcement. The company in question was a company called Deepgram, which builds speech recognition and transcript software. It's a company that's been around for quite some time. It was founded all the way back in 2015.

Starting point is 00:02:37 And like so many other AI companies, it's found itself competing with some of the giants. That includes notably OpenAI, which in September of last year, released their open source whisper speech recognition software, which, if you've ever used the chat GPT mobile app, is just of phenomenal quality. Now, of course, on top of the competitive dynamics of the artificial intelligence space itself, it's just a very difficult environment for startups in general. We are settling into the new reality of a world that doesn't just have near zero interest rates and endless amounts of capital flowing into risky areas like private equity and VC. That it means in general, fewer startups are getting funded, fewer startups are able to raise again,

Starting point is 00:03:13 and when they do raise, it's often at lower valuations. In his letter about the layoffs, Deep Graham's CEO Scott Stevenson said it was basically a self-preservation mechanism. He wrote, I'm not willing to bet that in one year or so the market will be good for raising additional funds. So we must be conservative, damping the growth at all cost mindset and instead focusing on efficiency. So when it comes to the question of what this says about artificial intelligence, I think the only thing that it says is that even AI startups are not immune to the larger macro forces at work right now. Next up, Buzzy AI startup character AI has released a new feature called group chat. Just like it sounds like you can bring together both humans and AI characters in a single group.

Starting point is 00:03:52 group chat experience. So how might this be used? One suggestion that character has is imaginative entertainment. They write, have you ever wondered what would happen if we could travel through time and space and have a group chat with history's smartest figures such as Albert Einstein, Marie Curie, Nicola Tesla, and Stephen Hawking, or maybe even a group chat with mythology icon Zeus, Hades and Poseidon, or be in a discussion with Napoleon, Athena, Genghis Khan, and Julius Caesar to speak about strategy and power. So one dimension they have is the entertainment side of things. Another is building social connections and communities based on specific hobbies and interests. They point out that some of the characters are focused around those hobbies and interests and could become the seeds for a larger

Starting point is 00:04:30 community conversation. They point to the idea of role-playing games, book clubs where friends can share their thoughts with not only each other but AI characters, and then for students' study group sessions. Now, as I have been admitted on this show, character AI's success and growth is somewhat surprising to me because it's a behavior that doesn't intuitively resonate to me. For that reason, and I'm actually keeping an extra close eye on it, as I feel like in some ways it's almost more important to understand where my perspective, based on whatever generational biases I have, starts to diverge in significant and meaningful ways

Starting point is 00:05:01 from especially younger users who have grown up in a different set of contexts and who have a fundamentally different view about how to interact with the world. Along those lines, there has also been a lot of conversation recently around meta-AI personas. Now, one of the things that people noted when meta announced this feature was that they were awfully similar to character AI in some ways, But interestingly, META actually paid millions of dollars to associate their personalities with specific real-world people. However, the fact that the personas are based on real-world people but not named after them is causing some confusion. A-16 partner Justine Moore writes,

Starting point is 00:05:33 The weirdest thing about the meta-AI personas is that they're taking very recognizable celebrities and turning them into characters with different names. The result is, understandably, mass confusion. Why is this Billy and not Kendall AI? Hey guys, it's Billy. I just want to introduce myself. I am here to chat whenever you want. Message me for any advice.

Starting point is 00:05:52 I am ready to talk, and I hope to talk to you soon. Now, whether it's Billy or Kendall or some other name, people are definitely also just confused around the strategy here. Billy slash Kendall posted a moody fall baking shot and wrote, Start a Fall calls for baking and restarting Gilmore Girls, again, to which the top commenter said, I don't get the point of this account. What is the purpose?

Starting point is 00:06:13 Someone else wrote, I just don't understand why Kendall is doing this. It's so weird. Now, on the flip side, some people think, that we're just caught up in how the world was, and that once we get used to this new reality, it could actually make for a significant difference. Entrepreneur Shahar and Ahmad wrote a piece about this called why Meta's AI personalities might just be a huge game changer. Shahar writes, at its core, meta's new feature doesn't just redefine information exchange,

Starting point is 00:06:35 it taps into the innate human desire for connection. After chatting with them for a few minutes, you can almost forget you are not talking with a real person. For good or bad, in an age where people feel more lonely all the time, I can see how this can become a huge hit, even addicting to many people. Anyway, like I said, as something that's fairly far outside of my personal experience, it's something I'm keeping an extra close eye on. Lastly, today, a classic archetypal trope of new technology being introduced, and the incumbents coming after them trying to throw their weight around and using whatever tools they have to halt their advance. The specific instance of that today, the Recording Industry Association of America has asked the U.S. government to place AI voice cloning sites on an official government piracy watch list.

Starting point is 00:07:15 the RIA filed an official submission to the U.S. Trade Representative, asking for this type of platform to be added to the annual review of notorious markets for counterfeiting and privacy. Interestingly, for all of the companies in that space, and there are many, the only one the RIA specifically names is Voicify.AI, potentially because it provides voice models of famous musicians. In its comment letter, the RIA wrote, the year 2023 saw an eruption of unauthorized AI vocal clone services that infringed not only the rights of the artists whose voices are being cloned, also the rights of those that own the sound recordings in each underlying musical track. Now, forgive my cynicism for a moment, but it's pretty clear which of those two rights being

Starting point is 00:07:53 infringed the RIA cares about more. As Ben Jordan put it, translation. Technology opened a door for artists on major labels to make money outside of the labels control, and RIAA wants the government to criminalize it. Now, as I've said before, my absolute base case when it comes to voice cloning for musicians is that there will be platform sanctioned ways to do it that allow musicians to benefit from a behavior that I see as completely inevitable. It will be a capitalist co-optation rather than just some Pyrick banning, so I think they might as well get on with that. Anyways, guys, that is going to do it for the AI breakdown brief. Next up, the main AI breakdown. And now a word from today's

Starting point is 00:08:30 sponsor. Are you interested in how two top-of-mind trends AI and crypto can work together? If so, I have the perfect podcast recommendation for you. Web3 with A16C crypto, the chart-topping show brought to you by venture firm Andresen Horowitz. Web 3 with A16Z Crypto is your definitive resource for the future of the internet. Whether you're already building in these spaces or simply curious about what's next. If you need a place to start, they recently released an excellent episode with Stanford Cryptography Professor Dan Bonay and former Google Xer Aliya in conversation with host Sonal Choxi about the intersection of AI and crypto.

Starting point is 00:09:06 From fighting deepfakes and proving humanity to large language models like chat GPT, they cover it all. I highly recommend checking it out, especially if you'd like to learn more about how AI and crypto will impact our everyday lives. Beyond crypto and AI, this show is for creators seeking more ways to truly own their work, for business leaders trying to prepare for the future today, and for innovators exploring trending tech topics. So go ahead, listen to Web3 with A16Z crypto wherever you get your podcasts. Before we get to the main episode, I want to tell you about today's sponsor, NetSuite.

Starting point is 00:09:38 I know many of you guys are entrepreneurs, executives, managers, business leaders who are trying to figure out how technology is changing the world and how it can change your business. Given that, I am thrilled to have NetSuite as a sponsor of the AI breakdown. NetSuite gives you the visibility and control you need to make better decisions faster. It is the software superpower behind so many of the world's most successful companies. And for the first time in NetSuite's 25 years as the number one cloud financial system, you can defer payments of a full NetSuite implementation for six months. That's no payment and no interest for six months and you can take advantage of this special financing offer today.

Starting point is 00:10:16 Listen, NetSuite is number one because they give your business everything you need in real time all in one place to reduce manual processes, boost efficiency, build forecasts, and increase productivity across every department. I know that if you are listening to the AI breakdown, you understand intuitively and deeply just how much data matters to any modern business. Having all of your information in one place can be the difference between making the right decision and making the wrong one. I think it's awesome that NetSuite has this new offer designed to really make their suite of tools available for all the businesses that need it. So, if you have been sizing up NetSuite to make the switch,

Starting point is 00:10:51 then you know that this deal is unprecedented. No interest, no payments, take advantage of this special financing offer at NetSuite.com slash breakdown. Go to netsuite.com slash breakdown to get visibility and control you need to weather any storm. That's netsuite.com slash breakdown. And with that, let's get to the show. For the last six years, Air Street Capital has been putting out an annual state of AI report. Now, if you're like me, these reports don't necessarily contain new information, because obviously if you're here listening to a daily AI show, you're probably really up to speed on the developments that have happened.

Starting point is 00:11:26 So rather than a catch-up, what I think this type of report is useful for is as a summarization and almost a zoom-out of what the most important things that happened are. It's a forest-for-the-tree sort of force that allows us to make sure we, don't get lost in the weeds of any given news cycle. The areas they focus on are research, industry, politics, safety, and then they give a set of predictions. And so what we're going to do on today's show is look briefly through what they consider the most significant events in each of these characters. Again, research, industry, politics, and safety. And I'll talk a little bit about where I agree, or perhaps in some cases disagree. So first up in the research section, they write

Starting point is 00:12:05 GPT4 is out and it crushes every other LLM and many humans. Now, this is actually something that if you listen to last weekend's long reads type episodes, we discussed a lot how GPT4 has set the tone for the entire year, that in fact the phase that we are in of artificial intelligence has in some ways been defined by ChatGPT's GPD3.5 and GPT4. There's a sense that many have, for example, Professor Ethan Malik, who I read from last week, that when we get to something that exceeds GPT4, which there is some speculation that Google's Gemini might be the first thing to do so, that we will have entered a different era. I tend to think that that's true, that we are coming up on the end of this inflection phase of generative AI. Now, of course, what that means and what

Starting point is 00:12:49 comes next is something we could talk about a lot, but let's move to their other summarizations. Next up, they write, fueled by Chat Chip-T's success, reinforcement learning from human feedback becomes MVP. Now, this one I actually want to pair with another assessment of theirs from a little bit farther on, which is researchers rushing to find scalable alternatives to RLHF. So this is sort of a two-part assessment. The first, how important RLHF has been to the training process, how it has become the default, but second, how it clearly has real significant limitations, some of which have been coming up with more frequency. Now, they're mostly focused in this section on these scalability issues regarding RLHF, which of course are quite clear. The larger the model, the harder it is to get

Starting point is 00:13:27 humans to actually engage with it in that way. But even beyond that, this year has seen a growing focus on ethical issues surrounding reinforcement learning, specifically the impact to people often in developing economies who weren't necessarily prepared for what they were going to have to deal with and who had really significant mental consequences because of it. Now, the next thing they noted was how with more advanced models, there has been a move away from openly sharing research. They write, OpenAI published a technical report on GPT4 where it didn't disclose any useful information for AI researchers, signaling the definitive industrialization of AI research. Google's Palm 2 technical report suffered the same fate, while Anthropic didn't bother releasing a technical

Starting point is 00:14:07 report for its clod models. Now, AI explicitly pointed to this in that GPT4 technical report. It identified the competitive landscape as well as the safety implications as the reason that it gave so little information. I think this is obviously a much bigger trend in something that has absolutely defined this year, is the shift to a much more hyper-competitive landscape, where all of the big companies, along with all of the leading startups, have been getting increasingly closed and competitive, rather than open and collaborative. The big exception to that is, of course, and perhaps unexpectedly, meta's Lama. Now, one of the things that I think historians will review when they look back at this key point in the history of artificial intelligence

Starting point is 00:14:46 is how Mark Zuckerberg's meta became the company that was the leading voice in trying to have more open source artificial intelligence. I think an interesting question within that is the extent to which it was a master plan based on principle, or was itself in some ways a response to seeing the growing competitive landscape and sensing that there was an opportunity to fill that role as other companies like OpenAI, ironically, got more closed. Now, even though Lama was open source, the state of AII report also notes, I think, correctly, that to the extent that this year saw a growing arms race in closed models, so too was their growing competition around open models.

Starting point is 00:15:21 They point to Falcon 40B, Red Pajama, and more recently Mistral AI 7B model. Interestingly, they identified that based on mentions on X, chat GBT and GBT4 were the most discussed models, followed by Lama and Lama 2. They also argued that while closed source models get the most attention, there has been an increase in interest in open source LLMs, especially those that allow commercial use. The team found that since the end of last year, RLHF and instruction tuning have been the most trending topic. And they point out that another big shift this year has been the new attention paid to context length. Indeed, they write context length is the new parameter count. I think in many ways one of Anthropics' biggest boons this year was its introduction of a 100K model, even if in practice there are still some challenges with it.

Starting point is 00:16:03 Another discussion on the rise that they notice is the question of whether we're running out of human-generated data and what it will mean for the training of future models. They discussed that even as synthetic data seems to be becoming more helpful, there is still, as they put it, evidence showing that in some cases generated data makes models forget. I tend to think that this is going to be one of the biggest points of exploration for the industry over the next year. Remember, meta had that hugely intriguing little chart when it released Lama 2 that suggested that an unreleased model that was trained with synthetic data actually outperformed the model that was trained on non-release. synthetic data. But we haven't really heard much more about it since then. They point out that even as societal interest in things like watermarking to help determine what has been AI created becomes more important, watermarking technology itself has some real challenges. Another trend that they identify that I think is going to be hugely important, they call Welcome Agent Smith. LLMs

Starting point is 00:16:52 are learning to use software tools. The most obvious tool they write is a web browser allowing a model to stay up to date, but practitioners are fine-tuning language models on API calls to enable them to use virtually any possible tool. If you've paid any attention to this channel, you'll have heard me talk about how much interest there is in AI agents, and that I think is directly in line with this observation, and I think is going to define a huge amount of what we see coming next year. Now, there are just a ton of other things in this research section alone. There's a discussion of vision language models, autonomous driving, foundation models for robotics, a discussion of the text video generation race, a little bit of nerfs, a whole discussion of medical AI. Overall, there's

Starting point is 00:17:30 nearly 70 slides just on the research piece alone. Now, what about the industry section? Perhaps unsurprisingly, right at the top is a discussion of compute and how much it has impacted the first great big public market winner in the AI space, which is, of course, Invidia. They discuss the GPU shortage, which has been, of course, a significant factor in shaping how the AI field has developed this year, and point out that to access the GPUs has been a key differentiator for certain startups. Interestingly, they take that famous phrase, X is the new oil, and apply compute to it, which I think many people will agree with, but specifically put it in the context of the Gulf states themselves, who are making increasingly aggressive moves to have access to advanced

Starting point is 00:18:08 compute at a hugely significant level. One interesting slide discusses what we talked about in yesterday's show, which is that Tesla is, at least for them, quietly marching towards having one of the largest compute clusters in the world, and they explore what use cases have been most significant for this new set of generative AI applications. They point to education is something that was disrupted almost immediately, the incredible shifts happening in the way that people code, they have some conversation that relates to today's brief around customized chatbots and AI characters, and of course they talk about the intensifying competition around text-to-image models. In that context, of course, they point out that copyright infringement issues and questions

Starting point is 00:18:43 are being fought on multiple legal fronts and could have significant impact on how some of these models develop over the next couple years. The other big trend they noticed, which is something we've talked about frequently on this show, is that generative AI has really been the last bastion in some ways of the ZERP-era venture capital world, although how long that can last remains to be seen. Now, we'll move a little more quickly through their politics and safety sections, given that basically the entirety of the politics section is identifying that the world is really divided on how to handle this, and that when it comes to global governance, it's still in its very early stages because there just isn't a lot of agreement yet.

Starting point is 00:19:19 Now, what that's created is a vacuum in which industry self-regulation efforts have arisen, although how far those efforts get remains to be seen. The other big dimension of politics that they talk about is, of course, the new front in U.S.-China economic and strategic tensions, in which the U.S. has blocked access to China as much as possible around a lot of advanced AI inputs. Finally, when it comes to safety, they identify something that has been one of the biggest themes for us this year, which is, of course, that the ex-risk debate has exploded into the mainstream. Around this, obviously, it has added a new dimension to this open-versus close source debate. Indeed, this is increasingly becoming wrapped up in the question of AI safety

Starting point is 00:19:53 and X-risk. Now, for those of you who are interested in actual approaches to and trends in how people are thinking about addressing different risks associated with AI, the trends that they list in this report are actually very interesting. They have a section, for example, on Anthropics constitutional AI and self-alignment. They ask how hard is scalable supervision. And ultimately, this is a great starting point if you want to understand how many people in the space are thinking about these issues. What about predictions? Predictions are always the most fun in some ways. So they have 10 here. I'll go through them quickly and then I'll double tap on a couple. One, a Hollywood-grade production makes use of generative AI for visual effects. Two, a generative AI media company is investigated for its misuse during the 2024 U.S. election circuit. Three, self-improving AI agents crust soda in a complex environment.

Starting point is 00:20:35 Four, tech IPO markets unthought, and we see at least one major listing for an AI-focused company, e.g. Databricks. Five, the Gen. A.I. Scaling craze sees a group spend over a billion dollars to train a single large scale model. Six, the U.S.'s FTC or UK's CMA, investigate the Microsoft OpenAI deal on competition grounds. Seven, we see limited progress on global AI governance beyond high-level voluntary commitments. Eight, financial institutions launched GPU debt funds to replace BC equity dollars for compute funding. Nine, an AI-generated song breaks into the Billboard Hot 100 Top 10, or the Spotify top hits 2024. 10, as inference workloads and costs grow significantly. A large AI company, e.g. OpenAI, acquires an inference-focused AI chip company. I think,

Starting point is 00:21:15 some of these are nearly guaranteed. Hollywood-grade production making use of generative AI, for example, definitely going to happen. I think when it comes to music, I'm not sure that we'll get a top-10 hit, but I do think that we'll see the first platform that is sanctioned by musicians and labels that allows people to use artist-likenesses and voices on tracks that are then allowed to be played in normal arenas. One of the most interesting ones to me, if in some ways the more boring on the face of it, is this idea that financial institutions will launch GPU debt funds to replace VC equity dollars for compute funding. I think this one has a very high likelihood of happening. We're already seeing compute and access to compute being used as a lever for venture capitalists

Starting point is 00:21:53 to attract better deals. And I think that seeing a subsection of funders that are specifically dedicated exclusively to the economics of providing access to compute is very, very likely to happen. I also think that there's a natural state that happens with venture capital funding where when you start to see areas that aren't just generalist development, but are specific cost centers, such as access to cloud services, paid customer acquisition, or now compute, you tend to get specialized funders. Anyway, overall, there is a ton in here. It is really worth taking even more time to dig into it. Thanks to the folks at Air Street, especially Nathan Bynock, who's the primary author, for what must have been a ton of work putting this all together. That's going to do

Starting point is 00:22:30 it for today's AI breakdown. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The State of AI Report: Research, Industry, Politics and Safety

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.