The AI Daily Brief: Artificial Intelligence News and Analysis - Is AI Slowing Down?

Episode Date: November 13, 2024

Is AI development hitting a wall? Today’s episode explores reports that OpenAI’s Orion model, expected to surpass GPT-4, shows less impressive gains than anticipated. Researchers are now shifting ...focus to inference scaling, enhancing real-time reasoning rather than relying solely on traditional pre-training. Brought to you by: Vanta - Simplify compliance - ⁠⁠⁠⁠⁠vanta.com/nlw⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, is AI actually possibly slowing down? Before that in the headlines, Salesforce hires 1,000 salespeople to sell AI salespeople. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. Today we are going to rip through a set of stories as the main episode is a little bit longer than usual. we kick it off with a story from Salesforce. Now, Salesforce is pushing the agent story more than any other company out there.
Starting point is 00:00:41 You might have heard CEO Mark Beniof, absolutely positively screaming, about the fact that the assistant era of AI has been a big letdown, because of course it's really and was always really about agents all along. The news today is that Salesforce is hiring 1,000 people to push their new AI agent platform. Beniof said that the hiring surge is aimed at making the most of the amazing momentum of their new product, adding, Agent Force became available just two weeks ago and we're already hearing incredible feedback from our customers. One of the important pieces of context is that Salesforce has been downsizing their sales team over the past two years, so there is clearly a lot of excitement
Starting point is 00:01:16 around this new product line. Of course, the joke everyone is making on Twitter is some variation of what Thomas Smalley wrote. Salesforce hires 1,000 salespeople to sell their AI salesperson. Just writing that out sounds like a joke. But still, I think it's a really positive sign that there's so much demand that this is the actual direction they're headed. Meanwhile, over in Elon world, X is testing a free version of XAI's GROC chatbot. Until now, the AI features of X have been only available to paying customers. However, over the weekend, it seems that GROC was available to free users in select markets. TechCrunch confirmed that all of New Zealand had access to GROC, and it seems that Australia also has access.
Starting point is 00:01:53 Now, usage is limited with 30 text queries every two hours and three image analysis questions per day, but there's still a lot of excitement. Dr. Warface writes, If Grock becomes free, I will delete ChatGBT.BT. Over in the land of Google, according to the Wall Street Journal, Google news execs Silesh Prakash, has resigned. The resignation comes as news publishers rethink their relationship with big tech in the AI era. Publishers have seen traffic and revenues decline, with many blaming AI for serving content summaries rather than driving clicks.
Starting point is 00:02:21 The issue has become even more contentious with Google rolling out their AI overviews globally. Prakash was originally brought in two years ago from the Washington Post to bridge the gap between the two industries, and therefore many are wondering if his departure is a signal that the relationship is souring even further. Then again, there are clearly structural changes underway at Google following a year marked with losses in antitrust lawsuits, so it's hard to know exactly what's going on. One other interesting story out of Google, the company's DeepMind Division has unexpectedly released Alpha Fold 3. The source code and model weights for the protein folding AI are now available for Academy Use, which could further accelerate scientific progress and drug development.
Starting point is 00:02:58 The surprise announcement comes weeks after the system's creators Demis Hesavas and John Jumper were rewarded the Nobel Prize in Chemistry for their groundbreaking model. Alpha Fold 3 seems like another paradigm shift for biological science. Version 2 could accurately predict protein structure based on nothing but their amino acid sequence. The new version can also model the complex interactions between proteins, DNA, RNA, and small molecules. This should massively expand its usefulness in drug discovery and disease treatment. Traditional methods of studying these interactions often take months of lab work with no guarantee
Starting point is 00:03:27 of success. This marks an important shift in computational biology. AI methods now outperform our best physics-based models in understanding how molecules interact. Essentially, AlphaFold has gone from a specialized tool to a more general Swiss Army knife for researching molecular biology. Over in the market realm, U.S. companies are set to spend $30 billion on data centers as AI drives a construction boom. That amount has more than doubled since 2022 when ChatsyBT was first released.
Starting point is 00:03:53 Data center investment is now bigger than every other major infrastructure category, including hospital, schools, hotels, and transportation. It was a smaller category than each of these until late last year. Investment manager KKR believes the trend is exponential, expecting it to hit $250 billion per year globally within the next three to four years. Although stick around for our main story, as there could be some implications on that front. That, however, is going to do it for today's headlines. Next up, the main episode. Today's episode is brought to you by Plum. Want to use AI to automate your work but don't know where to start, Plum lets you create AI workflows by simply describing what you want. No coding or API keys required. Imagine typing out AI, analyze my Zoom
Starting point is 00:04:33 meetings and send me your insights in Notion and watching it come to life before your eyes. Whether you're an operations leader, marketer, or even a non-technical founder, Plum gives you the power of AI without the technical hassle. Get instant access to top models like GPT40, Claude Sonnet 3.5, assembly AI, and many more. Don't let technology hold you back. Check out Use Plum, that's Plum with a B, for early access to the future of workflow automation. Today's episode is brought to you by Vanta. Whether you're starting or scaling your company's security program, demonstrating top-notch security practices, and establishing trust is more important than ever.
Starting point is 00:05:06 Vanta automates compliance for ISO-2701, SOC2, GDPR, and leading AI frameworks like ISO-402, and NIST AI Risk Management Framework, saving you time and money while helping you build customer trust. Plus, you can streamline security reviews by, automating questionnaires and demonstrating your security posture with a customer-facing trust center all powered by Vanta AI. Over 8,000 global companies like Langchain, Lila AI, and factory AI, use Vanta to demonstrate AI trust and prove security in real time. Learn more at vanta.com slash NLW. That's vanta.com slash NLW. Today's episode is brought to you by Super Intelligent. Every single business workflow
Starting point is 00:05:46 and function is being remade and reimagined with artificial intelligence. There is a huge challenge. however, of going from the potential of AI to actually capturing that value. And that gap is what Super Intelligence is dedicated to filling. Super Intelligence accelerates AI adoption and engagement to help teams actually use AI to increase productivity and drive business value. An interactive AI use case registry gives your company full visibility into how people are using artificial intelligence right now. Pair that with capabilities building content in the form of tutorials, learning paths, and a use case library. And Superintelligent helps people inside your company show how they're getting value out of AI, while providing resources for people to put that
Starting point is 00:06:25 inspiration into action. The next three teams that sign up with 100 or more seats are going to get free embedded consulting. That's a process by which our super intelligent team sits with your organization, figures out the specific use cases that matter most to you, and helps actually ensure support for adoption of those use cases to drive real value. Go to Bsuper.a.I to learn more about this AI Enablement Network. And now back to the show. Welcome back to the AI Daily Brief. Today we are having a really fascinating conversation that has implications for everything in AI from safety to the business of it, to technology frontiers, to applicability to regular life and all sorts of interesting things bottled up in this one conversation around whether the pace of development of
Starting point is 00:07:12 AI is actually slowing. Now, this is not a new conversation. It's something that has been debated for a very long time. And by the way, there are wildly divergent opinions here. There are some folks who think that you can just add more compute and more data and get more performant models. There are others who think that there is no path to AGI with the current architectures we're using. This debate is why people like Jan Lacoon can, with his straight face, say that they believe
Starting point is 00:07:42 that AI is dumber than a cat. And of course, the implications of that view are significant. In the particular case of Lacoon's point, it means that to him all of these existential X-risk-type concerns are way, way overblown. Now, what no one is arguing is that AI isn't extremely valuable even in the state right now. One of the things that I've often said is that even if AI stopped developing in this moment, it would still take years for the world to adopt and adapt to the new processes that have been made available by just the technology that we have in this moment. However, it still would be fairly significant if, in fact, the rate of AI improvement, was decreasing significantly.
Starting point is 00:08:19 All right, so let's start by discussing where this conversation, or at least this iteration of the conversation, actually came from. The TLDR is that the information reported that OpenAI are apparently looking to develop new strategies to deal with a slowdown in AI improvement. Back in May, OpenAI CEO Sam Altman had told staff that he expected their latest frontier model, which they're calling Orion internally not GPT5, he believed would be significantly better than last year's flagship model. At the time, OpenAI had only completed 20% of the training process for Orion, and it was apparently
Starting point is 00:08:51 already on par with GPT4 level. However, now that there have been a few more months of work, things are starting to look a little bit different. According to the information sources, employees who have tested the Orion model found that although the performance does exceed the current models, the jump isn't nearly as profound as the improvement from, for example, GPT3 to GPT4. There are even some use cases where the model might not be consistently better than the previous state of the art. According to one employee, while Orion does better at language tasks, it doesn't
Starting point is 00:09:20 necessarily outperform when it comes to coding, which given how prominent coding is as a use case, could be a significant problem. Now, what this is leading to is a different approach to thinking about scaling. The information writes, in response to the recent challenge to training-based scaling laws posed by slowing GBT improvements, the industry appears to be shifting its efforts to improving models after their initial training, potentially yielding a different type of scaling law. Another approach, OpenAI has apparently created a foundation's team to figure out how the company can improve their models given what is increasingly a scarce supply of novel training data. Those strategies reportedly include training Orion on synthetic data produced by other AI models,
Starting point is 00:09:58 as well as doing more to improve the model's reasoning process during the post-training process. One of the big challenges is that at this point, the current frontier models have effectively been trained on pretty much all of the Internet's data. Still, by far the more interesting shift here is about thinking about scaling simply as a matter of pre-training to instead thinking about the post-training implications. And this gets back to something that we've talked about a lot here, the idea that O-1, which is, of course, OpenAI's more advanced reasoning model, is a fork on the evolutionary tree of LLMs. Sam Altman, when he announced it, was clear that this reasoning model
Starting point is 00:10:31 represented a separate path from GPT4. At the time, it was assumed that OpenAI would parallel process these paths, but increasingly it seems like they're leaning into the reasoning path. And by the way, it's not just Open AI who's dealing with this problem. A few weeks ago, Alex Heath of the Verge reported, I've heard that the latest Gemini model isn't showing the performance gains that Google had hoped for, though I would still expect some interesting new capabilities. The chatter I'm hearing in AI circles is that this trend is happening across companies developing leading large models. Former Open AI chief scientist Ilya Sutskever spoke to this issue, stating, the 2010s were the age of scaling, now we're back in the age of wonder and discovery once again.
Starting point is 00:11:09 Everyone is looking for the next thing. Scaling the run. right thing matters now more than ever. And this one again gets us to what Turing Award winner Jan Lacoon was pointing out in last month's Wall Street Journal piece. The WSJ wrote, Lacoon thinks that the problem with today's AI systems is how they are designed, not their scale. No matter how many GPUs tech giants cram into data centers around the world, today's AIs aren't going to get us to artificial general intelligence. There are also questions around whether synthetic data will actually create their own set of new problems, although that's far from clear as well. But overall, it still appears that open
Starting point is 00:11:41 Open AI is planning to double down on their reasoning pathway. Last month at an appearance at TED AI in San Francisco, Open AI researcher Nome Brown said, it turned out that having a bot thing for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer. And while this does open up a new pathway for improvement, it also implies having a lot more computing power to serve answers. Sonia Huang, a partner at Sequoia Capital, said, this shift will move us from a world of massive pre-trading clusters towards inference clouds, which are distributed cloud-based servers for inference. Invidia CEO, Jensen Huang also referred to this pathway stating last month,
Starting point is 00:12:17 we've now discovered a second scaling law, and this is the scaling law at a time of inference. So where has the discussion been? First of all, there has been some confirmation from others that this is bubbling behind the scenes. Jan Peleg wrote, heard a leak from one of the frontier labs, not open AI, they reached an unexpected huge wall of diminishing returns, trying to brute force better result by training longer and using more and more data than what is published publicly. For others who have been saying this stuff for a while, they use this as a moment to reinforce their point. Professor Pedro Domingos wrote, scaling laws are S-curves, not exponentials. In-Seed Professor Jason Davis explained that a little further. He said this may be the most important
Starting point is 00:12:56 strategic question about AI. Is the rate of improvement of LLM performance starting to decline? Are we seeing maturation in the S-curve? The report from the information gives us some data, but I think we really don't know, and is at odds with many insiders working on foundation models, especially since the effective performance of LLM systems will depend on complementary innovations that extend its usefulness, and perhaps many years of individuals and organizations finding the right use cases in configuring their workflows to leverage them. Others, including the information founder Jessica Lesson, wondered what the implications are for the data center business.
Starting point is 00:13:27 She wrote, If this continues the slowing rate of gains in training LLMs has huge implications, one thought I had this morning, Will being at the bleeding edge and chips matter as much over time? Will China therefore catch up to the U.S. more quickly? Others pointed out that of course there was always going to need to be something new. Sai on Twitter writes, pre-transformation architecture AI had hit a brick wall. More insets and architectural tweaks will be needed to break successive barriers for improvement.
Starting point is 00:13:51 It would be unwise to imagine that current LLM is the final solution. Perhaps the most salient message, though, is just that we need to think about scaling and training in a slightly different way. Estreya Intel on Twitter slash X says AI hitting a roadblock, give me a break, as if the diffusion transformer breakthrough hadn't happened, as if O1 reasoning isn't groundbreaking on its own. The only thing reaching a plateau is pre-training, and that's a milestone, not a ceiling. Pre-training was never meant to be the holy grail of neural network optimization. We'll inevitably move towards models that can actively learn for a reason.
Starting point is 00:14:22 Hater at Slow Developer writes, even if current LLMs can't scale to AGI through compute increases alone, it's not a problem. OpenAI O1 shows further innovation are possible, and we haven't reached inference compute yet. Things are progressing even if LLMs don't magically scale to AGI with more compute. Professor Shelley Palmer writes, There's been a lot of chatter about the end of LLMs or LLM starting to fail. Those are only headlines, clickbait, really. If you dig a bit deeper, you'll read that there are several schools of thought regarding how to efficiently scale the foundation models.
Starting point is 00:14:50 If the goal is AGI, then just adding compute power to pre-training may not be the best path to follow. Instead, researchers are exploring an alternative called inference scaling to achieve smarter AI. Inference, the process when an AI model generates outputs and answers can be optimized by having models, quote-unquote, think through multiple possibilities before settling on a response. This approach enables complex reasoning during real-time use without increasing model size. OpenAI's recently launched O1 model is a good example. By enhancing inference, O1 can tackle tasks that demand layered decision-making, such as coding or problem-solving in ways similar to human thought. Test-time compute techniques make this possible, allowing models to dedicate more
Starting point is 00:15:25 processing to challenging queries as needed. A move towards inference-focused distributed cloud-based servers instead of large centralized training clusters might create a more competitive chip landscape. While Nvidia is the go-to chipmaker for pre-training hardware, there are a bunch of chipmakers, AMD, et cetera, that make hardware suitable for this new method of inference scaling. The key takeaway is simple. LLMs are not failing, they are evolving. Sensationalist headlines aside, this is how product development works. Every's Dan Shipper also pointed out that the framing of all of this was very problematic. And that it's didn't really reflect the conversations that he's having with people inside the labs. He writes,
Starting point is 00:15:59 the message that this headline conveys is at odds with what people inside the big labs are actually feeling and saying. It's technically correct, but the takeaway for the casual reader, that AI progress is slow, is exactly the opposite of what I'm hearing. And indeed, ultimately, there was so much chatter to that effect that the information actually ended up clarifying a little bit. Stephanie Palazzolo writes, seeing lots of discussion after our scaling laws article this weekend. To be clear, we're not saying the world is ending, just that researchers are having to find new ways to improve LLMs, i.e. test time compute. That led to another article, goodbye, GPT, hello, reasoning O. And this basically gets to the point that this isn't about a halt. It's about an
Starting point is 00:16:35 alternative direction and a new path that shows more promise than the previous. Still, it's a super interesting conversation, one that we will continue to follow closely. For now that is the story. Appreciate you listening as always. And until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.