Semiconductor Insiders - Podcast EP332: How AI Really Works – the Perspectives of Linley Gwennap

Starting point is 00:00:07 Hello, my name is Daniel Nenny, founder of semi-wiki, the open forum for semiconductor professionals. Welcome to the Semiconductor Insiders podcast series. My guest today is Linley Gwynapp, technology analyst and author of the new book, How AI Really Works, the models, chips, and companies powering a revolution. Linley was the longtime editor of Microprocessor Report and chaired the popular Linley conferences. Welcome to the podcast, Linley. Great. Thanks for having me, Daniel. I mean, we've known each other for some time,

Starting point is 00:00:39 Linley, but can you tell the audience, how did you get your start in the semiconductor industry? Sure. I started at Hewlett-Packard. I was doing PA-Risk system design back when PA-Risk was a thing. And I eventually moved into chip design and later in the marketing. And I was doing competitive analysis as part of my marketing job, and we used microprocessor report as a source.

Starting point is 00:01:07 And one day I saw an ad there that said they were looking for new analysts to help write articles. And I thought I'd give it a try. And if it didn't work out, I could always go back to chip design. But I ended up spending 30 years as an analyst and writing about the semiconductor industry. And then just a few years ago, I sold microprocessor report to Tech Insights and retired. And, you know, except for this one project that I've been working on. Yeah, I definitely missed the processor conferences. I used to really enjoy those.

Starting point is 00:01:41 So let me ask, Lindley, what led up to you writing this book? Well, living in Silicon Valley, I can't go to the park or the store without hearing people talk about AI. And most of what I hear is wrong. I mean, some people think AI is going to solve all the world's problems by next Christmas. And other people think that it's going to cause some sort of terminator-style extinction event. And these people are confused, you know, because they're listening to self-promoting AI leaves. The CEOs of companies like OpenAI and Nvidia, I mean, make billions of dollars by convincing investors that AI is wonderful and superintelligence is coming soon.

Starting point is 00:02:25 And on the doomsday side, there are these companies that are selling services to protect against rogue AIs. And of course, there's just trolls that want to generate social media hits. So I decided to write a book that would explain how AI works in a way that doesn't require a technical degree to understand. And then building on this foundation, I explain why AI isn't the Nirvana or a dystopia, but rather a fundamental technology shift. And similar to the microprocessor or the Internet or the smartphone. My goal is for readers to be able to use this knowledge. to better understand and evaluate what they hear about AI. I wrote this book for a general audience,

Starting point is 00:03:07 but I was surprised to find that my technical colleagues also found it useful. The book covers a wide range of AI implementation, including chatbots, image generators, recommenders, self-driving cars, as well as the chips and companies that support them. And people in the industry typically only work on one piece of this puzzle, So they can get a broader view from the book. You know, one of the myths you discuss in your book is that AI model sizes grow exponentially.

Starting point is 00:03:41 Why do people keep saying that? You know, is that still the case? I mean, what is the reality? Yeah, well, a few years ago, I would show a slide at my conference every six months with a graph of the largest AI models based on parameter count. And starting from GPT1, language models grew at a phenomenal, pace, like 40x per year. And everybody in the industry thought that that was happening. And of course, they thought it would continue forever. But more recently, model growth has hit the brakes. And,

Starting point is 00:04:14 you know, some people, of course, have seen that happening, but a lot of people are still back in this idea that models have kept growing. But the reality is that after GPT4 appeared at 1.7 trillion parameters, only groc 3 has been larger. And actually, most companies are now reducing model size. GROC 4, for example, is smaller than GROC 3. And other recent models are less than a trillion parameters. So during this period, the size of the largest language model has been growing at only about 10% per year. That's 10%, not 10x like we were seeing before. Now, there are two things that have have caused model growth to solve. The first is compute cost. The computation required for training a model increases with roughly the square of the parameter count. So it's an exponential.

Starting point is 00:05:13 Going from two trillion parameters to 10 trillion, for example, would require about 25 times more compute. Now OpenAI spent about a hundred million dollars to train GPT4. So do the math. I mean, larger models would quickly push training costs to billions of dollars, which is simply unaffordable, even with these mega data centers. Now inference compute grows linearly with model size, which isn't as bad, but larger models are still more expensive to deploy. And OpenAI and similar companies are now shifting their focus from developing new capabilities to building customer use.

Starting point is 00:05:58 And in this new regime, smaller models are more profitable. There's no real business case for an AI model with five or 10 trillion parameters. I mean, unless it has significant new capabilities that people would pay a lot more to use, instead of $20 per month, maybe $100 or $200. And so far, we just haven't seen anything worth that much to most users. Now the other major problem that's holding model size back is that bigger models, need more training data. So for GPT4, OpenAI use essentially every public web page you could find.

Starting point is 00:06:37 And after scraping the entire web, I mean, where are you going to get more content? I mean, sure, new webpages constantly appear, but much of this content is AI generated. And training models on AI sloth has been demonstrated to degrade their performance. So it's hard to see how we can get enough clean training data, you know, for a 5x or 10x increase in model size. Yeah, interesting. So let's talk about semiconductors. Invidia dominates the market for AI chips. We all know that.

Starting point is 00:07:10 Why has it been so difficult for other companies to compete? Well, there was a wave of well-funded chip startups like Cerebris and Brock, Samanova and Tens Torrance, Graph Corps. they all thought they could design a better chip than Nvidia, but by the time they got a product to market, Nvidia had already moved on to the next generation. I mean, these startups just never really caught up with Nvidia's cadence of improvement.

Starting point is 00:07:36 And another huge challenge is software. Now, AI developers typically use open frameworks like PiTorch and TensorFlow, which means they're not locked into Nvidia in the same way the customers are locked into X-86, for example. But in order to allow these customers to switch from Nvidia to a new chip, these startups would have to build a complete set of functions to support that framework. And to support one or even multiple frameworks took years of software development.

Starting point is 00:08:12 And even after building the software, they realized that it can't just be functional. It has to be performance-tuned as well in order to keep up with Nvidia's performance. So despite burning through hundreds of millions of dollars, these startups never really delivered a broad and reproducible performance advantage over Nvidia. They're now defunct or acquired or in some cases redirected to niche markets. Now, big companies have tried to take on Nvidia as well. Intel spent billions of dollars to acquire two different AI chip startups, but neither one could deliver a competitive product. And now, you know, Intel is left without a viable AI strategy. Now, AMD is the only company that's really gained noticeable share in this market,

Starting point is 00:09:04 reaching about 8% last year. But even with its considerable resources, AMD software stack is not as good as NVIDIA's, and that's keeping it from gaining even more market share. Right. So what do you see as the primary threats to NVIDIA's? dominance. I mean, are we talking about domain-specific chips? Yeah, yeah. I think it's really the chips that Nvidia's own customers are developing. Now, Google was the first

Starting point is 00:09:34 company to develop its own AI chips and now it's using its TPUs for nearly all of its internal workloads and basically eliminated the Nvidia from its data centers. And Google has an advantage that Nvidia has to support a broad range of customers and workloads, but Google only has a small number of four applications. So it can build software solely for those tasks and simplify its development efforts. And although both chip design and software development are expensive, Google can amortize those costs across its tremendous volume, and then they save money by not paying

Starting point is 00:10:16 Nvidia's steep margins. And other large cloud companies have picked up on this. Amazon has developed several generations of its tranium chips, which it rents as part of its AWS cloud. But the company continues to run most of its internal workloads on Nvidia GPUs. And then Meta and Microsoft have also designed their own AI chips, which they should be deploying later this year. And these new products will further eat into Nvidia sales, but the potential damage depends on whether Meta and

Starting point is 00:10:55 Microsoft use a broad approach like Google or stick to a more limited deployment like Amazon. And Chinese companies like Huawei and Alibaba and Baidu, they've also developed their own AI chips. Now they suffer from the same software limitations, as these other companies, and their chips are also limited by China's lagging semiconductor capabilities. These companies, well, they continue to buy as many Nvidia GPUs as they can, but they have these homegrown AI chips in case the U.S. decides to tighten its export restrictions even further. Yeah, well, it certainly is an exciting time to be in the semiconductor industry, that's for sure.

Starting point is 00:11:38 So, really, final question, how do you see the AI rollout developing? How will it compare to previous technology shifts that you and I have seen over the last, you know, 30, 40 years? Yeah, I think the main thing people don't understand is that fundamental technologies like we've seen in the past, I mean, they take decades to really achieve their full potential. You know, when they first appear, people apply them to existing services and functions. For example, the Internet turned postal mail into email. and the web changed bulletin boards into web pages. The smartphone delivers text messages instead of email.

Starting point is 00:12:19 And AI is in this phase today, and companies are using it to improve productivity on common tasks, like research and code generation, customer service. And in this phase, the new technology threatens jobs, since better productivity means fewer people, are needed for the same set of tasks. But on the other hand, greater productivity also expands the economy, and it reduces prices, which helps everyone. And then over time, new technologies can create completely new applications. And that's the important point that I think a lot of people miss.

Starting point is 00:13:04 I mean, the web enabled social media and streaming services, which didn't exist. before. The smartphone created opportunities for new applications like Uber and DoorDash. So we don't really know what new things AI will create, but I mean these new businesses could be even bigger than what these earlier inventions created. And so I would expect that as in these previous technology transitions, AI should create more jobs than it destroyed. and, you know, it will probably, you know, deliver tremendous economic and technical benefits. It's just not going to happen by next Christmas. Right. Yeah, I think we're at the tip of the iceberg. You know, the bottom line for me is that just like the previous technology disruptions or technology trends,

Starting point is 00:13:58 it's going to make the world smarter. You know, the people in the world are going to be smarter, you know, just like the Internet did and everything else. and AI is just going to, you know, raise the intelligence of the world. Yeah, I would agree. And, you know, in raising the intelligence, as you say, I think, you know, that will enable people to not just do the same things they're doing today, but, you know, create these new kinds of ideas and new kinds of businesses that don't even exist. And that's going to take time. But, you know, once we get there, the benefits will be tremendous.

Starting point is 00:14:36 Yeah. So where can we get your book, Linley? It's available on Amazon. So if you just Google my name, or not Google, but if you search for my name on Amazon, you should be able to find the book. Great. Hey, thanks for your time, Lily, and I look forward to seeing you again at a conference. Yeah, I hope so, Daniel. Thanks for having me on your podcast. That concludes our podcast. Thank you all for listening and have a great day.

Semiconductor Insiders - Podcast EP332: How AI Really Works – the Perspectives of Linley Gwennap

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.