Orchestrate all the Things - The future of AI chips: Leaders, dark horses and rising stars. Featuring Tony Pialis, Alphawave CEO & Co-founder

Starting point is 00:00:00 Καλώς ήρθατε στο Αρχιστήριο των Πορταγών. Είμαι ο Γιώργος Ανατιώτης και θα συνεχίσουμε τα πράγματα μαζί. Στοιχεία για τεχνολογία, δίκαια, AI και ΜΕΔΙΑ και πώς τα πλήρως συνεχίζουν να συνεχίσουν σε σύνορα. Το ενδιαφέρον και το επενδύμα στο AI αυτοκομιζόταν και το generative AI είναι σύντομο. Μετά από μία τρίτη της κεφάλαιης, οι CXOs έχουν επηρεασμένως πριν πραγματικά εμπλέκει το generative AI σε τις εργασίες τους, σχεδόν μίαν μόνο μίαν προσπαθία να εμπλέκει σε αυτό. Αλλά αυτό που τα πιέζει όλα, το AI hardware, δεν παίρνει όλη τη σκέψη που χρειάζεται. Το να κρατήσουμε AI στις chip της AI

Starting point is 00:00:34 σημαίνει να είμαστε γνωστοί των σημερινών μπλοκών και των επόμενων ευκαιριών. Σύμφωνα με το τελευταίο διάστημα του iMark, ο κοινωνικός chip της AI είναι προσπαθμένος να αυξήσει 89,6 διόλια δόλια το 2029. Η ανάπτυξη των αιχμόδοτων της AI έχει αυξήσει μεταξύ της. Η ανάπτυξη της τεχνολογίας της AI, η ανάπτυξη της αιχμόδοτης της AI στην επιχείρηση της ηλεκτρονικής επιχείρησης και η αιχμόδοτητη της αιχμόδοτης Nvidia. Για παράδειγμα, υπάρχουν AMD, Intel, chiplets, upstarts, analog AI, optical computing και AI chips

Starting point is 00:01:10 που χρησιμοποιούνται από την AI. Πολλοί άνθρωποι έχουν περισσότερες ενδιαφέρονες για να δείξουν την αιώνα της AI από την τον Τόνι Παϊάλης, CEO και co-founder του αλφα-wave. Σε μια εξαιρετική συζήτηση, είπε την προσοχή του ειδικού του για την αιώνα της AI, τη διαδικασιακ insider's perspective on the AI chip landscape,

Starting point is 00:01:26 the transformative rise of chiplets, specialized hardware for training and inference, emerging directions like analog and optical computing, and much more. I hope you will enjoy this. If you like my work and orchestrate all the things, you can subscribe to my podcast available on all major platforms. My self-published newsletter also syndicated on Substack, HackerNin, Medium, and Dzone, or follow Gesturate All the Things on your social media of choice. So I'm Tony Pialis, co-founder, funder, and CEO of AlphaWave. I am a semiconductor serial entrepreneur. AlphaWave is the third business that I co-founded, funded, and scaled. The two previous businesses, both Snowbush as well as

Starting point is 00:02:12 V-Semi, had successful exits. One was acquired by Semtech, is now part of Cadence. The other was acquired by Intel. We founded AlphaWave in 2017 with the ambition of becoming the next great semiconductor company of the industry. That's been our ambition throughout. We took the company public in 2021 for a market cap of $4.5 billion. And our goal is to continue to scale Alphawave as a vertically integrated semiconductor company, delivering connectivity solutions to the industry and particularly to the AI sector. Great. Thanks for the intro. And I have to admit that I wasn't familiar with AlphaWave myself. And I guess that many people won't also be familiar with the concept of vertically integrated circuit. And so I quickly went through your main products. And so as they're listed,

Starting point is 00:03:15 it's silicon IP, chiplets, custom silicon and connectivity products. And I should also add for context that, well, the topic, let's say, of today's conversation is, well, the broad AI landscape and how you see developing, and specifically with respect to hardware and AI chips. So I was wondering if you could share a few more words about your product line with Algebra and how that's relevant in the AI chip conversation. Sure. Look, the major players in terms of scaling out data centers and compute these days is no longer the Cisco's of the world. It's transitioned to hyperscalers, right? So hyperscalers are the Googles, the Microsofts, the Amazons, the Metas. And so these hyperscalers have both internal design capability, but they also build

Starting point is 00:04:07 their own server, they build their own networks, they build their own data centers, they build their own campuses. And so they need multiple different forms of solutions. So if they're building their internal AI chip, the same way Amazon does, the same way Microsoft does, the same way Google does, these days, as does, the same way Google does. These days, as they're building it on three nanometer and two nanometer, they'll need IP and they'll need chiclets. So that's that portion of our offering. But look, these hyperscalers also have more silicon needs than they can build themselves. And that's where a custom silicon offering comes in. Right. So they have a they have a design they'd like to implement. Let's say an AI switch for top of the rack in order for training.

Starting point is 00:04:52 Their design teams could be focused on the AI compute. They don't have the ability to design the AI switch. They don't have the connectivity that can connect that switch to the rest of the AI cluster. So that's where they work with someone like us. Finally, look, how do you connect servers? How do you connect racks? How do you connect data centers? That's where connectivity also comes in. So whether it's in the form of driving electrical cable, whether it's in the form of driving optical cables, you need connectivity silicon to do that. Things that go into optics, things that go into cables. And that is also part of our product line. And so there's been a lot of talk about AI these days, right? And different

Starting point is 00:05:41 forms of compute and NVIDIA, how it's transforming the world, and it is. But George, I'll tell you, having spoken to everyone in the industry, the major challenge facing AI moving forward is not the compute. We have the ability to design and implement the compute on shiplets. It's the connectivity to connect the compute in order to process all of the data. And that's the particular problem that we're solving. Okay. Okay. That's an interesting perspective. And I have to admit, again, it's one that I wasn't personally familiar with because usually the conversation revolves around models and their capabilities and by extension what they require in terms of compute

Starting point is 00:06:27 and also data but that's a different conversation not really relevant in what we're here to talk about today so let's let's start approaching let's say your your point of view by entering by how do you see the ai landscape formatting today? So obviously in the last year, there's been an explosion in terms of interest in this. So even organizations that were not previously on board, let's say, somehow got into the hype, let's call it. And so I was looking at a statistic recently that the number of mentions to AI and generative AI has skyrocketed in just a few months. So whereas last year it was only a very small segment of earning calls that on which executives would mention AI, that has gone through the roof basically. So clearly it means that there's more demand for this technology. So I guess the first question to address would be, okay, so what do you think are the organizations that are really in need of developing their own in-house AI models versus

Starting point is 00:07:40 getting something that's off the shelf from a third-party provider. Got it. So let me just hit the first item that you raised, and then we'll talk about models and who should be building their own versus leveraging third-party. First half of this year, you're right. The discussion about generative AI and AI exploded. Clearly, ChatGPT was introduced, I think it was 2022.

Starting point is 00:08:12 It took a while to take off, but once it took off and people started testing it, they were amazed. And I've seen a fury like the introduction of the cell phone, right? It just completely, or the the smartphone completely changed our lives. But I'd say from a business perspective, from a design start perspective, first half of the year. Yeah, nothing meaningful happened. We're seeing such a rapid pace of increase in terms of new design starts for AI, whether it's on the training side, whether it's on the inference side, whether it's on the edge, whether it's in the core. And different countries are investing into developing their own AI strategy. So you see the same strategies being replicated in the U.S., Canada, U.K., France, Germany, right, Korea, Taiwan and Japan. So everyone is is getting onto this AI trend and investing onto itself and not trying to be dependent solely on NVIDIA. So that has spurred off a tremendous

Starting point is 00:09:29 level of investment in our industry. We're seeing it in our pipelines. We're seeing it in our design starts. We're seeing it in terms of deals that we're booking. And so it's tremendous. For us, the custom silicon and in the silicon space servicing this sector, it's been a fantastic opportunity. Now, in terms of models and who should be developing their own models versus leveraging third-party models, yeah, look, training to a new set of models is expensive right now. I don't know if you know the going price of a leading edge NVIDIA GPU card can range anywhere between twenty five and seventy five thousand for a single card. OK, it's ridiculous. And you need thousands of these in order to be able to train any reasonable size model so yeah it you know the barrier of entry in terms of training your own model is extremely high right now extremely high so i

Starting point is 00:10:32 think that limits it to a select few that can justify the cost because i mean look at the time span between chat gpt 3.5 and 4 i think was, George, I think it was on the order of six months. Okay. I think so. So who can afford to do this each and every time? There was a statistic in terms of what the cost of a ChatGPT training session was. Okay. And I think it was on the order of tens of millions per run.

Starting point is 00:11:03 And guess what? Not every run passes. There was an interesting photo when I visited OpenAI. It was a bunch of engineers in front of servers on their knees praying like this. You know why? Because they didn't want it to crash. By the way, guess what caused it to crash? The connectivity, not the compute.

Starting point is 00:11:22 Just there'd be a failure on the I.O. So, yeah, the barrier is extremely high. I think right now it's very, very selective in terms of who can develop their own models. I believe over time, AI will inevitably progress to be something like a utility. OK, where hyperscalers will provide access to all of the compute power. So different organizations, you know, like electricity, like nuclear power. Okay. I think it will be a reasonably priced event. And, you know, developers, anyone can come on and use this utility to train their own

Starting point is 00:12:02 models, develop their own models, optimize their own models, deploy their own models. I think that is probably the end game here. There's going to be a lot of profit taking between now and then. And who knows how long it'll take to eventually reach that state, right? Is it five years? Is it 10 years? Is it 20 years? We'll see. We'll see. But I think that's inevitably the end game. It becomes like a utility, electricity, natural gas, water, and AI or access to compute. Well, I would argue that access to compute is already at least to some extent a kind of commonplace through the hyperscalers that you also mentioned earlier. But you're right when you said that, well, these NVIDIA specifically, GPUs are priced in such a way that it makes it quite a barrier to enter.

Starting point is 00:12:58 But I would also add to that that it's not just a price that you have to pay. It's also the know-how that not all organizations actually have not only in terms of how to leverage the hardware but also further up the stack so even for simple things so how to to manage their data or how to train their models and so on so there's lots's lots of things to consider there. And look, as CEO, one of the things we're trying to address as we incorporate AI more and more into our business is how do we secure it, right? How do we protect our data, our models from getting out into the industry? Yeah, so it's yet another layer of complexity. So in that line of thought then, and I agree that eventually, and I think actually even at this point, there are some open source models out there that people can tweak to their needs.

Starting point is 00:14:05 That doesn't necessarily mean that they're going to be appropriate for what they want to do. So if you have something like an open source large language model, let's say, that doesn't necessarily mean that it's going to be something that you can base, I don't know, some image recognition application on. Not necessarily. They're two different things. But well, eventually, I also see that, well, you're going to have models that will be sufficiently capable in pretty much all domains, let's say.

Starting point is 00:14:34 So, however, you know, you have a similar situation today. So you have like open source libraries, let's say, for all kinds of things. However, the actual differentiation and the business advantage comes from taking these για όλα τα πράγματα, όμως η πραγματική διαφορά και το πρόβλημα του εργασίου βαθμίζεται από την παρουσίαση αυτών των συμφωνιών και το δημιουργήσω κάτι που είναι χρήσιμο και διευθυντικό για την εταιρία σας. Σε αυτή την σκέψη, εάν έχουμε καταφέρει ότι υπάρχει ένα συγκεκριμένο σύστημα οργανωτικών που θα διευθυνθούν να διευθυνθούν να διευθυνθούν να διευθυνθούν a specific sub-segment of organizations that are going to be training their own models, then probably it's only a segment within that segment that is going to need their own custom AI chips, right?

Starting point is 00:15:17 So I'll give an example. Customer support's a perfect one, right? If we were to record every customer support conversation, what the questions were, what the responses were, right? You could build a significant data set that could be trained on to help generative AI become your new frontline customer support, right? So there's different parts of the business that if we stored, mined our data, yeah, definitely AI could be extremely useful. You know, one of the hyperscalers told me their development of clean code has improved tenfold. Why is that? Because two years ago, they initiated a task force that went off and took every piece of code that was submitted for review and test, and they mined ited it okay and they flagged all of the

Starting point is 00:16:06 bugs that were identified and as they built up this massive data set of what clean code looks like and what bug filled code looks like now this code is is parsed by an AI before it's submitted for testing. And apparently that has reduced the number of bugs by more than 10x. So, you know, yet another aspect, our industry that can immediately benefit from AI. You know, I'm sure it's funny. I had someone come in and present to me and it was a phenomenal, phenomenal slide deck.

Starting point is 00:16:46 You know, the messaging was crisp on the slide. It wasn't verbose. I complimented the individual and he said, thanks. I used an AI to generate it. Right. So yet another aspect. I'm responding to emails now. I use Grammarly on my phone. George, I don't know if you use the app at all anyways they've just incorporated i'm not worried of it but no i don't use it just incorporated ai into it so now an ai can go through and part you can just put some bullet points and then ai will draft your text so these days you're not even sure who's responding to you is it a human is it it can even it has a button to do a predictive response.

Starting point is 00:17:25 And then you can go in and refine it. So, yeah, it's, I think, you know, we can see where some of the intercepts are. But I'm sure there's a multitude of uses that we haven't envisioned that will inevitably come into play. And well, just to broaden the scope a little bit there. So far, I think what we've talked about mainly applies to applications running in the data center. However, there's also the edge basically. So environments in which you don't necessarily have much compute power or much power period for that matter. And you have like connectivity issues and all that. So there's this sort of parallel, let's say, developments running there.

Starting point is 00:18:17 But however, we've also seen miniaturized models that are also capable of running on these very restricted environments. And so there's progress made on that front as well. Yeah, no, I think a major part of the deployment of AI will inevitably be on the edge. Mobile devices, you know, portable forms of compute. And you're right, energy efficiency is going to be extremely, extremely important, right? Maintaining battery savings. So here I'm seeing in the industry, George, different types of AI, right?

Starting point is 00:18:55 Look, what we're seeing in the data center right now is massive forms of compute, you know, hugely parallelized, you know, lots of arithmetic units. What I'm seeing on the edge is a completely different vector. There's these new forms of analog AI, which are trying to use very low power analog approaches to implement compute, compared to the digital floating point implementations, right? You know, digital is great. It leverages leading edge technology very, very well, right? We're integrating hundreds of millions of transistors, scaling to billions of transistors within the decade. But guess what? Leading edge technology is leaky. It consumes power. And so if you can use older technology at very low power profile with sufficient compute to run these efficient models,

Starting point is 00:19:54 they become a very interesting solution for the edge. And so it'll be interesting to see what ends up winning in the edge. Is it just diminished forms of the same types of compute used in the cloud? Or is it a different form of compute that is tailor-made specifically for operating off of battery power? Right. You did mention analog AI, and that's something I wanted to ask you about as well. But let's revisit that in a while, as well as tiplets, by the way, Αυτό ήθελα να σας ρωτήσω επίσης. Αλλά ας το επισκεφθούμε για λίγο, όπως και τις τσίπλες, γιατί δεν είμαι σίγουρος ότι όλοι οι άνθρωποι που θα ακούσουν

Starting point is 00:20:30 τη συζήτηση δεν θα είναι αρκετά γνωστοί με τις τσίπλες και τι είναι και πώς δουλεύουν. Θα επισκεφθούμε αυτά τα θέματα για λίγο, αλλά πριν το κάνουμε, ήθελα να πάρω ένα σημείο για να αρχίσω να αναβλέπω γρήγορα το περιβάλλον σε σχέση με τα επιλογές που υπάρχουν σήμερα. I wanted to just take a moment to, let's say, quickly review the landscape here in terms of what are the options today. So you did mention NVIDIA previously. And well, for good reason, it's the obvious king, let's say, of this domain.

Starting point is 00:21:00 So, you know, with huge valuation. And by the way, it's not just valuation. Yeah, incredible. Incredible. It's also the fact that, you know, they own a huge percentage of the market at this point. They have a whole ecosystem, the tech stack and software and everything. However, it's not the only option. δεν είναι η μόνη επιλογή και μεγαλύτερα οργανισμούς και ακόμα χώρες γίνονται ξεκάθαροι στον πειθασμό ότι, τώρα, είναι πολύ εξαρτάταινοι στην Nvidia για τις επιλογές της AI και αυτό δεν είναι αρκετά καλή

Starting point is 00:21:34 κατάσταση να υπάρχει. Λοιπόν, προσπαθούν να βλέπουν αλλατερικές επιλογές και, τώρα, να μην βάλουν όλα τα αυτοκίνητα σε ένα μάσκαλο, ας πούμε. Λοιπόν, υπάρχουν δύο άλλες επιλογές εκεί. Υπάρχουν άλλοι will not put all their eggs in one basket, let's say. So there are a few other options out there. There are other vendors providing GPUs, so AMD, there's ARM, there's Intel. How do you evaluate those? Look, I think clearly AMD is a number two, right? These days when people talk about GPU. So it was NVIDIA first, AMD second.

Starting point is 00:22:09 So Lisa's done a remarkable job transforming and turning around AMD over the last decade plus. Remarkable job. Yeah, and you know, for AMD to eclipse Intel in terms of market cap, That was breathtaking to see. But look, I wouldn't count Intel out. So Intel acquired an Israeli company called Havana, I think about five years ago.

Starting point is 00:22:33 I personally know the CEO, David DeHaan. There's no one more operationally focused and capable of successful execution than David. He is a remarkable force. And he is spearheading Intel's initiatives. And I've seen data from the Intel's recent Gaudi announcement that shows it eclipsing NVIDIA's performance. And Gaudi is all David's chip. So with David there, yes, I would definitely not count Intel out. I would call them a dark horse in terms of coming in. And he certainly has the capability of surpassing even NVIDIA in terms of building leading edge performance.

Starting point is 00:23:17 So long as Intel culture does not block him. And then beyond that, you have a bunch of smaller players, right? Startups, Grok, you know, being a big one, TenStore being yet another big one, you know, different regions have their own. Europe has their own. There's a French company, Cyperl. There's a Korean company called Sapion. You know, like you said, it's like a utility where every nation wants to own its own electricity grid, its own water deployment. They want to own their own AI and not be dependent, let's say, how we are on the Middle East for our oil, where the whole world is dependent on a single region to supply it. So yeah, it's quite a distribution. And I haven't even talked about

Starting point is 00:24:06 the hyperscalers where you have all this investment into AI, but everyone wants to sell into the same four companies in the West and maybe the same three companies in China. But these companies are building their own, right? I think everyone but Meta is building their own these days with Microsoft having come out and formally made the announcement, they're building their own these days. With Microsoft having come out and formally made the announcement, they're building their own. So there's a lot of investment in this space. It'll be interesting to see in the long term who wins and let's say who gets acquired or merged together.

Starting point is 00:24:42 Actually, I think Meta also has plans, even more than plans. They've announced that they're building their own. I think the difference there as compared to the hyperscalers that you mentioned is that well the chips that Meta is building are not going to be available to anyone but Meta as opposed to other hyperscalers who are going to make them available out for rent to anyone who's interested in those. Okay. Yeah, look, I've heard of, again, informally and through just general public discussions in the industry, you know, Meta's silicon design team, nowhere near the size nor capability, I think, of the other three North American hyperscalers. And so there's been rumors over the years that they're going to acquire one of the smaller startups to jumpstart it.

Starting point is 00:25:33 Yeah, it's not clear. I think clearly, you know, where Microsoft was probably much further behind Google and Amazon. OpenAI and with the recent slate of announcements, they've run to the front. It's remarkable. I'd say Amazon, Google were neck and neck in terms of lead hyperscalers on the silicon development. I think Microsoft has an ability to catch up, if not even surpass them,

Starting point is 00:26:04 because they have the whole open AI momentum behind them, the user base behind them. And the transformation of that company is incredible, right? From a sleepy operating system company that we used to have to reboot our computers twice a day in order to continue work to now being, you know, the AI leader in the world. What a transformation. Right. Well, arguably one mode some people claim that NVIDIA has at the moment is not necessarily their hardware, even though, as you said, it's definitely leading at the moment. Well, you know, there's always some of the latest MLPerf results, and sometimes they may fall behind. You did mention, actually, in the latest one,

Starting point is 00:26:52 there were in some categories that Havana chips actually showed relatively better performance. However, from a vendor, let's say, point of view, from an end-user point of view, I would argue that maybe what really sets NVIDIA apart is the software stack that they have developed over the years, which makes it easier to use their hardware. However, some people have also argued that, well, maybe that mode is not going to last that long because of the fact that now it's actually possible for other vendors to catch up through other technology stacks, most notably PyTorch 2.0, which gives a way to other AI-chip vendors to develop their own stack.

Starting point is 00:27:41 Are you familiar with that at all? Do you have an opinion on that? Yeah, look, I think on the training side, because I work with everyone in the industry, spanning from hyperscalers to data center companies to software companies. On the training side, NVIDIA does have the industry remarkably cornered, right? There's been so much investment into developing code on NVIDIA's training platform. I think it's going to be very, very, very hard for others to come in and to displace them. So there's a whole category of AI companies

Starting point is 00:28:20 that are trying to use NVIDIA software on their hardware. And so that way, code written for NVIDIA hardware can run on theirs. Again, building those compilers, also remarkably difficult, right? When it's not an open standard, you're trying to reconstruct it. Yeah, that is the challenge. I'll tell you, the general rule of thumb is for every one hardware engineer, you need two to three software engineers if you're an AI hardware company. So even though there's a lot of talk about NVIDIA and hardware, the vast majority of the investment actually goes into the software. Inference, I think, is a different game though, right? There's open standards for inference software development. There's a couple of competing standards. There's no clear winner. It's going to be an interesting couple of years to see how that plays out. NVIDIA is not in a completely dominant seat there. There's multiple solutions with multiple vendors jockeying to establish a leadership role.

Starting point is 00:29:32 So I'm seeing a lot more opportunity for innovation and new entries into the market on the inference side. All right, that's interesting. So can you maybe envision a near-term future in which NVIDIA is still the leader in terms of training, but people have many different options in terms of inference. So where do they deploy their models? Look, I'll give an example. I know of three companies off the top of my head that were founded to develop training solutions, and they all pivoted to inference. Okay. So, yeah, some of them are trying to sell into hyperscalers, some into private data center. Yeah.

Starting point is 00:30:19 So there's just more room for innovation right now. There's less of an established infrastructure, as is on the training side. So there's just more room for innovation right now. There's less of an established infrastructure as is on the training side. That's also interesting from a user point of view, because if obviously, you know, depending on the type of model and the size of the model that you're interested in training, as you also mentioned, that can cost quite a significant amount. However, if that model is not going to be retrained very often, then actually, if you look at the total cost of ownership, the major part of that is actually inference, the operational lifecycle of the model. So I guess that's part of the reason why you see what you're seeing.

Starting point is 00:31:08 Yeah, and I think at the end of the day, there's more chips that will inevitably be sold on the inference side than the training side. So I'm sure business plans also improve with that increase in volume. Because you're right, there's so much focus on training and training to large models and the cost of training to large models. But the actual deployment where we all benefit from is on the inference side. And that needs significant scale. And that needs a different solution than on the training side. Okay, cool. So let's come back to chiplets, actually, which you did mention earlier. Ωραία, τέλος. Ας πάμε πίσω στις τσίπλες, που αναφέρατε πριν.

Starting point is 00:31:46 Πιστεύω ότι η περισσότερη πλήρη, εγώ, έγινα γνωστή από τσίπλες, κάπως παραλληλώς με κάποιες μη τεχνικές εξελίξεις. Και για να είμαι πιο συγκεκριμένος, specific, recently there were some sanctions imposed into exporting certain AI chips to China and therefore as a sort of countermeasure, let's say, to overcome these sanctions, there was an emphasis on chiplets. So chiplets, you know, I'm going to explain it in a very Οι τσίπλες, θα τις εξηγήσω με πολύ συμπληστική τέχνη, είναι βασικά ένα τρόπο για να συγκρίνουμε, ας πούμε, περισσότερες τσίπλες μαζί, ώστε να έχουμε περισσότερη πυροβολή. Αυτό, κατά τη διάρκεια, είναι αυτό που έχουν κάνει κάποιες οργανωτικές εργασίες στην Κίνα, για να αντιμετωπίσουν την αρκετή εμφανίστηση των τσίπλων AI, όπως η Nvidia. in China have been in order for them to counter the lack of access to AI chips such as NVIDIA.

Starting point is 00:32:47 However, obviously, I guess that this is a technology that didn't just magically appear at that point. So since you're obviously more familiar with it than I am, I was wondering if you could provide like a quick overview, let's say, of Ziplet technology and why it's important actually besides the context adjustment. So the whole reason microelectronics has continued to drive technology is that it's successfully been able to integrate more and more functionality on a rapid cadence every year to two years and deliver more cost benefit because the more

Starting point is 00:33:26 performance that you can implement on a single integrated circuit the less integrated circuits you need the lower the cost right it's why i remember my first computer was a a radio shack coco 16 i think it cost us $6,000. Now I just bought a Chromebook for one of my children for $300. And the difference in compute level is astronomical. So that's all driven by the benefits of microelectronics. So the problem is God. God got in the way. And unfortunately, now, when we build transistors, which are the basic building blocks of any integrated circuit, we're stacking atoms. All right. And so when you're stacking atoms, the laws of probability, the laws of averaging fall apart because now it's two atoms rather than hundreds of atoms and electrons.

Starting point is 00:34:28 And so what you get is defects. And one way to think about it is imagine you had a piece of fabric in your clothing manufacturer. If you could only manufacture or cut,'s say one to two patterns per panel of fabric and every every so often panels of fabric have little defects and so that means for what you know you're throwing out a whole panel if there's one tiny little pull and thread on one of the pieces. You have to throw out the entire panel. Now compare that to if you could cut a hundred pieces on a single panel of fabric. They're smaller, but now if there's a defect in one of the piece, you end up with 99 pieces that you can still use. Okay. And so if every other panel has a little defect, look at the difference. Now you're throwing one out of 200 pieces out versus my first scenario.

Starting point is 00:35:34 If you're cutting two pieces per panel, you're throwing one out of four. And guess who absorbs the cost of throwing it out? It's not the manufacturer. It's not the manufacturer, it's us. The cost of throwing out the panel gets forced down the consumer. So the more that's thrown out, the more we have to pay for the useful parts. And so as we encounter these defects, because we're stacking atoms at two nanometer and one nanometer, the industry has realized the only way to continue to add more and more functionality without throwing out more and more is to build things smaller

Starting point is 00:36:09 because those defects will exist, but there's less wastage when you throw them out. And now how do we increase capacity with smaller computers? Well, you connect these chiplets and you allow them to talk. And so silicon is no longer the foundation of leading edge semiconductors. The crazy part is, is it's not packaging. All right. The package is now the foundation. It's the base. Silicon is now something that is a component on this packaging. And so when we've, you know, the last

Starting point is 00:36:47 couple of years, there's been a lot of talk about semiconductor supply chains. There's a lot of silicon capacity right now, but guess where there's still, you know, very little to no capacity. It's on the packaging and specifically for these 3D types of designs that are built using chiplets. So, look, for me, having been in this industry 25 years, this is probably the biggest transformation that I've seen throughout my career. And I've seen a lot, right? I've seen metal go from aluminum to copper. I've seen changes in how transistors are built, you know, from planar to FinFET to now gate all around.

Starting point is 00:37:28 You know, there's been a lot of changes. Our industry is very, very dynamic and is always able to figure out solutions to physical problems. But this move to chiplets is a complete game changer. And so a select few like Apple, AMD, Intel have figured out how to do chiplets. You know, they're really the industry leaders at the forefront. But I'd say the vast majority of the industry, 80 plus percent of companies still haven't gotten there yet. But they will inevitably have to get there. If they focus on leading edge, they will have to get there okay if they focus on leading edge they will have to get there and and that's where companies like us foundries packaging companies are all trying to

Starting point is 00:38:13 build turnkey solutions so when they get there it's not a dramatic learning curve right the the building blocks are in place and they can just seamlessly transition over to a chiplet approach. Okay, well, somehow, what you just mentioned brought cerebrus to mind, because I'm not sure if I got it right, but it seems like they're sort of on the other end of the spectrum of what you're describing. They're sort of famous, let's say, for being these huge wafers with all sorts of chips packaged all together in there. So would you say that their approach is different, but it still works? Yeah, so they're going to encounter the same defects as everyone else.

Starting point is 00:39:03 The difference with them is they have redundancy where, you know, it's a wafer scale piece of hardware. So it's an, so a wafer is like that panel I told you about, right? And so they, they don't even cut the panel. They use the entire panel and there's going to be defects. And what they do is they just have software that works around the defects. Yeah. So I think for them, yes, they've gone fundamentally. They're using silicon as the foundation rather than packaging. Okay.

Starting point is 00:39:35 And so they have multiple chips that are arrayed, but they're not cut. They just connect on the wafer. It definitely is a different approach. Okay. One of the advantages of cutting things up is imagine you're Intel and you're trying to build a CPU and a GPU, let's say a DPU, perhaps even a networking device. By breaking things up into smaller pieces, they become like Lego building blocks. So you can have a processor core chiplet. You can have a PCI Express connectivity chiplet. You can have an Ethernet networking chiplet, a DDR memory IO chiplet, an HBM memory IO.

Starting point is 00:40:19 Now, if you can start mixing and matching these chiplets in a package to build out an entire product portfolio. Right. And so you can knock out a GPU, GPU, CPU in under a year if you have these building blocks. Compare that to trying to design and verify a single a single wafer, which easily will take two or more years for one product. And so I think from a design complexity, from an upfront investment perspective, breaking functionality down into smaller pieces and transitioning the foundation of semiconductors from silicon to packaging, which then you can plug in these legal building blocks to build different types of devices. Yeah, the the that's clearly the winning formula. OK, the benefits are incredible.

Starting point is 00:41:17 I gave a talk at a Samsung technology conference. You know, the reduction in cost of going to these smaller chiplets more than 60 percent. OK, the reduction in power more than 40 so as you think about building up next generation data center for a hyperscaler using ai now that's tens of billions of dollars of savings okay it's it's incredible so yeah it's just another technological revolution to deal with the problem that God has thrown at us? Well, I have to say, even though I'm not really that well-versed in hardware and AI hardware, for that matter, as an outsider, in a way, I would say that modularity always wins in the end. I mean, we've seen it at software, you know, going from big monolithic systems to modular systems. And I think it's sort of like a universal principle, which in a way makes me wonder.

Starting point is 00:42:12 So how come this only came to the fore now? Was it not like technically possible before? Or was it because vendors wanted to protect their IP and their sort of barriers to entry to others? I don't know. I agree with you. When you're writing code, modularity is always the best approach. But in semiconductors, right, an IC stands for integrated circuit and so the winner historically up to now has been the the company that could integrate into a monolithic device the most so i'll give an example intel do you remember the old days of computers you'd have the processor then you'd

Starting point is 00:42:59 have the north bridge and the south bridge and it was modular. What happened over time? It all got integrated. Broadcom, they cleaned up the TV business because there used to be discrete tuners, discrete graphics processors. And then what did Broadcom do? They came in, they integrated it all and beat everyone else because integration drove down costs. And so that created a maniacal focus on integration except as we approached the the dimensions of a single atom all of a sudden the manufacturing cost eclipsed the cost of of integration and so now like any, you swing too much to one side, it's swinging back to the other in terms of modularity. And, you know, the difference is. There's really no choice. We've hit a fundamental wall in physics.

Starting point is 00:43:58 You know, the concept of probability on the atomic and quantum scale falls apart, right, as you approach a single electron. So at that point, physics changes on us, and we have to change how we design things as a result. Okay, well, just to correct myself, let's say, well, actually, modularity always wins. Well, yes, but there's a certain point over which you get so much overhead through coordination overhead, basically, in software, at least I'm thinking

Starting point is 00:44:33 microservices, for example, you know, when they first came out, it was like this sort of new idea that everyone quickly subscribed to. But as it develops, and people started sort of, let's say, breaking down their monolithic components into microservices, some of them at least got to a point where they had so many of them and they had so much communication overhead that it just didn't make sense anymore. And I'm wondering if maybe you may end up in a similar situation with ziplots. Here's my experience as an engineer. Normally, one is never the right solution. Normally,

Starting point is 00:45:14 a hundred is never the right solution. Somewhere in between is typically the right solution. So, yes, I'm sure some will take the approach too far where functionality be broken up into so many tiny pieces that the cost of integrating all of these becomes limiting. Yes, ultimately, it'll be a hybrid blended approach. I think the way the you know, right now for chiplets, there's two ways of splitting things up okay one is you build a single lego block and you just mirror it over and over with the same functionality and they just talk to each other okay but it's the same lego block over and over the other approach is you split and cut and create Lego blocks for different types of functionality. So you'll have a compute Lego block, let's say a training IO Lego block, a network IO Lego block, a memory IO Lego block,

Starting point is 00:46:14 maybe a security and encryption Lego block. So right now, different approaches are using one of these two methods, I think I see more momentum behind the functionality split, right? Just because it creates reusable Lego blocks to accelerate other products and other forms of product development, right? Because you build it for one product. Let's say you build it for CPU. You can now reuse a bunch of these Lego blocks for GPUs.

Starting point is 00:46:48 It just advances the whole business. Well, maybe a naive question, but if you only have identical Lego blocks, let's call them, how do you go from that, like having a bunch of those all connected, to actually having a working chip? Do you somehow program it like you do with FBA? Exactly. Because imagine, let's say for an edge device, it's one Lego block. But then imagine you can have three price categories in data center, an economical brand, you know, a gold brand and a platinum brand. And it's the same Lego block, but in the edge,

Starting point is 00:47:26 it's a single chiplet. In the data center, you go from two to four to eight, and it's all the software. They get instantiated. Remember I said the new base is the packaging. So same silicon, just packaged differently, priced differently, and with different software to leverage the incremental compute and memory bandwidth associated with these devices. When you say software, I guess that actually probably means firmware, right? So where? Okay. Okay. Yes.

Starting point is 00:48:03 Because otherwise, you know, I would have a hard time figuring, like, how would the software stack sit on top of that. Yeah, I think it would be the, right, it's the compiler. The firmware, the compiler will know the amount of compute power, will optimize that high-level code to leverage all the parallelism that exists on the compute platform. Okay, well, that's a very interesting perspective, actually, for people like myself who mostly live in the software and applications world to actually get to know what's happening down there. And it sounds like actually a lot of the principles that we operate by are also applied there eventually at least. Look, engineering is all the same.

Starting point is 00:48:48 My brother's a chemical engineer. I'm an electrical engineer. My father was an engineer. Yeah, the principles are all the same, right? Whether it's thermodynamics, whether it's electromagnetics, whether it's computer science or computer organization yeah typically the way you solve problems are consistent right so yes i think we all speak a common language we just express ourselves differently but at the end of the day the the approaches we use to solve problems are relatively consistent across the disciplines.

Starting point is 00:49:27 Yeah, that's an interesting thing to say. The most effective hardware designers these days are the best software developers, the best coders. Yeah, software doesn't just end once you hit silicon. Design approaches, design efficiencies, all leverage software. And more and more, I expect to see AI come in to help improve hardware design efficiency as well. Well, yeah, there's actually already been a few experiments, let's say, in this area where people have had AI models applied to designing chips. And I think the results have been encouraging initially.

Starting point is 00:50:10 Yeah, see, yeah, it'll be very, very interesting. I'm a big believer in human ingenuity. But yeah, it'll be interesting to see if the machines that we help design can outsmart us one day. Yes. Well, for one thing, they can definitely be more efficient in terms of exploring search spaces. So, well, who knows? Yeah, and so it will definitely level the playing field, right? Because you can have phenomenal engineers that can very quickly converge.

Starting point is 00:50:44 And, you know, there are others that, let's say, with less experience or less ingenuity or less intuition, whatever you want to call it, sometimes take more time to converge. But if you can pull all this data into an AI and have it trained now, regardless of the individual, yeah, it, you know, it has the wealth of all those learnings behind it and can drive a solution onto itself with not being limited by an individual's experience set. Well, there's something else that you mentioned earlier that I said we'll come back to, which was analog AI. And again, to me, that's something that I very recently became aware of and seems kind of counterintuitive at first. So again, I'd like to ask you to just very quickly give an overview of what it is, how it works, and why it may be relevant.

Starting point is 00:51:41 Yeah, I will. And then, George, I'm going to have to jump off right after. I just have to prep a few minutes in front of my next meeting. So look, it's all about arithmetic processing. Okay, AI at the end of the day, just needs massively parallelized arithmetic processing, right vector processing, whatever you want to call it. And so binary is one approach, right? Binaries, numbers are represented as ones and zeros, and you can define the precision, you can define the range, and that defines your data path. But there's different ways of doing it as well, right? You can use analog arithmetic processing where

Starting point is 00:52:18 floating point can be represented via voltages or currents, okay? Potentially infinite precision, but in the real world with noise, potentially with less precision. But you know what? For edge AI, you need that much precision. And rather than using amps or milliamps or even microamps, you can use nanoamps, right? Very small currents, which keeps it low power. There's another form of compute that there's some companies investing in, which is optical compute for arithmetic, right? So using the properties of optics

Starting point is 00:52:57 to implement what are called MACs, right? Multiply, add, carry functionality, which is the core of any arithmetic unit. And again, it's supposed to be even lower power because you can implement far more max optically by sending out a signal and having the properties of light implement the rather than trying to use traditional digital approaches. And so at the end of the day, there's no change into how we process arithmetic. It's just different ways of processing arithmetic to, number one, increase scale. Number two, decrease power consumption per teraflop or petaflop. And so

Starting point is 00:53:43 there's a lot of money, a lot of research into different approaches right now. Okay. Well, since you did say we have to wrap up shortly, I think based on what we've talked about so far, if I were to note the takeaways of where do you see things going, I'd say you see chiplets as being a major major development going forward and by extension connectivity and probably i guess based on what is i said you think that analog and optical are also worth keeping an eye on at least yeah no for sure i think with the level of investment happening in optical and analog,

Starting point is 00:54:29 it'll be interesting to see how successful it becomes and how it can scale out to meet a portion of the market's needs. Okay, well, I think currently they're not widely deployed, are they? No. Most people I talk to are like, well, yeah, I mean, we're aware of them, but it's sort of like a fringe thing. Yeah, I know. But there's billions of dollars of investment that are going into these. So it'll be interesting to see. And there's some great minds, some friends of mine are working at these companies.

Starting point is 00:54:59 So it'll be really interesting to see what comes out of it. Look, if you asked me five years ago, I'd say whoever integrates first will win. And now I'm telling you, no, disintegration through chiplets is the winner. So I used to say, and I would historically say, he who does it digital will win. But look, if I was wrong on integration, I could be wrong on digital arithmetic. So, yes, never count anything out. Okay, well, I'll ask you again in five years, then, if not earlier. Okay, George, thank you very much. Thanks for sticking around.

Starting point is 00:55:41 For more stories like this, check the link in bio and follow Link Data Orchestration.

Orchestrate all the Things - The future of AI chips: Leaders, dark horses and rising stars. Featuring Tony Pialis, Alphawave CEO & Co-founder

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.