In The Arena by TechArena - Ushering in a New Era of AI Silicon with Tenstorrent's David Bennett

Episode Date: September 14, 2023

TechArena host Allyson Klein chats with Tenstorrent’s David Bennett about the company’s vision for RISC-V + accelerator solutions to usher in a new era of AI compute and how customers are hungry f...or alternatives including custom designs.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome to the Tech Arena. My name is Alison Klein, and I'm coming to you from the AI Hardware Summit in Santa Clara, California. And I'm here with David Bennett, Chief Customer Officer with Tenstorm. How are you doing today, David? Great, Alison. Thanks for having me. We're here at a really interesting conference that's talking about the future of hardware for AI.
Starting point is 00:00:45 We're talking about that at a moment in time in the industry when AI has really captured the attention of not just the industry, but broader populace, and we see the potential of this transformative technology. Tell me about TenStory and how you all fit into that and what you're expecting from the conference this week. I think you're absolutely right. I think, you know, you said where we are, but then you, you know, expanded, say, when we are. And I think that when is pretty important, right? It's, you know, AI has been plugging away for quite some time now. But really with the generative models, with large language models, I think it's really kind of burst into sort of general population pop culture more than it has before. But I also think it underlines the point, you know, right
Starting point is 00:01:29 now we have a market that for all intents and purposes is dominated by a single player, meaning that you have NVIDIA, who's put a lot of work into the whole ecosystem to have in the hardware and the software to drive a lot of this innovation and drive everything around AI. But I think it's important, you know, we've seen in other parts of our industry to have alternatives, to have different approaches to technology and that competition drives innovation. At the same time, you know, we think here at TenStorent, there's a better way to do it. And if you look at what TenStorent does, for those of you that may not know us, you know, we build computers for AI. And I think the two major things to think about when you think of Ten's Torrent is, you know, we have a solution that is built, grounds up for AI. It's not a,
Starting point is 00:02:16 you know, a GPU architecture that happens to be good at AI or something that we've been, you know, pushing for years. We really developed the architecture with AI and machine learning models in mind. That's number one. Number two, our approach, probably a little too much to go into this conversation. But again, we believe that to provide customers and to provide people what they're looking for in an AI hardware solution that's not sort of the de facto standard, you need to do everything. You can't just slice off inference only or be NLP. So, Ten Store's mission is inference training, CNN's NLP
Starting point is 00:02:51 recommendation engines, doesn't matter. Everything on the same silicon, same software stack. And I guess the last piece that's pretty interesting, and I think what differentiates us a lot from the other AI hardware startups out there, is that we believe the future of the data center and the future of compute is this combination of large general compute and acceleration. I think the trends, we see it, right? We see the MI300 from AMD, awesome. We see NVIDIA's Grace Hopper.
Starting point is 00:03:17 But I think the trend is there. And I think what sets us apart is our CEO, Jim Keller, has quite a name for himself in the arena of high-performance compute. So we are developing what we like to refer to as the world's highest performing, best performing RISC-V CPU architecture. And I think the magic is that it's designed as sort of a companion to our AI hardware, and that coming together of AI and general compute, we think is going to be the magic, the secret sauce for us in the next couple of years. Wow, David, you just set me up with an entire interview of questions in that one statement.
Starting point is 00:03:53 So let's break it down. Sure. You know, you started with, you're developing an architecture specifically for AI. And you also mentioned that you're a RISC-V architecture. Tell me why RISC-V is the right play. Oh, that's a great question. Actually, it's funny. Our story about why we got into RISC-V, I think, kind of tells that, right?
Starting point is 00:04:13 So we have this AI architecture, again, that is developed from the grounds up to be really good at machine learning models. So we did things. We're in our third generation of hardware right now. And from our first generation, we bet big on low-precision floating point. We bet big on sparsity, conditional sparsity, dynamic sparsity. We've got all these things in the architecture. But fundamentally, we believe it's this coming together of CPU, general compute as well. If you look at most of, even today, a fair number of the operations are happening off the accelerator, off the GPU and compute, right? So long story short, we knew we needed to compute. We had actually thought about okay
Starting point is 00:04:45 well what are we going to pair with our accelerator x86 you know it's very hard to kind of do that kind of thing so we went to arm and there was some um ideas that the team had around some data types that we would like added to the arm processor that we were using and unfortunately they couldn't do it for whatever reason so we're like well these are things that we know will be beneficial to ai we'd like to add in these data types. RISC-V. RISC-V is an open standard architecture, meaning that you can change the ISA, you can add in data types, things that you need. And we went to one of the RISC-V vendors and they're like, sure, and they put it in. And that kind of started us off on our journey.
Starting point is 00:05:19 We wanted the right technology. We wanted the right data types, which we knew would be beneficial for AI and machine learning models. I think when, we're very happy with what we got, but when we looked at it, I think it inspired Jim and the team to say, well, let's go build something bigger and better because we know what's going to be needed in the future. Now you mentioned Grace Hopper, you mentioned AMD's. Yep, MI300. MI300. When you look at TenStorm, are you looking to be an accelerator in a multi-vendor chiplet architecture? Are you looking to be an alternative to those?
Starting point is 00:05:53 That's also a great question. So how we define ourselves. We're small enough that we have multiple routes to market. So fundamentally, we are creating technology that will drive machine learning, a really high-performance compute. How we take that to market is in a number of ways. So today, we produce PCI boards, servers. We have an ultra-dense for you we call Galaxy. So very much we're going toe-to-toe with what NVIDIA is doing.
Starting point is 00:06:20 That doesn't change. At the same time, because we are open-sourcing our software, we're using Open Standard RISC-V, we're big believers kind of in open standards, open source in general. We've had a lot of inbound interest for our IP, be it the AI IP, be it the RISC-V IP. So we're working with strategic partners, which we've announced in the last couple of months, where we'll be working with companies to transfer our IP and develop products together. And then finally, you talked about chiplets. AMD's been very successful with their approach to using chiplets, and we believe really strongly in that. So if you look at our publicly stated next generation roadmaps,
Starting point is 00:06:54 we are going to be designing, developing, and selling chiplets. So we will have an AI chiplet. We'll have a RISC-V CPU chiplet. We'll probably have some memory and I.O. subsystem chiplets. But our goal is certainly to really go big with chiplets because fundamentally, chiplets allow you to do a couple things. It allows you to choose the right
Starting point is 00:07:13 amount of compute with the right balance for what your workloads require. We believe that it significantly reduces the time to market and the cost thereby significantly increasing the number of people, the number of customers and companies that are going to go and develop their own solution.
Starting point is 00:07:30 Last week, I was talking to Nidhi Chappell. She runs the team over at Microsoft that's building their infrastructure. And she talked about her background in HPC. You brought up also that Jim Keller has a very well-known background in HPC and supercomputing. Why is it that HPC is so critical and knowledge of supercomputers is so critical
Starting point is 00:07:50 when you're building AI training engines? Yeah, so it goes back to high-performance computer. High-performance, it turns out it's really hard to develop high-performance computers, right? So you see a lot of people playing, like there's a lot of RISC-I companies out there, a lot of ARM companies that are doing things on the low end microcontrollers.
Starting point is 00:08:08 But then conversely, you see a lot of really big companies putting in a lot of money to go develop their own high performance compute architecture. And frankly, not really, I think, seeing the results that they want. That's why it's sort of been limited to Intel and AMD. And more recently, I think ARM has made significant strides in sort of the higher performance compute.
Starting point is 00:08:27 But it turns out it's a lot harder. And what you're seeing now is that if you look at most supercomputers, if you speak with the directors of these centers, most of them will tell you, well, AI is about 10% of our workload today. I think it's pretty clear that that percentage is going to go up. Sure. So when you're talking about HPC and high performance compute, not only do you need to sort of be able to support the traditional HPC applications and workloads, but at the
Starting point is 00:08:55 same time, you have to support this growing requirement for AI and ML. So I think that's where you're going to see these coming together. And like I said, the real thing is we're at the infancy when it comes to AI and machine learning. Every time a new model comes out, and if the model's better, people just switch like a light switch. Some of the models that you thought were really popular yesterday or three months ago are almost unused now because when something new comes out, people switch.
Starting point is 00:09:22 And no one really knows what direction it's going to go go and I have a feeling that you you're looking at a high-performance compute and you want to be able to support the AI machine learning models of the future you need to have flexibility in your in your compute so that depending on the directions of the model goes does your architecture support that said another way and if I had one criticism of a lot of the AI hardware startups I see out there today, if you develop your hardware around the models of today, I think you're at an inherent disadvantage because it's changing so
Starting point is 00:09:54 fast. Yeah, that's a great way to put it. You guys have a keynote here today. Jim will be talking. What are the key messages for the conference this week and are you making any announcements? I don't know if we're making any specific announcements. I'm looking over at my PR guy. But Jim is, we have been focused on AI and machine learning models for the last six years. We're about to tape out our third generation black hole processor, which will be the first one that takes our 10-6 cores,
Starting point is 00:10:26 which are the blocks of architecture that we use in our AI, and combine them with sort of the first version of those larger RISC-V compute cores. So I have a feeling that Jim is going to be talking about, again, the heterogeneous compute, how that benefits AI and ML. And again, I think the importance of having an alternative, you know, before the microphone started rolling, we had a discussion about, you know, what's the value in having, you know, alternative architectures or alternative hardware. I think if COVID and the lockdowns taught us anything, there was a huge supply chain crunch with CPUs and memory and all these different parts.
Starting point is 00:11:06 And I think that companies realize that if you have all your eggs in one basket, you're single sourced, it's very dangerous. So I think there is certainly a need, especially with the popularity boom we see in AI, ML compute requirements, to have alternative vendors out there. And I think Jim will probably, I think, touch on that in his keynote.
Starting point is 00:11:28 Yeah, you're wearing an architectural diagram of one of your chips, which as a former silicon nerd myself is very exciting. Tell me a little bit about the third generation chip and what it brings in terms of an architectural footprint and how customers have responded to it? Sure. So we're just taping, since we're taping out, we expect product back in the labs early next year, and then we'll be, you know, taking that and putting it into different form factors. But we have been working with some of the customers that have looked at our IP. They've actually got their hands on it and they've been running in simulation.
Starting point is 00:12:01 I think they're very, very pleased. And like I said, we've been on a journey with our machine learning architecture. So our first generation Grayscale, which is the one on my shirt, introduces the 106 cores, which again is a, it's, you know, we would describe it as the world's first mass production graph computer with a data flow architecture,
Starting point is 00:12:19 very, very different to sort of a traditional GPU compute model. And that's where we call them the 106 cores. That's where they came in. With our second generation wormhole, we added a proprietary ethernet fabric designed to enable scale out, meaning that we're using an ethernet fabric to connect chip
Starting point is 00:12:41 to chip, card to card, server to server, rack to rack, so that we can connect, we like to say, to chip, card to card, server to server, rack to rack, so that we can connect, we like to say an unlimited, but let's say a large number of chips together so that they're seen by PyTorch or TensorFlow or whatever as a single device. And you remove the need for having NVLink or Mellanox or these switching routers, everything's kind of connected together in this large mesh network. Now the third generation, in addition to improving on our 10-6 cores, we've improved the math. We've added more memory. We've switched to, I think, a better GDDR. We've done a couple things inside to sort of tune up and optimize. But we've added 16 RISC-V cores to our chips. And as I said, that's kind of our,
Starting point is 00:13:26 we're on this journey with RISC-V. If you look inside our 10-6 cores, actually we've got these things called baby RISCs, or these little single in-order issue RISC-V. And that's part of what drives our 10-6 cores compute and gives it sort of the ability to do these dynamic conditional execution units. And then with wormhole, we have scale-out.
Starting point is 00:13:44 And with blackhole, we're now adding these 16 RISC-V, I would say medium-sized cores on the chip. And where we remove the need for switches and routers in our second generation, what we're trying to do is remove the need for a host system or at least remove the need for the operations to be sent out outside of the silicon, going through the main bus, going through memory and being done on the CPU.
Starting point is 00:14:08 How much of that can we do in the silicon itself? Now, the next generation after that is where we bring in our in-house high performance RISC-V chiplet combined with our next generation top of the line AI chiplet. We put them together and we've got this, what we believe will be a very interesting take on what Grace Hopper and what MI300 are doing. But this Black Hole chip is super interesting because it's really bringing together those RISC-V CPU cores with our AI and seeing what we can do with them together. And the response so far has been very, very interesting. We've talked about some of the investments we've had in our company
Starting point is 00:14:43 and some of the strategic partnerships we've signed. And I would tell you without doubt, all of these are companies that see the same vision of this combination of AI acceleration and general computing. It's really exciting. You know, when you talked a little bit about the architectures of today are not going to solve the problems because we're innovating too fast. This would be the time of the interview typically, David, that I would ask, what do you see in the next few years? But if you put in perspective how fast we're moving, a year ago today, nobody had heard about chat GPT. And so we see the acceleration, we feel the acceleration, it's visceral. So if you look forward a year from now, for Tencent Torrent as well as the industry, what do you want to be seeing and what do you think we're going to be talking about in terms of the
Starting point is 00:15:32 core capability of AI? Yeah, so you're absolutely right. It's super interesting, right? I think no one could have predicted. I think there's, you know, Sam Holt and those guys on an open AI or on record saying they're surprised at sort of the popularity of ChatG GPT and what happened. So I think it goes to show you we're kind of at the infancy in this industry, in this area, right? We don't know where it's going to go. I think without a doubt, as I said, with the products that have been released this year from NVIDIA and AMD, I'm super interested in seeing how developers and academics and enthusiasts in the space take advantage of that heterogeneous architecture and what they can drive with it.
Starting point is 00:16:12 So again, not knowing what's going to happen, not knowing what the models of the future will look like, I am excited to see how the folks that are developing the models themselves are going to take advantage of these sort of new heterogeneous architectures. And I think that's actually going to play well into our own strategy. Other than that, it's really hard to say. I think that with the rate of change that we're seeing, I can tell you some of the verticals. I'm the business side for TenStore. I can tell you that there are some verticals that are super keen on this kind of combination of compute and acceleration. And it's automotive, it's HPC high-performance computing, it's big data center.
Starting point is 00:16:51 And I think less about the architecture itself, but I would say the need or the desire to own your own compute is what surprised me most when I hear from customers. And that's where RISC-V really comes in. People don't want to be beholden to the financial needs or the whims of a few companies. So if you can own your own architecture, you can own your own destiny. And I think Tesla taught us this and so has Amazon.
Starting point is 00:17:21 They really do own their entire silicon from end to end. And I think the number of companies that are looking at that and saying, wow, if we can customize our silicon for the workloads that we care about and drive this incredible efficiency and we can own the technology. And really at TenStore, I think our reason to choose RISC-V, for instance, is not only about
Starting point is 00:17:44 being able to offer that customizability, but it's also about being able to offer ownership. It's a new model where there's no fear of what happens if 10 torrents not here in two years or five years, what if they get acquired? If you're in the RISC-V ecosystem, then you've got hopefully 5, 10, 15 other companies that if we're not
Starting point is 00:18:05 doing our job and we're not offering the best solution, you can go somewhere else. In fact, you can just decide to go do it on your own. And that ownership piece has been surprisingly one of the things I see most from customers about why they're choosing to work with 10th Point. That's fantastic. For those of us who are listening online, I'm sure we've piqued interest to engage. How do folks reach out to you to talk further and talk to your team? Oh my gosh. So yeah, tennisstorn.com. We're on Twitter. We're on kind of every social media. We've been, I don't want to say we're in stealth mode because we're not, but we've been intentionally careful with what we're going to market. Jim and
Starting point is 00:18:40 the team, we have one mission and we don't want to over deliver. We don't want to talk in hype. We really just want to talk about what we can do and more importantly, show it. So you'll see, I think in the next couple of months, we started to make some announcements as we bring these strategic partners on board. We've got a lot more announcements in the hopper that we're pretty excited to share with everyone. But in the meantime, please reach out to us on the website, reach out to us on social media. and i think one thing we've committed to everyone that we speak to is that we want to democratize ai and we want to be able to get you know a modern ai architecture that frankly is very different than sort of what the sort of the standards are today out to as many people as possible and i hope to have some news and
Starting point is 00:19:22 announcements about that coming soon and hopefully not only will we be able to talk to TenStore, but I think you'll be able to get a chance to kind of play around with the hardware, and we want more people to see what they can do with it. For those who listen to the Tech Arena, they know that I am a huge fan of chiplet architectures, header, genius, compute, and RISC-V. So I
Starting point is 00:19:39 nailed all your... I didn't know that. This is great. Yeah, so thanks so much for the time today and taking time away from the conference. No, I appreciate it. And thanks for having a really look forward to hopefully coming back next year and sharing some more. Thanks for joining the tech arena,
Starting point is 00:19:56 subscribe and engage at our website, the tech arena.net. All content is copyright by The Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.