In The Arena by TechArena - Ushering in a New Era of AI Silicon with Tenstorrent's David Bennett
Episode Date: September 14, 2023TechArena host Allyson Klein chats with Tenstorrent’s David Bennett about the company’s vision for RISC-V + accelerator solutions to usher in a new era of AI compute and how customers are hungry f...or alternatives including custom designs.
Transcript
Discussion (0)
Welcome to the Tech Arena,
featuring authentic discussions between
tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena.
My name is Alison Klein, and I'm coming to you from the AI Hardware Summit in Santa Clara, California. And I'm here with David Bennett, Chief Customer
Officer with Tenstorm. How are you doing today, David? Great, Alison. Thanks for having me.
We're here at a really interesting conference that's talking about the future of hardware for AI.
We're talking about that at a moment in time in the industry when AI has really captured the attention of not just the industry,
but broader populace, and we see the potential of this transformative technology.
Tell me about TenStory and how you all fit into that and what you're expecting from the conference this week.
I think you're absolutely right.
I think, you know, you said where we are, but then you, you know, expanded, say, when we are.
And I think that when is pretty important, right?
It's, you know, AI has been plugging away for quite some time now.
But really with the generative models, with large language models, I think it's really kind of burst into sort of general population pop culture more than it has before. But I also think it underlines the point, you know, right
now we have a market that for all intents and purposes is dominated by a single player, meaning
that you have NVIDIA, who's put a lot of work into the whole ecosystem to have in the hardware
and the software to drive a lot of this innovation and drive everything around AI. But I think it's important, you know, we've seen in other parts of our industry to have alternatives,
to have different approaches to technology and that competition drives innovation.
At the same time, you know, we think here at TenStorent, there's a better way to do it.
And if you look at what TenStorent does, for those of you that may not know us, you know,
we build computers for AI. And I think the two major things to think about when you think of
Ten's Torrent is, you know, we have a solution that is built, grounds up for AI. It's not a,
you know, a GPU architecture that happens to be good at AI or something that we've been,
you know, pushing for years. We really developed the architecture with AI and machine learning models in mind.
That's number one.
Number two, our approach, probably a little too much to go into this conversation.
But again, we believe that to provide customers and to provide people what they're looking
for in an AI hardware solution that's not sort of the de facto standard, you need to
do everything.
You can't just slice off inference only or be NLP. So, Ten Store's mission is inference training, CNN's NLP
recommendation engines, doesn't matter. Everything on the same silicon, same software stack.
And I guess the last piece that's pretty interesting, and I think what differentiates
us a lot from the other AI hardware startups out there, is that we believe the future of
the data center and the future of compute is this combination of large general compute
and acceleration.
I think the trends, we see it, right?
We see the MI300 from AMD, awesome.
We see NVIDIA's Grace Hopper.
But I think the trend is there.
And I think what sets us apart is our CEO, Jim Keller, has quite a name for himself in
the arena of high-performance compute.
So we are developing what we like to refer to as the world's highest performing, best performing
RISC-V CPU architecture. And I think the magic is that it's designed as sort of a companion to our
AI hardware, and that coming together of AI and general compute, we think is going
to be the magic, the secret sauce for us in the next couple of years.
Wow, David, you just set me up with an entire interview of questions in that one statement.
So let's break it down.
Sure.
You know, you started with, you're developing an architecture specifically for AI.
And you also mentioned that you're a RISC-V architecture.
Tell me why RISC-V is the right play.
Oh, that's a great question.
Actually, it's funny.
Our story about why we got into RISC-V, I think, kind of tells that, right?
So we have this AI architecture, again, that is developed from the grounds up to be really good at machine learning models.
So we did things.
We're in our third generation of hardware right now.
And from our first generation, we bet big on low-precision floating point. We bet big on sparsity, conditional sparsity, dynamic sparsity. We've got all these
things in the architecture. But fundamentally, we believe it's this coming together of CPU,
general compute as well. If you look at most of, even today, a fair number of the operations are
happening off the accelerator, off the GPU and compute, right? So long story short, we knew we
needed to compute. We had actually thought about okay
well what are we going to pair with our accelerator x86 you know it's very hard to kind of do that
kind of thing so we went to arm and there was some um ideas that the team had around some data types
that we would like added to the arm processor that we were using and unfortunately they couldn't do
it for whatever reason so we're like well these are things that we know will be beneficial to ai
we'd like to add in these data types. RISC-V. RISC-V is an open
standard architecture, meaning that you can change the ISA, you can add in data types,
things that you need. And we went to one of the RISC-V vendors
and they're like, sure, and they put it in. And that kind of started us off on our journey.
We wanted the right technology. We wanted the right data types, which we knew would be
beneficial for AI and machine learning models. I think when, we're very happy with
what we got, but when we looked at it, I think it inspired Jim and the team to say, well,
let's go build something bigger and better because we know what's going to be needed in the future.
Now you mentioned Grace Hopper, you mentioned AMD's.
Yep, MI300.
MI300. When you look at TenStorm, are you looking to be an accelerator in a
multi-vendor chiplet architecture? Are you looking to be an alternative to those?
That's also a great question. So how we define ourselves. We're small enough that we have
multiple routes to market. So fundamentally, we are creating technology that will drive
machine learning,
a really high-performance compute.
How we take that to market is in a number of ways.
So today, we produce PCI boards, servers.
We have an ultra-dense for you we call Galaxy.
So very much we're going toe-to-toe with what NVIDIA is doing.
That doesn't change. At the same time, because we are open-sourcing our software,
we're using Open Standard RISC-V, we're big believers kind of in open standards, open source in general.
We've had a lot of inbound interest for our IP, be it the AI IP, be it the RISC-V IP.
So we're working with strategic partners, which we've announced in the last couple of months, where we'll be working with companies to transfer our IP and develop products together.
And then finally, you talked about chiplets.
AMD's been very successful with their approach to using chiplets,
and we believe really strongly in that.
So if you look at our publicly stated next generation roadmaps,
we are going to be designing, developing, and selling chiplets.
So we will have an AI chiplet.
We'll have a RISC-V CPU chiplet.
We'll probably have some memory and I.O. subsystem chiplets.
But our goal is certainly to really
go big with chiplets because
fundamentally, chiplets allow you to do
a couple things. It allows you to choose the right
amount of compute with the right balance for what
your workloads require.
We believe that it significantly reduces
the time to market and the cost
thereby
significantly increasing the number of people,
the number of customers and companies
that are going to go and develop their own solution.
Last week, I was talking to Nidhi Chappell.
She runs the team over at Microsoft
that's building their infrastructure.
And she talked about her background in HPC.
You brought up also that Jim Keller
has a very well-known background in HPC and supercomputing.
Why is it that HPC is so critical
and knowledge of supercomputers is so critical
when you're building AI training engines?
Yeah, so it goes back to high-performance computer.
High-performance, it turns out it's really hard
to develop high-performance computers, right?
So you see a lot of people playing,
like there's a lot of RISC-I companies out there,
a lot of ARM companies that are doing things
on the low end microcontrollers.
But then conversely, you see a lot of really big companies
putting in a lot of money to go develop
their own high performance compute architecture.
And frankly, not really, I think,
seeing the results that they want.
That's why it's sort of been limited to Intel and AMD.
And more recently, I think ARM has made significant strides in sort of the higher performance
compute.
But it turns out it's a lot harder.
And what you're seeing now is that if you look at most supercomputers, if you speak
with the directors of these centers, most of them will tell you, well, AI is about 10%
of our workload today.
I think it's pretty clear that that percentage is going to go up.
Sure.
So when you're talking about HPC and high performance compute, not only do you need
to sort of be able to support the traditional HPC applications and workloads, but at the
same time, you have to support this growing requirement for AI and ML. So I think that's
where you're going to see these coming together. And like I said, the real thing is we're at the infancy
when it comes to AI and machine learning.
Every time a new model comes out, and if the model's better,
people just switch like a light switch.
Some of the models that you thought were really popular yesterday
or three months ago are almost unused now
because when something new comes out, people switch.
And no one really knows what direction it's going to go go and I have a feeling that you you're looking at a
high-performance compute and you want to be able to support the AI machine
learning models of the future you need to have flexibility in your in your
compute so that depending on the directions of the model goes does your
architecture support that said another way and if I had one criticism of a lot
of the AI hardware
startups I see out there today, if you develop your hardware around the models
of today, I think you're at an inherent disadvantage because it's changing so
fast. Yeah, that's a great way to put it. You guys have a keynote here today. Jim
will be talking. What are the key messages for the conference this week
and are you making any announcements?
I don't know if we're making any specific announcements.
I'm looking over at my PR guy.
But Jim is, we have been focused on AI and machine learning models for the last six years.
We're about to tape out our third generation black hole processor, which will be the first
one that takes our 10-6 cores,
which are the blocks of architecture that we use in our AI, and combine them with sort
of the first version of those larger RISC-V compute cores.
So I have a feeling that Jim is going to be talking about, again, the heterogeneous compute,
how that benefits AI and ML. And again, I think the importance of having
an alternative, you know, before the microphone started rolling, we had a discussion about,
you know, what's the value in having, you know, alternative architectures or alternative hardware.
I think if COVID and the lockdowns taught us anything, there was a huge supply chain crunch
with CPUs and memory and all these different parts.
And I think that companies realize
that if you have all your eggs in one basket,
you're single sourced, it's very dangerous.
So I think there is certainly a need,
especially with the popularity boom we see
in AI, ML compute requirements,
to have alternative vendors out there.
And I think Jim will probably, I think, touch on that in his keynote.
Yeah, you're wearing an architectural diagram of one of your chips, which as a former silicon
nerd myself is very exciting.
Tell me a little bit about the third generation chip and what it brings in terms of an architectural
footprint and how
customers have responded to it? Sure. So we're just taping, since we're taping out, we expect
product back in the labs early next year, and then we'll be, you know, taking that and putting it
into different form factors. But we have been working with some of the customers that have
looked at our IP. They've actually got their hands on it and they've been running in simulation.
I think they're very, very pleased. And like I said, we've been on a journey
with our machine learning architecture.
So our first generation Grayscale,
which is the one on my shirt,
introduces the 106 cores, which again is a,
it's, you know, we would describe it
as the world's first mass production graph computer
with a data flow architecture,
very, very different to sort of a traditional
GPU compute model.
And that's where we call them the 106 cores.
That's where they came in.
With our second generation wormhole,
we added a proprietary ethernet fabric
designed to enable scale out, meaning
that we're using an ethernet fabric to connect chip
to chip, card to card, server to server, rack to rack,
so that we can connect, we like to say, to chip, card to card, server to server, rack to rack, so that we can connect,
we like to say an unlimited, but let's say a large number of chips together so that they're seen by
PyTorch or TensorFlow or whatever as a single device. And you remove the need for having
NVLink or Mellanox or these switching routers, everything's kind of connected together in this
large mesh network. Now the third generation, in addition to improving on our 10-6 cores, we've improved the math. We've added more
memory. We've switched to, I think, a better GDDR. We've done a couple things inside to sort of tune
up and optimize. But we've added 16 RISC-V cores to our chips. And as I said, that's kind of our,
we're on this journey with RISC-V.
If you look inside our 10-6 cores,
actually we've got these things called baby RISCs,
or these little single in-order issue RISC-V.
And that's part of what drives our 10-6 cores compute
and gives it sort of the ability to do
these dynamic conditional execution units.
And then with wormhole, we have scale-out.
And with blackhole, we're now adding these 16 RISC-V,
I would say medium-sized cores on the chip.
And where we remove the need for switches and routers
in our second generation,
what we're trying to do is remove the need for a host system
or at least remove the need for the operations
to be sent out outside of the silicon, going
through the main bus, going through memory and being done on the CPU.
How much of that can we do in the silicon itself?
Now, the next generation after that is where we bring in our in-house high performance
RISC-V chiplet combined with our next generation top of the line AI chiplet.
We put them together and we've got this, what we believe will be a very interesting take on what Grace Hopper
and what MI300 are doing. But this Black Hole chip is super interesting because it's really bringing
together those RISC-V CPU cores with our AI
and seeing what we can do with them together. And the response so far has been
very, very interesting. We've talked about some of the investments we've had in our company
and some of the strategic partnerships we've signed.
And I would tell you without doubt, all of these are companies that see the same vision of this combination of AI acceleration and general computing.
It's really exciting.
You know, when you talked a little bit about the architectures of today are not going to solve the problems because we're innovating too fast. This would be the time of the interview typically, David, that I would
ask, what do you see in the next few years? But if you put in perspective how fast we're moving,
a year ago today, nobody had heard about chat GPT. And so we see the acceleration,
we feel the acceleration, it's visceral. So if you look forward a year from now, for Tencent Torrent as well as the industry,
what do you want to be seeing and what do you think we're going to be talking about in terms of the
core capability of AI? Yeah, so you're absolutely right. It's super interesting,
right? I think no one could have predicted. I think there's, you know, Sam Holt and those guys on
an open AI or on record saying they're surprised at sort of the
popularity of ChatG GPT and what
happened. So I think it goes to show you we're kind of at the infancy in this industry, in this
area, right? We don't know where it's going to go. I think without a doubt, as I said, with the
products that have been released this year from NVIDIA and AMD, I'm super interested in seeing how developers and academics and enthusiasts in
the space take advantage of that heterogeneous architecture and what they can drive with it.
So again, not knowing what's going to happen, not knowing what the models of the future will look
like, I am excited to see how the folks that are developing the models themselves are going to take
advantage of these sort of new heterogeneous architectures. And I think that's actually going to play well into
our own strategy. Other than that, it's really hard to say. I think that with the rate of change
that we're seeing, I can tell you some of the verticals. I'm the business side for TenStore.
I can tell you that there are some verticals that are super keen on this kind of combination of
compute and acceleration.
And it's automotive, it's HPC high-performance computing, it's big data center.
And I think less about the architecture itself,
but I would say the need or the desire to own your own compute
is what surprised me most when I hear from customers.
And that's where RISC-V really comes in. People don't want to be beholden to the financial needs
or the whims of a few companies.
So if you can own your own architecture,
you can own your own destiny.
And I think Tesla taught us this and so has Amazon.
They really do own their entire silicon from end to end.
And I think the number of companies that are looking at that
and saying, wow, if we can customize our silicon
for the workloads that we care about
and drive this incredible efficiency
and we can own the technology.
And really at TenStore, I think our reason to choose
RISC-V, for instance, is not only about
being able to offer that customizability,
but it's also about being able to offer ownership.
It's a new model where there's no fear of
what happens if 10 torrents not here in two years or five years,
what if they get acquired?
If you're in the RISC-V ecosystem,
then you've got hopefully 5, 10, 15 other companies
that if we're not
doing our job and we're not offering the best solution, you can go somewhere else. In fact,
you can just decide to go do it on your own. And that ownership piece has been surprisingly
one of the things I see most from customers about why they're choosing to work with 10th
Point. That's fantastic. For those of us who are listening online, I'm sure we've piqued interest
to engage. How do folks
reach out to you to talk further and talk to your team? Oh my gosh. So yeah, tennisstorn.com. We're
on Twitter. We're on kind of every social media. We've been, I don't want to say we're in stealth
mode because we're not, but we've been intentionally careful with what we're going to market. Jim and
the team, we have one mission and we don't want to over deliver. We don't want to talk in hype. We really just want to talk about what we can do and more importantly, show it. So you'll
see, I think in the next couple of months, we started to make some announcements as we bring
these strategic partners on board. We've got a lot more announcements in the hopper that we're
pretty excited to share with everyone. But in the meantime, please reach out to us on the website,
reach out to us on social media. and i think one thing we've committed to
everyone that we speak to is that we want to democratize ai and we want to be able to get
you know a modern ai architecture that frankly is very different than sort of what the sort of the
standards are today out to as many people as possible and i hope to have some news and
announcements about that coming soon and hopefully not only will we be able to talk to TenStore,
but I think you'll be able to get a chance to kind of play around
with the hardware, and we want
more people to see what they can do with it.
For those who listen to the Tech Arena,
they know that I am a huge fan
of chiplet architectures, header, genius,
compute, and RISC-V. So I
nailed all your... I didn't know that.
This is great. Yeah, so thanks so much
for the time today and taking time away from the conference.
No,
I appreciate it.
And thanks for having a really look forward to hopefully coming back next
year and sharing some more.
Thanks for joining the tech arena,
subscribe and engage at our website,
the tech arena.net.
All content is copyright by The Tech Arena.