Podcast Archive - StorageReview.com - Podcast #110: The Rising Importance of Storage Accelerator Cards
Episode Date: September 7, 2022For this podcast, Brian connected with Pliops Global VP of Products and Marketing, Tony… The post Podcast #110: The Rising Importance of Storage Accelerator Cards appeared first on StorageReview....com.
Transcript
Discussion (0)
Hey everyone, Brian Beeler here. Welcome to the podcast. We've got a really great conversation
coming up here around what the next generation of accelerators can do for computing. And
there's all sorts of challenges with big NVMe SSDs and legacy RAID cards kind of struggle
with some of these things. So I've brought on Tony Avshari from Plyops.
How are you, Tony?
Great. Thanks for having me.
So Plyops is in this accelerator card category.
Well, that's what I call it.
I don't know if that's what you call it.
But generally set up this category of products
and the problems that you and to some extent
your competitors are trying to solve were kind of raid left off?
Sure.
I think in general what we're seeing, what industry has been observing the last few years is the gap between what CPU and software combination
can do and the bottlenecks that they run into when it comes to NVMe drives.
As NVMe drives are getting more, they're getting higher performance, higher capacity, lower cost.
There's a lot of performance that is left on the table.
And frankly, it has to do with how applications,
specifically the storage engine portion of the applications,
where traditionally is the name given to the software that handles IEO going in
and out of storage for many applications and and the limitations of the CPU RAM
etc you can try to throw more network more CPU at this and try to hire an army
of software engineers that can that have to rewrite code frequently
to remove this bottleneck with the NVMe drives.
But ultimately, just like it happened with the graphics and GPU
and AI ML engines with GPUs and dedicated accelerators,
the story is the same for storage.
You come to a point where you need to offload some of these tasks
away from the CPU to an accelerator.
And in that sense, platforms like ours
or other computational storage type solutions are coming on board and focusing on solving that problem.
Now, the way we did it is we tried to make this simple.
We tried to tackle specific problems, which is, like you mentioned, rebuild problems of the large
drives, striping and data protection of the large drives, but it goes on from there.
We've designed a platform that is intended to do many things.
Let's go back a little bit because you and I have known each other for a long time you have background in in some of these cards from LSI
and days of yore right where the rate the physical RAID card was the de facto standard to
attach hard drives to a system or SSDs as we got to certainly to Gen 3 NVMe SSDs,
the RAID card was still the standard way to make that happen.
What is it in your view,
as I think the inflection was Gen 3 to Gen 4 on SSDs,
where the RAID cards that we were seeing were ultimately blocking?
So physical RAID cards were blocking,
and then you were talking about the software alternatives
that consume a lot of system resources,
CPU, a little bit of RAM, and other things.
But as your background in hardware,
what do you think happened?
Do you read it the same way,
that it was Gen 3 to 4 was really the problem,
where the RAID card sort of fell out of favor as the standard for aggregating storage?
Exactly.
The RAID cards fundamentally were designed around a slower media, starting with hard drives.
And there was a lot of there was a lot of Sloan it slowness in the system and
with the media itself where you could do things in the background and you could
get away with it fundamentally a rate card still runs an
algorithm on some kind of a core some kind of a compute device on a card. This is hardware, right?
You're talking.
And so, you know, you're executing code on some kind of a CPU locally.
And while there's latencies and deficiencies with that, but as long as the media is slow,
starting with hard drive that's you
can get away with it and then as the drives gotten better so we went to SATA
SSDs as you know SATA SSDs were much better than HDD but ultimately had some
some restrictions and then same with SAS SSDs much better performance much better resiliency
But ultimately not NVMe
When you when the shift has come to NVMe drives the whole notion of being able to
abstract SSDs with cabling and with
Hiding in behind RAID cards etc has gone away
because with hiding in behind RAID cards, et cetera, has gone away.
Because if you try to do that, just like today, perhaps, if you look at a Dell server or an HPE server or many others,
they have this tri-mode scheme where you have to support all three types of interfaces. But with that, all of a sudden, it's the Pandora box.
The RAID cards have absolutely come to their knees
when it comes to dealing with NVMe drives.
So, yes, I think the transition from SAS and SATA to NVMe was a big deal.
But even within the NVMe world,
it's to the point where a RAID card can run out of juice after maybe using two NVMe drives,
especially the Gen 4s.
Yeah, well we certainly saw that.
I mean, the NVMe drives have a lot of capabilities to sap resources.
And I suppose that's what is really compelling about your card is a couple things.
First of all, and we should get into it because we haven't talked about it quite yet.
So I've got one sitting here.
This is the less pretty version without the the marketing heat shield on top of
it yeah and and for all of you listening it's a it's a half height half length card that that
is is got a little DRAM module on it little SSD and and some capacitors to to hold it up and then
of course an FPGA on here to handle the logic.
Let's start with the card architecture.
I highlighted the key components, but what about the physical card stands out to you,
and then let's talk about the advantages of being on the PCIe bus.
Sure.
So we picked out a standard form factor that can fit in any server. And so we went with
half-wide, half-length, and we forced ourselves to stay within that framework. As you mentioned,
the core of the system is around what we do and our IP and that FPGA today.
We're working on an ASIC that is going to come out next year.
And the core functions around that product is the data is handed off to us from the host. So when the application is running to the storage, this time
it's not pointing out to the SSDs, it's pointing out to us. Having said that, we are not changing
anything. We are not hiding the drives. They're sitting adjacent to us. You can still have the
out-of-band management of the SSDs. All you're doing is writing and pointing
the storage to us. We take care of everything else. So on that card, as you notice, there's
SRAM, there's what we call NVDRAM, which is 16 or 32 gig of DRAM that is supercap backed up and that is only used the only the only time you use supercaps is
during a sudden power down where you have enough energy on the board to completely dump the content
of the DRAM or NVRAM onto a an SSD on the back of that card but but outside of that this is outside of that at sudden
power down operations we we are not really using this precaution or anything
so we have a very great atomic way of receiving data from us acknowledging
them very fast the the rights basically hit our NVDRAM and we acknowledge
the I.O. back to the host. So there's ultimately the lowest latency possible way we can write
to a storage. From there, we treat everything essentially in a key-value pair. You've heard of key-value operations. We have two
interfaces, APIs available to this card. One is directly talking key-value with the application.
That's a little bit more advanced. That's where you get the 20x, 30x, 50x types of improvements
and performance. But to start with, to make this very simple,
we also have just a block mode
where a block driver is receiving the IO from the application.
Fundamentally, blocks are, in a way,
can be represented as key value.
LBA address and the content of what goes on that LBA.
So keys and values.
When these blocks come to our domain, once they're acknowledged that they're in our
NVRAM, in our SRAM even, we have an SRAM and an NVRAM, we then start running them through a compression engine.
That compression engine ultimately forces variable size objects, depending on the compressibility of the data,
that a 4K block comes our way, may end up being a 500 byte object,
may end up being a 4K byte object, 1K byte object.
There we have our patented key value engine where it creates, it indexes, it sorts, it garbage collects
all of these objects, very tightly packs them with the resolution of one byte into one large string of data
that we ultimately write to the SSDs.
And those are typically could be as big as one gig or two gigabytes of data
that we have shaped and ready to push out to the SSD.
All of this takes place without the application having to worry about that.
And what it results into is a very streamlined operation with the SSDs, where SSDs almost
to the end of their lives have to do no background operations. So there's no pressure and forcing the SSD controller itself to do
a lot of work.
So that ends up translating to the highest, ultimately the best possible write to the
SSDs, meaning always large, always sequential, always NAND aligned,
and SSD providing the best latency possible, but also SSDs having the best life expectancy
as possible.
Well, so there's a lot there. So you've got the card,
sits right on the PCIe bus, takes the rights in, acknowledges them very quickly because you've got
the RAM. And then of course, it's battery or super cap protected in the event of power loss,
but normal operations, does all sorts of data enhancements, coalesces these writes, gives them to the SSDs
really nicely, and you're off and running, which is pretty cool. The one thing that you didn't say,
which I'm sure you want to say and get to, is that this really opens the door for more dense
NAND. So QLC, we've got these, the Solidigm 30 terabyte drives or 30.72 or whatever.
But Solidigm also at FMS was showing off their penta level cell drives and they're quite serious
about bringing those to market and even bigger QLC drives. So the market really wants to go towards more dense, more cost-effective flash,
which obviously makes quite a bit of sense.
But operationally, it's been difficult for organizations to adopt those drives
because they have sensitivities.
You can random write a
bunch of 4k blocks to a QLC drive, but it will be very sad about that. And the performance will
let you know, and the application will let you know that it does not enjoy that. So your card
and your software really unlock for Soladyne with these drives or whoever else they could be,
really unlock some of these NAND technologies and let them be leveraged in a way
where the application doesn't notice that it's, I almost said lower quality. It's not lower
quality. It's just lower cost in the end uh that that's
more dense so i mean to unlock these drives you really need something like this in the system
yeah that's a very good point uh if you want as you know uh traditionally
uh the controller has has had to be very complex, using a lot of DRAM.
We're talking about SSDs.
Using different bit levels and byte levels to manage data.
And it ultimately comes to cost. With our solution, we are promoting and using or saying that the bigger the drives
the better, but also the cheaper the drives the better. Because if the less the SSD controller
can touch the data, the better. But what if you have a solution like ours that can manage any SSDs?
So in other words, you're not tied to one type of SSD or one vendor.
You can mix and match it any way you want.
And ultimately, to try to deploy the biggest or the highest capacity SSDs that you can,
not having to worry about rebuild time,
not having to worry about performance.
So the cost per density in the server improving quite a bit.
There is hesitation, and we see this with our customers.
We either go into a compute node or a storage node.
In the compute node, people are typically putting in maybe two, at most maybe four type, for the most part.
A lot of customers out there.
And they're stopping at four terabytes or maybe 8 terabytes.
The minute you want to go to 16 terabytes or higher, there's this big hesitation about
having to worry about blast radius and the rebuild nightmares even at a cluster level
when a drive fails, still causes a lot of east-west
traffic at the rebuild time. All of those are reasons people are hesitant to put these
big drives like the technology from Solid9. QLC is the better manned technology and less number you know going from QLC to to PLC etc those are
good for everybody because ultimately that's how the physics geometry works it's better to bring those devices from the fab, offer them directly
to the end users, but that comes with some problems.
If there's a solution out there that can address that without having to change the software,
without having to worry about drive drive protection and
drive fail protection that is what ultimately the customers are asking so
we have a very good solution very good working relationship with solid I'm
around their QLC drives you guys did some testing with 432 terabyte drives, the performance ultimately looks like what you'll see with
four 8 terabyte TLC drives.
The reason is, as you know, with NAND, if you can be NAND aligned, if you can avoid
random writes and avoid background tasks for the drive controller,
you've done well.
That is certainly true as you move into KLC, as you move into PLC, but also maybe even
ZNS drives.
So if there's a way you can take all of that out of all of that thinking and figuring out
how to use all those drives away from the application
and the IT operators certainly is a benefit to them. And that's what we do.
Well, yeah, you're right. We did. We put your card with four, actually five of the Solidigm
drives at different times because we did fail a drive. We did a rebuild while running workload against it.
And that was pretty compelling too.
And the work we were doing,
we were showing your setup versus MD-ADM software RAID 0.
So we gave the SSDs all sorts of advantage in software. And the card, I mean, it just adds a lot of value,
right? All the computational stuff you're doing, all the alignment with the drives
is really quite compelling. And it's really easy. I mean, we drop in and I think we were running in tens of minutes or less, right?
After getting all set up and software installed and easy stuff.
You talked before about going to an ASIC.
What does that do for you?
I mean, this card worked really well.
So I'm just curious kind of the way you guys are looking at the future and what else you're trying to do with the card and how an ASIC makes a difference for you? It makes a difference in a few different ways. First of all, as we ramp up
our customer base, production, etc., certainly an ASIC makes a difference in cost and density and geometry of the device.
So the more, let's say we have multiple compression engines,
we can throw a lot more engines at that.
So speeds and feeds, the cost of deployment,
managing environmental items like airflow, et cetera, so power,
but also form factor.
With an ASIC, today we are a half-white, half-length card. With an ASIC, we could be placed on a mezzanine type of a backplane.
These are the kinds of things that think of us as how storage controllers in the past,
in the previous life, were deployed in a system.
They were either on a card or they were inside a drive or they were on a backplane or a mess card.
Having an ASIC helps with a lot of that because ultimately you can have more logical units and more transistors available
to you but more importantly you can keep the cost lower and you can keep the power lower.
So that's really what we're looking at at ASIC. We will always have an FPGA line,
and that helps us with being able to be more aggressive
with doing additional computational type functions,
getting in quickly,
but then always be able to suck those back into our main line ASIC.
Huh? That's interesting. Okay. Yeah. I figured it was a cost issue,
but the additional form factor flexibility and power
flexibility is pretty important too. And I suppose, I mean,
I don't know what the longterm plan is,
but ultimately your technology could be embedded on something like a DPU in theory.
I don't want to speak to your roadmap, but to conjoin with some of these other technologies
that are revolutionizing the way things work from a composability standpoint or storage
access or whatever.
So yeah, that's pretty cool.
We tested in an Intel server.
Talk to me a little bit about lanes, because that's one of the big fights between AMD and Intel these days is all the lanes available, right?
We see these are U.2 drives, but now in this current, the server refresh that's about to be upon us from both Intel and AMD, we're going to see all sorts of new form factors.
E1S is going to be a lot more prevalent. E3S server architectures and designs are going to be a little bit different.
Your card doesn't really care because it's on the bus, so it doesn't really matter what's plugged in or what the size or shape is.
But is there any difference in lanes available to you from Intel and AMD?
Are there other architectural designs that the server guys are going through that might be impactful or interesting for you and your card? Interestingly, yes. Not so much with our card, but with the number of lanes that are
available to our device, but more importantly, with what you can get out of SSDs. So let's take
Random Rights, for example. Random Rights, it's the most challenging workload essentially with SSDs as you're familiar with it.
What if you can get the same expected random writes performance out of four Gen 4 drives
versus eight Gen 4 drives without us and with us? So if you can do that, you essentially are giving lanes back to the system.
So as more lanes are becoming available,
there are more and more customers for those lanes.
So this problem of managing the fan out in the system,
it will always be there.
If there's an aggregator solution like ours
that is intelligently shaping the data
for the most optimal placement on NAND,
then we essentially are helping with distribution of those lanes
and fan out of those lanes.
Form factor-wise, you brought up a couple of ones,
E1.L, E1.S, the OCP form factor,
all of those are interesting.
It goes back to what we talked about earlier,
is the way we're looking at it is,
we can be purely in an aggregator mode,
whether that's adding card, E1.L, or even an E.3 type
form factor, or we can be inside the SSD itself.
All of those are ways to solve this problem that you brought up, which is how do you manage these lengths ultimately?
Yeah, I mean, the inflection that we're at right now with PCIe Gen 5 on the edge, so
most SSD vendors have a Gen 5 solution now not real widely available because there's only
so many places to put them since Intel and AMD
really need to formally launch their next-gen CPUs. But you've got that going on. You've got
all the new form factors. You've got CXL kind of on the edge, which is another thing that
may be interesting. What is your take on CXL? Does that do anything for you and the company?
We're keeping an eye on it. We certainly are looking at including that in our products.
Right now, I don't see it as a main rollout with a lot of products out there, especially storage for some time, maybe in the next two,
three years, but we certainly will be ready. What we see with that is some good impact on how you
can do peer-to-peer type operations on the PCIe bus and then some way of, again, managing logic on the PCIe complex.
So we like it.
We just don't know when it will get.
I wish I knew what the future is going to bring.
Typically, people are not dead on the exact deployment of these.
We've seen that story with DNS.
We've seen it with sri vmri we you know we
it's it will it will come i just don't know when so well i mean cxl is exciting
one one is kind of the the popular ratified version but i think two's been accepted they're
already working on three it It's so fast, that
space. And even when you look at PCIe, we're talking about Gen 4 to 5 now. I mean, 6 is done,
they're working on 7, but those are two-year cycles or longer if you go back to 3.
Right. It takes this industry so long to adopt new server technologies.
So I think we are quite a bit.
Someone actually asked me last week, how far away are we from CXL?
I said, I think broadly two years at least.
And I was chastised for that.
And I could be wrong.
I'm happy to be wrong.
But I've just watched all the other
technologies come into play and it just takes a lot of time. So we'll see how that goes.
AMD, we didn't even look at it. You have compatibility there too? Does it make any
difference to AMD, Intel? No, we work, we have customers that use both. With AMD, one thing that a lot of people
ran into was how to manage NUMAs and NUMA balancing. We do that autonomously. We do a good
job of figuring that out during a power up. And some of our customers were telling us that oh
we've used how you manage this elsewhere and we've recommended it to the NIC guys
as well so so I think with AMD that the only the only thing is you know how do
you how do you get around that but certainly once you do a great great CPU
great product and right now we see both
we see both of those guys
deployed in the field
and
it doesn't matter to us
just like it doesn't matter to us
what the NAND type is
or what the SSD type is
what we say is
use the cheapest SSD possible
the biggest SSD possible
cheapest and biggest
any action on ARM?
I know HP launched an Ampere server.
Those are gaining popularity, I think, physically,
in addition to AWS's Graviton and all that kind of thing
for these web apps and new born-in-the-cloud type things.
Is ARM something you guys are looking at too?
We will.
We've started to already.
There's an even bigger momentum in China
with some of the hyperscalers there
to use ARM-based solutions.
So we are talking to some of them.
And for us, as long as right now now at least as a pure server CPU it's still
lagging versus x86 but I'm sure it will get there so for us it will be just another
another compute device or compute capability. Now, having said that, you mentioned DPUs, you mentioned GPUs.
So ultimately, this complexity needs to be managed somehow.
So you're not relying on CPUs so much.
The CPU can be more agile, like an ARM-based solution,
but then you will have more dependency on accelerators and aggregators like ours,
like traditional GPUs, AI, ML, ASICs.
So that orchestration requires some work.
But again, the smarter these aggregators and accelerators, and the more transparent they are, the better.
So we see that more on the storage end.
We are seeing that what we call a smart bof, smart-bof is something that is eminent perhaps so
then that would just run with a DPU with some ARM cores on it that can run an OS
but you know managing this storage with XDP but the network piece and running
the storage stack on an ARM core and a DPU is certainly,
we see that, we demonstrated something like that with NVIDIA a few years ago.
We see that to be very relevant.
But I think it will start maybe in an isolated island like storage,
meaning a box that is only running the storage stack, nothing else.
When apps are going to run on ARM, again, I wish I knew exactly when, but it will come.
Well, I mean, we're seeing some of that already, right?
With Vast Data, Fungible, with their own DPU, Vast using Bluefield.
Really compelling stuff. And it does start to let you take a box of Flash
and truly be disaggregated in provision over the network
to whatever systems need the thing,
and GPUs as well, and whatever.
We're really progressing to the point
where it doesn't make a ton of sense
always to have expensive resources, especially GPUs, attached physically to each bare metal
machine. I mean, it's a little monolithic in the way you lay it out. So anything you do to enable
flexibility on how storage DRAM and GPU and other accelerators are consumed, I think is
certainly the trajectory of where we're going. But you're talking about too, a lot of ecosystem
stuff. So we were working in Linux. Do you have to get broad hypervisor support? Do you have to get
other things in place to really help this adoption
go? I mean, what's the story with VMware, for instance, we're about to hit their event in a
couple days here. They're still the OS of the enterprise, if you will. So what does that look
like to you? There's still plenty of demand for VMware support and ESXi support, so I think that's
not going to go away.
But there's simplification there as well.
Support for SRIOV is becoming a must, at least for the storage device, for direct storage
access in the VM itself.
But in general, I think, at least in the scale of data centers
that we're going and places we're going,
we see orchestration and containers as the preeminent
and dominant requirement.
Having said that, I think support for multiple functions,
having the kinds of requirements that a virtualized environment needs
is a must, whether it's VMware or the local flavor.
Yeah, well, I mean, there's certainly, that's another point of disruption we're in the middle
of, right, is VMware's transitioning to Broadcom. What's going to happen there? We're hearing from
a lot of people, they're looking at OpenStack. I mean, there's a thousand different other solutions to look at,
but there is a lot of excitement and energy right now around the whole IT architecture.
You have all sorts of new hardware coming, a lot of new software options.
So how do you best leverage those?
So that's part of the trick, too, is to figure out where those strengths are to tune
and develop your investment and where
you go with it accordingly. So speaking of investment, when you're, I ran into it at
Supercompute in St. Louis and then probably one or two other places since then. When you're at
events or when you're talking to prospects, I think
intuitively they probably understand what you're doing pretty quickly, right? Because it's kind of
like a better RAID card lets you take advantage of lower cost SSDs and all these other benefits.
What's a POC look like for these guys? Are they remoting into your boxes? Because they probably don't have these 30 terabyte QLC drives sitting around
to test with. How do your prospects check out
PlyOps and kind of figure out what you're doing?
So we do have an opportunity
for certain customers to log into our
labs and try it for themselves.
And we can tweak things from a configuration perspective
to match what they want, the drive capacity, drive type,
number of drives, CPU, grade, gold, silver, platinum, whatnot.
But for the most part, I would say 8 out of 10,
they will, maybe even 9 out of 10,
they are deploying the card in their environment specifically.
And they're finding out for themselves the benefits.
We did make an announcement with PhoenixNAP, a great company. We are offering XDP as bare metal as a service. So you can spin up an instance very quickly. They have a couple
of different configurations where you choose what, you know, have your bring in your own application you
pick the bare metal as a service that you want you and in this case there they
do offer a SKU with they call it the accelerated version with with XDP again
you can spin up your Redis instances on that beta metal system and transparently get all the benefits
of what we do in that solution,
that including protection, extension of life,
acceleration, higher capacity,
all the things that our platform does.
But again, there's no guessing
and needing to worry about tuning anything.
You just simply deploy your application on that system.
So that as a service thing is something that we're excited about.
We just started it with PhoenixNAP, but I think that is a model that will do well.
Well, I mean, anything you can do to reduce barriers, right?
It's always tricky with physical things,
especially now with some of the challenges.
I guess it's gotten a lot better in most of the world,
but there was a time where data center guys
couldn't even get into their own data centers, right?
It was the access challenges. I'm sure that's loosened up,
but again, you know, anytime you can make these accessible otherwise, that's pretty strong. So
you referenced it. We did do a really deep technical dive on the card. So we'll link to
that in the description and in the video for those consuming the
the podcast that way. Where else can people go
to check out the card? Where do you want to send them?
Best place is plyops.com and
there's a lot of references and stories there and solutions and blogs and
that's a good place to go including how to basically take this for a test
test drive and we have loaner cards we have like i said a system in our lab that people can log into
but ultimately the phoenix nap partnership and bare metal as a service is also
built all right we'll put links to both of those the the solution in the bare metal and your your
website and let me talk about loner cards i still have this one here i don't think uh you haven't
asked for it back yet so we'll probably hang on to it for a little while longer and and keep playing
but uh yeah i mean from our standpoint it was easy to drop in and use. Performance is great. So many more benefits than software RAID and really non-blocking the way
hardware RAID is. If I had to summarize the whole thing in 18 seconds, that would be it. But
anyway, this is great. Thanks for doing this. Appreciate you joining and
look forward to seeing you soon. Same here. Thanks very much for your good words and services
and appreciate all the attention you've given us.