Podcast Archive - StorageReview.com - Podcast #115: Is Computational Storage Just a Fancy Name for Storage?
Episode Date: January 31, 2023Brian sat down with ScaleFlux’s JB Baker for an interesting discussion about computational storage… The post Podcast #115: Is Computational Storage Just a Fancy Name for Storage? appeared ...first on StorageReview.com.
Transcript
Discussion (0)
Hi everyone, Brian Beeler with Storage Review. Thanks for joining into the
podcast. I've got JB Baker with us today from Scale Flux and we're going to talk
a little bit more about this category of flash SSDs called computational storage.
And JB is already gagging in the back of his mouth a little bit and I only bring
it up because that's how a lot of the world knows these drives, but
we're going to get into that, talk about the category a little bit and much, much more.
JB, thanks for joining us today.
Hey, thanks for having me.
I appreciate the opportunity here, Brian.
All right.
So let's just start with why do you hate the term computational storage?
Before you answer that question, what was is computational storage?
And then let's get into your drive a
little more. Yeah. So, you know, the ScaleFlux was founded back in 2014. And the mission at the time
was going after computational storage. And the whole concept of computational storage
is around improving the data center, the data infrastructure efficiency
by pushing some of the compute tasks down closer to the storage, closer to the data,
and doing work on data closer to where it lives. So that frees up those valuable CPU cycles, your x86-type processors are now the ARM processors,
freeing up those cycles to run your applications
as opposed to doing heavy management tasks
that can be very burdensome and add latency to your application,
cause more data movement,
and just generally slow things down and reduce your efficiency.
So broadly, computational storage was any, and tell me if this is too broad, but any
SSD that had some other hardware-enabled engine, maybe even hardware is too specific, in the
drive, though, to do
some other task to ideally offload cycles from your CPU. Yeah, and actually the way that the
industry came about talking about computational storage was not just down at the drive level,
but we also, through the Storage Networking Industry Association. ScaleFlux was one of the kind of initial partners or founders
in the computational storage working group in SNEA.
And as the group sat down four-ish years ago,
and in that formation, it was, well, there's computational storage drives.
There are computational storage processors.
There are computational storage processors. There are computational storage arrays.
All of these different ways in which you can move these functions away from the CPU and either put them at an array level or put them in a processor that might sit between the CPU and the drives or push things, functions all the way down to the drives and that's that huge scope led to some of the some of the confusion around how people understand
computational storage and the variety of ways in which all of us vendors who were working in
it talked about it and so you know that you asked early, why do I hate the term
competition? I don't hate the term, but it's, it is a term that people get confused about and can
feel like, oh, that's a, it sounds complex. It sounds like something future, future out there.
Great benefits if I can slash how much data I have to move across my network by 90% or if I can boost my application performance 2x, 4x, or I can get more data into my drives, but complex.
And so that's where I have the little bit of shock around just the term itself. Well, the more you describe it, actually, the worse I think it gets, because if you're
going to include the entire ecosystem, which includes probably DPUs to some extent, depending
on the flavor, GPU-accelerated RAID storage, like the GradeCard, the PlyOps solution, and
many others that are solving these problems.
I mean, we've seen computational storage companies start and fail already.
I mean, this has been some period of time.
But yeah, I think for the point of this conversation, we'll focus on the SSD itself and what you
guys are doing there and what that category looks like.
But for me as an industry analyst person, it's really hard not to crack.
And I'm very close to the industry in terms of what you guys are doing versus what competitors are doing
versus what some of the silicon guys are trying to do to get their accelerators added into these other spots.
And it's difficult because it creates a lot of confusion, I think. And I mean, you talked about
Senea and the working groups are really good at a lot of things. But sometimes,
marketing communications is not always one of them, unfortunately.
And there's so many constituents in there. I'm sure even in the working group you're
part of, there's probably at least a couple dozen, well, maybe not a couple dozen, maybe a dozen
players. And each one of you has different core tenants to your products, right?
Yeah. I've actually kind of lost track of how many different companies are participants in
the computational storage twig. I think we were, it's definitely over 60.
Oh, okay, maybe, okay, so a couple dozen.
And some of them have, you know,
some of them don't have necessarily
computational storage solutions,
but they are travelers and care about what goes on there.
Great, so you've got 48 people involved
that don't have a product, uh, but
have a, uh, a selfish motivation one way or the other to, to influence the outcome. All right.
I don't want to get into the working groups cause we'll spend an hour on that and still just be
pissed off at the end. Um, but your drives. Yeah. Yeah. I mean, I guess so where, where we have
over, you know, where we launched our third generation of product in 2022, we call it the CSD 3000 and the NSD 3000.
The first two generations were based on FPGAs.
We were using Xilinx FPGAs and that allowed us to do that rapid prototyping and put out some computational storage functions and start
selling some drives.
And as we went through the first generation of product, we learned that, hey, it's got
to be simple to use.
You can't force somebody to do application integration in order to take advantage of
the function that you put in the drive.
We had the primary function in that one was a compression offload engine.
But it was kind of complex to use where you had to push your data down to the drive.
It would do the compression offload.
And then you send the data back up to the host.
And then if you wanted to write it to the drive, well, that was a separate function.
So it was kind of a, we combined two separate devices into a single form factor,
but complex to use. In the second generation, we retained the compression function and made it
transparent. So now there's no replacing of libraries in the host, there's no application integration, you plug the drive
in and it boosts and you get that transparent compression.
But there was still a little bit of complexity in the Gen 2 in that we had a Scaleflux driver
instead of just the inbox NVMe driver.
And that's one of the big things going to the third gen is now we continue to
make it easier to use and as a direct substitute for another NVMe drive where we transition to
ScaleFlux Custom ASIC. Everything is on chip on the drive and we're using the NVMe drivers.
So now there's not even any software to install. You plug in our drive,
uses the inbox NVMe driver, the compression just happens automatically. And it's all transparent
and easy to use. So that was our, as we had worked with customers is like simplicity, simplicity,
simplicity, because those, those overburdened people in IT ops and hardware infrastructure,
they just don't have the time to deal with that complexity and things.
And then on the application side, they don't want to do application customizations.
It's never worked.
Some unique drive. It's never worked. Some unique drive. Yeah. It's never
worked. I mean, we, that goes back a decade back when fusion IO was disrupting the enterprise flash
space and they had these IO drives that were fantastic and they were really great. And they
always said, if we could just get the database guys to tune their code for flash instead of hard
drives, think about all the, this is a story that's as old as time, right?
There'll be some certain niche applications.
There'll be some in-memory databases that'll tune for something new like PMEM whenever that rolls around or CXL or whatever the new high-speed hotness is, right?
But for what you guys want to do for enterprise mass adoption, it's got to be easy.
It's got to just work.
And it's got to be compatible with enterprise stuff like VMware, which I know you guys have as well.
So the drive for anyone watching on YouTube, it's on my desk here.
It's an SSD, right?
And outside looking, besides being maybe a little heavier than other drives that we have in the lab,
I mean, it just looks like an NVMe SSD.
And I guess that's to your point, how you want to be characterized.
And to a certain extent, why the computational storage halo still technically applies,
but maybe not, you know, you don't want to be seen as that first.
You want to be seen as an SSD first, which makes sense.
That can also handle some of the compression and data reduction in the drive while saving the cycles on the CPU side.
And that's always been a challenge, whether it's a storage array or software defined anything you might even remember permabit was running around with a fiber channel appliance that would do
compression because it's expensive not just cost wise but cpu cycles i don't even still today i
mean we've seen solutions that'll have a 20 30 percent hit when you enable that. So if you don't have hardware acceleration
native to the array or the server or whatever,
in whatever form,
you're going to eat a substantial performance hit.
And when you spend the big money on those CPUs
with the big cores and clocks and everything,
that's just an unfortunate way to allocate your resources.
Yeah, I mean, there's a couple aspects of that.
The compression algorithm, it's a fixed function algorithm,
but it's something that,
and each one you use has a different burden on the CPU,
but I'll just kind of bookend it,
with LZ4, a lightweight compressor.
So you're sacrificing how much you compress the data in order to get more throughput out of the software through the CPUs.
But even that, we look at and we've seen that it can take several Intel high-end Pentium cores to achieve the throughput to fill one NVMe SSD. So now let's say, you know, if you
have a 48 core system, and you got one drive in there, okay, maybe you can dedicate the four cores
to compressing data at line rate to be able to fill that drive. But now you put a second drive in
and a third drive and a fourth drive.
You know, pretty quickly you run out of cores. It doesn't scale very well. And if you wanted to
actually maximize how much you compress the data, then you'd be using like a GZIP. And well, that
takes hundreds of cores versus doing that compression. And now you're also talking tens of or hundreds of watts at that point to do that compression
versus putting it into a hardware engine that can deliver six gigabytes per second of compression
throughput per drive at a cost of, you know, around a watt.
So, you know, do you want to spend hundreds of watts and tens of cores and hundreds of
thousands of dollars on cores to do compression? Probably not. No, you'd rather do it somewhere
else if you can, right? I mean, that's your big value prop. And so when you think about
compression, I guess the way it translates out the other side is this is a, I don't know,
I think this one's a 768, an 8-terabyte class drive.
On average, if you look at what the big array guys quote for enterprise workloads like a Pure or Dell,
I think they're like a 4-to-1 guarantee normally on their data reduction rates.
So maybe that's about what you see in the enterprise.
But I'm curious, what do you see?
And when you're going to market, do you talk to your customers like, hey, this is an 8-terabyte drive,
but based on your workload or workloads similar to yours, we think it's more like a 26-terabyte drive or a 38-terabyte drive.
Is that how you communicate your value?
Yeah, there's certainly two critical aspects of the value.
One is that extended capacity um and we do support going up to us
very quick very soon we'll be supporting up to 24 terabytes of logical capacity so you can take that
eight terabyte uh up to three exits capacity uh will support up to 4x capacity on the four
terabyte so you know being able to to store 16 terabytes of data on there and you know i you may have noticed
me kind of smirk when i hear the the compression guarantees or the data reduction guarantees
all of those guarantees as i've read the asterisks and the footnotes on them
they all say depends on your workload assumes that you're not pre-encrypting the data,
that you're not pre-compressing the data. So there's a lot of caveats around those.
There are, yeah.
And so, you know, the compression that you achieve is definitely going to be
impacted by the data that you run and what workload you're running. Now, as we've gone out and done testing
with customers and they report back what compression ratios they're achieving, we are
typically seeing that the compression ratio is greater than two to one for any of the database
applications. We do a lot of work with Aerospike, MySQL, Randy B, Postgres. We've done work with customers on Kafka
and other applications, but typically they're seeing,
the reports that we typically hear are 2.5 to one, up to five to one, with some cases like Redis on Flash,
it's been nine or 10 to one.
But if you send me encrypted data,
well, I'm not gonna compress. Or if you send me data data, well, I'm not going to compress.
Or if you send me data that's been pre-compressed with LZ4, I'm only going to see about an extra 20% compression.
But that has a massive improvement on the latency consistency and the mixed workload performance that you would see versus using an ordinary drive.
Yes, I want to go there next. So I think you start out with the,
the big pitch around the data compression or data reduction and the,
the capacity benefit, cause that's the easiest one,
I think to kind of get people to wrap their heads around. But yeah,
you just started to go there. There's a performance benefit too.
So talk about that a little bit in terms of
Gen 4 headroom and what you guys can do there. Yeah, so you know, and you guys have done testing with the drives and you see that when the data is compressible, which is, as you said, that's kind
of the typical enterprise case that we run into, it allows the drive to have significantly better write
performance up front.
And that, there's a virtuous cycle that happens in the drive of those hot writes require less
data to actually be written to the NAND, and then when we do that, now there's more free
space on the NAND.
So as the drive starts to fill up or as an ordinary drive would start to fill up, our drive still stays relatively empty.
And you've got higher effective over-provisioning on the back end that reduces the amount of cold writes you have to do.
And that helps with achieving your reads and consistent read latency when you've got
a mixed read write workload. I know I, I tend to auger down into the details.
No, that's good. And you know, we, we saw that in the single drive testing,
so it's, it's clearly there.
How does that scale then and then maybe talk a little bit about how your
customers have scaled. So
I'm sure that in many of your POCs, it's like, here, take a couple of drives,
do some stuff with them so you can kind of see this and feel this. And that's probably the
starting point for a relationship with the customer. But where do you see them going? Are they going into software defined storage solutions?
Are they going just as addressable drives
in VMware or something else?
Like what kind of use cases are you seeing?
Yeah, it really is a broad array.
I'd say that the leading use case
is to put us in a compute node.
So if you've got a database compute node, that's been the primary use case is to put us in a compute node. Okay. So if you've got a database compute node,
that's been the primary use case and they'll have anywhere from
one to four drives as a ordinary level in there.
We do have customers that are putting it into
the shared storage that is being addressed by multiple compute nodes.
There, now you've got 816 or more drives
in that array. So it really does kind of go
everywhere. And there's nothing stopping you,
I suppose, technically from being qualified and included in a
typical two-node storage SAN, right? There's nothing
fancy there, or is that more challenging?
Well, the only gap for a storage array at this point is the dual port. So I don't offer
dual port today. That is in the works for later second half of this year.
Although even the array guys though, honestly, a lot of them are disassociating their software from the array so that they can go
cloud defined server defined you know and go that route because the economics there
you know if you're a dell or hpe or somebody like that or cisco that has a vast server platform
to continue to engineer hardware side is,
I'm not sure that's gonna continue a whole lot longer.
Yeah, I mean, we're definitely more focused there
in the server side, as opposed to the storage,
the high end storage array, that kind of falls into that,
going back to the SNIA definitions in a way
that those are already computational
storage arrays. They have dedicated hardware and software solutions to do not just compression
but data dedupe and compaction and snapshotting and all these other functionality and there
would be some refactoring of software to leverage in-drive compression for some of those arrays, for the traditional ones.
New players, where they're trying to use DPUs as the primary processor for an array, well, yeah, now the compression in-drive becomes pretty important.
Because they don't have these other dedicated
hardware pieces to do that, those functions. Well, right. The big one, Fungible, who we know
pretty well and they recently were acquired, they could do some pretty creative things. And some of
the other guys that are DPU based, Vast is doing that. There are
others that are looking at some of these technologies that can layer,
I suppose, and make for an interesting
conversation there. Yeah, and it always comes back to
well, how do you want to allocate
your compute cycles and where do you want to put them
to make your system the most efficient, the most performant overall? So how is going to the ASIC?
I mean, I understand that's the progression, right? As everyone starts with an FPGA and the
holy grail is to hopefully have enough money to make it to an ASIC because at that point,
you've got exactly what you want but still
tunable, right, if you need to a certain extent. So what has that done for you guys in terms of
the growth of the company or applications you can support or what are the other impacts?
Sure. I mean, I think the biggest thing with going to an ASIC is that it allows the product to have a much richer feature set.
With an FPGA, we were still in a U.2 form factor with the FPGA. So that restricted us to
a relatively smaller FPGA to stay within the power envelope and the space envelope of a U.2.
And so that, in the prior generations, that prevented us from doing things like adding
encryption into the drive.
And then as you try to ramp up to PCIe Gen 4, Gen 5 speeds, there's just not enough gates in an FPGA that will fit within the power envelope of U.2,
or as we move into the E1S and enabling that, you just can't get an FPGA that's small enough
physically with enough gates to do something interesting.
Well, that's interesting. I was going to let you off the hook on form factors, but since you brought it up, Gen 5 obviously with Intel's official launch this week, although
Sapphire has been out there for many weeks, and Genoa not so long ago, that opens up some
interesting things for Gen 5, but now we're seeing all the servers, well, we're seeing two things.
The hyperscale server is going E1S primarily, although they may still use some E3 as well.
All the big enterprise guys are going E3S though on the server side. And most of those designs are
7 mil. I mean, that's less than half of the thickness of the current U.2 or U.3 drive.
Can you fit into a 7 mil form factor? And how does that, I mean, is that a harder challenge
for an ASIC versus a standard SSD controller? I'm curious about that.
Well, so, you know, just to be clear, our ASIC is similar size size similar overall to a standard SSD
controller okay and as we go we move into our gen 5 then we'll do a process
shrink and and we will shrink our you know the package as well our current
chip the the 3000 series chip we actually can fit into the e1s form
factor we have not delivered that to the market yet.
We focused on, you know, as a startup,
I got to try to keep my number of SKUs limited until I've got significant
pull.
So we focused on the highest volume SKU or form factor right now,
the U.2.
Well, yeah, I mean, slot-wise, even OCP will concede
that despite all the EDS FFF excitement that U.2,
at least through this year,
and I would guess probably kind of deep into next year,
is still the predominant form factor.
But E1S really opens up a lot of hyperscaler conversations, and we don't really need to go there today, but I guess the cloud guys can also benefit from this, maybe even more so with full control over their own stack, should they choose to embrace a product like this, right?
Right, right. Yeah, there is a lot of potential there in portions of their environment.
Yeah. in portions of their environment. Portions of their environment,
data comes encrypted and compressed before they see it,
and you can't do a lot there.
So the big question, I think, then,
is if we're thinking about this as an SSD,
as a 7.68 terabyte or a 4 terabyte class or whatever it is,
I see the benefit to the capacity argument, so I'm getting more,
theoretically, capacity. Good performance profile. You don't have to talk hard numbers,
but generally speaking, if I were to compare this cost-wise to other 768s in the industry,
I assume you're a little more. I have no idea, So I'm just a little curious as to what the pricing looks
like.
Yeah. I mean, it's, I don't want to talk any absolute pricing.
No, no, I understood.
If you buy more, it'll cost less.
Right.
But yeah, I mean, we're like, we're, we're aiming the,
the performance of the drive,
even when you're not using the extended capacity, you know,
we're seeing that we're at the upper end.
We're aiming at that performance segment, not the lower end, I don't know, what do you
want to call it, entry or low end data center segment.
We've aimed at this as a premium performance drive.
So we are aiming to be price competitive with the other drives up in that swim lane, as we call it. And then
when you want to use the extended capacity, then you're going to buy the CSD. If you're not going
to use extended capacity, we have the NSD SKU, which doesn't offer the capacity expansion.
But the CSD now, yeah, I'm going to charge a 25, 30% premium over those other drives because the
reason you're going to buy it is because you're going to take it and use it as twice its capacity.
So on a raw capacity, sure, it's a little more price expensive, but it actually is costing you
30, 40% less than buying, say, a 16 terabyte drive versus our 8 terabyte drive.
Or arguably more, because you're just talking about the hard cost of the drive,
but now you've got fewer slots, conceivably, less power in aggregate,
and all of these things, fewer RACU.
I mean, this is a big push in the U.S. for sure, but in Europe,
I mean, where energy expenses are going through the roof, if you can
take your footprint, even a small data center, down a couple RACU and down many watts, it's an
enormous savings. Yeah. And then, I mean, we're doing a lot of testing and work in different
system level environments to truly highlight those benefits
in terms of not just raw capacity efficiency,
but that system level efficiency,
reducing the overall power that it takes you
to achieve a certain workload
and deliver a certain amount of work
back to your application owners.
Yeah.
So let's,
let's think about this a little bit more too, from getting started with your drive.
If you, if you sample one out to a prospect or a customer,
they want to play around with it.
How do you help them visualize or come up with a test plan
to see like, I don't know,
just drag a folder of files over that was
there was a hundred gig or two hundred terabytes or whatever it is and now you
know I can look at the drive utilization how do you how do you get them to see
that to understand the the the benefits there yeah I mean it varies a little bit
but typically users will start off with something simple like FIO or as you guys do with VDBench.
And there are settings and we have scripts that we can provide out for FIO for people to see different levels of compression and what that does to the performance profile. And then it's just a query into the smart attributes in the drive that says, what was the host terabytes written?
And what's the NAND terabytes written to show you how much capacity was saved?
Yeah, you probably need a little widget for VROPs or something for the VMware people to have just that little pie chart that says, you know, you've,
you've done this much and this much has actually been,
been used on the drive.
That that's a good idea. We'll do that. What we have done, by the way,
what we have done is done integrations,
plugins for Nagios and Prometheus,
which are two pretty common server management tools.
So that what they can see on the pane there
is the remaining physical space
instead of just the remaining logical space.
Because every other drive out there,
logical equals physical.
So we've done that integration to enable people
to make it easier for them to manage that extra capacity.
Well, that's pretty cool.
I mean, we've been working with you guys for a number of years
through the generations of product, and this current one is pretty fun.
So what is the process if someone listening to us today or watching the pod,
if they want to check out the company ScaleFlux,
learn more about the drives, what do they do?
You can just, we have www.scaleflux.com.
And right from there, you can hit the contact us button
and request a POC.
Or if you want to just go directly, just type in,
send an email to info at scaleflux.com.
And we're monitoring all of that actively
and we will get a response back to you quickly.
Yeah, and I would encourage anyone
in the categories that we've discussed today
that can take advantage of something like this
to check it out because these guys really are
one of the most credible players in
computational storage even though we've discounted that term a little bit. So
many of the other products out there either the companies have gone under or
they're so niche that it's really hard to understand how to use them in a
standard environment. For this to be able to just drop into a virtualized
environment, VMware off and running, great.
If it's a software-defined or a server-based storage situation,
great there too.
So it's certainly worth checking out and understanding
how this kind of technology can impact those workloads.
Very well said. I appreciate it.
Good. Thanks for doing this, JB.
Look forward to seeing you again soon, buddy.
All right, thanks, Brian.