Grey Beards on Systems - 80: Greybeards talk composable infrastructure with Tom Lyon, Co-Founder/Chief Scientist and Brian Pawlowski, CTO, DriveScale
Episode Date: February 26, 2019We haven’t talked with Tom Lyon (@aka_pugs) or Brian Pawlowski before on our show but both Howard and I know Brian from his prior employers. Tom and Brian work for DriveScale, a composable infrastru...cture software supplier. There’s been a lot of press lately on NVMeoF and the GreyBeards thought it would be good time to … Continue reading "80: Greybeards talk composable infrastructure with Tom Lyon, Co-Founder/Chief Scientist and Brian Pawlowski, CTO, DriveScale"
Transcript
Discussion (0)
Hey, I'm Gary Ray Lucchese here with Howard Marks here.
Welcome to the next episode of the Greybeards on Storage podcast, the show where we get Greybeards Storage bloggers to talk with system vendors to discuss upcoming products, technologies, and trends affecting the data center today.
This Greybeards on Storage episode was recorded on February 20,
2019. We have with us here today Tom Lyon, co-founder and chief scientist, and Brian Palowski,
CTO of DriveScale. So Tom and Brian, why don't you tell us a little bit about yourselves and what DriveScale is doing these days? Well, this is Tom. I fit right in with the Greybeard theme because I've
been out here in Silicon Valley for 41 years now and I'm on my fifth startup. My
first one was a little company I had the pleasure of joining as employee number
eight which went on to be known as Sun Microsystems. Nothing important. So that
was good fun. So I've been trying to,
I've had the startup fever ever since. So this is my fifth startup, sold one to Nokia,
sold on to Cisco. The one we sold to Cisco was Nuova Systems, and we were responsible for
developing both some 10 gig switches, but also the UCS server product line. And I was the founder there.
DriveScale has been around about five years now,
and we have lots of other interesting people here like Brian.
Hi, Brian Pulaski.
My background is primarily storage.
That actually, I started working on distributed file systems back when I knew Tom at Sun Microsystems,
that small company in the 80s.
But then I left there and was spent in a 21-year fit of lack of imagination, became employee 18 at NetApp,
and worked at developing products for that company for a very long time.
Spent some time at Pure Storage afterwards as chief architect
and then wanted to go back to my roots in a smaller company
and here I am at DriveScale.
So the startup fever has gotten to both of you guys.
Yeah.
So tell us what DriveScale is doing.
Well, we're doing what we call composable infrastructure, which is a term that's being adopted by a fair number of other companies now as well.
For a fair wide variety of things, too.
Yeah, so it's not quite gelled as a really good definition, but we keep trying.
Okay. What do you guys mean by composer?
What we mean is we let you compose what are essentially physical servers
from component parts.
And that's somewhat a difficult concept for some people.
But, you know, historically, storage either came inside a server
or it came as part of some complex storage subsystem.
And that was how you separated storage.
But more and more those complex storage systems are just servers themselves with a whole lot of software.
And if you want to look at simplifying servers overall, you really want to get the storage out of the server, but without putting it in just another server.
And in fact, this is true for things beyond storage.
So GPUs and FPGAs and even memory eventually.
So it's all about building far more flexible and efficient server infrastructure.
Okay, we have to come back to that.
Yeah, memory, networking, GPU, and stuff like that.
Not all of that stuff is, I'll call it PCIe connected.
Well, no.
And in fact, we're not pushing PCI Express as the interconnect.
We're pushing plain vanilla Ethernet in its high-speed variations.
That's unusual, in my mind, for a composable infrastructure company, would you say?
Well, it takes a bit of understanding, but you have to understand what's really possible with networking and get beyond some of the belief systems that were never quite correct.
You know, like Ethernet is high latency.
Well, some of those things were true 20 years ago.
Right, right.
Okay, so you guys let me allocate storage devices across some Ethernet fabric to servers dynamically.
Right.
And so that's the base level of composability.
Then there's another level of composability we have,
which is basically constructing clusters for you
from any number of servers of your specification
so that you don't have to think about exactly which servers
with which drives make up part of your cluster.
Okay.
So, I mean, clusters and stuff like that seem to me more of an operating system level function
than a composable infrastructure function.
So you actually operate across racks then?
Yes.
And what was I going to say?
Yeah, a lot of what we do has to do with our target market space, which is not the mainstream VM world.
We're going after the big data analytics scale out bare metal type things where they consume almost entirely direct attached storage.
They rarely have a hypervisor involved.
And they almost always run Linux.
And
they always run Linux.
So it's a little
different focus from a lot of things.
Most of those guys
run Hadoop or some variant of
that sort of thing that kind of manages
the cluster and the storage
associated with each of the servers
and stuff like that.
How does your solution differ
than something like an Hadoop environment?
Those are our client applications, basically.
If you look at the world of software defined storage,
scale out file systems,
scale-out NoSQL, blah, blah, blah,
it's a whole new kind of way of writing storage systems.
You know, it's called cloud-native or stuff like that.
And all of these things just want to consume direct-attached storage,
and those are our client applications.
Yeah, because they all build shared-nothing clusters out of that direct attached storage right or share it yeah the the storage management happens in the application and the replication
happens in the application and the snapshotting happens in the application so so i i call i call
these things uh-specific storage.
And, of course, what's interesting is the actual hardware drives, you know, there's only three disk manufacturers left in the world.
And there's only five or six SSD manufacturers.
You could argue two and a half.
Let's not go there.
And so the hardware is totally commoditized.
But every single thing you put in the data path.
Wait a minute. You guys work with disks?
Absolutely. In fact, we got started with disk, but now we've been shipping NVMe over fabric product for about a year as well.
What's a disk look like at the end of an Ethernet cable?
Very much it looks like a disk that's inside your box because we essentially are taking established reference architectures that use direct attached storage, an embedded disk, and transparently replacing it.
So these devices look like they're local.
The applications need that locality because they've been designed to work efficiently
with local disk.
I mean, that runs a gamut of the Hadoop applications, the NoSQL architectures, and even the SSD-based
applications like Aerospike.
They all depend on this local architecture.
So you've got a disk driver that loads into Linux that attaches over Ethernet to the disk?
Yeah, it's called iSCSI, but don't tell anyone. Oh, God.
It is.
It is called iSCSI, unless, of course, you're running NVMe, and you have end-to-end capabilities
there, whereupon our system will just figure out what's the best thing and highest performance
path between the two endpoints of the node in the cluster and its storage,
which each node owns its own storage. It'll transparently choose between the optimal data
path. But the thing that's interesting about the thing that's interesting about the
drive scale solution that I found interesting when I started was that I have,
I'm no longer actually working in a storage company.
If you look at our composer and just make a quick cluster and associate,
associate compute with compute with disk,
you're not asked any questions about iSCSI.
It never pops up in the dialogue.
God, you don't have to define iSCSI initiators
and all the magic phrase that connects these things
and all that stuff?
No, and you don't know how to actually enable encryption
on Linux boxes either, because we'll do that for you
if you ask us to with a radio button.
So it's all – basically –
So I don't have to write 5,000 line chef recipes?
That's right.
Yeah, and that's a key thing about our solution is we don't give you a spreadsheet full of LUN addresses to deal with.
We actually do all the plumbing on the server side so your cluster is ready to roll.
Okay, now I was all confused about this.
I was confused too. So these disks are sitting
out there in a JBOD
or something like that? I mean, you must connect them to some
sort of compute layer
server thing, call it storage or something.
There's two approaches. The newer approach is
through what we like to call eBODs, you know, Ethernet connected JBODs. eBODs? I never heard
of eBODs. You can, in some ways, these are just servers with lots of drives, but they're also
moving towards being specialized processors, just like NVMe targets are moving towards specialized processors.
Sure, because just processing iSCSI and forwarding requests without actually building an array, you could do that in an SOC.
And so that's also what we do.
We have a hardware product.
We actually have two hardware products. But we have what we call the
DriveScale SAS adapter,
which is basically doing very dumb,
cheap, fast, Ethernet
to SAS conversion.
So you put that in front of a
stock SAS JBOD,
and you're done.
Huh.
What's the other hardware?
Well,
our other hardware product.
It's essentially an iSCSI bridge, right?
Yeah, it's an iSCSI bridge.
And it's based on a Broadcom network processor.
So it's really very good at what it does.
But, you know, people like to have lots of choice of hardware.
So we're supporting quite a variety of other kinds of hardware.
I always thought composable infrastructure was, you know,
take one or two or three CPUs and their associated DRAM
and attach it to one or two NVMe SSD kinds of things
and add in some GPUs and or networking hardware and you have this logical server that you've defined
so so our testing right now for for composability is
to basically allow you to manage 10,000 nodes to 100,000 disks either SSDs or spinning and
Manage them as one or more clusters, which are contained,
which control, which contain a web scale application,
aero spike facility B Cassandra database from no SQL work,
the Kafka, the Hadoop stuff.
Almost like you're building your own cloud here,
building your own public cloud.
Well, it's, it's orchestration at a different layer.
Yeah, we're focused on making servers simpler and more generic,
which is something you really have to do if you want to pretend you're a cloud.
It's just a piece of what you have to do. But you can't the notion of buying application specific hardware
is just wrong these days and yet people are still buying you know server type a for application one
server type b for application two server type c for application three and then when you pull back
the covers yeah all these silos none of which are fully utilized, but there's no sharing possible because you bought different things.
Oh, and so much of it is sequential use.
So this cluster processes data and then the next one picks it up.
And if I could recompose, I could use one set of servers instead of three.
But it takes two days to set it up.
So in our architecture, you buy one type of device
to be your servers, one type of device to hold your hard drives, one type of device
to hold your flash drives, and then you combine as needed. And when it comes
time to move resources between applications it's easy because they're
all on the same pools anyway.
It's all driven by API and or, you know.
Yeah, so we have a central management system,
which is where all the smarts are, and it's 100% API driven.
Yeah, I want to step back a second.
So after our pleasant conversation about hardware,
we're a software company,
and the beans we have that are important are orchestration and management of basically clusters at scale,
commodity hardware, commodity compute nodes, commodity disk. When Tom said there's device server A configuration for application one, server B configuration for application two, you pull back the covers on those things, same CPU complex, just different amounts of storage in it, and either spinning disk or spinning disk or SSDs, but often restricted to what that compute vendor has qualified and has defined for their platform.
And you may want to kind of like, you might, wow, you know, I don't really need the performance of
these small form factor drives. I just need the capacity of something that's just the kind of the
big, the 3.5 inch drives, but I can't get a configuration out of the vendor for them.
We're like, well, why don't you pick and choose the compute, which is going to be the same across most of your applications.
Maybe you'll have two different ones, a low-end node and a high-end node.
And then mix it with the storage that you really require
for that application.
So where do GPUs fit into this cluster world?
Well, they're more difficult because there's no such thing
as a network-native GPU yet.
And we're not investing in any hardware to do that.
But we're talking to a lot of interesting companies that are doing relevant things.
Still not quite that, but it'll happen. In the case of memory,
there's another company out there doing,
conjoining multiple nodes over the network
to be one big SMP system.
So that's an example of composing memory things.
Talk about non-uniform.
But part of our strategy is to make sure the performance is
you know to offer transparent performance so that there's no no real cost to putting stuff outside
the box and that that's that's not not quite possible with gs. It's nowhere near possible with memory, except for a few interesting cases.
No, but with NVMe over Fabrics, it's close enough for flash SSDs.
NVMe over Fabrics, the SSDs and the network are so fast that the bottleneck shifts back to the database software or whatever kind of software you have.
And the Fabrics part was probably not all that necessary yet. So the disks that are attached to this eBOD are RAID structure or not?
Or is there any data protection or is it all at the application level?
Is it kind of like Hadoop would be replication or something like that?
It's individual disks as God intended.
It's application specific, right?
So when you're running Hadoop,
Hadoop makes its three copies of things the way it does.
And that's the data protection.
So your orchestration is smart enough
to put the three copies from this cluster
in three separate bots?
Yes.
And in fact, in three separate racks,
if you want that kind of thing.
So we have a placement plug-in for HDFS
that lets you control all that.
And it talks to our system, which
knows the underlying topology of all these things.
But then back on the RAID question,
there are systems that really want a RAID protected volume,
even though they also scale out.
And for that, we can turn on RAID,
but we do it on the server side because it's easy, it works great, and it gives you even more protection because now the two halves of your mirror can be in different racks as well.
Right. And the important thing here is that the application was already designed to work with local disk and uses RAID locally on a per-node basis.
And some people are deploying 50 nodes, and some people are deploying 500 nodes.
And if they're using RAID, the RAID being run at the compute nodes to local disk outperforms
aggregating a lot of IO workload into a RAIDed system at the end of a pipe, where basically you're hammering it,
you're consolidating workloads to a single pinch point at a RAID device at the end of the pipe.
We're basically just providing disk access with an extension cord, but very flexible
configuration capabilities. And then if you want RAID, we specify it during the cluster creation,
and we'll take care of all the stuff that has to be done in Linux
to make that work correctly.
You really don't touch the Linux storage stack
or the application storage stack.
You just basically say,
look, this is what I want it to look like in the end
and I want 100 of them.
They're all the same.
This is what these web scale applications are about.
Cookie cutter similarity, one node fails out of 100 nobody really cares you know you know right
that's the whole idea it's like you know god doesn't notice a single sparrow but you know
when 10 sparrows fall out of the sky you might get interested right so i don't speak for him
or her for that matter now there's there's certain characteristics that are hard to achieve
when you take the storage out of the box right so performance is one um and we do we go through
great pains to uh to do load balancing of network links and make sure everything is whizzy and
efficient uh similarly for availability we do all the multi-pathing setup for you.
Typically between a server and a JBOD, there's like eight different paths in our system.
And so we're really, we're pretty smart about all this stuff you expect from a storage system,
only it's all being done on the server side of the fence, because that's where you're capable of running all the software,
and it already exists in Linux, if only you knew how to use it.
And the networking between these nodes is 10 gig Ethernet?
For disk, we require at least two 10 gigs per server, and then our little bridge box does 8 times 10 gig throughput.
For NVMe, we really suggest 2 by 25 gig per server.
And the boxes typically have 200 gig ports on them.
And using something like Rocky?
Yeah, we support Rocky.
Rocky V2 is what most people are supporting.
But we're eagerly looking forward to NVMe over TCP,
which is coming very soon.
Yeah, it does fit your model very well, doesn't it?
Yeah.
And in the meantime, we support iSCSI for Flash access
because there's so many people who just cannot...
Right.
As it stands today.
I mean, so the.
So to provide flash to this, I use an NVMe over Fabrics JBoff.
Right.
Are you, are you still treating each SSD as an independent entity?
You can.
Do you support multi-namespace SSDs and slicing and dicing and julienning?
Yeah, we do support slicing, and we also support assigning the whole thing.
Again, for something doing its own data protection, you have to be
very clear on where the data is relative to other slices. So slicing makes that a little more
complicated. Yeah, assigning three slices from the same SSD to three Hadoop posts would not end well.
Right. And so that's something we make sure people can deal with. But yeah,
we slice SSDs because that's a very low cost thing to do compared to slicing hard drives
where the head is competing. Well, I don't want you to slice hard drives. You're going
to thrash the heads into little shreds. Right. Right. Or unless it's SMR and it's all that
other stuff, but that's a different discussion then it'll even be worse
and hard drives cost a couple of hundred bucks
if you want to buy
at the sweet spot of flash
you're talking $5,000
and I definitely
want to buy 2 and 4 terabyte
SSDs
because the
cabinet costs and there's plenty of IOPS there.
Right, right.
So, I mean, how do you guys differ?
I mean, I hesitate to say that there are plenty of startups
and at this point even some major vendors
that are offering NVMe over Fabric kinds of solutions for their SSDs.
You seem like you've got that embedded.
You've added the disk side.
I just don't – I'm trying to understand where your value prop is beyond that.
So, well, I guess you've found our flaw, so we can go home now.
I'm sure there's something there.
I'm just not seeing it.
I was thinking you were consuming that, not providing it. So I think if you opened up our orchestrator and looked at how you work with our composer,
you would look at it and coming from a storage perspective would be immediately surprised
because the first thing you're asked to select is the number of compute nodes in your cluster.
And you can say what characteristics of the nodes are,
how many cores, how much memory,
and set constraints around it.
That's your first step.
So big differentiator here
is we're not talking about providing random endpoints
of NVMe storage.
We start with basically the high level view
of what does the application need to run on.
And that's a cluster.
It's a set of compute nodes, each with their own storage.
And you define the number of nodes,
and then you define the amount of storage per node and type
of storage, typically just spinning disk or flash, but it might
be fat drives versus high performance small form factor drives.
And then you hit a button and we basically do all the work of immediately whipping up
a hundred node cluster to a specification that replaced the process of purchase time
definition on somebody's website where you were defining how much CPU you needed
and how many storage bays you needed in a local node
that you were deploying.
Except now you're just doing it on the fly.
A very important part of our solution
is that we understand both the compute
and the storage side of things.
So we have a distributed composable solution.
The first thing you do when you install
our software, because we're a software company, is basically install our agents on the compute farm,
install our storage adapter agents on the adapters for the JBODs, and then bring up our composer.
And the first thing everybody does is report in with an inventory of all the resources they auto auto discovered this is
basically simplifies the entire prospect one of the things i'm hearing about the nvme over fabric
stuff that people are presenting a box hey here's here's a box of nbeam over you know nbbo fabric
storage and then it's like when what do you do to use that? Well, you go onto your servers and glue them together.
We do all that for you.
We auto-discovered it, and we basically defined your pool of available resources.
You're just clicking buttons at that point.
In a sense, what we've done is virtualized bare metal hardware, except for the lack of VM that we don't want to use, that the applications don't need.
Yeah, the only part you're not doing I really want
is to select and put this on a private VLAN.
Yeah, that's not far away.
We actually have a customer who uses VLANs
to separate our traffic from other stuff.
And that's something we could automate
with the right API to the switches.
But it hasn't been a big deal yet.
So we don't view any of that as out of our purview.
It's basically, when do we get around to it?
And so you guys have been in the market for a while now?
A GA-ish kind of market?
A couple of years.
OK.
A couple of years you've been GA?
With the disk product.
And of course, disks aren't terribly sexy anymore,
so it's hard to get press about that.
But the other place where this makes a huge amount of sense
is in object storage. Because within a cluster, if a server fails,
you can take the data that's still on the disks
and attach that to different servers.
Yeah, this is one of my favorite attributes of composability.
Yeah, so now if you want to build a cold storage system, I can reattach those perfectly
good working hard drives to the replacement server without having to move all the data.
Is that something that your orchestrator does automatically, or is that something that somebody
would have to step in and issue commands to?
It turns out the hard part of that is deciding when something has failed
and as i'm sure you guys appreciate from countless years but what's really beautiful is in kubernetes
kubernetes has a standard way of deciding that a server has failed or at least a container
and so we have a complete kubernetes integration where where if the server fails, Kubernetes spins up a container somewhere else and the storage is immediately reattached to it and off you go.
So it's all persistent volume claims and all that other stuff is automatically moved?
And so it's a very elegant solution for Kubernetes because you get direct attach performance and cost without the pain of being
locked into a server. Is this with Bosch? Because a lot of the
availability characteristics of Kubernetes are fairly rudimentary until
you get something like Bosch underneath it. Yeah, well, Bosch
solves a lot of other interesting problems, but
assuming you can make Kubernetes work at all,
which is still not true of everyone.
There are many instances of Kubernetes working in the world today.
But not everyone who tries to.
But persistent storage applications in Kubernetes
is the emerging front.
That's not where they started.
Yeah, I mean, with their new container storage interface and stuff like that, some of that
stuff is becoming a little bit more realistic and usable.
Right.
But if you think about the persistent storage applications, that if you combine, this is
why composability makes sense.
If you had captured DAS within the nodes, when the node went down, the storage went down with it because it was also the storage provider.
So separating out the storage from the's running on top of Kubernetes.
Debatable.
But yeah, I understand.
Obviously, the containers don't understand
that stuff and don't necessarily use it, but whether they could take advantage of it is
a different question.
Well, it depends on the application in the container.
Yeah, our product is a great complement for some of the emerging container native storage,
things like Minio or OpenEBS or you know Gluster and even Ceph
you know the you you run you run all that software in containers
but now it attaches to the raw storage outside the outside the server
and and that means we can do things like fire up more controller containers because we're doing a large ingest and there's a lot of CPU needed for the pods, right?
I mean, you're talking Kubernetes, so you're firing up pods and the pods are deciding how many containers to run based on the service that's being requested and things of that nature, right? Well, it goes back to what I said at the beginning. Storage these days is 99% software.
So why aren't you running that software in containers
using modern best of breed type practices?
So your orchestrator runs on those containers?
Is that what you're saying?
Not yet.
But it wouldn't be hard.
But wait, but we're not,
we are, our integration with Kubernetes,
Kubernetes provides a framework for container management, but it doesn't provide a framework for hardware management.
So we're definitely the underlayment for the infrastructure for something like Kubernetes.
And so we are not putting together a container management system.
What we are putting together is a hardware management system that supports separating
the lifecycle of compute upgrades from storage upgrades, which has always been different
for many applications and continues to be different,
but then allows you to run your container applications and you can turn a management system cleanly on top of it.
But from the perspective of those specters of Kubernetes, we still look like the swarm of Linux commodity compute servers that people are adopting wildly.
You mentioned updates and upgrades and stuff like that.
So your three hardware solutions
can be upgraded independently?
I mean, is it like a rolling upgrade kind of thing?
So let's say you, being a software company, of course,
you actually change your software on occasion.
Yeah.
So our hardware is all
multi-path to the storage.
So, and then our SAS adapter,
there's actually four
independent controller blades
that use them in parallel
to talk to a JBOD.
So if you have to upgrade one of them,
that you're losing 25% of the capacity that you probably don't notice.
But since you guys don't control the on-disk data format, we don't have to worry about an upgrade requiring a change to the on-disk data format.
Right. Again, the on-disk data format is controlled by the storage stack in the application itself and on the
Linux client.
There's no requirement on the other end for any high-level functions because they're
originally designed to run over raw disk.
And in fact, we have a cluster scale encryption at rest feature as well, so that by the time
we ever see the data that's been encrypted coming out of the server.
And we do all the key management
to make that really, really simple.
And it's software encryption at the server level
as it's coming off across the network
and stuff like that.
Yeah, because you're, you know,
I've been through this for, I was through this for years at one company.
Every time we came up with a hardware accelerator for encryption, the number of cores increased in the in the in the general purpose.
If you're using a storage controller that actually led to provide higher encryption rates than the dedicated encryptor.
For the time spinning boards and spinning boards,
we were not keeping up with the pace of the number
of cores of increase.
So again, you're distributing.
You mentioned like 16 node cluster report.
That would be a very small cluster for us. Where people are, you know, we start getting kind of like,
interested when there's 50 nodes,
because now you have a complexity of management problem,
and you have seven applications
all using different configurations.
And so now you're somewhere in a few hundred nodes
deployment, but you're looking for something
that allows you to simply order
one or two compute SKUs,
not seven different configurations,
and one or two storage SKUs,
mix and match vendors,
optimize your costs,
and then basically create these things
on the fly when you need them.
Now, when you create them on a fly,
by the way,
they probably will persist.
People aren't destroying.
If they have a big data application up and running, they're not going to move a petabyte of data
something like no they may by the way they may they may be they may they may be they may they're
more likely rolling off a petabyte of data every every period because new data came in because
these they're persistent with a time frame and the expiration date of interesting information
for ad placement, like one of our customers does,
is about a 90-day window because your interest may change
and that currency is very important
for selling an ad placement service, right?
So the data is going to peel off,
but there's always this sliding window of data in it.
For database applications, you may have a much longer persistence around the data for these scale-out applications.
As far as caching is concerned, all that would be done at the host application layer.
You don't have any cache in any of these ebods, jbops, or anything like that.
Yeah, it turns out when you're dealing with the block level,
the host is really good about caching.
Well, the host knows a lot more about what it's doing than you do.
Right.
Or can.
But, you know, all of main memory is a cache for Linux.
So the other question is, and these jbofs and stuff like that,
do customers purchase the jbof and they have their own storage they
insert in it or is it is your ebuff or ebot is it all is it come with disks as well or
so we we don't sell any storage okay so the customer buys the storage independently
or more more likely our our partner or system system integrator working with
the customer is providing a optimized solution right and and if i can switch to business mode
for a little bit our our number one partner is dell uh they resell all of our stuff um we are
in their tier one reseller program so So the entire worldwide sales force,
in theory, knows about us.
And we sell mostly with their hardware,
their JBODs and servers and switches,
that kind of stuff.
Additionally, Western Digital is a very good partner.
They have a line of JBODs and JBOFs.
And they're being very outspoken now Western Digital is a very good partner. They have a line of JBODs and JBOFs.
And they're being very outspoken now about the composable infrastructure strategy
and opportunity.
So we're working very closely with them.
Yeah, they surprised me when they came out
with the NVMe over fabrics to access spinning disks.
JBOD.
They got an NVMe over fabric bridge?
No, it's a 12 drive 1U JBOD with an NVMe interface.
Right.
Which may not make it to product status,
but they're definitely going to do some drive-based things.
But we have in-house their flash based product, which is quite a beast.
It can hold a lot of flash.
So, I mean, you guys have a lot of customers that are, what's a typical configuration?
You mentioned 50 nodes as being of interest, but you've talked about 10,000 node environments
as being really interesting.
So what's an average installation for your solution?
We can't really, yeah.
If such a thing exists.
Yeah, we don't have enough active customers
to have a decent average, I think.
But we're in a lot of POCs and it's a very long cycle
because even if they love us, we need to get
synced up with a major hardware purchase. But we're talking to companies that average
50,000 servers in their fleet. Right. Right. But our deployments now are at one customer is in the hundreds of nodes and thousands of disks.
This is why the hundreds approaching 1000 nodes.
The reason that we're, I mentioned the 10,000 nodes
and 100,000 disks is that this customer continues
to grow their deployment and we have to be able to basically manage this at scale
and manage it through the life cycles
and keep going and be durable and performant at that size.
10,000 nodes, 100,000 disks.
We're talking Google kinds of levels.
No, this is nowhere near Google.
Google has more zeros.
Okay.
Or is the magnitude bigger?
Okay, I got you.
But there's a lot of the cloud native companies, things that have popped up in the past 15 years or so where they're really exercising the limits of big data and hardware.
And even people like Uber have a gigantic amount of data infrastructure.
Well, this has been great.
Howard, any last questions for Tom or Brian?
Not really a question, but I find it interesting to compare this to what some other vendors call composable infrastructure, where it's, well, you can take any drive in this blade chassis and assign it to any blade.
And you guys are saying you can take any drive in this data center and assign it to any server.
Yeah.
You basically hit the nail on the head on that one.
It makes a lot more sense at the data center scale than it does at the blade chassis scale.
It makes sense at the data center scale when you're looking at the applications that are starting to sprawl across the floor, doing analytics and these NoSQL databases. Aerospike just, I believe, increased their, last year, increased their maximum deployment of their SSD memory database from 100 nodes scale out to 500 nodes scale out, I believe the numbers are.
They're selling these things, right?
So these are far beyond, these are far, oh, Bukowski, yeah, he's my cousin.
No, this is beyond Blade Center management.
If that's somebody's definition of composable for Blade Center management, then they are unable to handle the real applications we're seeing customers deploy now.
All right.
So, Tom and Brian, anything you'd like to say to our listening audience before we close off? Well, surprisingly enough, we have our own website, so come check it out.
So drivescale.com, I guess? Yep. Okay. Well, this has been great. Thank you very much,
Tom and Brian, for being on our show today. Thanks for having us. Next time, we'll talk
to another system storage technology person. Any questions you want us to ask, please let us know.
And if you enjoy our podcast, tell your friends about it.
And please review us on iTunes and Google Play, as this will help us get the word out.
That's it for now.
Bye, Howard.
Bye, Ray.
And bye, Tom and Brian.
Bye.
Bye-bye.