Grey Beards on Systems - 104: GreyBeards talk new cloud defined (shared) storage with Siamak Nazari, CEO Nebulon

Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Matt Lieb. Welcome to the next episode of Graybeards on Storage podcast, the show where we get Graybeards Storage bloggers to talk with system vendors and other experts to discuss upcoming products, technologies, and trends affecting the data center today. This Great Bids on Storage episode was recorded on June 30th, 2020. We have with us here today Simak Nazari, CEO of Nebulon. So, Simak, why don't you tell us a little bit about yourself and your company's product?

Starting point is 00:00:39 Sure. I've been doing storage, it seems, like forever. God, I think I've seen you at least in two companies, maybe three. Exactly. And I've been doing file systems and block storage for a long time. And as a part of my job at a previous employer, I got to see a lot of trends and talk to a lot of customers. And a few key themes started to develop. One was customers really wanted to simplify their environment and have fewer moving parts. Storage arrays, while great, are very expensive, and it was beginning to impact the bottom line. And then the second piece that kept showing up was the problem of managing at scale, being able to customers that often had dozens, perhaps more than a hundred arrays, and just managing a hundred arrays in

Starting point is 00:01:33 a data center is quite problematic. And these guys were facing the pressure of their users and essentially the consumers within the enterprise coming in and say, hey, I can do this stuff much easier on the cloud. Why is it that it takes you guys eight weeks to provision space for me? Yeah, storage and stuff like that. Yeah, yeah, yeah. And it was something that essentially this at-scale management problem was not something that they were prepared to handle. And when I started to think about this is the existing model of shared storage where

Starting point is 00:02:11 you're connected by a fabric to it, it has this inherent problem of not being able to scale this management model. And people have tried to build management models that tries to solve this. But at the end of the day, the building blocks are just not the right building blocks. And I wasn't interested in building another storage company. Although you did. Well, it's a very different type of company.

Starting point is 00:02:40 Absolutely, absolutely, yeah. The difference is that if you think about the existing solutions today, we have, you know, storage arrays we've talked about, we have the software defined storage or hyperconverged, and it tries to solve some of the issues with, you know, expenses of, and simplification in terms of not having as many moving parts of shared storage. But unfortunately, it brings a lot of restrictions in terms of the performance, in terms of not having as many moving parts of shared storage. But unfortunately, it brings a lot of restrictions in terms of the performance, in terms of the way you configure things, your traces of the hypervisor. And in some ways, it shoehorns you into a specific environment.

Starting point is 00:03:16 So what I really wanted to sort of solve the problem was, I wanted to kind of the flexibility of the arrays and the simplicity of the software-defined storage and hyperconversion. That's really what we were trying to think through and solve and build. These kinds of environments, I mean, with hundreds of SANs, shared storage arrays, I mean, these things are massive, massive environments, right? They are. And then one of the big problems they have is a given storage array is serving multiple applications that have different requirements. And so you have this kind of careful balancing act of every time you do something on the array, you have to kind of talk to multiple consumers of the array saying, hey, I'm about to install a new firmware. And when is a service window that I can coordinate with all these different users of this one array?

Starting point is 00:04:12 And then imagine you have to do this every week, every time. And I describe just the upkeep and firmware update of these arrays. It's akin to painting of the Golden Gate Bridge, right? If you've ever been to San Francisco, right? Yeah, yeah. It's a never-ending story, right? At the latter end, and now they have to start all over again because by the time they get to the other end,

Starting point is 00:04:33 the paint is no longer good, right? So firmware updates are the same, right? You start updating the firmware to the latest, you know, available from the vendor, and by the time you're done updating your 100 arrays because you do one or two a weekend, there's a new firmware and you start from scratch. And it's just a never-ending struggle to deal with this, right?

Starting point is 00:04:52 Oh, it's even faster now. I think they're releasing storage software even more often than once a year. Right. And the poor customer says, hey, if I have some sort of issue, what do I do? Well, you've got to upgrade your array. And by the way, you are three releases behind, which means that he can't even go directly there. You have to go through these intermediate releases to get the patches, right? It's just a nightmare, right? And then, and add to that the lack of

Starting point is 00:05:18 visibility, right? You can't even see what's going on, on a hundred arrays at the same time, right? It's much more difficult, right? Yeah, absolutely. And you're not even talking about changing the data or changing the storage, which is a different story altogether. Yeah, the lifecycle management is yet another pain point. There's just impedance mismatch, really, between the lifecycle of a server, which is usually measured in 18 months, 24 months type cycles, and of the server, which is usually measured in 18 months,

Starting point is 00:05:45 24 months type cycles, and the storage rate, which is usually measured in three to five years life cycles, right? That's another kind of transition point that is difficult to manage in the data center today. Yeah, that makes sense. I'm so curious about how you go about it, though. Yeah. Tell us a little bit about how you solve these problems. Yeah. So the way you solve the problem is a couple of things. So you start by re-imagining how the management works. You think of management being in the cloud, right?

Starting point is 00:06:17 And a typical storage array, when you ship it, it ships both with storage, you know, IO path and also all the management artifacts to be able to manage the array. And that's part of the bloat of the firmware that goes into one of those things. Imagine if you moved all of the management into the cloud, and, you know, you only had the IO path and critical pieces running inside the data center. And even then, you don't really run it inside a storage array, which has all sorts of, you know, you have to deal with multi-pathing and multi-tenancy issues.

Starting point is 00:06:51 What if you reimagine the array as a smaller physical device that is inside the server, right? So imagine taking your storage array, miniaturizing to size of a pcie card and sticking into a server right so we got the cloud end we got the storage engine that is sitting inside the server and to be clear this is not an accelerator this is the actual the entirety of the storage engine all the features that you expect from a storage array like compression, deduplication, encryption, all running on the card and this device, which we call an SPU, stands for services processing unit, kind of like a GPU in terms of a style and size and fit and finish and power, it then presents the actual storage to the host and you don't have any sort of in-band management or any sort of management

Starting point is 00:07:45 inside the data center. It's all sitting in the cloud. So the experience turns into something like this. What's at the other end of the SPU? The other end of the SPU. So the SPU is made up of essentially a full-function computer running out of storage stack. It takes over the storage media inside the

Starting point is 00:08:06 server and turns around and presents what looks like physical LUN to the host. And it also has Ethernet ports. Some of the Ethernet ports are used for the SPUs to talk to each other and present what looks like shared namespace and disaster recovery and mirroring and so on. And then there is a dedicated port to connect to the cloud for the purpose of management. But the storage is effectively still situated as inside the servers, right? Is that what you're saying? Exactly. But because the server storage is connected to the Medusa card or to the SBPU, we can turn around and expose that storage, not only just to the local host, but also to the hosts that are sitting in your sharing domain. So if you have a VMware cluster, it doesn't really matter that the capacity is sitting

Starting point is 00:08:58 in a different server. You can still go from that server to the local SPU, from that SPU over the Ethernet path of the SPU to the next SPU and grab the data and present it to the host, right? So essentially, we solve the shared storage by having all the properties of shared storage baked into the card, and the cards can communicate with each other and present both what looks like shared storage, but they also use highly available shared stories by doing erasure coding within the card and then mirroring across the cards.

Starting point is 00:09:33 And what, I'm sorry, Roy, what's the connectivity then between the cards? Is that Rocky, NVMe, what's going on there? It's 25 gig Ethernet. There's two ports on each card and they're talking. The Ethernet ports are RDMA capable. We have chosen not to enable RDMA partly because there is a big gap between the switches that are capable of RDMA and the switches that actually have turned on RDMA capabilities, it turns out there's a lot of expense and configuration issues. And just because of those deployment issues, we choose the ports in kind of standard Ethernet TCP IP 25 gigi mode.

Starting point is 00:10:19 So the protocol between, let's say, an SBU on one server and an SBU on another server is internal to Nebulon? Exactly. It's not iSCSI or anything like that, right? It's not iSCSI. It's a very – and a lot of it has to do with our security protocols. It turns out that, you know, the story on iSCSI security is pretty weak. We actually encrypt all the data that is exchanged between one card to the next card. And the data essentially just gets encrypted and is just traversing the entire system in an encrypted form.

Starting point is 00:10:55 And then the other piece is that we actually verify each card with the other card, ensuring that the certificates are, you know, we check the presence of certificates so that, you know, some other, you know, entity inside the data center can't connect to the card and present itself as a card trying to steal the data. So there's a lot that has gone into the security model. And so we are kind of using our own protocol, which means that we're not bound by some standard that doesn't really add anything to our protocol. And you mentioned high availability. So there could be more than one SBU in a server? We can have – okay, so the availability model is is built using layers so let's talk about a single spu so so the spu connects so by the way the spu can run both with or without drives if it's running

Starting point is 00:11:55 without drives then the spu is just talking to the spu to to collect the capacity or or perform the io if it has if it has drives then we will actually use erasure coding on all the drives that are attached to the SPU. The SPU can consume any type of SSDs, SAS, SATA, NVMe. It doesn't really care. So that's kind of the first level of availability within the card itself.

Starting point is 00:12:20 And then one more piece. If the host is rebooted or is down or having issues, the card is in a completely different fault domain from the host, right? So, in fact, if you have bare metal and you don't have an OS on the host, the card would be up and running and it's able to talk to the other cards and the other cards can use its capacity even if the host is not even configured or running, right? So that's kind of a big important piece of the design. Wait, wait, wait, C-Mac. So if the host is being rebooted, let's say, the SPU is still technically active, that reboot can still provide storage to other SPUs in the network? You got it.

Starting point is 00:13:05 In fact, the SPU has to be there because we expect the OS to actually boot from the SPU. So the SPU has to be there, not just to service the host when it comes back to provide its IO boot services, but also other SPUs that need the capacity inside that server. Absolutely. It is very interesting. I am still completely at a loss as to how it's done.

Starting point is 00:13:32 Well, it turns out that it wasn't that, you know, if there is a bit of a hardware magic taking place, there are different root domains. So that's, you know, in a more detailed conversation in front of a whiteboard, I'd love to describe more of it. So the second piece of availability is we talk about the SVUs having these 25 gigaports, are able to talk to each other. And so now they can actually mirror data. So a given LUN, so we talked about the host, you know, even rebooting the data is available, but you can imagine a scenario where somebody go, we lose complete power to the server. And therefore the SPO just goes offline because there was no power. In that case, we have mirrored the data.

Starting point is 00:14:15 We always mirrored the data to some other SPO within the pod. And in that case, we just do the failover and all the capacity is available. And in fact, when the SPU comes back, we re-silver the data or rehydrate the data from the SPU that was carrying the services forward back to the SPU that was down for service or whatever may have been the reason why the SPU was down. So that's kind of the other level of availability. You talked about two SPPUs in a server. It is possible to put two SPUs in a server, but we think of it more of a performance consideration as opposed to an availability consideration. So let's talk about fiber channel cards and how people deal with availability.

Starting point is 00:15:00 It's typically got two ports, right? So you have some sort of multi-pathing running at the host level with a port failure or cable failure. Well, in our model, we have two ports, but the host doesn't even know that those two ports exist. And then we deal with the port failovers. So it's kind of nice. You don't have to actually configure multi-pathing, right? You don't have to think about it.

Starting point is 00:15:22 It's just kind of built in to the card itself. So in this model, you know, you don't, you know, you have two ports, you deal with all the issues that may result as having a port go down. But same back in a fiber channel card with dual ports, you effectively got almost dual circuitry. It's, yeah, you're powered by the same configuration and you're talking to the same, let's say, PCIe bus, but those two ports are effectively electronically as isolated as they can be on the same card. They're not? One would hope, but they really aren't as isolated as you would think. Multipathing

Starting point is 00:16:04 really is designed to protect you against a Fibre Channel cable failure. Most chip failures will result in the entire Fibre Channel card going down. In fact, so there are few physical failures at the silicon level that will impact only one port and not the other. They're really designed to handle the cable failure or the switch going down for upgrade or whatever. Whatever the path is to get to the actual capacity inside the shared storage array. Yeah. You and I come from a history of high availability storage array. Yeah. You and I come from, you know, history of high availability storage arrays, and there's

Starting point is 00:16:47 always been two or four or eight different controllers sitting there, you know, and they could always, you know, migrate workload from one to the other in case there's a problem and things of that nature. In a single SPU environment, you know, I guess the host would be down in that case. Exactly. And if you had storage, then that storage would be mirrored to some other SPU storage, so that wouldn't be a problem. It'd still be accessible. You got it. That's exactly right. That's the exact thought process behind it.

Starting point is 00:17:20 Now, we can support two SPUs in a host, in which case it gives you additional bandwidth and performance, but it can also mask the failure of the local SPU. If a physical SPU actually does fail, you can still provide the same LUN view because the data that was on this local SPU that just failed is mirrored someplace else. So the remaining healthy SPU goes and fetches the data and you can continue to operate. So this is kind of the extreme high availability need. I got you. That would be a multi-controller scenario kind of thing. Exactly. And so you got that covered as well. And you mentioned that within the server, the SPU uses erasure coding to map the data across all the drives. That's exactly right.

Starting point is 00:18:12 So it's like Reed-Solomon, it could be two failure mode types of erasure code, or it could be more or it could be less. So within an SPU, we can tolerate up to two drive failures. So a third one, obviously, it results in the data not available to that SPU, in which case the SPU just goes and gets the data from a brethren SPU that was mirroring the data, right? So in this model, you can really tolerate up to five drive failing without actually having an outage or data unavailability, right? Because, yes. In that environment, do you have to have similar configurations between the mirrored SBUs? No. So I talked about the fact that we could even have SBUs that don't have enough capacity. They can just go get the capacity from another SPU. So we have an algorithm that goes and looks at all the capacity available and creates a map of

Starting point is 00:19:11 how much capacity can be consumed in each SPU and creates kind of a mesh of connectivity of LUNs. So we don't have a one-to-one mirroring between SPUs that has all sorts of performance implications if the SPU goes down because the surviving SPU ends up with all the load of the SPU that just failed. In this model, any given SPU is in a mirroring relationship with multiple SPUs. So if it goes down, the load is evenly distributed across the SPUs. This deals with the capacity unevenness as well. So effectively, it could be heterogeneous servers,

Starting point is 00:19:48 just as long as there's an SPU connected to each of them, they could perform a storage mesh. Is that what you're calling it, rather than a cluster? Yeah, we call it an end pod. The problem with using the word cluster is that now you're confusing it. Is it a VMware cluster, Microsoft cluster? The word cluster is just, you're confusing it. VMware cluster, Microsoft cluster, the word cluster is just

Starting point is 00:20:06 overused, right? And it's just confusing. And you mentioned all the data is encrypted at the SBU where it's written and then throughout the network it's maintained in an encrypted form? Yeah, so

Starting point is 00:20:23 at ingest, so this is where it arrives. The moment it arrives, we hash, compress, and encrypt the data. And from then on, throughout its lifecycle, it stays in that encrypted or hash-compressed encrypted format. So that format is preserved when it's written on drives and as it's kind of traversing from one SPU to the other SPU for mirroring or for disaster recovery use cases. And so you hash to deduplicate the data? So it's deduplicated, compressed, and encrypted.

Starting point is 00:21:03 Exactly. You got it. So everything you expect from a modern storage array. So snapshots is the other thing I would expect from a modern storage array. You support snapshots? Absolutely. So the metadata becomes now a bit of an interesting thing, how that metadata is distributed. So we struggled with how do we actually do this in the most robust highly available uh center you know mode so you know we keep talking about sharing you know we can actually use this in an unshared

Starting point is 00:21:33 environment also so you can imagine a modern uh application kind of like mongodb couchbase where the requirement is really not shared storage, just local storage. And in one of the conversations I was having with one of our customers, they said, look, I have a problem. This is the enterprise IT guy describing the problem he's having. He's saying, look, I got the guy who wants to do Hadoop or Cassandra or Spark or whatever kind of the modern thing he wants to do today is, and he comes to me and I said, well, it's going to take me eight weeks to do it.

Starting point is 00:22:10 And he says, well, okay. And before I know it, he's bought a hundred servers with a thousand drives and he's running his own new application, kind of shadow IT. His drives just start failing and he comes and says, hey, can you help me with this? It's like, why didn't you buy that? I mean, I don't know how to replace drives in MongoDB. And everyone is a little different anyway, right? So there's this weird tension between the application guy who wants to get going fast and the enterprise IT guy who wants to sort of have visibility.

Starting point is 00:22:40 And I think our solution is perfect because, you know, you can create a pod to run MongoDB. It doesn't require shared storage, but it requires that visibility. So if you turn around and just present what looks like lunch, the MongoDB guy is happy. He didn't have to sort of buy a shared storage array. He just bought the servers from his favorite server vendor, and he can get the capacity. Then the enterprise IT guy is happy because he has visibility into what's going on when drives fail, and he doesn't really care if it's MongoDB or Spark that is running. Replacing a failed drive has the exact same behavior.

Starting point is 00:23:12 You go to the cloud, notice something, you press a button, you go take the drive and put it out. You don't even have to talk to the application guy. So now, the reason I talked about this story and to take you back to the issue of metadata. So we had to design each SPU to have its own metadata, right? That has to do with compression, deduplication, encryption, all those things. So it's fully self-contained because it has to be able to operate independent of all the SPUs for the non-shared use case. In the shared use case, the only metadata you really need to have is,

Starting point is 00:23:49 okay, who is my mirror so I can send my data to? They don't have to know about the details of how many drives they have. Is it six drives or eight drives? Is it SAS or is it NVMe? I don't really need to do any of that stuff. I just need to know where it is, where is the network endpoint where I have to send the data for mirroring or if I'm not serving the data, which SP you'll have to go to talk to get the data and present it back to the host. So this gives you the ultimate availability and isolation of the

Starting point is 00:24:21 metadata. So there's some metadata that is clustered wide. That metadata is about who has what, which loan is served where, but all the other kind of metadata that has to do with the actual layout of the data and compression and hashes, all that is essentially encapsulated in an SPU and it's independent of the other SPUs.

Starting point is 00:24:43 Very good. Tom, maybe we can talk a bit about replication. I know that it's using erasure coding, so there's a replication algorithm built in there. Do you have any particular take on that, that you're doing differently? So to be clear, we're not doing erasure coding across SPUs. SPU is doing the erasure coding within it. We chose that partly because doing erasure coding across SPUs, you know, has a pretty huge tax in terms of latency that the host will experience. So we do just straight mirroring across SPUs.

Starting point is 00:25:25 And so within a pod, which is kind of what it maps to approximately a cluster, like a VMware cluster or Microsoft cluster server or an Oracle rack. So that's kind of the mirroring that takes place there. And then we are working on disaster recovery where it is asynchronous, where you kind of mirror the data of one pod in one data center to another pod to a different data center. And that can be done either asynchronously or asynchronously. And the good news is that all of these protocols, mirroring protocols, essentially have the same baseline code, which we have to kind of build from the beginning and makes us comfortable

Starting point is 00:26:06 in terms of reliability and performance of solution. Yeah. So, you know, for like synchronous replication, things of that nature, you'd have to wait until the data was actually at the replicated site before you, you know, authorize the IO to complete. You got it. And are you doing that for within the pod mirroring as well? Yes. So the data has to be in two places in order to satisfy the IO.

Starting point is 00:26:37 I gotcha. So to be clear, it's a function of, so this is kind of the beauty of making the model app-centric. In our model, you kind of say, you don't start with make a pod, then install an application. You actually say, I'm going to run VMware. That's where you start. Or I'm going to run MongoDB. I want to run VMware for a database or VMware in a development environment. And so we have a series of templates that describes what the configuration should be. And it's all embedded into the template itself.

Starting point is 00:27:15 In fact, in this model, you don't ever deal with a worldwide name or LUN masking or exporting. You just say, make me a VMware cluster from these servers. And in fact, the definition of the template is even whether a bootload is created and where the content the bootload should come from so so the the experience is the customer buys the server they rack him power them up connect the ethernet force go to the cloud and say make me a vmware cluster and we just go and create the bootload, grab the content of the VMware boot lens from the inside the data center, lay it out, create all the data lens based on the template, and set up whether they're mirrored or not. Let's say the customer says,

Starting point is 00:27:54 it's a VMware development environment, I don't necessarily need mirroring, I don't need that kind of high availability. Whereas in the VMware production environment, where we do mirroring and sharing, or the guy says it's a MongoDB, which means that no mirroring and no sharing, or the Kubernetes where there's no sharing but mirroring. So all those configuration kind of details are hidden behind the template. You press a button and the volumes get created, they get populated, exported, and you didn't have to know anything about

Starting point is 00:28:25 worldwide names. Mirroring is an option within the pod? Yes, it is an option. I guess I didn't realize that. Are encryption, compression, deduplication also options? No, just mirroring. And there isn't mirroring as an option. It has to do with the fact that Mongo doesn't need it, right?

Starting point is 00:28:45 And in fact, if you ever run MongoDB on something that is like one of these hyper-convergent software-defined storage, which they force mirroring, Mongo does three copies, you do two copies, before you know it, you have six copies of the data, right? So that's what you mean by saying application-centric storage, because you're effectively configuring the pod, the MPod, via application templates. Is that how this would work? That's exactly right. So at GA, we will have a certain set of application templates we've created for pretty popular

Starting point is 00:29:20 type applications. But the customers can take our templates and modify it or create their own template, right? And so the thinking is that somebody in the enterprise IT says, okay, well, VMware in our environment, we want it to have four terabyte LUNs, we want it to be mirrored, we want it to be this way. So they've modified the existing templates and the application guy just says, okay, I'm going to use this template at these 10 servers that just came into the data center, make them a VMware, and off you go.

Starting point is 00:29:47 And from then on, the enterprise IT guy is not involved in the conversation. He just sort of set the standard for the organization, and the application guy just uses it, right? Now, so getting back to where, yeah, I'm trying to understand how the SPU presents its storage as a LUN. Is it, is it iSCSI? Is it a virtual volume? Is it? It is, it is not iSCSI. Otherwise you have to deal with IP addresses and so on. Remember we are inside the server. We're on the PCI bus of the server. Therefore over the PCI bus, we are presenting what the host will see as a SaaS LUN. Now, we picked SaaS as opposed to NVMe in our initial implementation because NVMe as a shared interface is not all that well supported by VMware and Oracle RAC and so on.

Starting point is 00:30:43 So we chose something that is industry standard. Every single OS has drivers for a SaaS controller. That's kind of what we chose to actually do for our initial release. And then once NVMe becomes popular, remember we are on the PCIe bus, we can just turn around and present what looks like an NVMe target to the host.

Starting point is 00:31:04 That's extremely interesting. How are you going to market with this solution? So the solution really is SBUs and a cloud management control plane. Is that what the solution represents? You got it right. And so it's interesting. You talk about go to market. I think that's one of the, you know, when you talk to customers, they just don't want yet another vendor in their data center.

Starting point is 00:31:29 It just got enough of them as it is. So our model is really the best way to think about it is kind of like a rate card motion. Today, when you buy servers, every single server you buy has some sort of a storage controller in it. It could be a 5-inch card, it could be a RAID card. Who are you buying those? The cards are built by Broadcom. They're OEM from these vendors and

Starting point is 00:31:55 they're provided by Dell, Supermicro, HPE, Lenovo. They all have it in their configuration matrix, I guess, right? And that's exactly the model for us. In fact, if you refer back to our press release, we are going to go to market with HPE and Supermicro. We have a third one we are talking to, and we expect to be on board for GA.

Starting point is 00:32:21 So the thinking is, essentially, you buy it directly from Supermicro or HPE and you just call them up and say, hey, you know, instead of the standard, you know, 19, you know, the standard, you know, SAS controller that used to put in, put one of these Medusa cards and off you go, then you don't need to buy a five-way channel card or shared storage. Normally when I buy a SAS card or something, there's a standard cost for that. And I pay $190 or $250, whatever the cost. I don't know what the cost is, sorry. But I pay that once and I get the card and I've got it.

Starting point is 00:32:55 So how does the SP, the SPU is a much more, I'll call it intelligent device than a SAS RAID card? Sure, sure, sure. So you think of it, you know, you have standard, you know, graphics card, and then you have an NVIDIA card, right? You kind of think of them in that lane, right? Yeah, there's a dumb, you know, VGA built into the motherboard of most systems,

Starting point is 00:33:17 but a lot of people opt to buy NVIDIA cards because it does a lot of more stuff. So that's how you think about it, right? So in that model, you buy the card from the OEM, they set the pricing, but there's also a subscription in the cloud for the use of the cloud and all the analytics and API driven and automated software interfaces that you get in the cloud that is sold through the OEM also, right? So the entire solution is purchased through the OEM. And it's pre-mixed, pre-measured based on the parameters you give the OEM?

Starting point is 00:33:57 Well, there's a bit of a negotiation, but frankly, they will set the price on the hardware. Right, right. And there's no capacity charge here because the capacity is actually whatever ships with the servers. Exactly. The OEM and the customer decide,

Starting point is 00:34:10 hey, I need 20 terabytes per server. They get 20 terabytes per server. It just depends on what their needs are. And we don't charge for that capacity. That capacity, the customer just pays for the raw capacity they buy from the server render. It's like software-defined storage at the next level.

Starting point is 00:34:30 I'm trying to figure... It's got hardware is the only problem, right? You can't do this without an SBU hardware in there. But you can't do software-defined storage without hardware either. I mean, every single software-defined storage has a rate controller that they depend on, right? Exactly. Wow.

Starting point is 00:34:49 It's kind of interesting. When a company was to place something like this within their data center, though, does this require you to change over what you already have to support this? Or, for example, would a device like this allow you to connect externally at 25 megabit Ethernet to, say, that NetApp that's sitting over there unused? Would you be able to manage that in some way through these? I think they could coexist in an environment. Right. So we are choosing not to try and manage external storage, partly because then you always end up becoming a lowest common denominator.

Starting point is 00:35:36 There's just enough variations on these devices. And frankly, one of the big problems always is how do you, like, you know, you have to, then you have to talk to the brocade switch or the sysquatch switch to set up the zone and the worldwide names. There's just so much kind of these storage artifacts to try and deal with. In our model, all that stuff sort of disappears. In fact, you know, we don't want to bring that complexity back. We just want to take all that away, right? In other models I've seen, they say, well, we can integrate quite nicely with X, Y, or Z, but it's going to be hobbled in a certain way. Right. And that's the problem. The hobbling, you know, yes, it can integrate, but it gets hobbled, right? But, you know, in our model, you are essentially running it in a, you write exactly as it was intended, right.

Starting point is 00:36:25 Which is, you know, the application guy, you know, owning the entire server and doesn't have to talk to anybody else about it. Right. So, I mean, this thing could actually be, you know, let's say you could, I'm not sure you can order an SPU by itself without the server and all that stuff, but you could, you could almost plug an SPU into a server that has storage and you immediately have a shared storage environment. If you have another SPU card, you could plug it into another server and all of a sudden it's shared.

Starting point is 00:36:55 It could be extremely trialable from that perspective, right? It can be, although we are not going to. So the question of how do you get an SBU, whether you buy it and put it in your server and you buy pre-built with a server, all of the hardware motion is through the OEMs. We are leaving that conversation to the OEM and the customer to decide what's the best way for them to get their hands on. But initially we think. With the server is the way to go. Yeah, yeah. Often the customers, if you think about large enterprises, they are much more comfortable having a pre-built, pre-tested out-of-the-factory config coming in. They just plug in, power on,

Starting point is 00:37:39 and off they go, right? Well, that makes sense. Nobody wants to buy parts and pieces. They just want a single line item SKU. Exactly. Yep. Have you qualified particular storage devices? I was going to ask whether disk is supported, but I'm thinking it's not. Storage devices like third-party?

Starting point is 00:37:58 Well, you know, each vendor has, you know, a fairly elaborate list of storage devices that they support in their servers, not all of which do you have to support from your perspective, but I'm just wondering if there is there a limit like that? By storage devices, you mean drives that connect? Yes. Okay.

Starting point is 00:38:16 Yeah. Yes. So we are supporting SSDs only, no spinning media. Okay. And the answer is yes, we are working with the OEMs and there is a standard set of drives and capacities that they tend to be the most popular that customers buy. And then in conjunction with the OEMs, you can order what's supported. So it's a hardware compatibility list, yes? it is but you know it is mostly because uh you know the stuff on the back end is is pretty um it's not that dependent on the type of drives you attach you can in fact you know

Starting point is 00:38:52 but we are trying to sort of limit the exposure of the customer by sort of having it a certain set of you know drives that are that the oem is comfortable and they're getting the volume connected initially to the card. But the backend interface of the card to the drives is really what a RAID card connected to the drive would be like. So it's all well-tested. We are using industry standard components. We're not designing our own ASIC. So we are pretty comfortable with being able to expand that very rapidly.

Starting point is 00:39:23 You didn't mention storage class memory at all. Is there support for storage class memory? Yeah, storage class memory comes in two flavors. There is a flavor that is a form of a DIMM that sits on the server. So we don't even have access to it. It's just in a different PCIe domain. So that's usually used for caching solution

Starting point is 00:39:41 as a caching solution more than anything else. And then there is the storage class memory that sits on the bus on, on the NVMe. Right. And so we are, we are able to consume NVMe, although, you know,

Starting point is 00:39:55 that's probably not the first use case that comes into mind. And there's a third use case for, for that storage class memory to be used on the SPU for caching of our own metadata and improving the performance. So we are looking at adding that as an option later on. Okay, so it's primarily a metadata caching and data caching solution rather than a pure data storage solution there. Exactly. I mean, the fact is that today, the thing that people pay attention most is, you know, how efficient is compression duplication? Because they're trying to, and they're willing to pay the cost that comes with doing the compression and encryption.

Starting point is 00:40:38 Almost cost because, you know, the dollar per gigabyte is still quite important. And frankly, most applications, when they move from spinning media to SSDs, even with the compression and deduplication tax, it's still plenty fast for a large majority of applications. There are some niche applications, obviously, that do need that 30-mucrosecond latency, but very few applications can really take advantage of that productively. I was going to ask if you have an onboard cache on the SBU. I assume there's something like that. Yeah, we have a 32 gigabyte cache on the SBU, yes.

Starting point is 00:41:17 And you've got non-volatile memory there as well for write buffers? It's non-volatile memory, exactly. Oh, the whole thing. Okay, that's good. That's good. Huh, you know, I don't think I have any other questions. Matt, do you have any last questions for CMAQ? I really don't. CMAQ, is there anything you'd like to say to our listening audience before we close out? One thing I should probably ask is, it's not GA yet. It will be GA in the future. Is that true? Exactly. It is GA third quarter of this year, and we are making great progress. Looking forward to meeting the needs of the customers

Starting point is 00:41:53 with really kind of a completely new take on how to solve this problem in data center. It was a great pleasure talking to you guys. Really good questions. Yeah. Okay. Well, this has been great. Thank you very much, Really good and simple questions. Yeah, okay. Well, this has been great. Thank you very much, CMAC, for being on our show today. Thank you so much. Next time, we'll talk to another system storage technology person. Any questions you want us to ask, please let us know.

Starting point is 00:42:15 And if you enjoy our podcast, tell your friends about it, and please review us on iTunes and Google Play and Spotify, as this will help us get the word out. That's it for now. Bye, Matt. Bye, Ray. Bye, CMAC. Bye. Until next Bye, CMAC. Bye.

Starting point is 00:42:25 Until next time. Good day.

Grey Beards on Systems - 104: GreyBeards talk new cloud defined (shared) storage with Siamak Nazari, CEO Nebulon

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.