Grey Beards on Systems - 61: GreyBeards talk composable storage infrastructure with Taufik Ma, CEO, Attala Systems

Episode Date: May 22, 2018

In this episode,  we talk with Taufik Ma, CEO, Attala Systems (@AttalaSystems). Howard had met Taufik at last year’s FlashMemorySummit (FMS17) and was intrigued by their architecture which he thou...ght was a harbinger of future trends in storage. The fact that Attala Systems was innovating with new, proprietary hardware made an interesting discussion, in its own … Continue reading "61: GreyBeards talk composable storage infrastructure with Taufik Ma, CEO, Attala Systems"

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Howard Marks here. Welcome to the next episode of the Grayberry on Storage monthly podcast, a show where we get Grayberry Storage Assistant bloggers to talk with Storage Assistant vendors to discuss upcoming products, technologies, and trends affecting the data center today. This Great Bridge Online Storage episode was recorded on May 10, 2018. We have with us here today Tawfiq Ma, CEO of Atala Systems. Tawfiq, why don't you tell us a little bit about yourself and about Atala Systems? Hey, Ray. Thanks for the introduction and thanks for the opportunity to talk to you today. So, yeah, I'm CEO and co-founder of Atala Systems.
Starting point is 00:00:49 So who are we? Well, we're a cloud infrastructure company with headquarters in San Jose, California. The founders, myself and two others, we come from technology powerhouses such as Intel, Emulex, and two breakout startups called Silverworks and Silver Engines. So for the last two decades, we've been creating breakout technologies for the data center. We created the highest volume server storage and networking chipsets and controllers for the industry.
Starting point is 00:01:18 So we've got a pretty damn good track record. I remember conversations at Network Computing about how, well, everybody's using your grand champion chipset, so why are we even looking at servers anymore? That's right. Yeah, so we have a great team. We're very proud of our heritage and what we've done. So now we've got our sights set on private and public cloud infrastructure with the creation
Starting point is 00:01:42 of what we call composable storage infrastructure. So the sort of the genesis of this when we started the company was the observation that Moore's law is sort of slowing down. So to really deliver high performance infrastructure with a focus on storage, you need to start with a new approach that leverages hardware. And in our case, we've leveraged programmable hardware in the form of FPGAs. And what that allows us to do is create a fully composable SSD storage solution that delivers near-native, ultra-low latency performance. But more importantly, we create the ability to arbitrarily compose those resources across the data center. So to attach SSD resources to wherever they're needed on any server for any workload. Completely arbitrarily, completely autonomous with zero touch, one click provisioning.
Starting point is 00:02:41 So for the workload, you get the resources whenever and wherever you need them. So in essence, the solution provides customers with a boost in both agility and, increasingly importantly, resource utilization. So in our minds, the agility and improved use of resources are two sides of the same coin. So our solution effectively delivers that to end users. And as we talk to customers, that's becoming increasingly important given the growth of two key macro factors.
Starting point is 00:03:16 One is obviously the growth of data analytics that are increasingly being done in real time, this is batch, and the associated machine learning and AI applications that are just data-hungry. So that's one macro factor. And then the other macro factor is the advent of DevOps, where applications consistently change as cloud companies pursue the next innovation in their cloud-delivered services. So a little bit woody,
Starting point is 00:03:47 but that's effectively the philosophy and the genesis of what we're doing. So you're creating network shared NVMe storage that operates at almost local latencies? That's correct. So I refer to the use of programmable hardware and the use of FPJs. So that's a critical technology that is very unique to Atala that allows us to do exactly what you said, to provide workloads with the performance and low latency of the newer NVMe SSDs, as if those SSDs were in the same server as the workload, but with all the benefits of having it be pooled and composable sitting on the network.
Starting point is 00:04:36 And these FPGAs are something that you have created? This is proprietary technology to Atala Systems? I mean, the underlying technology I know is available, but go ahead. So the way that we create the solution is we have a partnership with the FPGA business unit of Intel that used to be Altera. So we've been working closely with them since the inception of the company for over two years now. So we take the Intel FPGAs
Starting point is 00:05:10 and then we add our programming to those FPGAs. Ah, okay, I gotcha. In the form of RTL firmware and additional software. And the combination of all of that effectively creates our solution. It's a whole new definition for software-defined storage. Exactly. No, it's sort of, I don't know.
Starting point is 00:05:29 Software runs on the FPGA instead of on the Xeon. Howard is exactly right. We have all of the ability to innovate quickly, very rapidly, by programming FPGAs. The only difference between us and the traditional software-defined storage company is in our case, the programming includes a little bit of RTL. It's not just C code.
Starting point is 00:05:52 Yeah, but you can't do this in the field. You can't reprogram the FPGA sitting out there in somebody's cloud storage environment, right? Or am I wrong? So we do plan to offer occasional updates, just like any software-defined storage vendor would do, in the field.
Starting point is 00:06:11 It's a programmable device. Oh, my Lord. All right, so I think we've done pretty well at the 30,000-foot view, but I'm a geek and I like to dig down. So when we talked at the Flash Memory Summit, we talked primarily about the target implementation of the FPGA. So why don't you talk a little bit about that? So just to expand on your question, so the definition of target for the listeners is the storage node that contains the physical NVMe SSDs.
Starting point is 00:06:47 In that case, the physical manifestation of our solution is effectively a chassis that connects to any Ethernet network. And within that chassis, we have our core FPGA-based technology that on one side connects to the Ethernet network and on the other side connects directly to NVMe SSDs. So the magic is what we do inside of the FPGA in being able to take the network protocol that sits on the Ethernet network and then with hardware speed and hardware based latency convert that to NVMe traffic to the SSDs themselves. So in our case there's no software in that data path so we don't suffer from any
Starting point is 00:07:41 latency because of software interrupts. We don't rely on CPUs palling the network to watch out for new packets. So we have not just slow latency, but high predictability on performance and also low cost and high density because the targets or storage nodes
Starting point is 00:08:03 are effectively serverless. It's just the FPGAs and the SSDs. Well, and there's a PCIe chip switch in there somewhere, isn't there? That's correct. We do use the standard switch to fan out to multiple SSDs. No, wait. There's no CPU. There's no cores.
Starting point is 00:08:19 There's nothing going on in that storage node other than the… It's a JBuff. It's the logical equivalent of a SAS JBod, but way smarter and way faster. I like exactly how it said. Smarter and faster is the critical words. What about
Starting point is 00:08:37 storage protection, RAID, and LUN control, and where does LUN data reside? You do that stuff upstream. Upstream? There is no upstream. There's a server and there's a storage node.
Starting point is 00:08:54 I think there's an Ethernet network in there between them. Right, and so if you have a web-native application where resiliency is at the application level, then it can consume capacity directly from the Attila JBoffs. If you have more traditional applications, you add an SDS layer that does RAID and replication and snapshots and those things. So basically, when we started out, as I said earlier, we set our sights on public and private
Starting point is 00:09:27 cloud infrastructure. And increasingly, the type of data analytics solutions they're deploying have their own data protection included. Capabilities, yeah, yeah, yeah. You think about the triple recollection with HDFS, Cassandra, MySQL, the scale-out databases that create data protection at the application layer. Trying to add this level of protection at the infrastructure layer not only adds complexity, but also adds cost and dilutes the very performance that we set out to deliver. Most significantly, it adds latency.
Starting point is 00:10:07 That's correct. Now, if customers do want a level of data protection for certain applications that don't have their built-in methods, then we have plenty of reference designs and work that we do with customers where we layer objects or file systems on top of our infrastructure, such as Ceph or Luster or GPFS. So we've basically provided, you know, extremely fast, flexible, and highly composable infrastructure, whether it's for newer apps that take care of themselves or for older apps where we layer on top whatever the file system of choice is for that customer.
Starting point is 00:11:00 And for more enterprise, as opposed to HPC-oriented folks, the guys at Caminario have been talking about supporting this kind of architecture. Although I think they're still a few months from delivering a product. So essentially, this is just, I hate to hesitate to say this, but from a server perspective, it looks like a local NVMe SSD or a set of local NVMe SSDs. So in a storage node, how many NVMe SSDs can you support? Is it something like 24 or 30 or something like that?
Starting point is 00:11:35 So we have a variety of different types of storage nodes or targets. The one that we announced most recently was a collaboration that we did with Supermicro. In that case, it's an amazing chassis. It's only one U high, and it includes 32 NVMe SSDs. This is the one with the kind of slide-out drawers in the front to stick the SSDs into, right? Correct. So extremely high density, extremely innovative, but 32 MBME SSDs. So if you do the math, later this year, the SSD vendors will come up with 32 terabyte SSDs. So 32 times 32, it's effectively, you could jam a petabyte of data into a 1U enclosure.
Starting point is 00:12:20 It's just mind boggling. Ray and I have been around long enough that a petabyte customer used to be impressive oh god yeah yeah and the one u closure is uh it's unconceivable inconceivable and i know i'm using a word wrong but i don't think that means what you think that means i know i know uh so i'm so with 32 even 16 terabyte ss, that's still a lot of storage. Do you split it up into, you know, LUNs or how does that work? So we do do what I think Howard referred to as when we met last year as slicing and dicing. So oftentimes a workload doesn't need the full 16 terabytes.
Starting point is 00:13:04 So what we do with our solution is we're able to slice up the SSD and then export different slices to different workloads on different servers. So hence, when I refer to two sides of the same coin, not only are we providing composability and agility, we're also maximizing resource utilization of the very expensive SSDs that customers buy. So the ability to slice and export maximizes the utilization of this very precious resource.
Starting point is 00:13:41 Especially as the smallest SSDs you can buy are starting to get substantially sized. This becomes a big piece of... But we only need 400 gigabytes for a cache in that server. So somewhere, somehow, all this management is being done at, I assume, the server level, but I may be wrong. So, I mean, how has that played out?
Starting point is 00:14:07 So, you know, defining which servers can talk to which devices across the composable storage infrastructure, let's say. So that's a great question, because so far we've talked about the hardware. The critical portion of our solution is the management software. So what we architected is a scale-out approach to managing not just a single box, which has been the traditional approach with storage vendors. Well, you've got a petabyte box. How many people need five? Well, in a lot of cases, because we are targeting private and public cloud, they do look at multiple boxes,
Starting point is 00:14:49 not just within a rack, but across multiple racks. Yeah, and you have to manage blast radius and that kind of thing too. That's correct. Blast radius? I'm sorry. Failure domains. Okay, failure domains.
Starting point is 00:15:02 I got it. That's correct. So our management approach is, we can't just afford to manage a single box. We have to manage a scale-up cluster of storage nodes or targets. So that's effectively what we do. We have a central management entity, which also sits on the network and communicates with all of the endpoints in our storage cluster and effectively orchestrates all the slicing, dicing, and the allocation of these slices to different hosts. And what's even cooler is we do it in such a way that it enables, whether it's a public or private cloud, sort of the cloud use model,
Starting point is 00:15:45 where you'll still have the storage or cloud administrator that sets up policies and manages the physical inventory. But after he or she is done, then what our central management software does is it creates tenant portals and GUIs, where the tenants or the developers, as is often the case, effectively have self-service allocation of SSD resources. They don't need to know anything about the physical infrastructure. They simply say, I need this much storage, this level of QS, and up pops the storage next to their workload. So we've really created this zero-touch provisioning model that's very much in line with the cloud practices these days. So I assume the storage management server has a pretty GUI because everything has a pretty GUI nowadays. But the kind of customers I would imagine consuming what you do and the
Starting point is 00:16:39 people who want the flexibility and agility of Composable, aren't they running DevOps platforms like Chef or Puppet or Ansible? And how do I do things out of one of those so that I instantiate a dev environment just with a script? So with our GUI, of course, everybody has to have a GUI. We also support that zero-touch provision via RESTful APIs, and that is the interface on top of which for cloud data centers is stateful containers. For example, Kubernetes Mesosphere that just lends itself to a composable infrastructure. So as the developers transition from stateless containers to stateful containers, and along with that comes the notion of persistent volumes and scripted applications with their requisite persistent volumes. It just lends itself to
Starting point is 00:17:54 having a fully composable storage infrastructure to feed NVMe-based volumes into the persistent volumes that then get composed into the container applications aren't the challenges with container applications that have state that they you know they come and go so quickly and they scale up from you know 10 to thousands and literally seconds can you support that or sustain that sort of i'll call it uh configure configure mobility i'm not sure if that's the right word, but that's the kind of thing I'm talking about. I mean, the challenge is that containers can really go
Starting point is 00:18:29 from literally 10 to 1,000 container executing in seconds. And if each one has a small slice of an NVMe volume, that's going to require a lot of manageability. That was exactly my point. So it's exactly what we've architected our management and orchestration software to do is to provide that level of composability with a zero-touch interface,
Starting point is 00:18:56 whether it's GUI-based or for this level of integration, it would be REST-based. So the Kubernetes interface that's exposed to the developers who's making all of these massive changes and thousands of changes, to your point, gets automatically fed down to our layer. And in our case, it's fully automated across the thousands of little slices that are arbitrarily mapped across the network as needed
Starting point is 00:19:24 to the different containers as they come and go so go ahead go ahead i'm just trying to figure out how this all plays out so each one of the servers in this environment has a um a source card with an fpga or two uh or or a rocky nick or a rocky nick which is another option, I guess. And each of the storage nodes, of which there can be many, right? We haven't even talked about the size of that thing, have the target version of the card. And there's this management node sitting on the side. And these guys are firing up containers and slicing and dicing NVMe SSDs in real time
Starting point is 00:20:08 with NVMe over fabric latencies. This is amazing. Yeah, that's correct. So a critical part of what we had to do in the management software to enable this level of zero touch provisioning and automation is to build an intelligence on where to pick an SSD, how to slice it, and how to map it to the requesting host. So to do that requires...
Starting point is 00:20:40 Is that across? Does that happen automagically across data nodes too? That's correct. Oh, cool. So I say I need 150 gigabytes, and my initiator is then connected to the target for 150 gigabytes, and that NVMe namespace, and away we go. That is correct. Excellent. All right.
Starting point is 00:21:05 And so something you said, can you split? Let's say I want a 10-terabyte Viome and I've got 1-terabyte SSDs. I guess I could, and Viome is not the right name, I believe, but I could potentially. Well, a LUN in SCSI and an NVMe namespace are kind of equivalent concepts. All right. Back to the question though i've got one terabyte ssds i want a 10 terabyte namespace slash lun you can gang together
Starting point is 00:21:32 multiples of those and and that all works fine so when you refer to uh um ganging together multiple smaller SSDs into a larger SSD. That's, uh, in, in our side, we refer to that as concatenation. Yes, that's it.
Starting point is 00:21:53 That works. So that's, uh, I was going to call it Giuliani, but that's another thing. Sorry. That would be the slice and dice thing. Okay.
Starting point is 00:22:02 So in our, the, the product that the solution that we're currently shipping um we do not support that however that is something we're looking at and it's certainly something you could do in the host volume manager if you were really stuck that's correct so i'm sorry you're shipping the product already yeah so we've been shipping since the beginning of the year. GA is later this quarter, but we've been shipping to customers worldwide for their evals and their POCs. I just can't wait until I get one in the lab to play with. What are you going to do with it?
Starting point is 00:22:39 Well, it's not like I have real applications that need millions of IOPS at 50 microseconds. So that's the question. What's the sort of response times can a person, a server see with this sort of solution? And what's the sort of IOPS per NVMe SSD kind of thing can a server see? What's a realistic number here for reads and or reads rights? So to answer your question, you have to look at what we do slightly differently. It's not that we deliver performance. We don't add any latency. So you need to look at it completely upside down from the way that the
Starting point is 00:23:25 world has seen storage traditionally. You have to look at it from the point of view as the basic storage device, in this case, the NVMe SSD. So the typical 3D NAND SSD has a latency of about 90 to 100 microseconds. That's the raw latency of the device. So with our solution, the ability to put that SSD resource on the network with all of the agility and resource benefits that comes along with that, we only add 5 microseconds of latency. What? Wait a minute. How can you go across an internet in five microseconds it's
Starting point is 00:24:09 modern ethernet switches have latencies in about in the 500 to 700 nanosecond range you know you've got you've got tcp ip stacks all over the damn place you've got you know ah but see that's where the uh the magic comes in and the use of fpgas so firstly we don't use tcp um the network protocol is the industry standard nvme over rdma um otherwise known as rocky um i know it's an acronym of an acronym, but it's RdMA over-converged Ethernet. And the technocrats, we call it ROCKI because it's R-O-C-E. But there is no TCP. It's an RdMA protocol on UDP Ethernet. There is no bulky TCP stack that gets in the way. And the way that we implement this industry standard network protocol, back to the very beginning of this podcast, is we do everything in hardware. So that's how we accomplish this ultra low added latency of only five microseconds.
Starting point is 00:25:20 So if I'm running vSphere with one of your HBAs and kind of the only way to get this performance with vSphere nowadays, the data path is into the FPGA, out onto the 100 gig Ethernet, into the FPGA to the SSD, and without all of the abstraction layers that we'd normally have in a storage system. That's exactly correct, Howard. And it's worked with vSphere today? That's correct. Yeah, we sort of glossed over what we do on the host side. We talked on the target side, the storage node side. Yes, yes. Now, on the host side, we do provide the customer with options
Starting point is 00:26:03 for what type of host adapter they use. Specifically, we have a host adapter that's FPGA-based, and we refer to it as a host NVMe over fabric adapter, or HNA for short. At the same time, we also support a standard RDMA NIC. And Mellanox is probably the best known for creating RDMA NICs. So we do a ton of testing with Mellanox RDMA NICs. But we support either one, whether it's our own host adapter, HNA, or a standard RDMA NIC. And your HNA looks like an NVMe SSD or a set of NVMe SSDs to the host. Correct. So that is the unique part of our HNA, which is
Starting point is 00:26:50 one of the benefits versus using a standard ARNIC. In our case, with our host adapter, the HNA, in the FPGA, we do full virtualization of the NVMe SSD as seen by the host. So when the host server scans the PCI bus, our adapter reports itself not as a networking card, but we report ourselves as a NVMe SSD. But I'd still need another network card for network stuff.
Starting point is 00:27:23 Correct. You still need a standard network traffic. But we take care of all of the storage network and traffic for NVMe. And the OS, the kernel of the hypervisor, simply uses standard NVMe drivers. We don't require any special NVMe drivers, nor do we require any host management agent. So we are completely agentless and zero footprint as far as the host is concerned. We take care of all the virtualization inside of our host adapter. Yeah, but some of these, I'll call them servers, have a limitation as to the number of NVMe drives they can support and things of that nature.
Starting point is 00:28:07 And even though you're virtualizing, let's say, I don't know, 32 or however many NVMe SSDs, I mean, they're going to have some problem from a configuration perspective, seeing that you've got 64 NVMe SSDs in a server that only really supports 32. Right? Or am I wrong? You're not quite correct on that. in a server that only really supports 32, right? Or am I wrong? You're not quite correct on that. Okay, good. You can get super big server chassis today with tons of NVMe SSDs,
Starting point is 00:28:48 and the kernels and the hypervisors do support that number of SSDs. The limits on number of NVMe SSDs per server are really around PCIe lanes and hardware, not how many logical NVMe SSDs can you have. Correct. And then there's also the newer NVMe SSDs actually do support multiple namespaces within a single physical SSD. So the kernels and the hypervisors have grown up to support those multiple namespaces as well. So we take advantage of that industry standard support.
Starting point is 00:29:19 So, Tufek, where are you trying to sell this? I mean, are you selling us to OEMs? Are you selling us to end users direct? I mean, you mentioned HPC. You mentioned public cloud and private cloud. But I'm just trying to figure out where's the game here? Where's the market that you're going after? So private and public cloud I know is a very broad terms.
Starting point is 00:29:42 So obviously we do talk to companies that are providing, you know, seeking to provide infrastructure as a service offerings, but increasingly it's also software as a service companies that fall in this public cloud space folks that are, you know, buying super fast composable infrastructure for reservations, for e-commerce, and you can imagine the other types of companies that fall in that category.
Starting point is 00:30:16 And then it's a blurry line, but private cloud these days, especially with this level of storage performance, also does trickle over into the HPC side of the world. So HPC now is not just the traditional national labs doing weather modeling and atomic palm modeling. It's also companies doing drug research and pharma. There's oil and gas companies. Media and entertainment companies doing video post-processing. So those all sort of fall into this private cloud spectrum that's sort of blurring the line between private cloud and high-performance computing.
Starting point is 00:31:05 So that hopefully gives you a flavor of the type of customers that we're engaged with. Yeah, it's not the impression I got when we talked at FMS. I thought you guys were trying to be the Zyra Techs of the next wave of storage. So what I just described is the end customer categories. Now, the way in which we reach those end customers, some of them we work with directly, others we work with partners, such as the Supermicros of the world. Right. So most of that stuff didn't seem like it was a vSphere kind of environment, per se. It's still mostly customers that have real-time, high data volume, data bandwidth
Starting point is 00:31:43 requirements. I would say when you mentioned the e-commerce guys, the reservation guys, anybody that's doing high levels of transactions fit into that framework. They're probably, well, I don't know if they're using VMware kinds of capabilities or not. They might be, I suppose. I think VMware would argue with you i think i'm sure they would i've i've i've never had a conversation with a vendor where they didn't argue that their proper view of the market was that that everybody should be buying our product yeah i agree i agree uh but i mean in general the the highest performance stuff doesn't run in virtualization typically things that yeah the things that we run be
Starting point is 00:32:33 that we have a lot of run in virtualization right um but you know the way i see it nvme maybe three years from now maybe five years from now and I have a bet going with Jmetz about this, is going to replace SCSI as the lingua franca for enterprise and high-end storage. Three or five years? Yeah, in terms of what products are hitting the market, what people are buying that's new. SCSI is going to stick around on spinning disks as long as spinning disks do. Yeah. That's amazing. Tawfiq, you mentioned multi-tenant and tenant portals.
Starting point is 00:33:20 And whenever we talk about multi-tenancy, the concept of noisy neighbors comes to mind because I used to live in Greenwich Village and we had noisy neighbors. Is there any QoS built into this absolutely um i mean the whole point of creating so much performance it's like the uh what's the quote from um spider-man with great power comes great responsibility um in this case with this much performance it's it's meant to be shared um with this much performance our responsibility is to share it, but to share it responsibly. So by that token, we've designed in hardware-based QoS for every active namespace or L during the provisioning process, along with that allocation, we do assign QS controls to preclude exactly the concern, Howard, that you referred to, the noisy neighbor issue. And so are those IOP or bandwidth throttles or something more sophisticated? Response time things or all of the above? IOPs and throughput controls. Limits or something? sophisticated? Response time things or all the above? IOPS and throughput controls.
Starting point is 00:34:27 Limits or something? Limits for now. Now, having said that, in our world, everything has two sides to the coin. On one side, we have the QoS controls for enforcement, but at the same time, we do have hardware-based monitoring on every active namespace or volume. So we track latency, IOPS, and throughput. And we collect all of that data using IoT methods from across the cluster. And we stream that performance data into the centralized management entity.
Starting point is 00:35:08 And we actually put the data into a database so that the operator, the administrator has a record of the performance, the latency ops and throughput of every active namespace in the cluster. So it's actually a pretty critical capability, especially the latency monitoring. Because the infamous situation is the developer has an app that's been put into production, and then 2 a.m. in the morning, there's a slowdown on the app. The developer calls the infrastructure owner or administrator,
Starting point is 00:35:47 and they're pointing fingers. Always blame the network guy. And the infrastructure guy says... I thought it was always the storage guy, but yeah, go ahead. I am the storage guy, so I always blame the network guy. I got you. So in our case, we accelerate what some companies call time to innocence. Time to innocence? I like that term.
Starting point is 00:36:14 It's this active volume that's in dispute. Pull up the historical data and more importantly, latency. Because IOPS and throughput, you could argue it's caused by the application or the infrastructure, but you can't argue latency. So the latency gives a clear direction as to where's the issue coming from. So we'll have a historical record of that. And since this database is on high-performance storage, we can do all sorts of analysis. That's correct.
Starting point is 00:36:42 He didn't say it was on NVMeme storage but he said it was a database but uh so what about else would i yeah where else would you put it i wouldn't necessarily put a database of monitoring latencies and iops and that sort of stuff that stream real time it may need to be on high performance storage i don't know based on what's going on i'd have to you could you could architect that in a number of ways but and there's some question here that was there. Okay, what about storage class memories and Optane and stuff like that? Are you guys ready for that? Yeah, in fact, we've done demos using Optanes.
Starting point is 00:37:16 And I think we claimed, and someone has yet to refute it, we claimed the lowest network storage performance or lowest latency ever. We download 16 microseconds of latency. So it's our 5 microseconds plus the Optane's 10 plus a little bit on the switch. So we did a demo that had 16 microseconds of latency going across the network, and we claimed it was the fastest ever 16 microseconds of latency to get an iowa operation done correct uh you could go faster just build a custom dram based ssd for you can get it down to 11 yeah Yeah. Jesus, I can't believe it.
Starting point is 00:38:05 I can't believe it. You know, 250 microseconds, 500 microsecond latencies were all the rage not a year and a half ago. 150 was a rage not a year ago. It's a whole new world, right? This is an order of magnitude improvement on top of all that. Yeah, it's just staggering. If you guys start selling this to high-frequency traders, I think they really need this stuff. We are talking to financial folks.
Starting point is 00:38:36 Good, good, good. Well, we all know they can afford it. Yeah. We haven't talked about price, but and so in your environment, you've got the HNA and you've got the target hardware and the management software. I mean, how are you pricing this? I guess that should be the question to an end user. So in certain configurations, we can actually get to a little over a dollar per gigabyte. Whoa. Which is the typical cost of an all-flash array nowadays.
Starting point is 00:39:12 It's not the typical cost of an all-flash array nowadays. And that's deduplication and compressed and thin-perfection and all that junk. Deduplicated and compressed and discounted up to that. This has got none of that this has got none of that these are device level network storage kinds of things a dollar per gigabyte and user price is extremely attractive yep at especially at this performance you can't can't approach this performance without hundreds of thousands of dollars of controllers surrounding this thing. That's right.
Starting point is 00:39:47 And even then, it's not even close. Am I wrong? You're absolutely correct. So going back to the very beginning of the podcast, we've, for the last two decades, created disruptive technologies for the data center. And that's what we believe we've done here. God. I think I would agree i'm on record i think what you're doing is the future so i'm glad to hear that it's at the state now
Starting point is 00:40:14 where things are pretty much ready for prime time um the one real question i have though is the nvme over fabric spec is a little lacking when it comes to some enterprise things like boot from SAN, which for your uses probably isn't that big a deal, but multi-pathing isn't really well defined. Are you doing some of your own magic to take care of that or are we just expecting to deal with it upstream? So we're the only, of all the folks that are participating in the standards, NVMe over Fabric standards, we're the only vendor that has the full host side adapter. So because we expose virtual NVMe devices to the host,
Starting point is 00:41:03 we actually get around a lot of the issues that you refer to. So we have standard MPIO running in our lab today because, you know, for us, we expose a standard MME device to the host. Right. And then you have your own solution for how to do the multi-pathing between the HNA and the data node. Well, we just create two different paths, and then the existing MPIO layer in whatever the kernel is takes care of the load balancing and the failover as required. So we just create two independent paths that are extremely reliable and low latency
Starting point is 00:41:41 from the two virtual NVMe devices that sit on our host adapter all the way to a shared NVMe SSD living in the target. And the NVMe SSDs are dual-path SSDs in that configuration? So that's another one of our innovations. So we support the dual-port NVMe SSDs to get the full redundancy all the way down to the device level. But there's a lot of customers who don't want to pay a premium for dual-port SSDs because they do still have a fairly healthy premium on them versus the single ports. So what we're able to... I was hoping that premium would disappear.
Starting point is 00:42:26 Yeah, I think it might take a while. It's funny, I talked to the NVMe controller folks, you know, the Marvell's and IDT's, and they say, well, it's the same controller. And so it's just a premium there because people will pay for it that's exactly which annoys me so but what we do which is another cool uh one of our innovations is we're able to take a single port ssd and create two different mpi paths so you don't get the last
Starting point is 00:43:00 inch worth of redundancy but what customers care about most is the is you know of redundancy. But what customers care about most is the redundancy on the network, right? Someone tripping over a network cable. So we're able to provide that level of MPI across the network, even using a single port SSD. Does having all four lanes on a single port SSD as opposed to two ports of two lanes give you, do you see a bandwidth advantage to that or are we not seeing enough demand for it to matter?
Starting point is 00:43:34 So today, predominantly it's still single port NVMe SSDs in terms of shipments. And that's why there's a premium on the dual ports. So, yeah, when it comes to performance, if you put redundancy aside for a second, it's actually easier to extract performance from a single PCIe x4 versus 2x2s. Right. Well, okay, I've got one last question. I think we've got...
Starting point is 00:44:04 And all of a sudden sudden the last question is gone what was oh yeah hot plug so the nvme ssds are they hot pluggable in this configuration yeah absolutely oh my god you got your cluster of storage nodes and um there's an administrator sitting in front of the the console the technician goes out to the data center plugs in an ssd and it pops up as a new ssd in the physical inventory of ssds that sit on the network and it becomes then available as yet another resource that is then available as yet another resource that is then allocatable to the tenants. Yeah, I love the storage guy's skepticism about hot plug PCIe. That's been a feature that's been in the spec and in servers for at least three or four server generations.
Starting point is 00:44:57 Yeah, but... When we're talking about an add-in card, I've never known anybody with the guts to pull the server out at the end of its rails, open the cover, and change a card. Yeah. U.2 makes that a lot simpler. I guess. All right, so this has been great. Howard, do you have any last questions for Tawfik?
Starting point is 00:45:15 No, I got it. Tawfik, do you have anything you'd like to say to our listening audience before we sign off? No, just look forward to working with any listeners that want to come to talk to us. We've got a lot of cool things going on and we love engaging with the customers to hear their problems and hopefully solve them. Why don't you mention a couple of the events
Starting point is 00:45:38 you're going to be at? So we'll definitely be at the Flash Memory Summit coming up. Also at the Super Computer summit coming up. Um, um, also at, uh, uh, the supercomputer show, SE18, that's a little bit later in the year. Um, VL, VM world. Um, we might go to, uh, for those, uh, listeners in EMEA, we might be at the supercomputer show in, in, uh, Frankfurt coming up. Um, but that's just a couple of examples of some of the shows we're going to be at the Super Commuter Show in Frankfurt coming up.
Starting point is 00:46:06 But that's just a couple of examples of some of the shows we're going to be in. Very good. Well, this has been great. Thank you very much, Tawfiq, for being on our show today. Thank you for the opportunity. Next month, we'll talk to another system storage technology person. Any questions you want us to ask, please let us know.
Starting point is 00:46:20 And if you enjoy our podcast, tell your friends about it and please review us on iTunes as this will help get the word out. That's it for now. Bye, Howard. Bye, Ray. Until next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.