Grey Beards on Systems - 61: GreyBeards talk composable storage infrastructure with Taufik Ma, CEO, Attala Systems
Episode Date: May 22, 2018In this episode,  we talk with Taufik Ma, CEO, Attala Systems (@AttalaSystems). Howard had met Taufik at last year’s FlashMemorySummit (FMS17) and was intrigued by their architecture which he thou...ght was a harbinger of future trends in storage. The fact that Attala Systems was innovating with new, proprietary hardware made an interesting discussion, in its own … Continue reading "61: GreyBeards talk composable storage infrastructure with Taufik Ma, CEO, Attala Systems"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here with Howard Marks here.
Welcome to the next episode of the Grayberry on Storage monthly podcast,
a show where we get Grayberry Storage Assistant bloggers to talk with Storage Assistant vendors
to discuss upcoming products, technologies, and trends affecting the data center
today. This Great Bridge Online Storage episode was recorded on May 10, 2018. We have with us
here today Tawfiq Ma, CEO of Atala Systems. Tawfiq, why don't you tell us a little bit about yourself
and about Atala Systems? Hey, Ray. Thanks for the introduction and thanks for the opportunity to talk to you today.
So, yeah, I'm CEO and co-founder of Atala Systems.
So who are we?
Well, we're a cloud infrastructure company with headquarters in San Jose, California.
The founders, myself and two others, we come from technology powerhouses such as Intel, Emulex,
and two breakout startups called Silverworks and Silver Engines.
So for the last two decades, we've been creating breakout technologies
for the data center.
We created the highest volume server storage
and networking chipsets and controllers for the industry.
So we've got a pretty damn good track record.
I remember conversations at Network Computing
about how, well, everybody's using your grand
champion chipset, so why are we even looking at servers anymore?
That's right.
Yeah, so we have a great team.
We're very proud of our heritage and what we've done.
So now we've got our sights set on private and public cloud infrastructure with the creation
of what we call composable storage infrastructure.
So the sort of the genesis of this when we started the company was the observation that Moore's law is sort of slowing down. So to really deliver high performance infrastructure
with a focus on storage, you need to start with a new approach that leverages hardware.
And in our case, we've leveraged programmable hardware in the form of FPGAs.
And what that allows us to do is create a fully composable SSD storage solution that delivers near-native, ultra-low latency performance.
But more importantly, we create the ability to arbitrarily compose those resources across the data center.
So to attach SSD resources to wherever they're needed on any server for any workload.
Completely arbitrarily, completely autonomous with zero touch, one click provisioning.
So for the workload, you get the resources whenever and wherever you need them.
So in essence, the solution provides customers with a boost in both agility
and, increasingly importantly, resource utilization.
So in our minds, the agility and improved use of resources
are two sides of the same coin.
So our solution effectively delivers that to end users.
And as we talk to customers, that's becoming increasingly important given the growth of
two key macro factors.
One is obviously the growth of data analytics that are increasingly being done in real time,
this is batch, and the associated machine learning and AI applications
that are just data-hungry.
So that's one macro factor.
And then the other macro factor is the advent of DevOps,
where applications consistently change
as cloud companies pursue the next innovation
in their cloud-delivered services. So a little bit woody,
but that's effectively the philosophy and the genesis of what we're doing.
So you're creating network shared NVMe storage that operates at almost local latencies?
That's correct. So I refer to the use of programmable hardware and the use of FPJs.
So that's a critical technology that is very unique to Atala that allows us to do exactly
what you said, to provide workloads with the performance and low latency of the newer NVMe SSDs,
as if those SSDs were in the same server as the workload,
but with all the benefits of having it be pooled
and composable sitting on the network.
And these FPGAs are something that you have created?
This is proprietary technology to Atala Systems?
I mean, the underlying technology I know is available, but go ahead.
So the way that we create the solution is we have a partnership
with the FPGA business unit of Intel that used to be Altera.
So we've been working closely with them since the inception of the company
for over two years now.
So we take the Intel FPGAs
and then we add our programming to those FPGAs.
Ah, okay, I gotcha.
In the form of RTL firmware and additional software.
And the combination of all of that
effectively creates our solution.
It's a whole new definition for software-defined storage.
Exactly.
No, it's sort of, I don't know.
Software runs on the FPGA instead of on the Xeon.
Howard is exactly right.
We have all of the ability to innovate quickly, very rapidly,
by programming FPGAs.
The only difference between us and the traditional software-defined storage company
is in our case,
the programming includes a little bit of RTL.
It's not just C code.
Yeah, but you can't do this in the field.
You can't reprogram the FPGA
sitting out there
in somebody's cloud storage environment, right?
Or am I wrong?
So we do plan to offer occasional updates,
just like any software-defined storage vendor would do,
in the field.
It's a programmable device.
Oh, my Lord.
All right, so I think we've done pretty well
at the 30,000-foot view,
but I'm a geek and I like to dig down.
So when we talked at the Flash Memory Summit,
we talked primarily about the target implementation of the FPGA. So why don't you talk a little bit
about that? So just to expand on your question, so the definition of target for the listeners is the storage node that contains the physical NVMe SSDs.
In that case, the physical manifestation of our solution is effectively a chassis
that connects to any Ethernet network.
And within that chassis, we have our core FPGA-based technology that on one side connects to the Ethernet network
and on the other side connects directly to NVMe SSDs.
So the magic is what we do inside of the FPGA in being able to take the network protocol
that sits on the Ethernet network and then with hardware speed and
hardware based latency convert that to NVMe traffic to the SSDs themselves. So
in our case there's no software in that data path so we don't suffer from any
latency because of software interrupts.
We don't rely on CPUs
palling the network
to watch out for new packets.
So we have not just slow latency,
but high predictability on performance
and also low cost and high density
because the targets or storage nodes
are effectively serverless.
It's just the FPGAs and the SSDs.
Well, and there's a PCIe chip switch in there somewhere, isn't there?
That's correct.
We do use the standard switch to fan out to multiple SSDs.
No, wait.
There's no CPU.
There's no cores.
There's nothing going on in that storage node other than the…
It's a JBuff.
It's the logical equivalent of a SAS JBod,
but way smarter and way faster.
I like exactly how it
said. Smarter and faster
is the critical words.
What about
storage protection, RAID,
and
LUN control,
and where does LUN data reside?
You do that stuff upstream.
Upstream?
There is no upstream.
There's a server and there's a storage node.
I think there's an Ethernet network in there between them.
Right, and so if you have a web-native application
where resiliency is at the application level,
then it can consume capacity directly from the Attila JBoffs.
If you have more traditional applications,
you add an SDS layer that does RAID and replication
and snapshots and those things.
So basically, when we started out, as I said earlier, we set our sights on public and private
cloud infrastructure.
And increasingly, the type of data analytics solutions they're deploying have their own
data protection included.
Capabilities, yeah, yeah, yeah.
You think about the triple recollection with HDFS, Cassandra, MySQL, the scale-out databases that create data protection at the application layer.
Trying to add this level of protection at the infrastructure layer not only adds complexity,
but also adds cost and dilutes the very performance that we set out to deliver.
Most significantly, it adds latency.
That's correct.
Now, if customers do want a level of data protection for certain applications that don't have their built-in methods,
then we have plenty of reference designs and work that we do with customers where we layer
objects or file systems on top of our infrastructure,
such as Ceph or Luster or GPFS.
So we've basically provided, you know,
extremely fast, flexible, and highly composable infrastructure,
whether it's for newer apps that take care of themselves or for older apps where we layer on top whatever the file system of choice is for that customer.
And for more enterprise, as opposed to HPC-oriented folks,
the guys at Caminario have been talking about supporting this kind of architecture.
Although I think they're still a few months from delivering a product.
So essentially, this is just, I hate to hesitate to say this,
but from a server perspective, it looks like a local NVMe SSD
or a set of local NVMe SSDs.
So in a storage node, how many NVMe SSDs can you support?
Is it something like 24 or 30 or something like that?
So we have a variety of different types of storage nodes or targets.
The one that we announced most recently was a collaboration that we did with
Supermicro. In that case, it's an amazing chassis. It's only one U high, and it includes 32 NVMe SSDs.
This is the one with the kind of slide-out drawers in the front to stick the SSDs into, right?
Correct. So extremely high density, extremely innovative, but 32 MBME SSDs.
So if you do the math, later this year, the SSD vendors will come up with 32 terabyte
SSDs.
So 32 times 32, it's effectively, you could jam a petabyte of data into a 1U enclosure.
It's just mind boggling.
Ray and I have been around long enough that a petabyte customer used
to be impressive oh god yeah yeah and the one u closure is uh it's unconceivable inconceivable
and i know i'm using a word wrong but i don't think that means what you think that means i know
i know uh so i'm so with 32 even 16 terabyte ss, that's still a lot of storage.
Do you split it up into, you know, LUNs or how does that work?
So we do do what I think Howard referred to as when we met last year as slicing and dicing.
So oftentimes a workload doesn't need the full 16 terabytes.
So what we do with our solution is we're able to slice up the SSD
and then export different slices to different workloads on different servers.
So hence, when I refer to two sides of the same coin,
not only are we providing composability and agility,
we're also maximizing resource utilization
of the very expensive SSDs that customers buy.
So the ability to slice and export
maximizes the utilization of this very precious resource.
Especially as the smallest SSDs you can buy
are starting to get substantially sized. This becomes a big piece
of... But we only need 400 gigabytes for a cache
in that server.
So somewhere, somehow, all this management is being
done at, I assume, the server level,
but I may be wrong.
So, I mean, how has that played out?
So, you know, defining which servers can talk to which devices across the composable storage infrastructure, let's say.
So that's a great question, because so far we've talked about the hardware.
The critical portion of our solution is the management software.
So what we architected is a scale-out approach to managing not just a single box,
which has been the traditional approach with storage vendors.
Well, you've got a petabyte box. How many people need five?
Well, in a lot of cases, because we are targeting private and public cloud,
they do look at multiple boxes,
not just within a rack, but across multiple racks.
Yeah, and you have to manage blast radius
and that kind of thing too.
That's correct.
Blast radius?
I'm sorry.
Failure domains.
Okay, failure domains.
I got it.
That's correct.
So our management approach is, we can't just afford to manage a single box.
We have to manage a scale-up cluster of storage nodes or targets.
So that's effectively what we do.
We have a central management entity, which also sits on the network and communicates with all of the endpoints in our storage cluster and effectively
orchestrates all the slicing, dicing, and the allocation of these slices to different hosts.
And what's even cooler is we do it in such a way that it enables, whether it's a public or private cloud, sort of the cloud use model,
where you'll still have the storage or cloud administrator that sets up policies and manages the physical inventory.
But after he or she is done, then what our central management software does is it creates tenant portals and GUIs,
where the tenants or the developers, as is often the case,
effectively have self-service allocation of SSD resources.
They don't need to know anything about the physical infrastructure.
They simply say, I need this much storage, this level of QS, and up pops the storage next to their workload.
So we've really created this zero-touch provisioning model that's very much in line with the cloud practices these days. So I assume the storage management server has a pretty GUI because everything has
a pretty GUI nowadays. But the kind of customers I would imagine consuming what you do and the
people who want the flexibility and agility of Composable, aren't they running DevOps platforms like Chef or Puppet
or Ansible? And how do I do things out of one of those so that I instantiate a dev environment
just with a script? So with our GUI, of course, everybody has to have a GUI.
We also support that zero-touch provision via RESTful APIs, and that is the interface on top of which for cloud data centers is stateful containers.
For example, Kubernetes Mesosphere that just lends itself to a composable infrastructure.
So as the developers transition from stateless containers to stateful containers,
and along with that comes the notion of persistent volumes
and scripted applications with their requisite persistent volumes. It just lends itself to
having a fully composable storage infrastructure to feed NVMe-based volumes into the persistent
volumes that then get composed into the container applications
aren't the challenges with container applications that have state that they you know they come and
go so quickly and they scale up from you know 10 to thousands and literally seconds can you support
that or sustain that sort of i'll call it uh configure configure mobility i'm not sure if
that's the right word,
but that's the kind of thing I'm talking about.
I mean, the challenge is that containers can really go
from literally 10 to 1,000 container executing in seconds.
And if each one has a small slice of an NVMe volume,
that's going to require a lot of manageability.
That was exactly my point.
So it's exactly what we've architected
our management and orchestration software to do
is to provide that level of composability
with a zero-touch interface,
whether it's GUI-based
or for this level of integration,
it would be REST-based.
So the Kubernetes interface that's exposed to the developers who's making all of these massive
changes and thousands of changes, to your point,
gets automatically fed down to our layer. And
in our case, it's fully automated across the thousands of little
slices that are arbitrarily mapped across the network as needed
to the different containers
as they come and go so go ahead go ahead i'm just trying to figure out how this all plays out so
each one of the servers in this environment has a um a source card with an fpga or two uh or or a
rocky nick or a rocky nick which is another option, I guess.
And each of the storage nodes, of which there can be many, right?
We haven't even talked about the size of that thing, have the target version of the card.
And there's this management node sitting on the side.
And these guys are firing up containers and slicing and dicing NVMe SSDs in real time
with NVMe over fabric latencies.
This is amazing.
Yeah, that's correct. So a critical part of what we had to do
in the management software to enable this level of zero touch
provisioning and automation
is to build an intelligence on where to pick an SSD, how to slice it, and how to map it
to the requesting host.
So to do that requires...
Is that across?
Does that happen automagically across data nodes too?
That's correct.
Oh, cool.
So I say I need 150 gigabytes, and my initiator is then connected to the target for 150 gigabytes, and that NVMe namespace, and away we go.
That is correct.
Excellent.
All right.
And so something you said, can you split?
Let's say I want a 10-terabyte Viome and I've got 1-terabyte SSDs.
I guess I could, and Viome is not the right name, I believe,
but I could potentially.
Well, a LUN in SCSI and an NVMe namespace are kind of equivalent concepts.
All right.
Back to the question though i've
got one terabyte ssds i want a 10 terabyte namespace slash lun you can gang together
multiples of those and and that all works fine so when you refer to uh um ganging together multiple smaller SSDs into a larger SSD.
That's,
uh,
in,
in our side,
we refer to that as concatenation.
Yes,
that's it.
That works.
So that's,
uh,
I was going to call it Giuliani,
but that's another thing.
Sorry.
That would be the slice and dice thing.
Okay.
So in our,
the,
the product that the solution that we're currently
shipping um we do not support that however that is something we're looking at and it's certainly
something you could do in the host volume manager if you were really stuck that's correct
so i'm sorry you're shipping the product already yeah so we've been shipping since the beginning of the year. GA is later this quarter, but we've been shipping to customers worldwide for their evals and their POCs.
I just can't wait until I get one in the lab to play with.
What are you going to do with it?
Well, it's not like I have real applications that need millions of IOPS at 50 microseconds.
So that's the question.
What's the sort of response times can a person, a server see with this sort of solution?
And what's the sort of IOPS per NVMe SSD kind of thing can a server see?
What's a realistic number here for reads and or reads rights?
So to answer your question, you have to look at what we do slightly differently.
It's not that we deliver performance. We don't add any latency. So you need to look at it
completely upside down from the way that the
world has seen storage traditionally. You have to look at it from the point of view as the basic
storage device, in this case, the NVMe SSD. So the typical 3D NAND SSD has a latency of about
90 to 100 microseconds. That's the raw latency of the device. So with our solution,
the ability to put that SSD resource on the network
with all of the agility and resource benefits that
comes along with that, we only add 5 microseconds of latency.
What?
Wait a minute. How can you go across an internet in five microseconds it's
modern ethernet switches have latencies in about in the 500 to 700 nanosecond range you know you've
got you've got tcp ip stacks all over the damn place you've got you know ah but see that's where the uh the magic comes in and the use of fpgas
so firstly we don't use tcp um the network protocol is the industry standard nvme over
rdma um otherwise known as rocky um i know it's an acronym of an acronym, but it's RdMA over-converged Ethernet. And
the technocrats, we call it ROCKI because it's R-O-C-E. But there is no TCP. It's an RdMA protocol
on UDP Ethernet. There is no bulky TCP stack that gets in the way. And the way that we implement this industry standard network
protocol, back to the very beginning of this podcast, is we do everything in hardware.
So that's how we accomplish this ultra low added latency of only five microseconds.
So if I'm running vSphere with one of your HBAs and kind of the only way to get this performance with vSphere nowadays, the data path is into the FPGA, out onto the 100 gig Ethernet, into the FPGA to the SSD, and without all of the abstraction layers that we'd normally have in a storage system.
That's exactly correct, Howard.
And it's worked with vSphere today?
That's correct.
Yeah, we sort of glossed over what we do on the host side.
We talked on the target side, the storage node side.
Yes, yes.
Now, on the host side, we do provide the customer with options
for what type of host adapter they use.
Specifically, we have a host adapter that's FPGA-based, and we refer to it as a host NVMe over fabric adapter, or HNA for short.
At the same time, we also support a standard RDMA NIC.
And Mellanox is probably the best known for creating RDMA NICs.
So we do a ton of testing with Mellanox RDMA NICs.
But we support either one, whether it's our own host adapter, HNA, or a standard RDMA NIC.
And your HNA looks like an NVMe SSD or a set of NVMe SSDs
to the host. Correct. So that is the unique part of our HNA, which is
one of the benefits versus using a standard ARNIC. In our
case, with our host adapter, the HNA,
in the FPGA, we do full virtualization
of the NVMe SSD as seen by the host. So when the
host server scans the PCI bus, our adapter reports itself
not as a networking card, but we report ourselves as
a NVMe SSD.
But I'd still need another network card for network stuff.
Correct. You still need a standard network traffic.
But we take care of all of the storage network and traffic for NVMe.
And the OS, the kernel of the hypervisor, simply uses standard NVMe drivers.
We don't require any special NVMe drivers, nor do we require any host management agent.
So we are completely agentless and zero footprint as far as the host is concerned.
We take care of all the virtualization inside of our host adapter.
Yeah, but some of these, I'll call them servers, have a limitation as to the number of NVMe drives
they can support and things of that nature.
And even though you're virtualizing, let's say, I don't know, 32 or however many NVMe SSDs,
I mean, they're going to have some problem from a configuration perspective,
seeing that you've got 64 NVMe SSDs in a server that only really supports 32.
Right? Or am I wrong? You're not quite correct on that. in a server that only really supports 32, right?
Or am I wrong?
You're not quite correct on that.
Okay, good.
You can get super big server chassis today with tons of NVMe SSDs,
and the kernels and the hypervisors do support that number of SSDs. The limits on number of NVMe SSDs per server are really around PCIe lanes and hardware,
not how many logical NVMe SSDs can you have.
Correct.
And then there's also the newer NVMe SSDs
actually do support multiple namespaces
within a single physical SSD.
So the kernels and the hypervisors have grown up to support those multiple namespaces as well.
So we take advantage of that industry standard support.
So, Tufek, where are you trying to sell this?
I mean, are you selling us to OEMs?
Are you selling us to end users direct?
I mean, you mentioned HPC.
You mentioned public cloud and private cloud.
But I'm just trying to figure out where's the game here?
Where's the market that you're going after?
So private and public cloud I know is a very broad terms.
So obviously we do talk to companies that are providing,
you know,
seeking to provide infrastructure as a service offerings,
but increasingly it's also software as a service companies that fall in this
public cloud space folks that are,
you know,
buying super fast composable infrastructure for reservations, for e-commerce,
and you can imagine the other types of companies that fall in that category.
And then it's a blurry line, but private cloud these days, especially with this level of
storage performance, also does trickle over into the HPC side of the world.
So HPC now is not just the traditional national labs
doing weather modeling and atomic palm modeling.
It's also companies doing drug research and pharma.
There's oil and gas companies.
Media and entertainment companies doing video post-processing.
So those all sort of fall into this private cloud spectrum that's sort of blurring the line between private cloud and high-performance computing.
So that hopefully gives you a flavor of the type of customers that we're engaged with.
Yeah, it's not the impression I got when we talked at FMS. I thought you guys were trying to be the Zyra Techs of the next wave of storage. So what I just described is the end customer categories.
Now, the way in which we reach those end customers, some of them we work with directly,
others we work with partners, such as the Supermicros of the world.
Right. So most of that stuff
didn't seem like it was a vSphere kind of environment, per se.
It's still mostly customers that
have real-time, high data volume, data bandwidth
requirements. I would say when you mentioned the e-commerce guys, the reservation guys,
anybody that's doing high levels of transactions fit into that framework.
They're probably, well, I don't know if they're using VMware kinds of capabilities or not.
They might be, I suppose.
I think VMware would argue with you i think i'm sure they
would i've i've i've never had a conversation with a vendor where they didn't argue that their
proper view of the market was that that everybody should be buying our product yeah i agree i agree uh but i mean in general the the highest performance
stuff doesn't run in virtualization typically things that yeah the things that we run be
that we have a lot of run in virtualization right um but you know the way i see it nvme
maybe three years from now maybe five years from now and I have a bet going with Jmetz about this, is going to replace SCSI as the lingua franca for enterprise and high-end storage.
Three or five years?
Yeah, in terms of what products are hitting the market, what people are buying that's new.
SCSI is going to stick around on spinning disks as long as spinning disks do.
Yeah.
That's amazing.
Tawfiq, you mentioned multi-tenant and tenant portals.
And whenever we talk about multi-tenancy, the concept of noisy neighbors comes to mind because I used to live in Greenwich Village and we had noisy neighbors.
Is there any QoS built into this absolutely um i mean the whole point of creating so much performance it's like the uh what's the quote from um spider-man with great power comes great
responsibility um in this case with this much performance it's it's meant to be shared
um with this much performance our responsibility is to share it, but to share it responsibly.
So by that token, we've designed in hardware-based QoS for every active namespace or L during the provisioning process, along with that allocation, we do assign QS controls to preclude exactly the concern, Howard, that you referred to, the noisy neighbor issue.
And so are those IOP or bandwidth throttles or something more sophisticated?
Response time things or all of the above?
IOPs and throughput controls. Limits or something? sophisticated? Response time things or all the above? IOPS and throughput controls.
Limits or something?
Limits for now.
Now, having said that, in our world,
everything has two sides to the coin.
On one side, we have the QoS controls for enforcement,
but at the same time, we do have hardware-based monitoring
on every active namespace or volume. So we track latency, IOPS, and throughput.
And we collect all of that data using IoT methods from across the cluster. And we stream that performance data into the centralized management entity.
And we actually put the data into a database so that the operator, the administrator has
a record of the performance, the latency ops and throughput of every active namespace in
the cluster. So it's actually a pretty critical capability,
especially the latency monitoring.
Because the infamous situation is the developer has an app
that's been put into production, and then 2 a.m. in the morning,
there's a slowdown on the app.
The developer calls the infrastructure owner or administrator,
and they're pointing fingers.
Always blame the network guy.
And the infrastructure guy says...
I thought it was always the storage guy, but yeah, go ahead.
I am the storage guy, so I always blame the network guy.
I got you.
So in our case, we accelerate what some companies call time to innocence.
Time to innocence? I like that term.
It's this active volume
that's in dispute. Pull up the historical data
and more importantly, latency. Because IOPS and throughput, you could argue
it's caused by the application or the infrastructure, but you can't argue latency.
So the latency gives a clear direction as to where's the issue coming from.
So we'll have a historical record of that.
And since this database is on high-performance storage, we can do all sorts of analysis.
That's correct.
He didn't say it was on NVMeme storage but he said it was a database
but uh so what about else would i yeah where else would you put it i wouldn't necessarily put a
database of monitoring latencies and iops and that sort of stuff that stream real time it may need to
be on high performance storage i don't know based on what's going on i'd have to you could you could
architect that in a number of ways but and there's some question here that was there.
Okay, what about storage class memories and Optane and stuff like that?
Are you guys ready for that?
Yeah, in fact, we've done demos using Optanes.
And I think we claimed, and someone has yet to refute it,
we claimed the lowest network storage performance or lowest latency ever.
We download 16 microseconds of latency.
So it's our 5 microseconds plus the Optane's 10 plus a little bit on the switch.
So we did a demo that had 16 microseconds of latency going across the network,
and we claimed it was the
fastest ever 16 microseconds of latency to get an iowa operation done correct uh you could go
faster just build a custom dram based ssd for you can get it down to 11 yeah Yeah. Jesus, I can't believe it.
I can't believe it.
You know, 250 microseconds, 500 microsecond latencies were all the rage not a year and a half ago.
150 was a rage not a year ago.
It's a whole new world, right?
This is an order of magnitude improvement on top of all that.
Yeah, it's just staggering.
If you guys start selling this to high-frequency traders, I think they really need this stuff.
We are talking to financial folks.
Good, good, good.
Well, we all know they can afford it.
Yeah.
We haven't talked about price, but and so in your environment, you've got the HNA and you've got the target hardware and the management software.
I mean, how are you pricing this? I guess that should be the question to an end user.
So in certain configurations, we can actually get to a little over a dollar per gigabyte.
Whoa.
Which is the typical cost of an all-flash array nowadays.
It's not the typical cost of an all-flash array nowadays.
And that's deduplication and compressed and thin-perfection and all that junk.
Deduplicated and compressed and discounted up to that.
This has got none of that this has got none of that these are device level
network storage kinds of things a dollar per gigabyte and user price is extremely attractive
yep at especially at this performance you can't can't approach this performance
without hundreds of thousands of dollars of controllers surrounding this thing.
That's right.
And even then, it's not even close.
Am I wrong?
You're absolutely correct.
So going back to the very beginning of the podcast, we've, for the last two decades, created disruptive technologies for the data center.
And that's what we believe we've done here.
God.
I think I would agree i'm on
record i think what you're doing is the future so i'm glad to hear that it's at the state now
where things are pretty much ready for prime time um the one real question i have though is
the nvme over fabric spec is a little lacking when it comes to some enterprise
things like boot from SAN, which for your uses probably isn't that big a deal, but
multi-pathing isn't really well defined. Are you doing some of your own magic to take care of that
or are we just expecting to deal with it upstream? So we're the only, of all the folks that are participating in the standards,
NVMe over Fabric standards,
we're the only vendor that has the full host side adapter.
So because we expose virtual NVMe devices to the host,
we actually get around a lot of the issues that you refer to.
So we have standard MPIO running in our lab today because, you know, for us, we expose a standard
MME device to the host. Right. And then you have your own solution for how to do the multi-pathing
between the HNA and the data node. Well, we just create two different paths,
and then the existing MPIO layer in whatever the kernel is
takes care of the load balancing and the failover as required.
So we just create two independent paths
that are extremely reliable and low latency
from the two virtual NVMe devices that sit on our host adapter
all the way to a shared NVMe SSD living in the target.
And the NVMe SSDs are dual-path SSDs in that configuration?
So that's another one of our innovations. So we support the dual-port NVMe SSDs to get the full redundancy all the way down to the device level.
But there's a lot of customers who don't want to pay a premium for dual-port SSDs
because they do still have a fairly healthy premium on them versus the single ports.
So what we're able to...
I was hoping that premium would disappear.
Yeah, I think it might take a while.
It's funny, I talked to the NVMe controller folks,
you know, the Marvell's and IDT's,
and they say, well, it's the same controller.
And so it's just a premium there
because people will pay for it
that's exactly which annoys me so but what we do which is another cool uh one of our innovations
is we're able to take a single port ssd and create two different mpi paths so you don't get the last
inch worth of redundancy but what customers care about most is the is you know of redundancy. But what customers care about most
is the redundancy on the network, right?
Someone tripping over a network cable.
So we're able to provide that level of MPI
across the network,
even using a single port SSD.
Does having all four lanes on a single port SSD
as opposed to two ports of two lanes give you, do you see a bandwidth advantage to that or are we not seeing enough demand for it to matter?
So today, predominantly it's still single port NVMe SSDs in terms of shipments.
And that's why there's a premium on the dual ports. So, yeah, when it comes to performance,
if you put redundancy aside for a second,
it's actually easier to extract performance
from a single PCIe x4 versus 2x2s.
Right.
Well, okay, I've got one last question.
I think we've got...
And all of a sudden sudden the last question is gone
what was oh yeah hot plug so the nvme ssds are they hot pluggable in this configuration
yeah absolutely oh my god you got your cluster of storage nodes and um there's an administrator
sitting in front of the the console the technician goes out
to the data center plugs in an ssd and it pops up as a new ssd in the physical inventory of
ssds that sit on the network and it becomes then available as yet another resource that is then available as yet another resource that is then allocatable to the tenants.
Yeah, I love the storage guy's skepticism about hot plug PCIe.
That's been a feature that's been in the spec and in servers for at least three or four server generations.
Yeah, but...
When we're talking about an add-in card, I've never known anybody with the guts to pull
the server out at the end of its rails, open the cover, and change a card.
Yeah.
U.2 makes that a lot simpler.
I guess.
All right, so this has been great.
Howard, do you have any last questions for Tawfik?
No, I got it.
Tawfik, do you have anything you'd like to say to our listening audience
before we sign off?
No, just look forward to working with any listeners that want to come to talk to us.
We've got a lot of cool things going on
and we love engaging with the customers
to hear their problems and hopefully solve them.
Why don't you mention a couple of the events
you're going to be at?
So we'll definitely be at the Flash Memory Summit
coming up.
Also at the Super Computer summit coming up. Um, um, also at, uh, uh, the supercomputer show, SE18,
that's a little bit later in the year. Um, VL, VM world. Um,
we might go to, uh, for those, uh, listeners in EMEA,
we might be at the supercomputer show in, in, uh, Frankfurt coming up.
Um, but that's just a couple of examples of some of the shows we're going to be at the Super Commuter Show in Frankfurt coming up.
But that's just a couple of examples of some of the shows we're going to be in.
Very good.
Well, this has been great.
Thank you very much, Tawfiq, for being on our show today.
Thank you for the opportunity.
Next month, we'll talk to another
system storage technology person.
Any questions you want us to ask, please let us know.
And if you enjoy our podcast, tell your friends about it
and please review us on iTunes
as this will help get the word out. That's it for now. Bye, Howard. Bye, Ray. Until next time.