Grey Beards on Systems - GreyBeards talk edge-core filers with Ron Bianchini, President & CEO Avere Systems
Episode Date: October 14, 2014Welcome to our 13th podcast where we talk edge filers with Ron Bianchini, President and CEO of Avere Systems. Avere has been around the industry for quite awhile now and has always provided superior... performance acceleration for backend NAS filers. But with their latest version, they now offer that same sort of performance acceleration for public cloud … Continue reading "GreyBeards talk edge-core filers with Ron Bianchini, President & CEO Avere Systems"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here.
And Howard Marks here.
Welcome to the next episode of Greybeards on Storage monthly podcast,
a show where we get Greybeards storage and system bloggers to talk with system and storage vendors to discuss upcoming products, technologies, and trends affecting the data
center today. Welcome to the 13th episode of Greybeards on Storage, which was recorded
on October 7, 2014. We have with us here today Ron Bianchini, CEO of Avere Systems. Why don't
you tell us a little bit about yourself and your company, Ron? Great. So at Avere Systems, we build a hybrid cloud NAS solution. And the point of our solution
is that it integrates both public and private object storage, so public and private cloud,
and legacy NAS into one centralized repository of storage, and then we put our
product in front of it, which presents the user our high-performance, scalable NAS infrastructure.
So I am the CEO of Avere Systems. Prior to that, I was the CEO of Spinnaker Networks,
which, as you all know, was acquired by Network Appliance, and the technology from Spinnaker Networks, which, as you all know, was acquired by Network Appliance,
and the technology from Spinnaker is what resulted in the Cluster Mode product that NetApp sells today.
About 20 years later, right?
Oh, come on.
That's right.
Not quite.
Well, it seems like 20 years.
Yeah, yeah, yeah.
So, Ron, this seems to be a change from what the prior Avere Systems was from my perspective.
It always seemed to be, you know, I would call it a NAS aggregator or a solution that could consolidate multiple, you know, in-premise or on-premise NAS boxes.
But this seems to be different now.
It is. You know, honestly, when I started the company with two other co-founders,
Mike Kazar, who's our CTO, and Dan Neidick, who's our chief architect. And when we started Avere,
it was all around the notion that one filer that was optimized for both performance and capacity
couldn't be optimized for either one of them. That it basically was
you were making compromises around both. And really, the performance part of NAS or the
performance part of storage is a very different function than a capacity part of storage.
And so in our 1.0 product, we came out, we introduced an architecture that we called
EdgeCore. And the EdgeFiler, which was our
product, was all about the performance part of NAS. And the CoreFiler, which back in the day
was a traditional NAS filer, was all about the capacity part of NAS. And by separating
performance from capacity, you could really optimize those functions independently.
So, for example, in our Edge filer, we have all the go-fast stuff.
So we have RAM, solid state, and 15K disk.
And in the Core filer, we tell our customers to buy nothing except 7K RPM high-density SATA drives. And when you optimize those two things separately, you really get huge advantages
over trying to compromise one product to do both. And really what we sold to in the early days
was all about that efficiency you get from separating edge and core. Now, if you take
that as our background and you fast forward us to today, what we figured out is the object storage guys, the cloud guys, the public cloud and the private cloud guys,
are trying to solve for the repository what we were trying to solve for the performance piece. going for massive density, very low price per terabyte, and even geographical distribution,
so you can continue to run even if you lose a whole site. So they really were optimizing the
repository part of storage, and we decided that it only made sense that not only do we support
NAS core filers, but we also support object store core filers
just to let our customers get the benefits of both, of really where the cloud guys were going.
Well, the separation of performance and capacity is a story we hear from a lot of people nowadays.
I mean, that's the whole server-side caching argument is put performance close to the workload and use your back-end storage just as capacity.
Why is it that you're the last man standing from the generation of people who tried to do acceleration as a bump in the wire?
Right.
So it's interesting um if you look at where people
going if you look at server side caching you you have so um nas gets all this um you know you
remember nas versus das the big the big story about nas was you have the centralized server
and you didn't have all these items of storage everywhere and you'd have one global centralized repository well the the the nice
thing about our edge core architecture it was the only architecture that allowed you to have one
globally shared pool of the go fast stuff so you you could embed our function in the server, but then each server
is independent and you're not globally sharing the media. If one part of your namespace becomes hot,
you really want to allocate all of your high speed stuff to that. And so really what you
wanted was this edge core separation, but you needed a separate device for the edge filer. And so compared to the
grid irons and those other guys that were trying to do bump in the wire acceleration,
they gave you the advantage of having one pool of cash so that you could allocate as much of it
as you needed to whoever was hot at the moment.
But it really was still independent pools of storage.
Right.
And the other thing that is very unique about our product is that we literally created our own file system.
You know, a lot of these products were doing very simple caching.
And you can get advantages for reads but you couldn't
get advantages for writes because you didn't own the directories all the directory operations had
to get pushed to the back we literally created our own names our own file system our own namespace
in our edge filer the difference between our file system and everyone else's file systems that exist is that we added one more layer of indirection at the bottom of the file system stack that knows about multiple different media types and can steering across all the different media types, including the core filer,
we also do clustering across all of our nodes.
And then as time goes forward, we're adding more and more features.
So one of the first features we added in our 2.0 release was a global namespace.
So we can take what the user sees as one common global namespace
and then distribute the directories across multiple core filers.
And then in the 3.0 release, we added the ability to change that mapping,
so you can now online migrate data between the core filers.
And all this is possible because we built our own file system,
all the directory operations, all the writes, all the reads happen locally. The only time you'll see us go to the core filer is on a read cache miss. But all
the other metadata, all the other operations happen locally. So that, you know, building our
own file system really helped us in many different ways. First, our offload, our ability to run ops directly out of the core filer is is is without peer in storage
a typical offload for us is like 50 to 1 that means we only go to the core filer for one out
of 50 operations that come in and those are exclusively cold reads but it also means that
we could add all this other features and functionality because we have the
file system that all the data is stored in from an operation perspective, and then we write it back
to the file system that's on the core filer behind us. So it's more like a destage activity out of
your, I'd almost call it a cache, but it's not. It's more than just a cache, right? That's right.
And honestly, it's that, I think you said it exactly right.
The way you think about our system is we have a file system.
We are the file system that your users see.
So all the directory operations, all the metadata operations,
all the writes happen locally.
Even the reads that we're caching happen locally.
And then to the core filer, you'll see two types of operations.
You'll see cold cache, read cache misses.
We'll go get them from the core filer, but also destaging as we pull things out of our file system and write them to the core filer behind us.
But you maintain a one-to-one relationship between files in your file system
and in the back-end file system, right? Yes, in 1.0 and 2.0, that's exactly true. In 3.0,
we added the ability to keep a one-to-two. So now for every file in our system, we can home it onto
two repositories, and now we can allow you to basically have a DR system
where you can lose a site, you can lose an entire core filer,
yet the system can still make progress because you have an alternate.
Ah, what? So you can do this off of the cloud, too?
What about if their performance is
substantially different.
You've got a cloud and a local
NAS.
You could have a
file system that spans the cloud
and local and actually replicates
one to the other?
Yes. Absolutely.
More rights to both
than replicates.
How is this going to work work how far is it behind i mean what
happens when the cloud goes down i'm sorry okay ray remember this appliance that's buffering all
of this asynchronous replication for lack of a better term has got terabytes of storage yeah
it's not it's not just the controller. That's exactly right.
Our largest node holds nine terabytes of storage.
And if you have a three-node cluster, that's 27 terabytes that's globally shared across those nodes.
And the way we do it is for the – so first, with the global namespace, any directory could exist on any one of the repositories behind us.
And that could be a heterogeneous mix of local NAS, local object store, public cloud.
So as the user's CDing directories, the repository can exist on any one of those things all at the same time.
So that's with the global namespace feature. With the
migration and the mirror feature, that's exactly what we can do. We've got directories that the
users are reading and writing out of those directories, and we can then destage those
directories to two completely independent repositories. And one can be NAS, and one can
be the public cloud and and basically
the way that all works is because the destaging queues are independent you
have one queue destaging to the local core and you have one queue destaging to the
cloud. Yeah, yeah, yeah, yeah, yeah. Why stop at two? I mean I mean I've talked to the guys, the big customers want three or four or more, actually. So, I mean,
once you've created this DR mode,
I mean, you could almost do it to just about N levels. I mean, as far as you're concerned,
it's just another queue, right? Exactly. Honestly, at this point,
it's probably a simple matter of testing.
We want to make sure that everything our customers do, we've done first.
And so in our QA organization, we've tested up to two, but it's not really a big stretch.
It's literally, as you point out, a destaging queue, and we can potentially go to more, but that would be features in future releases.
Yeah.
So speaking of features in future releases. Oh. So speaking of features in future releases.
Oh no, Howard, here we go.
So when do you add the HSM style D stage that I don't need that locally anymore?
Right.
So this is a really good topic and probably one that we can spend hours on and
that is we don't believe you need it because literally what our product does for every
block in the system we keep statistics on how that block is being used. And if we see a block becoming idle or not used that much, it automatically gets destaged back to the core filer behind us.
Right. I was thinking from that – I've got that core filer and sure, we've separated performance from capacity to some extent.
But I still back that core filer up every night. And so it's not
about cost of the drives in it. It's about the maintenance cost. Once the data gets stale enough,
I want to move it to an Ampla data or a CleverSafe where I just am working under the assumption that
once it's there, it's safe and I don't need to back it up.
So this – okay.
I got it.
So really what you're talking about is not HSM destaging it from us to the repository.
It's HSM destaging it from one repository to another.
Yeah.
Exactly.
It's almost core migration or something like that. And so this is a good debate and discussion we have with our customers.
And what I would say is our customers really fall into two camps.
The one camp is I want to know where my storage is forever.
If I tell you I want it on this machine, I want it there.
If I tell you I want it on the cloud, I want it there.
I don't want you making any intelligent decisions about when things move between them.
And we have other customers.
Aren't those the same guys who want to set their own RAID levels?
Yeah, maybe.
And then we have other people that say, I don't really care.
I want to treat you as a black box, and I want you to take care of everything for me.
Honestly, in the enterprise,
most of the customers fall in that first category. But I will say this, what we allow you to do
is we allow you to migrate at any time without ever taking the system offline. So for example,
if you decide there's too much data on the core filer and a piece of that you want to move to the cloud, it's literally three clicks in our GUI.
And now the data is being migrated while you can still run transactions against that data.
And so we don't automatically –
What's that?
Tell me there's a RESTful interface for that too.
To the Avere system? Well, because I'm going to write a script that walks the file system and identifies here are whole folders where nobody's touched anything in years.
And then I want to send a command to say, well, move that folder for me.
Is the core migration at the file level, the folder level, or the file system level?
I mean, what's the level of granularity of the core migration?
At the folder level.
So this could be a subdirectory, effectively, of a file system.
It could be a subdirectory with even thousands of directories below it.
Oh, yeah, yeah, yeah.
And so, Howard, we do have a GUI that sits in front of our product,
but that GUI only uses XML RPC to talk to our machine.
There's no paths around that.
So we actually have some of our customers that are writing XML codes.
So I could script to that XML.
Exactly.
Okay.
Exactly.
The one thing I'll say about our migration, it is the most – we have very conservative
enterprise customers.
Our migration is incredibly conservative.
Literally what we do when you start the migration, we do a tree walk.
We start at the root directory you tell us, and we start walking all the directories below it.
And as we get to a directory, we copy files over.
Once a directory has all the files copied over, we call that directory silvered. And so what happens is
once a directory is silvered, any more edits that go to files in that directory, we destage it on
both queues. So imagine at any given time, you have some silvered directories and some unsilvered
directories. When an edit comes in, if it comes into an unsilvered directory,
it's only put in the destage queue to the source.
If an edit comes into a directory that has been silvered,
it puts in both destage queues.
So what this means is while I'm doing the tree walk,
even though users are editing files,
I can guarantee that the source directory structure is pristine and will always get all the updates.
And then the destination is getting all the changes as the tree walk progresses.
And so then when the tree walk's done, you literally have both file systems, both directory spaces exactly in sync.
And the real beauty about all this is at any time during the migration,
you decide you want to abort, you just click the abort button and the source has not been touched.
It's all there. Yeah. Yeah. I got a couple of questions. Maybe the first one might be,
so, I mean, there are, I don't know, let's say a hundred object storage systems out there today.
Maybe that's over, that's overation. Do you support, which ones do
you support and which ones don't you support? Right. So what we've done is we have now officially
supported Amazon, AWS. We've officially supported CleverSafe and we officially support AmpliData.
And as part of our, you know, running it through our certification, we always have
published those performance benchmarks in front of those repositories to something I know, Ray,
you know really well, SpecFS. Yeah, I was surprised. So you're the first and only organization to
actually have a SpecSFS with Amazon or CleverSafe or AmpliData. So that's fairly unusual from my perspective.
I agree. It's incredible. It's the fact that this is really, really the whole point of Avere.
On the front end, we present a very traditional enterprise NAS file system from performance and
capacity perspective. But on the back, we really are independent of
whose repository you put behind us. So how do you manage the directory structures that you'd
probably want to mimic at the cloud and stuff like that? Let's say AWS, it's really an object
with buckets and that sort of stuff. Are you mimicking the directory structure into separate, I don't know, buckets out there?
I'm not exactly sure what the term is.
That's right.
So if you look at the back end, you'll see objects that hold the file data,
and you'll see objects that hold our directory structure.
So honestly, moving to cloud was incredibly easy for us.
Remember what I told you really early on.
One of the reasons we still survive is that we have a file system that we present to the user, and then we map our file system onto the core filer behind us.
Well, on NetApp, you do that mapping by writing and reading files.
On Isilon, you do that mapping by writing and reading files. On Isilon, you do that mapping by writing and reading files. On the cloud, we do that mapping by literally writing objects that contain our
file system in it. So it wasn't that we had to go create a new cloud file system. It's literally
our file system, and we're just putting a representation of it on the object store.
So Ron, is there a one-to-one relationship between
files in your file system and objects in S3? There is not, and mainly because in large files,
you might want to edit a piece of the file, and the last thing you want to have to do is write
a massive object to the back just to capture that update. Exactly where I was.
Yeah.
Right.
God, I was on the other side.
So if it's a small file and you're writing it to AWS, are you going to write that as
a separate object?
Like say, I don't know, 300 bytes or something like that or 1K?
Not always.
Sometimes it all depends on what's going on around it.
Sometimes it will.
Sometimes it all depends on what's going on around it. Sometimes it will. Sometimes it won't.
This is why people buy the Edge Filer because it makes – it hides all of the issues with reading and writing objects and dealing with the latency of the core, yet still the ability to maintain a NAS interface up front.
Right. And so since I can now store my back-end data in S3,
I can put edge filers wherever I like and have access to all of my data.
That's right.
There's this consistency problem with the cloud, right?
I mean, let's say you create version A of a file
and you want to go back and write version B of the file or something like that.
You know how the cloud can be effectively inconsistent when a read occurs or something like that?
How do you play that route or how do you deal with that?
I'm not even sure what I called it in the past, but there's a –
No, no, no.
You've hit the nail on the head.
The biggest issue – so honestly, we already had a file system. So just storing our file system on the cloud, that was the easy part. very strict guaranteed consistency. The most important consistency point is read after write.
In a NAS filer, you're guaranteed when you write something, any read immediately after
it, any read from that point on has to see the latest version of the data.
So you have very strict consistency rules when you implement a NAS filer.
Honestly, when we submitted to SPEC,, the spec committee didn't believe us,
and that's really what they focused on is can these guys guarantee the strict consistency
that a NAS filer requires.
And so as you point out, Ray, when you write to an object store,
an object store has eventual consistency.
Yeah, that's eventual consistency.
It's eventual consistency.
It's even worse than that.
When you write something, you might immediately read back the exact same data, and then the read after that might read an older version.
Because every time you go into the cloud, you get connected to a different server. And some of them have the updates and some of them don't. So not only is it
eventual consistency, but it's an eventual consistency that's not monotonically
getting better. It could get better and worse over time.
And so literally what we do is
we write tags, we
write information into the objects so that I know which is the most recent one.
And if the object store sends me the wrong one, I ignore it and wait for it to get the right one.
It means it's still propagating.
But better than that, we keep a temporal cache of all the things we've written that exceeds any of the eventual consistency
guidelines that the cloud vendors have told us.
And so we'll only ever go get that data local.
Yeah, okay.
So literally, that's what we had approved to the spec guys, that we can guarantee strict
consistency even though the repository behind us was an eventual consistency system.
So it's funny, when I talk to some of our more technical customers,
they describe it as the edge filer makes the cloud safe first nest.
And that's what we do. We hide that eventual consistency and make it strict.
Okay.
And then how do you handle collaboration?
Because an object store is a last writer wins environment, and maybe we turn on versioning so that if I'm using Dropbox to write to S3 and three users edit the same file, I can get to all three versions.
But in the NAS world, I would do things like file and record lock, which isn't really all that latency accepting.
Right.
So for the current release, we support really a time-based distribution model, and it's really based on time distributing the data.
And it turns out for NFS and CIFS, there are all these client-side caches, which effectively do the same thing anyway to stop the clients from being very chatty,
they create a little cache locally.
And most of the applications learn to deal with the fact that there's a time bound for distributing data.
And so for the current product we have out in the field,
we basically, the way you should think about it is we're stretching that client-side cash time as the data gets
distributed. But I think what you're going to see in the future is we will provide guaranteed
consistency across the clusters. So that's really directionally we want to get better.
But right now, the way you should think about it is imagine you're lengthening the client-side cash.
Okay. So when people back up these sorts of things,
and let's say you had both an on-premise NAS box
and an off-premise cloud public solution,
and you wanted to back these up,
do they back them up through Avira?
Do they back them up directly, or could they do both?
They literally could do both.
A lot of people back them up on so in the nas world
a lot of people are just doing backups on the nas filer and and they're taking snapshots and
on the on the core filer behind us and those snapshots are consistent and then they back the
data up there but you could very easily run the stream through the avir and do the backup through us okay and i think backing up the back
end filers works fine um until the global namespace has been in existence for a long time and there's
two billion pointers there right because then the because you know you're backing up a very
different view of the data right and honestly that and honestly, you're really now getting to, you're coming back to one of the questions you asked me earlier.
Why don't we HSM and move stuff around?
As the global namespace lives for a while, if you're doing lots of migrations, although on the front, the user sees a very clear, consistent view of all the directories,
on the back, now you have directories in different places.
And so from a backup perspective, you're going to want to back them all up at the same time.
And so I really think it's this administration, which is why customers really like knowing where all the data is. They've got engineering directory and all the subs on one repository, and they know
when they back up, all of engineering is backed up.
Right.
Well, I mean, down to the point where when that rate set fails, I know who's affected.
That's right.
That's right.
I think we've relied on that a long time.
I think that we have to start getting over it
when you start looking at
10 years ago
I would literally know that LUN was on this raid set
but then I got a 3-par
and I didn't know that anymore
and now some of it's on
even within one system some of it's And then now some of it's on – even within one system.
Some of it's on flash.
Some of it's on disks.
And I think we have to accept our abstraction layers and just continue to view through them, which in this case would mean that I'm going to back up through the edge filer.
And so do you support NDMP on the edge filer to dump to me we so we do not today
right now it's really just an nfs or a sift stream um as you pull the data out of us okay but but i
understand you know it's interesting as we get more and more into the cloud and as it's becoming
harder you can certainly version on the cloud but you you can't do the snapshotting and streaming on the cloud.
We realize in the cloud world we're going to have to pull more and more of those functions into our box.
And you'll definitely see that over time.
Okay.
Now, if I'm using a cloud backend, can you provide snapshots?
Yes.
And are they – yeah, now it's,
and I want that snapshot to be consistent
across the whole file system.
Yes.
Actually, I want to be able to define
the granularity of that snapshot as well.
Yeah, because I want one file system,
but I want to be able to take per VM snapshots.
And in fact, that opens the whole subject.
Do you guys support VMware hosting?
We do.
We actually, you should see us operate in a VM environment,
especially a VDI environment.
You know, people talk about bootstorms and how it crushes the servers.
Bootstorms are a beautiful thing to see. What happens is the boot block or boot blocks get
pulled from the core filer into the edge filer. And then we service them out of one of the edge
filer nodes. As their op count gets really high, the system calculates the cost of going
from node to node in the cluster versus storing less data in the clusters and replicating those
boot blocks. And we'll actually replicate the boot blocks across the nodes. And you'll see this
massive parallel access to the boot blocks. So literally,
when you see these boot storms, one read happens from the core filer, and then you'll see a little
bit of cache-to-cache communications between our nodes, but then eventually you'll see none,
because we'll replicate it. People talk about boot storms a lot, but the truth is,
it's just a lot of read traffic. And it's a lot of read traffic if you're using linked clones
to common data. So it's easy to cache.
Are you smart enough to cache that well enough or not?
We are. We do really well in VDI bootstorms.
Now, if you're running VDI, I want data reduction.
Because there's some executive VIP class where really the best way to deal with them is to give them a full persistent clone and dedupe the storage on the back end so that says everybody who doesn't complain is going to get recompiled this weekend, you can leave those guys out because they don't complain ahead of time.
They just complain after. Right. And, and so what, what we don't support D dupe today. And,
and, um, it was a very strategic decision not to, we, we. At least when you had NAS behind us, we wanted you to be able to pull our nodes out at any given time.
We didn't want to hold your data hostage.
Originally, you guys were the acceleration play.
And just the fact that the data on the back end looks like the data on the front end meant you could take us out any time.
And that was very important to our early adopters.
And so I'm glad we did it. But now that we have the cloud, we've already added compression.
And I think you'll see Ddupe is on our roadmap as well. It's not there today, but it is on our
roadmap. Oh, that's interesting. Very interesting. Yeah, I mean, it makes sense – even if you don't dedupe on NAS, it makes sense to dedupe in the local storage for things like VDI where the dupe level can be incredibly high.
Right.
And therefore you increase the effective size of your cash.
Right.
And it makes sense to dedupe in public cloud where you're paying per gigabyte per month.
So it just reduces your bill.
Exactly.
No, that's exactly right.
On the core filer, we absolutely needed to avoid it because we wanted to show that we were – I'm sorry.
On the NAS filer, we absolutely wanted to avoid it because we wanted to show that we were transparent.
But on the cloud, it's a very different story.
And the more we can save our customers, the better. Right. Yeah. Where was I going? So getting back to the little
spec SFS things that you submitted, I had a slight problem with the AWS Flash version.
I don't know if you want to talk about that at this point, Ron, but my concern was,
I understand it's AWS Flash storage. I understand that. But it was no way for me to actually understand how much Flash versus disk was actually being used in the cloud.
You know where I'm getting at?
Yeah, so in our box, in the 3800, which is our product, the FXT 3800, there's Flash and disk.
And at any given time during the run, it's hard to say
exactly where the data is. I could tell you that we'll run both of them at 100% before we'll evict
anything. So what I could probably tell you is very early in the run, the flash got filled,
and then we were evicting to disk. So I would bet you it was 100% full and flash after a couple of minutes,
and then it was evicting to disk and putting the hottest stuff in flash and the slower stuff in disk.
I wasn't as much concerned about the FXG3800 because you gave, you know, in the spec SFS,
there's a clear description of the product, hardware, configuration, et cetera, et cetera.
But my concern was what was in the back end of the storage.
Right.
So what was in the cloud was – I know, Ray, this is your big sticking question,
and here's the best I could tell you.
Remember, all of the edits, all of the directory changes, all the file creates,
all the writes, all of those are serviced strictly in the Avere.
And so none of those transactions ever made it to the core.
And in the case of spec, we targeted 180,000 spec ops across three nodes where the entire data set could fit in the edge filer.
So what that meant is there were never any cold reads in the edge filer.
So what we ran during that spec run was all of the synchronous transactions happened in the edge filer,
and then the core filer was a massive destaging queue.
So from run to run, we probably destaged approximately
the same amount of data every time.
But if you were to lose the link to Amazon
in the middle of that run,
and then it would come back 10 minutes later,
what you would see is the destage at the end of the run,
the destage queue would just be a little bit bigger
because we couldn't have destaged as much data.
I mean, that was the other thing that was kind of interesting is most of the,
you know, there was like four that were submitted. Yeah.
But it was like local NF, local ZFS and remote ZFS spec SFS runs with a
very in front, but the cloud,
the throughput was pretty much the same, but the latency was different.
Was slightly different.
And, yeah, as far as we were concerned, they were all the same.
But you're right.
And the only way – they were very, very close. queue was having to deal with all kinds of latency or drop packets to the core filer,
then potentially it was taking a little more CPU from the system, and maybe that added to the latency across the front end. But I think if you go and look at those results, you'll see that
they're really tight. They're very, very close. Yeah, yeah, I can probably bring them up. I won't
do that now. All right, all right, that's interesting, that's interesting.. But I think you said it exactly right. That was the whole point of when we did
that spec. We ran three of our nodes in front of local NAS. We ran three of our nodes in front of
remote NAS. We ran three of our nodes in front of a local object store. And we used both CleverSafe
and Amplidata. So we did two of those. And then we ran three of our nodes across the public internet to AWS, and all of those runs got the same performance and very close to the same latency.
Yeah, yeah, yeah, yeah. It was kind of interesting.
Well, now, no.
There's two good reasons.
First of all, there's disaster recovery.
That, you know, my Philadelphia site burnt down, and I'm going to run VDI instances in EC2 for my Philadelphia people to be able to work from home, and they should be able to access their data.
But more commonly, I'm going to have servers in my data center,
and I'm going to do cloud bursting. And I want to spin up more instances of my public web server
in EC2 and have them via NFS grab objects from the common store.
Almost like a hybrid cloud kind of solution.
Yeah.
Yeah. Howard, I love your insight into
all this stuff. And what I could tell you is we absolutely see the value in what the Edge Filer
does functionally, not only running locally, but running in the cloud. So I think you'll see in the very short-term roadmap
that we'll be moving in that direction.
Hey, Bernie should be working on that right now.
All right, all right, gents.
We're about 40 minutes here.
Howard, do you have any final questions for Ron?
No, I just – I've been amused with how interest in clever solutions for file management wax and wane.
You remember a while ago we had – we all thought that we would be interested in file area networks.
Yeah. And then that flopped astoundingly.
And we're seeing another wave of really interesting ways to deal with files, especially in remote and branch offices that have never been well served.
And so I'm really pleased to see solutions from a veer and, you know,
they're, you know, more direct competitors like pans or, uh, and really wacko things like the
guys that do, you know, NFS front ends for S3. So the specifically to run an EC2. Yeah.
I think that, I think the thing that I like
or what our customers like about what we do
is we present the exact same interface to storage
that they've been using for decades.
It's NAS, it's NFS, it's CIFS, and it's performant,
and they can scale performance,
yet we allow them to leverage all this new cloud stuff.
And it really helps a lot if the applications can stay the same,
but you can move the data center in the direction that is going to help save you cost.
Okay. I have one last question.
There's the last question I forgot.
Have you guys said anything about VVOL support?
So just virtual volumes.
Right.
Yeah.
And so you mean at the NAS level or at the cloud level?
You mean at the front end?
Yeah.
So it's the front end protocol.
It's a VMware solution, right? It means being able to have you guys do policy-based storage on a per VM basis, even as basic a thing as a per VM snapshot.
Right.
And I know we have not done anything there yet, but it's – this is one of the issues you have with smaller startup companies, right?
You have a very small pool of engineers, and you target them at the next big thing.
But it's not something – it's definitely something we're interested in.
It's just a matter of prioritization.
But I totally understand the value in it.
One other question from my perspective.
So, I mean, in the past you fielded specFSF runs with like 44 nodes and stuff like that.
Is there a maximum number of nodes in an FXT3800 cluster?
So publicly we tell everyone that we support up to 50 nodes.
And really that's because we have 250 node clusters in our QA lab. And one of the
earlier comments I made is we want to test everything here first. Can we drive above 50?
If a customer comes in and says, I need 100 nodes in a cluster, we can definitely do it. We'll just
have to build up QA to get there. But all of our data structures, all the infrastructure for the
clustering, everything gets distributed. There's no centralized metadata server. All the functionality scales with the number of nodes.
And so 50 really is just a testing limit today. Okay. Okay. I suspected that from the fact it
wasn't a power of two. Right. That's good. There should be a power of two solution there. Okay.
Well, in any event.
So, Ron, is there anything else you want to say?
No, that was good.
I thoroughly enjoyed the discussion, gentlemen.
All right.
It was our pleasure, Ron.
Well, this has been great.
Thank you, Ron, for being on our call.
Next month, we will talk to another startup storage technology person. Any questions you have, let us know.
That's it for now. Bye, Howard. Bye thanks again thanks again ron thank you both until next time