Grey Beards on Systems - 56: GreyBeards talk high performance file storage with Liran Zvibel, CEO & Co-Founder, WekaIO
Episode Date: February 15, 2018This month we talk high performance, cluster file systems with Liran Zvibel (@liranzvibel), CEO and Co-Founder of WekaIO, a new software defined, scale-out file system. I first heard of WekaIO when i...t showed up on SPEC sfs2014 with a new SWBUILD benchmark submission. They had a 60 node EC2-AWS cluster running the benchmark and achieved, at … Continue reading "56: GreyBeards talk high performance file storage with Liran Zvibel, CEO & Co-Founder, WekaIO"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here with Howard Marks here.
Welcome to the next episode of the Graybridge on Storage monthly podcast,
a show where we get Graybridge storage and system bloggers to talk with storage and system vendors
to discuss upcoming products, technologies, and trends affecting the data center today.
This is our 57th episode of Greybridge on Storage, which was recorded on February 9th, 2018.
We have with us here today Liran Zibil, co-founder and CEO of Weka.io.
Why don't you tell us a little bit about yourself and your company, Liran?
Sure, and thank you very much for having me here.
So Weka.io is about a four-year-old storage company.
We're building the fastest file system and maybe even the fastest storage solution available today.
We have strong storage roots.
We were the original team that created XAV storage almost 20 years ago, acquired by IBM end of 2007.
And now is the IBM Spectrum Accelerate.
So we have a long legacy of making storage systems work. And about four years ago, we've decided that we want to create the most scalable, highest performance file system that would be the best option for on-prem or a public cloud.
Huh.
Okay.
So why don't you tell us a little bit about the name of the company?
All right.
Weka is actually a Greek unit.
It's 10 to the power of 30.
It's like a trillion exabytes uh the
pronunciation like is like other greek units so tera mega peta so weka um and it's just a very
very large number as i said a trillion exabytes or a million yottabytes, which are themselves a million exabytes. Even the NSA doesn't have that much data, I hope.
At least not today.
But getting there, I'm sure.
Gosh, a million exabytes, that's crazy.
A trillion exabytes, that's even insane.
Yeah, what's 10 to the third between friends?
And you originally worked for XIV Storage before the purchase and that?
Yes.
By IBM?
Yeah.
So we actually, not myself, my co-founders also,
and actually our team has the most members of how XIV Storage looked
right before the IBM acquisition.
Yeah, except for the guy with the helicopters, right?
Well, there is Moshe and I,
and he has helicopters.
We don't want to insult anyone.
I know there's more than one.
Yeah, yeah, yeah.
So the first time I saw Weka.io in any documentation,
you guys showed up on a spec.sfs benchmark run
that I think was run on AWS hardware.
Was that true?
Yep.
So we actually, up until now,
have submitted spec.sfs results that were run on AWS hardware. Up until a few weeks ago,
we had the two largest missions at 500 and 1,000.
500 was run on 60 nodes of actually small instances
that only had two SATA devices.
And the 1,000 concurrent runs were performed on 120.
So we showed linear scalability between the two runs.
So currently, we still have the highest performance back as a first.
And actually, next week week we're going to publish
an on-prem submission of
very similar hardware to
the current highest one running
GPFS that will show about twice
that number. And this is a software build run?
Is that workload? Indeed.
Because spec assfs has five
different workloads, but
software run is actually the most challenging
one and is the run that most
vendors submit to, because running on all five requires
a lot of time.
It's interesting to me how the nature of scale-out storage has been changed by the public cloud.
Our question at this point would always be, well, how big a cluster have you tested?
Because the vendor would always say, well, there's no design limit.
It could be infinite scale, but we've only tested to eight because we don't have the
venture capital to buy 16 nodes.
That's just gone completely by the wayside with the work you guys are doing.
Right.
Actually, we have the ability to run very large clusters.
The theoretical limit is 64,000 nodes in the cluster.
Even if you just look at us from the hyper-converge approach,
we really can scale even hyper-converge
where you compare to the other solutions.
When they say scale, they mean dozens, not thousands.
We have tested up until about 4,000
because even on the cloud it costs money
but we're actually yeah but you got a couple of orders of magnitude past where people usually get
at this point exactly and uh now we uh we have signed the reseller agreement with HPE, so they're going to OEM us our solution.
We announced it in November, and we're actually now in the process of building some huge government-funded
clusters that will have thousands of instances, even on-prem, so not just on the public cloud.
And let me guess, one of them is going to be here in New Mexico.
The labs? Possibly.
I was thinking more of the place in Salt Lake,
which is kind of buried and stuff like that,
that's recording all of our phone calls and all of our voice messages
and stuff like that.
No, but remember, WEC-IO is a performance, not a capacity play.
Well, not primarily a capacity play.
So what I can comment on your previous things is that we have issued a press release with the San Diego Supercomputing Center,
and they're building a supercomputer for life sciences applications, then when you said that we are a performance play
and not a capacity play,
this is actually changing
because they think you couldn't find applications.
So the applications today that require the very high performance
also require a lot of capacity. And we're basically coupling two,
what other solution would be,
two totally separate file systems into a single product
because we also have our tiering.
So our fast tier is a parallel file system
for NVMe over Fabrics.
And we're the only solution that provides a file system out of NVMe over Fabrics, and we're the only solution that provides
a file system out of NVMe over Fabrics, but with the same low latency. But when we are
tiering actually to an object storage solution, we have implemented and designed the exact
same algorithm that the traditional 20 years old file system did for hard drives.
So the same thing that, you know,
GPFS, Lustre, Panassas did to get high performance over hard drives,
we're doing for our object storage scaling,
which just allow you to use any form of packaging the hard drives that fit your application.
So some would say, I'd like to push it to a public cloud,
but other would say, I want to use an open source Swift or Chef,
while other would say, I need geo-replication,
so I'm using one of the commercial object storage solutions.
But out of these solutions, we can get you very high throughput still, not very low latency.
And we support projects with up to hundreds of petabytes of object storage
and just a few petabytes of NVMe storage.
A few petabytes of NVMe storage? What?
Well, if you're talking about hundreds petabytes of object storage,
these are large-scale projects.
Obviously, we can support more down-to-earth
solutions, and these are more popular of, let's say,
100 terabytes NVMe and a petabyte or
two petabytes of object storage. Hold on, let me go back to this.
So, number one, you've got a file system
that operates over NVvme over fabric indeed
so we support this is either this is extremely unusual we support either ethernet or infiniband
uh for a lot of our use cases infiniband is the uh interconnect of choice especially now where
more and more applications start using GPU-filled servers.
These servers are connected via InfiniBand,
but the standard enterprise environment will use Ethernet
as an interconnect.
And we don't require any special Ethernet.
We also run on AWS.
So they are the pinnacle of
lossy networks, and we run very well there.
It's a good proof that we run on any Ethernet
network. They don't require any RDMA
capabilities and Ethernet switches or anything like that?
No. We have implemented it in such a way that we fix it all in our software.
So we run on most standard NICs and we just fix it in our software.
So we have isolated the networking stack so it can just run on any standard modern enough NIC.
Right. Now, if you're not running on an RDMA NIC,
you're not using NVMe over Fabrics to access the SSDs, right?
So since NVMe over Fabrics is currently a standard that no two vendors implement in an interoperable way,
we implement something that is very similar to NVMe over Fabrics.
So you get the same latencies and it's the same semantics,
though you couldn't connect us to a third vendor
that is running NVMe over Fabrics.
But if you look at the landscape,
currently no two other NVMe over Fabrics
can actually interoperate.
So we're not that much different.
Okay, so Liran, let's talk about architecture for a second.
You guys install as a distributed file system,
so I'm taking storage from all of the nodes
and building a shared nothing scale-out system from it?
Exactly.
So we run on Linux-based server.
We can take over your NVMe devices and then we run our own IO stack that provides
zero copy, low latency access, or we have a backwards compatible way to use SAS and
SATA devices.
And then we're using the Linux IO stack, similarly, by the way, to how we were running
the specSFS benchmark on AWS.
And when we run on these servers,
we are actually running in
what looks like user space.
We take control over complete cores
and a complete set of memory addresses
that we take away from the kernel, link to
the network card, and we take over the NVMe device.
And we're actually running our own real-time operating system in user space, what seems
like user space using Intel virtualization tricks.
So we run our own scheduling.
We run our own memory management. We run our own scheduling, we run our own memory
management, we run our own IO stack and our own networking stack. We pack it all in something that
looks to the Linux kernel like an LXC container so the Linux kernel knows not to access and not to try to also use the resources that we're taking over.
And this is how we can actually provide consistent low latency
because, as we all know, the Linux IOS scheduler
was never meant to be a consistent low latency scheduler.
Not in the nanosecond range, certainly.
Nanosecond? Not nanosecond range, certainly. Nanosecond?
What are you talking about, Howard?
Not even in the microsecond range.
In an NVMe environment,
if you've got
random clock jitter
of a couple of hundred nanoseconds,
that's going to be problematical.
Yeah, maybe.
We're talking microseconds.
We're talking orders
of magnitude difference here.
Yeah, but we're still
ensuring microsecond
latencies is difficult
with the Linux kernel.
And that's the reason
we do it ourselves.
So talking about Linux
kind of g-theory and mentioning nanosecond just being too kind.
So you can have
bumps that are in hundreds of microseconds if you're unlucky.
And when your total access time is in hundreds of microseconds, that can really be problematical.
Yeah.
So you can get 4K iOS to applications with us within 150 microseconds
if you're using NVMe and 100 gig Ethernet or EDR InfiniBit.
So we can get you iOS within the error range of the Linux kernel.
Okay.
Yeah, yeah, yeah.
What about metadata and stuff like that?
With any of these file systems, of course, you've got metadata requirements.
Is that shared across all the cluster nodes,
or is that some sort of a select set of cluster nodes? So we're also the first file system
to be able to scale metadata.
And actually, the more servers you add,
the more metadata performance you get,
which is the opposite inverse of any other file system.
So for us, you can get as much 4K IOs as you want.
So it's easier on AWS,
but the large M&E customer was testing us on just 300 instances
and they were running 4K IOs.
They got more than 10 million 4K IOs running on AWS.
And they could get much higher number, obviously,
if they had bigger cluster or running on-prem.
And the other thing we've solved is metadata scaling.
So we can have trillions of files in a file system.
So more than many of the object storage solutions.
And we can have trillions sorry billions of files in a single
directory and our directories operate with uh billions of files as well as they operate if
they just have a thousands of file so we have figured out a way of scaling metadata to actually work for a file system.
So for a lot of the reasons people say, hey, I have a lot of files, I would have liked
to use POSIX semantics, but I'm stuck with an object storage because I need to scale.
What we're doing for these projects, we show them that we scale better than their object storage,
and they don't have to change anything in their stack.
And they still get very low latency, which they never get with an object storage solution
because of how object storages work.
And you support NFS, v4, and SM SMB3 and all that stuff?
So we support our POSIX client, which is fully POSIX compatible.
So anything you would read on the man page, we do, unlike NFS.
We support NFS v3 and we support SMB up to 3.1. We then support also S3 as an access method,
and we're actually the lowest latency S3 you could find,
and HDFS.
So you can actually run all your workloads
on a single Weka.io cluster
without needing to ingest and egress from
different storage systems.
So once it's there, you can access it from Windows, from HDFS, from your IoT devices,
however you would want.
And the storage nodes, so each storage node then is effectively a metadata server as well as a data server.
Do they have to be similar in capacity?
So, I mean, let's say you've got 40 nodes or 1,000 nodes.
Do they all have to have the same storage capacity, storage performance kinds of questions?
They don't. We actually have unique algorithms to
find out how much performance
we can get from each node before it lowers the
average. And that's actually the reason we can show
great linear scalability on AWS where
everyone else complains about noisy neighbors. Because
if you happen to have a noisy neighbor, you will just contribute less to the cluster.
And on average, if you have 20 nodes in the cluster, you have the same percentage of noisy
neighbors if you have 200 nodes in the cluster and that's why we get linear
scaling so you're doing this performance load balancing dynamically so that indeed when noisy
neighbors pop up and go away things get adjusted yes or we have means of spreading the workload in sort of a crypto-hush way across the nodes.
Let's say you happen to attack how we spread the load
because you're just doing something that sends more work to a single server,
we'll be able to change how the servers handle their responsibilities.
So now three servers are doing what that single server was doing.
And again, the cluster's load is balanced.
This is pretty impressive stuff here.
I am...
Okay.
Yeah, well, I mean, you started off asking, you know,
what happens when I have nodes of different generations?
How do you load ballots across that?
And the answer is, oh, yeah, that just happens automatically because we're dealing with it in real time.
Yeah, so just to finish the thought, for us, nodes with different generations are not different with nodes with noisy neighbors.
Right.
Okay, so let's move on to data protection.
That's great.
Storage guys should be paranoid.
If they aren't, they haven't done
storage long enough.
Yeah, absolutely. I agree.
If I ever meet a storage guy
who isn't paranoid, it just means he hasn't screwed up yet.
And I, therefore, don't trust him.
Well, I think it's
the right approach.
Let's start with the basics. I send you data.
Do you erasure code it across nodes?
Replicate it three ways?
So we protect it in our own mathematically sound coding scheme
that looks like erasure coding, but is fast.
Because when I told you earlier we're going to do a 4K write
in 150 microseconds,
I couldn't be doing it with the Reed-Solomon or the other ways
of doing erasure coding.
Oh no, just waiting for the long tail hack gets you in trouble.
So part of our spreading the work right,
we can actually decide where we're performing the next troit.
So we know where you have resources that are busier
and they're just not going to perform the next write.
Or if you did it and you're waiting too long, you can try again.
Because our structures are always redirect on write
and we can very effectively do it.
But if we take a step back and talk about our protection scheme,
so we protect coding-like d plus p data and protection
the data width can be between 4 and 16 and protection is uh two three or four so shortest
stripe would be four plus two widest stripe would be 16 plus four though 16 plus 2 takes you extremely far.
So let's look at how we would protect 16 plus 2
if you have 100 failure domains.
And by the way, failure domain for us can be a wreck
if you say I only have a single switch
and I don't have network redundancy.
So failure domain is configurable for us.
But you have 16 plus 2 protection.
Are failure domains hierarchical?
Currently they aren't,
but you can work slightly harder to implement it
slightly more difficult yourself.
But I heard the magic word yet, so it's on the roadmap
somewhere within the next three decades.
Right. We're not discussing roadmap.
So, you mentioned 100 failure domains, like 100 racks?
So it could be 100 racks, or it could be 100 servers,
if you think that... If you want to configure it that way, yeah.
So it's up to you if you have two switches or other ways to think
that a single failure won't get
many servers down. So you have these 100 failure
domains, you're protecting 16 plus 2, actually each
write picks its best 18 servers to write
to. So it's not static, 18 servers protect 16 plus 2, it is each write finds the best devices to write
to out of 18 failure domains. And then basically what it means that we have huge number number of distinct
very different stripes all over the system because if you run the math how many groups of 18
objects you have out of 100 fellow domains this is 10 it's, way bigger than a Weka. So it's 10 to the 60 or even more.
So it's a huge number.
Then when you have a failure, even for a full failure domain,
all the other instances look and find all the stripes that share that failure domain
and basically rebuild and rebalance it
across the other failure domains.
When you have a second failure,
let's say a second server failure,
you're first going to find all the stripes
that share these two servers.
You rebuild them fast and before the rest.
Which brings everything back to n plus one and then
you can take your time exactly so after seconds or let's say a minute you're down to two distinct
single failures and then you can have a third failure going down and now you'll have two
distinct two failures and if you have... Okay, I'm both impressed
at how clever that is and kicking
myself for the world not
having made something like that standard
five years ago.
Well, we started almost five years ago.
Yeah.
But the whole concept of
first we find the data where if we lost it
we'd be in the deepest doo-doo
and let's rebuild those and work our way out
in a risk minimization
optimization. And why haven't we been doing that for a while?
Just so you know, now we have it patented.
Good for you.
We have actually quite a lot of patents. So we're doing a lot of novel things.
We have been doing storage for 20 years,
and we realize all the places where storage doesn't work well
and where we fix.
For example, we have end-to-end data protection,
but we never store the integrity data on the same media
that stores the data.
So if the media lied to us, then that right didn't happen.
The previous right doesn't check.
How many storage systems store the end-to-end protection?
So they tell you they have end-to-end data protection,
but then they end up storing the protection data with the media,
with the data, sorry.
You mean like a block checksum of some type?
Yes. And you're not storing it with the data. You mean like a block checksum of some type? Yes.
And you're not storing it with the data?
Of course we aren't.
It doesn't make sense.
Right, well, I mean, ZFS does that
for exactly that reason.
The checksum's stored
in the parent so that
a screw-up of the data block doesn't also
screw up the checksum.
Yeah, the metadata.
Are you saying it's separated that much?
Because when you said medium, I was like, and you write it to a different disk drive?
Yes. So you do write it to a different disk drive.
Presumably, I'll call it the
block metadata or the block virtualization layer where
you've got the block id that
says this block holds this amount of this much data and there's there's a checksum associated
with it or something like that where it actually resides across the cluster slash nodes slash
drives within the cluster right right but where zfs is set up for the well one time in 10 to the
god knows how many you tell a satA drive to write to LBA 5000
and it writes it to LBA 6000 instead.
But you guys are prepared for a whole disk drive
to be taken over by the KGB
and report wrong data on purpose.
Indeed.
That's because we already figured out
that storage guys need to be paranoid.
Yeah, we already went over this logic.
I'm assuming this is a log-structured file system.
You never overwrite data. Is that correct?
No, we don't really discuss how we do things.
Our data structures are inherently different
than any other file system data structure you currently think of.
And it basically...
Well, just the way you're writing the backend data with stripes on the drives that are ready for them when they're being written would require a whole different set of metadata.
Right.
So you cannot actually compare what we're doing.
So on the outside, we give you a set of a distributed file system,
but on the inside, we're doing things radically different
than the other solutions.
All right.
So snapshots, replication, all that other gunk, you guys support snapshots?
Obviously. So we have snapshots because our rights are redirected rights, so snapshots are
baked in from the get-go. They don't take time to take, so in the milliseconds range, and they don't have performance penalty
once you have taken them.
But this is something you would expect from a file system these days, and obviously we
have clones also.
But then we have actually taken our snapshots and the ability to tier to object storage and created a feature over it that I think is a lot more exciting.
So we have the ability to take a snapshot and push it completely into the object storage
with all the metadata and all the data, so you don't need the original cluster anymore.
So actually what we're doing,
we're able to take all our data
and everything else you would need
to restore a file system
and push it to a third-party storage
or to different third-party storage.
So you don't need to trust us
to say that the data is backed up.
Because a lot of vendors come up with a solution they call backup, but have only that vendors as
piece of the backup. And then because you're paranoid, you wouldn't trust it.
So the object storage, the object that you create from a snapshot, let's say,
it's almost self-defining from respect to the file information,
the file metadata, and all the directory information that's required.
So how does one go about, let's say you did this, and let's say Weka.io,
how would you reconstruct it?
Do you have some sort of a standalone utility that would go through and reconstruct it?
We don't have a standalone utility, but you can create any other cluster to mount it.
So a very good use case for us is to go to AWS S3 and then say,
all right, I have this monthly report or quarterly analysis.
I need tons of resources I don't have on my on-premises data center.
I'm going to form a Weka.io cluster in EC2.
Twice the instances will get you twice the performance.
You're going to do the work there.
By the way, twice the resources again will again get you twice the performance, you're going to do the work there. By the way, twice the resources again will again get you twice the performance.
You'll take another snapshot. You will save it to the same bucket.
We save only deltas, so hopefully it will be an easy way.
By the way, you can take a snapshot every hour and push it to the object storage.
And then any cluster that currently mounts a clone of the same file system ancestor
can also mount read-only any other snapshot sent to the same bucket or create a clone of it.
So now you can have your on-premises cluster mount read-only the results from EC2
and copy back the results or link to them.
So basically, this feature allows you to have backup or DR,
but it also lets you have the ability of looking at us as a global namespace,
but each file system runs at local latencies.
So we never route IOs through the internet or through your WAN.
So you're doing the work, the work is done,
then you push it back to the object storage,
which we treat as repository, and any other cluster can view it.
Okay.
So what we are doing on the backup side, for example,
we have customers and they just have one new server
they bought for $2K.
It has six SSDs, six VMs on it.
It's enough to run a Wakaio cluster with four plus two.
By the way, different clusters don't have to have the same amount of instances or servers.
And then it mounts a read-only view of each new snapshot and makes sure that it's readable
and workable.
So it is configured to track the new snapshots and make sure that the new data is accessible.
And then you make sure that not only you're backing up, but you can actually use that data.
Because that's another problem that happens with backup schemes.
You think everything is great, but then when you actually need to use it, it's not there. So you can either do it on-prem with such one you or run it on AWS if you're tiering
to AWS S3 with three small instances, just making sure that the snapshots make sense.
Let me try to make sure I understand what you just said, Laurent.
So I'm an on-prem Weka.io user.
I haven't got my cluster of storage here.
I decide I've got some, let's say, accounting work I need to offload someplace.
So I take a snapshot.
I now upload it to S3, Amazon S3.
And I decide, okay, I'm going to define a new cluster, Weka.io cluster in AWS EC2, I'm going to access, I'm going to reconstitute that S3 snapshot as a new file system
or members of the current file system that I just created in EC2.
I process it there at whatever level of processing I want.
I create the results back on that cluster.
I now tier it back to S3 as another object.
And now I can bring that object back down to on-prem?
Exactly.
My gosh.
That's not the word I was going to use, but that's...
I'm impressed.
So in the disaster recovery as a service case where I'm taking a daily snapshot and pushing it up to S3 and spinning the nodes act as caches to the S3 data
until they populate?
Yes.
So they act as caches.
If you aren't writing new data,
it will be totally read cache.
But once you write data,
the system knows what content on the local SSDs is read cache
and what is no primary data.
So when you're taking your next snapshot, it only
sends the data, the deltas of the
data back to the S3 bucket.
I'm trying to write out what I just said.
Anyway, God, this is impressive. Thank you very much. to the S3 bucket. I'm trying to write out what I just said. Anyway.
God, this is impressive.
Thank you very much.
Clearly, you've got a very interesting file system,
but you can't be trying to be all things to all people at the moment.
What are the markets you guys are addressing now?
From the beginning of our conversation, it sounded like a lot of HPC.
Yep.
The reason we go after HPC
is that we realize that
storage requires trust.
So it takes time for new solutions.
It doesn't matter how exciting they are
to gain acceptance.
So what we're doing,
we're looking for the applications
that cannot afford not to have this.
Because if they're using the traditional solutions,
they're just wasting lots of time or other resources.
The biggest complaint I hear constantly about HPC file systems
is that they just require a huge amount of care and
feeding. This is also true, but give them as much care and feeding as you want. You're not going to
read fast enough for these GPU servers. So the GPU servers, they're now doing all these machine learnings and the files that they work over are small so text samples
voice samples images so then and these systems needs very high throughput and we can fill in
the line we can read 12 gigabytes for the client out of a hundred gigabit line for a single client.
And we can have hundreds of these for a single cluster. So the other file systems couldn't even have a single client fill a line.
A single client, no other clients fill the line to so many gigabytes a second
because the latency was too high.
So the amount of time they wasted on opening the file, reading the file,
closing the file just wouldn't let them utilize it.
So GPU-accelerated workloads are a very good switch, but for us basically because they don't have a choice,
we come into POCs and it just walls apart.
Another very good use case for us are genomics, life sciences, microscopy,
because they used to have large files where GPFS and Lustre made sense. But now the way they're sampling their data
creates tons of tiny files.
We work great with tiny files.
We work okay with large files,
but it's just very easy to work with large files.
But we're the only solution that works well
with the small files.
And small files is where many scale-out file systems break down.
Exactly.
Where are you successful?
You ask the scale-out file system
vendor and they say, well, media and entertainment.
It's like, okay, I understand.
You want your average I.O. to be
in the megabytes.
We also have successes
in the media and entertainment.
So we're going to announce, I think, We also have successes in the media and entertainment.
We're going to announce, I think, two of the largest studios soon.
But they use us and still have alternatives.
So they can use other solutions as well.
We're currently looking for the customers that have no choice.
Okay, that makes sense.
I guess I have to ask the question.
So how do you charge for WACCIO?
Is it on a per capacity basis, per node?
So it's per capacity.
And we have two different prices, the hot tier capacity and the tiered capacity.
And then the charge is annual,
so it's basically a software subscription.
And the bigger you are, the better your unit economics.
Welcome to America.
Well, God, this is great.
We could do this all day, quite frankly,
but I think we've got to go on here.
Thank you. Yeah, I'm, but I think we've got to go on here.
I have lots of fun because usually it takes me a lot longer to explain what we're doing.
No, no, it's great.
Howard, do you have any other questions?
No, I'm fine, Ray.
Loran, is there something you would like to say to the audience before we go off here? Well, if you're intrigued, you can try running Weka yourself on the AWS cloud. Just go to start.weka.io and you're going to very easily create an AWS cluster.
Thank you very much for being on our show today, Laurent. We really appreciate it.
Thank you very much for having me. It was, as you said, great fun.
Next month, we will talk to
another system storage technology
person. Any questions you want us
to ask, please let us know. And if you
enjoy our podcast, tell your friends about it
and please review us on iTunes as this will help
us get the word out. That's it for now.
Bye, Howard. Bye, Ray.
Until next time.