Storage Developer Conference - #62: Getting it Right: Testing Storage Arrays The Way They’ll be Used
Episode Date: January 31, 2018...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair.
Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community.
Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage
Developer Conference. The link to the slides is available in the show notes at snea.org
slash podcasts. You are listening to SDC Podcast Episode 62. Hi everybody. After lunch, this is the most difficult presentation and session of the day because in about 10 minutes, lunch is going to kick in.
And as hard as you try, you'll do what we call a light check.
So I'll try to keep you entertained.
Call that the church notes.
Yeah, sure. As it says here, I'm Peter Murray and I'm from Virtual Instruments and I got a big long
title here for the presentation.
Virtual Instruments, some of you probably know us a little bit better as SwiftTester,
Load Dynamics, or that Swift box.
That's what I come from. And we've, for about a little less than, no, about 10 years now,
we've been building and enhancing on what I still will say is the best load generator on the planet.
Okay, I know I'm doing a little bit of commercial stuff.
I should shut up and stop doing that and just talk about what I'm talking about.
But I want to give you a little background.
So I've been an SE from the time we started out with this and we I have I say that I have the best
job in the world for a seven-year-old boy I break stuff and people give me
money and I don't even have to fix it I that's my job and it's it's pretty cool
and what we have actually we can break almost anything pretty easily.
Okay, enough out.
The reason I tee it up that way is that I just want to talk to you about testing. So the last, well, I guess about four years ago, I joined something, the SNEA Solid State Storage System Technical Working Group.
We call it S4TWIG because hardly anybody can remember the full name of it.
And the goal of it was to come up with a methodology for testing.
Originally, we said we want to test solid-state arrays.
Well, it turns out if you get the right methodology, it doesn't matter what you're testing.
If it's storage and you test it that way, it won't penalize anything particularly,
and it'll help people show off their features to their best effect if when you're really testing
and
Over the years of doing this we've I remember the first time we came here and people said Swift what and now we'll talk to
You later. Oh that looks like a kind of little nice little wrapper around
SMB torture and a lot of things were said back and
then and over the years we've gotten more and more people to know us more and more people are using
us it's pretty much all of the vendors certainly the tier one vendors most of the second tier
and now a lot of enterprises and the reason that they use us is that we help them
test their equipment the way it's going to be used. So that's, we're saying, test storage arrays the way they'll be used.
And this is the first of two presentations.
So I'm talking to you guys as storage vendors or probably most of us here.
Is that clear?
Any service providers or any standards trolls like me?
Okay.
There's at least one user here.
Really?
Oh, yeah, I'm sorry.
That's right.
We'll convert you, too.
Okay.
So I want to talk to just give you some ideas of how we do what we're doing.
Now, there is this solid-state storage system technical working group.
I got it out again.
It's working on coming up with a methodology
for testing flash arrays.
There's still a lot of discussion.
There's still a lot of viewpoints.
So I'm going to give you the perspective
that I have from my company.
And it's not just from me.
We have a professional services group
that goes out and is doing proof of concepts
and helping people do testings
and doing bake-off and things like that.
Every week, they're out doing something new.
And this that I'm going through is kind of a distillation of what we've been doing for a lot of years.
I hope you find it useful.
So I'm going to talk, blah, blah, blah.
I'll talk about these things.
If I don't talk about it, if I miss anything, come back and yell at me later.
And as I said, this is where I'm going to talk about the methodology,
the way we believe things should be done.
Wednesday afternoon, I'm going to do a second presentation
about the way that we have implemented our methodology
and what we do and how it works.
So solid-state arrays became mainstream when nobody was looking.
Yeah, there's still the hyperscalers are using drives and more power to them.
They've got a working model, and it goes great for them.
But in enterprises, solid-state storage is displacing the old disk drives.
Spinning rust is useless and less.
There are some things that are different about them.
We really need to test differently.
In the past, we had programs like Iometer and FIO and others that did straight I-O testing.
They were developed from disk utilities. They were not concerned with anything other than trying to get access
and get something back from a drive.
And you could test randomly across a drive,
or you could test sequentially,
and you could do reading and writing together.
Well, with solid state, the world kind of changed.
We've got new features,
and we really need to test for those features.
Anybody ever used to see dedupe and compression on a spinning drive?
Well, you could wait a really long time to get your data back and do it,
but nobody bothered because it didn't help.
Well, that's changed a lot, so we're going to look at some things like that.
And we believe that the best way to test now isn't just to use
the synthetic read-write 80-20 mixture of synthetic writes and reads and random or sequential and all
that, because it doesn't tell you how applications are going to run in a data center. What's going
to give you a better idea of what's going to run in a data center is to emulate an application as
closely as you can. Now, that doesn't mean that we're perfect. We've still got a long way to go. It's
a process we're going through. We're working like crazy on it right now to get finer and finer
grain control of what those applications look like. So I want to talk now about the elements. So what's common about an application stream? It has locality. That means there is a
certain pattern to where things are written and when they're written, where they're read,
and when they're read. And this is true even though people talk about the application blender
effect being saying that especially when you're working with a lot of VMs,
everything looks random.
I'm going to show you a chart,
actually an HTML page,
that says that's not true,
and that what I'm going to show you
is what we found is more honestly
what's out in the world.
So we need to understand locality.
Where is stuff being written?
Why, you know, how is it being grouped to right there?
Access patterns.
Anybody, if you go in and look at performance stats on an array,
especially older arrays, you'll look at it and say,
average block size, 32, 33K.
I still haven't seen a 33K block in my life.
But it's a mixture.
So they take all of that and crunch a number,
and 33 is what they give you.
Well, it's made up, actually, of multiple different sizes.
So we want to test with those sizes.
We want to test in realistic ways.
Uniform random write.
I'm going to hit this again, and so I'll be repeating myself.
But uniform random write is the way some vendors say to test solid-state arrays
because it makes their array look really good.
Vendors will say test with this certain IOP pattern using this particular load generator,
but they don't use bursts, and it distorts the performance.
They'll say test without data content.
Who needs that?
Well, it depends.
You know, not every array is good for every single application.
If you've got a highly compressible application,
then you probably want to test with compression in Ddupe
to find out how well it does,
because that's going to lower the cost of what the array that you get.
And lastly, testing with bursts.
Subsecond bursts are present everywhere.
Anybody ever seen a computer that sends traffic out at 100 IOPS?
No, it dumps the file.
It gets rid of it.
So we're going to talk more about that.
So we want to understand what's going to happen in production.
This is from my marketing guy.
So for those of you who haven't figured it out yet,
you don't get as many write cycles from solid state storage.
Okay, we get it.
So they try it.
A friend of mine at one of the vendors says it's best to call these things write avoidance arrays.
So they coalesce a bunch of writes up into a memory page, and then they lay them down.
And then they watch that because where that, because where that information was coming from
when it was laid down
is likely to be followed by other stuff in that same region.
That's the way an application operates.
If you think of adding to a customer file,
it's pretty sequential.
You're adding, adding, adding, adding, adding, adding all day long.
So in many ways, those bands show up and you see them.
So the write time really is implementation dependent.
If you're running JBOD, there's not so much.
But if you're looking at most of the modern arrays now,
they are doing a lot of sophisticated computing and math
to write things as efficiently as possible,
to avoid overwrites, to speed access when somebody does read it back,
if they ever do.
And we want to make sure that that gets done right.
Oh, I said here, reading is free.
Well, reading doesn't pit the solid state memory.
It doesn't cause any problems.
Writing does.
So it's a little bit different,
but reading still introduces problems with, you can get blocked. If an array is trying to get to
some location and it's being written to at that moment, they either have to go to RAID and get
to another copy of it, or they have to do some sort of work to get to that to keep it going as fast as it needs to be for to maintain the overall latency
we may well in fact in most cases i don't say in most cases in many cases
data reduction which is compression and d-dupe and pattern reduction requires that
post-processing occurs it's just not enough time and it's going to be interesting to see what happens with NVMe,
because at the much higher data rates,
we're going to have to have something that can just scream it,
slamming things down and making less space out of it.
And that may be pushed to the side.
That may all be post-processing, as we see.
That's still unfolding.
So I say it typically doesn't affect write speed,
because they'll just go through initially when they're taking things in and do what they can to that
data and save the rest for later when they can come back and look through the
array. Okay, preconditioning is a religious topic. That's for my friend in
the back. We have, there's a lot of different opinions out that everything
from, oh you know, overwrite it twice with random data,
you go back and splatter the data for 20 seconds,
and then you're done, and you've done your preconditioning,
and it's not going to change very much.
There's other people that say, you need to do this for days.
And the reason that you're doing that is we're not writing to disk drives anymore.
We're writing to metadata.
We're writing, when we do a write,
the array looks at that block that comes in. First it sees if it's a duplicate of something that's
come before. If it's not, great. Then it looks at it and sees if it can smash it to make
it smaller. And if it can, it writes. If it can't, it writes. And that's how the writing
occurs. And if that same pattern is seen,
later, you don't have to write.
You just make a reference to it.
And the smaller the chunk you use,
it goes down to about 512 bytes
is the smallest I've seen so far.
If you go down to that smaller chunk,
then it's more likely that you'll see duplicates.
It's just, no, statistics say that that's the case.
So there's a whole bunch of tradeoffs that vendors use for this.
But this can affect the speed of writing.
Preconditioning can affect all of this.
And I say more Wednesday.
Come hear me Wednesday.
Question?
So what forms of other forms of preconditioning are there such as simulated
fragmentation? The latest argument, yeah, I don't want to spend a whole lot of time on Tate. I'll
spend more, I think, on when I talk on Wednesday. But the latest argument is either you do this for
a long time, you try to do it across block boundaries
and do everything you can to dirty it up and mess it up.
But all you're really trying to do
is to come up with a bunch of checksums
and then look for those checksums.
And if they exist, you don't store again.
Some vendors can do dedupe across block boundaries as well.
So something gets appended and data moves out.
They can still go find that new chunk and just save a reference to that it depends on the vendor and the algorithms are getting
more sophisticated all the time
Okay, so we were trying to emulate real applications
duh, that's what I already said avoid uniform random write distribution because that's a
Particular vendor tactic for selling arrays and It works very well if you have a lot of reserve flash now all the vendors have some reverse reserve flash
what's presented to you is less than what's there and
The more that you have in reserve the less you have to worry about what's happening in the background and that really matters most for
for
Garbage collection when an array has to stop to get and blank a particular memory page,
it's not going to be a block anymore.
It'll be that memory page.
It takes time.
And searching for that can be very painful.
And vendors that use less reserve flash have a harder time getting to do that.
And the array will reflect that by slowing down,
especially if you're doing uniform random write can anybody tell me an application that
uniformly randomly writes and never reads pardon any logging process like
exchange web servers you're never gonna go look at it you're never gonna go look at it. You're never going to go look at it.
Well, okay, and if you're going to say exchange in servers,
I'll make an argument back there.
You're not seeing it at the rates that are going to choke a solid-state array.
But point taken.
Okay, it's unlikely.
Yeah, that's right.
They are sequential.
Yeah, there's sequentiality to it.
So it's doing that, that random that comes from Theo and comes from Iometer,
was relevant when we're doing disk drives.
It's just less relevant now.
And that's not what your typical application looks like.
And just wait until I get to that slide and show you, and I'm going to be so happy.
If it doesn't work, I'm going to be really irritated, and I'm gonna be so happy to if it if it doesn't work I'm gonna be really irritated not have to take you
all outside or something okay so yeah I know and you're getting what you pay for
today I'll tell you so don't do uniform random writes use multiple block sizes
duh now some people argue back and say oh my my Oracle application is all every time 8 8K writes, that's all it does, and that's the end of the story.
Well, sometimes.
But one thing about flash arrays, what are people doing?
They're combining applications.
They're not running just one application on a stand with multiple racks anymore.
They're combining these together, and when you get that, that changes it.
More on that later. So we also want to make sure we test in the presence of
the other stuff that goes. All the enterprise features backdrops, snapshots,
replication, periodic access like in the month processing, and a quarter
processing, whatever that might be, which has vastly different traffic patterns.
Okay, so what do we care about?
Write read ratios, random sequential access ratios,
the access pattern drift.
This is, you know, what am I losing it here for?
This is where we talk about clustering,
where we talk about, oh, never mind.
I'll come back to it later.
Block sizes, alternate paths. When you're writing to an array especially if you're using a fiber channel with mpio you've
got stuff coming in multiple paths coming into the array so you want to test in a way that looks more
like that rather than just trying to get one endpoint and read everything that came in there
because you may only be getting half of the story. There may be traffic coming in from multiple interfaces.
Okay, so locality is this idea that stuff tends
to get clustered when it's written during a day.
And this is even true with blended applications.
The IO blender effect that has famously been talked about with virtualized applications
isn't entirely true there's still some locality to this so we're defining when something is written
or read that's spatial locality or where when something where is something is written or read
spatial locality when it's written or read as temporal locality. Both of those are presence. And hotspots represent this locality. The hotspots you tend to
be writing in a narrow area, and that tends to drift over time as you're writing, okay?
So if you don't test with locality, you're not stressing the arrays that are going to be used,
the way they're going to be used in production. And again, if it's just JBOD, that's going to have less effect than if you've got a controller
with sophisticated algorithms that are deciding what to lay down and when and how and all that.
With a lot of arrays today, you don't write to one, zero, offset 100.
You write to a logical representation of that.
And where that gets put on the array
may be anywhere. That's because of where leveling. The whole idea is to make that array last as long
as it can. So you don't want to keep writing to the same location and burn it out. So where
leveling takes care of that. So all of that is abstracted when you're writing to the array.
You're writing to a logical space. It's writing to the physical space outback.
So I've got a
demonstration. So it's showing write locality
for a LUN with
real life
Oracle ASM access.
Horizontal access is
going to be the LUN broken up into 100
parts. So
if something comes in at the left, that's
towards the lower end of the address range. If it comes in at the left, that's towards the lower end of the address range. If it
comes in at the right, it's towards the higher end of the address range. And each refresh of what I'm
about to show you comes for one minute interval. You're not going to be able to get this,
unfortunately. I wasn't able to figure a way to save it. But if you come to me,
we can somehow get you. I'll send you the location or something.
Because it is pretty interesting to see this.
So the height is just normalized for each refresh.
It's not trying to show relative intensity.
It's intensity rather to show that the hotspots are there and that they move over time.
So my friend Lou Leidigson at Pure Storage is the guy who got this for me.
And now is the moment of truth.
And we'll see if my access to the Hyatt network worked.
Okay, so now I'm going to have to come up here and dance and show you what... There it is, there it is.
No dancing.
And you're not seeing it.
Let me see if I can do this.
Yeah, I'll just let it run for a little while.
This is why you go and try everything before you do an actual presentation.
But you can see what's going on there.
This is not random.
There is a definite pattern to where everything is being written,
and this pattern changes
over time sometimes there's a lot of clustering sometimes there's less but
this is one of many views that that Lou has done and they all reflect this they
have not been able to find applications that don't do this. Yes, sir?
Yeah, yeah, sure.
There. So it's just, it's a representation of a thousand, the LUN broken up into a thousand
different sections, okay? And it gives us a little bit more to see down here.
So this is happening.
This is happening all the time.
We just hadn't been able to see this yet.
One thing that I keep...
I don't want to sound like a commercial for these guys,
but they've got a way of taking a standby controller
and just monitoring everything that's coming in.
And they were able from that to generate these
graphs that show that locality as traffic is coming in pretty cool stuff
so it's there I proved it if you believe it that's the virtual view or the hardware view? That's one LUN.
And actually, yes, it is the virtual view.
LBA.
No, because the way it goes on the back end is who knows.
That's all done for.
The algorithms take care of that for their wear leveling.
It worked. I'm all pleased okay there okay good I'm back on it okay we just talked about
this access is not uniformly random hotspots are accessed more frequently than others during a defined time period,
and it varies according to what you're doing.
We have some examples here like index temp files, logs, journals, things like that.
And we want to, when we're testing, emulate that there are hotspots
and emulate that drift as well.
Here's one example that we saw of where things were written. So 1% of all access regions
got 35% of the IOs. That's quite a bit. And it varies according to each of those, and you get up
to higher numbers get less of the IOs over time.
And the FIO developer, as it says here,
says that this is a simplified example,
and that it's actually even a little bit more complex than this.
But they're present, and SKU is present.
So, yeah, 25 to 35K is the average reported block size.
We look at vendor performance reports,
vendor performance processes.
Virtually every vendor can produce a CSV file
that talks about performance during a time period.
For any of you guys who work for vendors
that are willing to share share with us we beg you
please do because we can tend to take the output of that csv file and turn that into input for
a workload model i'll talk a lot more about this on wednesday but the idea is that we take this information that's reported from the array.
Did Eden live?
Eden left.
Darn it.
Eden Kim's working on another model that works by using a small program in an initiator or target, if I understand it right.
Is that correct?
That is able to record what happens on the way out or going through one of the interfaces.
And he gets pretty good resolution out of it.
The problem that we have, really,
is getting adequate resolution for this.
If you get a five-second interval,
you could have a horrible spike for a couple of seconds
that really impacted performance,
but you average it over five minutes,
and it looks like nothing.
So the better we can get this, the better. And without being too much of a plug,
Virtual Instruments is working on that. Now, the guys, the VMAX team, has got something called
Trace SRT that allows them to get really fine time-grain intervals, The command, what it was, how long it lasted,
and things like that that allow you to recreate
a really accurate model of what happened during that time.
And that varies all over the map.
From some vendors that give you these five-minute reporting intervals,
which is not very useful at all.
It still may be a little bit better than doing things completely the old
way of testing but the finer the grain time interval the better and we're
working towards getting below one second it's a limitation where we are today of
computing getting trying to move all that data and at the same time say oh by
the way we did all this stuff it It's pretty difficult to do. But we're going in that direction.
So 25, 35K often applications don't use uniform block size across an entire application.
We want representative block sizes.
So you set up percentages of different block sizes and you mix those in.
Now, we are doing a statistical model.
This is not an exact replay of an application because if we had the desire to do that, we'd have to have a second data center to collect all the data to do it.
Because, you know, a PCAP at 10 gig, much less 40 gig or 100 gig, just huge amounts of data.
So we summarize, distill that down, and we're using that.
But we do a synthetic model of it.
So burst.
Here we go. Craig is in the back of
the room to correct me if I blow this one again. So here's this burst that happened
during a particular time period. The host performance, they see this going up. During this time, the IOPS didn't change.
Latency didn't really change. But there was this huge burst that happened. So it comes and it goes.
These are real. Computers don't send 100 IOP files. They get rid of them as quickly as they can.
When that happens, it tends to load up
the CPU, it loads up the queues, it can mess up the processor memory, the buffers, and the queues,
and can dramatically lower performance. There's a certain vendor that's saying recently that we
should all increase our queue depths. And we argue just the opposite, that the lower the queue depth
setting, the more likely you are to avoid network congestion
and you're going to get more work done during a given time.
So, yeah, 50 milliseconds.
Okay, yeah.
Is this here?
No, no, it's not.
Those heavy burst times
are actively impacting performance while they occur.
The more things are sitting waiting to get done,
the more it affects everybody going to that array.
So for reads or writes, which one is it?
Okay.
So we see here that we have minimum pending exchanges.
There was a lot of times where there was just nothing happening
over this period of time.
It's almost an hour.
We'll see the top average pending.
There was actually, during a whole lot of the time,
a lot that actually was waiting to get done,
which means somebody's getting backed up.
And we see the max pending up here,
our average about 40 and lowest about 20.
That was happening over that time.
Max, we see that it was well over 100.
So this is what's happening on the wire.
And this is going to be what really can hurt an application.
So let's look at reads.
If this was just a constant number of IOPS,
we're getting, what, 40 to 80k IOPS,
we would see a much more even response time. What we're seeing here is a range of 200 to 400
over a 50 millisecond barrier.
So the bursts were 40 times as high as the average.
What this means is that even in a second,
stuff hits the array, it gets backed up a little bit,
and then it drains and keeps on going.
And there's periods where nothing else is going on,
but where those peak times occur,
it's actually seriously impacting the performance of that array.
And that gets missed a lot.
IOPS, we're seeing it's only about 2K IOPS here,
but there was significant amounts of delay there.
Okay, so writes, real writes.
Here's an example.
We're looking at IOPS.
IOPS were almost nothing there except for one big peak during a brief time.
But when we look at the 50 millisecond peak,
we see that it's even worse than we saw when we were looking for writes than for reads.
So here's the reads.
Oh, no.
There's the reads, the writes, and those bursts were slowing us down.
Okay, so there's a range of these as well, and that's what this is showing. Those bursts were slowing us down.
So there's a range of these as well.
And that's what this is showing.
The peak, the bottom, the mean.
So a couple of these guys were really working,
getting greatly increased latency.
It was less bad over time. But the IOPS here were much more stable.
And you're going to see that in a real application.
Sir?
Ma'am?
So where is this being observed at?
This is being observed off of a wire using a tap,
using virtual instruments, virtual wisdom application.
This is our data.
Yeah.
This was observed in.
The first image that we showed you was an actual customer issue where the only observable
problem in their environment where the legacy was doubled was the fact that they were having
these light bursts.
Yeah. It's the only thing that changed. environment where the latency more than doubled was about how big a half of these lines burst.
Yeah, it's the only thing that changed.
We also went through and analyzed our customer data, and this is common in all of the data that
I've had asked, that most of the time, every second there's a time period in the second where it reports to the entire second.
And in some cases the writes, all of the writes for the entire second occur within a second.
So it gives you an idea that things are not a flat line for the sub sub sub sub one second.
They're very, they're all over the map. Even if you thought it was safe, high ops, And we've had multiple vendors come back to us and say,
you suck.
You give lower performance than Iometer does.
Or, I'm sorry, VDBench.
Because they were running VDBench without bursts.
And yeah, we do.
When you make the traffic more real, you don't get as much data through.
So writes, reads, writes, the range of response times.
Now, load dynamics, our load generator, by default,
breaks up transmissions and bursts them in less than a second. That's
the way that the product was designed in the way that it works. So you'll see here
we have a second broken up into ten pieces and there's those bursts are
taken and when we ran tests against arrays, performance was slightly lower
than what they would get running something like VDBench without any
bursting. And this is what those arrays are going to see in production,
not that steady flat line
that you get out of a storage test tool.
So we're seeing here we're getting about...
Oh, what am I trying to read here?
Oh, action.
It's 25K IOPS here, and we're getting a response time of about 40K microseconds.
40 milliseconds, sorry.
40 milliseconds.
That's with the burst.
40 milliseconds is pretty bad. Now, if you take this down to 80%,
take that rate before and drop it down by 20%,
we see that we actually were getting 20k IOPS through.
We're getting down to 6 milliseconds response time.
This is all what we're talking about,
sizing the array for what it's going to really see in production.
So that's running at 80% of maximum, running without bursts, which is what a lot of guys report as their numbers.
They'll take and just run this straight and meter it out.
And we see that we're showing 15,000 or 25,000 instead of the 20,000 that we saw at the last screen.
But they're showing it at 1.2 milliseconds latency,
and this is what they publish.
If you're sitting in the data center running applications,
this is not what those arrays are going to see.
So these are the hero numbers,
and testing with those bursts, testing with that realism,
is going to give you a lot better idea of what you're going to get when you really run the array.
So here's the difference here.
I love this one.
So with bursts, we saw they were getting 20k IOPS, throughput of about 1250 megabytes,
six and a half microseconds, milliseconds latency, and this is the real world.
With bursts, 25k IOPS jumped the throughput up slightly, but hugely increased
the latency. Unacceptable. And without the bursts, we're seeing 25k just like we were seeing with
bursts, but we're seeing 1.2 millisecond response time, which is a myth, but which is oftentimes
published. So first matter.
Okay, on. Any questions about that before I go on?
Okay, data content.
Most of the arrays are using data reduction,
deduplication, compression, and pattern reduction.
Like I said, if they see a block of all zeros,
make a reference, Don't store anything.
It's a metadata reference.
If something comes in, they check some of it.
If they haven't seen it before, they try to reduce it.
If it gets bigger or it doesn't reduce, they save it as is.
If it does reduce, they save it down.
And then they'll have a varying number according to the array of times they can repeat that
before they have to make another save of that same piece
of data. Coming back out the other way, because we have solid state memory now and we have faster
processors, it's possible to what they call rehydrate or re-expand that data and get it out to
the sender much more quickly. So all of those are there. Not all arrays do compression. Not all of them do dedupe.
Not all of them do pattern reduction.
So you want to know what's going to work?
Test for it.
And it's not just to say, oh, it got this dedupe ratio or this compression ratio.
That doesn't save a customer a lot of money.
The more efficiently data can be deduplicated and compressed, especially at higher rates,
the less you need of that stuff.
And for customers, that's an important consideration.
Okay, we all want to sell more as vendors,
but we also want to be considerate of this as well.
So, we create data content patterns,
we turn those into streams, which essentially take blocks in repeating in various amounts.
You'll have a block with a certain checksum may be repeated 10,000 times during a test.
A different one may be repeated not at all.
It may just be random.
But those combination of them give a stream that looks like an application.
I talked about the repeating and non-repeating patterns.
Pattern lengths vary.
As I said, some arrays can detect patterns across block boundaries,
so they're able nominally to get a little bit more compression out.
Next is thread count and Q depth. So when you're testing,
not only do you want to do the best testing, we'll say, okay, I want to understand a technology.
Okay, I choose that technology and then I want to know what configuration of that technology
is going to work best. And when I'm doing that testing, I want to be able to test
using different Q depths to find out
what gives me the best performance and combining that with different thread
counts to find out what the capacity max number of users and such is going to
work for a given size of an array maybe I have to go up to a second controller I
have to do a scale out to go to multiple nodes to maintain the essentially
maintain the latency that I want for that application over time.
So we find the max IOPS an array can do per thread, per Q depth,
and then a total for a given number of threads and Q depth.
And then we increase this to find out where the limits are.
You shouldn't really be testing for today.
You want to test what an application can do today,
and then the lease period is going to be somewhere between three and seven years,
something in there, and they're going to have to refresh at the end of that time.
So you want to make sure at the end of that time the array is still performing.
This is important, even going back into testing with disk-based arrays.
There was a company we worked with that had installed,
they had five stands, and they put the first one out
when their traffic was relatively low, worked fine.
Second one came in, third one came in,
fourth one came in, and the whole thing fell apart.
It was a limitation that they didn't understand
in that particular array,
and it caused a very public outage so oops that's why
you want to test up to what your expected maximums are going to be next
because we're talking about solid-state in many cases not all I'm trying I'm not
trying to claim that everybody's going to be throwing 50 different
applications on one array so but we're seeing more of it coming, right?
We're seeing more of that happening.
So we want to test those applications together.
Find out the performance of an individual app and then start combining them. And again, crank up the numbers to find out what the maximum performance is going to be.
And last night I was talking to my friend SW,
and he said, you want to be careful about this.
I was saying, ah, you just should test them all on the rainy day.
Well, that's not going to happen if you're...
The apps aren't necessarily all going to hit peak performance
at the same time, so you want to be sensitive to that.
Look at the patterns, you do some baselining,
understand how the applications are performing, and then put them together in ways that's going
to make sense over a test, say, where you emulate a day or something like that. And make sure that
the additive peaks don't exceed your latency targets. You want to stay within that.
So we want to emulate each application, combine them,
and then test, of course, with the other processes that
are going to go on, snapshots, replications, backups,
periodic processing, and all that.
And take care to make sure that the peak times are
represented, just not make this one big additive
jump on top of everything.
I say here, this is a work in process, progress.
We're working on this.
There's still a lot of work to be done,
but it's where we can do a better job
of helping to represent these applications running.
But this is pretty much where we are today.
And we need ways of speeding this up.
Somebody doesn't have usually the luxury of spending,
oh, we'll take a month to test this array and a month to test the next one.
You maybe get a week or two to crank it out.
And so you have to make it more efficient.
We're early on in this, and it's getting better.
Okay.
I already said this.
You want to make sure that you're sized properly,
so crank up the numbers if you get an application running at a certain rate today.
I think my assertion that it's okay to make the assumption that as traffic rises, the components of the application are still going to work approximately the same.
I may be wrong, but it's the best we know today.
So try to take today's rates and boost them up
and just make sure that the latency targets are met,
that the system doesn't fall over on its side.
So this all used to be black art.
It's not so much anymore.
It's still not clear, but it's clearer than it was.
There is no such thing as a perfect synthetic workload. The perfect workload is the workload that's running in the data center on the equipment it's running on now being moved over.
But that's not a very easy thing to do to try to test five different arrays by running your current application on them.
So synthetic is what we've got, and it works pretty darned well.
Customers can see how closely their workload matches a model.
We do this all the time.
They'll get performance stats from their application.
We will model it and run it,
and we find that the peaks, valleys, the temporal stuff
stays almost dead on with that.
So we can emulate this now.
The days of not being able to do it are gone.
We can do better than we used to do.
And I think this is changing
where we're headed. So thanks. This is my ad. Please come hear the second presentation,
which I'm going to do Wednesday, and I'm going to talk about an implementation that we've done.
And I'm trying not to make this too much of a commercial. We've just been working on this for
a long time, and we think we have something valid to let you know about
so Wednesday 4 0 5
Come on down. You can this is going to be posted online I think with this latest version of the last screen that actually says virtual instruments instead of your name here
and
Come and see us any questions
Sure Come and see us. Any questions? Sure. Could you touch on garbage collection in splash devices,
such that, I call it the Super Bowl problem,
your system is pretty idle until the one time you really have load
and the counters start climbing and it may kick in the garbage collection,
the one time you don't want to go.
Right, when everything gets delayed. Yes.
Yeah.
Consistently?
Some vendors are more vulnerable to it than others.
It's a matter of sizing.
Okay.
In general, I think it's safe for me to say
that the less reserve flash you have,
the more prone you would be to that.
And if you're operating in a JBOD environment
where you don't have metadata,
as I said, we're writing, essentially,
we're writing logically, but that turns into metadata
and what gets written physically
is something completely different.
There's a lot of work going on in that direction.
But the thing is, NVMe is gonna blow this all up.
I'm looking really forward to seeing what happens,
but there's gonna be a lot less time to do stuff
with data as it's going through,
because it's gonna be coming through
at higher and higher rates.
I'm gonna use this.
I was at one of those customer love-ins a couple months ago,
and their comment was, by 2020,
we're going to be generating five zettabytes a day of CCTV, of video.
It's got to be stored somewhere.
The Internet's going to have a capacity of about 1.5 zettabytes.
So everything to the cloud, it seems a pretty big stretch
to me now there's techniques that we can use to compress and do all kinds of
things like that but if it's something that the lawyers got to see it's still a
lot if you're monitoring a busy time it's gonna be an issue I just got the
hook so if you'd like to talk some more, come on, check me out afterwards. And also on
Wednesday, come by if you've got the time. I'd love to talk to you a little bit more. So thanks for now.
Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to
developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further
with your peers in the storage developer community. For additional information about
the storage developer conference, visit www.storagedeveloper.org.