Grey Beards on Systems - Greybeards talk all-flash arrays with Dave Wright, CEO and founder of SolidFire

Starting point is 00:00:00 Hey everybody, Ray Lucchese here and Howard Marks here. Welcome to the next episode of Greybeards on Storage monthly podcast, the show where we get Greybeards storage and system bloggers to talk with storage and system vendors to discuss upcoming product technologies and trends affecting the data center today. Welcome to the eighth episode of Graybridge on Storage, which was recorded on April 28, 2014. We have with us here today Dave Wright, CEO of SolidFire. Why don't you tell us a little bit about yourself and your company, Dave? Yes, I'm Dave Wright. I'm the founder and CEO of SolidFire. Why don't you tell us a little bit about yourself and your company, Dave?

Starting point is 00:00:46 Yes, I'm Dave Wright. I'm the founder and CEO of SolidFire. We are a scale-out, all-flash storage company, and we specifically target use cases around large infrastructure, including public and private clouds. So, Dave, the first question I have is, why does Howard always call you Jungle Dave? That's a great question. So my last startup was a company called Jungle Disc and I grabbed the Twitter handle Jungle Dave. Oh my god, so your Twitter handle is Jungle Dave. So, okay.

Starting point is 00:01:17 Huh, SolidFire, all flash array and all that? It is. So it's an all flash storage system storage system. And like I said, a little bit different than a lot of other all-flash systems in the market that are really just tuned for performance workloads. Solfire is really targeting a wide range of kind of tier one and two storage workloads. Okay. So what would you claim as tier one and tier two storage workloads in your environment? So most of the workloads we see run the gamut. It's everything from just kind of traditional virtualization and virtual infrastructure workloads,

Starting point is 00:01:51 kind of traditional enterprise apps from databases, email, web apps, and a lot of web hosting environments that are running on top of SolidFire. We also are seeing kind of emerging workloads in areas like VDI and as well as Tier 1 databases with Oracle, MySQL, SQL Server, and SAP. Having spent my consulting life in the mid-market, the concept of all flash for Tier 2 is still a little bit much for me to take. And what does that mean to you, Dave? Well, so, and here's really what I'd say is that the unique value that SolidFire brings is the ability to really separate the performance of storage system from the capacity. And that's actually kind of an unusual thing to think about in storage. But what that really enables us to do is host a lot of very high

Starting point is 00:02:42 performance, performance-sensitive workloads alongside other workloads that may need more capacity but less performance. And we can actually steer the performance to the applications that need it and steer the capacity to the applications that require the capacity. And through our quality of service, ensure that everybody's getting exactly what they need. And do it in a way that can actually be cost effective, even when delivered with all flash. But I thought separating performance from capacity was the story the server side guys kept trying to tell us. Well, you know, they take a different approach to it, right? So they take a caching or a tiering approach, which is what some of the traditional array vendors have done. They've had different media with different performance characteristics. And either through caching or

Starting point is 00:03:23 tiering, they try to move the workloads to the place where it's going to get the best performance. We take a very different approach, and the approach that we take is to put all of the data on flash, but actually steer the performance of the flash devices to the workloads separately from the capacity. And it's very different. It may not seem like an obvious distinction there, but the key benefit of that is that there's not fixed tiers of capacity of different performance levels. You don't have to size your host cache and then your flash cache on the system and then your SATA tier and SAS tier and fiber channel tier and hope that you got them all right and hope that the storage processor moves everything where it needs to be to deliver a certain amount of performance,

Starting point is 00:04:07 the storage system itself... My sales guy told me it would. Well, and that is the promise that's being made. And the reality is, especially when you get to these larger infrascale environments, it just doesn't work that well. It's very reactive. It doesn't necessarily know which apps are actually more important than others. It just knows which ones are doing more I.O. than others. And it doesn't always deliver a great end result. And it also isn't cost effective. By the time you add in all those tiers of storage, including host-based Flash, it can actually be quite a bit more expensive than a cost-optimized all-Flash system. You know, one thing that is surprising about your system is that it's scale-out.

Starting point is 00:04:48 Do you want to talk a little bit about what that means to you guys? Well, sure. I mean, I don't think scale-out should be a surprise to anybody in the storage space these days. I think really all of the interesting development going on in storage, whether it's hyper-con on in storage, whether it's hyper-converged storage, whether it's object storage or performance-oriented storage, scale-out is going to be part of the story going forward because the days of small storage environments and isolated kind of islands of storage are going away very quickly. So scale-out

Starting point is 00:05:19 is really essential to the way that modern data centers are run to really do two things. One is allow you to obviously scale your environment over time, start with a small footprint and grow it over time, as opposed to having to put in three to five years worth of capacity to start, which is extremely difficult to do in kind of unpredictable modern data center environments. But it's also very important because of the operational efficiency that comes from having larger pools of storage when you have a more agile kind of cloud-like environment where you have workloads coming and going, as opposed to having fixed pools or

Starting point is 00:05:57 dedicated storage to individual applications. It seems to me that just the level of performance that today's flash devices provide makes them beg to have more CPU to manage them than we needed even five or six years ago. And I can't see how you could build a scale-up system to provide the kind of data services we want on today's flash. I mean, just because the capacity is expensive, I want data reduction. And that means I want more CPU. Yeah, absolutely. And as you look at scaling both the capacity as well as the performance of the system with all that functionality, you need more CPU. And you can cram more flash in a system to get more capacity. But unless you can scale the CPU and the memory and the other aspects of the system, including the storage ports along with that,

Starting point is 00:06:48 you're going to bottleneck on that controller. Even though all this Intel roadmap and all this stuff keeps doubling performance and doubling density every 24 months and stuff like that, there's still a need for just more processing power, more ports, more capability. Yeah, the funny thing is the flash guys are on pretty much the same, you know, Moore's Law helps them too. Exactly, and if anything, it's been faster more recently on

Starting point is 00:07:15 the Flash side than it has on the Intel side. The Intel roadmap has slowed down a bit relative to Flash, that particularly with introductions of higher levels of MLC along with the 3D NAND that's coming out is actually staying ahead of Moore's curve at this point and is continuing to get denser and faster at almost a terrifying rate at this point. Did you say terrifying? You know, I think it is terrifying in many cases to some of the folks that have architectures that they've built around a particular generation of flash. And when they look at where the roadmap of flash is going, the density and performance of these things are going, it is terrifying. And one of the things that, you know, we try to help customers with is understanding why a scale-out system like SolidFire actually can help remove their fears about it. Because who wants to go spend a couple million dollars on Flash for a couple years' worth of capacity only to find out that the price was 50% less

Starting point is 00:08:14 12 months from now? Or that the performance was 30% or 40% better 12 months from now? And so a scale-out architecture that allows people to essentially integrate the latest generations of Flash and get the capacity and performance benefits as well as the cost benefits over time as opposed to kind of putting it all in up front I think is a huge kind of investment production. Well, we've heard that story from scale-out folks before. And sometimes the gotchas start to really add up. It's like, sure, you can have as

Starting point is 00:08:50 many nodes as you want, as long as you buy them all at once. Or, you know, yes, you can have new nodes and old nodes, but they end up being different clusters. To really get the benefit of scale out, don't we need the ability to build things heterogeneously as well? Yeah, absolutely. And that's one of the things that we didn't launch GA with. Originally, when we GA'd, you had to have your entire cluster be of essentially the same model and generation of node. But we went ahead and did the work to make heterogeneous clusters work very well and make them work well in a way that was completely seamless to the administrator. So you have a single pool of capacity, you have a single

Starting point is 00:09:29 pool of performance, whatever node you're adding to the cluster simply adds its capacity and its performance to that pool. And there's no kind of gotchas, there's no, you know, kind of arbitrary limitations on how you can, you know, add or scale to the cluster. The other thing that seemed like a scale-out limitation before was things would work for, let's say, NAS or iSCSI because it was Ethernet-based. That wouldn't necessarily work for Fiber Channel. How does that play with SolidFire? Yeah, Fiber Channel is challenging. The model of that network is not one that lends itself well inherently to scale out.

Starting point is 00:10:06 And so what we've done with that— Gee, Dave, you really got understatement down. Yeah. It is challenging. And one of the reasons we started with iSCSI in addition to kind of just being better suited to the use cases and environments we were in was that it is easier to work with for scale out. So what we've done with Fiber Channel in our latest release is added the idea of Fiber Channel nodes to the SolidFire cluster. And these Fiber Channel nodes essentially have the Fiber Channel ports on them, four 16 gig Fiber Channel ports. And then they have cluster connectivity. And in our case, we just use standard Ethernet network for our cluster connectivity. So the systems have

Starting point is 00:10:44 10 gig network ports in them that allow them to talk to the rest of the storage cluster. And the essentially fiber channel nodes can be scaled on the cluster independently from the actual capacity as well as the iSCSI ports in the system. So it does allow us to scale that out. Now we have to do some interesting things with how we balance the workloads across the fiber channel ports. But essentially we have a variety of options available, including full active across all the ports or active across a subset of the ports to enable you to scale that environment. How is a cluster node for Fibre Channel different than a gateway? Yeah, great question. So, you know, at the end of the day, I guess in any scale-out system, the node that you're connected to, you would argue, is some form of a gateway because it is speaking a storage protocol on one end and talking to the rest of the cluster on another end.

Starting point is 00:11:33 And the only real difference between these and our iSCSI nodes is that they're speaking fiber channel out one end instead of iSCSI, and they're speaking our internal cluster protocol out the other end. So, you know, they're really no different from iSCSI in that manner. Now, we don't put storage capacity in these nodes, really, for two reasons. One is that we don't want to force people to add capacity just because they want to add fiber channel connectivity to the cluster, and we want to allow them to scale that connectivity independently. But also because we've really found for performance, Fibre Channel works best if those nodes are just focused on Fibre Channel. And, you know, it's something that, you know, to give people the lowest latency on Fibre Channel, which is very important for Fibre Channel folks, we've just

Starting point is 00:12:17 dedicated those nodes to Fibre Channel processing. And what is that latency, Dave? You brought up the L word here. So our latency is sub one millisecond at load in the cluster and on a kind of moderately loaded cluster, typically 300 to 500 microseconds for both reads and writes. And that's on Fibre Channel as well as iSCSI? Yeah, in our testing there's, you know, order of single digit percentages generally between the two protocols. It's not a huge difference. Well, that's going to make some of the fiber channel fanboys heads explode. I would think so. You know, and the challenge there is to try to, you know, prove it with, you know, an independent lab or something like that. But, you know, SPC1 and SPC2. If we only knew somebody who did that. We won't go there. But, you know, SPC1 and SPC2. If we only knew somebody who did that.

Starting point is 00:13:05 We won't go there. But, you know, could you tell us a little bit about, you know, the problems with running something like SPC1 on your environment? Because there is an inherent problem here, right, Dave? Yeah, there is. It's funny. We actually are an SPC associate member. We joined, you know, really back when we were started because that benchmark is so well known in the block storage space. You know, there's a couple of problems with it, really.

Starting point is 00:13:28 Now, one that's kind of pointed out very commonly in the flash space that is certainly applicable here is that that test does not really account for compression and deduplication in a storage system. In fact, it really, if you look at how they calculate capacities and how they report on capacities, it really doesn't have any way to handle that or report on that. And they basically require that you disable that. But for architectures like SolidFire, where that's always on and just part of the architecture, that really isn't an option. And even if you were to run it and disable it or just report essentially no benefit from that, the results in terms of the effective

Starting point is 00:14:06 cost that it's reporting in those reports, and that's a big reason people look at those benchmarks to look at the cost-effectiveness of systems, it's very off. It's not real world at all. The other thing about that particular benchmark, again, I think by and large it's served the industry pretty well, is it's a monolithic workload benchmark. It simulates a data processing workload, and it simulates a basically single workload at ever larger scales, but a single workload with three different kind of IO mixes that make up a single workload. And the reality is that's not the world that we live in today. None of our storage systems are serving a single workload, be it a small workload or a big workload. And the reality is that's not the world that we live in today. None of our storage

Starting point is 00:14:45 systems are serving a single workload, be it a small workload or a big workload. The environments that we're in are serving dozens to hundreds to, in some cases, thousands of different workloads that have a huge variety of different I.O. patterns in them. And we actually think the more interesting question is not can you run this one benchmark really, really well? Can you get good workload numbers on this? But when you have dozens, hundreds, thousands of workloads, how good a performance can you get for all of them? How consistent can it be? And that's really where a lot of the effort that we've put into the system around quality of service really shows, is not just when you get great benchmark numbers on a single workload, but when you can actually

Starting point is 00:15:23 support thousands of workloads that some of them may need dozens of IOPS, some of them may need thousands of IOPS, and do all of that at the same time. Yeah, and you think any benchmark can do something, you know, hundreds of workloads or thousands of workloads? I'm thinking even dozens would be a challenge for most benchmarks, but... Yeah, it is, and that's one of the challenges we've had even internally is testing our system at scale. We literally have more systems in our lab to generate workloads than storage systems themselves. And that's kind a single kind of I.O. workload, but actually push them across thousands of iSCSI connections, thousands of LUNs, thousands of different I.O. patterns, simulating everything from VDI bootstorms to SQL Server OLTP workloads to data warehouse workloads to MongoDB and Cassandra NoSQL workloads, and actually running all of those simultaneously. It's a pretty interesting challenge.

Starting point is 00:16:26 And I agree that having a single industry benchmark that tried to do all of that, it would certainly be interesting. Yeah, but even at small scale, it's a daunting task. We've built kind of VMark-ish things where it's here are these eight VMs and that makes up a tile and you repeat that N times. But just coordinating all those workloads to start and stop at the same time becomes an issue. Sounds like an opportunity for a benchmark-oriented consultant to come up with something. Yeah, I think it could be interesting. But the other question then is, how do you measure it? And again, one of the things that's been nice about SPC is that they reduce the entire system down to a small number of metrics, including cost, IOPS, and dollar per IOPS, and capacity, and usable capacity.

Starting point is 00:17:24 And that makes a very nice, clean comparison. But when you're looking at a large-scale system like this, you've got to look at everything from the kind of overall throughput of the system, the range of workloads that it can effectively support, the consistency of the latency to the individual workloads, you know, kind of 95th percentile, 99th percentile type latency numbers, as well as the cost effectiveness of it for the different workloads. And there's a lot of different things to hone in on to actually say, yes, this is performing well.

Starting point is 00:17:53 I would have to say, let me interject here that SPC1 does provide 95th percentile work latency numbers and things of that nature. I mean, if you dig into the reports, mind you, you have to dig into reports, you can see what the latency was at 10% stress, 15, 20, 25, 50, 75, all the way up to, I think, 95%, and at 100% stress, actually. So, I mean, it's probably not all those numbers, but there's probably a good six or seven latency measurements that were taken. Yeah, there is.

Starting point is 00:18:26 And, again, I think it's one thing to report latency against a single workload, but when you've got hundreds of workloads running, how do you effectively characterize the latency there? Because it's something that's very difficult to reduce to a single number, I'd say. Yeah, well, that shows a big disconnect between the way your customers are using your gear and the way most of the customers for the other all-flash arrays are. was one workload that the ROI on it was so big that they just went and threw money at that one particular problem. This would be an Oracle workload in my mind. Frequently.

Starting point is 00:19:14 Frequently, yes. Very often is between the pervasiveness of Oracle as well as quite frankly the cost savings that you can get from getting better storage performance on Oracle and possibly reducing the number of actual Oracle hosts that you need. It's a very tempting thing for people to do. And Flash makes a very good performance band-aid for situations like that. Yeah. I'm starting to see OLTP, but more of it's been OLAP, just because even though they had the huge pain with the transaction processing, they weren't ready to deal with a startup to put the crown jewels on. What's OLAP? Online analytical.

Starting point is 00:20:00 Oh, online analytical processing. Oh my god, I learned a new acronym today. This is a first for me. At least a first for at least another decade or so. So the other thing about your product, Dave, that makes it kind of unique in my mind is sort of a distributed deduplication. Could you talk a little bit about that? Yeah. So we've talked about kind of the goal of the system to be cost-effective for a range of workloads,

Starting point is 00:20:26 not just kind of the performance at any cost workloads. And part of that was driving very high levels of data efficiency through inline compression, deduplication, and thin provisioning. You know, deduplication is challenging for a couple of reasons. One is simply just the performance and throughput of being able to do it for a flash system without adding latency to the data path. The second piece of it is in a scale-out architecture, how are you going to do deduplication? Are you going to do it at the node level, at the device level? Are you going to do it at the volume level? Or are you going to try to do it globally? And we chose to take the approach of wanting to do it globally to get best efficiency. And the approach that we take to achieve that is actually turning the internals of the system into a contra-addressed

Starting point is 00:21:10 storage system instead of a location-addressed storage system. And that actually means that we locate and place data blocks within the storage cluster, within this kind of scale-out cluster, based on the content of the data blocks rather than who or where they were written. Because at the end of the day, it doesn't really matter where they were written because they could have been written multiple places by multiple people. And simply placing them somewhere because it was written to a particular LUN at a particular address doesn't really make any sense. Yeah. So it's sort of similar to some of your competition in this regard, right, where it's content addressable storage. So that makes a lot of sense. Yeah, it's actually an architecture we've seen in all Flash and, you know, at the

Starting point is 00:21:53 other end of the market from guys like ExaBlocks where it's, you know, we take the data, we break it up into chunks, we hash the chunk, and then do CAS at the chunk level. Yeah, and the reason is fairly straightforward, which is that it's the only thing that really makes sense to be able to do scale-out deduplication. There really isn't another method that makes a lot of sense, because otherwise, if you don't do content-based placement, you have

Starting point is 00:22:16 to have some kind of global index that you're searching to find the location of data. And simply managing that global index becomes an N-squared type scaling problem. Where data shows up. Well, that and the whole, you know, have I seen this block before hash lookup is now distributed. Right, exactly. And that

Starting point is 00:22:39 just kind of breaks the scaling model. Okay, the other thing that's sort of unique about your environment, I guess in every one of the flash vendors we talk to has sort of a unique data protection scheme. Can you talk a little bit about how SolidFire protects the data? Yeah. So, you know, we obviously don't use a traditional RAID-based scheme for data protection for a couple of reasons. One is that kind of vanilla RAID 5, RAID 6 applied against Flash has pretty significant wear and performance implications. So everybody that has

Starting point is 00:23:11 a kind of modern all-Flash architecture has kind of come up with their own data protection scheme that is really more optimized around Flash and wear characteristics. In our case, we had to look at a couple of other elements as well. We wanted a data protection scheme that would protect not just against drive-level failures, but against node-level failures in a clustered environment as well. So we wanted to be able to lose entire nodes within our system and have the system keep running. The second thing that was really important for us was we wanted a data protection scheme and really, I would say, kind of expand it to a high availability scheme that was self-healing. So any component in the system could fail, and obviously we'd continue running.

Starting point is 00:23:48 It wouldn't lose any data. But we would also self-heal from that. And the self-healing part means that we would do some kind of reconstruction that restores the redundancy in the system so that you can then sustain additional failures without manual intervention. And do that essentially as many times as possible up to whatever the available capacity of the system is. And traditional RAID certainly doesn't provide any benefits from that. It may be able to heal or rebuild into a spare drive, but then you have to go replace that drive. And if you have high availability at a higher level through something like dual controllers or redundant controllers, there's really no self-healing capability inherent in that. And so the approach that we've taken is something we call solid-fire helix. And it's a distributed

Starting point is 00:24:28 replication scheme that distributes redundant blocks of data throughout the cluster, but does so so that redundant copies are not on the same node or obviously on the same drive. And this allows us to survive both kind of the drive level as well as node level failures, but also allows us to do that reconstruction so we can restore that redundancy when a driver or node fails. But the other really cool thing about this is that it ties into our quality of service capabilities, and the rebuilds are done in such a way that they have very, very little performance impact, and we can still guarantee our quality of service settings even when failures occur. And that's another place that traditional RAID-based systems tend to struggle. Okay. If I understand how you do things right,

Starting point is 00:25:13 each SSD in the system is responsible for keeping a copy of data for some set of hash values. That's correct. So if you don't, but that drive, that SSD doesn't have a doppelganger someplace else in the system, right? That's correct. Exactly. The essentially ranges that it's protecting, other copies of those ranges are protected in other drives in the system, but it's not a mirrored one-to-one relationship, which is important because that's what allows us to get that. Because of the rebuild.

Starting point is 00:25:48 Exactly. The mesh rebuild process that allows us to rebuild from a failed drive in literally minutes instead of the hours or days that you would have with RAID-based systems. Okay. Well, that brings us to our favorite subject, storage QoS. We hear all sorts of things from all sorts of different vendors about their storage QoS. And how well it works, mind you. Well, I mean, when we talk to vendors, it always works perfectly. Doesn't everything vendors do when they talk to us work perfectly?

Starting point is 00:26:26 Yes, it does. Sometimes I wish I lived in the land where those – no, never mind. So I've seen QoS systems that don't expose knobs, but they expose screws. You need the right screwdriver to get in there and go, well, allocate this much cache memory on this controller to this workload. And I've seen others that have just limits. And I've seen others that let you say, these are my gold, bronze, and silver quality of service. How do you guys expose what you do? And how do you really deal with the guy who wants to play the Who at 4 o'clock in the morning? Yeah.

Starting point is 00:27:09 So, you know, great question. And we see all of those approaches to quality of service as well. I like the description of the screws because that's really all most people have had historically is they have a set of kind of very coarse-grained things that they can do with the storage system to try to get better or worse performance. Things like, well, I'm going to do RAID 10 for this volume versus RAID 5 versus RAID 6, or I'm going to, like you say, allocate, you know, I'm going to pin this to a certain service processor, or I'm going to allocate more cache memory to it,

Starting point is 00:27:39 or maybe I'm going to put it in this storage pool that has this kind of mix of devices in it, and I hope it's going to get better performance. But the challenge that all of these things have always had is they don't tell you exactly what that performance will be. And so you can turn your screws and you can kind of guess. And if you have a small enough number of workloads so that you can turn all those things and everything's happy, that works relatively well. But when you get to large scales, when you get to the point of having

Starting point is 00:28:05 hundreds of workloads and you have a very dynamic environment where workloads are coming and going, it's nearly impossible to do that. In fact, I would say it is impossible to do that type of balancing manually. And so you need some kind of automated quality of service. And then you need to look at whether you're going to use something like a rate limiting approach that just is kind of a hammer to pound the bad actors with, whether you're going to use something like a rate-limiting approach that just is kind of a hammer to pound the bad actors with, whether you're going to try to use a prioritization approach that achieves a relative priority, or something like SolidFire's quality of service that is really based around what we call a fine-grained quality of service model, where each volume in the system has a minimum, a maximum, and a burst level of performance.

Starting point is 00:28:42 And that minimum is key, and that's really the thing that pretty much none of these other approaches can give you is what is at the end of the day the most important thing to the application is what's my guaranteed level of performance? You know, if everything else in the system is going nuts and then you have a drive fail or a node fail or something just goes horribly wrong, what do I know my consistent kind of minimum level performance is going to be? And that's what we really try to deliver with our quality of service model is the predictability of a minimum, a maximum, as well as the flexibility of a burst model, which really gets over some of the challenges of traditional rate limiting, where storage workloads don't like to be rate limited. They're very bursty in their nature, and you need to be able to accommodate that. And what's a burst in

Starting point is 00:29:22 your look like? I mean, it's like 10 seconds of high activity or? So it's a credit-based scheduling mechanism. Credit-based? Exactly. So that means that as you- That should sound familiar. Well, I was thinking about pulling out my credit card, but go ahead. Yeah, it's not all that different, but it basically means as you run under your limit,

Starting point is 00:29:42 as you run under your maximum, you accumulate credits. When you have a burst of activity, something like, say, a VDI or just a virtual machine booting up, or you have something like a database checkpoint or something like that, it can then burst to a separately configured burst limit. The higher it bursts, the faster it consumes those credits. So it may take anywhere from several minutes to dozens of minutes to consume those credits. So it may take anywhere from several minutes to dozens of minutes to consume those credits. This isn't meant to allow you to kind of accumulate performance for months on end and then go crazy for a month. But on the order of minutes, which is kind of typically what storage workloads burst on, you can accumulate credits and use those credits. And it's, of course,

Starting point is 00:30:18 completely transparent to the application. What they see is just consistent low latency. And that's the huge benefit here, is that if you have a rate limit, as soon as that workload has a burst of IO activity, it's going to hit that rate limiter, the queue depth is going to go up, the latency is going to go up, and it's just going to seem like the storage is really slow. Okay, so let's say, I don't know,

Starting point is 00:30:43 100,000 IOPS is my max level. And I sit here and operate for an hour at 50,000 IOPS. So I effectively have a 50,000 IOPS per hour credit kind of thing. And then so somehow, you know, Howard kicks in and he fires up his workload and it's now 200,000 IOPS. I can use, somehow I can use my 50,000 IOPS hour for some amount of period to correspond to the activity at 200,000 IOPS. Something like that, right? Yeah, I mean, again, it's typically on the order of minutes. So if you have 100,000 IOPS allocated and you're only running at 50,000 IOPS,

Starting point is 00:31:22 you're accumulating, you know, 50,000 IOs per second of credit. You're going to do that for essentially several minutes. You're going to get a bucket of, you know, let's just call it for the sake of argument, 5 million IOPS that you have topped out there. And then if your burst limit is 200,000 IOPS, then that's fine. If you have a burst of activity, if some other workload comes in there or you, you know, run a database, you run an OLAP query or something against it that wants to really push the system, you'll be able to burst up there up to 200,000 IOPS and tap into that pool. And if your query finishes, then great. If you continue to run at that higher level, eventually you'll be

Starting point is 00:32:00 limited back to your maximum to prevent kind of abuse and becoming a noisy neighbor. But that's really the benefit of the burst is allow you to absorb those, you know, very typical spikes in I.O. that we see with storage workloads. Okay. So my problem is outside, you know, so I understand if I'm a solution provider that there are good reasons for me to put a limit on a given customer's performance because that's all he's paying for. And if I give him more than that, he's going to get used to it. And if the system later gets busy and he gets less than he was getting but more than he's paying for, he's still going to get upset and I don't want that to happen. Right. But in the corporate world, if the system isn't running at its full capacity, why do I want to limit any workload?

Starting point is 00:33:02 Yeah, it's a great question. I think there's two things there. One is that more and more corporate IT, corporate kind of data center managers are becoming service providers, and they are turning around and offering SLAs on availability, on performance, on capacity, and other things out to their business users and application owners. And part of that is being able to guarantee that they're going to deliver a certain level of performance. And you might say, well, if the performance is available, why wouldn't you give them more? Well, the same reason that the service provider in the public cloud wouldn't do that either because if you can deliver – Because I have to keep them down on the farms and therefore won't show them Puri?

Starting point is 00:33:41 No, no. What you want to do is say, look, if you're consistently running at or above your maximum anyway, we need to adjust your allocation. We need to give you more, right? We need to change what your limit is rather than just let you go crazy. Because what happens is when we do get more workloads in here, and we're constantly getting more workloads into our corporate data center, eventually, I'm going to have to turn you down. You're going to be upset. So it's better to give them the performance they need, allocate it, whether you're internally billing for that or not, then you know what they're getting. They know what they're getting. The storage system knows what they're getting. And you know how

Starting point is 00:34:14 much headroom you have in your storage system, which is one of the big questions people have about their corporate storage environments today is, how much headroom do I have, right? How much are my applications consuming? How much could they consume? Where are they going to go? Where can I put new applications? And this takes all the mystery out of that, out of the picture entirely. So you actually tell the user or you would have some sort of operator panel that would say, you know, of the 5 million IOPs that are provided by this 5,000 cluster solid firefire node scale-out system, you're only using one million, something like that? Absolutely.

Starting point is 00:34:50 So we show what they have allocated. So here's kind of if you want to be able to deliver everybody's minimum, here's what you've allocated out in the cluster. Here's what's being consumed in the cluster and show this cluster-wide as well as per volume as well as per tenant because we're a multi-tenant storage system. So you can see what you've allocated out and what's being consumed by tenant as well as all the way down to the volume level. Yeah, it's a collection of volumes. And again, it could be deployed in different fashions. Obviously, you think about a customer in a public cloud scenario in a kind of

Starting point is 00:35:20 internal corporate data center that could be a business unit, could be a department, could just be an application, right? You could give every application and the set of volumes around that application its own tenant and be able to report and see the aggregate of all of that. I am somewhat dismayed that we're still talking about block storage because that means that I have all of these wonderful tools and I have to apply them to data stores rather than just individual VMs. But I guess I'm just going to have to be somewhat disappointed until VVols.

Starting point is 00:35:59 VVols is the answer, right? It's an answer. A couple of things to realize. One is that VMware is not the entire world, and we support other environments, everything from bare metal to cloud stack and open stack, that allow us to equate a kind of one-to-one relationship between volumes and virtual disks today, and we can do our quality of service and all the other fun stuff on a per-virtual-machine basis. Obviously, on the VMware side, vVols is coming. We'll give some of that capabilities there.

Starting point is 00:36:27 And even in the interim, we have some interesting things we haven't announced yet, but we'll be rolling out soon that try to bridge the gap, particularly on the quality of service side, between the VMDK level and the data store levels. I got Howard speechless, so I guess we're done. You got both of us speechless, Dave. And you know how rare that is. Dave, and we've talked to you like four times in the last three months.

Starting point is 00:36:46 Yes. That is the other side of this. All right. Well, I think that we should recommend that our listeners watch Dave's Storage Field Day 5 presentation. Dave managed to talk about his product and some of the competitor products without stooping to name-calling and finger-pointing. That presentation was one of the most impressive pieces from the CEO of a vendor neutrality I've seen in a while. Yeah, I would agree. I mean, and we've seen to some extent versions of this in the prior meetings that we've had together. force of what Flash architectures exist today and some of the benefits and some of the not-so-benefits

Starting point is 00:37:49 of different architectural decisions that are being made and stuff like that. So I assume that most of our listeners know what Tech Field Day is, since Ray and I are both relatively active in that community. But for those of you who don't, please go to techfieldday.com and look for Dave's presentation from Storage Field Day 5 last week. It is educating for anyone looking at all-flash products, whether they're interested in solid-fire or not. Absolutely.

Starting point is 00:38:21 And it was a good session and was well received by the blogger team that was assigned to critique it. So it was good. I often think people are a little bit overly obsessed with their own architectures, and it's actually nice to be able to discuss the broader context of what choices other people are making, where things are similar, where things are different, because everybody wants to make it sound like they're the only ones in the world that do something, and the reality is there's a lot of overlap in the storage space, but when you look at the combinations of decisions people are making, patterns start to emerge. I think it's a good point to end our discussion. This has been great. Thank you very much, Dave, for being on our call. Absolutely. Ray, Howard, thank you very much.

Starting point is 00:39:15 Next month we'll talk to yet another startup storage technology person. Any questions that you have on SolidFire, please let us know or contact Dave directly. That's it for now. Bye, Howard. Bye, Ray. Bye, Dave. Thanks again, Dave. All right.

Starting point is 00:39:30 Thanks a lot, guys. All right. Until next time. All right. Bye-bye. Bye-bye. Solid.

Grey Beards on Systems - Greybeards talk all-flash arrays with Dave Wright, CEO and founder of SolidFire

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.