Grey Beards on Systems - GreyBeards discuss server side flash with Satyam Vaghani, Co-Founder & CTO PernixData

Starting point is 00:00:00 Hello everyone, this is Howard Marks and Ray Lucchese. Welcome you back to the second edition of Greybeards on Storage, the monthly podcast where the voice of experience introduces you to the latest in the storage world. If you'd like to appear on a future episode of Gray Beards on Storage, you can contact Ray Lucchese at ray at silverton consulting, or me at hmarks at deepstorage.net. If email is too 20th century for you, we are on the Twitters at Ray Lucchese and DeepStorageNet, respectively.

Starting point is 00:00:49 This, our second episode, was recorded on October 29th, 2013. Well, our guest this month is Satyam Vaghani, who's the Chief Technology officer at Pernix Data, and widely known as an early architect of VMware's entire storage product line or storage architecture. Sacha, I'm going to fill in on your history a little for us. Sure. Thanks for the generous words, Howard, and thanks to you and Ray for inviting me. I'm very excited. I even bought a new pair of shoes for this podcast.

Starting point is 00:01:32 And for folks who are going to listen, thanks for spending the time. I'm right now a CTN co-founder at Pundit Data, but like Howard said, I spent 10 years before this gig back at VMware architecting their storage stack. Obviously, it was a great ride. We started out when people thought VMware was a company that makes washing machines. But eventually, people kind of got on the train, and we changed the world. Hopefully, we'll do it again with Pundix data.

Starting point is 00:02:07 So, you know, as you and hopefully some of our listeners know, I've been following the server-side flash market closely for a couple of years, and that's where you guys have chosen to play with what is either your only product or your first product. Oh, I see. So that's how we are going to do it, huh, Howard, today? Understatements. I get it now.

Starting point is 00:02:36 You've been following the server-side flash market. I see. I've been writing a bunch of code, you know, for enterprise systems. For a while now? Yeah, just for a little while. I'm sorry to break your train of thought, Howard. It's okay. It's narrow gauge anyway. Jesus.

Starting point is 00:02:58 So you guys have gotten a lot of attention because you were the first to bring write-back caching to the server side. So what do you consider write-back caching? Maybe that's a good place to start. Write-back caching is the ability for us to act as an acceleration layer for writes that are coming out of virtual machines. And while that sounds very trivial, the problem with write-back caching is, especially using server-side components like server-side flash device, is that you've got to guarantee that if the server goes away or if the server-side flash device fails, you've got to guarantee that data availability is not compromised.

Starting point is 00:03:49 In other words, the availability of your data is as good as if you were running without write-back caching on the server-side, just going to your regular SAN. So, you know, I find the word... Go ahead. I'm sorry. I'm sorry. I find the word caching very interesting because what we are doing through write-back caching is actually write-back caching with fault tolerance. And so the fault tolerance aspect is what makes it very complicated and very challenging to do.

Starting point is 00:04:19 Okay, Sachin, we got pretty deep very early. Before we return to the deep end of the pool, why don't you tell us about FVP and how it works and what value it presents to the users? Sure, Howard. And so what FVP does is it aggregates all server-side flash devices that you care to put into servers running hypervisors into what looks like one seamless pool of flash. And then we use that seamless pool of flash that spans servers to accelerate IOs that come out of regular virtual machines and as they go out to their primary storage systems. Now we do that in a manner. This piece of software is resident inside the hypervisor, so we do it in a manner that requires absolutely no change to the virtual machine. It requires absolutely no change to your primary storage system. And it requires no new things to be managed. No new virtual appliances, no new data stores, et cetera. It is the same old virtual machines running on the same old data stores,

Starting point is 00:05:26 except they are magically faster. And we saw that magically faster applies both to read-intensive virtual machines and to write-intensive virtual machines. Okay. And so it installs entirely in the hypervisor? That's right. It goes into the hypervisor. And so it's as non-disruptive, it's as transparent as it gets. And we assume from your, that hypervisor is vSphere? For now, it's vSphere, yes.

Starting point is 00:06:01 We are going after the lion's share of the market, although we do realize that other hypervisors are coming up. Do you support cache coherence across multiple servers? Is that how you do this? We do. And so we make sure, and, well, this goes back to not the right-back caching aspect, but the fact that we were also the first company to make a truly clustered platform out of server-side flash. And so because we are clustered, because we know where every piece of virtual machine data is,

Starting point is 00:06:34 regardless of whether it's spread across multiple servers, we can make sure that as virtual machines move around, we would get you the latest copy of the data for that virtual machine, regardless of which server-side flash device it resides on. It could be a remote flash device, but we'll get that. And yes, we will make sure through our cache coherency protocol, we'll make sure that regardless of where the virtual machine is running, it will get the freshest data and so it has total

Starting point is 00:07:07 knowledge of how many versions of a particular block are cached on the server side layer for a particular virtual machine and it has knowledge to fetch the freshest block or in other words disregard disregard those tail blocks. Gee, we got deep here really quickly, didn't we? Sorry, sorry. Okay, I'll shut up now. Oh, no, no. It's just cache coherency is sort of a, and the fault tolerance that goes around it or supports it

Starting point is 00:07:40 is fairly sophisticated tools and mechanisms for what I would consider just a mere server-side Flash user environment today. I mean, you're effectively creating a new caching layer throughout the virtual machine architecture, it seems. Bingo. Bingo. And so, you know, maybe this is a good cue to rewind back as to why we are even doing what we are doing and the reason we created Pernikstead and created FVP the product is that you know we saw a very interesting kind of you know behavior in the storage industry and the behavior was that all the consumers of storage,

Starting point is 00:08:25 especially as it relates to virtualization, know that they've got to go to some form of scale-out storage to make this storage problem tractable with respect to the growing needs that virtual machines kind of pose to that layer. And so everybody knows that scale-out storage needs to be done, but it seemed that every time some player in the storage industry wanted to deliver scale-out storage to consumers, it was always in the form of a box.

Starting point is 00:08:55 And that was very ironic to us because, you know, scale-out storage is about getting performance that scales and, you know, determinism and performance. It has nothing to do with capacity so we found it very ironic that to solve a performance problem you suddenly ship a capacity based solution and so the reason we created Pernix data was to was to say well you know maybe we can do it better and that maybe we can decouple the storage performance problem from the storage capacity problem and

Starting point is 00:09:24 solve the storage performance problem and the storage capacity problem and solve the storage performance problem. And just that one problem, using server-side flash devices extremely close to the applications where I think the problem rightfully belongs. And let the storage capacity tier, which is traditional storage systems or new age storage systems, solve the storage capacity problem. In other words, give you just raw gigabytes

Starting point is 00:09:48 and data services on top of those raw gigabytes, things like snapshotting, replication, rate levels, et cetera, et cetera. And so that's the bottom line. That's the reason we are doing whatever we are doing. The functions are what they are, clustering, writeback, all that stuff, all that good stuff. But the reason we are doing this is to make sure that every data center

Starting point is 00:10:11 that runs virtual machines can actually deploy a scale-out storage tier comprised of server-side flash, regardless of whatever they are using as a means of providing capacity, as a means of providing storage capacity. Amazing. Amazing. Okay.

Starting point is 00:10:30 So let's roll back a little bit and talk about the whole idea of write-back and how if you didn't do the clustering, that would be problematical. Because I hear people talking about write back as an advantageous technology, but I think they really underestimate the fragility of today's server environments compared to today's storage environments. That's very true. They are much more fragile.

Starting point is 00:11:09 And, you know, some of the fragility is actually user-induced, right? And that, you know, nowadays with technologies like VMware vMotion, servers have just become replaceable components. You can put a server in maintenance mode as and when you wish. And so it's not just the fact that server components fail maybe much more often than storage components, but it's also the fact that things like vMotion have actually made them much, much more fungible. People can decide decide to install new drivers put them in maintenance mode as and when they please it's just much more feasible to do that

Starting point is 00:11:49 and so people work and that keeps servers you know coming in and out of virtualized environments and so one one needs to handle it but even coming back to your question as you can see I'm trying to stay away from the answer as much as possible. But, right, so coming back to your question, the write-back problem is fragile. Number one, because the server components fail. And so one needs to make sure that if a flash device fails or if a server fails, you can resume the virtual machine on any other server in the system, and that virtual machine needs to get access to the freshest copy of the data. Now to do that, it's not a caching problem anymore.

Starting point is 00:12:52 Now you've got to build in clustering to make sure that you can figure out within your whatever cluster of 32 nodes or 64 nodes where the freshest copy is. But more importantly, you've also got to do things like replication for fault tolerance so that you can figure out who is the other server in the system that has a copy of the fresh data because presumably the primary has failed already. And so now we are talking about not only doing writeback, which is significant in itself because as you guys already know to your quote-unquote short amount of exposure to server-side flash you know server-side flash devices are great for reads or flash devices in general they do great

Starting point is 00:13:38 at reads but as soon as you start using them as a right substrate you've got to think about you know some fundamentally different on-flash layouts that will make sure that the writes don't kind of get into the weird physics of server-side flash. So you've got to solve that. Well, of flash in general. Yeah, there you go. Too many of the legacy vendors in their first generations of systems said, oh, look, they're SSDs. They act like disk drives and ignored the weirdness of Flash. Do you guys create your own hardware or use certified hardware of others

Starting point is 00:14:16 or from a server-side Flash perspective? We want to limit the problems we subject ourselves to. So just to, I guess, maintain the work-life balance. Okay. No, I'm just kidding. But no, we do not create hardware. We only ship software. But the interesting thing about shipping

Starting point is 00:14:46 software, and that was true even when I was doing my job at VMware before this, is you've got to then actually take care of the disparity in hardware in your software layer. So,

Starting point is 00:15:01 as of right now, we actually see that there is a wide disparity in both performance and reliability when it comes to server-side flash device vendor one versus vendor two. And so some of these things are things that we can actually account for and take care of in our software layer. Some of those things are things that we just throw up our hands in the air. So, for example, we can't account for flash devices not providing persistence on power failure. Yeah, really.

Starting point is 00:15:37 If you have a consumer flash device and it doesn't have the kind of super caps that are needed to destage data on power failure. So we can't solve those kind of problems, but we can indeed solve problems where, for example, we've seen in a particular flash device, we saw that if we mix a stream of 4K writes with a stream of smaller writes or bigger writes, then the performance is widely different from when we don't mix those streams and so well you know we figured out that we can maybe tweak some on flash format to make sure that those you know when these streams go in parallel we are

Starting point is 00:16:20 changing that stream to make it look like a different stream which the flash device is known to handle better so you know things like those we can take care of in software. The other things we don't. But that's what makes it very exciting. And that's what also gives us, you know, technology lead compared to other vendors in the space who might, you know, legacy vendors, to use Howard's term, or maybe newer vendors who are trying to crack this nut. So, I mean, yeah, so the challenge with, you know, write-back caching and clustering and that all that stuff is the question of providing, you know, multiple copies of a write block

Starting point is 00:16:57 across the multiple servers that you're supporting throughout the cluster. I guess the first question is, does your software run on all servers that have SSD cache? In order to be able to do this, it would seem like that would be a necessity. It is a necessity, in fact. The software needs to run on all ESX machines that are in a cluster, and that is for the simple reason that if a virtual machine is ever powered on any of the machines, then our software makes sure that it fetches you the freshest copy of the data. Even if it doesn't have an SSD cache, SSD PCIe card or anything like that, it still has to run there in order to be

Starting point is 00:17:35 able to retrieve the data. Yep, yep, I understand. That's correct. Last I checked, you guys weren't supporting members of the cluster that didn't have their own Flash, though, right? Oh, no. In fact, remember, if you remember our conversation from SFD4, I'm sorry, SFD3, you can see I'm getting into the mood now. But anyway, so in fact, at SFD3, we talked about exactly this, which was that you don't need to have flash devices on all hosts in the cluster to make writeback work. All you need is the software to run on all hosts in the cluster.

Starting point is 00:18:26 Right. Yeah. Okay, so the other question that comes to mind is how many copies of a write block are floating around your cluster? You know, assuming everything is working properly. Let's start there. And, you know, because you could have, you know, two copies, or you could have 32 or something like that. It's not clear what you consider, you know, a fault-tolerant environment for that data at that point. Although the network overhead for 32 copies would be rather large. You think it's worse than two? I don't know because, I mean, it's a broadcast kind of scenario.

Starting point is 00:18:58 I don't know. It's a different story. Well, you know, yeah. Anyway, I shouldn't make that joke go ahead go ahead I'm okay but I think our 10 gigi network vendor would love

Starting point is 00:19:18 for us to make 32 copies I'm sure but we pinged a bunch of customers or a bunch of customers, or rather, a bunch of users who participated in our alpha program. And the general reaction was that a primary copy and two extra copies

Starting point is 00:19:38 ought to be enough. And so now we give you choices. You can, on a per virtual machine basis, you can choose to do n plus 0, N plus 1, or N plus 2. On a per virtual machine basis? Yes. N plus 1? Really? That means to me N plus 0? Really?

Starting point is 00:20:01 It would be temporary work files and stuff like that, things that you don't care about. You just rerun them and stuff like that, right? Yeah. Exactly. that things that you don't care about you just rerun them and stuff like that right yeah exactly and for things like in fact this came as a surprise to me but right in our first year of shipping software people are using us with analytics and so for those workloads you know most of the right is about you know outputting temporary data which is going to be read again and then you know you're just shrinking the map radius then you're just shrinking the map radius, right? You're just shrinking the

Starting point is 00:20:27 results into one number, which obviously ought to be 42, which you then write out. Of course, of course. What? Wait, wait, wait. I don't understand using server-side Flash for analytics. It doesn't compute

Starting point is 00:20:44 to me. These guys are using DAS. They're using slow disks. They're having three copies of all the data. Why would they want to use a PCIe flash card for A? Why would they run on BMP? Why would you even want a SSD? I don't understand.

Starting point is 00:21:01 I hope I understood the question right. I'm not sure I had a question in there, but go ahead. I see, I see. Well, I'll just make one comment, which is it's a huge causal chain of events. For example, we have companies like VMware, which is trying to convince people who are doing analytics to run on virtual machines, and they are trying to break down the last known barriers to running those workloads inside virtual machines.

Starting point is 00:21:32 And so if that is the case, then we as a company, we as Pernix Data, want to make sure that, well, you know, the data set that you want to populate your engine with can reside on whatever storage, you know, it could be shared storage, doesn't matter. But then all these intermediate results that you're churning, and which are the ones that actually require performance out of storage, can reside in the server-side flash tier. And then, you know, over time, you're just going to throw them away,

Starting point is 00:22:06 which means, you know, you can potentially run those virtual machines in write-back mode with zero replicas and be done with it because you can rerun the computation anyway. Okay. All right. I got you. I got you. All right. Now back to the deep technical discussion. So, you know, write-back cache mode, when I was doing storage controllers, we had sort of, you know, a limit to how much data we could actually store in the cache in an instant time. We would force it to be destaged out after, you know, five minutes or an hour or something like that.

Starting point is 00:22:38 Do you guys have anything like that? We do, in fact, so that, you know that people can argue about determinism. We didn't want to be a system where data stays in the server-side flash tier for an unknown amount of time, and hence you can't argue about the RPO characteristics of your capacity tier. So anyway, I think I said too many things. But yes, so the way this thing works is we have a region on the flash device, again, on a per-virtual-machine basis,

Starting point is 00:23:16 and we call it a destaging area. So that's the area that tracks all the uncommitted writes on behalf of a particular virtual machine that haven't yet made out to the SAN. Now, that is in addition to the other area that a virtual machine has, which is an immutable cache region, which is just blocks that you can use for read acceleration, whereas the destaging area is special in that you can not only use it for read acceleration, but it's also the part that needs to be written out to the SAN over time. Now, we start that write-out to the SAN as soon as possible. In other words, as soon as you have stuff in the destaging area, we are going to start writing it out to the SAN.

Starting point is 00:23:59 The only thing we make sure is we don't write it out to the SAN in a spiky fashion. So we are constantly producing a uniform workload to the SAN, if you will, regardless of how spiky the application workload is, because that's all being absorbed in the destaging area. And then we are just constantly destaging at a constant rate to keep the sand happy, to not get into the corner case regions where the sand actually behaves erratically. But with the net effect that you take a little while to destage, but that little while, so to speak,

Starting point is 00:24:44 is bounded by virtue of the size of the destaging area now that leads us to the next obvious question um is well what if the area fills up and and so the answer to that is as we see that it's getting close to full, as we see that there is only a small amount of capacity left. We induce artificial delays in acknowledging rights to the application. Now those delays are nothing compared to the latency of the SAN. These delays are typically in the order of microseconds um but but that that makes it such that you know that delay essentially buys us extra time to to send out more data to send uh and so that's how we uh

Starting point is 00:25:38 rate throttle the application uh so to speak and so in that intermediate time you know if you're running very very close to the full capacity of the destaging area you're not seeing full flash performance because because of the artificial delays but at the same time you're not seeing the bad sand performance as well because you know the delay puts you somewhere in the middle of flash and somewhere in between flash and sand latencies. Yeah, yeah, I got you. And then as the destage area fills up, do you increase the amount of back pressure? Exactly. So we gradually start off with some latency, and then we gradually keep on wrapping up.

Starting point is 00:26:25 And, of course, the worst case is that you're going to run at sand latencies if you just cannot get it done. Yeah, but at least it's tapering off. I've talked to a lot of people who don't realize that right-back caches are really about bursty traffic, that once it gets full, if you don't have this kind of back pressure mechanism, you fall immediately to the speed of the backing store. That's very true. And the other thing that I think not too many people realize is that write back is not just about, you know, absorbing stuff in the cache and then, you know, writing it out to the sand just blindly. Because if you were to do that, then you are going to just produce the same old spikes on the sand that the exact problem that you are trying to solve and so

Starting point is 00:27:11 right back is also about kind of ironing out the traffic characteristic as it is presented to the sun and that is a slightly non-trivial problem just to keep up with the morning's kind of, you know, understatement. So sad. It's a non-trivial problem in a clustered environment because it's not just you who are, you know, this

Starting point is 00:27:37 one horse that is riding out to the sand. There are many, many, many horses. And so you've got to keep the ride smooth and out across a cluster of machines it's just not you know one machine and the same number one you mentioned you mentioned sand multiple times so it really only supports storage area network block block storage behind you is that how this works all right so I guess

Starting point is 00:28:07 we started out very quick and stayed there but the current product works on any fiber channel iSCSI or FCI target that you care to use and so that is all traditional sense but it also it also fact, extends to things like virtual storage appliances that you can buy. And so it's not just SANs. You can also do it with storage systems that comprise of using server-side spindles to give you capacity. The problem with those storage systems is because they are using server-side spindles and because they are using virtual appliances, they tend to be pretty low-end in terms of IOPS and latencies,

Starting point is 00:28:53 and we make them look like high-end. And then, of course, there's the question of benefits. We are working on it, and we have some news to share reasonably. You also mentioned that the D-stage area ends up being a fixed amount of storage per VM. Yes. So, I mean, that would be, so if you had like an N plus 2, which would be a normal, you know, something resilient type of data environment that you wanted to keep, you'd have that VM be effectively stored across

Starting point is 00:29:26 three PCIe SSD kinds of things, right? That is correct, but just uncommitted data for that VM and nothing more. Right, right. And so at any point, that uncommitted data is a very small fraction of the VM's overall flash footprint. Right, right, right, right, right. And you're saying uncommitted data because it's not something that's been written out to the disk yet or the storage. And then when it becomes committed at a point in time when it actually has been written, does it get kind of moved into the regular read cache,

Starting point is 00:30:05 or does it get, you know, flushed, or how is that? I know it gets flushed, excuse the term, to the back-end storage, but does it stick around? I mean, that's, you know, I'm getting into nuts and bolts here. Perhaps I should shut up. No, I love it. We like nuts and bolts. Yeah, okay.

Starting point is 00:30:25 So what happens to the data after it's committed? I guess that's the question. Right, so it does get moved to the immutable region so that you can use it for read acceleration. Okay. And then, of course, then begins the, you know, very interesting resource management for that data, right? Because, you know, that data is less important to keep on the peers because you know the

Starting point is 00:30:49 replicas because well it's coming to the sand so you clearly don't need that for availability reasons so it's a different kind of handling when it comes to you know that that data exactly in transition of that data, yeah, yeah, yeah. Exactly, the transition of that data. It's a different kind of handling between primaries and replicas. Right, and clearly I'd rather read from the local SSD than across the network anyway. Yeah, that's true. But the other interesting part of your clustering technology is what happens when we do a vMotion.

Starting point is 00:31:26 Because with a lot of the more basic products, first of all, many of them have problems just enabling vMotion. But even if they do allow it, when you get to the new host, the cache is cold. That's right. And, in fact, there are some products which are advertised that they support vMotion in the sense that they migrate the cache. And I'm not a fan of that, too, for multiple reasons. So, one, you're going to cause an immense, you know, these flash footprints for virtual machines are in the order of tens or hundreds of gigabytes. And so now, on a vMotion, we are talking about causing, you know, tens of gigabytes, if not hundreds, of network traffic over the network.

Starting point is 00:32:24 I mean, that just doesn't quite make sense to me. The other thing that doesn't make sense is, you know, all that data that is eagerly migrated needs to be written out on the destination flash device. And guess what writes to SSDs do? They eat them up. Exactly. So,

Starting point is 00:32:48 this is a very inefficient model. If you do scale out, you've got to do it on demand. So, if virtual machines move, at least our take on the whole situation is that, well, you can get the virtual

Starting point is 00:33:03 machine sets data from remote caches on an on-demand basis. If it asks for it, we can get it. And in the process of getting it, we can also warm up the local flash device. But that's a much more scalable mechanism because we're not moving gobs and gobs of data. Right. scalable mechanism because we're not moving gobs and gobs of data. Right, right, right. Yeah, you know, on the one hand, I don't mind that the vMotion takes longer because you're moving the cache. But when we're talking about caches as large as these,

Starting point is 00:33:36 the network and CPU requirements to make the transfer start to become significant. Exactly. You know, we're reaching about a little over half an hour at this point. I don't know if you had any final questions, Howard or Satcham. I think I'm pretty well exhausted at this point. I think we've

Starting point is 00:33:58 covered pretty much everything. Seems like there's a lot more to be said about this write-back clustering, fault tolerance, and all this stuff. But I think we at least spread it. When you make the transition from write-through, which almost everybody else on the server side does, to write-back, it's at least in order of magnitude more moving parts yeah yes it is and and there's a lot to keep track of um but you know we're we're getting into a world where there are more and more right intensive applications and so you know we kind of have to go there. Yeah.

Starting point is 00:34:46 All right. So that's a good point. In fact, that's a good cue into one thing I wanted to say, at least in closing, which is, you know, if we use server-side flash devices for trivial things, and, you know, my definition of a trivial thing is using it as a read-only cache, using it as a form that is not clustered, then I think that server-side flash devices will never see their true potential being realized. And so I think as technologists, it is our job to make sure that we can use them in a much more complete manner. That's the reason why we are not doing caching, I'm sorry, clustering or write-back, because it is fun and there are more interesting technical challenges.

Starting point is 00:35:35 We are doing it so that you can think about server-side flash devices as a part of your infrastructure, not as a thing that you use for a niche use case and that is very important to me you know you if you want to change the core personality of the server that is going to ship in the year 2013 or 2014 and obviously that core changes you know it using a flash device then you've got to use that flash device in a non-trivial manner. You've got to use it in an infrastructure manner. You've got to think about it just like how you think about CPU and memory. Well, all those components get used by the hypervisor,

Starting point is 00:36:15 and they are applicable to every virtual machine you run, not just the five virtual machines out of the 20 that you are going to run. And so similarly, a server-side flash device needs to be applicable to every virtual machine, every workload that you are going to run. And so similarly, a server-side flash device needs to be applicable to every virtual machine, every workload that you are going to run. And the only way to do it is to make a complete solution, is to solve both the read problem and the write problem, no matter what it takes. And of course, I'm also very religious

Starting point is 00:36:39 about doing it in a very non-disruptive manner because this is not about you know using these flash devices in a way that you got to make a hard choice you know there are ways to use server-side flash devices as primary storage systems and at least my thing is well it makes the customer make a choice which is they've got to throw away their current primary storage and move to this other primary storage and I'm not a big fan of that again because again if you want to ship it in every server that is shipped then you've got to make sure that it is applicable in every environment that we ever our virtualization is deployed today you know and that, you know, just by the very definition of every

Starting point is 00:37:27 environment encompasses any storage vendor, any primary storage system that you care to deploy. So anyway, that's me on my soapbox in my new shoes, of course. Well, and there you answered the other question, which is, you know, why don't we go all the way to a vSAN scale IO style model and blow up the sand and then and the answer is that that that's a niche in my opinion I am clearly I'm not close to people using it I clearly some people will find it useful I've heard it's a great thing for

Starting point is 00:38:05 robo-use cases where you just want to deploy two servers and forget about it. But, you know, I'm worried about the guys who use blade systems who don't have enough slots in their servers to do primary storage. I'm worried about the guys who use one-year boxes who

Starting point is 00:38:21 don't have enough slots to do primary storage. I'm worried about the guys who use hundreds servers in their data centers or 150 or 200 or 4,000 where you know suddenly you know doing primary storage that is spread out across a hundred nodes is actually hard to argue in terms of operational ease because now we've got to actually monitor 100 different fault domains, independent fault domains, as opposed to arguing about one box that may or may not fail. So, you know, there's a lot of operational considerations. There's one thing about technology,

Starting point is 00:38:57 and, you know, we always go after, you know, building great solutions, solving the hardware problems in computer science, but totally different things when it comes to operationalizing good technology. There are many, many constraints. Yeah. Wow. Very good.

Starting point is 00:39:15 Well said. All right. Thank you, Satyam. It's been a pleasure chatting with you as always. And likewise, guys, you guys. Always a pleasure, especially always nice to go this deep in the conversation.

Starting point is 00:39:31 Well, this has been great. We've had a visit to the deep end of the server-side flash pool with Satyam Vaghani, and thank you, Satyam, for being on our podcast. Next time, we're going to shift our focus to the other hot topic in the storage world, big data, and

Starting point is 00:39:48 we will take a look at everyone's favorite stuffed elephant, Hadoop, and what it actually is and means. Hopefully, you'll be able to join us the next time on Greybeards on Storage. That's it for now. So long, Ray. Thanks, Satcham.

Starting point is 00:40:04 Thanks, guys. And stay tuned to this very channel on iTunes for Grey. Thanks, Howard. So long, Ray. Thanks, Satcham. Thanks, guys. Stay tuned to this very channel on iTunes for Greybeards on Storage.

Your Ad Here

Grey Beards on Systems - GreyBeards discuss server side flash with Satyam Vaghani, Co-Founder & CTO PernixData

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.