Grey Beards on Systems - 096: GreyBeards YE2019 IT Industry Trends podcast

Starting point is 00:00:00 Hey everybody, Ray Lucchesi here with Matt Lieb and Keith Townsend. Welcome to the next episode of the Graybridge on Storage podcast, a show where we get Graybridge storage bloggers to talk with system vendors to discuss upcoming products, technologies, and trends affecting the data center today. This GreatBird on Storage podcast was recorded on December 27, 2019. This is our annual year-end podcast where we discuss the year's technology trends and what to look forward to for the next year. Matt, would you like to open the discussion? Yeah, I think one of the things that I'd love to discuss is the whole world around NVMe and how the fabric is changing and how it's being utilized in a lot of different platforms.

Starting point is 00:00:49 Yeah, so NVMe or fabric is starting to come out in a big way for a number of storage solutions and vendors. I mean, obviously, there's the Ethernet side of things as well as the fiber channel and continues to be InfiniBand, which is kind of interesting. But it's all kind of RDMA-based logic and stuff like that. So, Keith, are you seeing a lot of activity in the NVMe over Fabric space? Actually, I talked to, ironically, me and Mark May did a podcast earlier in the year, at the end of 2018, talking about, you know, when is NVMe over Fabric going to happen? And I think this past year is a pretty good indication that either from a customer adoption or just an overall vendor product maturity,

Starting point is 00:01:33 it's actually starting to happen at a reasonable scale that most customers that go and talk to are at least aware of it as a technology. Yeah, so I've seen a lot of the storage vendors start offering at least NBME or Fabric Fiber Channel solutions for most of their latest storage rounds. And a couple of them are actually offering NBME or Fabric for Ethernet as well. Advantages are significant, and the SSDs are cheap enough now that they're no longer, you know, that significantly expensive. So if you want to get millions of IOPS with, you know, 100 microseconds or less latency, it's really the only way to go at this point. Yeah, it's kind of an indication of where we're at from a overall

Starting point is 00:02:18 problem and what problem is solving, which is whether, you know, we're talking about ML and AI, which I think we'll get in a little bit later on, or just analytics, the sheer amount of data that we have and that we're trying to tackle. I'm an SAP guy, so I'm partial to HANA and trying to ingest all of this data and analyze this data. I.O. becomes a really, really big challenge. And not just IEO distributed. Yeah. Towards that end, I'm wondering if we're going to see more solutions like we saw at Pure Accelerate this year, where they're even reducing the latency between the disk and the proc by incorporating

Starting point is 00:02:58 the Optane storage array component. Right, right. We're going to talk about SCM here in a little bit, but let's try to continue to focus on NBME over Fabric. The solutions for NBME over Fabric are starting to come out with most of the major storage vendors. I think NetApp has some, IBM has some, Pure certainly has some, Hitachi's getting there or have their NBME over Fabric ready solution for their systems. But all these guys are looking at this as a way to boost the read-write IOPS that are occurring and try to reduce the latency.

Starting point is 00:03:29 It's interesting. I don't see a lot, you know, so that there's a file side of this and there's a block side of this. The NBMU or Fabric really is more of a block side solution, but, you know, you could put it behind a file server or something like a NAS box and still

Starting point is 00:03:45 gain some significant benefits from it, I'm sure. It's just amazing what the speeds and fees are coming out these days with the solutions that are emerging. Yeah, I was just about to say, because you brought up an interesting solution. You think of it as a block technology, but ultimately, as you build upon other solutions or protocols, it becomes a necessity. Obviously, from a block perspective, I can easily see, you know, it backing something like a VMFS, a VMware file system, so that you're building faster and faster access to VMs, which, you know, may provide block services, may provide database services, whatever the case is, reducing latency in the overall system.

Starting point is 00:04:30 Right. Does vShare support MDM over Fabric today? I think it does. It does. It's a protocol, remember. It's not a format. So, yes, as long as there's an initiator from the VMware side, everything is scopacetic. And yes, IO is going to be vastly improved. But for me, you hit on it, Ray. And that's the latency between the processor and the disk that's going to really make the difference. Yeah, yeah. So we were starting to talk, you know, when we got to SSDs, we were starting to talk to sub-millisecond, you know, read and write latencies, which is, you know, let's say 500 microseconds or 750 microseconds. The discussions we had with Pure and some of the other, you know, major vendors, they're talking sub-100 microsecond latencies for some of these stuff. It gets to, I mean, you have to start combining the storage class memory kinds of solutions

Starting point is 00:05:27 in here, but NVMe over Fabric was really designed for storage class memories and things of that nature, which is where, you know, where the industry is going to see a major transition here, I think, and it's starting to see that as well. The other thing is, you know, it was a couple years back where we were at Tech Field Days and Storage Field Days where we started seeing, you know, storage startups start to pump, you know, 4 million IOPS and, you know, 100 microsecond latencies with NVMe over fabric SSDs and stuff like that. Those guys, I don't, you guys, you don't hear much from anymore. They're still there, but now that the majors have all started

Starting point is 00:06:10 to support that protocol, those guys have got to be hurting somewhat, I would think. One of them just got bought by Amazon or something like that. I'm trying to think which one. It wasn't any event. That's what needs to happen, right?

Starting point is 00:06:28 They either need to be bought or they need to become successful. Yeah, there's so many players in the industry right now. We're even seeing the next generation 3PAR, the Primera device, moving in that direction. I don't know how a smaller startup is going to compete with these large majors unless they're doing something truly, truly revolutionary. And it did seem a couple of years ago, I think I was sitting next to you at a particular Storage Field Day event, Ray, where your job was- Yeah, the Acceleron guys were showing 4 million IOPS with two SSDs and 100 microsecond read latencies. I said, you've got to be kidding me. This is, you know.

Starting point is 00:07:12 It was, what, $25,000 worth of hardware? Yeah, yeah, exactly. I mean, these were, you know, major systems that couldn't even approach 4 million IOPS, let alone 100 microsecond latencies years back. So all that's due to NVMe over Fabric and NVMe SSDs. And, you know, PCI Gen 3, I mean, the whole system's kind of gone up a step. Keith? So I have a question. Is that ultimately anything that people really have to care about if the fabrics themselves are self-contained in platforms because one thing that we didn't talk about in pre-show are these new converged systems like dell power one and stuff that that app the the dhci systems which

Starting point is 00:08:00 will incorporate these things in the fabrics of the actual converged systems. So it's invisible. It might use NVMe over fabrics for the inter-system communication, but we may never see it. Yeah, yeah, yeah. Yeah, so does this mean that they should be packaged, or should people who listen to the podcast be asking for it specifically? Yeah. I think if you look at something like VMAX, which is a distributed cluster

Starting point is 00:08:30 storage system, and then, you know, 3PAR obviously is also to some extent, and there are others out there as well that, that can take advantage of NBMU or Fabric or InfiniBand or, or, you know, internal ethernet or internal fiber channel. You know, at one timeiniBand or, you know, internal Ethernet or internal fiber channel. You know, at one time, this was all proprietary stuff, but nowadays it's becoming more and more standardized fabric. So yeah, it's certainly, it's there. They can be taken advantage of. It's not something that a customer in my mind can go out and say, you know, I really want NBME or fabric in your intercluster fabric. But it's, you know, the vendors are certainly seeing that there's an advantage going in that direction because the whole industry is moving there. Right. think about the footprint of a hyper-converged platform, I think you're going to be able to pack

Starting point is 00:09:26 much more density in terms of storage. Of course, we're going to have to lean on some of the new Intel Prox to do it. But in order to get that kind of density against, say, a three-node cluster, NVMe, just purely as a disk-style format format really is going to add some relevance. Especially where most HCI systems break down is at the network. When you start to fill them, the problem becomes these weird network problems that are generated by basically latency and chattiness of protocol. Yeah, because they're not driving all the I.O. to the same storage that the system's on. Yeah, the HCI aspect is interesting. I mean, certainly an HCI solution could support NVMe or Fabric across a cluster of nodes to do some of its internal activity. You know, you find VSAM and a couple of other guys that are doing, you know,

Starting point is 00:10:25 software-defined storage is trying to almost limit the amount of IO they have to go off system for, you know, across the network for, because of that, to try to reduce the network overhead involved in bigger and bigger clusters and stuff like that. But, you know, the advantages, of course, is that you've got all this storage sitting on the server. So if you can access it from the server, it's it's it's almost it's faster than NVMe over fabric, but not much. You know, it's maybe got a dozen microseconds faster if you're if you're accessing the storage directly versus over NVMe over fabric. It depends on the number of hops and all that stuff, sure. But it's come down considerably than what used to be.

Starting point is 00:11:11 I mean, it used to be if you're accessing SaaS directly within a server, it was considerably faster than trying to do that over the network, like a fiber channel network or something like that. Nowadays, with NVMe or Fabric, the difference is not that significant anymore. So yes, there is network bandwidth consumed. And the faster and the more ops you do, the more bandwidth you're consuming.

Starting point is 00:11:38 I'm finding, I was going to say, just keeping on this subject, Keith, you mentioned DHCI, the NetApp, sort of non-reliant, it's disaggregated being the D. So you're going to see things like, in terms of a SimpliVity type solution from HPE, you're going to see an external storage array leveraged for the storage rather than internal storage, which has always been sort of the HCI bread and butter.

Starting point is 00:12:18 Are you thinking we're going to see more like that? Well, absolutely. HPE, I did some content with HPE earlier this year at VMworld, and they're heavy on that DHCI solution, bringing in their three-part into kind of merging that with Synergy and giving that HCI experience. So as we lessen the interface, the complexity of the interface to customers, can they increase the complexity on a back end so those system interconnects, you know, the storage system talking to the nodes, the HCI nodes, or the compute nodes, do you just say, you know what, we're going to do NVMe over fabrics, over Ethernet,

Starting point is 00:13:09 which is something that a customer probably wouldn't tackle themselves. But if it's an engineered system, why not? Yeah, especially, you know, there are some costs with, you have to go to a specialized switch, I guess, to support, you know, RDMA or Rocky or that sort of software.

Starting point is 00:13:27 But even that, you know, in the years past, we've talked to vendors that are trying to get away from even having specialized networking fabric to support NVMe over fabric. Okay, I think we beat that one to death. The other thing that was kind of interesting, and Matt mentioned it, was this Optane and storage class memory and Optane persistent memory and that sort of stuff. I'm starting to see a lot more vendors discussing how they're going to implement these solutions. I think 3PAR was early in with having to have some sort of a, I guess I call it a read-only cache version of the solution but uh they're not the only one i mean everyone's talking about it some of actually implementing it that sort of stuff

Starting point is 00:14:12 you guys starting to see anything in that space that interests you so i'm seeing a lot of this because i'm you know my world is sap sap hana so obviously when you're talking about one one terabyte and 1.5 terabyte in memory systems the ability to subsidize the cost of these systems by using uh source class memory uh really really becomes compelling i just learned from Intel the other day that it's not just the SAPs of the world, but now they've done a lot of work with Microsoft to take advantage of this using standard SQL of all things. So if you're in, yeah, you don't even have to have the enterprise version of SQL. You can get, you know, you can load a system up, I think, with 512

Starting point is 00:15:03 gigabytes of RAM. And a good portion of that being the storage class memory, this obtained memory, and take advantage of it. So it is definitely happening in the analytics world. It is as close to mainstream as when you're considering in-memory databases, which I don't know if you can call in-memory databases mainstream, but in the world of in-memory databases, it's very common. Yeah, so, I mean, a couple of years back when we first saw, you know, 3DX cross-point starting to come out from IBM, not IBM, Micron, and Intel, you know, there was, you know, Howard and I and others had some discussions

Starting point is 00:15:43 about how quickly it would start to emerge. But it wasn't until really this year, I think it was an Intel data center, you know, new data center conference that they had where they introduced the Optane memory, the persistent memory. So persistent memory is available, I guess, in two different formats. One is, you know, like, you know, it's just regular direct memory. It just happens to be a larger direct memory, DRAM. And the other one is it would be a specialized service that you can call to page stuff in and out of persistent memory.

Starting point is 00:16:19 So, I mean, there's that. There's the persistent memory side of, I'll call it storage class memory. And then there's the SSD version's the persistent memory side of, I'll call it, storage class memory. And then there's the SSD version of storage class memory. Both of these guys are starting to come out in big time. So, I mean, and Intel had a major push this last year to try to introduce the persistent memory version of it. And I guess they've been successful from that perspective because SAP and even SQL. SQL, that's kind of interesting.

Starting point is 00:16:50 Yeah, I was pretty surprised when they told me about it. I expected it to be the more expensive enterprise version, but no, they treat, Microsoft doesn't treat it any differently than if you loaded a regular standard SQL server up to the max with, actually it's not that, they're looking at the size of the memory footprint versus the size of the database. So, you know, the memory can work in those two different modes that you mentioned. And if you lean towards the paging versus the direct memory mode, you get the performance boost without increasing licensing costs. So, yeah, that's really.

Starting point is 00:17:26 Yeah, it's hard for me to see the licensing cost aspect of it. It's there, obviously, with, you know, CPUs and memory and all that stuff. So that's an interesting aspect of it. But, you know, if SQL can use it, there's no doubt in my mind that anybody out there has the potential to take advantage of this more memory and that sort of stuff that's going to be available. So, you know, peers did that interesting thing where they're using it in the storage array as a caching layer. And they worked with SAP, Redis, Labs, et cetera, to optimize those databases to run against a SAN, which is not something

Starting point is 00:18:09 I predicted. I thought that we would normally see server-class storage memory in systems directly in the servers. I didn't foresee a use case of putting this stuff in SANS and using that as a shared pool of resources. I don't think consistent numbers came out on the performance benefits of it, but I thought it was a creative use. Yeah, if you build it, they will come. So, when you say the memory is coming out as a DIMM, is that what you're saying? Yeah, so the memory is the... So, they come in the DIMM format. Right, right, right. So, here in their... I think it's their FlashBlade or something. I forget what system they use it in. You can load up one of these FlashBlade systems with them, with a couple of slots.

Starting point is 00:19:10 And then they have an algorithm that pre-stages memory or they pre-stage reads. They use it as a recache basically and it uh and it so supposedly increases sap performance quite substantially as their claims and uh without having that dedicated cost because with distributed storage systems the advantages that you can spread the benefit across many systems will appear making the bet that if you spread Optane memory across several systems, you'll get benefits across all of your workloads versus just a single workload. Exactly, exactly. As in any storage, you know, any storage network storage solution, because you can spend more money on the network storage and take advantage of that across multiple workloads rather than a single server, for instance, that sort of thing. So, yeah, so Optane, you know, Intel is obviously playing all the games they can, right?

Starting point is 00:20:11 So they want, obviously, to sell Optane SSDs. They want to sell Optane persistent memory. They want to sell persistent memory to servers. They want to sell persistent memory to storage systems. And then, you know, we haven't gotten into it, but between their processor and their networking technologies, they're pushing, you know, go back to the previous topic, NVMe over fabrics. Once you offload the processing to the network card for the NVMe over fabrics and make it just as simple as using any other protocol, in theory, they have something there. That's the

Starting point is 00:20:43 theory, at least. Yeah. Intel's in the long game, right? They've been working on NVMe over Fabric as a protocol because they knew that storage class memory would come along and SCSI protocols weren't going to cut it anymore. You had to have something different, completely different. And NVMe over Fabric was the answer to that. Now the other foot has dropped, the other shoe has dropped, right? Because now they've got storage class memory that's available and the world is a new, you know, every, it's almost like every couple of years, something substantial is happening to this world of storage. I mean, it's just, it's a whole new world from an IOPS perspective and a response time perspective.

Starting point is 00:21:26 And, you know, we were at Flash Memory some, I don't know if you guys were there, but there are companies out there, Samsung, that's offering what they consider a storage class memory. It doesn't have the same endurance as Optane, but it's got somewhat similar to the same performance. And it's got much better endurance than the other SSDs. So they're starting to redefine what it means to be storage class memory. I'm not sure Intel is happy about that or Micron is happy about that, but Samsung certainly is happy because they can offer a specialized NAND that offers some of the same characteristics as storage class memory, at least from an SSD perspective, right? Well, it's interesting, especially as the stuff gets pushed down

Starting point is 00:22:11 into more common use cases or broader use cases. You know, ironically, I've been looking at the Apple Mac Pro. I can't really buy it, but I'm wondering, I haven't read any that, but I'm wondering if it already supports Optane or like products. This is a system that can, that comes with a Xeon,

Starting point is 00:22:39 a 28 core Xeon that can address 1.5 terabytes of RAM. No one's going to go out and spend $25,000. In your Mac, right? But there's practical when you think about machine learning and you stuff

Starting point is 00:22:55 these things with NVIDIA cards instead of AMD cards, these become pretty decent ML, AI workstations and now you push that workload down to your data scientists and the capabilities and what your data scientists can do with workstations with a tremendous amount of RAM. You know, I'm really curious as to where this is going to go. I look at the deep learning stuff and it's pretty heavily it's it's it's it's pretty heavily file based but there is

Starting point is 00:23:25 a there is a ram component in that from a caching perspective and obviously they're building the neural network and ram and updating ram you know memory locations or neural nodes and stuff like that there's a whole there's a whole different discussion on what's happening in that space that yeah and it's driving it's certainly driving gpus it's driving you know new hardware coming out left and right uh but yeah i'm not sure the storage class memory is having an effect on that as much as it's opening up the memory space i guess on the servers so yeah the 1.5 terabytes on a mac and DRAM or memory is a pretty impressive number. It is a very impressive number.

Starting point is 00:24:08 I mean, I'm sitting here. I got like four or five. I got eight gig on the desktop at home. And I can see it when I've got like, God forbid, 20 Safari pages open. The memory is the problem. It just can't handle it. Yeah. With 1.5 terabytes

Starting point is 00:24:26 you'll be able to run slack in safari side by side yeah probably probably so yeah so yeah so intel's playing this for a long game obviously nobody's coming out with uh an optane persistent memory uh knockoff anytime soon because it's all proprietary technology. You would think AMD would take a shot at it and stuff like that. But the SSD portion of it, it's all NVMe over fabric. And if you can come up with some other technology that operates at that level of performance and endurance and that sort of stuff, why not? And that's what Samsung has done. I think what they've done is they've taken, you know,

Starting point is 00:25:07 some old SLC technology and dusted it off and said, okay, we can use this to provide better performance and higher endurance. Yes, the capacity is not going to be the same. It may be half of what you can get now or maybe a quarter of what you can get now in an SSD, but if it can do it, you know, why not? And people are willing to pay the money because the performance is so high. maybe a quarter of what you can get now in an SSD, but if it can do it, you know, why not? And people are willing to pay the money because the performance is so high, stuff like that.

Starting point is 00:25:31 Yeah. So it's interesting. I think early on we saw HPE come out with it on the Primera solution, but now just about everybody's talking. You know, Pure came out with it as an option in their FlashBlade. Not in their FlashBlade, in their FlashArray, right? I mean, you could actually replace some of the FlashArray SSDs with SCM, stuff like that. Hitachi's talking about it. IBM's certainly talking about it.

Starting point is 00:25:58 So just about Dell certainly is talking about it as well. So, yeah, you're right. It's coming out all over the place. Well, I mean, a third item I thought we'd talk about would be, you know, and we saw evidence of this at Pure, was this whole version of enterprise storage moving to the cloud. And I had this discussion. I'm not sure who it was with. It was a guy from Wikibon, Dave Valenti, at the bar on what Pure is doing with their cloud volume solution.

Starting point is 00:26:30 It was just impressive how they've effectively re-architected their flash array for the cloud using cloud services and stuff. It's an impressive solution. Yeah, you scan up an array on AWS and it looks just like your pure array and it behaves in your pure management platform, your purity operating system, as if it were a pure array. But it's not. The back end is all S3.

Starting point is 00:27:08 Yeah, I was pretty impressed with the level of engineering that went into the solution. The way I'm taking was basically EC2 instances. And they're not, other than using, you know, object as the persistent, as a persistent layer, like what they did was to take EC2 instances and make them basically virtual disks was pretty unique and interesting approach to recreating a array. Oh, exactly. I mean, and they've got some different, you know,

Starting point is 00:27:44 there's like front end. I'm not even sure if I call them EC2 instances. They've got some specialized, you know, high end compute and storage solutions there. And then they've got the normal EC2 instances for effectively their storage, their disks. And then they've got the object storage, their persistent storage. Like you said, Keith, it's an impressive solution. But, you know, they're not the only guys out there. I mean, Matt, you were saying that Nimble had been out there for a while, right, with their cloud?

Starting point is 00:28:12 Yeah, in fact, they call it cloud volumes, which is interesting. But it's a little bit different. The way that Nimble does it is they actually have – I actually refer to it as cloud proximate storage. So if you have a large database sitting in your Amazon or Azure cluster, but your data overhead is so prohibitive, You can actually put that data sitting on a locally situated data center that is Nimble backed. And that Nimble storage shows up as the storage for the database that is running on EC2 or running on S3. And it's a little bit different. Yeah, so, I mean, that sort of stuff I consider like cloud-adjacent storage. NetApp's had it for a while, and Infinidat has got it, I'm sure.

Starting point is 00:29:17 You know, most of the other vendors have offered some solutions with more or less services surrounding it so that you can attach to it. But, you know, and NetApp actually has gone another step, right? I think they're actually deploying NetApp hardware and software in Amazon and Azure and maybe even Google data centers. So it's even closer than adjacent Equinix. NetApp probably gets less credit than what they deserve for pioneering cloud-based block storage in general. When you buy block storage from Azure,

Starting point is 00:29:57 it is NetApp storage. It's on tap in the back end. Believe it or not, something that's ancient, I think NetApp would like to say is tried and true, but when you get block storage, the basic block storage you get from Azure

Starting point is 00:30:17 is NetApp, which is traditional NetApp array in the back end. They have the cloud-adjacent solution as well. Yeah, I knew they had like enterprise file services for Azure and AWS. And I think Google coming out, that was all NetApp real storage. You know, it's not like they've re-implemented their NetApp ONTAP for Amazon or Azure or Google. They've actually got real storage sitting there that

Starting point is 00:30:45 they're maintaining and they're supporting and they're providing on an OpEx basis, just like any other storage in the cloud. But, you know, this is happening more and more. And I think NetApp, you're right. I think they were an early adopter of this view that, you know, the world is going to move to cloud. We can either move with them or die. And they've taken it on and said, what can we make happen? Yeah, I talked to Dave Hitz at Insight before last when I interviewed him on The Cube, and I asked him about this specifically. And he said he was an early champion

Starting point is 00:31:23 and he had to help lead the culture change. And NetApp is just say, you know what, we're going to lean into cloud storage. The thing that people believe is going to kill us is actually going to be the next wave they ride, is the theory within NetApp. It is interesting. Yeah, I believe Dave was a critical component of that. And there was, I would say, a major transition and a lot of people of NetApp in a big way. I believe they were leading edge compared to some of the other vendors we've talked to and stuff like that. Well, you look at Dell EMC. I like to pick on Dell EMC. They're a customer of mine, so don't pick on them too much. They're also a customer of mine.

Starting point is 00:32:20 There you go. But, you know, my brand is to pick on people that, you know, they pay me to pick on. Yeah, exactly. They reacted, I think, well to their lack of strategy. I think they've partnered with Faction to leverage one of their most valuable assets. One of their most valuable assets is their relationship with VMware. And if you want to extend VMware Cloud on AWS, from a storage perspective, there's pretty only one player in town, and that's essentially the Dell EMC solution. So Dell might be coming from behind, but they have assets in the whole Dell tech. When you look across their portfolio, including VMware,

Starting point is 00:33:10 they have a pretty good advantage. Pretty good response to most of this stuff. I agree. I mean, you know, it's taken them a while to see that there's an advantage there or there's an advantage to be gained there. And, you know, sooner or later, they're going to be at the same level

Starting point is 00:33:24 as the rest of these guys, if they're're not there already the guys at the faction keep telling me that they are they uh you know uh doing that tech field day pure uh presentation they were constantly uh kind of pinging me in the background and trying trying to you know make a case for why their solution is much better than pure solution and. And if you ask me, all of this stuff is very, with maybe the exception of NetApp, because they've been doing it the longest, most of this stuff is very 1.0. The Pure solution, as creative as it is,

Starting point is 00:34:00 there's an awful lot of gaps in functionality, what I would like to see in a cloud product, from where it is today to where the product will go. So the market is super exciting. And so, you know, what's driving a lot of this storage innovation to some extent is the explosion of data. It just doesn't seem to stop. There's nothing stopping it. I was thinking it was AI and machine learning and deep learning and stuff like that but that's just a part of it the whole iot stuff is that starts you know i wrote a i wrote a blog post on the internet of tires imagine that tires the internet connected tires and stuff like that it's bizarre it's just the data is coming out of

Starting point is 00:34:40 everywhere and the need to be able to put it where the compute, you know, we, it's funny that we keep going in circles with this whole move the data, move the compute, which one, which one has more gravity. But when you're talking about, you know, I want to use 10,000 TPUs, tens of processor units, tens of flow processor units. I can't recreate it that in my data center where my data lives. So do I move the data?

Starting point is 00:35:06 Do I move the compute? Or do I take one of these intermediary solutions and put them cloud adjacent? So this is why we're having this conversation, because of the compute capabilities the cloud providers can provide that I can't. Right. And in fact, Keith, you were talking not too long ago about the propensity of this data to not be able to be accessed. And we've got so much data being created. it. And when we're trying to draw analytics against it, the capacity for the analyst to do that work is really hampered by the fact that so much of it is being created and so little of it has really immediate access available. I think it's a really important detail that you talked about quite eloquently. Yeah, if you think about it, you know, we talk about the crazy number of 1.5 terabytes of storage in a single server.

Starting point is 00:36:11 But even if I loaded that, if I was running HANA, RedisDB or any of these other in-memory databases and put this stuff in memory, that's still only 1.5 terabytes. If I'm working against a petabyte data set, that's only 15% of the data. 1.5%. Something of that nature. It's amazing how much data is floating around. And the AI machine learning, deep learning stuff just consumes all the data it can, really, in order to do its job. And it's just, you know, on the analytics side of it, it's also got the same problem. They just, they'll take any data they can get their hands on. Wow.

Starting point is 00:36:57 So what's coming out in VMworld with Tanzu Mission Control and Project Pacific and all that stuff. It seems like containers is getting more and more pressed these days. Are you guys seeing any of that activity? Well, I am. I've got a number of customers that are, you know, and it feels, you know, if you'll forgive me for the analogy, do you remember the whole argument about OpenStack being a science experiment? Oh, yeah. Oh, yeah.

Starting point is 00:37:36 Yeah, yeah. It feels to me like what we're getting today in terms of containers is quite a bit like that throwback to being a science experiment. Now, the question is, and I know that Nigel will cringe if he hears this, but are we getting a new level of science experiment with the container-based platforms? Or is it going to reach a critical mass where, in general, it does tend to replace our virtual machine infrastructure? Or if not replace, at least what I expect is augment. Because I don't think, just like the beginning days of VMware, it's not ideal for everything. It's not ideal for existing architectures that were never designed

Starting point is 00:38:33 to support something like a container-based platform. But as these things get re-architected and reconfigured, we're probably going to see more of it. What do you guys think about that? I think this is a bit of an oxymoron. We're storage people talking about a stateless solution that solves stateless problems.

Starting point is 00:38:58 So the enterprise typically doesn't have many stateless problems, which is my kind of problem with containers replacing. Do you think that containers are still just a stateless solution? I mean, just about every storage vendor out there has a container storage plug in that you can use to access, you know, containers can use to access state information. Oh, gosh. And then, you know, vSAN and all those guys have even got, you know, better than plug-in kinds of interactions with VMware and stuff like that. Yeah, I mean, it was originally designed for stateless, but.

Starting point is 00:39:39 Yeah, I think you have this core concept that containers will access some type of state, whether that state is a database or that state is a persistent volume or whatever the case. But if you think about the way we've designed applications in the past versus how we're designing applications today, we would create application servers that had some sense of state. So within the application itself, whether it's the configuration database, within the application or parts of the state of the overall system within the application server itself, we treated this as a pet. So if the application server itself went away, we'd have to figure out some way to recover

Starting point is 00:40:26 the application by recovering the application server. Containers kind of move away from that architecture to saying if the container dies, that is by design you just create another container. If you look at SAP, you look at any of these other legacy architectural footprints, we can't just kill the application server and keep moving along. So to. I think the the the the crux of this problem is that even those applications that are, you know, stateless application, container stateless applications, they're talking to some database behind the scenes or something. Because, I mean, users have to log in.

Starting point is 00:41:10 They have to go out and do transactions. Those transactions take place effectively against some sort of state information. They're not in the container. They're sitting in the database server or they're sitting someplace else. Yeah. So the state is somewhere else, whether the state is on a storage array or the state is in the database. The state is not in the application itself, which is where the container is running. A handful of people are saying who are suggesting that we run database daemons in containers.

Starting point is 00:41:45 And for those people who are saying that, now they're relying on a storage array or the persistent volume to maintain consistency and state across. Because the container that runs the daemon can die at any point, and that's the shift in architecture. So if you're not solving the problem that you just need something that will do some like serverless type compute where it needs to do transcoding or processing. But even transcoding, you're taking in old media and you're putting out media in a different code. You're still doing state information oh no

Starting point is 00:42:25 guys in the middle of that process you just spin up another container and it continues to do it so if i'm 50 done with the encoding process and the container dies so what i'll just spin it up another container and restart the process so that is that's not a stateful process that's not a stateful process. That's just a process. Yeah, it seems like you're trivializing it, though. I mean, there's a whole lot more behind that scene than what it feels like you're capturing in this conversation, Keith. Yeah, I understand what you're saying, Keith, that containers have historically been stateless. And even today, if you look at Kubernetes or something like that, they're going to they're going to terminate containers at will whenever they think the need is there. Yeah. Now, all the the Kelsey Tide Towers of the world, everyone who's designed, even if you interview Joe Bita a couple of months ago, they'll tell you, don't build stateful things with containers. That's not, if you're building stateful things

Starting point is 00:43:29 with containers, you're asking yourself for a world of pain. That's interesting. Now, if the experts are telling us that, and we start as an enterprise to adopt containers to do things that best designed for VMs, hey, you know what, try what our own, proceed at your own peril. I think we have a good data. I think that what we're seeing, I certainly

Starting point is 00:43:58 know that Ray and I have seen storage that is designed around the concept of containers so that that state, particularly in terms of databases, is far more recoverable. We're seeing products being created such that in a container-based environment, the storage can involve all of the details necessary for a stateful interaction. I haven't actually seen this stuff working, but I think that there's a lot going on in the world of containers as it relates to the back-end storage. Yeah, so I completely agree with you. I'm saying that the storage vendors are moving up

Starting point is 00:44:56 stack and replacing databases in the role of persistent storage in the container world. That's all I'm saying. What the container itself is a stateless being. Except it has access to persistent volumes and stuff like that. So every every every application server, every design has access to persistent storage. You don't ever want it's always a bad design to have the application server, the thing processing the data, keep the state of

Starting point is 00:45:26 the data. So whether that state is maintained on a storage array or if it's maintained in an Oracle database, et cetera, whenever we start missing states, things that maintain states is when we get in trouble. I think you're talking semantics here. I mean, the container pod is sitting there and has access to storage, you know, persistent storage, whether it's, Oracle database was the last time this Oracle server was booted, it was booted off this storage array. So we'll just trust that this storage array is consistent and we'll just restart the Oracle database. No, you got recovery logs and stuff that have to be run. I understand all that stuff. Exactly. So that's very different than how containers

Starting point is 00:46:25 work. Containers, we don't say... When I can design a proper container application, that's exactly how it works. When the container restarts, I trust the persistent storage layer. I trust that the storage

Starting point is 00:46:41 volume is consistent and I can restart the application. I think that's a good start to the application. I think we're going to have to end it here, gents. This has been great. Keith and Matt, anything you'd like to say before we leave? Just wanted to personally thank you, Larry, for including me in the show this year. I know it happened as a result of our friend exiting the independent space. Speaking for myself, I've had a whole lot of fun doing the show with you.

Starting point is 00:47:15 Yeah, I've had a lot of fun too, man. I appreciate it. I just hope it continues for a long time. It's been a blast. Okay. Keith, anything? This, again, has been super fun. Thanks for the opportunity, Ray. It opened me to a whole new wonderful audience of people. Somehow I'm now known as a storage guy. Well, that's good. Actually, I think storage guys are a great place to be.

Starting point is 00:47:44 All right. Well, next time we'll talk to another system storage technology person. Any questions you want us to ask, please let us know. If you enjoy our podcast, tell your friends about it. And please review us on iTunes and Google Play and Spotify as this will help us get the word out. That's it for now. Bye, Matt. And bye, Keith.

Starting point is 00:48:00 Bye, Ray. Bye, Ray. Until next time.

Your Ad Here

Grey Beards on Systems - 096: GreyBeards YE2019 IT Industry Trends podcast

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.