Grey Beards on Systems - 107: GreyBeards talk MinIO’s support of VMware’s new Data Persistence Platform with AB Periasamy, CEO MinIO

Episode Date: September 25, 2020

Sponsored by: The GreyBeards have talked with Anand Babu (AB) Periasamy (@ABPeriasamy), CEO MinIO, before (see 097: GreyBeards talk open source S3… episode). And we also saw him earlier this year, a...t their headquarters for Storage Field Day 19 (SFD19) where AB gave a great discussion of what they were doing and how it worked … Continue reading "107: GreyBeards talk MinIO’s support of VMware’s new Data Persistence Platform with AB Periasamy, CEO MinIO"

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Keith Townsend. Welcome to another sponsored episode of the Greybeards on Storage podcast, a show where we get Greybeards bloggers together with storage assistant vendors to discuss upcoming products, technologies, and trends affecting the data center today. This Greybeard on Storage episode is brought to you today by MinIO and was recorded on September 17, 2020. We have with us here today, AB Piriya Asami, CEO of MinIO. So, AB, why don't you tell us a little bit about yourselves
Starting point is 00:00:38 and the recent news about a partnership between VMware and MinIO. Great. Thank you for bringing me on this channel. It feels like not so long ago we spoke. Yeah, it was a couple months back, not that long ago. That's good news. Yeah, it feels like long ago, right? And I'm myself, AB, Anand Babu Fereyasamy.
Starting point is 00:00:59 If you dedupe my name, you get AB, right? And I'm one of the co-founders of MinIO, and MinIO is an object storage. And the recent announcement about VMware partnership, it's actually a big step in the enterprise IT space, bringing Kubernetes to the enterprise. And MinIO now is natively available on VMware Tanzu, and it is available as a service to the data persistent services layer. Yeah, so VMware has been spending a lot of, I would say, development effort and marketing effort talking about Kubernetes on VMware and Tanzu's latest integration of that activity. And so I'm trying to understand how MinIO plays in the VM world environment nowadays. Yeah, we saw the industry was split until recently, right? There was enterprise IT that was file block and VMs, right?
Starting point is 00:02:01 Or HCI, the closest innovation they had there was HCI. And you saw the cloud native world, which was Kubernetes and containers and everything is elastic. These two worlds are fundamentally incompatible, right? Containers and data services. And the data services, either you have object storage or you have a database. And object storage is the primary storage out there, right? Whether it's Snowflake or Azure ML, Power BI, even static website hosting, everything in the cloud is built on object storage. But you look at the private cloud or in the enterprise, the IT environment, completely incompatible, right? That has shifted this time. And the way VMware did it, they did
Starting point is 00:02:42 such a clean job. If you see the race all along is how do we modernize the enterprise IT so everybody in the end looks like AWS or one of these public cloud. And our journey was to give the enterprise the storage side of things, which is object storage. And this time around, what VMware did was by bringing Kubernetes support native into the heart of vSphere, it enabled us to go on top of VMware and take full advantage and bridge the world. You see the traction Manivo has, we pretty much own the Kubernetes space. It's the de facto object storage of choice in the private cloud, in the hybrid cloud. But when it came to enterprise IT, we were seen as a shadow IT. Is this something that somebody on vCenter can just fire up a MinIO object storage cluster?
Starting point is 00:03:38 I'm trying to understand how this all plays out in the VMworld environment. Yeah, and I'm a little bit even confused because when I talk to VMware about object storage, especially vSAN, I get my hand slapped. And they say, you know what, Keith? vSAN, no object storage, not optimized for object storage. It is basically foul in VMFS. That is what they addressed this time, right? And the vCenter is the key, right?
Starting point is 00:04:06 So the IT still owns the physical layer, and they control when to buy, when to upgrade, when to fix the fail to drive, the physical to virtual or physical to container. IT owns the physical resources and the SLA and stuff, right? How do they manage it? Through the vCenter. This time, IT can provision private cloud infrastructure, multi-tenant, like full Amazon-like capability,
Starting point is 00:04:30 but more enterprise-hardened, without ever learning to spell Kubernetes. And this is all done entirely through the vCenter. You don't even have to touch the kubectl command. You don't need to know even that it's underneath powered by Kubernetes. So it becomes just a vCenter data store? I'm not sure even that's the right terminology here, but isn't that what we're talking about here?
Starting point is 00:05:01 Yeah, so actually, I'm glad you paid attention to that detail. It's a very fine, subtle detail, but it actually is a huge shift in the industry, right? Like you saw how enterprise IT resisted software-defined storage, right? What VMware calls it as data persistence layer is actually a huge upgrade to the software-defined storage. They actually just moved the industry forward several steps. And this is all because that is how the public cloud operates already. It is no longer about storage.
Starting point is 00:05:31 It's actually about data. Now when it comes to data, actually they put database and object storage along the same lines. It's actually the same layer. In this announcement, the data persistence services the if you see three of them are object storage and one of them is a cassandra database as you see more more services come on they are essentially going to be a data service either a database for storing metadata or it's an object storage the file and block are are kind of gone right but
Starting point is 00:06:03 end of the day the hard hard drive is where like solid state or some block storage, you need to still save the data. So vSAN actually is the thin layer that actually virtualizes the physical drives or SSDs into a container storage interface and enabling high-performance object storage like MinIO or database,
Starting point is 00:06:24 distributed database to run natively. So effectively, this is providing a persistent data layer, a persistent storage layer for the Tanzu container solution. But it seems like this sort of stuff also applies to normal VMs and stuff like that, wouldn't it? Yeah. In fact, the integration speaks volumes. Instead of just retrofitting, if you tell the customers, you can just run Kubernetes on top of VMs, nothing changes,
Starting point is 00:06:56 it's just a marketing campaign. That's not what VMware did. They actually brought Kubernetes into the vSphere layer, and they did some fundamental improvements in a way that Kubernetes now got the benefit of VM-like isolation, and you can now manage VMs and containers just alike. And Minivo running in the supervisor cluster close to the vSAN direct layer, you actually get best of both worlds. And that to us is a big deal. So I know some of this firsthand. One of the big things that VMware did in the 1.0 release of vSphere 7.0 and Tanzu was implement namespaces, not just adopting container namespaces,
Starting point is 00:07:44 but adopting namespaces for vSphere and vCenter itself. Is MinIO tying into the concept of namespaces across VMs and containers and offering some new persistent layer of storage based on just Linux namespaces? Yeah, it's actually the, so the VMware namespace now is actually the Kubernetes namespace as well, because they are all kind of converged now, right? And namespace is the fundamental resource isolation. And now just like applications are isolated
Starting point is 00:08:19 from each other through namespace, and namespace is how the IT would control how resources are allocated in a multi-tenant environment that applies to the storage layer as well meaning the object storage layer or a database layer and even inside minivo when you provision new tenants the tenants could be just different departments inside your company or an msp onboarding multiple customers or even within and within a particular department, they may have multiple applications. They want like different SLAs and different isolation
Starting point is 00:08:50 security levels. So when you create multiple tenants, Minaio actually uses the same namespace to isolate even between the tenants. If you upgrade one tenant, you may even run different versions of Minaio at different times and there is no disruption between the tenants. If you upgrade one tenant, you may even run different versions of MinIU at different times, and there is no disruption between the tenants and is fully isolated. So the applications and the data services like MinIU or database are all managed exactly like one fabric.
Starting point is 00:09:19 Oh gosh, how well does this thing perform under vCAM solution with VMware and all that stuff? You mentioned the problem, right? Like it was not like previously when you asked about vSAN, that for object storage, they couldn't run. And we had the same problem. We, of course, would like to make it easier for IT to control the physical layer. We wanted to work on top of VMware. But the problem that early on we had was vSAN to be able to hold petabytes of data.
Starting point is 00:09:50 It's not just the scalability part, right? The other real problem is vSAN as a software-defined network storage, if we are running in one container on, say, Node 6, and it's attached to a drive that is on Node 3, now every I.O. that we perform we we cause right amplification and we write across the network and we also have to erase your code right this is the this this is the one that they beautifully fixed it by introducing vsan direct
Starting point is 00:10:16 which is new in this uh the 7.0 update one and vsan direct gives you the host local access also it eliminates the raid controller type bottleneck. If we actually get JBOD or a JBOF type access, you can now bring in thousands of drives to actually build a very large infrastructure. And still, IT, without hiring Amazon like DevOps, can manage the private cloud environment all through vCentral. You mentioned petabytes. I mean, object storage is known for having sizable storage repositories, but I'm not sure I've seen many VMware installations with petabytes of storage in the past.
Starting point is 00:10:56 I can't think of too many vSAN petabytes in the petabytes. That's what we're talking about, right? Yeah, it was not possible before, but I can tell you the use cases exist. Here is the problem, right? In every customer base that we have, the IT is kind of frustrated that all the data processing AML workloads are run by the Hadoop workloads, and the Hadoop guys are now ditching HTFS and moving to MinIO, and they went to Kubernetes. And IT couldn't manage those services. And then the other problem was even in the organizations that are entirely
Starting point is 00:11:31 managed under IT, like Splunk, for example, is actually growing really fast inside these organizations. And Splunk actually grows to petabytes in no time. Bulk of the organization's data growth is actually machine generated logs and event data and Splunk is getting standardized there instead of Hadoop and HDFS type like complicated service. And when we couldn't bring Splunk and Minivo, like Splunk Smart Store and Minivo combination, we couldn't run them on vSAN because of the same problem. And we actually have customer requirements that this is one of the bank I can't name. There are three different sites they have to consolidate and totaling to like 70, 80 petabytes of data.
Starting point is 00:12:12 70 or 80 petabytes of data? Yeah. In one vSphere cluster or a couple of vSphere clusters, we're talking supercomputer stuff almost. Actually, you know, it looks very big, right? But not actually in the object storage space. Like if you see the dense deployments for Minivo, they actually like in one of our customers, they actually have 200 drives per chassis.
Starting point is 00:12:39 In just 16 servers, they are talking about 39 petabytes. 40 drives, 48 drives, 96 drives per chassis. 16 terabyte drives. These things are just amazing. And I don't have a problem with petabytes of storage being accessed by vSphere cluster. Typically, we just look outside of HCI or BYOD type of solutions to do this. We're looking at, you know, purpose-built object store solutions or purpose-built foulers that could handle that skill. It's really disruptive to think that you can get that, you know, in the native solution.
Starting point is 00:13:22 Yeah, a vSAN HCI solution. That's what we're talking about here, right? I can easily think of several use cases. It's just that I would never have tried it. It is disruptive. In fact, that is how I myself thought that when any way data cannot move around and elasticity and stateless,
Starting point is 00:13:41 it makes sense for the applications to be stateless containers. And particularly if you look at minivo the entire minivo server is like a 45 megabyte static binary and it's super easy to start even like some average javascript developer can run minivo even if he or he doesn't know how to run elastic search it's that simple why would you actually bring it on to kubernetes why would you put MinIO on container? All these questions. For me, it wasn't obvious when we started. And we actually did not support Kubernetes. Even though it was designed to be cloud native, I always thought that they would just buy these dense machines, run MinIO on top of it,
Starting point is 00:14:22 keep it simple, and application would be on the containers, right? What actually happened was the community started maintaining these Helm charts. They actually started putting containers. And if you look at our downloads, they are basically more than 61% is all containers and Kubernetes type. And they are all, they basically, they are all like community and customers pushing us towards it. When I started asking these guys, why are you guys doing it, right? I was surprised just like you. And what they told me was they want to completely containerize their software infrastructure. Sounds very familiar, like how VMware, everything has to be virtualized, right?
Starting point is 00:15:02 This time they want to virtualize the data layer as well. Why? Because they are saying that they roll out their software updates multiple times a month, sometimes even multiple times a day. And this is crucial for them when they containerize. That's why they containerized and brought Kubernetes for orchestration. You can now deploy on edge or private cloud, public cloud, anywhere. And if you only virtualized or containerized the application side,
Starting point is 00:15:34 if you go to Azure, now you can't put EMC appliance or a NetApp appliance there. You can't even buy it. This is where they want everything has to be containerized. Right. So you guys get a lot of downloads. I mean, is it a highly active environment? I mean, is it a highly active environment? I mean, yeah. I remember, as it is, it was already growing in the first two years, right? And then around 2017, we were just doing our Series A,
Starting point is 00:15:58 and it started just exponential rise. And our investors are super excited, and I'm telling them, maybe it's one of the security fix we did. Everybody's rushing to update. just exponential raise. And our investors are super excited. And I'm telling them, maybe it's one of the security fix we did. Everybody's rushing to update. Don't count on it. It will fall down. And it actually started accelerating,
Starting point is 00:16:13 started accelerating, like even growing faster and faster. We are nearly doubling like every 18 months, actually. Oh my God. Yeah. And so this is kind of, you know, it's all part to seem, seems to be part of VMware's push to, I'll say, conquer the container world as they've conquered the enterprise IT world. It seems, right?
Starting point is 00:16:33 I mean, they're just trying to make this environment as useful to enterprise IT as they possibly can. Yes. Actually, for us, it was something that we wanted to do, but we could not do. Just like the rest of the world says, don't fight the cloud. For us, we were there. We were born in the cloud, right? But we didn't want to fight the IT because IT actually did important things like SLAs and upgrades, updates. They still run the infrastructure, right? We have to incorporate them, but we couldn't do it because we didn't want to be a hardware appliance company.
Starting point is 00:17:08 This time around, that VMware bridged the Kubernetes world, the cloud-native world, and the enterprise IT into one fabric by allowing us to not retrofit, but run natively. This time, it made it possible. And this time, we didn't want to fight IT. Now, we don't want to fight IT. Now we don't have to. So let's talk a little bit about that not fighting IT and integrating into existing flows. Because I like not fighting IT. I'm an IT guy. If I'm not ready for containers but I'm ready to move to vSphere 7.0,
Starting point is 00:17:44 what's the argument for MinIO in that environment? So if you're IT, right? So if you see the industry, how it happened, right? How IT saw these new developments in their lab, almost every case in our customer base, right? It's very much like how Linux itself penetrated and then like, say say even application services, like say like in the past,
Starting point is 00:18:08 you would buy WebLogic and like say DB2 or SQL Server license and you would give to the application developer. Now we go build application. But nowadays that application team tells IT that not only I'm running Cassandra or Elastic and Kafka, they are telling I have even orchestrated everything. Now I manage
Starting point is 00:18:25 my application infrastructure I push multiple times it's all CICD and it is like I don't know how to deal with that and it applications team is like let me do it and this is where but the applications team they don't they're for them the priority is not SLA security and bunch of other things that even they don't even know how to spec out the hardware and this is where what what the way the VMware integrated if you see Minivo specific case itself you can you when you go to the vCenter UI and provision you like you actually you you all you all you are saying is you basically say this tenant how much capacity how many nodes how much memory and cpu resources you want to give then you say you you want them to connect to an ldap or open id identity manager
Starting point is 00:19:12 a encryption service you connect to a key management service you basically just setting what their bounds are right because you are in a better place you don't want some rogue application to take over uh or even some unintentional right you still control that you can you are doing all of that still without a ever touching kubernetes but once they up a tenant is provisioned then the applications team has native kubernetes api and they can do all api driven they are the one, your customers, IT's customers, that is the application team, they would use Kubernetes interface, but you would use vCenter interface. And both of them are nicely integrated. Kind of like a shadow IT came out of existence doing this sort of stuff.
Starting point is 00:19:56 And now they can actually do it on real IT infrastructure and sort of be administrated and managed to some extent by the real IT organization and stuff like that. It's really interesting. So it's almost time. I just wanted to ask one question about how is it working with VMware being a startup like yourself? Yeah. You know, we work very closely with them on this. And now I can tell you from my heart, right? Actually, the team was wonderful.
Starting point is 00:20:30 For a company of their size, we felt like they are just another startup of our size. They were moving fast at the same pace and also more than anything, right? They were resourceful like a startup. There is no bureaucracy, nothing in between. Getting things done for a company of that size resourceful like a startup there is no bureaucracy nothing in between getting things done for a company of that size to behave like a startup was just stunning right for us we we don't want to do things from a press release and marketing point of view right we have to do something that is real and it will benefit our world and we our customers and our community and it was the reason for us to get engaged with VMware was they were doing real,
Starting point is 00:21:10 they were fixing the problems the right way. And that is what enabled us and got us excited. But working with them closely, I think they are just a big startup. It seems like to me, ever since they started playing in this container space, their development cadence has started to increase. The vSAN team has always been, you know, really quickly adopting technologies and stuff like that. I don't know. Keith, you're kind of involved in that. What do you think of what they're doing these days? Yeah, you know, I remember I've been doing podcasts around vSAN for the probably past five or six years that I remember it was vSAN, haha.
Starting point is 00:21:49 Wow, vSAN is in my data center. I'm running vSAN. So it's come a very long way. It's a feature parity. Most features enterprises care about for general purpose workloads are there. And this is just another example, right? Today, if you are rolling out like a software defined, everything cloud native, right? You need to have a both storage, basically storage, networking, and compute.
Starting point is 00:22:19 All of them has to be containerized or VMs, virtualized. And that's how you can roll out anywhere. Otherwise, there is no cloud, right? And they are now there. They brought it together. Yeah, it's amazing. That's amazing. All right, so Keith, any last questions for AB
Starting point is 00:22:35 before we leave? No, I don't want to go down the rabbit hole of looking at solutions other than, no, I'll ask the question. What if I'm not a vSAN fan? What if I have VCF and I use VCF for the core? What about solutions outside of vSAN? So you can actually use MinIO
Starting point is 00:22:56 through the TKGI, it turns to Kubernetes grid interface. You can actually run MinIO as just like a Kubernetes application, right? That is also possible where you would mount anything that maps as a CSI, that's like a vSAN, like vVol, pretty much anything, right? But I actually liked the current integration with the vSAN
Starting point is 00:23:16 and how it's tightly integrated and we are in the supervisor cluster with more privileges and you have the vCenter console. But I think eventually even i can't speak for vmware i think technically the same technology that is released can actually just support vwall as well but they find it it will be more like do you want to support a legacy san nest why would you build object storage on top of san and nest not just object storage even the distributed databases and data services, they took care of replication, erasure code, everything.
Starting point is 00:23:48 So vSAN is the right interface. So that's why they focused on this one then enabling vBall. But I don't know, maybe in the future it might happen, but it is more like it only benefits the legacy investment. If you have already made investment into the SAN or NAS, it makes sense, but for all new deployments, I think vSAN direct is the way to go. Okay. AB, anything else you'd like to say to our listening audience before we close?
Starting point is 00:24:14 Just one small point. If you see the data persistence announcement, it is not just about a hardware appliance vendors writing a CSI driver and claiming to be now Kubernetes compatible. This is actually, this is the first time that you see the shift has happened, that a storage software is now treated like a database and it has to be available as a container. This is where you see all the storage giants who are appliance vendors actually have no role to play. They have to go back to the drawing board to build something that is not only software
Starting point is 00:24:51 defined, it has to be container native object storage built from scratch, which is what we have been doing all along. It gave us a huge advantage. Yeah, I would say so. Well, this has been great. Thank you very much, AB, for being on our show today. And thanks again to MinIO for sponsoring this podcast. Thank you for having me. I always enjoy talking to you both. Thank you, Ray. Thank you, Keith. All right. That's it for now. Bye, Keith. Bye, Ray. And bye, AB. Bye, everyone.
Starting point is 00:25:21 Until next time. Thanks. Next time, we will talk to another system storage technology person. Any questions you want us to ask, please let us know. And if you enjoy our podcast, tell your friends about it. Please review us on Apple Podcasts, Google Play, and Spotify, as this will help get the word out. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.