Screaming in the Cloud - The Staying Power of Kubernetes with Kelsey Hightower

Starting point is 00:00:00 Hello and welcome to Screaming in the Cloud with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is sponsored by in the Cloud. engineering teams. And three, spending $50,000 on something by accident would be bananas in virtually any other industry. That's why CloudZero built a product that helped engineers correlate their costs with the engineering activity that caused it, so you can quickly detect anomalous spend, investigate the source of any cost, and slice up your AWS costs by the metrics that matter to you. Go to cloudzero.com to kick off a free trial.

Starting point is 00:01:06 That's cloudzero.com. And my thanks to them for sponsoring this episode. This episode is brought to you by DigitalOcean, the cloud provider that makes it easy for startups to deploy and scale modern web applications with, and this is important to me, no billing surprises. With simple predictable pricing that's flat across 12 global data center regions and a UX developers around the world love, you can control your cloud infrastructure costs and have more time for your team to focus on growing your business. See what businesses are building on DigitalOcean and get started for free at do.co slash screaming. That's do.co slash screaming. And my thanks to DigitalOcean for their continuing support of this ridiculous podcast. Welcome to Screaming in the Cloud.

Starting point is 00:02:02 I'm Corey Quinn. I'm joined this week by Kelsey Hightower, who claims to be a principal developer advocate at Google, but based upon various keynotes I've seen him in, he basically gets on stage and plays video games like Tetris in front of large audiences. So I assume he's somehow involved with esports. Kelsey, welcome to the show. You've outed me. Most people didn't know that I am a full-time esports Tetris champion at home, and the technology thing is just a side gig. Exactly.

Starting point is 00:02:32 It's one of those things you do just to keep the lights on, like you're waiting to get discovered, but in the meantime, you're waiting table, same type of thing. Some people wait tables. You more or less sling Kubernetes, for lack of a better term. Yes. So let's dive right into this. You've been a strong proponent for a long time of Kubernetes and all of its intricacies and all the power that it unlocks.

Starting point is 00:02:53 And I've been pretty much the exact opposite of that, as far as saying it tends to be overcomplicated, that it's hype-driven, and a whole bunch of other, shall we say, criticisms that are sometimes bounded in reality, and sometimes just because I think it would be funny when I put them on Twitter. hype-driven and a whole bunch of other, shall we say, criticisms that are sometimes bounded in reality and sometimes just because I think it would be funny when I put them on Twitter. Where do you stand on the state of Kubernetes in 2020? So I want to make sure it's clear what I do, because when I started talking about Kubernetes, I was not working at Google. I was actually working at CoreOS, where we had a competitor to Kubernetes called Fleet.

Starting point is 00:03:30 And Kubernetes coming out kind of put this fork in our roadmap. Like, where do we go from here? What people saw me doing with Kubernetes was basically learning in public. I was really excited about the technology because it's attempting to solve a very complex thing. I think most people will agree building a distributed system is what cloud providers typically do, right? With VMs and hypervisors, those are very big, complex distributed systems. And before Kubernetes came out, the closest I got into a distributed system before working at CoreOS was just reading the various white papers on the subject and hearing stories about how Google has systems like Borg, tools like Mesos being used by some of the largest hyperscalers in the world, but I was never

Starting point is 00:04:18 going to have the chance to ever touch one of those unless I would go work at one of those companies. So when Kubernetes came out and the fact that it was open source and I could read the code to understand how it was implemented, to understand how schedulers actually work, and then bonus points for being able to contribute to it, those early years, what you saw me doing was just being so excited about systems that I attempted to build on my own becoming this new thing, just like Linux came up. So I kind of agree with you that a lot of people look at it as a more of a hype thing. They're looking at it regardless of their own needs,

Starting point is 00:04:56 regardless of understanding how it works and what problems it's trying to solve. But my stance on it, it's a really, really cool tool for the level that it operates in. And in order for it to be successful, people can't know that it's there. And I think that might be where part of my disconnect from Kubernetes comes into play. I have a background in ops, more or less the grumpy Unix sysadmin, because it's not like there's a second kind of Unix sysadmin you're ever going to encounter, where everything in development works in theory, but in practice, things pan out a little differently. I always joke that ops is the difference between theory and practice. In theory, devs can do everything and there's no ops needed. In practice, well,

Starting point is 00:05:38 it's been a burgeoning career for a while. The challenge with this is Kubernetes at times exposes certain levels of abstraction that, sorry, certain levels of detail that generally people would not want to have to think about or deal with while papering over other things with other layers of abstraction on top of it that obscure valuable troubleshooting information from a running something in an operational context. It absolutely is a fascinating piece of technology, but it feels today like it is overly complicated for the use a lot of people are attempting to put it to. Is that a fair criticism from where you sit?

Starting point is 00:06:22 I think the reason why it's a fair criticism is because there are people attempting to run their own Kubernetes cluster, right? So when we think about the cloud, unless you're in OpenStack land, but for the people who look at the cloud and you say, wow, this is much easier. There's an API for creating virtual machines. And I don't see the distributed state store that's keeping all of that together. I don't see the farm of hypervisors. So we don't necessarily think about the inherent complexity into a system like that because we just get to use it.

Starting point is 00:06:57 So on one end, if you're just a user of a Kubernetes cluster, maybe you're using something fully managed, or you have an ops team that's taking care of everything, your interface of the system becomes this Kubernetes configuration language where you say, give me a load balancer, give me three copies of this container running. And if we do it well, then you think it's a fairly easy system to deal with because you just say kubectl apply and things seem to start running. Just like in the cloud where you say, you know, AWS create this VM or gcloud compute instance create, you just submit API calls and things happen. I think the fact that Kubernetes is very

Starting point is 00:07:38 transparent to most people is now you can see the complexity, right? Imagine everyone driving with the hood off the car. You'd be looking at a lot of moving things, but we have hoods on cars to hide the complexity. And all we expose is the steering wheel and the pedals. That car is super complex, but we don't see it. So therefore we don't attribute this complexity to the driving experience. This, to some extent, feels like it's on the same axis as serverless, with just a different level of abstraction piled onto it. And while I am a large proponent of serverless, I think it's fantastic for a lot of Greenfield projects. The constraints inherent to the model mean that it is almost completely non-tenable for a tremendous number of existing workloads.

Starting point is 00:08:25 Some developers like to call it legacy, but when I hear the term legacy, I hear it makes actual money. So just treating it as, oh, it's a science experiment. We can throw into a new environment, spend a bunch of time rewriting it for minimal gains is just not going to happen as companies undergo digital transformations, if you'll pardon the term. Yeah, so I think you're right. So if you take, let's say, let's take Amazon's Lambda, for example. It's a very opinionated, high-level platform that assumes you're going to build apps a certain way.

Starting point is 00:08:56 And if that's you, look, go forward. Now, one or two levels below that, there is this distributed system. Kubernetes decided to play in that space because everyone that's building other platforms needs a place to start. Like the analogy I like to think of is like in the mobile space, iOS and Android deal with the complexities of managing multiple applications on a mobile device, security aspects, app stores, that kind of thing. And then you as a developer, you build your thing on top of those platforms and APIs and frameworks. Now, it's debatable. Someone would say, why do we even need an open source implementation of such a complex system? Why not just everyone move to the cloud and then everyone that's not in the cloud on

Starting point is 00:09:44 premise gets left behind? But typically, that's not how open source typically works, right? The reason why we have Linux, the precursor to the cloud, is because someone looked at the big proprietary Unix systems and decided to re-implement them in a way that anyone could run those systems. So when you look at Kubernetes, you have to look at it from that lens. It's the ability to democratize these platform layers in a way that other people can innovate on top. That doesn't necessarily mean that everyone needs to start with Kubernetes,

Starting point is 00:10:18 just like not everyone needs to start with a Linux server, but it's there for you to build the next thing on top of if that's the route you want to go. It's been almost a year now since I made an original tweet about this, that in five years, no one will care about Kubernetes. So now I guess I have four years running on that clock. And that attracted a bit of, shall we say, controversy. There were people who thought that I meant that it was going to be a flash in the pan and it would dry up and blow away. But my impression of it is that in, well, four years now, it will have become more or less system D for the data center in that there's a bunch of complexity under the hood.

Starting point is 00:10:57 It does a bunch of things. No one sensible wants to spend all their time mucking around with it in most companies. But it's not something that people have to think about on an ongoing basis the way it feels like we do today. I mean, to me, I kind of see this as the natural evolution, right? It's new. It gets a lot of attention. And the kind of the assumption you make in that statement is there's something better that should be able to arise given that checkpoint. If this is what people think is hot, within five years, surely we should see something else that can be deservant of that attention, right? Docker comes out and almost four or five years later, you have Kubernetes. So it's obvious that there

Starting point is 00:11:38 should be a progression here that steals some of the attention away from Kubernetes. But I think where it's so new, right? It's only five years in, like 20, Linux is like over 20 years old now at this point. And it's still top of mind for a lot of people, right? Microsoft is still porting a lot of Windows only things into Linux. So we still discuss the differences between Windows and Linux. The idea that the cloud for the most part, is driven by Linux virtual machines that I think the majority of workloads run on virtual machines still to this day.

Starting point is 00:12:11 So it's still front and center, especially if you're a system administrator managing VMs, right? You're dealing with tools that target Linux, you know the syscall interface, and you're thinking about how to secure it and lock it down. Kubernetes is just at the very first part of that lifecycle,

Starting point is 00:12:26 where it's new, we're all interested in even what it is and how it works, and now we're starting to move into that next phase, which is the distro phase. Like in Linux, you had Red Hat, Slackware, Ubuntu, special purpose distros, some will consider Android, a special purpose distribution of Linux for mobile devices. And now that we're just in this distro phase, that's going to go on for another five to 10 years where people start to align themselves around, maybe it's OpenShift, maybe it's GKE,

Starting point is 00:12:57 maybe it's Fargate for EKS. These are now distributions built on top of Kubernetes that start to add a little bit more opinionation about how Kubernetes should be pushed together. And then we'll enter another phase where you will build a platform on top of Kubernetes, but it won't be worth mentioning that Kubernetes is underneath because people will be more interested on the thing above. I think we're already seeing that now in terms of people no longer really care that much what operating system they're running, let alone what distribution of that operating system. The things that you have to care about slip below the surface of awareness.

Starting point is 00:13:35 And we've seen this for a long time now. Originally, to install a web server, it wound up taking a few days and an intimate knowledge of GCC compiler flags, then RPM or depackage, and then yum on top of that, then ensure installed once we had configuration management that was halfway decent, then Docker run whatever it is. And today it feels like it's, oh, with serverless technologies being what they are, it's effectively a push a file to S3 or its equivalent somewhere else, and you're done. The things that people have to be aware of and the barrier to entry continually lowers.

Starting point is 00:14:08 The downside to that, of course, is that things that people specialize in today and effectively make very lucrative careers out of are going to be not front and center in five to 10 years the way that they are today. And that's always been the way of technology. It's a treadmill to some extent. And on the flip side of that, look at all of the new jobs that are centered around these cloud-native technologies, right? So, you know, we're just going to make up some numbers here.

Starting point is 00:14:33 It meant if there were only 10,000 jobs around just Linux system administration, now when you look at this whole Kubernetes landscape where people are saying, we can actually do a better job with metrics and monitoring, observability is now a thing culturally that people assume you should have because you're dealing with these distributed systems. The ability to start thinking about multi-regional deployments when I think that would have been infeasible with the previous tools or you'd have to build all those tools yourself. So I think now we're starting to see a lot more opportunities where instead of 10,000 people, maybe you need 20,000 people because now you have the tools necessary to tackle bigger projects where you didn't see that before. That's what's going to be really neat to see. But the challenge is always to people who are steeped in existing technologies,

Starting point is 00:15:20 what does this mean for them? I mean, I spent a lot of time early in my career fighting against cloud because i thought that it was taking away a cornerstone of my identity i was a large scale unix administrator uh specifically focusing on email well it turns out that there aren't nearly as many companies that need to have that particular skill set in house as did 10 years ago and what we're seeing now is this sort of forced evolution of people's skill sets, or they hunker down on a particular area of technology or particular application and to try and make a bet that they can ride that out until retirement. It's challenging, but at some point, it seems that some folks like to stop learning.

Starting point is 00:15:59 And I don't fully pretend to understand that. I'm sure I will someday where, no, at this point, technology's come far enough. We're just going to let, we're going to stop here and anything after this is garbage. I hope not, but I can see a world in which that happens. Yeah. And I also think one thing that we don't talk a lot about in the Kubernetes community is that Kubernetes makes hyper-specialization worth doing? Because now you start to have a clear separation from concerns. Now the OS can be hyper-focused on security, system calls, and not necessarily packaging every programming language under the sun into a single distribution. So we can kind of move part of that layer out of the core OS and start to just think about the OS being a security boundary where we try to lock things down.

Starting point is 00:16:49 And for some people that play at that layer, they have a lot of work ahead of them in locking down these system calls, improving the idea of containerization, whether that's something like Firecracker or some of the work that you see VMware doing. That's going to be a whole class of hyper-specialization. And the reason why they're going to be able to focus now is because we're starting to move into a world, whether that's serverless or the Kubernetes API, we're saying we should deploy applications that don't target machines. I mean, just that step alone is going to allow for so much specialization at the various layers, because even on the networking front, which arguably has been a specialization up until this point, can truly specialize because now the IP assignments, how networking fits together, is also abstracted away one more step where you're not asking for interfaces or binding to a specific port or playing with port mappings. You can now let the platform do that. So I think for some of the people who may be not as interested

Starting point is 00:17:51 as moving up the stack, they need to be aware that the number of people we need being hyper-specialized at Linux administration will definitely shrink. And a lot of that work will move up the stack,

Starting point is 00:18:01 whether that's Kubernetes or managing a serverless deployment and all the configuration that goes with that. But if you are a Linux, like that is your bread and butter, I think there's going to be an opportunity to go super deep, but you may have to expand into things like security and not just things like configuration management. Let's call it the unfulfilled promise of Kubernetes. On paper, I love what it hints at being possible namely if i build something that runs well on top of kubernetes then it we truly have a

Starting point is 00:18:32 right once run anywhere type of environment i'm stopping i've heard that one before 50 000 times in our industry and record its history but in practice as has happened before it seems like it tends to fall down for one reason or another. Now, Amazon is famous for many reasons, but the one that I like to pick on them for is you can't say the word multi-cloud at their events. Right. That'll change people's perspective. Good job. People tend to see multi-cloud through a couple of different lenses. I've been rather anti-multi-cloud from the perspective of the idea that you're setting out day one to build an application, but the idea that it can be run on top of any cloud provider, or even on-premises, if that's what you want to do, is generally not the way to proceed. You wind up having to

Starting point is 00:19:20 make certain trade-offs along the way, You have to rebuild anything that isn't consistent between those providers, and it slows you down. Kubernetes, on the other hand, hints at if it works and fulfills this promise, you can suddenly abstract an awful lot beyond that and just write generic applications that can run anywhere. Where do you stand on the whole multi-cloud topic? So I think we have to make sure we talk about the different layers that are kind of ready for this thing. So for example, like multi-cloud networking, we just call that networking, right? What's the IP address over there? I can just hit it. So we don't make a big deal about multi-cloud networking. Now there's an area where people say, how do I configure the various cloud providers?

Starting point is 00:20:06 And I think the healthy way to think about this is in your own data centers, right? So we know a lot of people have investments on-premises. Now, if you were to take the mindset that you only need one provider, then you would try to buy everything from HP, right? You would buy HP storage devices. You would buy HP racks, power. Oh, maybe HP doesn't sell air conditioners. So you're going to have to buy an air conditioner from a vendor who specializes in making air conditioners,

Starting point is 00:20:35 hopefully for a data center and not your house. So now you entered this world where one vendor doesn't make every single piece that you need. Now in the data center, we don't say, oh, I'm multi-vendor in my data center. Typically, you just buy the switches that you need, you buy the power racks that you need, you buy the Ethernet cables that you need, and they have common interfaces that allow them to connect together. And they typically have different configuration languages

Starting point is 00:21:03 and methods for configuring those components. The cloud, on the other hand, also represents the same kind of opportunity. There are some people who really love DynamoDB and S3, but then they may prefer something like BigQuery to analyze the data that they're uploading into S3. Now, if this was a data center, you would just buy all three of those things and put them in the same rack and call it good. But the cloud presents this other challenge. How do you authenticate to those systems?

Starting point is 00:21:34 And then there's usually this additional networking cost, egress or ingress charges that make it prohibitive to say, I want to use two different products from two different vendors. Data gravity winds up causing serious problems. Yeah, so it's that data gravity, the associated cost becomes a little bit more in your face. Whereas in a data center, you kind of feel that the cost has already been paid.

Starting point is 00:21:55 I already have a network switch with enough bandwidth. I have an extra port on my switch to plug this thing in and they're all standard interfaces. Why not? So I think the multi-cloud gets lost in the true problem, which is the barrier to entry of leveraging things across two different providers because of networking and configuration practices. That's often the challenge I think that people get bogged down in. On an earlier episode of the

Starting point is 00:22:23 show, we had Mitchell Hashimoto on, and his entire theory around using Terraform to wind up configuring various bits of infrastructure was not the idea of workload portability, because that feels like the windmill we all keep tilting at and failing to hit. But instead, the idea of workflow portability, where different things can wind up being interacted with in the same way. So if this one division is on one cloud provider, the others are on something else, then you at least can have some points of consistency in how you interact with those things. And in the event that you do need to move, you don't have to effectively redo all of your

Starting point is 00:22:57 CI, CD process, all of your tooling, et cetera. And I thought that there was something compelling about that argument. And that's actually what Kubernetes does for a lot of people. For Kubernetes, if you think about it, when we start to talk about workflow consistency, if you want to deploy an application, kubectl apply, some config, you want the application to have a load balancer in front of it, regardless of the cloud provider, because Kubernetes has an extension point we call the cloud provider. And that's where Amazon, Azure, Google Cloud, we do all the heavy lifting of mapping the high-level ingress object that specifies I want a load balancer, maybe a few options

Starting point is 00:23:37 to the actual implementation detail. So maybe you don't have to use four or five different tools. And that's where that kind of workload portability comes from. Like if you think about Linux, right, it has a set of system calls for the most part. Even if you're using a different distro at this point, Red Hat or Amazon Linux or Google's container optimized Linux. If I build a Go binary on my laptop, I can SCP it to any of those Linux machines, and it's going to probably run. So you could call that multi-cloud, but that doesn't make a lot of sense because it's just because the way Linux works. Kubernetes does something very similar because it sits right

Starting point is 00:24:16 on top of Linux. So you get the portability just from the previous example, and then you get the other portability and workflows, like you just stated, where I'm calling kubectl apply, and I'm using the same workflow to get resources spun up on the various cloud providers, even if that configuration isn't one-to-one identical. This episode is sponsored in part by DataStax. The NoSQL event of the year is DataStax Accelerate in San Diego this May from the 11th through the 13th. I've given a talk previously called The Myth of Multicloud, and it's time for me to revisit that with a SQL, which is funny given that it's a NoSQL conference, but there you have it. To learn more, visit datastax.com. That's D-A-T-A-S-T-A-X dot com. And I hope to see you in San Diego this May.

Starting point is 00:25:09 One thing I'm curious about is you wind up walking through the world and seeing companies adopting Kubernetes in different ways. How are you finding the adoption of Kubernetes is looking like inside of big E enterprise style companies. I don't have as much insight into those environments as I probably should. That's sort of a focus area for the next year for me. But in startups, it seems that it's either someone goes ahead and rolls it out and suddenly it's fantastic, or they avoid it entirely and do something serverless. In large enterprises, I see a lot of Kubernetes and a lot of Kubernetes stories coming out of it. But what isn't usually told is, what's the tipping point where they say, yeah, let's try this? Or here's the problem we're trying to solve for. Let's chase it.

Starting point is 00:25:53 What I see is enterprises buy everything. If you're big enough and you have a big enough IT budget, most enterprises have a POC of everything that's for sale, period. There's some team in some pocket. Maybe they came through via acquisition. Maybe they live in a different state. Maybe it's just a new project that came out. And what you tend to see, at least from my experience, is if I walk into a typical enterprise, they may tell me something like, hey, we have a POC, a Pivotal Cloud Foundry, OpenShift,

Starting point is 00:26:26 and we want some of that new thing that we just saw from you guys. How do we get a POC going? So there's always this appetite to evaluate what's for sale, right? So that's one case. There's another case where when you start to think about an enterprise, there's a big range of skill sets. Sometimes I'll go to some companies like, oh, my insurance is through that company, and there's ex big range of skill sets. Sometimes I'll go to some companies like, oh, my insurance is through that company. And there's ex-Googlers that work there that used to work on things like Borg or something else. And they kind of know how these systems work.

Starting point is 00:26:55 And they have a slightly better edge at evaluating whether Kubernetes is any good for the problem at hand. And you'll see them bring it in. Now, that same company, I could drive over to the other campus, maybe it's five miles away, and that team doesn't even know what Kubernetes is. And for them, they're going to be chugging along with what they're currently doing. So then the challenge becomes, if Kubernetes is a great fit, how wide of a fit is it? How many teams at that company should be using it? So what I'm currently seeing is there are some enterprises that have found a way to make Kubernetes the place

Starting point is 00:27:33 where they do a lot of new work, because that makes sense. A lot of enterprises, to my surprise, though, are actually stepping back and saying, you know what? We've been stitching together our own platform for the last five years. We had the Netflix stack. We got some Spring Boot. We got Console. We got Vault. We got Docker. And now this whole thing is getting a little more fragile because we're doing all of this glue code. Kubernetes, we've been trying to build our own Kubernetes. And now that we know what it is, and we know what it isn't, we know that we can probably get rid of this kind of bespoke stack ourselves. And just because of the ecosystem, right?

Starting point is 00:28:10 If I go to HashiCorp's website, I will probably find the word Kubernetes as much as I find the word Nomad on their site because they've made things like Console and Vault become firstclass offerings inside of the world of Kubernetes. So I think it's that momentum that you see across even people, Oracle, Juniper, Palo Alto networks, they all seem to have a Kubernetes story. And this is why you start to see the enterprise able to adopt it because it's so much in their face and it's where the ecosystem is going. It feels like a lot of the excitement and the promise and even the same problems that Kubernetes is aimed at today could have just as easily been talked about half a decade ago in the context of OpenStack. And for better or worse, OpenStack is nowhere near where it once was.

Starting point is 00:29:00 It felt like it had such promise and such potential. And when it didn't pan out, that left a lot of people feeling relatively sad, burnt out, depressed, etc. And I'm seeing a lot of parallels today, at least, between what was said about OpenStack and what was said about Kubernetes. How do you see those two diverging? I will tell you the big difference that I saw personally, just for my personal journey outside of Google, just having that option. And I remember I was working at a company and we were like, we're going to roll our own OpenStack. We're going to buy a free BSD box and make it a file server. We're going all open source.

Starting point is 00:29:37 I was like, do whatever you want to do. And that was just having so much issues in terms of first-class integrations, education, people with the skills to even do that. I was like, you know what, let's just cut the check for VMware. Like, we want virtualization. VMware, for the cost and what it does, it's good enough. Or we can just actually use a cloud provider. That space in many ways was a purely solved problem. Now let's fast forward to Kubernetes. And also when you got OpenStack finished, you were just back where you started. You got a bunch of VMs and now you got to go figure out how to build the real platform that people want to use because no one just wants a VM. If you

Starting point is 00:30:20 think Kubernetes is low level, just having OpenStack, even if OpenStack was perfect, you're still at square one for the most part. Maybe you can just say now I'm paying a little less money for my stack in terms of a software licensing cost. But from an extraction and automation and API standpoint, I don't think Kubernetes moved or OpenStack moved the needle in that regard. Now in the Kubernetes world, it's solving a huge gap. Lots of people have virtual machine sprawl, then they had Docker sprawl. And when you bring in this thing like Kubernetes, it says, you know what, let's rein all of that in. Let's build some first class abstractions, assuming that the layer below us

Starting point is 00:31:04 is a solved problem. You gotta remember, when Kubernetes came out, it wasn't trying to replace the hypervisor. It assumed it was there. It also assumed that the hypervisor had APIs for creating virtual machines and attaching disks and creating load balancers. So Kubernetes came out as a complementary technology, not one looking to replace. And I think that's why it was able to stick because it solved a problem at another layer where there there were a lot of interesting people behind it fascinating organizations but then you wound up looking through the backers of the foundation behind it and the rest and there were something like 500 companies behind it an awful lot of them were these giant organizations that they were big e corporate it enterprise software vendors

Starting point is 00:32:02 and you take a look at that we're not going to name anyone because at that point, oh, well, we get letters. But at that point, you start seeing so many of the patterns being worked into it that it almost feels like it has to collapse under its own weight. I don't, for better or worse, get the sense that Kubernetes is succumbing to the same thing, despite the CNCF having an awful lot of those same backers behind it, and as far as I can tell, significantly more money. They seem to have all the money to throw at these sorts of things. So I'm wondering how Kubernetes has managed to effectively sidestep the, I guess, the open source miasma that OpenStack didn't quite manage to avoid. Kubernetes gained its own identity before the foundation existed. Its purpose, if you think

Starting point is 00:32:48 back from the Borg paper almost eight years prior, maybe even 10 years prior, it defined this problem really, really well. I think Mesos came out and also had a slightly different take on this problem. And you could just see at that time, there was a real need. You had choices between Docker Swarm, Nomad. It seems like everybody was trying to fill in this gap because across most verticals or industries, this was a true problem worth solving. What Kubernetes did was played in that exact same sandbox, but it kind of got put out with experience. It's not like, oh, let's just copy this thing that already exists, but let's just make it open. And in that case, you don't really

Starting point is 00:33:30 have your own identity. It's you versus Amazon in the case of OpenStack. It's you versus VMware. And that's just really a hard place to be in because you don't have an identity that stands alone. Kubernetes itself had an identity that stood alone. It comes from this experience of running a system like this. It comes from research and white papers. It comes after previous attempts at solving this problem. So we agree that this problem needs to be solved. We know what layer it needs to be solved at. We just didn't get it right yet. So Kubernetes didn't necessarily try to get it right. It tried to start with only the primitives necessary to focus on the problem at hand. Now, to your point,

Starting point is 00:34:13 the extension interface of Kubernetes is what keeps it small. Years ago, I remember plenty of meetings where we all got in rooms and said, this thing is done. It doesn't need to be a PaaS. It doesn't need to compete with serverless platforms. The core of Kubernetes, like Linux, is largely done. Here's the core objects, and we're going to make a very great extension interface. We're going to make one for the container runtime level, so that way people can swap that out if they really want to.

Starting point is 00:34:43 And we're going to do one that makes other APIs as first classes as ones we have. And we don't need to try to boil the ocean in every Kubernetes release. Everyone else has the ability to deploy extensions just like Linux. And I think that's why we're avoiding some of this tension in the vendor world, because you don't have to change the core to get something that feels like a native part of Kubernetes. What do you think is currently being the most misinterpreted or misunderstood aspect of Kubernetes in the ecosystem? I think the biggest thing that's misunderstood is what Kubernetes actually is. And the thing that made it click for me, especially when I was writing the tutorial, Kubernetes is the hard way. I had to sit down and ask myself, where do you start

Starting point is 00:35:32 trying to learn what Kubernetes is? So I start with the database, right? The configuration store isn't Postgres. It isn't MySQL. It's etcd. Why? Because we're not trying to be this generic data storage platform. We just need to store configuration data. Great. Now, do we let all the components talk to etcd? No. We have this API server. And between the API server and the chosen data store, that's essentially what Kubernetes is. You can stop there. At that point, you have a valid Kubernetes cluster, and it can understand a few things. Like I can say, using the Kubernetes command line tool, create this configuration map that stores configuration data,

Starting point is 00:36:17 and I can read it back. Great. Now, I can't do a lot of things that are interesting with that. Maybe I just use it as a configuration store. But then if I want to build a container platform, I can install the Kubernetes Kubelet agent on a bunch of machines and have it talk to the API server looking for other objects. You add in the scheduler and all the other components. So what that means is that Kubernetes' most important component is its API, because that's how the whole system is built.

Starting point is 00:36:46 It's actually a very simple system when you think about just those two components in isolation. If you want a container management tool, then you need a scheduler, controller manager, cloud provider integrations, and now you have a container tool. But let's say you want a service mesh platform. Well, in a service mesh, you have a data plane that can be Nginx or Envoy, and that's going to handle routing traffic. And you need a control plane. That's going to be something that takes in configuration, and it uses that to configure all the things in a data plane.

Starting point is 00:37:17 Well, guess what? Kubernetes is 90% there in terms of a control plane with just those two components, the API server and the data store. So now when you want to build control planes, if you start with the Kubernetes API, we call it the API machinery, you're going to be 95% there. Then what do you get? You get a distributed system that can handle kind of failures on the back end, thanks to

Starting point is 00:37:41 etcd. You're going to get RBAC so you can have permission on top of your schemas and there's a built-in framework we call it custom resource definitions that allows you to articulate a schema and then your own control loops provide meaning to that schema and once you do those two things you can build any platform you want and i think that's one thing that it takes a while for people to understand that part of Kubernetes, that the thing we talk about today for the most part is just the first system that we built on top of this. I think that's a very far-reaching story with implications that I'm not entirely sure I'm able to wrap my head around. I hope to see it. I really do.

Starting point is 00:38:29 I mean, you mentioned about writing, learn Kubernetes the hard way, and your tutorial, which I'll link to in the show notes. I mean, my, of course, sarcastic response to that recently was to register the domain Kubernetes the easy way, and just repoint it to Amazon's ECS, which is in no way, shape, or form Kubernetes, and basically has the effect of irritating absolutely everyone, as is my typical pattern of behavior on Twitter. But I have been meaning to dive into Kubernetes in a deeper level. And your stuff, stuff that you've written, both not just the online tutorials, but also the books have always been my first port of call when it comes to that. The hard part, of course, is there's just never enough hours in the day. And one thing to think about too is like the web.

Starting point is 00:39:06 We have the internet, there's web pages, there's web browsers. Web browsers talk to web servers over HTTP. There's verbs, there's bodies, there's headers. And if you look at it, that's like a very big complex system. If I were to extract out the protocol pieces, this concept of HTTP verbs, get, put, post,

Starting point is 00:39:25 and delete, this idea that I can put stuff in a body and I can give it headers to give it other meaning and semantics, if I just take those pieces, I can build RESTful APIs. Hell, I can even build GraphQL. And those are just different systems built on the same API machinery that we call the internet or the web today. But you have to really dig into the details and pull that part out. And you can build all kinds of other platforms. And I think that's what Kubernetes is. It's going to probably take people a little while longer to see that piece, but it's hidden in there. And that's that piece that's going to be, like you said, it's going to probably be the foundation for building more control planes.

Starting point is 00:40:05 And when people build control planes, I think if you think about it, maybe Fargate for EKS represents another control plane for making a serverless platform that takes the Kubernetes API, even though the implementation isn't necessarily what you find on GitHub. That's the truth. Whenever you see something as broadly adopted as Kubernetes, there's always the question of, okay, there's an awful lot of blog posts, getting started to it, learn it in 10 minutes. I mean, at some point, I'm sure there are some people still convinced Kubernetes is in fact a breakfast cereal. Based upon what some of the stuff the CNCF has gotten up to, I wouldn't necessarily bet against it. Socks today, breakfast cereal tomorrow. But it's hard to find a decent level

Starting point is 00:40:45 of quality. Finding a certain quality bar, a trusted source to get started with is important. Some people believe in the hero's journey, story of narrative building. I always prefer to go with the moron's journey because I'm the moron. I touch technologies. I have no idea what they do and figure it out and go careening into edge and corner cases constantly. And by the end of it, I have something that vaguely sort of works and my understanding's improved. But I've gone down so many terrible paths just by picking a bad point to get started. So everyone I've talked to who's actually good at things has pointed to your work in this space as being something that is authoritative and largely correct. And given some of these people, that's high praise.

Starting point is 00:41:26 Awesome. I'm going to put that on my next performance review as evidence of my success and impact. Absolutely. Grouchy people say it's all right. You know, for the right people, that counts. If people want to learn more about what you're up to and see what you have to say, where can they find you? I aggregate most of my outward interactions on Twitter.

Starting point is 00:41:47 So I'm at Kelsey Hightower and my DMs are open. So I'm happy to fill any questions and I attempt to answer as many as I can. Excellent. Thank you so much for taking the time to speak with me today. I appreciate it. Awesome.

Starting point is 00:41:58 I was happy to be here. Kelsey Hightower, Principal Developer Advocate at Google. I'm Corey Quinn. This is Screaming in the Cloud. Kelsey Hightower, Principal Developer Advocate at Google. I'm Corey Quinn. This is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on Apple Podcasts. If you've hated this podcast, please leave a five-star review on Apple Podcasts and then leave a funny comment. Thanks.

Starting point is 00:42:23 This has been this week's episode of Screaming in the Cloud. You can also find more Corey at Screaminginthecloud.com or wherever Fine Snark is sold. This has been a humble pod production stay humble

Screaming in the Cloud - The Staying Power of Kubernetes with Kelsey Hightower

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.