PurePerformance - “You Build It, You Run It Doesn’t Scale!” with Luca Galante

Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson. Hello everybody and welcome to another episode of Pure Performance the last intro I started and you thought it was a boring intro and then I went to my story? Well, I thought maybe it's because of Lent. I'm not sure if you are. No, I used to. I gave up on that nonsense. You gave up on that, yeah, because maybe you just wanted to do it. I gave up Christianity. I gave up religion for Lent

Starting point is 00:00:56 one year and then I just forgot. But I had another dream with you, Andy. The last time I had the dream that you were kidnapping my kids, and it made me happy. This time you were taking my daughter shopping. And you were like, come on, Adele, let's go shopping. Andy's always on Schwarzenegger when I do his impression.

Starting point is 00:01:19 And I was like, oh, where are you taking her shopping? And he goes, I'm taking her to DSW. And my daughter goes, oh, where are you taking her shopping? And he goes, I'm taking her to DSW. And my daughter goes, oh, that's designer shoes, right? So they go there. They're out there for about an hour or two. And then they come back. I'm like, oh, what kind of shoes did you get, Adele? And she's like, oh, look at these really cool shoes.

Starting point is 00:01:39 They're really tall. And Andy goes, yes, they're platform shoes. Get it? That is an awesome it's amazing get it and i and i yeah you did it it's i think that's one of the best openings we ever had and it's a it's a perfect segue to the topic too because you want to talk about not platform shoes but actually platforms that enable engineering organizations to build software products. And yeah, today's topic is everything around platform engineering. And we don't want to keep him waiting for much longer.

Starting point is 00:02:15 Luca is with us today as our guest of honor. Luca, welcome to the show. Sorry to keep you so long, but I thought it was a really great dream story. Yeah, it was totally worth it. Totally, totally worth it totally thank you for having me yeah hey luca do us a favor introduce yourself to the audience in case they uh haven't yet come across your name sure um so i'm luca i run product at humanitech and probably better known on twitter or so as the one of the core contributors to the platform engineering community. I helped moderating the Slack over there where we have over 10,000 members. I co-host platform con,

Starting point is 00:02:53 which is the number one platform engineering conference. And I write a newsletter platform weekly that goes out to about 10,000 people as well every week. That's cool. So I do hope that you will send out the link to this podcast in your next newsletter because we can always... I'm better. Yeah.

Starting point is 00:03:09 I'm better. Otherwise, I'll be in Brian's next story. Exactly. You will be in his next nightmare. Yeah, I guess I should preamble those as nightmares, not dreams because apparently it's like, what's wrong?

Starting point is 00:03:23 Yeah. But Luca, thank you so much for being here. You mentioned a couple of things already, and I want to remind people, if you're listening to this and you want to follow up on some of the content we're discussing, like Luca mentioned PlatformCon. Luca mentioned his newsletters, his Slack. We will put all of the links in the summary.

Starting point is 00:03:44 PlatformCon also, just a shout- out, it's coming up early June. And I think you have already listed a lot of great speakers up there. So really, thank you so much for hosting this conference and bringing people together that want to talk about platform engineering. Now, the interesting thing, and this brings me to my first question. If I look at platformcon.com, it says the top DevOps and platform engineering leaders on a virtual stage for two days, which brings me to my first question. Why do we need platform engineering if we have talked about DevOps so long? What is platform engineering?

Starting point is 00:04:25 Because I have a lot of challenges sometimes explaining even what DevOps is, even though I've been trying to explain that to our community for many years. But enlighten us. What is platform engineering for people that have never heard about it? And what is, from your opinion, the difference to DevOps and I think also SRE? 100%. Happy to do so. So, okay, well, let's start with the definition of platform engineering. So platform engineering for me is the art,

Starting point is 00:04:56 because it's really more of an art than a science, of sort of like taking all the different tech and tools that you have in an engineer organization today, especially in the enterprise, and bind them together into a golden path that enables developer self-service and reduces sort of the cognitive load on the individual contributor when they interact with their infrastructure. And these different golden paths, the sort of superset of this golden path is what is often referred to as an internal developer platform or ID care? And, and, and how is that different, right? From, you know, everything that I already know around DevOps and so on. And I think to, to answer that probably it's best to take it a bit of a step back and, and kind of understand where DevOps come from.

Starting point is 00:06:02 And so if we look at whenubs came up more than like 10 years ago at this point even 15 years ago the the word was a very different place then um we mainly developed a monolith probably running on bare metal the infrastructure that we uh were developing on was a lot less complex you didn didn't have Kubernetes, infrastructure as code. You didn't have this crazy CNCF landscape with 10,000 plus tools that nobody really understands. And the initial idea behind Douds was really simple and a great idea, I think, which was basically to remove the barriers between developers

Starting point is 00:06:42 and operations and facilitate collaboration. So I think we can all agree that's great. The issue is the reality of DevOps when you have all these converging trends of, you know, containerization in Kubernetes, infrastructure as code, Terraform, GitOps, and all this other like-native toolings and trends that came together,

Starting point is 00:07:12 what that meant for the average engineer organization is that developers now are very overwhelmed just by the sheer amount of, you know, tools and scripts and steps that they need to touch in order to do their job. And so while, you know, back in the days, if I just wanted to deploy a small change to, you know, a front end service, for instance, to test something, I maybe had to touch one script and one tool, one deployment tool. Nowadays, that is oftentimes like a Helm chart here, a YAML file here,

Starting point is 00:07:55 another Terraform module over there, three tools in between, right? And so this can become quite overwhelming quite quickly. And so the reality of DevOps today is for a lot of teams, developers being blocked effectively by this cognitive load and by fear of screwing things up, frankly, if they start touching all the scripts and different things. And then operations team basically becoming a bottleneck for the velocity of the organization, because now you have developers effectively creating tickets for operations and leading to what we call ticket ops. And so you have developers waiting on the one hand and operations team becoming a bottleneck and being stuck and sort of like fighting to putting off fires all the time on the other. And that's obviously not a great situation. It's not great for sort of all your key DORA metrics, it's not great for overall time to market, and it's also not great as a work environment, right? Because it can become

Starting point is 00:09:11 quite a frustrating experience. You have a lot of friction between developers and operation teams, and it's just not ideal. And so in this, and if we continue down the memory lane, or yes, like up, what happened after this was basically you had leading tech organizations, leading tech companies, and top performing engineering orgs that quickly realized that the initial promise of DevOps, so you build it, you run it, doesn't really work at a certain scale. And so there's sort of like Airbnbs and Spotifys and Googles of this world that have to onboard literally hundreds of new developers,

Starting point is 00:10:04 sometimes a month, to an increasingly complex cloud-native setup, we quickly realized, hey, this thing doesn't really scale, right? I cannot expect everybody to understand everything of an increasingly complicated delivery setup. And so they said, we need to build some sort of platform layer here in between the operation side of things and the application developers to A, enable developer self-service

Starting point is 00:10:33 and B, ensure that the operations could build a scalable system and enforce the right policies and all these sort of like enterprise grade functionality that you need at that scale. And that's really kind of, I think, where this sort of like, you know, DevOps started morphing into this discipline of platform engineering. And so all that platform engineering really is, is, you know, this discipline of binding this tech and tools into this internal developer platform and golden password developers to enable developer self-service. And it is really an evolution, if you will, of DevOps that enables true DevOps, enables true you build it, you run it at the enterprise scale in the cloud native era.

Starting point is 00:11:27 Thank you so much. I'm taking a lot of notes here and I have a lot of follow-up questions. Because the first thing you said when you explained the challenge, and I think it came by we as an industry talking a lot about we need to shift everything left. We need to codify everything. So everything is code. But now in your example, you said as a developer, I may need to change my code. I need to change my deployment definition. I need to change my infrastructure definition, which means I need to all of a sudden be familiar not only with all these tools, but all the languages.

Starting point is 00:12:04 And then I need to figure out how these changes get applied what happens if something fails so if i hear you correctly and just repeating what i learned from you you said the reason why we needed a different approach to this is because the whole shifting left and everything is code and we give all the power to the developers or ask them to do, you build it, you run it, became just too complex because it's just not doable. I mean, you may have some engineers that can do it within an organization, but especially as you are onboarding 10, hundreds and thousands of developers over a certain period of time, you cannot assume that everybody knows Terraform inside out, Ansible inside out, Helm inside out, customize it, whatever

Starting point is 00:12:48 else there is. I think while the movement was great to empower developers, and I think empowerment might be the wrong word, I think we put too much pressure on the development side to do everything. This needed a new approach of making sure developers can really become efficient and stay efficient, especially as we onboard new teams. And therefore, I like the definition of a golden path. So you have different golden paths, and you provide the golden path

Starting point is 00:13:21 through a self-service model. And this is done through providing a platform to your engineers. And yeah, that's what I just took a couple of notes. I also really thought, first of all, confirmation. Did I get it right? Yeah, spot on. Yeah.

Starting point is 00:13:39 One other thing that I would like to ask Werner Vogel, because he kind of was famous for you build it, you run it. And you say you build it, you run it, it doesn't scale. I'm just wondering, back in 2006, I think is when, I don't know, 2006 might be too long ago. When did he say? Yeah, 2006. 2006, yeah.

Starting point is 00:13:56 So it's amazing. It's been 18 years, 18, 15 years, 60, whatever. I need to do the math. I wonder how big Amazon was back then. Not that big. I looked it up because I was also thinking about this. And it was like a few hundred developers, actually. It was not that big.

Starting point is 00:14:17 And I mean, obviously, the other thing I think is obviously it was in their interest, right? Because they said it when they released AWS, when the whole idea was like, oh, now you have this AWS console and you can do everything from here. And we all know that that quickly stopped becoming true for a lot of developers. And then we started adding all this other cloud-native things

Starting point is 00:14:42 and it just got really complex really quickly. And yeah, so I think there's also a question of incentives there. Yeah. Now, this brings me to my next question. And because it seems it's a size and a scale question, at which time then is it the right time for an organization to think about, do I need a platform engineering team? Do I need to build my own internal platform? Or am I still small enough? I don't know, to still run with the, you build it, you run it.

Starting point is 00:15:16 Yeah, I think that's a great question. And as you mentioned, Andy, right, like you might have sort of like your development team with 10, 15, 20 engineers where everybody's super comfortable handling, you know, their Helm charts and their YAML files and Terraform modules, then it's fine. But as we mentioned, that tends to not scale really well as you add new people and not everybody understands everything. And specifically, they're not familiar with how you build your own tooling and systems. And so what is that sort of like threshold, right? In our experiences between like 50 to 100 developers, that's where at 50 is where you start basically seeing things break. And usually you have maybe that like one, two, three people

Starting point is 00:16:14 really specializing in operations who start becoming, you know, who start being put under pressure by the rest of the organization. And that tends to get progressively worse until usually we see sort of like platform initiatives emerge, right? Around that like 50, 100, like 100 plus really, usually you see platform teams or some sort of like platform initiative emerge.

Starting point is 00:16:42 And I think that's the other important thing for engineering executives to realize which is you know if you don't decide to build your platform it will build itself right um and so you're much better off sort of uh you know recognize that this is happening and and kind of like you, tackle it and plan it and really decide on what is the architecture and what are the design principles that we want to follow when we build this platform versus, you know, basically having a homegrown solution that will sort of emerge in this in a patchwork way, because, you know, people realize, okay, we need to build some sort of layer here to enable some degree of self-service, even if it's minimal.

Starting point is 00:17:31 And so that is kind of the threshold that we see in the market. And below that, you could have the sort of like first case that we just spoke about, everybody's a pro, and maybe they have a really advanced cloud native setup um and they're all familiar with uh all the you know tool chain and they can run on their own or the alternative of that would be you know a team that just opt ops for let's say a pass solution so you know the og pass is kind of like a roku now there are a lot of different other solutions on the market. They're very sort of specialized in different types of use cases.

Starting point is 00:18:11 And there's nothing bad with that. Like I think pass solutions are probably a really good option. If, you know, if you really want to focus on ultimately sort of like your key value adding activity, right? So if you are a, you know, a food delivery startup, I mean, maybe that's a bad example, because they usually are really well funded and have a lot of employees from the get go. But you know, if let's say you're, you're like an HR startup, SaaS or something, and you are just starting out and you're like 10 people. There's no point in you investing resources and building infrastructure because that is not your business. Your business is HR software, not infrastructure software.

Starting point is 00:19:00 And so that's what you should focus on and probably opt for a very simple path solution can be a great path forward at the beginning. But it's important to be mindful of that threshold. So the reason why Heroku doesn't work for the enterprise and all these past solutions tend to break past that sort of like 50 to 100 engineers point is that they are just not flexible enough, right? So when you start having different development teams of different, they have different preferences in terms of how they want to interact with the delivery, setup and infrastructure, then is where it really becomes important that you have a platform team that listens to them and that builds different types of golden paths that then become your intern developer platform, right?

Starting point is 00:19:49 And so that's really the key difference there is basically having, you know, a external platform team, which would be the product team at your pass provider that, you know, is basically trying to optimize for every one of their customers or a platform team that is really building a product for your organization specifically. Andy, I bring this point up a lot, Luca. Not a lot, but for me more than once is a lot. This sounds to me a lot like what the goal

Starting point is 00:20:26 and mission of Cloud Foundry was, where it was flexible enough that the teams developing on the backend, the platform side, can set up all the rules, set up all the opinionated platforms. It wasn't just stuck in one way. Obviously, there's different technologies

Starting point is 00:20:42 we're talking about. I know back when they were popular, they weren't supporting Kubernetes so much at that time. But conceptually, it sounds like a very similar idea that was put forth then where

Starting point is 00:20:55 it takes all the burden off the developer to say, hey, you know what, developer, write your code, push it, the rest is done. And that's maintained by best practices created by other teams. Am I getting that right? I always go back to Cloud Foundry because it's like, oh, look what they were trying to do years ago.

Starting point is 00:21:13 It didn't work at the time, it didn't seem like the world was ready, and it wasn't flexible for everything else going on. But we seem to be heading back towards that direction now after going through a full-on, I'm going to build and make bespoke versions of everything. I'm not even going to use one of the regular Kubernetes versions. I'm going to modify it and make it exactly what I want. Now everyone's coming out and saying, wow, this is way too much. I don't want to have to think about that side of it. Is that a proper

Starting point is 00:21:36 interpretation or am I missing a piece of it there? No, I think it's a really good interpretation. And yeah, as you said, I think they were probably a bit too early in the market. And I also think, as you also mentioned, they were still this in-between, I think. So, you know, the way I think about platform engineering and kind of like what we advocate for is like, hey, look, platform engineering is an unopinionated toolbox to go and build your own opinionated workflows and platform right yeah um which is the opposite of a pass which is basically saying hey i the product team at you know pass provider x already have figured out what the minimum common denominator functionality is for the entire market. And here it is with very little ability to go tweak it and customize it to your own needs. And I think in that spectrum, Cloud Foundry was still sort of in the middle where they

Starting point is 00:22:41 weren't simply providing a tool in the toolbox for the platform team to go build. They were still like coming with a baggage of sort of, you know, some opinionated decisions that they had taken sort of like upstream. And I think and so I think the combination of those two things, like the market not being fully ready or having, frankly, like, what does ready mean, right? Like ready means basically people haven't experienced the pain yet, really, to the right extent, right? And this is partially a question of scale, really, right? Because you now have engineering organizations that are already much larger on average than just engineering organizations 10 years ago. We just talked about AWS had only like 500 people. Obviously, most companies didn't scale like AWS, but still, the average

Starting point is 00:23:36 bank or healthcare institution has probably, I don't know, 5x the amount of developers they employ now, right? And at the same time, technologies like Kubernetes have gotten boring, frankly, right? So boring that even the enterprise adopts it now. And so then that problem becomes more and more real of, okay, how do I make my know, make my cloud native setups with all this technology is manageable. And so that's where I think like these pain points that platform engineering addresses are, have become a lot more felt in the last, you know, five years or so.

Starting point is 00:24:17 And that's where, and you can really track, right? This basically this, this trend of, you know, Google building Borg and, you know, and sort of like these leading tech companies building these platforms. And then slowly now in the last really five years trickling down to, you know, downstream to or mainstream to like larger, larger parts of the market, larger segments of the market. And then you can see the platform engineering community that we've been working on in the last couple of years really taking off because I think it really hit a nerve with a lot of people that, especially senior contributors or senior platform builders who had been building these platforms for the last five or 10 years, they just didn't really know what it was called or what best practice was.

Starting point is 00:25:14 And so it was kind of like a big relief moment, I think, for a lot of them, kind of like, oh, that's what I've been doing um and and um and that's kind of where um the the the sort of we see that the crazy growth in the community and and really just um in general and like social and uh everywhere there's there's just a lot of people and analysts talking about it now quick recap again for me what i learned in the last couple of minutes that means what you were saying is that if you're small, right, start with something that is actually opinionated. Because you need to focus on, you know, proving all your business, building your business case and existing platforms that are super generic, but also, well, let's say they're super opinionated in the way they address the problem. They get you to your first milestone and maybe to a second milestone. But eventually, you need a platform that has your own opinion, right? Because every organization is differently. And then I guess you have two options. You can either say build everything from scratch to everything yourself or you're

Starting point is 00:26:27 starting obviously with I think you called it a platform. I'm not sure if I wrote this down correctly, but it said that a platform toolkit that allows you to then build your own opinion of the platform and then really with this allow you to scale to the next level and building the golden paths um it's really interesting right we start with with like there's a there's a generic opinionated platform that probably helps 80 of organizations in the world like the classical 80 20 rule in the beginning it helps you but as you outgrow that, you obviously need to think about how you build your own platform that works based on your skills, on your business use case, on your processes.

Starting point is 00:27:13 Steven, your needs, right? at this time is you think one of the original goals of Captain was that different teams are going to need different tools, different platforms so that you had an easy way to integrate them. I think what Luca, you're saying is that when you start building your own platform, you can take a look at holistically the organization and say, okay, what are the

Starting point is 00:27:39 golden paths, as you said, of the different sets that we need, the different teams. Not every team is going to work with one workflow. So we can build different sections for each team. Within limits, obviously, you're not going to give everybody everything. I guess that's where the platform team is going to come in and help make judgments on that and say what's going to be the best. But then you can build that, maintain it, and then each team can operate in what's going to be the best path for them whereas obviously with the past platforms you just have one choice and that's it typically right

Starting point is 00:28:11 100 and and that's why it's also really important that people um you know to to find that balance you know between sort of like built here syndrome and adopting what's already out there. Because even when you are building your platform, right, even when you get to that scale where a pass doesn't make sense, you're already like 60, 70 developers, things start to break, you need to start building your own opinionated workflows and golden paths and so on. You still don't want to reinvent the wheel from scratch, right? You still want to see, okay, what is the best combination of, you know, whether it's open source or commercial tooling out there that I can sort of like, you know,

Starting point is 00:28:56 mix and match to build the platform that I need, right? And then the platform team, you know, real value creation then is not in rebuilding 100% of the stack, but doing that last mile optimization because they're going to be the only ones in the world that really understand the specific requirements of the organization. And so that's where the value really is, is in building that feedback loop between the platform team and the rest of the org. In your experience, organizations that have their own platform teams,

Starting point is 00:29:38 who is part of that team? What is a platform engineer? Or what type of skills do you need in a platform engineering team? Does this include, let's say, your DevOps engineers that know the delivery tools well? Does it also include operations? That means, does it also include folks that know how to automatically configure, set up, and scale infrastructure?

Starting point is 00:30:06 Or what makes up a good platform engineering team? Yeah, I think that's a great question. And I think it's important to understand that a platform team does not replace an existing SRE, DevOps infrastructure team, cloud ops, whatever the name is. And the reason goes back to your question, which is what is the skill sets that you need? And obviously, yes, you need a solid understanding of your cloud-native technologies du jour, so Kubernetes, I ic um you need to understand cicd workflows githubs whatever but if you come from that devops sre um cloudops background probably

Starting point is 00:30:53 you understand all of this and so the um the key kind of um mindset shift that needs to happen, I think, is really one around product. So as a platform engineer, it needs to be super clear to you and to everyone else in the organization that developers are your customers and the users of your product, which is the platform. And so instead of trying to sort of like, you know, onboard them and teach them infrastructure technologies, you really want to focus on enabling developer self-service. And so your task is to build internal tooling and really build that feedback loop with developers, right? To listen to them, to listen to what they need, right?

Starting point is 00:31:45 And so in that scenario, basically all the topics, all the traditional product management topics that, you know, and key themes that we know from the last 20, 30 years of product management theory apply, right? So user research, product roadmap, MVP, rollouts and adoption, and so on, so forth, right? And so you basically, you know, need to build really a product team. So you know, you have your backend, your frontend, your QA, and so on. And the product manager becomes a really important key sort of role in that scenario, right?

Starting point is 00:32:27 Because they become really the link between the product team and the other teams. So the DevOps, SRE, and infrastructure teams, because you still need people that think about, hey, how do I optimize my load balancing across these regions for availability, yada, yada, right? That's a separate thing. It's a separate area of focus. But you need a team that is focused on one mission, which is, hey, we're building a platform as a product. And platform as a product is really one of the key principles that we advocate for in the community. And we build this platform as a product, and we're not here to

Starting point is 00:33:07 just maintain your infrastructure and optimize it for scalability. We're here to ship a product, which is the IDP. So the question now is, does the platform as a product team follow the you build, you run it? That means they actually not only build the platform, but also run the platform? Or is the platform run by somebody else? Or do they need another platform to build the platform?

Starting point is 00:33:36 Yeah, you can keep them going forever. The meta platform team um um yeah no so i i think uh the the so the platform team is responsible for for everything right it's responsible for like shipping the product and and sort of like maintaining the product and you can really think of it as basically a startup you know within if you will within the uh within the the larger engineer organization right and so if you think about within the larger engineer organization, right? And so if you think about a startup, it's not just about building a product and putting it out there in the world, right?

Starting point is 00:34:12 It's actually about building a go-to-market motion around that product. And this is really the key part and kind of like what I was getting to, which is it's not just about having that product mindset. It's also about being a really good communicator because you not only have to build that relationship, the really tight feedback loop with developers and the rest of the engineering organization to really listen to them, understand what is the right level of obstruction, right? Because as an example, if you are a senior backend engineer who really enjoys messing

Starting point is 00:34:54 around in their YAML files, and now I give you some sort of like ClickOps UI to do your deployments, you're going to be really mad at me. Because you're going to be feeling abstracted away and that you completely lost control over all the things that you like to tweak normally. At the same time, if you... And so for that specific persona, you might want to create, you know, a really absolutely code-based sort of like golden path that really gives them all the ability of tweaking, hey, how much CPU my Kubernetes, my pod is using or whatever it is, right? But at the same time, you might have, you know, a front-end team that doesn't care whether you're running on GKE or AKS. But in fact, they might not care whether you're running Kubernetes at all.

Starting point is 00:35:53 And so in that case, providing a more abstracted developer experience might totally make sense. And in fact, might be the best thing, right? So that's where being able to listen and communicate with the different users of your platform and your different development teams becomes super important. But not only you need to communicate to developers, you also crucially need to communicate to management executives to get full stakeholder buy-in right and that is really not a an easy task a lot of times and it's completely different lingo that you need to you know get used to and speak uh right because when you speak to developers it's about hey you don't want to it's about waiting times

Starting point is 00:36:44 really right like you don't want to wait you don't want to it's about waiting times really right like you don't want to wait you don't want to have this like crazy cognitive load on you to just like do something um and so those are kind of your keywords there when you talk to develop sorry to to management it's it's a completely different thing right it's about uh you know time to market lead time is about uh you're you're kind of also like Dora metrics and stuff like that, right? And so communication becomes an extremely important part of the skill set. And if you were to, I think, like plot on a graph where kind of like on your y-axis,

Starting point is 00:37:21 you have communication scale from zero to infinite. And, you know, on the X axis, you have time. I think you'd have kind of like your sysadmin and your infrastructure engineer and your, you know, DevOps, cloud ops engineer. And then eventually your platform engineer has, you know, plotted on one line up into the right, right? Because over time, you basically, you know, this role, this operations, and then now platform engineer role has to, you know, need to acquire more and more confidence in how they communicate internally. And that becomes really, really important. I think your example with the front and then the backend developer was really great great and also kind of at least helped me to see kind of at least visually in my mind what do you mean with different golden paths?

Starting point is 00:38:13 Because basically the platform gives me maybe two options, right? I am a senior software developer. I want to code everything in Y the ML and the platform is great. It helps me to maybe deploy it out somewhere, but I still have all the control of how I configure my manifest. But then on the other golden path could be, I just want to get a simple website

Starting point is 00:38:35 out and I don't care where it runs as long as I can access it after the deployment. I think that's good. I want to specify one thing because I mentioned this fully code-based workflow in the senior backend engineer case.

Starting point is 00:38:55 It is not that if you want to simplify further, then it just means a UI. In fact, what we see in the market is usually these like ClickOps solutions that are very UI heavy don't really work very well. Because even your front-end engineer, like their workflow is still usually fully Git-based, right? In most cases. And so they're still going to hate you

Starting point is 00:39:25 if you ask them to basically interrupt that workflow to jump into UI that they don't really understand and now they need to learn it, right? And so in both cases, actually a fully code-based interaction method, I think, is preferred. And a lot of times can make or break your platform initiatives. Like we preferred and a lot of times can make or break your platform initiatives.

Starting point is 00:39:46 We've seen a lot of platform teams really focus on the high-level functionality of like, oh, let me put a nice service catalog, i.e. a backstage or whatever on top and off you go. And those really tend to fail a lot of times because the service catalog functionality is not where really the pain points is for a lot of developers. The pain points usually are around application configurations and infrastructure orchestration, which comes later. And yeah, and you're asking them to basically go into this UI that is not super useful for them, right? And so, yeah, just to specify that, I think code base is still the way, but within code base, you have different levels of context

Starting point is 00:40:38 that you can provide to your users. Andy, that gave me two thoughts. Luke, as soon as you mentioned the UI bit, I remember going back many years ago to WebSphere, and it seemed like the WebSphere maintenance was all through GUI, right? A UI, you wanted to change back,

Starting point is 00:40:59 you know, at Mondays, Andy, you had to add the library, and it would be the operations person would have to go in every single screen, click, click, click, click, click to get down. Other people running just pure Java, just a couple of keystrokes and it's in. I was like, wow, that seems to be a nightmare.

Starting point is 00:41:16 I had another thought that you triggered on there, but I can't remember it now because I went too far into my web sphere nightmare. It'll come to your dreams. Yeah. But from an early adoption... Oh, I got an idea.

Starting point is 00:41:30 Yeah, go on, Andy. Yeah, just from an early adoption. If you look at tools in the GitHub space these days, Argo, CD, or Flux, at least from my perception, and I only see parts of the world, it feels that Argo CD, the reason why it is so popular

Starting point is 00:41:45 and it gets adopted quite a bit is because it actually has some UI that visualizes the stuff that's happening. I'm not saying that you configure things there, right? Because you configure it still through code. But having some type of visualization is obviously very important because otherwise people don't know what's happening. I totally agree. But that's the key. What you is is really the key difference right like you want to be able to configure everything in code and then you know obviously visualizing things like a dora

Starting point is 00:42:16 dora metrics or you know um a deployment history is a lot easier in a ui because you can just look at it right but the important thing is is not not building instead of a golden path, a golden cage that forces you to have to click around to do things when you're not used to. That can really kind of prevent adoption of the platform. So that's a nice new keyword, golden path versus golden cage. I'm pretty sure you have used this in the past, but I will use this

Starting point is 00:42:52 the first time, I think, when I do the summary here. Let's not build golden cage. Yeah, I'll definitely be creative with that. So the other thing you said in that, it came back to me. Talking about the senior developers wanting to have full control to tweak and everything, I wonder how much of that is holding on to knowledge and their skill because they know they can trust themselves better than a platform.

Starting point is 00:43:17 Whereas when the platform gets to the point where they can trust it, will they be happy to give up that flexibility? So a lot of the points you were talking about was configuration, tweaking, and all this. There's a lot of tools out there. A lot of times we have Akamai out there that talks about, not Akamai, sorry, Akamas, tweaking different Java settings

Starting point is 00:43:38 as it's running to optimize. You can have self-remediation. You can have feedback loops that are constantly taking and tweaking, and these things are all getting better and stronger. I'd be curious to see when that gets really solid, when you can push it out there and the automation and AI

Starting point is 00:43:56 or whatever you want to call it can do the tweaking, can optimize the running of it. It'd be interesting to see if those senior-type developers will be willing to give up that control to say, yeah, I do trust this runs and this is awesome now because I don't have to think about this. I can just focus on the architecture, all this other kind of stuff and just get some of my

Starting point is 00:44:16 time back, right? Because there is a control, right? I think it's the same kind of control, like if you're driving yourself versus a self-driving car. If you're a good driver and you're experienced, like, well, I don't know if I want AI to drive for me, right? Because I feel like I have the control. If it gets good enough, will they let go? And it's fun to drive, right? Yeah. Yeah, I totally agree with you.

Starting point is 00:44:39 I think that is definitely a sort of like psychological block in some cases, not necessarily a sort of like psychological block in a lot of, in a, in some cases, not, not necessarily a lot of them. But I think, you know, a good selling point for platform engineers in that scenario to their

Starting point is 00:44:57 developers, to their senior developers is usually is so what I call shadow ops, which is another word for you, Andy. And it is basically, you know, the reality of DevOps in a lot of large organizations is, okay, you have ops stuck and they become a bottleneck and everything we already talked about. And what that leads to oftentimes is shadow ops,

Starting point is 00:45:24 which is basically your senior contributors, your senior ICs, eventually doing ops for their team. Because they're the senior ones, they're the ones that understand the Helm charts and the customized and Terraform modules and so on. And so if the more junior, less experienced, less ops, knowledgeable people on the team will just slack and be like, hey, can you just do this for me? Because, you know, ops, otherwise I need a database provision. It's going to take a week. Can you just do it for me? And they're like, yeah, sure. Once and sure, twice and sure, three times. And then, you know,

Starting point is 00:46:00 at the end time, you're basically becoming the operations team, which, you know, from an organizational perspective is obviously suboptimal because you have your most experienced, most expansive, frankly, resources tied up and doing work that they shouldn't. They're not actually shipping features and building applications. For the senior engineer, the experience now sucks twice, right? Because they still need to wait

Starting point is 00:46:27 for some stuff on operations. And then they also become operations under pressure. So they just get the worst of both worlds, right? And so I think in that case, there is a really strong pitch from the platform engineer perspective to say, hey, you know, I'm really making your life better along a couple of dimensions here.

Starting point is 00:46:47 And so hopefully that can get them through the finish line. I have one final argument on this discussion. If you say that the platform has to be treated like a product, like every product manager or every product owner, you need to first figure out who's your target audience. And you may have people in your organization that are not your target audience. You don't need to convince 100% of the people to use your product. That's also a very important point.

Starting point is 00:47:16 Don't try to solve 100% of the problems if it doesn't make sense, maybe. Totally. And by the way, product management is uh one of the few channels that has been created directly by you know pms in the community in the slack space in the platform engineering slack space and it's one of the most active and because again it's super interesting and you know it touches all those different things not just technical challenges but also all this like internal marketing and stakeholder buy-in challenges.

Starting point is 00:47:46 And yeah, you're absolutely right. One of the key questions is, what is your rollout strategy? Who do you start with? And what we've seen successful platform initiatives do is usually they start with the more advanced dev teams because they're the ones that are not afraid of you know your yaml and helm charts and um and so with them you can uh and usually they're the ones that are also sort of more willing to try new things and you know they're they're uh uh more versatile in the cognitive technology world and so uh that is a great starting point

Starting point is 00:48:27 because it really allows you to figure out, hey, who's my pro user basically? And how can I build that sort of like low level, low obstruction kind of like golden path? And then from there, start onboarding more and more teams and build higher and higher abstraction layers on top of that. Hey, Luca, I think there's still a lot of stuff on my mind

Starting point is 00:48:54 that I would like to discuss, but I also know at some point we have to close the curtain on this episode. We have a lot of stuff that we discussed. I tried to do a decent job to take all the new words I learned and all the new stories and write a nice summary and then also link to all the things we mentioned, right? Starting from PlatformCon, your platform engineering community, obviously your LinkedIn and your Twitter. We also link to Humanitech and all the other stuff that you provided in the conversation today. Is there anything as final words that you want to make sure

Starting point is 00:49:35 that in case you didn't get it out yet that needs to be said? I think we said everything. You mentioned at the beginning PlatformCon. I would encourage if people want to find out more about platform engineering, you know, platformengineering.org is a great place to start. But yeah, PlatformCon is the main conference on the topic. We did it last year for the first time. It was great. We had 7,000 signups, 6,000 attendees just in day one, and we already have 7,000 signups now. So we're expecting about 15,000 people this year and it's free and it's virtual. So it's really easy to join for everybody. Well, there's no good argument against free, right? And especially if you have, I looked at some of your speakers that you have in the lineup. I mean, that's if you have time, and I assume, even if people cannot join live, the recordings will be available somehow. On YouTube, yeah. DevOps and platform ops. And I think one of the additional benefits, and this is probably a topic for another day on how it would be done, but you talk about DevOps, right?

Starting point is 00:50:50 Then it turned into DevSecOps. Then it turned into DevSecBizOps. Then it turned into blah, blah, blah, blah, blah, right? All of those are a means of getting workflows, data flows, and everything else into this DevOps thing. And if you think about platform ops, that's all tooling, workflow, transferring data. One word covers it all.

Starting point is 00:51:10 So if you want to add, you don't have to get into these awkward DevSec biz ops. Platform ops, sure, it fits in. Why not? You could put any other teams in there because they're all part of this ecosystem. And it's really about an ecosystem, I think. So at least from a speaking point of view,

Starting point is 00:51:28 it makes it a heck of a lot easier to say. I know there's a lot more to it than just the word, right? There's a lot more, but just in terms of being a speaker and having to say these little acronyms all together, platform ops, that's a bonus for it and maybe a reason to push towards it some more. I really like the, I mean, that's a great final thought. I really like the golden path. I think if you think about your organization and you understand who your different teams are that needs to ship code, then you should figure out what are the one, two or three golden paths for them

Starting point is 00:52:06 from inception of an idea until it's in production, and then build a product that enables these engineers to actually get the stuff out there through the golden path. And the golden path obviously includes all of the best practices we have learned over the last 10, 15, 20 years, even longer since building software. Yeah, absolutely. And then every time you build a golden path, you can give everybody a golden ticket and sing the Willy Wonka song.

Starting point is 00:52:35 I've got a golden ticket. Yeah. There's a lot of jobs you can make with golden. There's showers as well. Oh, geez. Go in there. That's the next nightmare. Or maybe it's not a nightmare for Brian.

Starting point is 00:52:52 No, no, no. Exactly. I just gave you the perfect material for the next one. Yeah, yeah, exactly. Family show, right? All right, Luca, thank you so much. This has been eye opening I know

Starting point is 00:53:06 I remember when Andy mentioned this at first he was like yeah I really want to really want to dig into this and I think it's a great topic and

Starting point is 00:53:12 you know I didn't know much besides the words and all so this has been extremely educational for me as we said in the beginning

Starting point is 00:53:20 before we started so thank you so much I'm sure our listeners learned a whole bunch as well and Andy's got some new buzzwords. So that is fantastic.

Starting point is 00:53:30 Really appreciate you taking the time. We'd love to have you back sometime to dive into this more. Hopefully that could be arranged if we didn't offend you too much with our bad humor. And thanks to our listeners. So thanks, everyone. Thank you for having me. Would thanks to our listeners. So thanks, everyone.

Starting point is 00:53:48 Thank you for having me. Would love to be back some other time. This was fun. Also, have a good day. Thanks. Bye-bye. Bye-bye.

Your Ad Here

PurePerformance - “You Build It, You Run It Doesn’t Scale!” with Luca Galante

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.