PurePerformance - Observability Predictions 2025 Under the Covers with Bernd Greifeneder

Episode Date: January 20, 2025

To predict the future, it's important to know the past. And that is true for Bernd Greifeneder, Founder and CTO of Dynatrace, who has been driving innovation in the observability and security since he... founded Dynatrace 20 years ago!Bernd agreed to sit down, look behind the covers and answer the open questions that people posted on his LinkedIn in response to his recent observability prediction blog. Tune in and learn about Bernd's though on the evaluation from reactive to preventive operations, who is behind the convergence of observability & security, why observability can help those that have serious intentions for sustainability and how observability becomes mandatory and indispensable for AI-driven services.We mentioned a lot of links in todays session. Here they are:Our podcast from 9 years ago: https://www.spreaker.com/episode/015-leading-the-apm-market-from-enterprise-into-cloud-native--9607734Bernds LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7275101213237354497/Predictions Blog: https://www.dynatrace.com/news/blog/observability-predictions-for-2025/K8s Predictive Scaling Lab: https://github.com/Dynatrace/obslab-predictive-kubernetes-scalingSecurity Video: https://www.youtube.com/watch?v=ICUwRy4JFTkCarbon Impact App: https://www.youtube.com/watch?v=8Px0BB1U1ykAI & LLM Observability Video: https://www.youtube.com/watch?v=eW2KuWFeZyY

Transcript
Discussion (0)
Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson. Welcome everyone to another episode of Pure Performance. As you can probably hear from my voice, this is not Brian Wilson, who typically does the intro, but this is Andy Grabner. Unfortunately, Brian couldn't make it today, but I think I found a really great, amazing guest today that makes up for not having two hosts, only one host, but an amazing guest, Bernd Geifner. Servus, Bernd. Hey, Andy. Thanks for having me. Yeah, it's been, believe it or not, well, you know it because you reminded me,
Starting point is 00:00:54 it's been almost nine years since I had you back on the podcast in episode 15. I don't know whether this is a good sign or a bad sign. Well, it's A, a good sign that I invited you back. B, you could have done better in coming back earlier. It shouldn't take nine years. But Bernd, back then, the episode was called Leading the APM Market from Enterprise into Cloud Native. We talked about the transformation
Starting point is 00:01:25 to what we called no-op spec then from revolutionizing the way we are actually delivering software. For those of you that don't know you, Bernd, you founded Dynatrace back in 2005, so that's 20 years ago. A lot of things have changed. Can you quickly recap,
Starting point is 00:01:42 if you're looking at the observability space that you have innovated over so many years, what has roughly changed over these last 20 years? Yeah, back then it was all about three-tier kind of online shop applications, which could not be launched in time because those would break down. So the key for me was, as an architect, already to figure out why would these online shops not be able to withstand 50 users. Hence, I created the tracing technology, and hence Dynatrace is called Dynatrace, for doing dynamic traces and understanding the behavior of these applications. And back then, the number one issue was death by thousand cuts
Starting point is 00:02:32 through database queries. And most of the time it was the object relational database layer, which would cause a thousand database requests and the developer thought, oh, it's only one ORMEPA query. And this was the number one pattern back then. Finally, only technology has changed. The pattern is still true in today's world. Today, it's just REST calls amongst microservices
Starting point is 00:03:03 and so forth. But from the principle, it hasn't changed. What has changed, though, is the number of tiers applications have, and the complexity has shifted from just code complexity, the now complexity into the broader infrastructure setup of all these various containers and systems that are interconnected. And has also changed in the way that basically a single human cannot understand anymore these setups without any type of observability. Hence, the whole APM has shifted into,
Starting point is 00:03:48 hey, I need to better understand and tune my app to, oh, I need to actually figure out how does my service amongst the other services work in production in this entire broader setup. So this is also why observability has gotten this additional raise, although I think observability is actually a too short-handed term because to me it's not about observing system. To me it's about using the observability data to ultimately understand, optimize, and automate systems so that they don't fail. So, and this was always the whole idea from day one of Dynatrace. How do we help customers prevent issues through automation, through deep visibility that allows these types of automation.
Starting point is 00:04:46 So if I can quickly recap, 20 years ago, it was code and build complexity that was replaced into architecture and deployment complexity that we have. And obviously the tech stack complexity that was brought in. And I really like what you just said. Observability is really just the foundation. It's the raw data, the data that we get, obviously all connected, but it's really what observability
Starting point is 00:05:11 data enables us to do. And this is done through smart automation on top to really make sure that organizations can deliver at the speed of market, that whatever they do, they do it wisely, sustainable. I think sustainability is a topic we will touch upon. A lot of things have changed. Now, Bernd, it's very unusual for anybody to work in a company for 20 years. And obviously, you are the founder. You're very passionate about it. But still, 20 years is a long, long time.
Starting point is 00:05:42 What keeps you motivated to be still at Dynatrace and to drive this after 20 years? And also, if you can answer this question, why do you think so many early Dynatrace employees, including myself, are still here? So, yeah, the second question, I guess, you will have to answer. So I can explain why I'm still here.
Starting point is 00:06:06 Basically, it is extremely egoistic. This is just my ability to really show the world that there is software that we can build software here that provides a value
Starting point is 00:06:22 to customers, which sort of is fueling me and giving me the energy to actually do something that's valuable to the world and have an impact. And the other one is more think of evolutionary, I spread my genes in this way by building better software than the other usual suspects on the West Coast.
Starting point is 00:06:49 Here with the core team in Europe, that's another big one that's motivating me. And sort of also, sort of why is it not boring? As long as I can actually create one generation of Dynatrace after the other one and keep the innovation forward, as long as I can actually create one generation of Dynatrace after the other one and keep the innovation forward. As long as I can do this, then it's fun to work because the opportunity, the abilities that can be done are way bigger and broader than what we can do. So focus is hard in all this, but this makes so much fun and has so much potential
Starting point is 00:07:27 that keeps me here and every day allows me to push forward and push energy. And I think just sharing the bigger vision, seeing the opportunities, seeing the strong technology foundation on one hand, but also being a very collegial team that we pull on the same rope on this forward is for sure something why still all the people who started in the beginning are with us. How about you, Eiland? Yeah, I'm almost here for 18 years now, sooner rather than later. I think I agree with you because it's so funny.
Starting point is 00:08:10 If you think about it, whenever I pick up a topic, I always think, man, this topic eventually will be old and I will be bored. But then the next thing comes around, the next technology shift, the next big move, the next big change in how organizations are delivering and operating software. And we are helping all of these organizations to walk through this transformation.
Starting point is 00:08:31 And the beauty is because we've already, in most of the cases, walked through that transformation ourselves because we are one of the leading software organizations, as you said, right? And so it's really great to then kind of be there and teach and mentor people on how they can overcome the same obstacles that we have already solved. So that's one thing. And the second thing, this is a very personal thing, but I grew up on a farm in rural Austria. And now being a small farmer boy and now being able to talk with the big technology leaders around the
Starting point is 00:09:05 world is something that gives me joy and also makes my parents very proud. So it's pretty cool. Bernd, the last personal question that I have before I want to go into the blog post that you wrote about the predictions for 2025. How do you manage to not lose touch with current technology trends? How can you keep innovating a product that needs to be ahead of almost its curve in its time? How can you make sure that you still have the time to understand what is coming? What challenges may come up? How do you do this? Yeah, that's a really good question.
Starting point is 00:09:43 And it's actually also challenging me every day. And there's a couple of things how I go about this. So one is from day one of Dynaphyse, one of my mantras is never do what customers want, but actually understand what their needs are and figure out how to create an innovative solution to it. So meaning customers have the problem competency, we have the solution competency, but to your point, I need to also be able to understand the problem in order to then build the proper solution for it. And this is exactly where it is so key to be hands-on.
Starting point is 00:10:21 So to be honest, in the holidays now, just the past two weeks, I was not skiing. I was actually coding most of the time. And this is sort of one of the ways to stay in touch with technology and, of course, use, as everyone these days, also AI for coding, but also always couple it with using also Dynatrace and product, sort of building an app on Dynatrace, so leveraging it, always be customer zero, even before customer zero, sort of argue. Maybe it's a pain for my own team,
Starting point is 00:11:03 but at least it helps me get a better feel of where we are with product, what's missing. But also this helps me a lot to better steer product at the end because I'm not relying on PowerPoint status updates, but actually hands-on experience. And we all know that these days, even more than ever, experience matters. And that's key. And this is why also my assistant has the order that two days a week, she must make
Starting point is 00:11:39 room for product hands-on sessions. These are with teams, with me, so that we really are not losing any real touches on product. Wow, so two days a week as a CTO, your assistant carves out the time that you have hands-on experience with the product, being able to work with the product folks. It's quite a lot of time, that's a lot of time that you can carve out, that you are carving out every day. And it's very, and that also shows the commitment that you have and the passion
Starting point is 00:12:12 and also allows you to get all these insights that allow you to drive the innovation forward. Well, Bernd, you just mentioned that you built an app for you, you're obviously hands you built an app for you. You're obviously hands-on. So maybe, let's not wait another nine years, maybe I invite you back sooner for a video session for one of my observability labs
Starting point is 00:12:33 and you show me what you've built and some tips and tricks from the CTO and founder of Diamond Race itself, how you use the product. Yeah, that'll be awesome. That will be a cool challenge for 2025. Talking about 2025, you wrote a blog post
Starting point is 00:12:47 about the observability predictions for 2025. Really nicely written blog. Then you also posted this on social media, on LinkedIn. And this is when I actually got to read the blog post because I saw it there. I also saw a couple of people have left comments. And so what I would like to do, I would like to hear it from your own words.
Starting point is 00:13:11 Not sugar-coated, not beautified marketing language, but these five predictions that you made. And for three of those where I thought some comments came in, I would also like to ask you the questions that came in from LinkedIn so that you can actually respond to them. And the first prediction that he had was around shifting from reactive to preventive operations, from being reactive to becoming a preventive organization. What does this really mean? It really means that the way customers typically purchased monitoring products in the past was due to a firefighting situation. So they serviced the app, did not work as expected, there's a deadline or shit, sort of what do you do?
Starting point is 00:14:00 You need to bring in some vendor who gives you the visibility in order to be able to fix items. So this is sort of one of the typical use cases of how Dynatrace also grew. But also the customers who went already a couple of times through these learnings figured, hey, we should better be proactive. And this is why also these days, as we call this type of monitoring observability, figured observability is actually a mandatory part of standing up any new project, any new kind
Starting point is 00:14:38 of cloud, at least a cloud-native project. So this is what's being done, but mostly to initially avoid these firefighting situations. when you establish platform engineering, then you figure out this is way too much work for the manual efforts. So you need to be proactive in order to prevent items, but also sort of the automation part is the best way to be proactive. And every automation that is proactive actually requires data and data that you can trust and this is exactly here where observability is actually a prerequisite to prevent issues
Starting point is 00:15:38 because it is the reliable and trustworthy data on top of which you can build predictions, predictions about behavior of certain systems, of scalability. Even disk outages are these days oddly a topic. And prevent those issues from causing outages by taking proactive actions before some outage occurs or overload occurs. And that's a key one. So in the past, Dynaphrase already had always this root cause analysis that would do it automatically.
Starting point is 00:16:21 But then customers had issues with figuring out, okay, but how do I set this up with an automation? And this is the key now also to how to make it as seamless as possible to take the data, to do the analytics on it, get the automatic prediction, and then trigger the right preventive action in the year. And this is sort of going all the way into either more classic changes where you change configurations proactively or into more a GitHub style of approach
Starting point is 00:17:02 where you even change configurations in a Helm chart and define, for instance, a rescaling of your application because you already have a prediction of load changes or something like that. So there's this full variety, but having it all in one chain of action together in one system
Starting point is 00:17:25 shortens sort of all the issues that you have of bringing the data and action together. And this is making a lot now happening based on the observability data. And I also want to give a couple of examples that I've seen being implemented. And especially now I work a lot with the CNCF, with the Cloud Native Computing Foundation.
Starting point is 00:17:48 And for instance, from a Kubernetes perspective, scaling is typically done through an HBA, a VPA, so these auto scalers or CADA. But as you said, they are typically taking current data and basically threshold-based. So you're reaching 80% capacity, you start scaling up. What I've seen now people do is using the
Starting point is 00:18:09 Davis, the Dynatrace forecasting the predicting capabilities and basically preventively scale when Davis predicts that we are going to really run across the 80% threshold, 90% threshold within the next hour or so,
Starting point is 00:18:27 and not just scale because of a little blip. And scaling would have actually not been necessary. The other use case that you brought, the GitOps-based approach, I always hear from people, do we really want the AI to be under full control or do we still want to have the human in control but the AI supporting? Folks, in the description of the podcast,
Starting point is 00:18:43 I will link to a couple of things besides the blog post from Bernd, also to some of our labs that we built. One of the labs is exactly showing how Dynatrace can automatically open up a pull request with the suggested scaling recommendations based on forecasting data, which means the human is still in control, but it's supported by the AI because the AI made the recommendations based on the forecasted data. And I'm pretty sure as we go along in this podcast, there might be more links coming up.
Starting point is 00:19:15 So if you're listening to this on your mobile device, make sure you check out the description. There was also one question exactly on this topic that came in on your LinkedIn post, and it was from Andrew. He was saying many of the observability vendors, and he named a few in his comments, lack the accuracy in their AI predictive approach. His question then to you was, why do you think Dynatrace is different? Yeah, I think this is a great question and one that is actually really hard for sort of in this current hype of AI to explain to people that there are actually different types of AI
Starting point is 00:19:56 and analytics in order to help with this problem. Because the difference of Dynatrace is starting with the way data is captured and data is then kept in context. And then also the data is enriched. For instance, a log is not just a log line, but we automatically enrich a log line with it runs on what system and which pod and which process and so forth. So they have all this in context.
Starting point is 00:20:27 Also, you know, this log then that has been written, this process has been in contact with these other systems. So you know, then cross dependencies. So we store all this context. And then Davis has actually three types of AI for that purpose. And one of that that provides this factual precision is Davis's causal AI, which is totally different to what everyone thinks when we talk about AI these days. Everyone thinks there's a huge neural network, like a large language model based on probabilities. Now, Davis' causal AI is actually based on the effects of the collected data. And it builds a real-time topological graph
Starting point is 00:21:16 that is a directed graph. This means it can cause it, not just correlate. It knows that the system has called the other system, this one has a dependency on the other, not the way around. And exactly this inference, this knowledge, allows Davis causal AI to A, learn instantaneously, meaning in today's systems, containers come and go thousands of times an hour
Starting point is 00:21:48 and all this is being taken into account. So this relearns basically instantaneously this causal layer. That's one item. And then the way it comes to the decision and conclusion of a root cause is by actually traversing the factual effect-based causal graph so it knows what is dependent to what and then therefore can easily figure out, okay, this 100% CPU actually has no impact to the end user response time, so why bother? Let's not alert. Or this actually 10% CPO does not look any suspicious,
Starting point is 00:22:32 but actually the end user suffer with horrible response times and the low CPO is only a symptom of some other system not working properly. So, Davis causal AI completely infers that, understands these dependencies and therefore can create an automatic root cause information about the current system's behavior. So, now combine this with the ability to predict, then you have an accurate deterministic ability to infer, to understand the cause, predict a bit into the future. And this allows you then to prevent proactively in a way you could never before.
Starting point is 00:23:29 I'm pretty sure this answers the question of Andrew, and I will make sure once this session airs that I'll post it on your LinkedIn as a reply. And folks, there's a lot of stuff that Bernd has just talked about. Davis, the AI, we talked about the model SmartScape. I will try to add as many links to these technologies in the description. Switching to the next one, the next topic, the next prediction you have is called observability and security converge around continuous compliance. Now, before I let you answer, I know that a lot of our vendors, the vendors in our space, we obviously move into also its direction. We as Dynagrace have moved into security a couple of years ago, the vendors in our space. We obviously move into also its direction. We as Dynageways have moved into security
Starting point is 00:24:05 a couple of years ago, naturally by extending our agent. Also through acquisition, we made a couple of announcements on acquiring companies in that space. So obviously we're all broadening the scope, but why do you think it's really, why is this happening?
Starting point is 00:24:21 Why is this necessary, that observability and security conversion that we have to see it as a unison? Yeah, so DORA was a big eye-opener for me on that topic, which is the European Digital Operations and Resilience Act. Because everyone, at least, I mean, most to whom I have spoken to, would think with the word compliance always
Starting point is 00:24:47 about security, right? But Dora was explicit about that availability is as important as security. And what sort of was part of this eye-opening moment is think about why at all are these regulations being created. So think of our society and our reliance in payment that we all do now digitally these days. So if there is a crisis situation, I mean, yes, we have here, for instance, the Ukraine war. If there are crisis situations, then you have still to be relying on payment systems,
Starting point is 00:25:33 because otherwise you would have even a worse situation if payment does not work and people cannot buy even basic food anymore. So this makes it obvious why these resilience acts are so important. By the way, it's not only Europe who is doing that. There is the UK Bank of England's operational resilience policy. Australia has a CPA. Hong Kong has one. There's the Federal Reserve Regulation in the United States. So this is a global movement for this.
Starting point is 00:26:08 It's also reinforcing and showing how important it is that these digital systems that we actually have to use every day in order to do our living have to work both securely and they have to be available. And this is exactly the point where it's no longer the CISO reporting compliance. No, you have to bring both together. The SRE teams have to report the resilience as much to be compliant as much as sort of the C CSO is doing the security side. Basically, this forces the groups coming closer together and actually collaborating more because it all fuels into one and the same report for these resilience requirements and compliance requirements. And that's actually a big one. And to me, it's just fostering from the top down,
Starting point is 00:27:07 sort of from the political side, actually that observability and security is coming closer together. While I have always believed that bottom up, it makes just sense for them to come closer together for different reasons. Thanks for also the reminder of all of these acts. Every time when I hear DORA, and it was also a big aha moment for me last year when I heard DORA, I thought, of course,
Starting point is 00:27:33 that's the DevOps Research Institute. That is something different. But yeah, it's interesting to see, as you just explained, that sometimes these regulations being pushed from the top to the bottom makes actually a lot of sense because it allows us to really leverage the data that we can then connect on from the bottom up. There is Karun who asked the question also on your LinkedIn post. He agrees with you. He says convergence is already here.
Starting point is 00:28:01 For an example, if you look at firewall logs, they serve on the one side security teams to figure out if there's any malicious traffic. On the other side, they also serve the SRE team for doing troubleshooting on connectivity issues, performance-related issues, resiliency issues. However, he says, while there is a convergence, he still sees many organizations still investing in dedicated scene and observability platforms. And he wanted to get your take on why do you think this is still there and what will it need to make a change? Yeah, that's a great question that also I am actively investigating even more.
Starting point is 00:28:39 And my current stance here is that I do see companies who are very rigid about, oh no, my logs need to be separate for me as a CISO because I don't trust what the observability team is doing on the other side. So this needs to be separate. So who watches the watcher sort of kind of thinking? And then there are the others. Why should we do it twice and deploy it twice and pay it twice and manage it twice and let's bring it together?
Starting point is 00:29:10 And actually, we need more context. By the way, context is the number one keyword I've always heard in these discussions. We need more context in order to fulfill our security needs, which is, in my opinion, then the biggest driver for converging. So in a nutshell, I think the biggest challenge
Starting point is 00:29:32 of why it is not as widespread yet out there is more for organizational reasons. That just historically, the CISO and the team is just such a different department who needs to check the compliance boxes, but not necessarily fix all those issues. Those are thrown over the wall.
Starting point is 00:29:54 Hey, you need to fix them. And those people who are more on the side needing to fix it have much more interest of convergence than sort of the one checking the boxes. And this brings me to the belief and the conclusion so far that the biggest inroads are sort of where we all see it first emerging in a stronger way is from the cloud and the iNative side of the house and the project starting there. Because to them, it is natural to do secure coding
Starting point is 00:30:29 and care about security and availability slash observability at the same time. And to them, it is much more natural than to leverage the data that's already available from that thing. So on the other macro side, I do see also security companies investing in more observability interest, whether they acquire companies for additional
Starting point is 00:31:00 sources like log sources. The key reason there is that the siloed approach of security of just looking at the firewall or just looking at the skins of your pods is no longer good enough. You need to have the visibility of the end-to-end attack vectors in order to understand the true risk. And this is why you need all the data end-to-end. You need the data proactively from your static sources to then all the way as you do the continuous delivery into your staging systems. But then also you need it in runtime to really understand what is your current footprint
Starting point is 00:31:45 and risk footprint in production as it is. So this is where this is moving towards. And in order to get to that, you need basically to converge the collection in the analytics of the data. Yes, some of the users will still be fanning out a bit. But at the end of the day, when I look at our own security team, the security team says, okay, here we have all greatly automated now the suspicious activities. But then the actual person who needs to fix something or prevent something when there's a vulnerability found,
Starting point is 00:32:27 then it's a Jira ticket to development. And then again, it's the developers who have to remediate and fix it. So, which is also a signal that the collaboration is key here between the security guys and the developers. And that's another reason why they don't want to have five different tools and different ways to look at data. It is just easier and faster if you can build it on one on the data pool. But at the end of the day, if I ask the question,
Starting point is 00:33:00 what is the key factor in bringing the data together, it's always one word, context. That is the number one for the security folks. They need more context in order to find issues, resolve issues, and also automate. Hopefully, Karun, if you listen to this, you will like the answer.
Starting point is 00:33:26 Also, check out a link. I will add an additional link to one of the YouTube videos I did with one of our
Starting point is 00:33:32 folks from our security team on how we internally actually solved that problem with combining and connecting
Starting point is 00:33:39 the security data with the observability data and then providing the right information to the right
Starting point is 00:33:44 people through our automation so routing the problems to the right people through our automation. So routing the problems to the right folks. So check this out. Bernd, I now need to time box the next three questions because I want to make sure we'll finish this on time. Observability is mandatory for sustainability. Actually, as it says, it says observability is mandatory for any serious it sustainability strategy can you fill us in on this yeah so this is all triggered by even our own
Starting point is 00:34:14 sustainability efforts what we got from our external esg and provider is basically they took our cloud cost or cloud spend from the hyperscalers, applied a factor and said, okay, here is your carbon footprint. And this is the common way this is calculated. Now tell me, how should I improve on it? I can't because I don't know. I only know if I reduce overall hyperscaler spend, then sort of I can reduce it.
Starting point is 00:34:50 So, and this is the whole point. If you don't have any detailed granular visibility in there, is it really spent? And yes, you could argue hyperscalers give you a bit more. That is correct. But they still don't tell you which service sort of is running redundantly, which one you could turn off, where do you have unneeded traffic across different deployment zones or how could you better scale up and down. They don't tell you that, but this is the whole point why
Starting point is 00:35:20 observability is actually mandatory if you take sustainability serious because serious sustainability doesn't mean to just report on it. Serious sustainability means actually to reduce our carbon footprint overall. And carbon footprint obviously means optimizing your operations I guess also optimizing costs in the end because there's a benefit on both sides, both on the carbon side and the end, right? Because there's a benefit on both sides, both on the carbon side and the costs. And this also to just bring out one other question
Starting point is 00:35:50 from Steven. He said, why didn't you add cloud cost management into your top five predictions? But it seems that this falls nicely under your serious sustainability strategy because that's a side effect of it. Yeah, that is so true because in this observability efforts, it is 95% actually the same analytics that you need for sustainability as for FinOps. And finally, this is also the reason why we are actually extending the Dynatrace carbon optimization app to cloud cost and carbon optimization to cover both in one app. And folks, also on that app, I will add a link because I have my app spotlight series on YouTube.
Starting point is 00:36:48 And I will add a link to that so you can see how this actually looks like. Cool. Two more to go. The prediction number four, AI observability becomes indispensable for AI-driven services. Now, when I hear AI observability, then two things come to mind, and maybe you can help me better understand. On the one side, it is how the AI models,
Starting point is 00:37:15 the AI systems that we may operate, how we get observability in those to make them more efficient. On the other side, how we observe the AI services that we use through API calls. So I think these are the two at least areas that I see when I hear AI observability or I hear people talk about it. What do you mean with it and what do you predict what is necessary?
Starting point is 00:37:40 So I see it this way that the majority of new projects, of new projects for digital services that are emerging this year, will want in some form use generated AI. And yet the attempts are more than just adding a chat interface to it. Why chatbots are the obvious additions. I think the aim will be a bit smarter and also this sort of is screaming for leveraging more smaller models because just the generic large models are super expensive and not optimized for specific needs. But it will be a mix of both, sort of.
Starting point is 00:38:33 And this is also then the whole point where just sort of calling one open AI interface will not be the answer to these new digital services. So it is more a combination of different AIs and sort of we are in this evolution from yes it was a CHPT kind of interface at first then it has become the RAC model where you do more automatic prompt engineering and circular loopbacks but it becomes more towards an agentic setup where you combine different types of AIs together,
Starting point is 00:39:08 which also means just alone the setup center integrations into new digital services becomes more complex. And here we are, because the first thing that is what you need to do is prove that it works. The second one is proof that it provides value. And then the third one is actually then you figure out how do you manage cost. Sort of in this order of these new services that are being stood up by all this cloud. And this is why I call them now cloud NDI native teams that are building new software these days. So with just these challenges, these three, you have to have observability
Starting point is 00:39:53 as much as you have to have observability into your microservices that are interconnected because at the end of the day, these AI setups are software as well that suffer from similar challenges like any other software. So this is the core type of observability I'm talking here about, that everyone needs who is standing up these services, which is the majority of everyone doing something new. And I remember, I think one of your opening sentences you had today, when you talked about the initial days of Dynatrace, you talked about these patterns, right? You talked about the database pattern, the death of a thousand cuts.
Starting point is 00:40:40 20 years later, if you think about it, we are building services that are making calls to AIs to do whatever to enrich our digital services. We could fall victim to the same thing if we're making inefficient calls to those AIs because they're very expensive, not only to run, but then also for us as a consumer. And in the end, as you said, the patterns stayed the same. Maybe the technology and how we call these services have changed, but the patterns stay the same.
Starting point is 00:41:10 So true. Cool. All right. Bernd, we are reaching the last prediction. And I had to smile and almost laugh when I read it because it says AIOps is dead. Long live AIOps. The reason why I had to smile last year, I was invited on a panel that was called DevOps is dead, long live platform engineering.
Starting point is 00:41:31 Because it was kind of like the migration or the move from talking from DevOps to platform engineering. But now AIOps is dead, long live AIOps. AIOps has been around for quite a bit. Why is it still there to last? Why does it live long and prosper? Let me approach it from two angles here. One angle is that AIOps in the past, let's say, decade has always served the purpose to actually reduce the noise from events. When you look at typical systems that do not do causal AI, I'm excluding causal AI for now, I'm talking about other software systems
Starting point is 00:42:25 that do this pattern analytics to reduce the alert noise. And this was pretty much the number one value of AI ops systems that would not use causal AI in the past. And this is the part that is clearly dead by now. First, of course, there is causal AI that's proven for years. This is one that avoids the noise up front. But then there's the other ability to, with the new AI ops, actually combine causal AI, predictive AI, and
Starting point is 00:43:08 generative AI into one, which sort of goes back to how do you combine AI that actually you can trust, that is deterministic, but still get the assistance from the generative AI to get to do your work as a human faster, combine that in a smart way. And this is the new AIOps. And that fits actually perfectly to my second point of why this headline.
Starting point is 00:43:36 This is that I cannot name you any customer I have spoken to who does not have the need, we need more AI in our IT department or engineering or something like this. And then the question is, yeah, but what is it? And some of the customers answer, yeah, but why are you asking? We have already Dynatrace, so we have AI. That's sort of the good answer, yeah, but why are you asking? We have already Dynatrace, so we have AI. That's sort of the good answer, but others still struggle with connecting
Starting point is 00:44:10 the dot from us, from the management, from the top management ask, because this is sort of the typically board or CEO executive team ask, you need to use AI. We want you to be 40% more productive next year, sort of through AI, sort of not knowing what this means, but just asking the question. And so this is why sort of the middle management, VP level management team is often challenged with, yeah, how do we leverage now AI ops to be more productive? And finally, some customers have not even realized that, yeah, they have it.
Starting point is 00:44:46 They use it already for years. They just did not compare how it would be, actually, if you would not use Dynatrace on that end. So that's sometimes a funny situation. Others obviously do fully count into that. But this ties the nozzle to the first point of being more proactive and prevent issues and to think this is now what's here and what's available. And the hard part is always explaining it to customers because, or to not just customers, to anyone. Because when you talk AI ops, most put it in context of large language models kind of
Starting point is 00:45:33 AI or generative AI. And this is sort of the whole point why I wanted this on the blog, that I think people need to realize there's more out there than a chat GPT interface. You need to dig one level deeper in order to understand what can be done, what can you trust, and what do you use for what use case, and that the combination matters actually. And this gives you a much better approach to a modern AI ops that you have ever thought where you can actually understand and explain the value of what it's doing
Starting point is 00:46:11 and not just have this vague promise of you should be more productive. Bernd, thank you so much for walking us through the blog. I always like to say bringing the blog alive. Clearly, if people are watching uh some of the video sound bits that we are posting on linkedin you see how burned is really excited about all of this you really bring this thing to life um i make your promise it will not take nine more years until you are back on my podcast i will follow up on the doing a video together on actually you showing me how you use
Starting point is 00:46:48 Dynatrace because that would be good to see for our global community but now I will let you go because I know more innovation is coming and by the time this airs we will just be about two weeks away from Perform our flagship
Starting point is 00:47:03 conference in Vegas, where I would assume, as always, you will give us even more insights in what you have planned and what you've been cooking up in your brain up there. Yeah, I look forward to seeing anyone who has the chance
Starting point is 00:47:17 to come to Las Vegas. Thank you so much, Pian. Servus. Bye. Servus.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.