Screaming in the Cloud - Keeping the Cloudwatch with Ewere Diagboya

Episode Date: October 14, 2021

About EwereCloud, DevOps Engineer, Blogger and AuthorLinks:Infrastructure Monitoring with Amazon CloudWatch: https://www.amazon.com/Infrastructure-Monitoring-Amazon-CloudWatch-infrastructure-...ebook/dp/B08YS2PYKJLinkedIn: https://www.linkedin.com/in/ewere/Twitter: https://twitter.com/nimboyaMedium: https://medium.com/@nimboyaMy Cloud Series: https://mycloudseries.com

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. I've got five bucks on DNS, personally. Why scroll through endless dashboards while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter?
Starting point is 00:00:52 Context switching and tool sprawl are slowly killing both your team and your business. You should care more about one of those than the other. Which one is up to you? Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business, production. With Honeycomb, you guess less and know more. Try it for free at honeycomb.io slash screaming in the cloud. Observability, it's more than just hipster monitoring. This episode is sponsored in part by Liquibase. If you're anything like me, you've screwed up the database part of a deployment so severely
Starting point is 00:01:31 that you've been banned from ever touching anything that remotely sounds like SQL at at least three different companies. We've mostly got code deployments solved for, but when it comes to databases, we basically rely on desperate hope with a rollback plan of keeping our resumes up to date. It doesn't have to be that way. Meet Liquibase. It's both an open source project and a commercial offering. Liquibase lets you track, modify, and automate database schema changes across almost any database, with guardrails that ensure you'll still have a company left after you deploy the change. No matter where your database lives, Liquibase can help you solve your database deployment issues.
Starting point is 00:02:15 Check them out today at Liquibase.com. Offer does not apply to Route 53. Welcome to Screaming in the Cloud. I'm Corey Quinn. I periodically make observations that monitoring cloud resources has changed somewhat since I first got started in the world of monitoring. My experience goes back to the original Call of Duty. That's right, Nagios. When you'd set instances up, it would theoretically tell you when they were at an unreachable or certain thresholds didn't work. It was janky, but it kind of worked. And that was sort of the best we had. The world has progressed as cloud has become more complicated, as technologies have become
Starting point is 00:02:55 more sophisticated. And here today to talk about this is the first AWS hero from Africa and author of a brand new book, Eware Diagboya. Thank you for joining me. Thanks for the opportunity. So you recently published a book on CloudWatch. To my understanding, it is the first such book that goes in depth with not just how to wind up using it, but how to contextualize it as well. How did it come to be, I guess, is sort of my first question.
Starting point is 00:03:24 Yes, thanks a lot, Corey. The name of the book is Infrastructure and Monitoring with Amazon CloudWatch. And the book came to be from the concept of looking at the ecosystem of AWS cloud computing. And we saw that a lot of the things around cloud are mostly talked about, mostly socially compute part of AWS, which is the EC2 the containers and all that you find books on all those topics you know they are all proliferated all over the internet you know and videos and all that but there is a core behind each of the services that no one actually talks about and amplifies which is the monitoring part which helps you to understand what is going on with the system i mean knowing what is going on with the system helps you to understand failures helps you to
Starting point is 00:04:04 predict issues helps you to also envisage when a failure is going to happen so that you can, you know, remediate and also check. And in some cases, it will give you historical behavior of the system to help you understand how a system has behaved over a period of time. One of the articles that I put out that first really put me on AWS's radar, for better or worse, was something that I was commissioned to write for Lenox Journal back when that was a print publication. And I accidentally wound up getting the cover of it with my article, CloudWatch is of the devil, but I must use it. And it was a painful problem that people generally found resonated with them because no one felt they really understood CloudWatch. It was incredibly expensive. It didn't really seem like it was at all intuitive or that
Starting point is 00:04:50 there was any good way to opt out of it. It was just simply there. And if you were going to be monitoring your system in a cloud environment, which of course you should be, it was just sort of the cost of doing business that you'd then have to pay for a third-party tool to wind up using the CloudWatch metrics that it was gathering. And it was just expensive and unpleasant all around. Now, a lot of the criticisms I put about CloudWatch's limitations in those days, about four years ago, have largely been resolved or at least mitigated in different ways. But is CloudWatch still crappy, I guess, is my question. Yeah. So at the moment, I think, like you said, CloudWatch has really evolved over time.
Starting point is 00:05:29 I personally also had that issue with CloudWatch when I started using CloudWatch. I had the challenge of, you know, usability. I had the challenge of proper integration. And I will talk about my first experience with CloudWatch here. So when I started my infrastructure work, one of the things I was doing a lot was ec2 basically i mean everyone always starts with ec2 at the first time and then we had a downtime and
Starting point is 00:05:51 then my cto says okay everywhere go and check what's going on and i'm like how do i check i mean i had no idea what to do and then he says okay there's a tool called cloud watch you should be able to monitor and i'm like okay i dive into cloud watch and boom i'm confused again and um okay you look at the console you see it shows you certain metrics i'm yet to even understand what cpu metric talks about what does network bandit talks about and here i am trying to dig and dig and dig deeper and i still don't get a sense of what is actually going on but hey what i needed to find out was i mean what was wrong with the memory of the system? So I delved into trying to install, you know, the CloudWatch agent,
Starting point is 00:06:29 get metrics and all that. But the truth of the matter was that I couldn't really solve my problem very well, but I had a foot starting of knowing that I don't have memory out of the box. It's something I have to set up differently. And trust me, after then, I didn't touch CloudWatch again, you know, because like you said, it was a problem.
Starting point is 00:06:47 It was a bit difficult to work with. But fast forward a couple of years later, I could actually see someone use CloudWatch for a lot of beautiful stuff, you know, creates beautiful dashboards, creates some very well aggregated metrics. And also with the aggregated alarms that CloudWatch comes with, making it easy for you to avoid what we call incident fatigue. And then also the dashboards. I mean, there are so many dashboards that are simplified to work with, and it makes it easy and straightforward to configure. So the bootstrapping and the changes and the improvement on CloudWatch over time has made CloudWatch a go-to tool, and
Starting point is 00:07:23 most especially the integration with containers and Kubernetes. I mean, CloudWatch is one of the easiest tools to integrate with EKS, Kubernetes, or other container services that run in AWS. It's just more or less one or two lines of setup, and here you go with a lot of beautiful, interesting, and insightful metrics that you will not get out of the box. And if you look at other monitoring tools, it takes a lot of time for you to set up, for you to configure, for you to consistently maintain, and to give you those consistent
Starting point is 00:07:53 metrics you need to know what's going on with your system from time to time. The problem I always ran into was that the traditional tools that I was used to using in data centers worked pretty well because you didn't have a whole lot of variability from an hour-to-hour basis. Sure, when you installed new servers or brought up new virtual machines, you had to update the monitoring system, but then you started getting into this world of ephemerality with auto-scaling originally and then later containers and, God help us all, Lambda now, where it becomes this very strange back and forth story of you need to be able to build something that I guess is responsive to that. And there's no good way to get access to some of the things that CloudWatch provides just because we don't have access into AWS's systems the way that they
Starting point is 00:08:36 do. The inverse though, is that they don't have access into things running inside of the hypervisor. A classic example has always been memory. Memory usage is an example of something that hasn't been able to be displayed traditionally without installing some sort of agent inside of it. Is that still the case? Are there better ways of addressing those things now? So that's still the case, I mean, for EC2 instances. So before now, we had an agent called a CloudWatch agent. Now there's a new agent called Unify CloudWatch agent, which is a top notch from CloudWatch agent. So at the moment, basically, that still happens on the EC2 layer. But the good thing is when you're working with containers or more or less Kubernetes kind of applications or systems,
Starting point is 00:09:16 everything comes out of the box. So when containers, we're talking about a lot of moving parts, okay? The container themselves with their own CPU, memory, disk, all all the metrics and then the nodes or the ec2 instance or the virtual machines running behind them also having their own unique metrics so within the container world these things are just the click of a button everything happens at the same time as a single entity but within the ec2 instance and ecosystem you kind of still find this there, although the setup process has been a bit easier and much faster. But in the container world, that problem has totally been eliminated. When you take a look at someone who's just starting to get a glimmer of awareness around what CloudWatch is and how to contextualize it, what are the most common mistakes people make early on?
Starting point is 00:10:04 I mean, I also talked about this in my book. And one of the mistakes people make in terms of cloud watch and monitoring in general is, what am I trying to figure out? If you don't have that answer clearly stated, you're going to run into a lot of problems. You need to answer that question of what am I trying to figure out? I mean, monitoring is so broad. Monitoring is so large that if you do not have the answer to that question, you're going to get yourself into a lot of trouble. You're going to get yourself into a lot of confusion. And like I said, if you don't understand what you're trying to figure out in the first place,
Starting point is 00:10:36 then you're going to get a lot of data. You're going to get a lot of information and that can get you confused. And I also talked about what we call alarm fatigues or incident fatigues. This happens when you configure so many alarms, so many metrics, and you're getting a lot can get you confused. And I also talked about what we call alarm fatigues or incident fatigues. This happens when you configure so many alarms, so many metrics, and you're getting a lot of alarms hitting your notification services, whether it's Slack, whether it's an email,
Starting point is 00:10:53 and it causes a fatigue. What happens here is the person who should know what is going on with the system gets a lot of messages. And in that scenario, you can miss something very important because there are so many messages coming there so many notifications coming in so you should be able to optimize appropriately you
Starting point is 00:11:11 should be able to like you said conceptualize what you are trying to figure out what problems are you trying to solve um most times you really don't figure this out for a start but there are certain bare minimums you need to know about and that's part of what I talked about in the book. One of the things that I highlighted in the book when I talked about monitoring of different layers is when you're talking about monitoring of infrastructure, say compute services such as virtual machines or EC2 instances, there are certain baseline metrics you need to take note of that are core to the reliability, the scalability,
Starting point is 00:11:42 and the efficiency of your system. And if you focus on these things, you can have a baseline starting point before you start going deeper into things like observability and knowing what's going on internally with your system. So baseline metrics and baseline of what you need to check in terms of different kinds of services you're trying to monitor is your starting point. And the mistake people make is that they don't have a baseline. So when they don't have a baseline, they just install a
Starting point is 00:12:08 monitoring tool, configure CloudWatch, and they don't know the problem they're trying to solve. And that can lead to a lot of confusion. So what inspired you from, I guess, kicking the tires on CloudWatch and the way that we all do and being frustrated and confused by it, all the way to the other side of writing a book on it. What was it that I guess got you to that point? Were you an expert on CloudWatch before you started writing the book? Or was it, well, by the time this book is done, I will certainly know more about the service than I did when I started? Yeah, I think it's a double-edged sword. So it's a combination of the two things you just said. So first of all, I kind of have experience with other monitoring tools.
Starting point is 00:12:45 I kind of have love for reliability and scalability of a system. I started using Kubernetes at some of the early times Kubernetes came out when it was very difficult to deploy, when it was very difficult to set up, because I'm looking at how I can make systems a little bit more efficient, a little bit more reliable, you know, than having to handle a lot of things like autoskilling go through the process of understanding auto scaling i mean that's a school of its own that you need to prepare yourself for you know so first of all i have a love for making sure systems are reliable and efficient and second of all i also want to make sure that
Starting point is 00:13:17 i know what is going on with my system per time as much as possible the level of visibility of a system gives you the level of control and understanding of what your system is doing per time. So those two things are very core to me. And then thirdly, I mean, I had a plan of a streak of books I want to write based on AWS. And just like monitoring is something that is just new. I mean, if you go to the Packet website,
Starting point is 00:13:41 this is the first book on infrastructure monitoring in AWS with CloudWatch. You know, it's not a very common topic to talk about. And I have other topics in my head and I really want to talk about things like monitoring, things like, sorry, things like networking and other topics that you really need to go deep inside to be able to appreciate the value of what you see in there. With also scenarios, because in this book, every chapter I created a scenario of what a real life monitoring system or what you need to do looks like so being that i have those premonitions and all that when the opportunity came to you know to share with the world what i know in monitoring what i've learned in monitoring i took a grab of it and then secondly i also see as opportunity for me to
Starting point is 00:14:21 start telling the world about the things I learned and then I also learned while writing the book because there are certain topics in the book that I'm not so much of an expert in things like big data and all that I had to also learn I had to take some time to do more research to do more understanding so I use cloud watch okay I'm kind of good in cloud watch and also I also had to do more learning to be able to disseminate this information and also hopefully x-ray some parts of monitoring on different services that people do not really pay so much attention to. What do you find that is still the most, I guess, confusing to you as you take a look across the ecosystem of the entire CloudWatch
Starting point is 00:15:03 space? I mean, every time I play with it, I take a look and I get lost at, oh, they have contributor analyses and logs and metrics and it's confusing. And every time I wind up, I guess, spiraling out of control. What do you find that after all of this is a lot easier for you?
Starting point is 00:15:18 And what do you find that's a lot more understandable? I'm still going to go back to the containers part. Sorry, I'm in love of containers. No, no, it's fair. Containers are very popular. Everyone loves them. I'm just basically anti-container based upon no better reason than I'm just stubborn and bloody minded most of the time. So pretty much, like I said, I kind of had experience with other monitoring tools. Trust me, if you want to configure proper container monitoring for other tools, trust me, it's going to take you at least a week or two to get it properly from the dashboards to the login configurations,
Starting point is 00:15:54 to the piping of the data, to the proper storage engine. These are things I talked about in the book because I took monitoring from the ground up. I mean, if you've never done monitoring before, when you take my book, you will understand the basic principles of monitoring. And funny enough, monitoring has some big data process, like the ETL process, extraction, transformation, and writing of data into an analytic system, okay?
Starting point is 00:16:15 So first of all, you have to battle that, okay? You have to talk about the availability of your storage engine, okay? Whether you're using an Elasticsearch, you're using an InfluxDB, whatever you want to store your data in. Then you have to answer the question of, how do I visualize the data?
Starting point is 00:16:29 What method do I visualize this data? What kind of dashboards do I want to use? What methods of representation do I need to represent this data so that it makes sense to whoever I'm sharing this data with? Because in monitoring, you definitely have to share data with either yourself or with someone else. So the way you present the data needs to make sense i've seen graphs that do not make sense so it requires some
Starting point is 00:16:50 some level of skill like i said i've worked with other tools where i spent a week or two having to set up dashboards and then after setting up the dashboard someone was like um i don't understand we just need like two and you're like like, really? You know, because you spend so much time. And secondly, you discover that repeatability of that process is a problem because some of these tools are click and drag. Some of them don't have JSON configuration. Some do, some don't. So you discover that scalability of this kind of system, it becomes a problem. You can't repeat the dashboards. If you make a change to the system, you need to go back to your dashboard. You need to make some changes. You need to update your login tool. You need to make some changes to
Starting point is 00:17:32 across the layer. So all these things, it's a lot of overhead that you can cut off when you use things like container insights in CloudWatch, which is a feature of CloudWatch. So for me, that's like the part that you can really, really suck out so much juice from in a very short time, quickly and very efficiently. On the flip side, when you talk about monitoring for big data services and monitoring for a little bit of serverless, there might be a little steepness in the flow of the learning curve there, because if you do not have a good foundation in serverless when you get into lambda insights in cloud watch trust me you're going to be put off balance you know you're going to get a little bit confused and then there's also multi-function
Starting point is 00:18:13 insights at the moment so you need to have some some very good solid foundation in some of those topics before you can get in there and understand some of the data and the metrics that cloud watch is presenting to you and then lastly things like big data and the metrics that CloudWatch is presenting to you. And then lastly, things like big data too. There are things that monitoring is still being properly fleshed out, which I think that in the coming months, in years to come, they will become more proper and they will become more presentable than they are at the moment. This episode is sponsored by our friends at Oracle HeatWave,
Starting point is 00:18:42 a new high-performance query accelerator for the Oracle MySQL database service, although I insist on calling it MySquirrel. While MySquirrel has long been the world's most popular open-source database, shifting from transacting to analytics required way too much overhead and, you know, work. With HeatWave, you can run your OLAP and OLTP, don't ask me to pronounce those acronyms ever again, workloads directly from your MySquirrel database and eliminate the time-consuming data movement and integration work while also performing 1,100 times faster than Amazon Aurora and two and a half times faster than Amazon Redshift at a third the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense. The problem I've always had with dashboards is it seems like managers always want them. More dashboards, more dashboards. Then you check the usage statistics of
Starting point is 00:19:36 who's actually been viewing the dashboards, and the answer is no one since you demoed it to the execs eight months ago. But they always claim to want more. How do you square that? I guess slicing between what people ask for and what they actually use. So yeah, one of the interesting things about dashboards in terms of most especially infrastructure monitoring is the dashboards people usually want
Starting point is 00:20:00 is the revenue dashboards, trust me. That's what they want to see. They want to see the money going up, up, up, you know? So when it comes to- Oh yes, up and to the right, that everyone's happy. But CloudWatch tends to give you just very, very granular, low level metrics of things. It's hard to turn that into something executives care about. Yeah, which people barely care about. But my own take on that is the dashboards are actually for you and your team to watch, to know what's going on from time to time
Starting point is 00:20:25 but what is key is setting up events across very specific and sensitive data for example when any kind of sensitivity is flowing across your system and you need to check that out then you tie a metric to that and then you turn an alarm to it that is actually the most important thing for anybody i mean for the dashboards it's just for you and your team. Like I said, for your personal consumption, oh, oh, I can see all the RDS connections are getting too high. We need to upgrade, you know, or we can see that, oh, the memory, there was a memory spike in the last two hours and all that. That's for you and your team to consume, not for the executive team. But what is really good is being able to do things like aggregate data that you can share.
Starting point is 00:21:06 I think that is what the executive team would love to see. When you go back to the core principles of DevOps in terms of the DevOps handbook, you see things like a meantime to recover and change failure rates and all that. The most interesting thing is that all those metrics can be measured only by monitoring. You can't know your change failure rates
Starting point is 00:21:24 if you don't have a monitoring system that tells you when there was a failure. You can't know your release frequency when you don't have a metric that measures the number of deployments you have and is updated in a particular metric or a particular aggregator system. So, you discover that the four major things you measure in DevOps are all tied back to monitoring and metrics and being able to understand your system from time to time. So what the executive team actually needs is to get a summary of what's going on. And one of the things I usually do for almost any company I work for is to share some kind of uptime system with them.
Starting point is 00:21:58 And that's where CloudWatch Synthetic Scannery comes in. So Synthetic Scannery is a service that helps you calculate, that helps you check for uptime of a system. So it's a very simple service. It does a ping, but it is so efficient and it is so powerful. How is it powerful? It does a ping to a system and it gets a feedback. Now, if the status code of your service, it's not 200 or not 300, it considers it a downtime. Now, when you aggregate this data within a period of time say a month or two you can actually use that data to calculate the uptime of your system and that uptime is something you can actually share to your customers and say okay
Starting point is 00:22:35 we have an sla of 99.9 we have an sla of 99.8 that data should not be a doctored data it should not be a data you just cook out of your head. It should be based on your system that you have used, worked with, monitored over a period of time. So that the information you share with your customers are genuine, they are truthful, and they are something that they can also see for themselves. Hence, companies are using things like status page to know what's going on from time to time whenever there's an incident and report back to their customers so these are things that executives will be more interested in that just dashboards dashboards and more dashboards so it's more or less not about what they really ask for but what you know and what you believe they are going to draw value from i mean an executive in a meeting with a client and says hey we got a system
Starting point is 00:23:24 that has 99.9 uptime he opens the dashboard or he opens the uptime system and say, you see our uptime for the past three months, this has been our metric. Boom. That's it. That's value instantly. I'm not showing clients a bunch of graphs, you know, trying to explain the memory metric. That's not going to pass the message, send the message forward. Since your book came out, I believe, if not certainly by the time it was finished being written and it was in review phase, they came out with Managed Prometheus and Managed Grafana. It looks almost like they're almost trying to do
Starting point is 00:23:58 a completely separate standalone monitoring stack of AWS tooling. Is that a misunderstanding of what the tools look like? Or is there something to that? Yeah. So, I mean, by the time it was announced in re-event, I'm like, oh, snap. I almost did my public share. You know what? We need to add three more chapters. But unfortunately, it was still on review, in preview. I mean, as a hero, I kind of have some privilege to be able to request for that. But I'm like, OK, I think we're going to change the narrative of what the book is talking about. I think I'm going to pause on that and make sure this finishes with the premise of it.
Starting point is 00:24:32 And then maybe a second edition, I can always attach that. But hey, I think there's trying to be a galvanization between Prometheus Grafana and what CloudWatch stands for. Because at the moment, I think it's currently on pre-release. It's not fully GA at the moment. So you can actually use it. So if you go to Container Insights, you can see that you can still get how Prometheus and Grafana is presenting the data.
Starting point is 00:24:56 So it's more or less a different view of what you're trying to see. It's trying to give you another perspective of how your data is presented. So you're still going to have CloudWatch. It's going to have CloudWatch Dashboard. It's going to have CloudWatch Metrics. But hey, these different tools, Prometheus, Grafana, and all that, they all have their unique ways of presenting the data. And part of the reason I believe AWS has Prometheus and Grafana there is,
Starting point is 00:25:19 I mean, Prometheus is a huge cloud-native, open-source monitoring presentation analytics tool. It packs a lot of heat. And a lot of people are so used to it. And people are like, why can't I have Prometheus in CloudWatch? I mean, so instead of CloudWatch just being a simple monitoring tool, it makes CloudWatch become an ecosystem of monitoring tool. So we're not going to see CloudWatch as, oh, it's just so sorry, log, analytics, metrics, dashboards.
Starting point is 00:25:46 No, we're going to see it as an ecosystem where we can plug in other services and then integrate and work together to give us better performance options and also different perspectives to the data that is being collected. What do you think is next as you take a look across the ecosystem as far as how people are thinking about monitoring and observability in a cloud context? What are they missing? Where's the next evolution lead? Yeah, I think the biggest problem with monitoring, which is part of the introductory part of the book where I talked about the basic types of monitoring, which is proactive and reactive monitoring, is how do we make sure we know before things happen? And one of the things that can help with that is machine learning. There is a small ecosystem that is not so popular at the moment, which talks about how we can do a lot of machine learning
Starting point is 00:26:39 in DevOps monitoring observability. And that means looking at historic data and being able to predict on the basic level, looking at historic data and being able to predict on the basic level, looking at historic data and being able to predict. At the moment, there are very few tools that have models running at the back of the data being collected for monitoring and metrics, which could actually revolutionize
Starting point is 00:27:01 monitoring and observability as we see it right now. I mean, even the topic of observability is still new at the moment. It's still being integrated. Observability just came into cloud, I think, like two years ago. So it's still being matured. But one thing that has been missing is seeing the value AI can bring into monitoring. I mean, this much is going to tell us, hey, by 9 p.m., I'm going to go down. I think your CPI and memory is going down. I mean, this machine could practically tell us, hey, by 9pm I'm going to go down. I think your CPI and memory is going down. I think line 14 of your code
Starting point is 00:27:29 is the problem. Cuss in the bug. Please, you need to fix it by 2pm so that by 6pm things can run perfectly. That is going to revolutionize monitoring. That's going to revolutionize observability and bring a whole new level to how we understand
Starting point is 00:27:46 and monitor the systems. I hope you're right. If you take a look right now, I guess the schism between monitoring and observability, which I consider to be hipster monitoring, but they get mad when I say that. Is there a difference? Is it just new phrasing to describe the same concepts, or is there something really new here? In my book, I said monitoring is looking at things from the outside in, observability is looking at things from the inside out. So what monitoring does not see on the basic layer, observability sees. So they are children of the same mom. That's how I put it. One actually needs the other and both of them cannot be separated from each other. What we've been working with is just understanding the system from the surface. When there's
Starting point is 00:28:28 an issue, we go to the aggregated results that come out of the issue. Very basic example, you're running a Java application and we all know Java is very memory intensive and on the very basic layer. And there's a memory issue. Most times infrastructure is the first hit with the resultant of that. But the problem is not the infrastructure. It's maybe the code. Maybe garbage collection was not well managed. Maybe they have a lot of variables in the code that is not used and they're just filling up unnecessary memory locations. Maybe there's a loop that's not properly managed and properly optimized. Maybe there's a resource on objects that has been initialized that has not been closed,
Starting point is 00:29:02 which would cause a heap in the memory. So those are the things observability can help you track. Those are the things observability can help you see because observability runs from within the system and send metrics out. While basic monitoring is about understanding what is going on on the surface of the system, memory, CPU, pushing out logs, don't know what's going on and all that.
Starting point is 00:29:24 So on the basic level, observability helps gives you kind of a deeper insight into what monitoring is actually telling you. OK, it's just like the result of what happened. OK, I mean, we are told that the symptoms of COVID is coughing, sneezing and all that. That's monitoring. But before we know that you actually have COVID, we need to go for a test. And that's observability. Telling us what is causing the sneezing.
Starting point is 00:29:49 What is causing the coughing? What is causing the nausea? All those symptoms come out of what monitoring is saying. Monitoring is saying you have cough. You have a runny nose. You're sneezing. That's monitoring. Observability says there's a COVID virus in the bloodstream.
Starting point is 00:30:06 We need to fix it. So that's how both of them are. I think that is probably the most concise and clear definition I've ever gotten on the topic. If people want to learn more about what you're up to, how you view about these things, and of course, if they want to buy your book, we will include a link to that in the show notes. Where can they find you? I'm on LinkedIn. I'm very active on LinkedIn. I also share the LinkedIn notes. Where can they find you? I'm on LinkedIn.
Starting point is 00:30:26 I'm very active on LinkedIn. I also share the LinkedIn link. I'm very active on Twitter too. I tweet once in a while, but definitely when you send me a message on Twitter, I'm also going to be very active. I also write blogs on Medium. Okay. I write a couple of blogs on Medium and that's part of why AWS recognized me as a hero because I talk a lot about different services.
Starting point is 00:30:43 I help with comparing services for you so you can choose better. I also talk about setting basic concepts too. If you just want to get your foot wet into some stuff and you need something very summarized, not the AWS documentation per se, something you can just look at and know what you need to do with the service, I talk about them also in my blogs. So yeah, those are the two basic places I'm in, LinkedIn and Twitter. And we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me. I appreciate it. Thanks a lot.
Starting point is 00:31:14 Ewarey Diagboya, head of cloud at My Cloud Series. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice, along with a comment telling me how many more dashboards you would like me to build that you will never look at. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business and we get
Starting point is 00:32:00 to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production. Stay humble.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.