Screaming in the Cloud - It’s like a HeatWave, Burning in my Heart with Nipun Agarwal

Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is sponsored in part by our friends at Vulture, spelled V-U-L-T-R, because they're all about helping save money, including on things like, you know, vowels.

Starting point is 00:00:40 So what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that, well, sure, they claim it is better than AWS's pricing. And when they say that, they mean that it's less money. Sure, I don't dispute that. But what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to cost. They have a bunch of advanced networking features. They tell you in advance on a monthly basis what it's going to cost. They have a bunch of advanced networking features. They have 19 global locations and scale things elastically, not to be confused with openly, which is apparently elastic and open.

Starting point is 00:01:17 They can mean the same thing sometimes. They have had over a million users. Deployments take less than 60 seconds across 12 pre-selected operating systems, or if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vulture Cloud Compute, they have plans for developers and businesses of all sizes,

Starting point is 00:01:41 except maybe Amazon, who stubbornly insists on having something of the scale on their own. Try Vulture today for free by visiting vulture.com slash screaming, and you'll receive $100 in credit. That's v-u-l-t-r dot com slash screaming. Couchbase Cape Database as a service is flexible, full-featured, and fully managed, with built-in access via key-value, SQL, and full-text search. Flexible JSON documents align to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling scaling while reducing cost. Capella has the best price performance of any fully managed document database.

Starting point is 00:02:34 Visit couchbase.com slash screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella. Make your data sing. Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted episode is a returning guest with a slight difference. When last we spoke, Nipun Agarwal was a VP over at Oracle, but now that's right. When people stay at a company long enough and perform well, they wind up getting additional adjectives in lieu of other things. Nipun, you're now a senior VP over at Oracle. Congratulations, I think, unless that just means you've gotten older. Welcome back.

Starting point is 00:03:16 Thank you, Corey. So now that you're at SVP level, I can ask some of the harder questions that we didn't necessarily, it seemed fair to get into the last time we spoke, such as what is an oracle and what might they do these days for folks who have, I don't know, been living in a cave for 40 years? Corey, glad to be back on your show. And since the last time we spoke, we have had like, you know, a lot of enhancements and innovations, and I'll be happy to describe those in detail whenever is a good time. Absolutely. So you've been focused on MySQL for a very long time. I mean, you've been using it so

Starting point is 00:03:52 long, I really should be calling it your SQL, but that's neither here nor there. And you've also been focusing on HeatWave, which is effectively MySQL with then some, I'm just going to cheat and call it magic, that is layered on top of it. That is probably a terrible descriptor of what it actually does, but understand I'm coming from a perspective where I firmly believe the best database in the world is, you know, Amazon Route 53, which is a DNS server. So people look at that and say, well, that's not really what it's designed to do, which really sounds like a them problem. And fair and fair enough, we're going to invert it here. So why is HeatWave a terrible DNS server? What is it exactly? So MySQL is the most popular database in the world. It's the most popular open source database

Starting point is 00:04:40 in the world. Lots of people use it. All the major cloud vendors, they take the MySQL database and either as is or with some enhancements, they offer a managed service, whether it's Amazon, Azure, Google, pretty much all the major cloud vendors. Now, MySQL has been designed and optimized for transaction processing. So it does a great job for transaction processing. But when customers need to run complex queries, or when they need to run analytics, customers would have to take the data out of the MySQL database into some other database for running analytics. Let me make sure I understand your terms properly. When you say transactional, you're talking about I'm shopping for underpants on a website. I go ahead and make a purchase

Starting point is 00:05:23 that's considered a transaction, and a database change reflecting my purchase makes sense. From an analytics perspective, you're like, all right, let's see who bought underpants during this time period. It's effectively usually a small individual record versus now we're going to start doing deep dives into effectively a lot of those records in aggregate. Is that directionally correct, or is my understanding more than a little flawed about things beyond DNS?

Starting point is 00:05:48 Right. What you described is very accurate, that transactional processing is about point queries making frequent changes, whereas when we talk about analytics, it typically involves scanning a much larger amount of data to get the results, And aggregations is a very good example of that. So historically, it seems that people have used very different tooling for different sides of those. Ideally, I admit back in the battle days when I was a systems administrator, we were running MySQL a fair bit and we had the primary database, which was the thing that handled all of the live transactions and the rest. And whenever we ran business reporting queries on it, it's like, huh, why is the website super slow?

Starting point is 00:06:28 And it didn't seem to work very well. Now, back then, at the scale we were operating at, the solution was, ah, we're going to use a replica, and then we're going to basically beat the crap out of the replica for our reporting queries. And if that gets a little slow and bogged down, who cares? Well, just other people running reporting queries, people can still buy underpants. So that was the way that we handled it back then. This was a decade ago. Datasets have gotten significantly larger since then. And apparently my way of viewing it is, as they say, quaint when they're trying not to be actively insulting.

Starting point is 00:07:01 The right way to do it these days is to have completely separate systems that wind up handling those queries with different user interfaces by and large. That is, to my understanding, the rise of big data. And you can hear the initial caps in big data when people talk about it like that. Correct. So what you describe is absolutely correct, that people would extract the data out of databases, take it to specialized databases which are apt for running decision-making and re-processing. But the downside is that A, people need to express the logic and write code to extract

Starting point is 00:07:36 this data and then customers end up with these two different databases. They got to keep the data in sync, they got to move the data periodically. There are a lot of issues in terms of having to manage two different databases, one for transaction processing, one for analytics. What we have done with HeatWave is to enhance the MySQL database service

Starting point is 00:07:58 in the Oracle Cloud so that now the single MySQL database is optimized both for transaction processing as well as analytics. So now you have a single database, and whether you want to run point queries or these aggregate queries, you can do it on the same data. So the data remains as is. You're bringing richness of computation, richness in query processing to the customers. One of the truisms of cloud is that it forces a re-evaluation, in many cases, of things that people historically hadn't had to think about.

Starting point is 00:08:38 A classic example, when I was consulting on cloud migrations, was building up costing models, as you might imagine. And my customers would ask me questions such as, great, so what's this going to cost us? And I would come back with, well, okay, how many gigabytes in a given month does transfer between this database and that other database, you know, in the machine sitting right next to it? And their response started off with a, why on earth do you think we would know that? Followed by, wait, why do we need to know that? Followed by, oh God, it costs us to do what? And very quickly, an architectural pattern has emerged within cloud of, you know, people

Starting point is 00:09:11 experience this the second time they plan for it. And as a result, whatever database is the most cost effective is the one the data's already in. Because moving data from point to point is inherently an expensive proposition. Depending on where the second point is, it can be an extortionately expensive proposition, which means that very often we'll start to see patterns that are, I guess, sacrificing one side

Starting point is 00:09:35 of the database interaction model or the other, that transactions are going to be a little slower because you need to have it in the same place you're going to be running large-scale analytics on, or alternately, analytics are going to be super crappy just because you have to wind up querying systems during downtimes and low periods. It just becomes a giant mess. Regardless of whether it's bad in one way, bad in another, or just expensive, it hasn't worked for people. And my sense is that that is what HeatWave is directly aimed at. Yes, indeed.

Starting point is 00:10:06 So there are multiple reasons why HeatWave is being so successful. One is the case that customers need a single database instead of having multiple. The second thing is there is absolutely no change required to MySQL applications. So the MySQL applications or MySQL-compatible applications work as is with this query accelerator HeatWave without any change. But the third reason why this is so popular is that HeatWave has been designed from the ground up for scalability, performance, and optimized for the underlying gear, which is the underlying cloud platform. As a result, it offers a very good price performance compared to any of the service we have run against. So not only is it providing the benefits of having a single database, no change to the application, but also it is extremely fast and low priced. And that's because a lot of technology innovations we did, like almost like over a decade, to build this scale-out system for analytic processing, which has been optimized for the underlying cloud commodity gear.

Starting point is 00:11:12 So help me understand, is HeatWave a effectively re-engineering of MySQL? Is it a completely separate layer that exists distinct from an existing MySQL database, or is it something else entirely? So we started off designing HeatWave separately as something ground up, which came out of many years of research and advanced development. And once we knew that we could scale up HeatWave for analytic processing, and it is very well optimized for the underlying hardware and such,

Starting point is 00:11:42 then we did the work of enhancing the MySQL database so that it could be integrated, right? So yes, it started off as a standalone effort from the ground up so that we didn't have to live any constraints of any existing code base so we could design it and optimize it, right, from the ground up to be the best possible. But then we integrated this thing

Starting point is 00:12:03 with the MySQL database so that the customers can use it without requiring any change to the application in terms of the semantics or any new syntax, right? So there's absolutely no new syntax and no change to the semantics for existing MySQL applications. So it gives you best of both worlds. So this has frequently been described in the context of a competitor to very, again, forgive the Amazonian focus. That's where I spend most of my time, usually complaining about things. But it's been positioned in some ways as a competitor to things such as RDS or Aurora, as well as Redshift or Snowflake, if we're stepping slightly outside that ecosystem. The challenge that I keep running into very often is that when I talk to customers using those systems, and yes, those systems invariably show up on the bill as one of the big numbers, regardless of how you slice it, it feels like their use case for each of those is very different.

Starting point is 00:12:58 It feels very much like half of those are aimed at purely transactional and half of them are aimed at the data warehousing story, the large amounts of data for analytics queries. And my default knee-jerk reaction whenever someone says, ah, we built a thing that does both of those super well, it's, yeah, I've heard this before. It was the HP multifunction printer where it does three things, none of them well. And no one has a multifunction printer that they liked three things, none of them well. And no one has a multifunction printer that they liked for the longest time because it's moving parts and computers and the devil in equal measure. And it's, okay, so you're trying to build something

Starting point is 00:13:34 that stands between two worlds, but it's easy to come away with the conclusion as a result that it's not the best of breed for either use case, but rather a series of trade-offs or compromises that are made to enable both use cases. I get the sense that that is not your impression of what you've built. Correct. And I'll give you a data point for that. And the data point is... Yay, data.

Starting point is 00:13:55 I love that, as opposed to your opinion's bad, because my opinion's good. No, no, coming with data is a great approach. Please continue. In terms of the customers who are using or adopting MySQL HeatWave, one of the largest segments of the customers who are migrating their production workloads from other databases or other services and coming to HeatWave are AWS customers who are migrating their production workloads from RDS or Aurora and are going production with MySQL HeatWave.

Starting point is 00:14:25 The fact that the customers are doing that is an evidence that there is some value to it. The reasons they are doing it is absolutely no change to their application. It is faster, it is cheaper. Now, in addition, what they find is that many of these customers were moving their data from Aurora or RDS into

Starting point is 00:14:45 Redshift or Snowflake for analytics. They don't need to do that. And that's an additional savings they get. But we have a lot of evidence that existing customers of MySQL-based services, definitely AWS, but even on other clouds and Aurora are migrating. And that's very encouraging for us that, hey, we should be doing something right for customers to want to migrate their workloads to MySQL HeatWave. You had a couple of announcements coming out about what's new and what's coming to HeatWave.

Starting point is 00:15:17 And one of the ones that we're talking about today is the idea of elasticity. Something you just said reminds me of a couple of years ago when Amazon had relatively recently brought out Aurora and they said much the same thing of, oh, it's super elastic. You don't have to take it down to make it bigger. And it's great. Well, you just talked about people removing data as they migrate somewhere else. And the question I had at the

Starting point is 00:15:39 time was, okay, great. So that's how the database in Biggins. That's great. How does it in Smolin, does that wind up having that same elastic property? And the response was a very defensive, well, why would someone ever do that? Data only gets bigger. And it's, yeah, well, you haven't worked with me in production where I accidentally drop a table now and again, and data does get smaller. And the answer for the longest time there was elasticity and auto scaling was basically unidirectional because that's what customers are asking for. Right. So I have to ask, when you say elasticity around heatwave, is that unidirectional or does it mean that, oh, now there's less data, so we're going to go back down again? It is bid-directional, so customers can upsize or they can downsize. Now, I have to say

Starting point is 00:16:28 that HeatWave is a highly scalable system. And what that means is that as customers add more nodes to the cluster, the performance of the system improves almost linearly with the number of nodes which have been added. So as a result, we have a lot of customers who start with a cluster size of certain number, and based on the workloads, they either add nodes or they reduce the number of nodes. It's a very common operation. People want to scale up and scale down. With the real-time elasticity feature we have introduced,

Starting point is 00:16:57 customers can do either operation and with absolutely no downtime. There's absolutely no time when the cluster is not available for queries or for DMS, right? So while the resize operation is going on, the cluster is fully available and customers can upsize to any number of nodes and downsize to any number of nodes.

Starting point is 00:17:19 As it scales in or scales out, is that effectively doing its own internal sharding and rebalancing of data under the hood, invisible to customers? Is there something else going on? How does this work? Right. So take the example that customer has, say, four nodes, and they want to add two more nodes.

Starting point is 00:17:35 There are a couple of interesting properties over here. We have a technique called super partitioning, by which we know exactly which are the blocks of data which have to be populated to the new nodes which have been added. However, one of the key design points of our elasticity is that there is no data movement between the nodes. All the data which has to be populated in the new nodes which are being added, is fetched from the object store, the OC object store. As a result, the existing cluster of four nodes is working as is, queries are working as is

Starting point is 00:18:11 without any degradation in performance. When the data has been populated to these additional nodes, the system then starts having the queries execute on the larger cluster. So the smaller cluster is available all the time and the larger cluster is available. So from a user's perspective, they see absolutely no downtime. And since there is no data movement happening from the initial four nodes, there is no degradation of the existing queries,

Starting point is 00:18:36 which will be running on the older cluster. It's 2022, and you're announcing enhancements to a technology. So of course, it is a given that you are now talking as well about machine learning. Now, in a general sense, whenever someone says that, my immediate instinctive reaction is to check my wallet in case someone is in the middle of picking my pocket because it seems like it winds up in some very weird places. What is machine learning and its applicability to heatwave? Because generally speaking, when I look at things you can use machine learning for, the answer is often finding signal from noise in large datasets

Starting point is 00:19:12 and, of course, the ever-popular bias laundering. But I get the sense that neither one of those is quite what you're talking about here. What monstrosity have you built? With MySQL heatwave, customers are bringing in more data from either consolidating multiple MySQL databases into one, bringing workloads from other databases into MySQL. But the volume of data which now customers are putting into MySQL HeatWave is growing because they want to run transaction processing analytics all together in one database. Now, as the size of the data is growing, we are finding that many customers want to extract the data or currently need to extract the data out of the MySQL database to run machine learning processing. Some of the very large customers of MySQL HeatWave have been

Starting point is 00:19:58 using HeatWave very successfully for transaction processing and analytics, but they had to extract the data out to some other ecosystem, to some other service for machine learning processing. With the announcement we have made, which is HeatWave ML, we are now providing in database support

Starting point is 00:20:17 for machine learning, meaning that customers of MySQL HeatWave can do training, inference, as well as explanations, all inside MySQL HeatWave without the data or the model ever having to leave MySQL. This is something which is fairly unique. Apart from the Oracle database,

Starting point is 00:20:40 I'm not aware of any other database which provides in-database machine learning capabilities, and certainly not as rich, right? Which is very efficient training, inference, and explanations. And all models which are created by HeatWave ML inside MySQL HeatWave can be explained, which is a pretty important capability which enterprise customers like to have. This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps.

Starting point is 00:21:11 They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They've also gone deep in depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That's s-y-s-d-i-g dot com. My thanks to them for their continued support of this ridiculous nonsense. what does this wind up empowering customers to do? Do you have an example or two? Just because it's easy to talk about this stuff in the abstract as far as, oh, it would theoretically let someone do X, Y, or Z. But the problem I found, generally speaking, in the world of machine learning is that it is challenging to articulate it

Starting point is 00:21:56 in a way that people hear the story and think, hey, that looks like something I might want to do, as opposed to the common stories are, well, if you have a world-spanning data set and want to do this, this, the common stories are, well, if you have a world-spanning data set and want to do this, this, and this, like, well, I don't, and I don't, and I don't, and I don't. So what value is it to me? What capabilities does it unlock? Right. So with the introduction of HeatWave, what we had said is that customers don't need multiple databases, one for transactional processing, one for analytics. They can do both transactional

Starting point is 00:22:22 processing and analytics with one database, right? That's what we started off with. Now the same thing holds true for machine learning. Current customers of most databases need to extract data out of the database for doing machine learning. And we are saying, hey, whether it's now OLTP, analytics, mixed workloads, or machine learning, your data can all be inside MySQL, MySQL HeatWave, and you can do all the processing with that service.

Starting point is 00:22:48 Now, the kind of capabilities customers like to have for machine learning, training is the most important one. And training is a very time-consuming operation. And typically, when customers do training and they're using some other service, it's time-consuming and it is very expensive as well. One of the very interesting properties here is that when you're running machine learning inside HeatWave, you don't need to provision any additional cluster, or you don't need to have any custom gear. This machine learning training is happening on

Starting point is 00:23:22 the same cluster which the user has provisioned for analytics or for transaction processing. So on the same hardware, on the same cluster, now they can run machine learning processing. So the kind of use case which you are asking is when customers have this data, and I'll walk you through an example. Take the case of credit card. If a bank wants to determine whether they want to deny someone a credit card or approve it, it's based on some characteristics.

Starting point is 00:23:52 Many of the times people use a rule-based mechanism, but now with data-driven approaches, people want to look at a lot of data and the system makes a recommendation that, yes, this person is appropriate for granting the loan or not. And this is something for which customers or the enterprises want to have rich models, which accurately provide a characterization of the data so that they can make the right predictions. So training is very important because you want to get the training be done right on the data because it influences the quality of the predictions which are being made. And once a prediction is made, there may be reasons, like there could be regulatory compliance reasons,

Starting point is 00:24:34 because of which the enterprise may need to offer an explanation that why was the credit card denied, just to kind of make sure that there wasn't any bias or unfairness. And that's where machine learning explanation capabilities are also very helpful. So this is an example where someone goes for applying for a credit card, whether it's rejected or approved. Another example is that when someone is making a call, like a marketing team is making a call, and the system wants to predict that will a call lead to a successful outcome or not, right? That's another example.

Starting point is 00:25:07 So machine learning is being used pretty now extensively, and one of the advantages of a database is a database is where there's a lot of data. So it's a very, very good opportunity to harness this data using machine learning because machine learning is really tied to the richness of data and to the amount of data someone has. That makes a lot of sense. It definitely shines a light at, if not the easy answer for a lot of those questions, directions that people are going to have a better time of mapping to their specific use cases. One that I think is easier for everyone to map to a specific use case is another component of what you folks are announcing, which is cost reduction, which is, to be direct, not something people generally think

Starting point is 00:25:50 of Oracle as the first example of a company that's like, ah, that's the thing that's going to cost me less money. And to be clear, I have no problem with that. I pride myself on absolutely not being the least expensive answer to basically anything. But it is an interesting direction to go in. There are a few ways you can wind up saving folks money. Which path have you folks taken? Now, there are multiple ways in which we can reduce the cost for the customer. So one thing to realize is MySQL customers are very cost sensitive. And in the previous benchmarks and results we have shown, we have shown that compared to other Windows, we are significantly faster,

Starting point is 00:26:28 the heatwave is significantly faster and significantly cheaper. We had a class of customers come to us saying, hey, you know what, can you trade off some performance for even lower cost? The way we have done is the following. We have doubled the amount of data which can be processed on a heatwave node. Heatwave is an in-memory system.

Starting point is 00:26:49 The size of the cluster depends upon the amount of data which is being processed, and it depends upon the amount of data which can be processed per node. If you double the amount of data that can be processed per node, it means that now customers need a cluster half the size compared to what they were doing in the past, which reduces their cost by half. Now, please note, when they're running on a cluster half the size, the amount of time it takes to run the same query

Starting point is 00:27:16 will double. So what it means is the system is providing the same price performance because half the cost at double the time. But it's a choice the customers have. If they still want to get the same performance slice earlier, they can continue to run on the larger cluster, but now they have a choice. So in a way, we are providing an even lower entry point for customers. That's the first part of cost savings. And it makes sense because with a lot of the workloads you see where it's nice to be able to run analytics on the same type of data, you don't need the same level of responsiveness on a lot of those queries either. So we're trying to get an answer to this giant analytics query. Okay, so great.

Starting point is 00:27:54 How quickly do you need it? What transactions are measured in fractions of a second? The answer to analytics queries is, well, Tuesday would be nice. We'd like it by Tuesday if you can find a way to pull that off. So there's no reason to pay for near-line rate speeds if you don't need it for a lot of those queries, which is absolutely going to be an interesting option for folks. Now, you said there was a second aspect as well. Yes.

Starting point is 00:28:16 And the second aspect is, again, for analytics, right? Customers want to run the queries. They want to run it occasionally. They don't want to run it all the time. So what we are now introducing is a feature called Pause and Resume. What it does is that if you're not using the cluster, you can pause and the system makes a copy of the data and all the metadata associated with

Starting point is 00:28:34 the data in a backup and when the user wants, they can resume and fetch the data, which is still in the in-memory representation, and all the metadata associated with Autopilot, and just start resuming. So this is another way by which customers, when they're not using the cluster for some duration

Starting point is 00:28:51 of time, they can pause it, and for the duration they pause it, they're not being charged. I am a big believer of the number one step of cloud economics is like, oh, should I buy some reservations or lock in a long term contract? No, you should turn things off when you're not using them. And people look at you strange as in, what? You can turn things off? And yes, you absolutely can, which makes people feel

Starting point is 00:29:13 better about generally not doing it. But again, customer behaviors are usually ones that make sense in their context. I just look at it from a billing perspective, and it seems a little weird. I like the option, particularly for things that are either non-production or only going to be relevant to production during certain time windows. There are a number of areas where that begins to make an awful lot of sense, and people would do it if it didn't require backing up the database, destroying the cluster, then reprovisioning the database, restoring the cluster, and yet people don't generally have weeks to spend on spin-up and spin-down. Yes. In fact, that's a very, very good observation, Corey. I want to say that many of our customers who are running the production workloads on HeatWave, they also have a test environment.

Starting point is 00:29:53 And exactly on the lines of what you said, that they want to have a copy of this data in the test environment should something bad happen, but they don't want the cluster on all the time. They just want it for some duration of time. And for them, this pause and resume would be a very good idea and also save them money. So something which we have seen with many of our customers. The last component of your announcement is one that I approach with a significant amount of skepticism. Because every time I start drifting in this direction, one thing is for certain that it's that I'm going to get yelled at on the internet. I'm referring, of course, to benchmarking. Now, Oracle historically has been a company that prefers people not benchmark and publish results of those benchmarks, and it's backcoding into the mists of history. And the argument has always been that people don't generally tend to benchmark

Starting point is 00:30:40 database workloads appropriately due to a series of misunderstandings. And let's be clear, this stuff is complicated. And a number of companies in the space love to talk about their benchmarks are great. And when you look into it, it's okay, those numbers are great. And you sort of know that the benchmarks that didn't perform so well are not the ones that they're talking about. And then their competitor immediately winds up chiming in where it's, ah, they're talking about. And then their competitor immediately winds up chiming in, where it's, ah, they're doing it wrong, because when you do these other benchmarks, our solution winds up being better. And it winds up in a nerd slap fight that no one, even the participants, particularly enjoy. What makes your benchmarks interesting is that

Starting point is 00:31:19 you talk through not just what the benchmark results are, because, of course, that's the entire point. You're also putting the benchmark methodology and tooling up on GitHub where people can grab it and run it themselves and see for yourself is the entire approach. That is, how do I put this politely, that is atypical of large companies in general and Oracle in particular. What changed?

Starting point is 00:31:46 Right. So there are three things over here, Corey. The first thing is, as we talked about, MySQL is the most popular open source database in the world. Pretty much all cloud vendors, they have some version of MySQL which they're offering as a managed service. And in many cases, they're enhancing MySQL and then offering their service. In the context of MySQL,

Starting point is 00:32:07 it becomes very important for us to give the opportunity to our customers for them to compare which service is better for their needs. It is more important in the context of MySQL, since everyone is offering it and some of them have derivatives that we provide some mechanism for people to compare. That's the trust for having a benchmark.

Starting point is 00:32:26 That's the first point. Second thing is, when you want to compare the performance of the cost of these various flavors, instead of us coming with our own, say, workloads, which we see from customers, it's good to have a well-published benchmark or well-understood benchmark so that people can say, okay, you know what? Based on TPC-H, what is the performance? Or on TPC-DS, what is the performance? In some cases, when a benchmark isn't available, what we have done is for machine learning, we have used a bunch of open datasets, and based on those open datasets, we are publishing the benchmarks to say, hey,

Starting point is 00:32:58 we are so much faster or so much cheaper. And then the third aspect is in terms of why we are making them all available in GitHub or open source, that these benchmarks are a starting point, but customers will have workloads which are different from these benchmarks. So we want to provide the opportunity for the customers to first look at what is our methodology, what have we used to come up with these numbers, so A, they can reproduce them, But B, if their workloads are different, they can enhance or augment these benchmarks in the way they would like and then run them to see how do they compare, right?

Starting point is 00:33:31 So we want to be fully transparent about what we have done, how we have done, and let customers decide on their own which is going to be the best platform from a cost perspective, from a performance perspective. So this is the reason why we have chosen to benchmark and get

Starting point is 00:33:48 Omnic available all of our scripts in the open source. One of the things I think I admire the most about that is I've always viewed benchmarks as being borderline worthless because I do not care in the slightest how your system performs on hand-selected ratings on sample data that you provide, whereas I care everything for how the system performs with my workloads and my data sets. So unless I am talking to someone who is effectively a neutral third-party benchmark source, in which case they are immediately attacked for being shills for one company or

Starting point is 00:34:24 another and sometimes both or neither at the same time because people are terrible. But seeing how it runs on my workloads and with my constraints is the important and valuable thing. And this is the easiest I can ever see it being for getting a good representative feel for exactly how different offerings are going to perform under the specific conditions that my production environment lives within. Because it's me, we're talking about the specific conditions of my production environment are, of course, terrifying. So I want to point out, yes, one is the fact that we have made these benchmarks methodology like, you know, very transparent. But the second aspect of that is what we talked about last time, which is MySQL autopilot. This is machine learning

Starting point is 00:35:06 based automation, data-driven based automation. So we are very actively working on making it easy for customers to not have to do any configuration changes or optimizations that the system determines based on the queries, based on the workloads, how to best tune the system. So we're working on both angles. One is to make the system more intelligent so that based on the queries, based on the workloads, how to best tune the system, right? So we are working in both the angles. One is to make the system more intelligent so that based on the workloads, the system can optimize for the user's workload, and then B, making our approach very transparent

Starting point is 00:35:35 so that customers can compare for themselves. So we are very, very aware of this. And again, for MySQL customers, for many of these open source customers, simplicity is very important. And we are working hard to make it simpler and transparent to our users. I really want to thank you for taking me on a tour of what you're announcing today. Now, so let me ask one of the forbidden questions.

Starting point is 00:35:58 What's on the roadmap? What's coming that customers can look forward to? So one of the things which we are working on is that there has been a very good reception of the heatwave capabilities you've introduced. So MySQL heatwave is one of the fastest growing services in the Oracle Cloud. But there has been a lot of interest in customers who have been asking us to provide similar capabilities

Starting point is 00:36:20 on AWS. So this is something which we are working on. It's in the roadmap. And please stay tuned for more news around this. You can bet that I will. I really want to thank you for taking the time out of your day to basically suffer my slings and arrows and also spend time teaching what amounts to a remedial database course to a moron. But thank you once again for being as generous with your time as you always are. Well, thank you, Corey. It's always a pleasure to come and talk at your show. Thank you again for the opportunity.

Starting point is 00:36:51 Always. Nipun Agarwal, SVP at Oracle in charge of MySQL, U-SQL, and HeatWave. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice and explain how databases always fail your personal benchmark of doing a select star on a terabyte of data at once. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group. We help companies fix their AWS bill

Starting point is 00:37:34 by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production. Stay humble. this has been a humble pod production stay humble

Screaming in the Cloud - It’s like a HeatWave, Burning in my Heart with Nipun Agarwal

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.