In The Arena by TechArena - The In-Memory Future of Applications with Oracle

Episode Date: April 10, 2023

TechArena host Allyson Klein chats with Oracle VP Shasank Chavan about in-memory databases, customer demands in a data centric world, and how infrastructure must change to fuel customer needs....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome to the Tech Arena. My name is Alison Klein, and today we're coming to you from MemCon in Mountain View, California. I'm delighted to have Shashank Chavan, Vice President of In-Memory Data Technologies Group at Oracle with me. Welcome to the program.
Starting point is 00:00:40 Hi, thank you. Thank you for having me. So Shashank, let's just get started and define what it means to be the vice president of the In-Memory Data Technologies Group and what purview you have with the company. researchers working on the Oracle database product, where we develop performance-critical and customer-visible features running in-memory technology. And I've been at Oracle for over 10 years now. When you think about in-memory technologies, and I was preparing for this interview, it took me back to Oracle x10 systems and all of the innovation that happened maybe a decade plus ago to bring some esoteric designs into computing to drive databases into in-memory. And really, that was the first time that I'd ever heard of an in-memory computing to begin with. Why don't you tell the audience a little bit about that history in this space and why in-memory
Starting point is 00:01:41 has been such an important capability for your customers? Yeah, absolutely. So, Times 10 has been around for, I think, since the 90s. It was developed at HP. It's now part of Oracle. I think Oracle bought Times 10, I don't know how many years ago, maybe 10 years ago, maybe around that timeframe. And Times 10 was one of the very first in-memory databases, as you mentioned. It's a, you know, 100% relational database where it is designed to,
Starting point is 00:02:11 you know, give you very fast, low latency access. And it was really, again, meant as it's a relational database, you got transactional processing, query processing, and it's embedded in the application itself. And so M-SAN is really sitting in this application tier where it's used in the financial and telecommunications, where low latency response times is critical. As I mentioned, that was one of the very first innovations in relational databases with memory. And back in 10 some years ago, maybe 15 years ago, you started seeing a push for in-memory columnar data engines in databases like SAP HANA and with Oracle, Oracle Database and Memory. That's the product that I work on. And there we're moving again in memory
Starting point is 00:02:57 technologies into the database tier where your data of residing at rest as well as where, you know, you are persisting your data. So it's not sitting close to the edge like that times 10. And it primarily came out of the need from where the industry was going, where industry was looking at being having real time access to data. And so accessing data over slower storage wasn't efficient, wasn't working for the use cases, for the workloads that enterprises needed and need to this day. So enterprises need real-time access to data and storing your data in memory enables that. So that's kind of been the history where things are. And you couldn't have in-memory databases before in part because we didn't have the memory capacities that we have now. We didn't have the compute capacities that we have now. And also,
Starting point is 00:03:50 as I mentioned, it's just an explosion of data now. And so real-time access to this data is becoming more and more important. Well, we're at MemCon, so I'm going to ask you a bunch of questions about data availability and what we're seeing in databases. When you look at how organizations are using their databases today, and everybody's talking about more data, you just talked about it, more applications of data, where do you see this particular segment of application development and other customers seeking today that might be different than they were even five years ago? So as I mentioned, so they, customers, we believe,
Starting point is 00:04:26 enterprise customers are basically real-time, you know, they want real-time access to data. They want it immediately. An example is, you know, fraud detection. A customer, you know, is making a transaction with their credit cards, and immediately we need to determine if there's any fraud associated with that transaction.
Starting point is 00:04:47 And so that's real-time responses. And we feel that that capability is in many sectors. Again, the financial sector, retail, communications, and so on. So that's one factor, which is their workloads are also changing. So nowadays, customers have a variety of different types of workloads. It could be relational as we have, you know, or as we have in the past, but you also have machine learning workloads, you have graph-based workloads, document-based workloads. So we believe customers are looking for solutions across these different workloads.
Starting point is 00:05:22 And one of the things that we find that customers want is they want a one-stop shop. They want what we call a converged database to reduce the complexity of the solution that they have and that they have to support. And so they want a single database like Oracle that will support all these different workloads. And it reduces the complexity as well as reduce the costs.
Starting point is 00:05:44 And to get the performance that they need, and it reduces the complexity as well as reduce the costs. And to get the performance that they need, that's where these memory technologies come into place, the increase in compute capacities and so on that Oracle is trying to leverage. Now, when you look at that landscape, tell me a little bit more about what Oracle is doing in terms of working with the industry on some of the emerging technologies that will push you further and looking at how they would apply to the data optimization itself. When you look at some of those technologies, you know, like CXL, for example, a topic at MemCon,
Starting point is 00:06:21 will that require code changes to the database? Good question. So, you know, CXL right now is still very much in the early phases, even though, you know, we're at CXL standard 2.0 moving into 3.0. There aren't any real solutions right now in production at the moment. So everybody's kind of looking at this technology. Everybody believes that this is the next wave that we want to hop onto. What it provides is tremendous in terms of, you know, increasing memory capacities with memory pooling, increasing access to accelerators, providing shared memory access and shared compute access. Will it require changes? Certainly it
Starting point is 00:07:04 will. And that's one of the big question marks that we have is, what is the programming interfaces to these devices and how seamless would it be from the application layers, from the application tier and the database to be able to access these devices? And it's not just the programming capabilities of accessing it. What is the functionality that's actually going to deliver?
Starting point is 00:07:26 And can it also deliver on the performance that we need? So right now you're accessing memory through these DRAM DIMMs. And now we're going through a CXL interface. And that's going to add hundreds of milliseconds to your latency. We don't know exactly what the performance implications really will be once we start evaluating against it. So those are a lot of the concerns that we have, performance concerns about being able to program against these CXL devices. There's still an availability story about these devices. If you've got memory spread across multiple nodes, one of those nodes
Starting point is 00:08:06 goes down. How does that affect your application? So as a database company, obviously that matters. Availability is a big factor for us adopting newer technology. That being said, we're very excited about what CXL can deliver. Oracle as a background is always very quick to investigate and utilize emerging technology. As an example, Oracle was one of the first companies to start investigating and building technologies around Intel's Optane persistent memory in full capacities. from expansion of memory for our database and memory product, where we can build a much larger column store by just using persistent memory without making any changes. Intel had a mode called memory mode, where you had a DRAM DIMM that sat in front of persistent memory DIMMs to serve as a cache. And so we can get the expansion of memory that persistent memory DIMMs provided without a loss of performance because we got really good caching behaviors going to the DRAM
Starting point is 00:09:13 DIMM. And the beauty behind that solution was that there were no changes that were needed in that particular case. However, we also used Intel Optane's technology in our X data storage nodes, where we use these persistent memories as switches, again, for very fast access to transactional data. So Oracle has always been on the forefront of these emerging technologies, both from the storage side and memory side, as well as on the compute side. As an example there, Oracle purchased Sun Microsystems. I don't know when that was, like early 2000s or late 2000s, I think. One of the first things that Oracle did was try to engage with the hardware team to say, how do we build a new processor that is optimized for running database queries? How do we build some software in silicon, as we called it,
Starting point is 00:10:04 to optimize query processing specifically for analytic workloads? And we built this accelerator called DAX that was one of the first of its kind. Now you see these technologies used, for example, on Amazon with Aqua. But prior to that, Oracle had developed this as a database accelerator. And so Oracle wants to adopt these emerging technologies where it makes sense. And so that's what we're investigating with CXL Technologies. Now, from my time at Intel, I know the depth of collaboration between the Intel team and Oracle teams and really taking the most out of hardware and applying it to the database. And I'm sure you
Starting point is 00:10:43 take the same approach with other infrastructure players. Tell me how that plays out in your mind in the CXL era. And where do you see the opportunity to engage? And what does that look like for collaborations with the ecosystem? Right. That's a great question. So yes, we definitely want to collaborate very closely with chip vendors, with various organizations that are building these CXL devices. Where we're interested in, of course, is looking out on behalf of our workloads, our analytic workloads, our transactional workloads, and making sure it makes sense to use with CXL technologies like with the MIPooling and sharing of accelerators. So there, again, it's one of these things where we want to make sure that we can prove that the technology makes sense.
Starting point is 00:11:32 For example, with memory pooling, the concept is great. There's this thing called memory stranding, which there's a paper that was written recently that claimed something like 25% of memory is unused. It's the memory stranding effect, even when all of the CPU cores are rented out in a cloud data center environment. And so that's wasted memory, which means wasted dollars. And so how do we better utilize and oversubscribe, if you will, memory? Now, that's where memory pooling comes into place and that's where the advantages are. However, are there other solutions that can address that problem? I mentioned oversubscription being one of them. Another one is VM migration. In live VM migration, if a particular VM needs more memory, go ahead and migrate it at that point,
Starting point is 00:12:26 pay the cost of doing it and migrate to another server with more memory capabilities. Normally what we see today, it's, it's kind of a majority of it. A VM that needs to dynamically scale because of its workload is primarily a CPU. It needs to scale to get more CPU, necessarily more memory. And with VM migration, you can get both. You can get more memory, you can get more CPU. So that being said, again, that's a very heavy cost. So how does that compare to expanding your memory allocations through a memory pool across multiple servers?
Starting point is 00:13:01 That's something that we would work closely with chip vendors to understand, you know, where does it make sense? A lot of these emerging technologies, you know, we get worried that is this a solution in search of a problem or do we have a problem and then hence finding the solution. And that's something that having spent a lot of time here at Oracle, we spent a lot of time making sure that the technology fits the problems that we have and not just adopt the technology just because it's the coolest, coolest thing on the horizon at the moment. The other thing was accelerators. So the great thing about CXL is also that you have shared accelerators. You can offload your workloads to an accelerator attached to a CXL device and that is also very interesting but alternatives and the question is which alternative which which solution is better so an alternative to that is simply you know moving your workload to a server that has a lot more cores on it these CPUs are becoming cheaper and cheaper with more and more cores and a lot more memory channels to access memory and get higher bandwidth. So which one's better? Offloading work
Starting point is 00:14:14 to an accelerator or offloading work to a server with many, many cores, cheaper cores, ARM cores, for example. When you look at your broad customer base, and I mean, Oracle is pretty much synonymous with database, so the customer base is very broad. What else do you see customers thinking about in terms of how they're applying their databases as we move into the AI era and more customers are paying acute attention to the value of that data sitting in databases? Yeah, I think customers are fundamentally thinking about costs. They obviously want to lower their costs, which is why a cloud solution makes a lot of sense. They're concerned about security, that this data is sitting out in the cloud.
Starting point is 00:15:01 You know, they want to make sure that that data is secure throughout all of the different tiers, storage and memory including, especially when you talk about now memory pooling, your data is sitting everywhere. They want to make sure that data is secure. I think that customers, again, along the lines of cost, want to make sure that they also are looking at simplicity. They don't want these complex solutions that will potentially add more and more cost to them in the long term in terms of maintenance and service. So as I mentioned, Oracle is really big on this converged database story where it's a single database that supports a variety of different workloads. You don't need to have a specialized database for your
Starting point is 00:15:42 specialized workloads. And of course, performance is super important. And that's where all of these technologies, whether it be for memory expansion or for accelerators, come into play. So I think customers are really looking for better performance at lower costs and making sure it's secure. That's fantastic. I think that those statements resonate with everything we've heard at the conference in terms of the core capabilities that customers are looking for. One final question for you before we leave. I'm sure we've piqued folks' interest about what Oracle is doing in the in-memory database arena. Where can folks find out more about the Oracle database and the solution portfolio that you're
Starting point is 00:16:30 delivering both on-prem and in the cloud and connect with your team? Yep, absolutely. So you can do a search online. You can search for Oracle database in memory, and you will find a lot of websites there with different blogs and web pages on our technologies and the way to contact us. You can also reach out directly to me on LinkedIn at Shashank Chavan, and I'll be happy to communicate with you through LinkedIn. Shashank, thanks so much for being on the program today. It was great talking to you. Thanks for taking some time out of the conference.
Starting point is 00:17:08 Great. Thank you for having me, Allison. Appreciate it. Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by The Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.