In The Arena by TechArena - The In-Memory Future of Applications with Oracle
Episode Date: April 10, 2023TechArena host Allyson Klein chats with Oracle VP Shasank Chavan about in-memory databases, customer demands in a data centric world, and how infrastructure must change to fuel customer needs....
Transcript
Discussion (0)
Welcome to the Tech Arena,
featuring authentic discussions between
tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena.
My name is Alison Klein, and today we're coming to you
from MemCon in Mountain View, California. I'm delighted to have Shashank Chavan, Vice President
of In-Memory Data Technologies Group at Oracle with me. Welcome to the program.
Hi, thank you. Thank you for having me. So Shashank, let's just get started and define what it means to be the vice president of the In-Memory Data Technologies Group and what purview you have with the company. researchers working on the Oracle database product, where we develop performance-critical
and customer-visible features running in-memory technology. And I've been at Oracle for over 10
years now. When you think about in-memory technologies, and I was preparing for this
interview, it took me back to Oracle x10 systems and all of the innovation that happened maybe a decade plus ago to bring
some esoteric designs into computing to drive databases into in-memory.
And really, that was the first time that I'd ever heard of an in-memory computing to begin
with.
Why don't you tell the audience a little bit about that history in this space and why in-memory
has been such an important capability for your customers?
Yeah, absolutely.
So, Times 10 has been around for, I think, since the 90s.
It was developed at HP.
It's now part of Oracle.
I think Oracle bought Times 10, I don't know how many years ago,
maybe 10 years ago, maybe around that timeframe.
And Times 10 was one of the very first in-memory databases, as you mentioned. It's a, you know, 100% relational database where it is designed to,
you know, give you very fast, low latency access. And it was really, again, meant as it's a
relational database, you got transactional processing, query processing, and it's embedded
in the application itself. And so M-SAN is really
sitting in this application tier where it's used in the financial and telecommunications,
where low latency response times is critical. As I mentioned, that was one of the very first
innovations in relational databases with memory. And back in 10 some years ago, maybe 15 years ago, you started seeing a push for
in-memory columnar data engines in databases like SAP HANA and with Oracle, Oracle Database
and Memory. That's the product that I work on. And there we're moving again in memory
technologies into the database tier where your data of residing at rest as well as where, you know,
you are persisting your data. So it's not sitting close to the edge like that times 10. And it
primarily came out of the need from where the industry was going, where industry was looking
at being having real time access to data. And so accessing data over slower storage wasn't
efficient, wasn't working for the use cases, for the workloads that enterprises needed and need to this day.
So enterprises need real-time access to data and storing your data in memory enables that.
So that's kind of been the history where things are.
And you couldn't have in-memory databases before in part because we didn't have the memory capacities that we have now. We didn't have the compute capacities that we have now. And also,
as I mentioned, it's just an explosion of data now. And so real-time access to this data is
becoming more and more important. Well, we're at MemCon, so I'm going to ask you a bunch of
questions about data availability and what we're seeing in databases. When you look at how
organizations are using their
databases today, and everybody's talking about more data, you just talked about it,
more applications of data, where do you see this particular segment of application development
and other customers seeking today that might be different than they were even five years ago?
So as I mentioned, so they, customers, we believe,
enterprise customers are basically real-time,
you know, they want real-time access to data.
They want it immediately.
An example is, you know, fraud detection.
A customer, you know, is making a transaction
with their credit cards,
and immediately we need to determine
if there's any fraud associated with that transaction.
And so that's real-time responses.
And we feel that that capability is in many sectors.
Again, the financial sector, retail, communications, and so on.
So that's one factor, which is their workloads are also changing.
So nowadays, customers have a variety of different types of workloads.
It could be relational as we have, you know, or as we have in the past, but you also have
machine learning workloads, you have graph-based workloads, document-based workloads.
So we believe customers are looking for solutions across these different workloads.
And one of the things that we find that customers want
is they want a one-stop shop.
They want what we call a converged database
to reduce the complexity of the solution that they have
and that they have to support.
And so they want a single database like Oracle
that will support all these different workloads.
And it reduces the complexity as well as reduce the costs.
And to get the performance that they need, and it reduces the complexity as well as reduce the costs.
And to get the performance that they need,
that's where these memory technologies come into place,
the increase in compute capacities and so on that Oracle is trying to leverage.
Now, when you look at that landscape,
tell me a little bit more about what Oracle is doing in terms of working with the industry on some of the emerging technologies
that will push you further and looking at how they would apply to the data optimization itself.
When you look at some of those technologies, you know, like CXL, for example, a topic at MemCon,
will that require code changes to the database?
Good question.
So, you know, CXL right now is still very much in the early phases, even though, you know, we're at CXL standard 2.0 moving into 3.0.
There aren't any real solutions right now in production at the moment.
So everybody's kind of looking at this technology.
Everybody believes that this is the next wave that we want to hop onto. What it provides is tremendous in terms of, you know,
increasing memory capacities with memory pooling, increasing access to accelerators,
providing shared memory access and shared compute access. Will it require changes? Certainly it
will.
And that's one of the big question marks that we have is,
what is the programming interfaces to these devices
and how seamless would it be from the application layers,
from the application tier and the database
to be able to access these devices?
And it's not just the programming capabilities of accessing it.
What is the functionality that's actually going to deliver?
And can it also deliver on the performance that we need?
So right now you're accessing memory through these DRAM DIMMs.
And now we're going through a CXL interface.
And that's going to add hundreds of milliseconds to your latency. We don't know exactly what the
performance implications really will be once we start evaluating against it. So those are a lot
of the concerns that we have, performance concerns about being able to program against these CXL
devices. There's still an availability story about these devices. If you've got memory spread across
multiple nodes, one of those nodes
goes down. How does that affect your application? So as a database company, obviously that matters.
Availability is a big factor for us adopting newer technology. That being said, we're very
excited about what CXL can deliver. Oracle as a background is always very quick to investigate and utilize emerging technology.
As an example, Oracle was one of the first companies to start investigating and building technologies around Intel's Optane persistent memory in full capacities. from expansion of memory for our database and memory product, where we can build a much larger
column store by just using persistent memory without making any changes. Intel had a mode
called memory mode, where you had a DRAM DIMM that sat in front of persistent memory DIMMs
to serve as a cache. And so we can get the expansion of memory that persistent memory DIMMs provided
without a loss of performance because we got really good caching behaviors going to the DRAM
DIMM. And the beauty behind that solution was that there were no changes that were needed
in that particular case. However, we also used Intel Optane's technology in our X data storage nodes, where we use these persistent memories as switches, again, for very fast access to transactional data.
So Oracle has always been on the forefront of these emerging technologies, both from the storage side and memory side, as well as on the compute side.
As an example there, Oracle purchased Sun Microsystems.
I don't know when that was, like early 2000s or late 2000s, I think.
One of the first things that Oracle did was try to engage with the hardware team
to say, how do we build a new processor that is optimized for running database queries?
How do we build some software in silicon, as we called it,
to optimize query processing specifically for analytic workloads?
And we built this accelerator called DAX that was one of the first of its kind.
Now you see these technologies used, for example, on Amazon with Aqua.
But prior to that, Oracle had developed this as a database accelerator.
And so Oracle wants to adopt these emerging technologies
where it makes sense. And so that's what we're investigating with CXL Technologies.
Now, from my time at Intel, I know the depth of collaboration between the Intel team and Oracle
teams and really taking the most out of hardware and applying it to the database. And I'm sure you
take the same approach with other
infrastructure players. Tell me how that plays out in your mind in the CXL era. And where do
you see the opportunity to engage? And what does that look like for collaborations with the ecosystem?
Right. That's a great question. So yes, we definitely want to collaborate very closely
with chip vendors, with various organizations that are building these CXL devices.
Where we're interested in, of course, is looking out on behalf of our workloads, our analytic workloads, our transactional workloads,
and making sure it makes sense to use with CXL technologies like with the MIPooling and sharing of accelerators. So there, again, it's
one of these things where we want to make sure that we can prove that the technology makes sense.
For example, with memory pooling, the concept is great. There's this thing called memory stranding,
which there's a paper that was written recently that claimed something like 25% of memory is unused. It's the memory stranding
effect, even when all of the CPU cores are rented out in a cloud data center environment. And so
that's wasted memory, which means wasted dollars. And so how do we better utilize and oversubscribe, if you will, memory? Now,
that's where memory pooling comes into place and that's where the advantages are. However,
are there other solutions that can address that problem? I mentioned oversubscription being one
of them. Another one is VM migration. In live VM migration, if a particular VM needs more memory,
go ahead and migrate it at that point,
pay the cost of doing it and migrate to another server with more memory capabilities.
Normally what we see today, it's, it's kind of a majority of it.
A VM that needs to dynamically scale because of its workload is primarily a CPU.
It needs to scale to get more CPU, necessarily more memory.
And with VM migration, you can get both.
You can get more memory, you can get more CPU.
So that being said, again, that's a very heavy cost.
So how does that compare to expanding your memory allocations through a memory pool across multiple servers?
That's something that we would work closely with chip vendors to understand, you know, where does it make sense? A lot of these emerging technologies,
you know, we get worried that is this a solution in search of a problem or do we have a problem
and then hence finding the solution. And that's something that having spent a lot of time here at Oracle, we spent a lot of time making sure that the technology fits the problems that we have and not just adopt the technology just because it's the coolest, coolest thing on the horizon at the moment.
The other thing was accelerators. So the great thing about CXL is also that you have shared accelerators. You can offload your workloads to an accelerator attached to a CXL device and
that is also very interesting but alternatives and the question is which alternative which which
solution is better so an alternative to that is simply you know moving your workload to a server
that has a lot more cores on it these CPUs are becoming cheaper and cheaper with more and more cores and a lot more
memory channels to access memory and get higher bandwidth. So which one's better? Offloading work
to an accelerator or offloading work to a server with many, many cores, cheaper cores, ARM cores,
for example. When you look at your broad customer base, and I mean, Oracle is pretty much synonymous with database, so the customer base is very broad.
What else do you see customers thinking about in terms of how they're applying their databases as we move into the AI era and more customers are paying acute attention to the value of that data sitting in databases?
Yeah, I think customers are fundamentally thinking about costs.
They obviously want to lower their costs,
which is why a cloud solution makes a lot of sense.
They're concerned about security,
that this data is sitting out in the cloud.
You know, they want to make sure that that data is secure throughout all of the different tiers,
storage and memory including, especially when you talk about now memory pooling, your data is sitting everywhere.
They want to make sure that data is secure. I think that customers, again, along the lines of
cost, want to make sure that they also are looking at simplicity. They don't want these complex
solutions that will potentially add more and
more cost to them in the long term in terms of maintenance and service. So as I mentioned,
Oracle is really big on this converged database story where it's a single database that supports
a variety of different workloads. You don't need to have a specialized database for your
specialized workloads. And of course, performance
is super important. And that's where all of these technologies, whether it be for memory expansion
or for accelerators, come into play. So I think customers are really looking for better performance
at lower costs and making sure it's secure. That's fantastic.
I think that those statements resonate with everything we've heard at the conference
in terms of the core capabilities that customers are looking for.
One final question for you before we leave.
I'm sure we've piqued folks' interest about what Oracle is doing in the in-memory database arena. Where can folks find out more about the Oracle database and the solution portfolio that you're
delivering both on-prem and in the cloud and connect with your team?
Yep, absolutely.
So you can do a search online.
You can search for Oracle database in memory, and you will find a lot of websites there with different blogs
and web pages on our technologies and the way to contact us. You can also reach out directly to me
on LinkedIn at Shashank Chavan, and I'll be happy to communicate with you through LinkedIn.
Shashank, thanks so much for being on the program today. It was great talking to you.
Thanks for taking some time out of the conference.
Great.
Thank you for having me, Allison.
Appreciate it.
Thanks for joining the Tech Arena.
Subscribe and engage at our website, thetecharena.net.
All content is copyright by The Tech Arena.