Disseminate: The Computer Science Research Podcast - Dimitris Koutsoukos | NVM: Is it Not Very Meaningful for Databases? | #41
Episode Date: October 9, 2023Summary: In this episode, Dimitris Koutsoukos talks to us about Persistent or Non Volatile Memory (PMEM) and we answer the question: Is it Not Very Meaningful for Databases? PMEM offers expanded ...memory capacity and faster access to persistent storage. However, (before Dimitris's work) there was no comprehensive empirical analysis of existing database engines under diferent PMEM modes, to understand how databases can benefit from the various hardware configurations. Dimitris and his colleagues have then analyzes multiple diferent engines under common benchmarks with PMEM in AppDirect mode and Memory mode - tune in to hear the findings!Links:VLDB'23 PaperDimitris's HomepageStudy's source code Hosted on Acast. See acast.com/privacy for more information.
Transcript
Discussion (0)
Hello and welcome to Disseminate the Computer Science Research Podcast. I'm your host,
Jack Wardby. Again, a reminder that if you do enjoy the show, do please consider supporting
us through Buy Me a Coffee. It really helps us to keep making the show. Today, I'm going to be
joined by Dimitris Koutsoukas, who will tell us everything we need to know about NVM. Is it not
very meaningful for databases? Dimitris is a PhD student in the systems group at ETH Zurich.
Dimitris, welcome to the show. Thank you. Thank you so much, Jack, for inviting me here. I'm
really happy to be part of this. Brilliant. So let's jump straight in then. So can you tell us
a little bit more about yourself and how you became interested in database research? Yes,
sure. So right now I'm like a fourth year PhD student in the systems group.
I'm just finishing my fourth year and they're in my fifth in a month, actually.
So I just did this combined bachelor's plus master's we have in Greece called the master's
of engineering.
And then I started working for a couple of years and I realized that I want to do something
a bit more than just like being a plain software engineer.
And I saw all this like, I don't know,
this machine plus deep learning boom.
And then I said, okay,
like the easiest way to get like to ride the train
is just to do the data science master.
And I saw like various universities that offer programs.
I just applied to a bunch of them.
One of them like that accepted me was like a TDA
I took that
and I did a couple of years of a lot of machine learning
like big data courses
and mathematics courses
and then I realized that I still enjoy
this machine learning part
but to combine it a bit with my previous background,
I like more to write code, right?
To work on the systems.
And at that point that this happened,
I was really interested in data processing systems
that combine multiple disciplines,
for example, data and relational analytics,
machine learning, and this stuff. And that's what brought me to Database after all.
Amazing stuff, yeah. So today we're going to be talking a little bit about NVM, right?
So non-volatile memory. So maybe you can tell the listener, we can go through some background
here, and you can kind of tell us what is non-volatile memory, also known as persistent
memory sometimes, right? So yeah, let's start off with that how what are these
technologies so non-volatile memory is something i would just like in layman's term let's say i
would just describe some like storage layer that is in between traditional like uh hard drive storage
so it's this and ssds and vram So it's faster than SSDs. It's slower than
DRAM, of course, but also comes at a lower price. It can persist data. The advantage of it is that
you can basically use it in order to support the increasing amounts of data that are produced
every day and companies, researchers want to analyze them.
Because the problem is that DRAM cannot keep increasing all the time, right?
First of all, it's kind of expensive.
And secondly, of course, new technologies are coming, but we needed to find a solution
to just keeping all the data in main memory most of the time.
Nice. So it sits between SSDs and DRAM and that sort of spectrum.
In terms of the price point then, so how much cheaper than DRAM and more expensive than SSD are we talking here, really?
I don't recall the exact numbers.
So just probably this is a spoiler, but Intel Optane and like the commercial implementation
of PMM is now discontinued. So like we cannot really buy it or see the price. I would say that
probably is like 60, 70% less than DRAM like in price, like 60% of the price of Ethereum, just to be exact. Now about SSDs, there are also NVMe SSDs,
so it's like difficult to just pinpoint it like exactly.
It also depends on like how many SSDs,
which is the technology,
like SSD technology is actually like evolving all the time, right?
Right, sure.
Yeah, it's quite diverse sort of things
that's hard to draw an exact sort of comparison cool so there's been there's been sort of you mentioned this in
your paper as well there's been a whole chunk of research looking at using sort of nvm and pm
in databases so where does your work fit into that sort of body of work and kind of yeah i guess
what's the general elevator pitch for for your work in this
area so my my elevator piece would be that okay of course like we we knew that like we cannot keep
all data in main memory and then we started like finding other types like of memory let's say maybe
far memory high bandwidth memory non-volatile memory and at that point that this idea started
we didn't have like any, the actual hardware,
like any commercial implementation of NVM.
And like researchers in databases, but also like in general did like a lot of studies,
right?
But at the end, the hardware wasn't there.
So they couldn't like exactly see what was happening.
Then when the hardware came out, some studies came, but they were like a bit more general, studying the characteristics, you know, like people just
like dipped their hands in the water, like very, very slowly. And at that point that we started
writing this publication, there were some studies for like, not exactly like databases, but like,
let's say all up workloads workloads or like some isolated studies
done by companies like Microsoft or SQL Server.
But we were like, okay, so let's say that,
like, Jack, you, like, as a researcher,
want to come and, like, use your NVM in databases
and you don't want to, like, start, like,
optimizing your code, but you just want to,
you just, like, spend the money or the money or to find the hardware and you want
to see what it does, what it cannot do for databases, right?
And that's how we started to explore, see different workloads, experiment with different
knobs, and try to actually cover breadth and depth at the same time.
Nice.
So this paper is the same time. Nice. So it's the kind of the, this paper is the
one-stop shop. So if you want to know anything about how, how MVM works in a database, right?
It's the comprehensive study. The listener needs to go nowhere else. Just go to this paper. You
can find everything you need to, you need to know. Right. So that's, that's, that's awesome.
Cool. So let's talk about the, the experiments you ran then, because the paper is, is, has got a
hell of a lot of, it's covered, you've covered a hell of a lot of ground so let's start off at the very beginning then let's tell us about the
the experiment experimental setup then so i think we can maybe talk about here that you mentioned
earlier on the about intel's uh opt-in so you can maybe start off by telling us about that technology
because that was the one that was the actual commercial offering you've got your hands on
and got to play with and then we can talk about kind of the systems you measured and the workloads.
Yeah, sure.
So just to start and just to make things clear, we started like Intel Optane in the context
like of a database running on a single node.
There are many, many more applications.
Like in a recent paper, it's almost impossible to cover everything.
And that's like the question
like that we tried to answer, right?
So probably we couldn't fit everything in the title,
but like in the context of like a single server,
like with a single database,
does it make sense to use NVM or not?
And there is Intel Optane.
So Intel had this NVMe SSDs,
which were like faster than traditional SSDs,
but of course they were
in the PCIe interconnect.
So Intel Optane
is actually in the integrated
memory controller.
So if you imagine you're
computing the system and you have
your CPUs
with the number, the cassies,
and then you have the memory bus,
and then in the memory bus you have the DIMMs that are DRAM.
You actually, in the same like slots,
you just plug this Intel Optane technology.
So that basically means that it's not like a peripheral device.
It's integrated, like the integrated memory controller
manages both the DRAM and the Intel Optane.
So I would just call it B- or NBM just to make it a bit shorter for the podcast.
Nice. Yeah, there's a few different flavors in which it can be configured in, right?
So I think there's Memory, App Direct, and Mixed Mode.
Can you tell us about how these differ?
Yes. So in Memory Mode, you basically substitute, like substitute in quotes, because it's not like 100% accurate, your volatile memory of the system.
So your previous DRAM with BIMM.
But the caveat in that is that like DRAM becomes an L4 direct map cast.
So basically, you still have like the previous memory hierarchy, but the DRAM is
like a hidden cast for you now. In indirect mode, there are a number of ways. So basically
PMF just becomes like another storage layer, like let's say a faster SSD. You can configure it in
like a number of namespaces. Intel recommends FSDUX which actually bypasses the
operating system page cast and you can just use directly a map to map like files to the storage
medium. And this is actually the like the previous studies have mostly studied this mode and not
from the angle that we have done. And they were trying to see like,
how can I adjust my short code
and like my software to just be able
to take advantage of that.
And mixed mode is just like,
you configure a percentage in memory mode
and a percentage in up direct mode.
Intel recommends a ratio of one to four, I think,
just for a price and performance ratio.
And again, like DRAM sorry, ratio of one to four, I think, just for price and performance ratio.
And again, like,
Dear I'm Sorry becomes an L for direct mapped cache.
And just like because I mentioned it, just to also make clear,
we just wanted to when we just benchmarked NVMP,
we didn't want to make like changes in the source code, right?
As I said in the beginning, we just wanted to say, you just happened to find this hardware somehow.
Like, can you use it?
How can you use it?
What happened?
Yeah, just drops from the sky.
Here you go.
Okay, now what do I want to do?
Right, yeah, cool.
So just to recap there real quick.
So the in-memory case,
that allows us to basically swap it out for DRAM.
DRAM becomes a hidden cache.
We have the same memory hierarchy.
Whereas AppDirect,
that allows us to sort of interact
and we can m map directly and
mix mode is is kind of a combination of the two and they recommend it was a one to four ratio
which was the one on which was the four in that ratio i think so for one gigabyte of like uh
dirham you needed four gigabytes of like uh bm in memory mode if i'm not mistaken gotcha cool cool
that makes sense cool great stuff so that's that's
this is the toy that we that you've been playing with for all these experiments so now that we've
covered that so can we let's talk a little bit more about the the experimental setup then so
what were the systems you used and then maybe you can introduce some of the the benchmarks as well
i know we're going to talk in the about those in a lot of depth later on but yeah give us a high
level sort of overview of all the various components yes so, sure. So we just like, I don't know, like we're database researchers, right? So
we know two types of benchmarks. We know OLTP and we know OLAP. And at the end we chose like an OLAP
workload like TPC-8. It's heavily used, especially when it's the first thing that you want to do like in a research
paper also companies just use it as like i don't know like a point two to prove that uh like they
have an efficiencies how fast we are yes exactly and for all mdp we used tpcc it's also like very very heavily used and then at some point uh one of the reviewers
asked for uh like uh key value stores just to like bridge like all the different types of
relational databases that we would have i know like key value stores are not exactly like a
database but they are used as kind of such right right? So we also, like the most famous benchmark there
is like the Yahoo Cloud Streaming Benchmark.
So we also like ended up using that.
And for what we try to do in general is that we said,
okay, like we really need to stress the memory, right?
And we know that maybe some of the effects that we see are a bit like
exaggerated because we stressed like the memory in cases that it's not common but we really wanted
to see like the weak points the strong points so we we just isolated like all our experiments in
one NUMA node so that we made sure that our working set is actually higher than the
amount of DRAM that we have there.
So basically, we wanted to use as much PMM as possible for the experiments that we run.
So for TPC-H, for example, we would use scale factor 100.
For TPC-C, we configured a number of warehouses to just have around 100 gigabytes of data and the same for the Yahoo Cloud streaming benchmark.
Awesome. Cool. Yeah. So which systems did you use in these experiments?
So in the end, we ended up using, I think, MySQL, SQL Server, and then DuckDB. And for TPCC,
we used, again, PostgreSQL, MySQL, SQL Server, and VaultDB. And for the Yahoo Cloud Semi-Branchmark,
we just used RockCB. Nice. What drove the choices of those systems? Did you
usually get kind of get a breadth of some of the most popular systems and I guess you've got a nice
obviously cross-section there of transactional systems also with obviously analytical systems
as well but was there anything more to it than that? So the first like choice is that we wanted
to focus on as much of like open source databases as possible. Because as you well know,
it's very easy to change configurations,
basically, and understand the behavior
because you can actually have access to the source code
and understand how they manage the different stuff.
So this was the first choice that we made.
So that's why we choose Postgres and MySQL
and also DuckDB.
I think that are three of the most popular systems
that are being used in databases now.
And then we also use SQL Server
because we know that, I don't know,
and of course I agree with that, it has, and of course, I agree with
Guelda, it has a very high place in the industry for optimization of plans, running queries
very fast.
And for the VaultDB, I think we chose it because we also wanted to specialize, let's say,
OLTP system.
Of course, MySQL is one of them, but you wanted a second one because Postgres and SQL Server are not
so popular for
OLTP workloads.
And for the Yahoo Cloud
Streaming Benchmark, I think we just
picked RocksDB again
because of being open
source and popular.
We just wanted a QI source that
people actually use a lot.
And also,
other studies have used a lot
roxy b as well for comparisons yeah cool yeah obviously you scoped here kind of on relational
systems but did you ever have you thought about maybe um how these how this experiment would
interact with maybe kind of some of the systems like mongo db for example it was a little bit
sort of out of the usual relational world.
We thought at some point to do experiments
with NoSQL databases like MongoDB, for example.
But the thing with these systems
is that they don't have such standardized benchmarks,
at least in my experience as Postgres or MySQL.
And then it's hard, first of all,
to find the insights. Because if you don't... I mean, we discussed about TPC-H before, right?
You know the benchmark by heart, you know the weak points, you know what happens. We need a
benchmark like that for MongoDB. And also, in academia, you also need reviewers, the other side,
to be able to understand and contribute with their comments to that.
So it was like, I don't know, like it was just a compromise at the end, right?
Yeah, sure. I mean, at the end of the day, right, you've got to be practical about it at some point because there's that many systems.
You could be doing it forever. And at some point you've got to publish, right?
So, I mean, yeah, that's really cool.
So let's talk some about some results
then so just kind of set the scene you ran some sort of baseline micro benchmarks can you tell
me what kind of these were and kind of what some of the sort of preliminary sort of findings were
of those experiments so these micro benchmarks we just uh had them as a point of reference. There was a lot of work in basically more in systems conferences, let's say.
And especially there is like a famous lab led by Swanson that have done, have open sourced their code as well.
So me and the other people that worked on the paper, here it was like a postdoc, Michal Friedman,
that took more of the micro-benchmarks.
She took the code of these previous papers,
she adapted it in order to be able to understand
what are the limits, right?
Like how faster in practice can PIM and B than DRAM
or than the SSD that we had.
And of course, we could do that only for the, just like this parenthesis here,
and just to bring like probably the audience in the same page.
Memory mode is like, we know the basics that I said, but like it's kind of a black box, right?
We don't know exactly how it works.
We do not know, like, we know that it's direct math,
but we don't know, we cannot pinpoint exactly
what is happening there, right?
So when we run these micro benchmarks,
we couldn't verify that what we are actually seeing
as a behavior was actually the memory mode
and not something else that was hidden behind the scenes.
So the micro benchmarks were not for memory mode, but just for the up direct mode, our
SSD and DRAM, which like, I don't know, like we were perfectly clear that, okay, this is
kind of what happens with these hardware components.
And we can be sure that, for example, like PMEM has like 10 times more read and write
bandwidth than the SSD.
Cool. Let's get into some of the OLAP experiments then.
So can you start maybe giving us sort of some of the general insights,
the general observations you found whilst doing these experiments with MTPC-H?
Yes, sure. So I would say that the four different databases that we had
gave some common insights and many different insights.
What I would say, one of the common insights is that
when you have a workload like TPCAs that you mainly read,
the more cores that you use, the better you get,
the higher bandwidth you get for PMEM.
And you reach the maximum both in PMEM and DSSD, and you can see like
a lot of difference.
But on the other hand, we didn't see like the exact difference that we have seen, like
in the micro benchmarks.
And we were kind of like wondering where this, why this happens or where some of the bandwidth
is lost and then we started observing
how like different databases like postgres utilize the page cast the operating system page cast
and we see we saw that because bm is in the integrated memory controller then there is no
direct memory access right there is no like this kind of prefetching that is happening through PCIe
that you know that data are going to be there.
So you just prefetch them in the operating systems cache
and the database region from there.
And the CPU has to be like involved all the time.
So you're kind of losing some bandwidth there.
Another insight that we got is that for some reason,
it's also like very online. If you use any search engine and you search for it, SQL doesn't use more than one core for
analytical queries except for a very specific type of queries. And you cannot really change
this behavior. So in very constrained environments,
we saw that the advantage of PMEM was even less because you had to spend some of the CPU resources
to do the processing. You had to spend some of the CPU resources to actually transfer the data
through the integrated memory controller. So you were losing a bit of that as well. And that insight was also verified by DuckDB.
DuckDB, like from all the systems that we run,
has a columnar format instead of a row format.
It's very, very optimized for specific type of queries,
especially TPC-8.
And we saw that for CPU-intensive queries, for example,
because it's a columnar format and anyway, storage, of course, plays a role, but you don't read as much data as in like in a raw format.
You can see that PMM in up direct mode is more or less the same as like using a plain SSD or sometimes even worse.
Yeah, just to make things clear that but i just discussed before
it's mainly about the up direct mode right okay sure so um there was obviously you mentioned
there's various knobs you've you've changed in some of these experiments to sort of
investigate different areas of this sort of space so can you maybe talk us through which when you
were doing your experiments which knobs you did vary so in general we tried to like make the buffer cast as a smaller size but like of course wouldn't make it
like one megabyte because that wouldn't make sense but like we wanted to use as much as less of DRAM
as possible right so we made it like 16 gigabytes we experimented for like other sizes but we we
settled up on on that um at the end in order to be able to
just read from disks all the time disk being either pmem or like an ssd we didn't actually
experiment with it but we saw how like postgres utilizes a lot like a cyclical and like a cycle
buffer in order to be able to just get data and process them.
And I would say that for OLAP, we mainly focused on trying to understand, okay, your database
has a query plan that is like an aggregate that is based on a hash table and then a double
for a loop joint.
So we were trying to understand, okay, how would this map
to the actual hardware characteristics and what makes sense to change,
like making the axis size larger or the block size larger or uh like the block size larger or smaller so i would say it was more like observation
and less experimentation compared to tpcc that we like had a lot more uh room to just tune knobs
there okay cool so i guess that's a nice segue into the ltp experiments and so i guess let's
let's rinse and repeat here tell us
about the insights and then we can talk about the about the various knobs so something that we also
observed for all but it was more obvious for all the OLTPs that memory mode it has some like tiny
few cases that it was better so you needed to have you needed to have like workloads that not only had the larger working
set and vram and the os page has combined but you also needed to have like a slow number of read
and write accesses to them right so let's say that you have a table it's very large you read
it once you process and then you're done you're not going to read it multiple times so this was like the premise for memory
mode but for OLTP we really didn't find like anything that it could be better compared to
a system that uses a plain SSD and just DRAM and then that's it. We also saw verified
insights that other people have found for, that you need to be very careful
when you do write and read workloads together.
So we had to make sure that the interference
and how the queues are managed
by the integrated memory controller made sense, right?
Because the thing is that with PMEM,
when you have just read workloads,
the bandwidth increases almost indefinitely
until you get to the maximum.
But as you increase the write rates,
there is a lot of contention
and then the bandwidth drops.
And if I can describe it,
it's an opposite U-curve.
It's basically not an opposite,
like a proper U-curve with the written right uh like workloads
you have like a sweet spot in the middle and then like you you end up like dressing left and right
right so that like that were the two main insights i would say so be careful when you have mixed
workloads when you have right workloads also be careful don't write too much because then
you will also have bad results and at some point at dpcc for sql server
we had so bad results when we were using a high number of users that the ssd was like uh better
like by by a large margin for like the transaction rate that's crazy i mean i mean obviously the
elephant in the room here you mentioned earlier on is that it discontinued this this now of intel
so is this is this the maybe the reason why right like the performance of it just isn't isn't that good
i mean obviously no it gets it would be used for a lot of other things than just databases but i
mean this doesn't seem like it's not kind of a a glowing sort of um endorsement or use case like
look how amazing it is right so yeah do you have any sort of information why they discontinued it
i i know there are the speculations that like everybody like so i don't really know why they
did it like i guess that's like one reason was that right and one other reason was the cost
because like it had like a kind of a hidden cost let's say because basically besides like
having a payment buying payment and putting in your system there is
like a certain number and in the of intel processors that can actually support pmm right
you cannot put like your processor like from six seven years ago and expect it to to work with
pmm out like uh out of the box so it was like i combined hidden cost, I would say. There were some of the companies that
were also producing these chips because Intel outsources the production of the actual circuits
that were also not so happy economically. That's what online sources say. Combined with
these characteristics that don't make the software so appealing and i guess this was the
like i don't know for the final post to push to to this like direction and also like there was this
like cxl the compute express link which is being advertised as the new solution so it was just tiny
big pieces that were like yeah in the end it's the the straw that breaks the camel's back right there's all these little sort of minor sort of cuts right and then yeah death by a
thousand cuts there's another shame for it i guess and cool and so yeah so we've touched on some of
the general insights there from the old tp experiments but yeah you said that there's the
you varied a lot of different um knobs here tuned a lot of knobs so let's talk about them the things
you varied when you were doing these experiments then yeah so we we just played like first of all like we saw the number
of threads that you can use when you want to write for example and we observed like what other people
have done there are optimizations in postgres like the wall compression and like it's used to
just like compress the logs in order to to take advantage
that cpu is actually like faster than storage but there we just couldn't see like any advantage of
it so like all optimizations have to be like re like reconfigured or thought or thought a bit
in order to be able to to just bring them into pmem we also show saw in MySQL the number of flushing methods because
MySQL, you have different ways to be able to flush your logs and your data to storage, right?
You can do it again by bypassing the page cast. You can just use intermediate buffers.
So we also played a bit with that. And we also, so in general,
like in up direct mode,
let's say that you have,
so it's like a bus
of the integrated memory controller.
You have like, let's say,
one like slot of a PMEM, right?
There is a way to combine them together.
And this is called interleaved up direct mode.
But there is a way to just see them
as two separate namespaces.
We also played around
a bit with that.
And finally, we just
went a bit around.
Let's store the logs
in one storage medium,
let's say SSD, and
the data in another storage
medium, for example, PMEM,
or stores everything in PMEM or in everything in SSD
and to understand like if you can gain
some performance advantage
or because like, I don't know,
Intel Optane and PMEM in general
are being like advertised a lot for persistency
and you can say, okay,
I want my logs to be persistent, right?
So what if I just store the logs
and not like everything there? But in say, okay, I want my logs to be persistent, right? So what if I just store the logs and not like everything there?
But in general, besides like we had some like insights that other people had,
but like the other insights just like fall a bit naturally with the general story of like the paper
and the weaknesses that we had identified before.
Cool. Yeah, I guess so.
Given that then moving on to the key value
experiments and ycsb was that the same story there or was the was there a difference of insights you
got out of those experiments it was just like a confirmation of the story that we have seen
before we we use like three workloads one was like half and half so mixed and there we saw like
actually that with mixed workloads
it's like very, very tricky.
And then we had just a read workload,
which showed like, I don't know,
how you can use PMM and just expect like your bandwidth
to grow indefinitely up until like a very high point.
And then mainly write workloads.
So I wouldn't say that there was anything very surprising.
The difference is that there the keys and the values were very wide.
And we were expecting maybe to get some performance advantage
because the line of PMM is like 256 bytes,
which is like four times larger than a line in DRAM, for example.
So maybe we said because of read and write amplification,
we'll be able to see like some different uh behavior but um we couldn't like identify something that wasn't
like revealed in part of uh in part in the previous part of the paper yeah of course i'm getting the
answer to the question and the title of the paper is no it's not very meaningful yeah cool so i guess before we
move on to sort of um sort of some more kind of i guess general questions about about um things
it might be a good opportunity just if you would like to provide like a i don't know a summary of
all the insights you sort of found or like kind of a condense all your findings into sort of um
one soundbite i guess if you can do that i mean there's
a lot of findings but yeah um so if i wanted like uh i don't know listeners to just like
think about two things like from this paper the one is that memory mode like who couldn't really
find like a use where like it's science or you should use it if you have like a shortage of
dirham the second one is that the app direct mode is more optimized like
as a hardware from like a traditional SSD but like there are small cabins like along the way
so you should be careful about read write interference the number of write threads
if your resources are limited for example so if you if your system like Postgres uses a lot the OS page cast
because then other systems, I mean, when you just use it SSD,
this might cover some part of the hardware difference.
So these are all the things that a reader or like a researcher
or some person like working in a company should pay attention to
if they use like Intel intel obtain at some point
in the future cool that's that's that's a nice summary there so i guess are there any sort of
limitations of your of your of your study though so i mean obviously it'd be very sort of comprehensive
but are there any areas that we think yeah that could maybe have been improved or this is some of
the some of the the dark corners of the study that we kind of pushed to one side.
I don't know.
But yeah, the only limitations, I guess, is the general question.
Yes, of course, like every study has limitations.
So my paper is not an exception to that.
I have heard from people in companies
that when they use PMEM as in a separate machine,
as a casting layer in the cloud,
they get really good results compared to traditional ssds um but our study was like more on like a single
like server environment and it was hard to push the the scale factor and everything to the extreme
because you wanted to be able to fit like in the hardware that we had
so you wanted to be able to fit everything into one NUMA node which means 256 gigabytes of pmm
two terabytes of ssd plus like 64 gigabytes of ram so like at the end like the scale factors
were just a bit restricted by this fact right maybe like Maybe like if you go like to, you have like, I don't know,
like a system that is like,
has like a few terabytes of RAM
and then you have like, again,
a few terabytes of PMM
and like more optimizers as these,
you will find, of course,
we will find some of the findings that we did,
but like maybe you will find some things
that we missed.
Cool, yes.
I guess kind of on that,
is there any sort of plans for the next steps of this study
or is it with this technology in general with PMM
or is this kind of the end of the road for PMM for you for now?
For me, yes, this is like the end of the road
after the discontinuation.
With CXL, because right now it's simulated using FPGAs,
so field non-programmable arrays.
And my group has a long tradition of using this type of hardware.
I think they are planning to do some CXL work.
I'm not so involved in that.
But for PMM as it is,
we really don't have anything to continue working on that line of research.
Sure, cool.
I guess kind of, you mentioned this a bit early on, actually, about sort of what you
would want a software engineer or DBA or someone building kind of database system to kind of
keep in mind from your study.
But I guess what I want to know is what impact do you think your work can have?
And maybe what impact it has maybe had
already if you've had any sort of feedback from people who've used the this study to influence
what they're doing their day-to-day work when i was like presenting to vldb um it was um like it
it was kind of like a nicer price let's say uh there there is like there was like a very known
researcher for microsoft and he said that uh look, you don't present us anything new, right?
Like at Microsoft, we had the software from 2019 when it came out.
And all the things that you have said and that it's not so useful, we have found out the same things, right?
And at the end, that's why it was discontinued and then like
i told him like that look like when we were writing this study um uh like it was like more
than a year before the technology got discontinued and like by the time that we submitted it for the
first time it was also like more than like eight months before it was discontinued, right? So then he understood the motivation of our work
and that we had actually predicted in a way
the discontinuation without knowing it.
Yeah, that's funny.
But I mean, it's all good and well
them knowing internally at Microsoft
that this is the reason why and this is why.
But I mean, not everyone has access
to the resources Microsoft do to figure this out, right? so you need to kind of have it in the public domain somewhere
to sort of to allow other people to go and sort of get the same insights right because these these
big companies are very good at keeping their findings to themselves i mean no i think it's
this it's really good work and um yeah i think hopefully you got your point across to it
thank you so much like uh i don't know like it's a bit of like a win, let's say, at the end,
that like the history just confirms what you have predicted, right?
Exactly. Yeah, yeah.
So it's great.
That's cool.
Awesome.
So, yeah, another question I like to ask people
when we're talking about the work is,
was there any sort of, I don't know,
there's two sort of angles to this question.
The first is kind of what is the most interesting thing
that you learned while working on MVM
other than the fact that you can predict the future?
The most interesting fact, I think,
that there is this like specification,
like book that we see for every type of hardware,
but the reality sometimes surprises you, right?
This was like, for me, what was the most interesting thing
when I was working with this technology.
I guess on the other side of that question then,
what was the most unexpected thing
or maybe the thing you tried along the way that failed
and that was some war stories, I guess, from this project?
Because you said you were working on it for quite a long time. First of all, it was kind of like some war stories, I guess, from this project? Because you said you were working on it for quite a long time.
First of all, it was kind of like a war story to get this paper accepted
because, of course, the title is a bit controversial.
The results were a bit controversial as well.
It's hard to convince people so easily that something that has been elevator beats like for decades
right and like it's actually now it's not like really worth the fuss i would say that like uh
this was one of the the struggles in this uh in this work and also when you are using so many
systems and you like have to to conform to all the dependencies and stuff like that,
this is a bit different than your typical database paper that you develop a solution,
you have a couple of baselines, etc.
When you are using six, seven, eight different databases and version change,
the way that they work changes, it's kind of hard to like uh synchronize and
script uh everything like in one place yeah every system has a different way of different syntax for
doing the same thing right so yeah it's a bit of a nightmare i can imagine that's a very difficult
challenge cool yeah so i've just got a few more questions now so i guess the the next one you
mentioned you've mentioned it various points throughout the podcast is that some of the other research you work on so can you maybe tell
us more about some of the stuff you've maybe done in the past and maybe some of the stuff that you
you wish to do in the future just like to to take like a pause here since we're just
sleeping um like i would like to thank like a lot of like the people that worked in this paper
yeah go for it yeah i would like to thank, like, really a lot,
Raghav Bhardia, the second author.
He was a master's student at the point
that we started this work.
I think he's now doing, like, a very good career,
like, as a software engineer.
But, like, he really worked and, like,
produced together with many of the results that we actually found in the paper at the time.
And I remember the conversations and some of the things that he was saying to me.
And I was skeptical at that point.
But at the end, he just did excellent work from his master's thesis. And just to share about some future plans
and in general, the motivation of my PhD
and my dissertation.
I was just like my main, let's say,
the main research question that I was trying to answer
is that look like the data processing landscape
evolves very fast.
You have new types of hardware, new types of algorithms.
You want to have real-time processing.
Your data is increasing.
And usually, the systems that are built right now are a bit monolithic,
are a bit very good at doing exactly one thing.
They're super fast at doing this exact one thing.
But the moment that you have to change
something you basically have to rewrite everything from scratch so i just pick like um different like
parts uh and see how we can adapt them to like the the modern trends of like hardware or platforms
um and my work is just going to continue like at the same, let's say, road.
I just like I'm working right now to a bit like how it's like Snowflake and Databricks, but like a
bit more of the first two for like a workload that you have in order to have like the best
latency and also the best goals.
So a bit like more like since the data processing moves in the cloud, how you want to organize
your data to take advantage of that.
And another type of work that I'm kind of doing is bridging, like other people do this type of work as well.
But I would say I'm doing it from a bit of a different angle.
So I had a system that I built as part of my PhD,
it's called Modularis.
And at that point, it just doing like relational workloads and we now extended
to machine learning workloads we added the support for fpgas to be used like as a as a smart storage
engine and we want to see how you can build like a system that is like using different types of
hardware to just run one query right because maybe Because maybe like, as we said before,
we don't know like what comparison system do.
Maybe Amazon Redshift does the same thing.
But like when you just develop like your own system
and like you open source it afterwards,
you know exactly how these things work
from like one system for another
and how you can like put them all together.
Yeah, so it sounds like you've got loads
of really cool things to be working on there. i'd be i'd love to know about how you actually approach generating these
ideas and things to work on and then obviously once you've generated ideas i mean selecting
the things to actually dedicate a significant portion of time to basically i think it's just
like an iterative process uh mainly like based on on discussions i I've been extremely lucky to have some very talented
colleagues in my group. And maybe somebody will tell you
okay, we have
this, for example, we have this problem. How will you
solve it? And then you start thinking that this is actually a problem, right?
And maybe some of these people like go to industry afterwards and you see that uh like if you're
still giving thoughts you see that this problem actually exists in like also companies so like
for example this is like evolving data layout for or like uh to optimize optimize your data layout based on like workload it's it's like a real problem
right so like let's say that you're the typical data scientist of the day and you have like your
data somewhere in whatever like csv json per kill i mean if you want to do some analysis probably
you won't have like a full-fledged database right you would just like use a query as a service system then you will be you would just run your queries take some insights
take some rows like compute uh what you need to train your machine learning models or whatever
so given the fact that you know this and like you don't like uh to pay pay so much money every time
to Amazon or Google or whatever,
what can you do to just say,
okay, I want to take advantage of some transformations
that I know make sense for the systems,
for example, compression,
to be able to have workloads that run faster
and are also cheaper.
And also this is the typical data scientist problem, right?
So you want to, let's say, have a model that does a prediction.
You will just go to a database.
Then you will have to run something.
Then you will export your results to CSV,
then you will just start using Python.
You will import them to Pandas.
You will start training your scikit learn model.
Um,
so what if you can do all of them like with one system and what if you can do it like using different types of hardware as well?
Fascinating.
I really liked what you were saying in the,
in the answer to this question about sort of the people you surround you with,
right?
You really are kind of product of your environment,
right? And if you surround yourself with good people then it i know it just produces those good ideas and if you can iterate on things as well yeah yeah no that's that's
a great great answer to that question so with that we're on to we're on to the last question
i know you gave a great summary of your of your um of your work on mdm earlier on but maybe now
is a chance to get kind of to a chance to give that message again.
So what's the one thing you want the listeners
to take away from this podcast today?
The one takeaway that I would say,
so first of all, in the context of NPM,
like today, there is no NPM, right?
This technology is discontinued.
I hope that my study will just give light for like
future types of hardware uh that will be similar to nvm or even cxl right to just not um have some
of the weaknesses that this system had for example what i discussed about like about using CPU to just read all the time and not having
DMA or not having a hardware prefetcher that brings data to your main memory.
Just take it as lessons in order not to do them.
See what NVM did wrong and don't repeat it that's uh i would say it's like the
main takeaway yeah which was the same something like if those who don't understand history or
know history are doomed to repeat the failures of the past or something right so yeah there's
there's a warning message there so yeah great so well that's great let's let's end things there
thank you so much for coming on the show it's been a fascinating chat it's a listener's
interested in knowing more about demetrius's work we'llris, for coming on the show. It's been a fascinating chat. If the listeners are interested in knowing more about Dimitris'
work, we'll put links in the show notes
so you can go and find that. And again,
if you do enjoy the show, please do consider
supporting us. It really helps us to keep making the show.
And we'll see you all next time for
some more awesome computer science research. Thank you.