Disseminate: The Computer Science Research Podcast - Kevin Gaffney | SQLite: Past, Present, and Future | #11
Episode Date: November 14, 2022Summary: In this episode Kevin Gaffney tells us about SQLite, the most widely deployed database engine in existence. SQLite is found in nearly every smartphone, computer, web browser, television, and ...automobile. Several factors are likely responsible for its ubiquity, including its in-process design, standalone codebase, extensive test suite, and cross-platform file format. While it supports complex analytical queries, SQLite is primarily designed for fast online transaction processing (OLTP), employing row-oriented execution and a B-tree storage format. However, fueled by the rise of edge computing and data science, there is a growing need for efficient in-process online analytical processing (OLAP). DuckDB, a database engine nicknamed “the SQLite for analytics”, has recently emerged to meet this demand. While DuckDB has shown strong performance on OLAP benchmarks, it is unclear how SQLite compares... Listen to the podcast to find out more about Kevin's work on identifying key bottlenecks in OLAP workloads and the optimizations he has helped develop.Questions: How did you end up researching databases? Can you describe what SQLite is? Can you give the listener an overview of SQLite’s architecture? How does SQLite provide ACID guarantees? How has hardware and workload changed across SQLite’s life? What challenges do these changes pose for SQLite?In your paper you subject SQLite to an extensive performance evaluation, what were the questions you were trying to answer? What was the experimental set up? What benchmarks did you use?How realistic are these workloads? How closely do these map to user studies? What were the key results in your OLTP experiments?You mentioned that delete performance was poor in the user study, did you observe why in the OLTP experiment?Can you talk us through your OLAP experiment?What were the key analytical data processing bottlenecks you found in SQLite?What were your optimizations? How did they perform? What are the reasons for SQLite using dynamic programming?Are your optimizations available in SQLite today? What were the findings in your blob I/O experiment? Progress in research is non-linear, from the conception of the idea for your paper to the publication, were there things you tried that failed? What do you have planned for future research? How do you think SQLite will evolve over the coming years? Can you tell the listeners about your other research?What do you think is the biggest challenge in your research area now? What’s the one key thing you want listeners to take away from your research?Links: SQLite: Past, Present, and FutureDatabase Isolation By SchedulingKevin's LinkedInSQLite Homepage Hosted on Acast. See acast.com/privacy for more information.
Transcript
Discussion (0)
Hello and welcome to Disseminate, the computer science research podcast. I'm your host, Jack Wardby.
I'm delighted to say today I'm joined by Kevin Gaffney, who will be talking about his VLDB22 paper, SQLite, Past, Present and Future.
Kevin is doing his PhD in the data systems group at the University of Wisconsin-Madison, and his research focuses on database system performance. Kevin, welcome to
the podcast. Thanks, Jack. I'm happy to be here. Fantastic. Well, let's dive straight in. So how
did you end up researching in databases? Yeah, so I was interested in computer science research
probably since I started high school when I wrote my first program. But I never really knew what area I wanted to go into.
So I had a little bit of interest in AI and ML, a little bit of interest in bioinformatics.
But I ended up doing a semester of research with Jignesh Patel at UW-Madison and ended
up kind of falling in love with the idea of database research, specifically performance optimization.
One of the things that his group really focuses on is bare metal performance. So we try to get
performance of database systems as close to the performance of the underlying hardware
as much as possible. Fantastic. So let's talk about your paper. Can you first off,
for the uninitiated,
can you describe what SQLite is?
And I think I'm maybe pronouncing that wrong, right?
Is it SQLite, right?
Yes, it's SQLite, kind of like a mineral.
So SQLite is an embeddable database engine.
So this means that in contrast to a lot of other database engines, it runs in the process of the host
application. So this is as opposed to a database system where it occupies a single process on a
computer and then clients have to communicate with that process via some sort of inter-process
communication. SQLite is basically just a library. So you process data, you manage data via library calls.
And it's the most widely deployed database engine in the world.
And in our paper, we kind of boil that down into four reasons.
One, it's cross-platform.
So the database file in SQLite can be freely copied between 32-bit, 64-bit architectures,
little-endian, big big endian architectures. And it's also so portable that it's been recommended by the U.S. Library of Congress
for the preservation of data. It's also compact, so it compiles down to less than one megabyte
in general, and its source code is all contained in just one single C file. And then it's reliable.
So over 600 lines of test code exist for every one line of library code. So there's just a huge
amount of testing. 100% of machine branches are coveted in the test, which is a huge achievement.
And then finally, it's fast. So it can support about tens of thousands of transactions per second.
And in some cases, it can be faster and more space efficient
than the underlying file system.
Amazing. I know when I first read the fact that all of the machine branches
have been tested, I was like, wow, that is pretty amazing.
And some achievement, like you say.
Cool. So can you maybe give the listener an overview of what SQLite's architecture is?
Yeah. So like most other database engines, SQLite is organized into various modules.
So there are four main groups of modules in SQLite. The compiler modules, the core modules, the backend, and then accessories.
So the compiler modules have the basic tokenizer, parser,
that you would see in most database engines that process SQL.
But there's also this code generator.
And this is a component that takes a SQL query
and compiles it into a program of virtual instructions. And these virtual
instructions are defined in SQLite specific to SQLite. So it's kind of like an instruction set
specific to SQLite. So and then the core modules include the command line and C interfaces, how you
interact with the database engine, and then the virtual machine. The virtual machine is the component that's
responsible for executing this virtual program. And it actually just consists of a large switch
statement with one case for each virtual instruction. And this is kind of the heart
of SQLite where all the data processing really happens. And then the backend modules include
the B-tree implementation, the pager, and then the interface to the operating system.
And then there are also some accessory modules like test code and common utilities.
You mentioned transactions earlier on.
So how does SQLite provide ACID guarantees?
What mechanisms does it use to achieve this?
Yeah, so isolation is achieved through basically just file locking.
Because SQLite is an embeddable database engine, it needs to lock files to prevent other processes from operating on files that could cause conflicts.
It provides consistency via data structures like its B-tree and some internal checks.
But when you think about atomicity and durability, it starts to get a little bit more interesting. So it provides these guarantees via one of two modes. There's rollback
mode, and then there's wall mode. So in rollback mode, before modifying a page, SQLite will write
the original version of the page to a rollback journal on disk. And this ensures that SQLite can recover the original page contents whenever there is a
power failure or other some exception.
Then after that rollback journal is flushed to disk, SQLite can write the modified page
to the database file.
So in wall mode, it kind of inverts this.
So instead of writing the original version of a page to a rollback journal,
SQLite will write the modified page to a write-ahead log file or a wall file.
And this leaves the original database file untouched. And then every once in a while,
when the wall file grows to a certain size, SQLite needs to transfer the modifications in the wall
file back to the original database file. And this is called a checkpoint. So the main differences in terms of performance are that
rollback mode allows unlimited readers or one writer, whereas wall mode allows unlimited readers
and one writer. So there's a little bit more concurrency. And then rollback mode typically
requires two writes per modified page, one to the rollback journal and then another to the database file.
And the wall mode typically requires just one write.
But there are those occasional checkpoint operations.
And in our experiments, we evaluate both modes in our transaction processing focused experiments.
Awesome. And we will probably touch on this later on when we discuss the OLTP experiments.
But what are the sort of workloads when I want to use one or the other?
Which is the best one?
Is it always that there's one that's better than the other?
Or is it kind of there's a trade-off there between the two in a design space, but a choice based on your workload, essentially?
Yeah, it definitely depends on the workload.
There's no one answer because there are so many factors that go into it like for
example wall mode won't work on a network file system whereas rollback mode will so there are
some drawbacks and advantages to both modes so it really just depends my next question is about
this is a really nice bit in your paper where you look at how kind of hardware and workloads have changed across SQLite's lifetime.
Maybe we can dig into that a little bit and you can discuss these two different dimensions and how they've developed over the past 20 years or so.
Yeah. So since SQLite runs on just about anything and it runs on almost everything, it's kind of a question of like, how is computing hardware in general changed over time? So there are tons of trends that we could talk about, like,
but there are some key trends that are specific to SQLite, mainly just faster storage. So devices
now have faster flash-based storage that can sometimes support up to multiple gigabytes per
second of read bandwidth. A lot of devices have larger
memory. So the amount of DRAM in a typical mobile phone or a computer has grown by several orders
of magnitude over SQLite's lifetime. And we also see a lot more parallelism. So modern computers
now have several cores. Modern mobile phones even have CPUs with six or more cores
and maybe a GPU or some specialized accelerators.
So there are many more trends in hardware,
but these are kind of the most relevant to SQLite.
In terms of workloads,
it's kind of difficult to say what SQLite's been used for historically,
but we know based on its design
that it's kind of specialized
for fast online transaction processing.
So this typically includes workloads
with a small portion of inserts, updates, and deletes,
and access to data through B-tree indexes.
So a typical use case could be sort of a mobile application
that needs to store some local state.
But there is this really interesting 2015 study that was done by the University of Buffalo researchers in which
they gave out modified Android phones to some students, faculty, and staff at the university.
And these phones would record all the SQLite activity on the phone over the course of the
study. And participants were just
asked to use these phones as normal. And then after the study, they looked at all the data.
And they found that a large portion of the workloads they observed were key value reads
and writes. But they also saw this significant tail of complex OLAP queries that had nested
select statements and maybe joins between several tables. And then one other interesting thing they found is that delete statements tended to be very expensive
because they often included nested select statements to figure out which rows in the tables to delete.
So we know along with that that SQLite is sometimes used for analytical operations and data science
and also for data processing on edge
devices and the internet of things. So basically, it's used for OLTP, OLAP, key value, and mostly
everything in between. And DuckDB, which is a new embeddable database engine, has recently emerged to meet this demand for in-process fast analytics or OLAP.
And DuckDB is similar to SQLite in many ways. It runs in-process. There's a single database file
and there's no dependencies. But there are a lot of differences between SQLite and DuckDB,
such as the storage where DuckDB uses column storage, SQLite uses row storage.
DuckDB uses vectorized query execution, whereas SQLite uses row by row.
DuckDB uses batch-optimized MVCC and has a lot of inter-query parallelism.
Whereas SQLite offers very limited parallelism.
So basically, SQLite is used for many, many different things.
And one of the goals in this paper is to compare the performance of SQLite
on many different workloads relative to DuckTV.
Building off what you said there, what do all of these sort of changes,
what are the challenges do they pose to SQLite?
So the key challenge is to sort of support a wide variety of these
workloads with high performance, but we want to do this without sacrificing SQLite's qualities of
compactness, portability, and reliability. And I would argue these are the things that have made
it most popular. It's kind of difficult to say for sure, but most people probably choose SQlite because it's so easy to use.
It's so well tested and it's non-intrusive.
You can basically just drop it in to your program and start using it.
So we need to make sure that we maintain all these qualities as we start to think about improving its performance. Kind of a key component
of your paper is, as you mentioned, a really nice extensive performance evaluation of SQLite
and obviously the DuckDB as well that you're comparing it against. What were the kind of the
key questions you were trying to answer with your performance evaluation? Yeah, most people
that are somewhat familiar with SQLite and DuckTB
have some intuition that SQLite is likely to perform better on OLTP workloads and DuckTB
is likely to perform better on OLAP workloads. But we didn't really have a good idea of the
magnitude of those performance differences. So the first question we wanted to answer
is just how does SQLite perform relative to DuckDB on a huge variety of workloads?
And how big are those performance gaps, basically?
So our second question is, what are the key factors responsible for these performance differences?
In other words, can we explain these performance gaps with a few key causes?
And in this part, we specifically focus
on OLAP workloads. And then finally, as we talk about in the later part of our paper,
our third question is, how can we make SQLite faster on OLAP workloads? In other words,
how can we shrink those performance gaps between SQLite and DuckDB?
So what was the experimental setup and what benchmarks did you use in your evaluation?
We run three benchmarks. For OLTP, we use TATP and for OLAP, we use star schema benchmark or SSP.
And then we also use a third benchmark that we developed specifically for this study,
which we call the blob benchmark. And this is basically just a benchmark to measure the
efficiency of raw reads and writes to a binary object. And then we run these three benchmarks on a single thread on
two hardware configurations, a fairly powerful cloud server, and on the other end of the spectrum,
a Raspberry Pi 4B. This is kind of a, what might be a difficult question to kind of answer because
when people have asked me this before,
I've struggled to give them a satisfactory answer.
How realistic are these workloads?
These are synthetic benchmarks, right?
But TATP, for example, is quite a simple benchmark.
How closely does that model or map to what you've seen in user studies?
Yeah, it's difficult to say, as you said.
But the other obvious choice for transaction processing would be something like TPCC,
which is quite a bit more complicated than TATP.
So we felt that because we're measuring the performance of SQLite, which runs on
a huge range of devices, but a lot of mobile phones are smaller devices. And judging on that
2015 study where we saw a lot of the workload consisted of key value reads and writes or simple
OLTP operations, TATP would probably be a good choice. And then in a similar line of thinking, SSB is sort of a
simplified version of TPC-H or something like TPC-DS, which is considerably more complicated.
And we felt that SSB would be a better benchmark in order to start thinking about not only what
are the performance differences, but also what are the causes of those performance differences.
Using a simpler benchmark would allow us to sort of boil down those performance differences
that we were seeing into a few root causes.
So let's start off with the OLTP experiment.
What was the key results from this?
On TATP, SQLite's wall mode is considerably faster than its rollback mode, about 3x.
And this was expected given that, as I mentioned earlier, wall typically requires fewer writes to disk.
Now, when we start to compare SQLite and DuckDB, DuckDB's performance is actually pretty comparable to SQLite for very small databases, like on the order of a few megabytes.
But as the database grows beyond a few hundred megabytes, DuckDB's performance starts to degrade pretty significantly. So at 1 million subscriber records, which is basically the scale factor for
TATP, SQLite ends up being about 60 times faster than DuckDB on the Raspberry Pi.
So I just want to touch on the deletes here. So you mentioned in the user study
that deletes were really sort of problematic
and the performance was quite low.
There is, I believe, this one transaction in TATP
that is a delete transaction, right?
How did that sort of fare?
Did you observe anything there
as to why the deletes are so slow?
Yeah, so you're right.
There is one delete transaction,
but it's actually a fairly
simple transaction. So it just requires a lookup into a B-tree. And I think it usually deletes
a single row, if not a handful of rows. So as opposed to some of the deletes that were observed
in that other study, where nested select statements, complex select statements were determining which rows to
delete. Can you now talk us through your OLAP experiment? Sure. So our OLAP experiments were
focused on measuring the mean latency for each SSB query run on both SQLite and DuckDB. And we
found that in summary across the board, and as we expected, DuckDB performs better than SQLite
for this workload on every query. So on the Raspberry Pi, the gap in latency ranged from
as low as 3x to as great as 35x. So there were some pretty big performance differences here.
SQLite also had a lot more performance variation among different queries. So its slowest mean latency was about 10 times higher than its fastest mean latency.
So there's a lot of difference when you're executing different SSB queries.
And then based on those results, we kind of asked ourselves, and this leads into the next question,
what are the key bottlenecks in SQLite slowing down its OLAP performance?
And that's where we start to look at what are the causes for these results.
I guess the next question is, what are these bottlenecks that you found?
Yeah, so to answer this question, we took advantage of SQLite's virtual database engine that I was talking about earlier.
So like I said, SQLite compiles SQL engine that I was talking about earlier. So like I said,
SQLite compiles SQL queries into a program of virtual instructions, and then these go into
the virtual database engine, which is that large switch statement that executes the logic of each
virtual instruction. So we can actually measure the time spent in each virtual instruction,
and we just did that for SSB. And we found that only two instructions,
the seek row ID and column instructions, were taking up most of the time for all the queries.
The seek row ID instruction says that given a row ID, find the corresponding row in a table
via a search in a B-tree index. And then the column instruction says, given some row in a table,
extract the value for a specified column in that row. Given these bottlenecks, what were the
optimizations you developed and how did they perform? So our optimizations focus largely on
the seek row ID instruction, but we talk a little bit about potential optimizations to the column
instruction in the paper. So we asked ourselves, how can we reduce the number of seek row ID calls?
So SSB, as the name implies, uses a star schema. So this means we have a very large fact table,
and then we have foreign key references to several smaller dimension tables.
And when you draw these tables and the reference arrows, it resembles a star.
Most SSB queries involve selective semi-joins on the fact table. And this means that many of the
rows in the fact table are actually excluded from the final query results due to some filter
applied on a dimension table. But in order to figure out whether a given row in the fact table should be
included in the result, we often need to use that seek row ID instruction to do a B-tree index lookup
on a dimension table. So that's why we're seeing so much time spent in the seek row ID instruction.
In order to reduce the number of seek row ID calls that we need to perform, we use this technique called look ahead
information passing or LIP. This involves looking ahead in the join pipeline to build lightweight
data structures that are used to filter out the rows from the fact table that will ultimately be
excluded from the final result. So in SQLite, we build bloom filters on each key set that remains after applying some filter on a dimension table.
And then for each row of the fact table, we first probe these bloom filters prior to probing any B-tree indexes.
And because probing a bloom filter is relatively cheap compared to probing a B-tree index,
and assuming a low false positive rate in the Bloom filter, we can actually eliminate
a lot of these tuples early on. And thus, we save a lot of expensive B-tree index lookups.
So these optimizations resulted in a huge decrease in seek row ID calls, as we show in the paper.
And it also improves the latency of some queries by a wide margin, up 10x and we also observed an overall speed up of about 4.2x
on ssp awesome as you uh obviously optimize this this instruction what else becomes the
bottleneck then just kind of does it kind of put more um sort of it's more time spent
on the column instruction than basically relative to everything else? Yeah, exactly. As we know, as we
eliminate one bottleneck, another arises. So after
eliminating or reducing the time spent on seek row
ID, now most of the queries actually spend a significant amount of time
on the column instruction. And we started to look into
how we could reduce the time spent in the column instruction. And we started to look into how we could reduce the time spent in the
column instruction. As I said before, the column instruction says, given some row in a table,
extract a specified value from that row. Now, SQLite has an interesting quirk that causes this
to be a little bit more expensive. SQLite actually uses
dynamic typing, which is very unusual among database systems. And this means that within
the same column, the value of one row could be a completely different data type than the value of
another row. So for example, we could store a float in one row, an integer in the next, and a string in the
next, all in the same column. And SQLite tables can actually be defined with no types at all.
So for example, like create table ABC without any type specified is actually valid in SQLite.
Now, all of this means that because each column in SQLite doesn't have a predefined data type, it's flexible, we need to store type information associated with each row.
So SQLite uses a row store organization, but each row consists of a header and then the actual encoded values.
The header includes type information and size information for each the actual encoded values. The header includes type information and size information
for each of the encoded values.
So in order to extract a specified column from a row,
we first need to read the header information
and then compute the offset of the specified value
and then jump to that offset and extract the value.
And we need to do this for every row.
So that's the main source of overhead in the column instruction. Now you can kind of contrast
this with a column store organization where all of the values in a single column are laid out
consecutively. And you can perform using a SIMD operation, the same operation on maybe eight or so values at once. So that's where we start to see some big
differences in the column instruction. Why was that design decision made way back whenever it
was made? What was the motivation for it? And what are the benefits of opting for dynamic typing
versus the conventional approach, I guess, that most database systems use?
I'm actually not sure. I do know that in terms of the rationale, I'm not very clear on it,
but I do know that there are some benefits to dynamic typing. For example, if you're using a dynamically typed language, such as like Python or JavaScript or something like that, it can ease sort of
the interoperability between the database engine and the language.
It just kind of suits it a little bit better.
But beyond that, apart from just flexibility, I don't know the rationale for that design
decision. don't know the rationale for that design decision there is a there is a web page on sqlite's website
that's dedicated to explaining this uh decision though so i would recommend checking that out
cool so yes are these optimizations you've made are they available today
yes they are yes they are so these optimizations were released in sq SQLite version 3.38.0,
and that came out in February of 2022.
So I would encourage you to download the code and try it out yourself.
Awesome. Have you had any feedback on the optimizations?
Any happy customers?
Not yet. We're not sure how often these optimizations will come into play
just because we don't have a great idea of the workloads that SQLite is used to execute in the wild,
which would be another interesting experiment to sort of build on this 2015 study
to see what is the impact of a given optimization.
And seven years later, is SQLite being used for considerably different things than the study originally found.
The next experiment you did was the blob IO experiment.
Can you talk us through what your findings were with these experiments?
Yeah, so the motivation for this experiment was our observation that SQLite is used for a lot of key value type workloads. So it's sometimes
used as just a storage engine that provides transactional guarantees. So in this experiment,
or in this benchmark rather, the database has issued a constant stream of reads and writes
to a raw binary object. And we vary the size of the binary object and the proportion of the workload that is reads and writes.
So for small blobs, about 100 kilobytes in size, SQLite is just a bit faster than DuckDB for all proportions of reads in the workload.
So SQLite executes this workload with about 30% faster throughput than DuckDB. Now, as you start to
increase the size of the blobs up to about 10 megabytes in size, DuckDB gets a little bit
faster. So for these large blobs, DuckDB is actually 2x faster. And we think that this is
largely due to the differences in how the two engines handle atomic commit in transaction
processing. As I said before, SQLite uses rollback mode or wall mode.
But for these large blobs in wall mode, it actually causes a checkpoint to occur with each write.
So there are actually two writes per modified page in SQLite. So the performance of wall mode
and rollback mode in SQLite are similar for this workload. But DuckDB's batch optimized version of MVCC typically requires
fewer file system writes for this workload. So that's why it performs a little bit better for
large blobs. Cool. Is there any sort of space there for potential optimizations based off of that
finding? Yeah, I think somehow reducing the writes in SQLite would be an important target for that.
Possibly increasing the size of the wall that triggers a checkpoint operation to occur.
So if a checkpoint occurs when the wall grows beyond 10 megabytes in size and you're performing a 10 megabyte write with each transaction.
Of course, the checkpoint is going to occur with each transaction.
So increasing the size of the wall would be one option.
But we didn't get to that experiment in the paper.
So my next question is, progress in research is pretty much nonlinear, right?
We have ups and downs and so from the kind
of conception of the idea of the paper to the publication were there multiple things well were
there things that you tried that sort of failed and also I'd be interested to know in about the
origin story of this project as well how did it actually come about yeah this is a great question
and definitely there were things that we tried that failed. Over the course of this project, we had several great discussions with the SQLite team, who, by the way, are just great people to work with, about the viability ambitious dreams about what we could do. But they were quick
to point out that some of these things wouldn't be viable because they would break the portability
of SQLite or its cross compatibility, or it would be too complex to add. And it would counteract
some of the other goals that SQLite was striving for, such as its portability, compactness, and reliability in
terms of tests. So we considered many things like hash joins, native code generation, vectorization,
increased parallelism, and some of these things were inviolable considering those other goals,
but also others were just out of the scope of this work
and would take too much effort to actually implement within the given timeframe. So we
ultimately decided on the Bloom filter-based lip optimizations because of the ratio of added
complexity to the performance gains. In order to implement those optimizations, we only added two new virtual instructions, made some small changes to the query planner, and also implemented sort of a simple data structure, the Bloom filter.
And that's all it really took, but we ended up getting a pretty substantial performance gain, as I said. In terms of the origin of this project, my group had done some related work in the past focused on developing an analytics optimized layer outside of SQLite's core engine that would use columnar storage and vectorized query processing in order to answer OLAP queries more efficiently, and also have some machinery to keep the data in SQLite's core engine and the
analytics optimized engine in sync with each other. Now, this turned out to be a little bit
too clunky and didn't work out for various reasons. So we kind of turned to a different
direction, which is how can we improve the performance of SQLite's OLAP performance,
or how can we improve the performance of SQLite by just focusing on things that we can change
within the core engine?
And this is how we arrived at this project.
I'm interested to know about the development time and the ease of going in there and making
an invasive change to, I'm guessing, quite a, I don't know, is it quite a complex
code base to work with? Or is it quite easy to find your way around? Yeah, SQLite, and they say
this on their website, is very difficult to hack, because it's been optimized pretty heavily over
its two decade lifetime. Now, it's still organized in a relatively modular way. And even though the code is provided
in a single amalgamated C file, the developers use separate C files that are put together in a
script whenever they want to package it. So it's somewhat modular, but due to its being heavily
optimized over the many years, it is difficult to make a change that doesn't break everything. So
we actually implemented these optimizations using extensions to SQLite. So we defined a bloom filter
as a blob in a table. And then we used SQLite's user-defined functions to construct and probe this Bloom
filter. And we developed sort of a proof of concept using this methodology. And then we
presented those results to the SQLite developers. And from then on, we started talking about how we
could actually integrate this change into the core sqlite code base and then the
developers sort of took took the lead and actually wrote the code that would implement these changes
but in terms of actually trying to make the changes ourselves i did find it somewhat difficult
just because of the the complexity of sq code. So I'm guessing as well,
there is certain levels in terms of like test coverage
and stuff like that
and sort of things they need to maintain, right?
So I'm guessing that's sort of like,
I guess it doesn't make the best left
to the SQLite team, I guess.
Yeah, but one of the cool things
about that huge amount of test coverage
is that you can make changes
as long as you know what you're doing
to the SQLite code base.
And as long as all the tests pass,
you can be reasonably sure that nothing is broken.
Because there are so many tests that cover
so many possible cases,
you can quickly add new features to SQLite
and be pretty confident that nothing's broken. So the actual time it took for
the developers to implement these changes was relatively small. I think they got it done in
less than a month and all the test cases were as well. So, and Richard, the developer says that he
spends, you know, a small fraction of his time writing the actual code of sqlite and the vast majority of his
time writing test cases so although that can kind of bog you down in terms of a developer
it can also empower you to sort of make changes more quickly because you have more confidence
in your changes yeah awesome cool so what do you have planned next for future research?
Are you going to be continuing in working with SQLite or what's next, basically?
Yeah, I would love to continue working with SQLite and I'm still sort of carving out my direction.
But I think I'll continue to look into methods for improving the efficiency of join algorithms, possibly focusing on more general techniques
that could be applied to any database system
in addition to SQLite.
And I think I'd like to focus on lightweight data structures
such as Bloom filters to improve memory efficiency.
So I'm not sure exactly what that looks like,
but I'm enjoying reading a ton of papers at the moment.
Yeah, awesome.
So is that sort of things like worst-case optimal joins,
that sort of space?
In that vein, yes.
Fantastic.
So I guess now we can touch on the future of SQLite as well a little bit.
So how do you think the system will evolve over the coming years?
Yeah, I think SQLite will continue to add tons of new features as it has over its, you know, over 20 year lifetime. And I wouldn't be surprised if its analytical performance continues to improve.
But one certain thing, and the developers note this, is that it will continue to receive the same level of support that it has for several more decades. So they say their mission is to support SQLite through the year 2050. And a lot of their design decisions are made accordingly.
So they really prioritize the portability, the compatibility, and the compactness and the
reliability of the SQLite library. I guess the sort of the elephant in the room sort of question or the duck the duck in the
room right i mean how do you think the sqlite and duck db what will the market look like how do
these two sort of things end up in over the next 30 years right is it going to be a case of eventually
duck db will basically corner the market on any sort of analytics use case whereas that not a
little sort of like almost relegate sql like to just being like sort
of transaction style workloads or do you think there'll be some sort of meet in the middle where
the performance will gradually sort of sort of trend towards each other kind of yeah what do
you think the state of play over like if we fast forwarded 20 years yeah that's a great question
i think there will still be a place for both as there is now. I think DuckDB fills a really important gap in database engines today.
You know, you see data scientists execute a lot of, you know, relational type operations as part of data science workflows.
And sometimes that can be in a data frame type environment like Pandas. But often it's a lot more efficient,
especially when the data starts to exceed the capacity of memory
to execute it in an embeddable database engine like DuckDB.
So DuckDB has a very strong use case for that.
And other use cases where embeddable OLAP database engines
would be important.
So I think DuckDB really, really nicely fills that gap.
SQLite is still very optimized for transaction processing.
And as we show, for larger databases, it can be considerably faster.
So if you're an application and you need a database engine to store state
and maybe even do some lightweight analytical
processing, SQLite would be probably the ideal choice, in my opinion, just due to its valuable
properties of compactness and reliability. So I think, in general, both will occupy an
important space, and I think there will be a need for both.
The key question that I'm sort of interested in is, is there a need for a system that can kind
of do both at the same time? And this kind of relates to a lot of the HTAP research that you
see in database systems. You know, is there a need for an engine that can perform both transaction processing and real-time analytics in the same space?
And if so, how do we accomplish this?
So I think you can apply that same logic to small database engines, embeddable database engines.
Is there a need for an embeddable engine that can have the transaction processing performance of SQLite and the analytical processing performance of DuckDB.
I guess as well, the question that follows from that is,
what would the architecture of that system actually look like?
How do you go about marrying these two things together?
It's a non-trivial question, right?
And how would you actually do that is very interesting.
I agree. That's a huge question.
I think that could occupy, you know, a grad
student's entire career. So somebody could write their dissertation about it. But I think, you know,
there are certain properties of DuckDB that you just can't get with like a more OLTP-focused
engine. And there are certain properties of SQLite that you can't get with
OLAP-focused engine. We know that row store is typically better for OLTP and column store is
typically better for OLAP, certain things like that. So, you know, the question is, do we try
to have some sort of hybrid storage? Do we duplicate data and different types of storage? Do we duplicate data in different types of storage? But there are things that you can do
to improve the performance of OLTP-focused systems. And another interesting question,
for me at least, is starting with an OLTP system, how close can we get to the performance
of a system that's been heavily optimized for all that are there any techniques like that you can borrow from their duck db or the i mean imagine if you
was like constructed it's maybe harder to integrate these sorts of techniques into
a system as mature as sql light but if you were building a system from scratch and you kind of
had that um design goal in mind are there certain techniques you could borrow from day one that would kind of
maybe let you kind of close that gap a little bit more? Yes, definitely. I think a key one would be
vectorized query processing, in other words, batch operations. So this definitely reduces
the overhead of expression evaluation and just general query processing.
The other thing is, even if you can't implement sort of a column store organization,
in a lot of row store tables or OLTP tables, you have a certain number of regularly sized values. So maybe storing those regularly sized values in a location or in
a way that you can quickly process them. So for example, having them regularly spaced out
in regular strides would help a lot in reducing the overhead of extracting those values from the
rows. So obviously this project is kind of one of the
many things you've done over the course of your PhD. Can you maybe give the listeners a sales
pitch of some of your other work as well that you've done? Yeah, so I started out doing a lot
of work in transaction processing. My first paper was about a modular isolation service. So the goal here was
to say, you know, given a typical transactional storage manager in a database system, it's
responsible for providing all these transactional guarantees, but it ends up being a very monolithic,
complex piece of code. So is it possible to take
this transactional storage manager and extract services from it to provide transactional
guarantees as modular code components? So we focus specifically on the isolation component.
And in order to accomplish this, we used sort of an optimized version of predicate locking,
which is an old idea that's been sort of discarded
or used in a limited capacity.
But we showed that you can optimize predicate locking
for a very common class of OLTP workloads
and provide transaction isolation as a modular service.
Are there any other things you've worked on across your PhD
that might interest the listener?
Yeah, I was actually
a co-author on a recent paper at SIGMOD focused on HTAP benchmarking. So we observed that there
are a few other HTAP benchmarks out there that already exist. But we really wanted to develop
a benchmark that was focused on evaluating HTAP across the entire spectrum of HTAP. So
HTAP is a very loosely defined term. Sometimes you can have largely OLTP workloads with a little
bit of analytics thrown in there, or maybe you have a lot of analytics with a little bit of OLTP.
So we wanted to test the performance of systems over an entire range of workloads and also come up with a more structured methodology
to do so and this paper's um what's its name this paper is um how good is my htap system
cool great stuff again we'll put that in the air in the old show notes and the listener can go and
check that out it's really cool stuff two more questions now so the penultimate one is
what do you think is the biggest challenge in in your research area now yeah so the the biggest
challenge that i foresee in database research that some people are already starting to work on
is taking advantage of the huge amount of available hardware that we have available to us right now.
So hardware has changed over the past many years, but it's also grown in heterogeneity.
So not only do we have CPUs available, we have GPUs, FPGAs,
and we also are starting to see the emergence of some specialized accelerators, maybe for data processing.
So I think database systems are really well poised to be responsible for taking advantage of all this hardware because of their declarative query interface.
That's what a database engine does, is it takes a query that says what you want,
and it determines how best to get it. So I think that's the key property of database engines that
make them suited to taking advantage of all this available hardware. And my research group has
already started some projects that are focused on this this sort of goal brilliant and
we'll um keep an eye on keep an eye out for that research coming out and so i guess last question
now it's time for the last word what's the one key thing you want the listener to take away
from your research and your work on sqlite this work really taught me the importance of benchmark-driven performance optimization in research in general.
So as researchers, I think we can sometimes have the tendency to put the solution before the problem. of first taking just an experimental approach in which we execute a benchmark or a number of
benchmarks on a system and figure out what are the real problems that that system is experiencing,
and then sort of use that information to guide the next steps. And related to that point, I think
there's a lot of importance in designing systems that are easy to profile and easy to observe.
Aesculite's design as a virtual database engine where we can measure the time spent in each of
its virtual instructions was really useful in determining where the time was going in our OLAP
experiments. And that sort of led to those performance optimizations that we discussed
in the paper. So I think there's a huge amount of importance for designing systems that are
easy to profile.
Brilliant.
Yeah, definitely echo that sentiment as well.
That's awesome.
So we will, let's end it there.
Thanks so much, Kevin, for coming on the show.
If the listeners are more interested to know more about Kevin's work,
we'll put all the links to everything in the show notes so you can find it
there.
Or you can, I'm guessing reach out to kevin via email or i believe you're on linkedin as well right so yes the message on there right if you're interested in chatting more so yeah and
we'll we'll see you all next time for some more awesome computer science research you