Disseminate: The Computer Science Research Podcast - Disseminate x DuckDB Coming Soon...
Episode Date: March 6, 2025Hey folks! We have been collaborating with everyone's favourite in-process SQL OLAP database management system DuckDB to bring you a new podcast series - the DuckDB in Research series!At Disseminate o...ur mission is to bridge the gap between research and industry by exploring research that has a real-world impact. DuckDB embodies this synergy—decades of research underpin its design, and now it’s making waves in the research community as a platform for others to build on and this is what the series will focus on! Join us as we kick off the series with:📌 Daniel ten Wolde – DuckPGQ, a graph workload extension for DuckDB supporting SQL/PGQ📌 David Justen – POLAR: Adaptive, non-invasive join order selection 📌 Till Döhmen – DuckDQ: A Python library for data quality checks in ML pipelines📌 Arjen de Vries – FAISS extension for vector similarity search in DuckDB📌 Harry Gavriilidis – SheetReader: Efficient spreadsheet parsingWhether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series. Subscribe now and stay tuned for our first episode! 🚀 Hosted on Acast. See acast.com/privacy for more information.
Transcript
Discussion (0)
Disseminate the Computer Science Research Podcast, the DuckDB In-Research Series.
Hi everyone, Jack here with a quick announcement.
I know we've been quiet in 2025 so far, but that's because we've been
working tirelessly in the background on a new series that I'm here to tell you about today.
The new series is a collaboration with DuckDB.
For those who are not familiar with DuckDB, it's an in-process analytical
SQL database, and you may be thinking why are Disseminate and DuckDB
collaborating together?
Well, Disseminate as a podcast is all about impact and helping bridge the
gap between research and industry.
And DuckDB is a perfect example of these two communities working together in a way that ideas of decades of database research underpin the design of the system and it's very core.
And it is now itself influencing the research community as a platform for others to build on.
So we put our heads together and thought, let's make a series that platforms all of
the cool research and ideas that are happening on top of DuckDB.
So we've got a five part series coming your way soon.
And we've got a whole range of topics.
And we've got Daniel and Tam Wold, who's going to come tell us about DuckPGQ,
which is a community extension for graph workloads that supports the SQLPGQ standard.
We've also got David Houston, who's going to come tell us about Polar,
which is some research he published at the Very Large Databases conference this previous year.
And this is about adaptive and non-invasive joint order selection
via plans of least resistance.
We've also got Till Dorman from MotherDuck coming on to tell us about some research
he did on a project called DuckDQ, which is a Python library for data quality
checks in machine learning pipelines.
Number four, we've got Arjan De Vries, who's gonna be coming on to tell us about FICE,
the FICE extension that he has developed,
which allows users to store vector data
and enables efficient similarity search
and vector-based operations.
And last but by no means least,
we're gonna have Harlan Pus Gavrilidis
from the Technical University of Berlin
to come on and tell us about SheetReader,
which is some work he's done that enables efficient spreadsheet parsing using DuckDB.
So if any of those topics interest you, keep an eye on your feeds. These podcasts are going
to be coming out real soon. Take care everyone. See you then.