Disseminate: The Computer Science Research Podcast - Disseminate x DuckDB Coming Soon...

Episode Date: March 6, 2025

Hey folks! We have been collaborating with everyone's favourite in-process SQL OLAP database management system DuckDB to bring you a new podcast series - the DuckDB in Research series!At Disseminate o...ur mission is to bridge the gap between research and industry by exploring research that has a real-world impact. DuckDB embodies this synergy—decades of research underpin its design, and now it’s making waves in the research community as a platform for others to build on and this is what the series will focus on! Join us as we kick off the series with:📌 Daniel ten Wolde – DuckPGQ, a graph workload extension for DuckDB supporting SQL/PGQ📌 David Justen – POLAR: Adaptive, non-invasive join order selection 📌 Till Döhmen – DuckDQ: A Python library for data quality checks in ML pipelines📌 Arjen de Vries – FAISS extension for vector similarity search in DuckDB📌 Harry Gavriilidis – SheetReader: Efficient spreadsheet parsingWhether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series. Subscribe now and stay tuned for our first episode! 🚀 Hosted on Acast. See acast.com/privacy for more information.

Transcript
Discussion (0)
Starting point is 00:00:00 Disseminate the Computer Science Research Podcast, the DuckDB In-Research Series. Hi everyone, Jack here with a quick announcement. I know we've been quiet in 2025 so far, but that's because we've been working tirelessly in the background on a new series that I'm here to tell you about today. The new series is a collaboration with DuckDB. For those who are not familiar with DuckDB, it's an in-process analytical SQL database, and you may be thinking why are Disseminate and DuckDB collaborating together?
Starting point is 00:00:33 Well, Disseminate as a podcast is all about impact and helping bridge the gap between research and industry. And DuckDB is a perfect example of these two communities working together in a way that ideas of decades of database research underpin the design of the system and it's very core. And it is now itself influencing the research community as a platform for others to build on. So we put our heads together and thought, let's make a series that platforms all of the cool research and ideas that are happening on top of DuckDB. So we've got a five part series coming your way soon. And we've got a whole range of topics.
Starting point is 00:01:23 And we've got Daniel and Tam Wold, who's going to come tell us about DuckPGQ, which is a community extension for graph workloads that supports the SQLPGQ standard. We've also got David Houston, who's going to come tell us about Polar, which is some research he published at the Very Large Databases conference this previous year. And this is about adaptive and non-invasive joint order selection via plans of least resistance. We've also got Till Dorman from MotherDuck coming on to tell us about some research he did on a project called DuckDQ, which is a Python library for data quality
Starting point is 00:01:59 checks in machine learning pipelines. Number four, we've got Arjan De Vries, who's gonna be coming on to tell us about FICE, the FICE extension that he has developed, which allows users to store vector data and enables efficient similarity search and vector-based operations. And last but by no means least, we're gonna have Harlan Pus Gavrilidis
Starting point is 00:02:21 from the Technical University of Berlin to come on and tell us about SheetReader, which is some work he's done that enables efficient spreadsheet parsing using DuckDB. So if any of those topics interest you, keep an eye on your feeds. These podcasts are going to be coming out real soon. Take care everyone. See you then.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.