The Data Stack Show - The PRQL: Data Migration Made Easy: Postgres, ClickHouse, and the Future of Analytics with Aaron Katz and Sai Krishna Srirampur
Episode Date: May 19, 2025The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building a...nd maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show prequel.
This is a short bonus episode where we preview the upcoming show.
You'll get to meet our guests and hear about the topics we're going to cover.
If they're interesting to you,
you can catch the full-length show when it drops on Wednesday.
Welcome back to the Data Stack Show.
We are live in Oakland, California,
recording at the Data Council Conference,
and we have Sy and Aaron from ClickHouse on the show today.
Welcome, gentlemen.
Thank you very much.
Really excited to be here.
All right, well, give us just a quick background.
You've had a pretty incredible journey,
so give us a quick background.
Sure, I'm happy to start.
This is Aaron.
We formed ClickHousehouse Inc, the company around
the popular open source database Clickhouse about four years ago. And it's a venture backed
startup headquartered in Silicon Valley, Delaware Corporation and well capitalized. This is
model is to take this very popular columnar open source database and offer it as a managed
service as a database. It supports a variety of different use cases,
which I suspect we'll get into.
And we launched our managed service, which we call ClickHouse Cloud,
two years ago, and it's gone very well.
There's a lot of market demand for this type of technology.
So we've got over a thousand customers on our managed service,
companies like Weights and Biases, Land Chain, Versel, Twilio, Roblox, Sony, Cisco,
and many others, and they're driving great benefits
in terms of cost savings and also extremely low latency
analytical experiences for their customers.
So the company's about 300 employees globally distributed.
Over half of our team members
are outside of the United States, which also shows up
in terms of our customer base and our revenue mix being highly international, with over
50% of both being outside of the Americas.
Love to introduce Cy.
We acquired Cy's company about 10 months ago, PeerDB, where he was the founder and CEO.
And they developed a CDC protocol for moving data
from Postgres into ClickHouse as Postgres emerged
as one of the most popular sources of data
going into our analytical database.
Awesome, very excited to be here
and thanks, Aaron, for the great intro.
So I'm Sai and I head up ClickPipes efforts in ClickHouse.
So ClickPipes is a native ingestion service
which gets data into ClickHouse Cloud.
So at a high level, we make it very easy to stream
and like get data from various sources like object storage
or streaming sources like Kafka and also databases, right?
And prior to ClickHouse, I was the CEO and co-founder
at PeerDB where we were building a data replication tool
with laser focus on Postgres. So the goal was to provide the world's fastest and the easiest way to move data from
Postgres to data warehouses, which included Clickhouse. And interestingly, Clickhouse
was one of the most adopted in the high traction connector, which is why I think Aaron acquired
PeerDB. And now at Click click house, we integrated PRDB already into
click house cloud.
So you just click a button and like you can start streaming
Postgres data into click house and use click house for blazing
fast analytics, right?
So it's all native.
So you don't need to have any external ETL tool to do all of
this.
It's all in the click house cloud experience.
And prior to PRDB, my experience is all in Postgres, right?
So I was working at this database startup called Citus Data,
which built a distributed Postgres database.
And that database got acquired by Microsoft.
So I spent eight years there,
helping customers implement Postgres.
So I've seen all the pain points around Postgres for analytics,
which is why I built this company where making it easy
to move data from Postgres to warehouses. And now I'm working in the other side, which is Clickhouse, which makes analytics like
blazing fast. So I would love to talk about like Postgres, Clickhouse. So yeah.
Yeah. So Sai and Aaron, I'm really excited about talking about this Postgres topic as well,
because I think teams hit this wall and they're like, okay, this doesn't work anymore, what do I do?
And the thing they don't want to do
is have a bunch of different solutions for each thing, right?
They want like as few solutions as possible.
So I wanna talk about that.
Aaron, what's the topic that you wanna hit?
Perhaps we can touch on the,
just the diversity of use cases
that we're seeing emerge around this type of technology
and the convergence of a lot of these specialized databases,
and we've seen this now for the last,
let's call it five years,
where you have transactional databases
like Postgres or MySQL or Mongo.
You've got analytical databases like ClickHouse,
Apache Druid or Pino, many others.
You've got relational databases, vector databases.
And you can kind of see these technologies
on a bit of a collision course.
And just the overlap between them
and what we're hearing from customers around,
the desire to simplify the database infrastructure
to where they can have one or two databases satisfy
a lot of these different requirements.
What about you, sir?
I'd love to talk about Postgres and ClickHouse.
And my experiences of what I have seen at Citus,
because Citus did build a real-time analytical database.
What were the challenges that we saw building stuff
within Postgres, and how we saw customers move to purpose-built
databases like ClickHouse.
We used to hear MemSQL also at that time.
We used to hear Snowflake.
I would love to share those experiences and yeah.
Great. Awesome.
Well, let's dig in. Tons to talk about. Yeah, let's do it.
Alright, that's a wrap for the prequel.
The full-length episode will drop Wednesday morning.
Subscribe now so you don't miss it.