The Data Stack Show - The PRQL: The Shortcomings of Apache Kafka with David Yaffe and Johnny Graettinger of Estuary
Episode Date: November 6, 2023In this bonus episode, Eric and Kostas preview their upcoming conversation with David Yaffe and Johnny Graettinger of Estuary. ...
Transcript
Discussion (0)
Welcome to the Data Stack Show prequel.
This is a short bonus episode where we preview the upcoming show.
You'll get to meet our guest and hear about the topics we're going to cover.
If they're interesting to you, you can catch the full-length show when it drops on Wednesday.
This week's recording is with Johnnyny and dave from estuary.dev and i think this is
going to be a really fun conversation it's a topic that we've actually covered quite a bit
on the show which is streaming you know in particular real-time streaming But this is really in the context of, I think, what you use streaming for. And we really
dig into sort of the Kafka side of the conversation, which we haven't covered in depth a ton.
But part of the estuary story is really reacting to real-time streaming needs, evaluating Kafka,
and seeing some pretty severe shortcomings,
which is why they built Estuary.
Now, what's really interesting to me is,
in many ways, they don't talk about Estuary
as a streaming service.
They kind of talk about it almost as real-time ETL,
which is fascinating.
There's some open-source technology under the hood,
and this is really, I think,
going to be an interesting conversation
because streaming is obviously a hot topic
and there are multiple technologies.
So really interested to see what the SRE team has built.
Yeah, 100%.
It was a very fascinating conversation, actually,
for many different reasons.
First of all, it was pretty technical
and slowly in terms of talking about Eswari itself.
Actually, we had a very deep dive into Kafka,
how Kafka is built,
and some of the issues there
that actually Eswari is addressing,
like from the perspective of the architecture of the system.
Like, for example, we were talking about how compute and storage in Kafka
is like very tied together
and how this has been like changed
with using something as RAR.
And like, what does this mean
in terms of like managing the system
and like what type of of use cases it enables.
So we did a very interesting architectural conversation around this type of system.
So anyone who is interested to understand better how Kafka and this type of streaming
systems are working, definitely should listen to that.
And then we talked a lot about also some important concepts
like CDC, right?
And why CDC is important,
how we use it,
and how they implemented it
because the standard out there
is pretty much like using
something like the BISM,
but the folks at Aestuary
actually implemented everything
from scratch.
And they have some really good reasons
why they did
that and they are talking like through these things so amazing people both johnny and dave like
very deep expertise in this type of technology and we had an amazing conversation ranging from
the technical side of things up to the business side of things.
So I think everyone should listen to them and hopefully we're going to have them again in the future
because I don't think one hour was enough
to go through all the different topics
when it comes to streaming.
All right, that's a wrap for the prequel.
The full-length episode will drop Wednesday morning.
Subscribe now so you don't miss it.