The Data Stack Show - The PRQL: Who Needs a Stream Processing Engine?
Episode Date: November 7, 2022In this bonus episode, Eric and Kostas preview their upcoming conversation with Zander Matheson of bytewax. ...
Transcript
Discussion (0)
Welcome to the Data Stack Show prequel.
We just recorded a show with Xander from ByteWax.
ByteWax is a super interesting technology.
It's stream processing within the Python ecosystem.
One question I have for you, Costas,
which we touched on a little bit in the show,
but there are a lot of tools cropping up around stream processing,
which we talked a little bit about on the show.
How many companies really need this, though?
That's something interesting to me.
There are a lot of technologies popping up,
but it seems like they're primarily enterprise-level use cases.
What do you think? Is it going to trickle down?
Yeah, I think it will.
You have to keep in mind that many times we see technology getting adopted
by the enterprise primarily because the technology is not mature enough
to be adopted by a broader audience.
And enterprises have the resources to maintain
and make accessible to the
whole organization this technology, right? Like getting something like Sling and like setting it
up out there and running it and doing that consistently like blah, blah, blah, like all
that stuff. It's not easy, right? Same like also like with Apache Spark, sorry, Apache Kafka, that's why you have Contra out
there, right?
And you have the hosted solution around that.
So when it comes to data in general, data infrastructure, it's very natural to see the
enterprises being, let's say, the pioneers, because it sounds like they have the resources
and the need, because of the volume or like whatever to go and do things first.
And I think as we will see companies focusing more on the developer experience of things, we will see like a much broader adoption.
Now, is like every shop out there is going to need like a streaming processing engine?
Probably not.
I don't know.
But we will see.
I think there are like use cases out there that are important.
I think anything that has to do with DML use cases where, I mean, when you want
to actually use DML there, like create the features and like
serve like recommendations, like all that stuff, as we're like streaming is like super
important. So I think as ML and AI get like more and more, let's say the adopted
together with all those tools like streaming become like more and more important
together with like, okay, the rest of the technologies that we have there for like bus processing and
more like static data processing.
David PĂ©rez- Yeah, I agree.
I think the other thing, I love this.
We're getting, I love it when we get into predictions because
it's very dangerous territory.
But also fun.
Gristle ball.
I think the other thing, Xander gave a really interesting example of pulling in wogs from a web server and processing them for some sort of use case. I can't remember of the challenge now is that even though the individual components are accessible, for example, there's great CDC technology out there. right? Like, let's get logs. Okay, great. Like, you have the logs, right?
Can you process those logs in a streaming format?
Okay, like, you have, you know,
ByteWax, you know,
in order to do that, right?
And you have the stuff
down streaming ByteWax.
But even with modern tooling,
it actually is still a lot of work,
even though, like, individually,
those things have gotten easier.
Like, it's still hard to consume an entire use case, right? But imagine if you could just
literally hook your logs up to an end to end pipeline. And it's like, well, you get sessionization
at the end, right? So I think as that, I think as the ecosystem evolves, and more of those use cases
are available out of the box, like adoption will go up as well. Because
you may not need
stream
processing for everything. For example,
we don't necessarily need
real-time reporting on
certain things at RudderStack, right?
But it is really nice if you can do it.
Right?
And if it's easy, then why not?
You don't have to wait on batch jobs and all that sort of stuff
so anyways it'll be really interesting to see how the ecosystem evolves great show with xander
bite wax is super cool so check out the repo subscribe if you haven't and i will catch you
on the next one Thank you.