The Data Stack Show - 46: A New Paradigm in Stream Processing with Arjun Narayan of Materialize
Episode Date: July 28, 2021Highlights from this week’s episode include:Introducing Arjun and how he fell in love with databases (2:51)Looking at what Materialize brings to the stack (5:28)Analytics starts with a human in the ...loop and comes into its own when analysts get themselves out and automate it (15:46)Using Materialize instead of the materialized view from another tool (18:44)Comparing Postgres and Materialize and looking at what's under the hood of Materialize (23:16)Making Materialize simple to use (32:33)Why Materialize doubled down on writing 100% in Rust (35:43)The best use case to start with (42:03)Lessons learned from making Materialize a cloud offering (44:22)Keeping databases to the cloud for low latency (48:31) The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
 Transcript
 Discussion  (0)
    
                                         Welcome to the Data Stack Show.
                                         
                                         Each week we explore the world of data by talking to the people shaping its future.
                                         
                                         You'll learn about new data technology and trends and how data teams and processes are
                                         
                                         run at top companies.
                                         
                                         The Data Stack Show is brought to you by Rutterstack, the CDP for developers.
                                         
                                         You can learn more at rutterstack.com.
                                         
                                         Welcome back to the show. Today, we get to talk with the founder of a company building a database
                                         
                                         product. The company is called Materialize, and Arjun is the founder of the company. And I'm super
                                         
    
                                         interested to talk to him. I think, as I think about our audience, Costas, the biggest question that comes to mind is, what are the immediate use cases from a tool like Materialize that at a foundational level can take jobs with data that are generally considered batch and happen over a long period of time with a lot of latency and essentially turn them into real-time jobs.
                                         
                                         Analytics is absolutely a use case that I think makes a ton of sense, but I'm sure that people
                                         
                                         are doing all sorts of other interesting things. So that's going to be my big question is,
                                         
                                         as far as use cases, analytics is obvious, but what else can you do when you go from batch to
                                         
                                         real-time in the context of a database? Costas, you love materializing.
                                         
                                         I cannot wait to hear what your burning questions are. Yeah, yeah. I mean, okay, first of all,
                                         
                                         what you have in your mind, I think it's a great question. Materialize is a very, let's say,
                                         
                                         novel way of interacting with data and consuming data. So it's very interesting to see what people
                                         
    
                                         are doing with it. So absolutely, I'm really looking forward to hear about the use cases.
                                         
                                         I have a lot of questions myself, to be honest.
                                         
                                         I don't know how much we'll manage to cover today.
                                         
                                         Most of them are going to be technical.
                                         
                                         I want to learn more about the technology,
                                         
                                         like the secret sauce, let's say, behind Materialize as a database.
                                         
                                         And also, apart from technology, it's also a very interesting product.
                                         
                                         Like the ergonomics that this database has is very, very interesting.
                                         
    
                                         So I have quite a few different questions that will help us understand better the
                                         
                                         technology behind it and also some choices that the team has made in
                                         
                                         building this new database system.
                                         
                                         Okay. Well, let's jump in and talk with Arjun.
                                         
                                         Let's do it.
                                         
                                         Arjun, welcome to the Data Stack Show.
                                         
                                         We are very excited to talk with you because there's just so many data topics
                                         
                                         that we could cover in this conversation,
                                         
    
                                         and we probably won't have time to get through all of them, but welcome.
                                         
                                         Thank you very much. I'm excited to be on the show.
                                         
                                         Let's just start like we always do with, we'd love to know your background.
                                         
                                         I'm Arjun Narayan. I'm the co-founder and CEO of Materialize. Materialize is a streaming database
                                         
                                         for real-time applications and analytics. It allows you to get extremely complicated and complex analytics answers
                                         
                                         in real time on top of streams of data as opposed to once a day on top of batch data. It looks and
                                         
                                         feels exactly like a SQL database. I started Materialize a little over two and a half years
                                         
                                         ago. Before that, I worked in a different field of databases. I was
                                         
    
                                         a software engineer at Cockroach Labs working on CockroachDB, which is an OLTP scale out,
                                         
                                         horizontally scalable database. And before that, I did a PhD in distributed systems and big data
                                         
                                         processing. I've sort of lived, breathed, and been in data for a while, and a little bit by accident.
                                         
                                         I didn't intend to fall in love with databases, but as I learned more and more about how they
                                         
                                         power most of our applications and experiences that we deal with computers, they just became
                                         
                                         endlessly fascinating to me.
                                         
                                         And I've spent a decade looking at databases at this point.
                                         
                                         I love that.
                                         
    
                                         With a PhD in anything related to databases, I would think that you have a lot of technical
                                         
                                         acumen.
                                         
                                         But, and I love the sentence, I didn't mean to fall in love with databases.
                                         
                                         I feel like that's the beginning of a novel that may have a very specific readership.
                                         
                                         Okay, Materialize, super interesting. I feel like that's the beginning of a novel that may have a very specific readership. Okay.
                                         
                                         Materialize, super interesting.
                                         
                                         I think a lot of our audience is very familiar with working sort of in and around your traditional
                                         
                                         database data warehouse, right?
                                         
    
                                         So Postgres, the usual suspects when it comes to data warehouses, you have Redshift, BigQuery,
                                         
                                         Snowflake is obviously taking over the market. And there are really common paradigms within that,
                                         
                                         you can run SQL, you can create views, et cetera. The syntax and stuff is a little bit
                                         
                                         different depending on the warehouse. But for our average listener who maybe, let's just take an
                                         
                                         example, they are a data engineer.
                                         
                                         They do a lot of work getting data into Snowflake.
                                         
                                         They create views.
                                         
                                         They create different use cases for analytics teams, et cetera.
                                         
    
                                         For that person who may not be familiar with Materialize, could you just paint a picture
                                         
                                         of if you introduce Materialize into the stack, what does that look like? And what are the key
                                         
                                         benefits that it brings? That's a great question. I think it helps to break down a standard paradigm
                                         
                                         of where most databases fit in, in the traditional worldview. And then we'll introduce how Materialize
                                         
                                         sort of brings some new capability that's different from what's currently in the market.
                                         
                                         So databases, and this is going
                                         
                                         back, say, several decades at this point, traditionally fall into two large buckets.
                                         
                                         There's the transactional databases and the analytics databases. So transactional databases
                                         
    
                                         are your Oracle, your Postgres, your MySQL. They're generally speaking focused on processing lots of transactions that may potentially be conflicting.
                                         
                                         They're sort of the point that decides what events are allowed to happen. So they reject
                                         
                                         some transactions, they accept some other ones, and then they're very good at writing those
                                         
                                         transactions down. So they're very focused on avoiding data losses. It's something you really,
                                         
                                         really want from your transactional
                                         
                                         database. Then you have your analytics databases, like your BigQuery, your Redshift, your Snowflake.
                                         
                                         Your analytics databases are more focused on enabling far more powerful compute. Typically,
                                         
                                         in SQL databases, people use SQL in both settings, in the transactional setting and the analytical setting.
                                         
    
                                         But if you take some of these complex queries, say it's joining eight tables together over at least some of these tables are very, very large.
                                         
                                         Those queries, if you ran them on a transactional database, the transactional database would A, most likely fall apart.
                                         
                                         And B, if it didn't fall apart, it would probably greatly slow down your other concurrent transactions.
                                         
                                         So there's a reason people mostly separate these systems.
                                         
                                         If an analyst types some large analytical query about last quarter sales, you don't
                                         
                                         want all your cart checkouts to triple in latency, right?
                                         
                                         So it makes sense.
                                         
                                         It makes perfect architectural sense to separate these concerns and then also build separate
                                         
    
                                         systems that are optimized for these different classes of workloads.
                                         
                                         The big, big thing that most people give up today is your analytics query runs on a dump of the data that is somewhat stale.
                                         
                                         So this is feeding your batch data warehouse with a once a day ETL. I mean, this is really ETL. ETL, really extract
                                         
                                         transform load is about getting data out of the transactional system and putting it in the
                                         
                                         analytics system. It's getting less painful, but it used to be an extremely painful process. You
                                         
                                         would run it overnight, once a day. Some folks are now running this on a more multiple times a day,
                                         
                                         but it is still fundamentally a batch operation, which means there's a large of analytics or analytical style queries
                                         
                                         that are incredibly valuable to have in real time, which don't make sense around a transactional
                                         
    
                                         database, but existing analytical databases or data warehouses are not equipped to do because
                                         
                                         they're fundamentally built in this batch paradigm. Materialized flips
                                         
                                         the setting a little bit, which is instead of computing your answer off of a data set from
                                         
                                         scratch when the query is presented to you, it pre-materializes some set of questions that you've
                                         
                                         pre-registered with Materialize. And this is why the companies have
                                         
                                         been named Materialize. So if you might be familiar with the term Materialized Views,
                                         
                                         the entire point of a Materialized View is you tell the database, hey, I'm interested in asking
                                         
                                         this question on a repeated basis. Can you please pre-compute it for me as the data changes?
                                         
    
                                         In the past, most Materialized view support in most databases has been highly
                                         
                                         restricted, right? So you can do it for fairly simple queries, but if the query gets fairly
                                         
                                         complex, the database really wants you to ask it and then it'll go ahead and do the work rather
                                         
                                         than doing a whole bunch of redundant work that has to be immediately thrown away the moment the
                                         
                                         data changes. So under the hood, Materialize is an incremental query processor.
                                         
                                         And we can talk a little bit more about the technology because this is a thing,
                                         
                                         I don't think I'm describing anything that people haven't wanted for a very long time.
                                         
                                         The unique thing that we bring is a novel set of underlying research and technologies that allow
                                         
    
                                         this to happen in an elegant fashion. But Materialize allows you to ask
                                         
                                         these complex analytical queries on a sub-second, say a single-digit millisecond latency, even when
                                         
                                         these queries are very, very complex. This is more than just about taking some analytics query that
                                         
                                         you've asked once a day and making it a dashboard. Now, absolutely, a lot of our users start by taking something that they computed once a day, some very valuable
                                         
                                         metric, and making that into a dashboard so they could see it on a more real-time basis,
                                         
                                         especially in, say, the financial and the trading use cases. They can never have things fast enough.
                                         
                                         But the more interesting thing happens when you start to put these live changing data
                                         
                                         and taking automated actions off of them.
                                         
    
                                         So you could think alerting, you could think personalization in an application as you get
                                         
                                         real-time data, as opposed to realizing that somebody was a customer that should be segmented
                                         
                                         a certain way and then doing an email
                                         
                                         marketing campaign the next day by the time your OLAP job finished. There's a wide variety of
                                         
                                         uses where when you can action while a user is on your website or while a transaction is still
                                         
                                         pending before it has been authorized to clients, if it's a card transaction, it's much more valuable to make a precise judgment as to the
                                         
                                         quality of that user or that transaction within, say, a 10, 100 millisecond budget versus doing
                                         
                                         that overnight and reacting to it the next day. Absolutely. I mean, this is fascinating. And
                                         
    
                                         we've had several conversations with different businesses where this is where they're heading with their architecture.
                                         
                                         And e-commerce comes to mind just because it's a situation where you have a lot of data.
                                         
                                         A lot of it needs to be enriched or combined with other data.
                                         
                                         So data from transactions or ML models and all of that's happening in some sort of database.
                                         
                                         And the challenge has been we're creating all this value of the data that we have.
                                         
                                         And it's very difficult to deliver that with speed, right?
                                         
                                         And in e-commerce, if you want to send a personalized coupon right after purchase or something like that, that needs to happen very quickly.
                                         
                                         But the latency has
                                         
    
                                         been really high just due to technology. But that's changing. And that's really, really exciting.
                                         
                                         So super, super interesting. Absolutely. One of the things that we see is the amount of folks who
                                         
                                         are putting in the capabilities, and we're very much in the early stages of this architectural
                                         
                                         transform, because folks are pretty much just
                                         
                                         putting in place the streaming infrastructure to move the data at low latencies and at high
                                         
                                         volumes. So this is doing change data capture out of their transactional databases on an ongoing
                                         
                                         basis so that milliseconds after a transaction commits in Postgres or MySQL, it is present in
                                         
                                         a Kafka topic that can be used for
                                         
    
                                         these downstream consumers or downstream applications. And the early adopters have
                                         
                                         gone ahead and built these manual microservices, right? So the absolute earliest adopters have
                                         
                                         adopted this microservice pattern, which comes at a huge cost, right? So not to mention just the
                                         
                                         development cost of building these manual not to mention just the development cost
                                         
                                         of building these manual microservices,
                                         
                                         but the ongoing maintenance and upkeep costs
                                         
                                         that these microservices introduce
                                         
                                         when you want to just say,
                                         
    
                                         change a little bit of business logic, right?
                                         
                                         So changing business logic sometimes takes a full quarter
                                         
                                         because you have to shut down or upgrade these microservices
                                         
                                         in a controlled fashion.
                                         
                                         And perhaps something that would be very simple in a database, like joining against another stream,
                                         
                                         ends up introducing a massive amount of architectural shift,
                                         
                                         as you now have to build and manually maintain an extra set of states that is introduced by adding on that third topic.
                                         
                                         So these are the sort of costs that people currently pay that we want to reduce.
                                         
    
                                         So we think that building these streaming microservices, streaming applications right
                                         
                                         on top of the stream should be as easy as building a CRUD app using a MySQL database.
                                         
                                         Today, it's not, but with Materialize, it is.
                                         
                                         Yeah.
                                         
                                         Well, I want to dig into some of the technical details because
                                         
                                         there are a lot of questions that Kostas and I talked about. But before we get there,
                                         
                                         you mentioned something around moving just beyond the basic analytics use case. And that's something
                                         
                                         I want to talk about briefly. People use the term digital transformation, which is a buzzword, but on the spectrum of digital transformation, you have companies who have figured out the analytics thing and they're relying on technology that is doing the batch,
                                         
    
                                         you know, is relying on the batch load paradigm, maybe with outdated tech.
                                         
                                         What are you seeing?
                                         
                                         Or, I mean, there are a lot of companies who I think could just benefit from the analytics
                                         
                                         use case in and of itself.
                                         
                                         But the real, the use cases that really move the needle are the ones where you're actually
                                         
                                         delivering personalization or other really dynamic customer
                                         
                                         experiences. But I'd just love to know what you're saying as you talk with your customers and
                                         
                                         people who are interested in adopting something like Materialize, what's the balance? Are a lot
                                         
    
                                         of companies still trying to figure out the analytics use case or are there more companies
                                         
                                         than we think who are actually doing some really interesting things around the customer experience.
                                         
                                         That's an excellent question. To me, a large part of this comes from where your analytics team is.
                                         
                                         One of the amazing things that has been happening in the industry is analytics teams have become progressively more empowered
                                         
                                         to do more and more and create more value for their organizations and now are
                                         
                                         starting to get into building these applications or building part something that ends up being
                                         
                                         surfaced in the core application. The way I think about this is analytics pretty much starts with a
                                         
                                         human in the loop and then analytics starts really coming into a zone where once the analysts
                                         
    
                                         themselves are trying to figure out how to get themselves out of the loop, right? And how to make these things automated. So I think a lot of the
                                         
                                         analytics journey to real time and streaming begins with augmenting the human capability by
                                         
                                         giving them a more live, but where it truly comes into its own is when we start doing automated
                                         
                                         actions directly off that analytics pipeline.
                                         
                                         There's a huge benefit to everyone in the organization, whether it's the application
                                         
                                         or the analyst speaking the same, speaking a common language in terms of defining the metrics
                                         
                                         that they've been thinking about in the exact same way. DBT is, of course, absolutely the leader in creating an ecosystem where an entire company's
                                         
                                         or an organization's data is modeled using a single unified paradigm. And starting from the
                                         
    
                                         analyst and then going towards the application, I think, is the correct way to do things. I
                                         
                                         absolutely encourage most folks to take their first steps by moving,
                                         
                                         say, a once a day refreshed dashboard into real time because, A, it's an enabler of a lot more
                                         
                                         things. And it's a good way to ensure that all the application and the real time in application
                                         
                                         experiences are fundamentally based on the exact same vocabulary that is already part
                                         
                                         of the analytical organization.
                                         
                                         Arjun, this is great.
                                         
                                         Actually, before I start asking my questions, I have to tell you, I really enjoyed your
                                         
    
                                         introduction.
                                         
                                         I think it was one of the best descriptions of the difference between the two database
                                         
                                         paradigms that we have, which is pretty common. Many people are asking about why do we need to have an analytics database
                                         
                                         and a transactional database.
                                         
                                         But that was amazing.
                                         
                                         If you haven't written a blog post or something about that,
                                         
                                         please go and do it.
                                         
                                         I think many people are going to thank you about it.
                                         
    
                                         But I have a couple of more technical questions that I want to make.
                                         
                                         And let's start, a little bit more technical questions that I want to make.
                                         
                                         Let's start with the materialization.
                                         
                                         You mentioned that you also chose the name because of the concept of materialized use.
                                         
                                         Why someone would use materialize and not just keep using the materialized use of transactional
                                         
                                         database for example offers?
                                         
                                         Excellent. Well, thank you so much, Costas. I appreciate it. I should write a blog post.
                                         
                                         This is a great question in terms of why not just use the materialized view
                                         
    
                                         in, say, Postgres or MySQL? Well, the first answer is if your materialized view becomes
                                         
                                         the slightest bit complicated, you'll lose the ability to incrementally update it.
                                         
                                         So it's really about what is the update strategy for this materialized view? Because
                                         
                                         for a complex materialized view, let's say you're joining four tables together,
                                         
                                         you have some subquery in there, you have some non-trivial aggregation, maybe some max and some
                                         
                                         group by or something of that sort. The first thing in OLTP or even an OLAP
                                         
                                         database is going to tell you is you have to manually tell me when to refresh the materialized
                                         
                                         view. And then when you do that, I will essentially run the equivalent of a select query and then
                                         
    
                                         stash the result in a table. So for you to query. So it's not in any way, it gains you almost
                                         
                                         nothing compared to
                                         
                                         repeatedly issuing select queries. The hard part, the technologically hard part is the reuse of
                                         
                                         previously computed results to efficiently update the materialized view. A good way to think about
                                         
                                         it is you want to do work proportional to the changes, not proportional to the query load.
                                         
                                         So if somebody asks a select query and very little has changed, you shouldn't force your
                                         
                                         database to do a massive quantity of work. Data has changed, but does not affect the result. You
                                         
                                         want that to essentially be suppressed as early as possible, so a good example of this is if I'm, if I'm summing a bunch
                                         
    
                                         of rows and then somebody added a bunch of zeros, we should quickly detect that and not, not, not,
                                         
                                         not throw all our results out and recompute everything from scratch. A large amount of
                                         
                                         analytics workloads that happen in data warehouses today are fundamentally redundant queries where
                                         
                                         we are mostly recomputing the same answer. So if you
                                         
                                         have terabytes of data, most of this data is historical, right? Like big data is absolutely
                                         
                                         real, but it's primarily a phenomenon related to the amount of data we have collected. You don't
                                         
                                         have big data every second. Well, Google might have, but most organizations today, the amount of data that is coming in
                                         
                                         second by second is not that voluminous.
                                         
    
                                         But when your queries are fundamentally nonlinear, they're joining a bunch of different things,
                                         
                                         the database sort of looks at it and goes, well, I don't know what's changed.
                                         
                                         I kind of have to throw it all out and start over from scratch.
                                         
                                         And that's fundamentally the paradigm that we want to get away from.
                                         
                                         That's great. Another question on that, why I would like to have incrementally updated
                                         
                                         views instance of having something like a caching layer and cast the results of a view?
                                         
                                         Well, the hard part is deciding when to invalidate your cache, right? So what you get from an
                                         
                                         incrementally updated materialized view is this logic is handled
                                         
    
                                         correctly, perfectly, without the user having to do anything more than think. One of the cute
                                         
                                         taglines we use internally is think declaratively, but execute incrementally. So it allows you to
                                         
                                         still think in terms of what's fundamentally the select query I'm trying to run. And then we think
                                         
                                         through all the hard parts of what is the data flow query I'm trying to run. And then we think through
                                         
                                         all the hard parts of what is the data flow that has to happen under the hood, which parts of these
                                         
                                         are stateful, which are stateless, which ones invalidate cache. If you're building a microservice,
                                         
                                         you're going to have to reason about all of this yourself, build a microservice, a stateful
                                         
                                         microservice. And this is hard and you might get it wrong. And if you get it wrong,
                                         
    
                                         it's really subtle to debug. It's difficult. Generally speaking, most people use databases
                                         
                                         because inventing half a database that you happen to need for this particular use case is a risky
                                         
                                         thing to do and very hard to validate if you did it correctly. So we also find a solution to one of
                                         
                                         the hardest problems in computer science, right? When to invalidate the cache. So we also find like a solution to one of the hardest problems in computer science,
                                         
                                         right?
                                         
                                         When to invalidate the cache.
                                         
                                         So that's great.
                                         
                                         Yeah, exactly.
                                         
    
                                         It's data naming things.
                                         
                                         Yeah, yeah.
                                         
                                         All right.
                                         
                                         So what's the secret sauce?
                                         
                                         What's the magic?
                                         
                                         Like what is different in materialized converting?
                                         
                                         Like what Postgres is doing, which is, I don't know, probably
                                         
                                         one of the most complex databases ever built.
                                         
    
                                         We built it for like the past 30 years or something, right?
                                         
                                         So what's new and what is different with materialize?
                                         
                                         That's an excellent question.
                                         
                                         Before, I don't want to talk negative about Postgres, I'm actually going to take the flip
                                         
                                         of the question.
                                         
                                         It's like, what does Postgres do that we can't do, right?
                                         
                                         So Postgres is a great OLTP database. In fact, we love it very much in the engineering team at Materialise
                                         
                                         because Materialise speaks as close as possible wire compatible Postgres. So for an application
                                         
    
                                         that's talking to Materialise, you use Postgres client drivers, you use the Postgres native
                                         
                                         language bindings and it'll all just work. So we're huge fans of Postgres.
                                         
                                         Postgres is a great OLTP database.
                                         
                                         What Postgres does very well that we don't do is transaction isolation and concurrency
                                         
                                         control.
                                         
                                         So if you have, say, a unique index or a primary key field and you have two people racing to
                                         
                                         commit transactions, Postgres will ensure that only one of them succeeds, right? It's great. It's great at this conflict resolution and consistency aspects of
                                         
                                         the asset properties that you want from a database. What we're very good at is computing these
                                         
    
                                         denormalizations, these complex views, and keeping them incrementally up to date.
                                         
                                         And we actually work very, very well downstream of Postgres. So one way that some of our users deploy Materialize is they have Materialize
                                         
                                         essentially acting as a read replica, right? So Materialize connects directly to Postgres,
                                         
                                         the transactions, all the writes land in Postgres, and then get immediately replicated within a
                                         
                                         millisecond or a few to Materialize. And then Materialize gets to maintain all these
                                         
                                         rich analytical indexes that essentially are kept incrementally updated as soon as the data comes in.
                                         
                                         This way, the writes offload to Postgres and then the complicated reads, essentially it offloads
                                         
                                         compute from Postgres. Now, how do we actually do this? So under the hood, Materialize is built on this state-of-the-art
                                         
    
                                         stream processing platform called Timely Dataflow. Now, Timely Dataflow was invented
                                         
                                         or co-invented by my co-founder, Frank McSherry, who has done a lot of stream processing research
                                         
                                         for, I think, coming up on seven to eight years now. Timely data flow is a fully horizontally scalable
                                         
                                         stream processing framework on which we've built query planning and data flow planning such that we
                                         
                                         can take an arbitrary SQL statement or a SQL view definition and convert it down into a persistent
                                         
                                         data flow that is
                                         
                                         horizontally scaled out on this timely data flow cluster.
                                         
                                         We do have some, there are some folks who use timely data flow directly as a stream
                                         
    
                                         processing library.
                                         
                                         It's an open source project, but most people don't want to do this, right?
                                         
                                         You don't want to write, so timely data flow is written in Rust.
                                         
                                         You don't necessarily want to build and write Rust data flows and manually orchestrate them. So we think there's a large market for people who want those benefits of that incrementally updated high performance scale out, blah, blah, blah. for several decades, which is they write and define SQL queries, and these SQL queries just stay alive, and they don't really think about it,
                                         
                                         and these things just stay alive forever as the data changes.
                                         
                                         That's very interesting.
                                         
                                         So how is timely data flow different compared to other solutions out there
                                         
                                         like Flink, Databricks, and the rest of all the streaming processing platforms that we have seen in the
                                         
    
                                         market until now. That's great. So first off, I'm going to do sort of a bad job answering this,
                                         
                                         but there's a wonderful research paper called NIAID, a timely data flow system,
                                         
                                         which won several academic awards that lays the foundational case for timely data flow and how
                                         
                                         it's novel. There's a few things, not all of which we currently take advantage of in Materialise
                                         
                                         today, but a good example is timely data flow is capable of reasoning about cyclic data
                                         
                                         flows, whereas most other data flow models are purely acyclic.
                                         
                                         It is extremely expressive, almost to a fault.
                                         
                                         So driving timely data flow around is hard and
                                         
    
                                         something that we take a lot of pains to do correctly at Materialize in the Materialize
                                         
                                         database layer. It is data parallel across a sharded data flow graph in a way that most other
                                         
                                         data flow engines are not. So today, most data flow systems, say Flink or Spark streaming,
                                         
                                         the primary way in which they scale across to use many more compute resources is by taking
                                         
                                         various operators of the graph and placing them on dedicated CPU resources and flowing data from
                                         
                                         a data flow node to another data flow node. So if you think about that, so a good way to get intuition for this is let's say you
                                         
                                         have two sources of data, each of which have some map operation, and then there's a join
                                         
                                         operation and then there's some subsequent map or map or filter or things like that.
                                         
    
                                         These each things form this graph of computation and each one of those nodes gets their own
                                         
                                         dedicated compute resource.
                                         
                                         Timely data flow is sharded in a very, very different
                                         
                                         model that results in a very, very higher performance, particularly in cases where you
                                         
                                         have very, very large data flow. So let's say you have a SQL query that has eight different input
                                         
                                         streams, complex sub queries, things like that. The actual execution graph of this may actually
                                         
                                         be hundreds of nodes.
                                         
                                         You as a user may not care. You just want that SQL to be incrementally updated.
                                         
    
                                         Getting that data flow graph to get high performance in some of these other stream processing systems is very, very hard. Whereas with timely data flow, because of the way it
                                         
                                         scales up and has this shared cooperatively scheduled data flow execution model,
                                         
                                         makes it far, far more performant.
                                         
                                         For more details, I would point you to the research paper,
                                         
                                         because I'm struggling a little bit to convey some of the nuances
                                         
                                         without the reference to some diagrams and some slides.
                                         
                                         Yeah, yeah, makes sense, makes sense.
                                         
                                         I mean, I was aware about the NIAD paper and also the timely data flow model.
                                         
    
                                         But I think it's something that people out there
                                         
                                         are not the community out there, are not that aware of.
                                         
                                         So I think the more we can communicate and talk about it,
                                         
                                         I think the better it is for everyone to start understanding,
                                         
                                         thinking in new terms, right?
                                         
                                         Because as you said,
                                         
                                         timely data flow is like a different paradigm
                                         
                                         of how you can process.
                                         
    
                                         And whenever we introduce a new paradigm,
                                         
                                         it takes a lot of repetition
                                         
                                         from the people who know about it
                                         
                                         and they evangelize this
                                         
                                         to help the people out there understand.
                                         
                                         And actually, it's very interesting
                                         
                                         because we had an episode pretty recently
                                         
                                         with CockroachDB. And one of the topics
                                         
    
                                         that we were discussing was how important it is today for the engineers out there to start
                                         
                                         thinking more into getting some distributed elements from distributed computing and start
                                         
                                         incorporating them in the way you think as an engineer or as a developer,
                                         
                                         right?
                                         
                                         And I think this is one of the values that we, as people here, are sitting together and
                                         
                                         discussing about interesting technical topics that we can offer to our audience out there,
                                         
                                         how we can give them some guidance of, yeah, you know something, there's a different way
                                         
                                         that data can be processed out there.
                                         
    
                                         Maybe you should start also trying to think into this or yeah you might be like a web developer or
                                         
                                         like a front-end developer but still if you start thinking and using some of like the patterns that
                                         
                                         come from like distributed systems probably it can help you with your work and also can help you work
                                         
                                         much better with the back ends that probably are distributed behind the scenes. So that's why I find it always very, very valuable to discuss a little bit more technical
                                         
                                         details.
                                         
                                         Absolutely.
                                         
                                         I strongly agree.
                                         
                                         I think it's very important for developers building and using systems like this to understand
                                         
    
                                         and appreciate what the right principles are.
                                         
                                         One, so they can choose the right technologies to work with or the appropriate technologies
                                         
                                         for the problems that they're solving.
                                         
                                         But one of the things we maybe struggle with, and I appreciate you pushing a little bit
                                         
                                         on this, is to what extent should we encapsulate and hide the complexity versus unwrap and
                                         
                                         show the complexity?
                                         
                                         So one of the big advantages of Materialize is you don't have to know, you just write SQL,
                                         
                                         but there's a sort of inherent tension where,
                                         
    
                                         you know, actually, A, everyone is interested
                                         
                                         and definitely wants to know,
                                         
                                         and B, maybe understanding will get you
                                         
                                         the right intuitions for what computations
                                         
                                         you can even execute
                                         
                                         and how to go about choosing the right architecture
                                         
                                         to build which systems you can incorporate
                                         
                                         and not incorporate in your architecture.
                                         
    
                                         Absolutely. I totally agree.
                                         
                                         So, Arjun, you mentioned that by incorporating
                                         
                                         this new timely data flow processing model,
                                         
                                         Materialize achieves to be very performant
                                         
                                         compared to the rest of the solutions out there
                                         
                                         for streaming processing.
                                         
                                         What kind of resources someone who wants to start using it today
                                         
                                         should consider about setting up the open source version of Materialize?
                                         
    
                                         So we aim to make Materialize very simple to use.
                                         
                                         So you go to our website or our GitHub, you click the download button,
                                         
                                         and you can run this on a single node.
                                         
                                         You can scale up this node to, to, to handle.
                                         
                                         In fact, in fact, if you, if you get the, the large, the larger sized, uh, VM
                                         
                                         and you run materialize on it, you can ingest millions of messages,
                                         
                                         a million messages a second.
                                         
                                         You can, you can install dozens of views and so on before even needing to consider
                                         
    
                                         whether you need a multi-machine setup as part of
                                         
                                         making it easy to graduate beyond this.
                                         
                                         In fact, you know, you will be very productive on a single node database.
                                         
                                         We really go to great lengths to make it as easy to use as a database, right?
                                         
                                         So you run it on a single node, you connect to it using a SQL shell or a SQL driver in
                                         
                                         your language.
                                         
                                         The lived experience is very much like Postgres, right?
                                         
                                         Like this is how most people run Postgres is they run brew install Postgres or app get
                                         
    
                                         install Postgres and they run it and then it's living in a VM by itself in a cloud for
                                         
                                         years of uptime.
                                         
                                         So that's really the easiest way to get started.
                                         
                                         We are building a cloud service, which we are launching publicly next month, which allows
                                         
                                         folks to get even more advanced features.
                                         
                                         So some of the features that we will be shipping in our cloud product is horizontal scalability,
                                         
                                         where you have these very, very large data volumes, well north of a million messages per second, for instance.
                                         
                                         And you do need multiple machines in a horizontally scaled setup to absorb that data volume.
                                         
    
                                         And then two for having replication, right?
                                         
                                         So if you have extremely high availability needs,
                                         
                                         you're going to want multiple servers set up in an automatic failover capacity.
                                         
                                         And that's something that our cloud product will,
                                         
                                         not next month, but down the road, also support.
                                         
                                         That's great.
                                         
                                         And I'm very excited to hear that you are launching
                                         
                                         a cloud version of the product.
                                         
    
                                         And I want to ask you more about this. But before we go there, because we are going to spend some
                                         
                                         time on it, I have a question that I don't want to forget to ask. And that's about, you mentioned
                                         
                                         at some point that Timely Dataflow is implemented in Rust. So how did you decide to use Rust?
                                         
                                         What's the reason behind that?
                                         
                                         I think the original reason was Frank,
                                         
                                         when he started coding Timely Dataflow,
                                         
                                         he had recently left.
                                         
                                         He had just left Microsoft Research
                                         
    
                                         and he had been coding for a while
                                         
                                         in the sort of.NET ecosystem.
                                         
                                         He wanted to try something new
                                         
                                         and Rust was a beta programming language at the time,
                                         
                                         a very risky thing, but he was just playing around. I think a lot of these open source
                                         
                                         projects, they start that way. So timely data flow was coded in Rust. Now I think
                                         
                                         for highly data intensive applications, the best choices are Rust or C++ because the manual memory management and control is quite important
                                         
                                         for predictable low latency experience. I think there are some places that have gotten good
                                         
    
                                         at writing in Go. Go is a garbage collected language and not manual memory management.
                                         
                                         So I had some experience
                                         
                                         because I was suffering
                                         
                                         from a cockroach.
                                         
                                         Cockroach DB is written in Go.
                                         
                                         We struggled with it a little bit.
                                         
                                         I don't think it's impossible.
                                         
                                         I think you can definitely,
                                         
    
                                         with enough sweat and effort,
                                         
                                         essentially drive the garbage collector
                                         
                                         around to do the kinds of things
                                         
                                         that you would have wanted to do
                                         
                                         in a manually managed environment.
                                         
                                         There's pluses and minuses.
                                         
                                         Rust, we doubled down on Rust
                                         
                                         when we built Materialize
                                         
    
                                         because one of the things we could have done is we could have left timely data flow as a Rust
                                         
                                         underlying engine layer, and then built the Materialize database management layer in a
                                         
                                         different language. And when we looked at that design decision, we thought about it a little bit
                                         
                                         and we came to the conclusion that Rust was actually pretty great. And we were quite happy
                                         
                                         to build it on Rust at all layers of the stack. So Materialize is 100%
                                         
                                         written in Rust. And we're quite happy with that. I mean, I'm happy to go into like, more detail as
                                         
                                         to our experience building in Rust and maybe contrasting a little bit to the Cockroach
                                         
                                         experience in Go as well. Yeah, that's very interesting. And I'm asking you because Rust is
                                         
    
                                         like a pretty young language language but it's gaining
                                         
                                         a lot of traction lately and it's a very interesting language also like from a let's say
                                         
                                         research perspective in terms of like what kind of primitives they've added there in order to do
                                         
                                         like this kind of memory management it's very interesting it's of course like very interesting
                                         
                                         to see that it starts to be used for systems out there that get in production and in products that
                                         
                                         are delivered out there so that's production and in products that are delivered
                                         
                                         out there. So that's why I was very interested to hear your opinion about Rust. And something that
                                         
                                         it's about Rust again, but from the perspective of being a founder and building teams, right? So
                                         
    
                                         how easy it is today to find developers out there that can write in Rust or who are willing
                                         
                                         to write in Rust? Right. So we don't expect our engineers to
                                         
                                         know Rust when they join, although many of them do, certainly not all. We find that it takes a
                                         
                                         reasonable amount of time on the order of a few months to get productive in Rust. This is probably
                                         
                                         the biggest cost that we pay as an organization for building a product in Rust is there is a bit
                                         
                                         of a ramp up time that we have to pay,
                                         
                                         but that's fine. It is not difficult to find people who want to work in Rust. In fact,
                                         
                                         I would say it's a significant attraction to several engineers who maybe if they've written
                                         
    
                                         C++ code and they've lost so many weeks of their life to chasing down some memory leak or some manual memory management bug
                                         
                                         and they want to move to a language or an environment where they get the benefits of
                                         
                                         manual memory management, the performance, and they also don't have to deal with that class of
                                         
                                         bugs. So we find quite a few people are very excited to work in Rust, although we do have
                                         
                                         to take some time to let them ramp up.
                                         
                                         And what is the reason that it takes a couple of months
                                         
                                         to start being productive in Rust?
                                         
                                         And that's probably also the...
                                         
    
                                         Sorry for interrupting you, but I think this is one
                                         
                                         probably of the main contrast with Go,
                                         
                                         because one of the benefits that I hear,
                                         
                                         at least from engineering months, about Go is that
                                         
                                         it doesn't take that much time to be productive in Go.
                                         
                                         But why Rust has that?
                                         
                                         It takes five to six months to get productive.
                                         
                                         I wouldn't go so far as to say five to six.
                                         
    
                                         I think it's more like two to three months,
                                         
                                         assuming we have an experienced software engineer who has been building, which is the backend or distributed systems,
                                         
                                         which pretty much all the engineers that we hire fit that mold.
                                         
                                         The primary difficulty,
                                         
                                         and by the way, having worked in Go
                                         
                                         and at Cockroach Labs,
                                         
                                         most people can be productive in Go in under one week.
                                         
                                         It's a truly incredibly concise language
                                         
    
                                         to get productive in.
                                         
                                         It's sort of, I would almost say,
                                         
                                         optimized for productivity.
                                         
                                         The primary difficulty with Rust is that it is, most folks have a little bit of an adversarial
                                         
                                         engagement with the compiler.
                                         
                                         It can be a little bit frustrating to essentially what you are doing when you're writing a Rust
                                         
                                         program is you are giving it sufficient type annotations that it is able to prove that certain classes of memory bugs
                                         
                                         are provably absent. So it's a little bit of you are guiding a not very smart computer because
                                         
    
                                         it's not a human to follow a proof. And there's a little bit of it's too dumb to see that the code
                                         
                                         you've written does not have a memory leak. This is often called fighting the borrow
                                         
                                         checker. So the borrow checker is a part of the compiler that yells at you. And there's this
                                         
                                         standard failure mode of like fighting the borrow checker for a while until you fully internalize
                                         
                                         the limited ways in which the borrow checker thinks. And then you know, oh, this is where I
                                         
                                         should probably add this annotation or do this thing or use this pattern in order to do the compilation step.
                                         
                                         The other thing I didn't mention is, and this is a place where I say, given the novelty of Rust, this is a negative.
                                         
                                         There's just not that much libraries and pre-existing tools that you can draw off a rich sort of open source ecosystem.
                                         
    
                                         It's very different from Go.
                                         
                                         And Go, like pretty much if you're looking for some compatibility to some driver or some
                                         
                                         library or some parsing library or some security thing, like it's a very rich, mature ecosystem
                                         
                                         compared to Rust where oftentimes we've had to, there's at least a couple instances I
                                         
                                         can think of where we had to write a library from scratch,
                                         
                                         whereas if we were writing in Go,
                                         
                                         we would have used an off-the-shelf one.
                                         
                                         Yeah, makes sense.
                                         
    
                                         Although from my limited experience with Rust,
                                         
                                         I have to say that Cargo is a very nice experience
                                         
                                         for package management.
                                         
                                         So yeah, there's always trade.
                                         
                                         Docents, it's a young language, right?
                                         
                                         It takes time for the community to build everything there.
                                         
                                         But with Attraction, I think it will catch up pretty fast.
                                         
                                         For sure.
                                         
    
                                         And also some of these things that I'm saying,
                                         
                                         they're not going to be downsides for people coming after
                                         
                                         because there will be more software engineers
                                         
                                         who are already fluent in Rust.
                                         
                                         And hopefully we are a contributor as well,
                                         
                                         adding some of these libraries that we've open sourced
                                         
                                         and other people as well.
                                         
                                         So a year from now, it'll be even easier.
                                         
    
                                         So these are just growing pains. Yep, absolutely. You'd asked about how to get started with
                                         
                                         Materialize. And I just wanted to jump in really quickly because we talked about,
                                         
                                         obviously, the open source offering and then super exciting that you're launching cloud.
                                         
                                         Arjun, one quick question. I'm just thinking about our audience here.
                                         
                                         What use case would you encourage them to start with? within a matter of a couple of days and really validate that the technology is capable of taking arbitrary SQL that you have,
                                         
                                         business logic in your organization,
                                         
                                         and move it to real time.
                                         
                                         And then that's a position from which
                                         
    
                                         we can think through the more complex things
                                         
                                         like actioning or integrating this into a pipeline
                                         
                                         that sort of is part of an application experience.
                                         
                                         But getting this value in as short time as possible
                                         
                                         is what I would encourage folks.
                                         
                                         And that pretty much means some pre-existing business logic
                                         
                                         or a pre-existing DBT model.
                                         
                                         Since Materialize has a DBT plugin,
                                         
    
                                         you should be able to take your pre-existing DBT model
                                         
                                         and make it work on Materialize with ideally in a single day.
                                         
                                         Oh, very cool.
                                         
                                         Wow, I mean, that's extremely
                                         
                                         fast time to value. And then just one more quick tactical question for our listeners.
                                         
                                         Is there, just go to materialize.com to get notified about the launch of the cloud product?
                                         
                                         Yes, that will be front and center on our homepage. And in the meanwhile, you can download
                                         
                                         the source available free product from there as well.
                                         
    
                                         Sure. Great. Okay. Sorry, Costas.
                                         
                                         I know we're close to time. We have another.
                                         
                                         But I just I just I constantly think about our listeners and I love learning about new technologies. And I just want I just want them to get the fastest way to understand how I can get in and kick the tires on it.
                                         
                                         Yeah, absolutely. And it was very good that you asked these questions, Eric, because it's time to spend a little more time on the cloud version of Materialize. for a product or like a framework,
                                         
                                         like materialize,
                                         
                                         like things that you expected beforehand and didn't happen
                                         
                                         and things you didn't expect,
                                         
                                         but they happened.
                                         
    
                                         Like anything interesting
                                         
                                         that you can share with us
                                         
                                         about this process
                                         
                                         of turning this amazing piece of technology
                                         
                                         into a cloud offering.
                                         
                                         Absolutely.
                                         
                                         The first one I would say
                                         
                                         is the biggest reason why we're building a cloud product is
                                         
    
                                         by far, we talk to our users, we talk to prospective users,
                                         
                                         we talk to basically everyone in the industry, there's a wide consensus
                                         
                                         that everyone wants to use a managed cloud offering
                                         
                                         of pretty much all of the technologies that they use
                                         
                                         because running and upgrading and manually maintaining these things
                                         
                                         is not something that most people are interested in doing, particularly as things get more and
                                         
                                         let me put it this way, you much rather have somebody else carrying a pager than you carry
                                         
                                         a pager. The more mission critical this gets, the less you want to be in charge of carrying
                                         
    
                                         that pager when that system might go down. In terms of building a cloud service, one of the things that's very exciting, and this
                                         
                                         is particularly true for companies like ours, where we're building this from day one, knowing
                                         
                                         this, that the cloud product is the predominant way in which we are going to be successful as a business, is you get to think in
                                         
                                         terms of atomic components that are cloud native. A very, very good example of this is separating
                                         
                                         storage from compute. So storage in this infinitely scalable, extremely low cost service,
                                         
                                         namely S3 or the S3 equivalents on the other major clouds, is available and extremely
                                         
                                         high durability, extremely strong guarantees that you get from these services is a building block
                                         
                                         that you can build, say, a database around that means that there's an entire class of problem
                                         
    
                                         that you don't have to engineer for, namely data loss or data corruption or replication
                                         
                                         or things like that. You can rely on this atomic unit of an S3 bucket being the principal storage
                                         
                                         layer for the vast majority of your data. And what this means is, of course, you get to use
                                         
                                         your engineering budget instead of solving the same problem that everyone has had to solve pre-cloud.
                                         
                                         You get to use this to solve new problems.
                                         
                                         Another one that you get is the ability to other services that are cloud native, save
                                         
                                         for other components.
                                         
                                         So a good example of this is going back to Postgres, materialized cloud uses highly available
                                         
    
                                         Postgres nodes under the hood for certain classes of metadata and things like that. Whereas otherwise, if we were
                                         
                                         building a fully on-premise piece of software, getting this highly available would be a long
                                         
                                         engineering challenge. At the same time, we love users who just want to use the source available
                                         
                                         product or they want to use it and deploy it in their own premises. The key distinction I would make is we've designed
                                         
                                         materialized cloud such that the best place to get the highest number of nines of availability
                                         
                                         is materialized cloud. So things like active, active replication, automatic failover, load
                                         
                                         balancing, these are built using cloud native services and owned and operated by us as part
                                         
                                         of materialized cloud that are not part of the downloadable on-premise offering.
                                         
    
                                         And that's because fundamentally these things are designed
                                         
                                         using cloud services that they're not portable, right?
                                         
                                         Like you can't take, you don't have S3 on your laptop.
                                         
                                         You can, and yes, you can emulate it for testing,
                                         
                                         but that's not how you would run a production service.
                                         
                                         Absolutely, absolutely.
                                         
                                         Operating a software and building a software are two different things.
                                         
                                         So I have a question about the cloud offering compared to the experience that you described
                                         
    
                                         about Materialize from the beginning. And it has to do with latency, right? You said that
                                         
                                         Materialize is a system that you can expect a single number of digit latency
                                         
                                         when it comes to the queries that you execute and the updates that you have.
                                         
                                         My intuition says that in order to achieve that, if I'm consuming data on Materialize
                                         
                                         from a database system that I have, I have to have my materialized nodes as close as possible to my database.
                                         
                                         How can I do that when I use the cloud offering? So the first point I'd make is you're absolutely
                                         
                                         correct. You want this to be very close to your database. But the other thing I'll observe is most
                                         
                                         of the databases are in the cloud. So if you want to be close to the databases, you have to be in a
                                         
    
                                         cloud instance by definition to be close to the databases that are in the cloud.
                                         
                                         The important part of this is co-locating them as closely as possible.
                                         
                                         And it usually would come down to region, availability zone, co-location, and things
                                         
                                         like that.
                                         
                                         You almost certainly don't want to move this data across clouds, right?
                                         
                                         So our cloud service is launching next month on AWS, but
                                         
                                         eventually we want to fast follow to Azure and Google Cloud as well, because if your database
                                         
                                         is in one of these other clouds and you will have too much latency going between two clouds.
                                         
    
                                         The other thing I would say is the clouds have gotten, the hyperscaler, the three cloud companies
                                         
                                         have gotten very good at laying extremely high bandwidth, low latency network connections.
                                         
                                         So as long as you're in the same region and spinning up your materialized instance in
                                         
                                         a VM that is in the same region and perhaps even the same availability zone as your database,
                                         
                                         they've done a very good job making sure that those actual packets that are
                                         
                                         going across this virtual network will go over a fairly small physical distance.
                                         
                                         That's great. One last question from me, Arjen, and then I'll give it to Eric so we can also
                                         
                                         conclude this episode. You mentioned about co-location and all that stuff, and you mentioned
                                         
    
                                         also about S3. So for the people out there who are interested in using the cloud version of Materialize when it's launched,
                                         
                                         is this going to be on one cloud provider like AWS? Yes. Next month, we're rolling it out on AWS,
                                         
                                         and then a few quarters later, we will be loading it out on other clouds.
                                         
                                         Okay.
                                         
                                         So people can expect that in the next couple of months, if they are a GCP shop, the materialized
                                         
                                         will also be available there, Azure and at least the major cloud providers are there.
                                         
                                         I can't commit to a specific timeline, but one thing I will say is that there always
                                         
                                         is the option of running Materialize in a VM, the downloadable source available product,
                                         
    
                                         in a VM in an Azure region or data center.
                                         
                                         That's great.
                                         
                                         I think we need to have at least another episode
                                         
                                         because I have more questions to make.
                                         
                                         But I have completely monopolized this conversation
                                         
                                         and I need to give at least some time to Eric.
                                         
                                         This has been really fun.
                                         
                                         I really appreciate the questions, Costas.
                                         
    
                                         Thank you.
                                         
                                         Yeah, it's great.
                                         
                                         I think we're close to the buzzer,
                                         
                                         but we've talked about Materialize a lot just as a team
                                         
                                         and Costas and I,
                                         
                                         because we love discovering new technologies
                                         
                                         and it really is a true joy just to get to talk with you
                                         
                                         and just hear about the inner workings in many ways.
                                         
    
                                         And I hope this has been a really fun conversation for our listeners.
                                         
                                         Arjun, this has been such a wonderful conversation.
                                         
                                         We'll definitely have to have you back on.
                                         
                                         And congrats on the cloud launch.
                                         
                                         That's going to be great.
                                         
                                         Encourage all of our listeners to go to materialize.com and check it out.
                                         
                                         And we'll have you back on the show maybe in another six months or so
                                         
                                         after the cloud products been live and hear how it's going. I would love to do that. This is an
                                         
    
                                         absolute pleasure of a conversation. Thank you both. Thank you, Eric. Thank you, Costas. This
                                         
                                         is a wonderful show you have over here. Well, Costas, I think one of the big takeaways I have,
                                         
                                         and this won't be my takeaway from the content of the show, is that you and Arjun are incredibly intelligent when it comes to very deep concepts around
                                         
                                         databases and languages that you use to build technologies. And so it was a real joy for me to
                                         
                                         hear two very intelligent people reason around some of the decisions
                                         
                                         that they're making.
                                         
                                         I think the big takeaway actually relates to my big question on the front end.
                                         
                                         Analytics is a really obvious use case, but all the other interesting things you can do
                                         
    
                                         when you enable real-time, I think are just going to open up a lot of really creative
                                         
                                         solutions to problems that are low-level plumbing problems in the stack
                                         
                                         currently. And that's very exciting. I mean, coming from a marketing background, I think about
                                         
                                         enriched profiles and automation and other things like that. And the ability to have this stuff in
                                         
                                         real time from a database, I think it will actually be a very big driver of creativity
                                         
                                         in the way that people are building experiences.
                                         
                                         Absolutely. You're absolutely right. I mean, the closer you get to real time, the more use cases
                                         
                                         you open. And I think we are just at the beginning of like seeing what people can come up with
                                         
    
                                         technologies like materialize. And I'm pretty sure that like, if we talk again with Arjun,
                                         
                                         like in six months from now, he will probably have
                                         
                                         even more use cases to share with us.
                                         
                                         So yeah, absolutely.
                                         
                                         Materialize is a new technology, a new paradigm.
                                         
                                         There's many new, let's say, patterns that we have
                                         
                                         to learn and understand from there and experiment with.
                                         
                                         It might take some time for people to figure out
                                         
    
                                         how to use it, but my
                                         
                                         feeling is that we are going to see very exciting things coming from this technology. I have
                                         
                                         to say though that Arjen is also an amazing, amazing speaker. He was amazing in explaining
                                         
                                         really complex concepts, so I really enjoyed the conversation. I was really happy to hear
                                         
                                         about all the technology that they are using to make materialized products. And I'm also very
                                         
                                         excited to see what's going to happen with the cloud version of the product. It's also very
                                         
                                         exciting for me to hear that regardless of like the technology that someone is building, how this technology
                                         
                                         is delivered and is used, it's very important. And cloud is probably the best delivery model
                                         
    
                                         that we have at this point for this kind of product. So yeah, hopefully in a couple of
                                         
                                         months from now, we'll chat again with him and learn even more. Yeah, absolutely. I think as I
                                         
                                         reflect on the conversation, a lot of really paradigm-shifting
                                         
                                         technologies take something extremely complex and make the experience very simple. And there
                                         
                                         are lots of examples of that, but being non-technical, but working with you closely
                                         
                                         enough to understand when you talk about anything real-time related to a database,
                                         
                                         from a technical perspective, that's an extremely
                                         
                                         complex problem to solve. And I think if materialize can simplify that, I mean,
                                         
    
                                         that's pretty paradigm shifting. So it'll be really fun. And I think if they can accomplish
                                         
                                         that, that'll, that'll be huge. Awesome. Well, thank you for joining us on the show.
                                         
                                         Lots of really good episodes coming up this fall. We're actually about to wrap up season two.
                                         
                                         So you'll see that wrap up coming up in the next couple of weeks. And then we have a great lineup
                                         
                                         for season three. And until then, we'll catch you on the next one.
                                         
                                         We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite
                                         
                                         podcast app to get notified about new episodes every week. We'd also love your feedback. You can email me, Eric Dodds, at eric at datastackshow.com.
                                         
                                         That's E-R-I-C at datastackshow.com.
                                         
    
                                         The show is brought to you by Rudderstack, the CDP for developers.
                                         
                                         Learn how to build a CDP on your data warehouse at rudderstack.com.
                                         
