The Changelog: Software Development, Open Source - Reinventing Kafka on object storage (Interview)

Starting point is 00:00:00 What's up friends friends? Welcome back. This is The Change Law. We feature the hackers, the leaders, and those who are building data streaming platforms inspired by Kafka. Yes, today's conversation revolves around Kafka and data streaming. We're joined by Ryan Worrell, co-founder and CTO at WarpStream. Last year, they posted a blog titled Kafka is dead. Long live Kafka. And that, of course, hit the top of hacker news and put Warp Stream on the map. Today, we get the backstory on why Kafka is so widely used, who created it and for what purpose, and more importantly, the story of Warp

Starting point is 00:00:57 Stream. And the question they asked themselves was this, what would Kafka look like if it was redesigned from the ground up today to run in modern cloud environments directly on top of object storage with no local disks to manage, but still had to support the existing Kafka protocol? Well, that's just the premise for today's conversation. A massive thank you to our friends and partners over at Fly.io. More than 3 million apps have launched on Fly, and we're one of them. Scalable full stack without the cortisol. No stress. Learn more at fly.io.

Starting point is 00:01:29 Okay, let's Kafka. What's up, friends? I'm here with a new friend of mine, Sagar Bhattu, co-founder and CEO at Speakeasy. You know, I've had the pleasure of meeting several people behind the scenes at Speakeasy, and I'm very impressed with what they're doing to help teams to create idiomatic SDKs, enterprise-grade SDKs in nine languages, and it's just awesome. So, Sagar, walk me through the process of how Speakeasy helps teams to create enterprise grade idiomatic SDKs at scale.

Starting point is 00:02:12 You know, APIs are tough things to manage for a company. The open API spec, this great standard, widely adopted standard to describe and document the SDKIs is the best chance the company has towards documenting it, understanding state, you know, point in time, what is the API, and also ownership. What are the APIs? How are they grouped? Which teams own them? What services do they get deployed to? There's a lot of questions there that often we see teams and companies kind of struggling to answer. So speakeasy is a forcing function for them to invest in making that open API spec as great as possible. Completely descriptive, fully enriched.

Starting point is 00:02:51 Speakeasy helps with those gaps. We have deterministic and AI tools to kind of fill in the gaps for them. And so the better and better that open API spec gets, the better chance you have at serving your community. The end value is always to the end user who is actually integrating with your API. So end value is always to the end user who is actually integrating with your API. So if your open API has gaps in it, more likely they are into errors.

Starting point is 00:03:11 They don't understand what they're implementing. It gets tough to maintain because it becomes institutionalized knowledge as opposed to described on the document. So there's a lot of great reasons why you invest in that open API spec. Any artifact like that, that you're going to invest a ton of time into needs tooling to manage, right? And that's what speakeasy is at its core. It's tooling to manage that open API spec, give you kind of very clear change management principles around it, version it, understand, you know, exactly what versions are used for what SDKs. If you invest in that spec and use speakeasy, you'll have a good document. And the moment you have a good document, you can have good or great SDKs, which make integration easy.

Starting point is 00:03:49 The way Speakeasy works there is you point us at your document, wherever it lives, in GitHub or maybe some other file storage or somewhere else. We detect changes as it evolves, as different people contribute to it. And we send you new updated code every time that happens. And so the moment we send you code, there's an opportunity for you to review that and say, you know, yes or no. Like this is new code we want to ship to our customer. We do that heavy lifting of generating that code,

Starting point is 00:04:18 giving you kind of provenance of your spec, but leave you as human in the loop to decide, okay, am I going to serve my ecosystem with a new version of the spec in SDK? So that's the kind of core workflow that we're built around. And that's really the point of collaboration between us and companies that we work with. Okay, friends, the next step is to go to speakeasy.com.

Starting point is 00:04:39 Try today for free. You get one free SDK in your language of choice on them. Enjoy it. Robust, idiomatic SDKs in nine plus languages. Your API, the OpenAPI spec available everywhere. Again, go to speakeasy.com. Once again, speakeasy.com. dot com All right, today we are joined by Ryan Worrell from WarpStream. Ryan, welcome to the ChangeLog.

Starting point is 00:05:36 Thanks, it's great to be here. Great to have you. Shout out to listener Vladimir for requesting this episode. Also, shout out to your co-founder, Richard, who unfortunately couldn't be here today, but hey, Richard. What's up, Richard? Yeah, but you're here,

Starting point is 00:05:53 so let's talk to you and not to Vladimir and to Richard. That being said, Vladimir requested this episode. You too, listener, can request episodes. Head to changelog.fm slash request. Let us know what you would like to hear about on the pod, and we might just fulfill your every desire. Vlad wanted to hear about

Starting point is 00:06:12 WarpStream, and so that's why Ryan is here. Just so happens that Adam and I both would also like to hear about WarpStream. So here we are. Let's start with Kafka, though, because it sounds like WarpStream's story starts with Kafka's story. What is Kafka, besides an author from the early 1900s? But the open source thing, what is that thing all about? Yeah, Kafka is both a very interesting and a very boring system. The easiest way to think about it is it lets you create topics and you can have producers that write messages into these topics and consumers that consume messages out of the topics. It's kind of like a publish and subscribe type deal. But the thing that makes it interesting is the fact that once you consume those messages, they're not deleted. So they're

Starting point is 00:07:06 still stored inside the system and another consumer can go and read them again for a different purpose. Like if you have two different applications that are consuming the same data set, they can both equally consume those messages. You know, let's say that you have one application that does machine learning training and another that does alerting based on the two different, like the same messages you want to process them, but you want to process them in different applications. Kafka is a useful tool for that. It also provides ordering for those messages so that if you need to implement an application where you send messages in a certain order and you want that order to be retained on the other side. Kafka also does that, you'll get them back in the same order every time. So you can implement something like state machine replication or that type of thing where the ordering matters.

Starting point is 00:08:11 Okay, so what are some use cases for this? Lots of, it sounds like, well-funded companies use it, larger companies. And I think that some of that is because of the operational complexities and the love-hate relationship with it. But why are people grabbing this particular tool often? Yeah, the reason why it's useful is there just isn't a lot out there that fulfills those, you know, the two main things.

Starting point is 00:08:43 It's like a publish and subscribe mechanism that's scalable. And then also that lets you have different consumers process the same set of messages without, you know, one of the consumers deleting it. Like there's a lot of queuing systems that the messages, when you consume them once, they're just gone forever at that point. Like the purpose is to consume the message and then have it go away, not to reprocess it again in the future. There are a lot of use cases for it. I'd say that the most broadly popular one is for moving data from point A to point B, kind of like a dump pipe. It's used a lot in observability and security-related workloads where you have a lot of application servers that are generating logs and you want to temporarily put those logs somewhere before you put them in something else.

Starting point is 00:09:31 Like you say, you want to put them in Elasticsearch or something like that. Elasticsearch can be a little finicky. So you want to have Kafka, which is a much simpler system in place, as a temporary buffer to hold those long messages that you want to write to Elasticsearch

Starting point is 00:09:48 in case that Elasticsearch cluster is down or you're doing an upgrade or something like that. There's a lot of different reasons for it, but Kafka is pretty much the de facto standard for those kinds of workloads. And then when you get outside of observability and security, there's a lot of people that are building custom applications on top of Kafka, like an inventory management system for a warehouse where every time you want to keep track of the real-time status of everything going on in the warehouse, you might want to send messages to say, oh, this new batch of inventory has been added onto the shelves of the warehouse. I'm taking things out. And then you're computing some type of a live application based on that inventory data to say, you know, that you need

Starting point is 00:10:30 to replenish the stock when it goes below a certain amount, but you want to do that in real time so that you can react to fat, you know, faster than just doing this once a day. So Vladimir pointed us to a post, which I think Adam and I, we had both already read this post because it was last year. Last summer, I believe. Kafka is dead. Long live Kafka. This was your big coming out party, it seems. A great way to introduce WarpStream.

Starting point is 00:10:56 And in that post, you said that Kafka is one of the most polarizing technologies in the data space. And then, whether it was you or Richard who wrote that, then you just moved on and kept going, assuming that we all just knew why or how, or agreed that that was just true.

Starting point is 00:11:10 I assume it's true, it's probably polarizing. But why is it polarizing? My guess is because it's useful but difficult to use. And so people love it and hate it, but maybe there's more to it than that. So I think that there are probably two it than that so i i think that there there are probably two main uh criticisms that people have of kafka the first is that it's hard to run like as an as the operator you have to have a lot of knowledge about how to use the open source project

Starting point is 00:11:39 appropriately and this the second major issue is the cost You know, I'm sure we'll get into this, but the, the, the cost of running open source Kafka in the cloud, it's pretty high compared to what people expect it to be for such a, if you think of it as a dumb pipe, you would expect to pay dumb pipe type rates for it. But given the fact that it requires triply replicating the data onto local disks, and you'd have to pay, most of the cloud providers are charging you money for interzone replication, you end up paying a lot more than you expect, even if you're just storing the data temporarily. If you're using open source Kafka and AWS, for example, the minimum cost for a highly available 3AZ setup for the cluster is 5.3 cents per gigabyte, compressed gigabyte written into the cluster. That's just to do the replication part.

Starting point is 00:12:36 The storage part is all another story. It depends on how long you want to store the data for, but you're like, if you're starting out, that's your baseline cost. It can get pretty expensive pretty quickly. Is anyone building or using Kafka, open source Kafka, as you said, in a scenario where they're not on public cloud, where they're building out their own infrastructure, where it's probably maybe even more harder because you're literally managing the disks. You're not ordering the disks or SRAing the disks. You're literally managing the disks.

Starting point is 00:13:03 Is that a scenario that happens or is it less likely? So that's definitely a thing that happens. I know of companies that do that, but just as the migration to public cloud over the last 10 years has only increased in velocity, essentially, that is becoming less and less popular because it is indeed hard and it's even harder when it's in your own data center as opposed to the cloud where you can just ask for more disks and you get them right away the cost situation is a little different

Starting point is 00:13:35 there too because typically you're the way that you're provisioning network in your own data center would not end up with a per gigabyte cost. I mean, obviously, you amortize everything over how much data you're transferring inside your data center, but you're buying it in terms of hardware and your per gigabyte rate if your traffic goes up is it doesn't correlate the same way linearly as it does with Amazon. But that's, it's definitely still a thing people do,

Starting point is 00:13:59 but it's less and less popular every day. Continue with the polarizing. What else is polarizing about Kafka? Some people have some, some people have strong opinions about the actual developer programming model of Kafka and that it's a little hard to use sometimes.

Starting point is 00:14:15 I think that's less of a big deal these days as more tools have integrated with Kafka. It just, it makes it even easier to use Kafka than, there are some other systems that might have a theoretically easier to use Kafka than there are some other systems that might have a theoretically easier to use programming model, but everything speaks Kafka now. So those concerns are mostly trumped by the fact that it's the de facto standard.

Starting point is 00:14:36 I think really what most people are concerned about when, like, if you don't use Kafka today and you're thinking about bringing it in to your company, the two things that you're going to be concerned about are how hard is it to run and how much is it going to cost? Those are typically people's two big blockers. It doesn't have anything to do with the fact that conceptually they have an issue with Kafka. It's those more practical things. What makes it so difficult to run? Is it the SSDs? I think that post also called it finicky.

Starting point is 00:15:08 Is it poorly architected? Like why is it finicky? It's a number of different things. I think the first one is yes, being responsible for anything that stores data on local disks. If you want to achieve high availability and high durability of your data is challenging. It requires experienced SREs to handle those types of failures when they do occur. But that, I think, that can be dealt with because people do that with other systems all the time. But I think that most people's problems with Kafka come when they want to scale up and scale down the cluster in response to load. The open source project doesn't really give you much tooling when it comes to helping you manage that process. Like for example,

Starting point is 00:15:56 in the open source project, there is no automated tool to rebalance the data among the machines when you add or remove machines. Like that's kind of a table stakes feature in a lot of, like, if you're thinking about a relational, a distributed relational database, you know, that would seem kind of silly if you had to like run a script to move data between the nodes and the database. But that is true of open source Kafka. And there are now, there are other tools that you can use alongside of it that can, that can take some of this work off of you, but they're not always the easiest to use either. It's not like a self-balancing, self-managing thing like a lot of the distributed relational databases are. It's something that takes a little bit more hands-on work. thing that goes along with that is the if you're storing data for a long period of time in the open source project uh on they they didn't add a tiered storage feature until very recently

Starting point is 00:16:50 in the open source product and the time that it takes just to copy the data around from machine to machine when you're scaling up or scaling down the cluster can be hours or days depending on how dense you're running the machines. Some of that is alleviated with the new tiered storage stuff where the older data is moved to object storage, but that part doesn't alleviate the inner AZ networking costs. And there's another post on our blog about tiered storage and Kafka if people are interested in learning more about that topic. It is open source though, right?

Starting point is 00:17:27 Apache Kafka? Yeah. Yeah. The project is managed by the Apache Foundation and has a variety of contributors across a ton of companies. And I would say it's a fairly healthy example of an open source project in terms of having a big community around it. There's a margin of haters, let's just say, towards Kafka. And it is open source. And I'm

Starting point is 00:17:51 just curious, you know, you may be in that bucket of margin of haters because you've created WarpStream, right? So you're kind of not for, you're kind of against, at least from an economic standpoint, and maybe a DX standpoint and many other standpoints. The point I'm getting to is, why not just improve Kafka? There are a lot of practical challenges with improving a large open source project with a lot of users and a lot of dependent parties, I should say. Not even necessarily just users, but stakeholders

Starting point is 00:18:26 of all kinds. Making large sweeping changes is essentially impossible. Like it's not the amount of code churn required to take open source Kafka and get it to something resembling the architecture of Workstream is just not going to, that's not going to happen in any reasonable amount of time. That's the first part. something resembling the architecture of Workstream is just not going to, that's not going to happen in any reasonable amount of time. That's the first part. If you just wanted to abstractly, no financial interests involved, how would you do this? It would be very hard, practically. The second reason is that Workstream makes a pretty different set of trade-offs than the

Starting point is 00:19:00 open source project does in terms of the environment that we expect users to run in. Now, I think those trade-offs are correct for the world that exists today. But in the abstract, it is different than the open source project. So WarpStream stores data only in object storage. That's step one. You need an environment that has object storage. And then step two is that we run a control plane

Starting point is 00:19:24 for the cluster, which in the open source, the comparison would be kind of like if somebody was running Zookeeper or Kraft, which is their replacement for Zookeeper inside of the open source project. It's kind of as if we're running that for you remotely, and then you're running the agents as we call them, which is the replacement for the Kafka broker inside your cloud account. So just like there's a very specific topology that we're prescribing to our customers as well. That's different. Probably wouldn't fly in an open source environment, or at least would make it even more challenging to run potentially. I think those are probably the two biggest reasons

Starting point is 00:19:58 of why we couldn't just improve Kafka is it just, it would be too hard practically to make improvements. And then also we're making trade-offs around what we think the world, like how we see the world existing today and how we think it's going to continue to exist in the future that a lot of the stakeholders to the OpenService product may not agree with our assessment pair, basically. Good answer. I was expecting a version of that. I was not suggesting that you should just not start WarpSt and by all means just go contribute to Kafka and bail. But it's always good to get that perspective because Kafka's got history. It's 13-ish years old. It was developed inside of LinkedIn for different purposes. That's why I started off with the question, which was their own infrastructure. Because LinkedIn designed this for a different purpose than

Starting point is 00:20:43 everybody else today uses it. It was not designed to be used in a cloud environment where there's a lot of egress fees and a lot of fees between moving data around and so it was not really designed for its actual use case or the the usage space that it's in and LinkedIn did not charge its users those transaction fees I assume potentially because i don't know linkedin's infrastructure history but i assume because they had far more control over their cloud or their own environment to not have to deal with those costs than maybe everyone else who's become a kafka user has had to take on yeah the way that I like to explain that, the networking cost side, is that when you're renting space in a colo or you have your own data center,

Starting point is 00:21:28 you're implicitly paying for what is kind of a fixed capacity resource. It has a very high fixed capacity, but you are essentially paying for a resource that has a fixed capacity without doing a bunch of capital improvements to your data center. Whereas if

Starting point is 00:21:45 you're in the public cloud, you can show up and put a credit card down and start moving gigabytes a second across the network without asking anybody for permission, nothing. So you're paying kind of a tax for that flexibility of being able to show up without asking anybody. All of a sudden, start moving a ton of data and especially in terms of how spiky you can you can do it like you can you can write 100 gigabytes a second for one minute and then never pay amazon any money again and they they have to they have to do some capacity planning on their end to just like they do for every other service and why they charge you know higher on-demand rates for EC2 instances

Starting point is 00:22:25 than if you go and buy a random server off the internet and put it in your house. The cost looks very different. Now, whether that cost is right, whether that reflects real economic realities, I don't think anybody can say without being inside of Amazon, but I think there's a pretty logical rationale for why it exists that way because there are people that will consume bandwidth in a very different... of Amazon, but I think there's a pretty logical rationale for why it exists that way, because there are people that will consume bandwidth in a very different... You have to think about the worst case scenario users, basically, of your service, the people that you might even call it

Starting point is 00:22:54 abusers of your service in terms of your cost profile. So I think that's why, as you're saying, you're correct that LinkedIn can just decide to use Kafka in a different way internally to match their ability to provision infrastructure. And Amazon can't really force you to do that in any way other than just charging you more money for it. So that's why they do. So you and Richard, did you guys meet at Datadog? Is that where you guys connected or was he at Datadog? Tell us a little bit of the history of you two. Yeah, so Richard and I met a little over five years ago now at a conference.

Starting point is 00:23:29 We met at Percona Live, I think it was 2019 in Austin. And he was working at Uber at the time. And yeah, so we did eventually both end up joining Datadog, but that was a little later. Gotcha. And while you were there, you had put some sort of Datadog infrastructure on S3 or on object storage. Husky, I think. I'm going from memory now. Yeah.

Starting point is 00:23:57 So my co-founder, Richie, and I, after he left Uber, we started working on a prototype of a system that was, the idea was basically a snowflake for observability data. That was like the elevator pitch. And we were going around pitching that to investors at the time. And that's how we got to know some of our investors in Warframe today is we met them back in those days. And that eventually caught Datadog's attention. And we ended up joining Datadog together to build that system, Husky, with some of our current colleagues at Warpstream were also there at Datadog building that system with us. Basically, the idea there was to replace the legacy system inside of datadog for a lot of the the kind of basically anything that you can think of that's not like pre-aggregated time

Starting point is 00:24:54 series metrics the idea was to be we think of it as like timestamp plus json that was the the data model, basically. And we wanted to move all that data to object storage. There's a ton of different reasons for it, similar to the reasons why WarpStream is useful. But over the three and a half years that my co-founder and I were there, we migrated all of the products that were using the legacy system over to ASCII. Yeah, I mean, that's why I ask about it because it seems like it's a precursor to this very similar move with Kafka, right?

Starting point is 00:25:32 Like what if we took Kafka, ripped out the local storage aspect of it? Sounds easy enough. And built something, I mean, by ripped out, conceptually ripping out, right? You didn't fork Kafka and write this, right? You started over? Yeah, we started from scratch and writing it in Go.

Starting point is 00:25:52 Right, so conceptually rip it out, but actually rewrite something that's Kafka compatible in terms of features and API, I assume, and all that kind of stuff. But no local storage, object storage. And your success with what happened to Datadog probably led the way for you to say, well, if we did that, it would be a lot cheaper basically

Starting point is 00:26:14 and way easier to operate because hello, Amazon Web Services, right? Like it's their problem now. Yeah, there's definitely a lot of like high level conceptual overlap. The systems are extremely different because one looks more like an OLAP database, and the other is, I mean, Kafka is more like a log. So there's some very high-level conceptual similarity.

Starting point is 00:26:41 And I think the thing that we really got the most experience with there was learning about object storage. So that's about where the similarities stop is just like the deep experience of understanding how object storage works at scale when all of the major public clouds was a hugely valuable learning experience for us to know that like when we left and we were, you know, we're doing the back of the envelope math on could we make this thing work that experience let us you know the experience with object storage that we that we learned there was was pretty helpful now i think a lot of like object storage people talk a lot about object storage nowadays so i think that's not like an unknown thing to understand the characteristics of of working with it. But I'd say in 2019,

Starting point is 00:27:25 that was a fairly different story. I think the only people that would know a lot about building high performance systems on top of object storage, they were probably all either inside the public cloud providers themselves, or they were working at Snowflake or a similar company. The knowledge was not super well distributed at that time. Most people, when they think of object storage, they think of something that's super slow. Like they're thinking about it in terms of like seconds of latency to do anything. And they just like you have to rework your, the numbers around it are very different than what people might think of off the top of their head. And that opens up a lot of design possibilities that you don't think of immediately.

Starting point is 00:28:20 Okay, friends, here are the top 10 launches from Supabase's launch week number 12. Read all the details about this launch at superbase.com slash launch week. Okay, here we go. Number 10, Snaplet is now open source. The company Snaplet is shutting down, but their source code is open. They're releasing three tools under the MIT license for copying data, seeding databases, and taking database snapshots. Number nine, you can use PG replicate to copy data, full table copies and CDC from Postgres to any other data system. Today it supports BigQuery, DuckDB and MotherDuck with more syncs to be added in the future. Number eight, Vect2PG, a new CLI utility for migrating data for vector databases to Supabase or any Postgres instance

Starting point is 00:29:05 with pgVector. You could use it today with Pinecone and QDrant. More will be added in the future. Number seven, the official Supabase extension for VS Code and GitHub Copilot is here. And it's here to make your development with Supabase and VS Code even more delightful. Number six, official Python support is here. As suit base has grown, the AI and ML community have just blown up suit base. And many of these folks are Pythonistas. So Python support expands. Number five, they released log drains

Starting point is 00:29:38 so you can export logs generated by your suit base products to external destinations like Datadog or custom endpoints. Number four, authorization for real-time broadcast and presence is now public beta. You can now convert a real-time channel into an authorized channel using RLS policies in two steps. Number three, bring your own Auth0, Cognito, or Firebase. This is actually a few different announcements. Support for third-party auth providers, phone-based multi-factor authentication,

Starting point is 00:30:11 that's SMS and WhatsApp, and new auth hooks for SMS and email. Number two, build Postgres wrappers with Wasm. They released support for Wasm WebAssembly foreign data wrapper. With this feature, anyone can create an FDW and share it with the Superbase community. They released support for Wasm, WebAssembly, Foreign Data Wrapper. With this feature, anyone can create an FDW and share it with the Superbase community. You can build Postgres interfaces to anything on the internet.

Starting point is 00:30:34 And number one, Postgres.new. Yes, Postgres.new is an in-browser Postgres with an AI interface. With Postgres.new, you can instantly spin up an unlimited number of Postgres databases that run directly in your browser and soon deploy them to S3. Okay, one more thing. There is now an entire book written about Supabase. David Lorenz spent a year working on this book, and it's awesome. Level up your Supabase skills and support David and purchase the book. Links are in the show notes.

Starting point is 00:31:11 That's it. Superbase launch week number 12 was massive. So much to cover. I hope you enjoyed it. Go to superbase.com slash launch week to get all the details on this launch or go to super base.com slash change law pod for one month of super base pro for free. That's S U P A B A S E.com slash change law pod. what are some lesser known things about object stores that you know that we don't know or maybe nobody knows besides you yeah it's it's not really he's gonna know it's not really one

Starting point is 00:32:02 secret trick i think it's just a conceptual framing that you have to think of it as if you have access to a very large oversubscribed array of spinning disks. If you think about it like that, then the conceptual framing of how it works will make, like how you design a system around it will make a lot more sense so there's a you know there's a couple different pieces of that really large like way bigger than your individual application so like

Starting point is 00:32:34 you have a the world's biggest raid zero of all the disks ever right zero it's actually unlimited so think about that way But also oversubscribed. It's like it's the latency characteristics that are highly variable. Like one request might take 10 milliseconds and the other takes 50. And there's no discernible reason to you why that is the case. It's just that is how it works. So you have to design around that a little bit in terms of retrying requests speculatively and that type of thing. But if you have that framing

Starting point is 00:33:06 of it's very large, cheap storage with variable latency characteristics, you can, if you rework your application to think about how it would make it work on top of that, then you've got the right framing.

Starting point is 00:33:20 The reason why it's so challenging for people today is that they think about, they spend all their time thinking about the fastest storage that's available today. They spend a lot of time thinking about persistent memory or NVMe SSDs, stuff like that. They think about that first when they're designing their application. How do I get the lowest possible latency? Making your application work on that first

Starting point is 00:33:45 and then trying to add object storage on top is a very popular thing that people try to do. They always call it tiered storage. Basically every system that has that calls it tiered storage. And it's very hard to match the characteristics of those two things together going top down, whereas going bottom up the other direction, starting with object storage and then layering stuff on top,

Starting point is 00:34:07 it seems like it should be the same, but it's not. You don't end up making the same design decisions along the way. And that has a big influence on the overall characteristics of the system. And I can explain specifically what that means for Kafka in terms of tiered storage. So they were thinking about disks first, like local NVMe SSDs. That's usually what people are running it on these days in the cloud. The way that that influences the design is that the way that they implement tiered

Starting point is 00:34:37 storage is they just take those log files on disk that have all the records in them, and they copy them over to object storage. That solves a cost problem. If you never want to read that data again, you're good. Like that's cool. It's much cheaper now. When you want to come back and read it, let's say that you wanted to read all of it, like all of the data you've ever tiered off into storage. The way that that works in the open source project is that you'll end up reading all of that data you're going to have to pull back through one of the brokers. There's no way for you to like parallelize that processing because they just view it as this bunch of log files that I put into object storage. And with OrbStream, we've kind of decoupled the idea

Starting point is 00:35:26 of the local storage being owned by one machine to now there's a metadata layer that says, these are all the files that exist. And then we have all these stateless agent things that can actually pull the data out of object storage for you. So you can scale up and down as quickly as you need to, to read all that data out of object storage. So you wanted to pull it all out. You can scale up temporarily for the hour that you wanted to run some big batch job and then scale back down at the end. With the open

Starting point is 00:35:55 source tiered storage in Kafka, that's a lot harder because they started with the local disk part, which makes sense because that's what existed before. It just means that adding stuff on afterwards, you're usually the tiered storage, lower layers of storage is like a secondary concern. It doesn't get as much love and attention as the primary storage gets. And you end up with a very different system at the end. For us laymen, can you describe how the brokers work and contrast that again with these stateless agents? I understand that you can scale the agents horizontally because they are stateless versus a broker, which seems to have kind of a lock on some data. But what do Kafka brokers do exactly?

Starting point is 00:36:48 Yeah, so Kafka has, let's start with topics topics are like a basically just a name for doing the you know mapping consumers and producers together they agree on the name of a topic for how they're gonna where they're gonna send the data to and where they consume the data from and within the topic there are partitions and a partition is basically just a shard to make that topic scalable there are a lot of different ways to decide basically just a shard to make that topic scalable. There are a lot of different ways to decide which shard you're going to write the data to, but let's just say for now, you do it by hashing the key of the message and then routing it to the shard

Starting point is 00:37:17 based on the hash of that key. So if you have the record with the same key, you'll end up going to that same broker every time that owns that partition. So that's how it works in the open source product. The brokers own some set of partitions from a leadership perspective. And then there's also replicas of that that are just copying the data. And it's just other brokers that are the replicas for those partitions. So the broker will write that data that it receives from a client,

Starting point is 00:37:51 producer client down to the local disk and replicate it out to the followers. And then a consumer can come along and read either from a replica or the leader, the data that producer wrote. But they're all coordinating on essentially one of those brokers owns the partition specifically that I'm interested in and reading. So that's how it works in the open source project. And in WarpStream, we've decoupled the idea of ownership of a partition from the broker itself. We have a metadata store that runs inside our control plane that has a mapping of here are all the files and object storage. And within those files, the data for this partition for this offset is here. It's in some section of a file and object storage. So any of our agents, which are like the stateless broker that speaks the Kafka protocol to your clients, any one of those agents can consult the metadata store and ask, I want to read this topic partition at offset X.

Starting point is 00:38:55 Where do I have to go in object storage and potentially multiple places in object storage? Where do I have to go in object storage to read that data. But because the metadata store inside the control plane is handling the ordering aspect of it, essentially, you get the same guarantees as Kafka in terms of, I have this message with this key that's routed to this topic partition, and I want them to stay in the same order because I'm writing them in a specific order. That ordering part is enforced by the metadata store inside the control plane, but the data plane part of actually moving all of those messages around is only inside the agents and object storage. So it lets you do that thing that I was saying before, where if you want to scale up and down, it's very easy to do that because you don't have to rebalance those partitions, which take up space on the local disk amongst the brokers in order

Starting point is 00:39:45 to facilitate that. So you're reading metadata versus reading the real data, basically. And that's what makes it faster. In terms of being faster, it's faster at the fact that there is no rebalancing that happens because the data is always just in object storage somewhere. You don't have to do any rebalancing for it. That part of it is faster. There's obviously a trade-off when you do this, in that the latency

Starting point is 00:40:05 of writing to object storage is higher than writing to the local disk. So if you want your data to be durable, you have to wait for the data to be written to object storage first. So that's the primary trade-off somebody that's using WarpTree would be making, is that they're comfortable with around 500 milliseconds at the p99 of latency to write data to the system and then the end-to-end latency of like a producer sends data and then it's consumed by a consumer is somewhere between one to one and a half seconds again at the at the p99 what percentage of the kafka population does that out? Because it seems like many of them are highly real-time oriented. So it's interesting that you say that you use that word real-time

Starting point is 00:40:51 because we've talked to a ton of different Kafka users. And when you ask them, what is your end-to-end latency of your system today? A lot of them don't know the answer. They know, like, they think that they know the answer. Well, it's real time. Yeah. They're either not measuring it

Starting point is 00:41:11 or they're measuring it in a weird and incorrect way. There's a lot of different ways that that can happen, but typically the way that we've experienced is that if you ask an executive at the company that uses Kafka heavily, ask them, is your application latency sensitive? They'll say, of course, we're an extremely high performance organization. We love high performance systems. Obviously, the intent latency couldn't be anything

Starting point is 00:41:36 more than 50 milliseconds. That would be crazy if it were anything more than that. And then you make it a little bit further down the chain in the organization. You ask the application developer or the SRE who's actually on call for the thing or wrote the code. You ask them and they're like, I don't know. I hope that it's fast, but I'm not really sure. Or you ask them and you get an explicit answer

Starting point is 00:41:59 that's very different than the answer that the executive gave you. And realistically, there are a few applications that we come across that do need that low latency. And the primary example of that, I mean, there's a lot of this kind of application out there in different domains, but the good example that demonstrates it is credit card fraud detection. The way that, you know, there are people out in the real world using credit cards and you want to make a determination

Starting point is 00:42:28 about whether a chart is fraudulent at the point of time that they're swiping the card. So that is like a, necessarily a real time thing. You know, like there's a user who's waiting on the real world and if Kafka is in the critical path,

Starting point is 00:42:44 especially multiple hops through Kafka and a critical path, then a system that has higher latency like WarpStream would be harder to adopt. And there are other applications that meet this criteria. But basically, if the user is in the critical path of the request, then WarpStream is harder to adopt, like in the abstract. You can obviously, some specific applications might be okay with higher latency than others, but that's the one that we see from time to time. When you strip all those out though, the things that you have left

Starting point is 00:43:14 are the more analytical type applications. Like the example I was talking about before, moving application logs around. Developers are pretty used to some delay between the log print statement running inside their application and being searchable inside wherever they're consuming their logs from. So the additional one second of latency there is typically a non-issue. And the reason why that's useful for us as a company at Workstream is that those workloads are typically really high volume and they cost the user a lot of money. So our

Starting point is 00:43:53 solution being more cost-effective really resonates with them because usually there's also a curve of the more data you're generating, the less valuable that data is per byte. So there's also a curve of the more data you're generating, the less valuable that data is per byte. So there's like budget pressure to get the efficiency to process that data. You know, you want to increase the efficiency of processing that data. And Kafka sticks out like a sword from in terms of that processing cost. So we can come in and say, hey, the, you know, because of the way the cloud providers don't charge you for bandwidth between VMs and object storage, and we store all the data in object storage, that means you're going to save this many hundreds of thousands of dollars a year on sending the dumb application logs that you're generating into the eventual downstream storage. That makes a lot of sense to them. So while we understand that we can't hit every possible application in the market

Starting point is 00:44:50 with the shape that WarpStream is today, we're pretty happy with the set of use cases and workloads that we can target because there are just so many of them out there and they happen to align with the budget-sensitive ones. Those reads and writes, can you restate those? Did you say writes are at most in P99 500 milliseconds and reads are one to two seconds in P99?

Starting point is 00:45:15 Is that correct? So the writes are around 500 milliseconds at the P99. That's tunable. By default, we have the agent buffer the records that your clients are sending in memory for 250 milliseconds before writing them to object storage so that you just write fewer files to object storage, which is the primary determinant of the cost of the object storage component of the system if you're not retaining the data for very long. But you can shrink that down all the way to 50 milliseconds, in which case the produce latency at that point

Starting point is 00:45:45 would be probably ballpark 300 milliseconds at the P99. I said end-to-end instead of read because that's typically what people talk about in Kafka terms because they want to know, like, a producer sends a message, how long does it take until a consumer can consume that message successfully? So that's what I mean by end-to-end. And that is one to one and a half seconds of the P99 for most R users. So latency aside, what are the other downsides of this approach? So there really aren't that many downsides other than the latency. The latency is what actually enables all of the benefits of WarpStream, basically. The object storage is what enables a lot of the benefits.

Starting point is 00:46:33 We have a couple of interesting features that are based on the fact that all of the data is in object storage. One of them we call agent groups. And agent groups let you take one logical cluster and split it up physically amongst a bunch of different domains. They could be different VPCs within the same cloud account. They could be different cloud accounts. They could be different cloud accounts or the same cloud account but across regions. All by just sharing the IAM role for the object storage bucket between those different accounts. The alternative to this with open source Kafka is setting up something crazy like VPC peering, which is extremely hard to do. And your security team

Starting point is 00:47:18 will probably not be super happy if you try to ask them to peer a bunch of VPCs together because it introduces more security risks. So we have customers in production using this feature today where the example that we usually give is there's a games company that splits their production games account, where all the game servers run, from the analytics account where they do like the, they run a bunch of flank jobs to process the data generated from the analytics account where they do like the, they say they run a bunch of flank jobs to process the data generated from the production account.

Starting point is 00:47:47 And they run agents that just do produce. So just writes, they run that in the production account and they run agents that just do fetch inside their analytics account. So they've kind of flexed the cluster across those two different environments. And all they had to do to set that up was share the IAM role on the object storage bucket instead of peering the VPCs together. So the fact that everything is in object storage

Starting point is 00:48:11 opens up a ton of new possibilities, actually. Basically, the only downside of WarpStream is the fact that the latency is higher. Now, obviously, we're a new company. The product Like the product does not have the 13-year maturity of the open source Kafka project. But just to speak of the operational stuff and the cost stuff, the Workstream is a huge win on both of those. Does it have any of the hosting flexibility? I suppose you're putting everything in object storage anyway, so there's probably people running their own object storage clusters, but that might be crazy.

Starting point is 00:48:50 I don't know. Yeah, so there are a number of projects and products out there that you can buy to give you an object storage interface in essentially any environment. Like there's the open source project, MinIO, and then basically every storage vendor on the market

Starting point is 00:49:05 will sell you something with an S3-compatible interface if you're running in a data center environment. And because we work with S3, GCS, and Azure Blob Storage, we can essentially connect to anything. If you had an NFS server, we can even make it work on that too. We don't have anybody in production doing that, and I wouldn't recommend it.

Starting point is 00:49:31 I would recommend using the object storage interfaces, but we're pretty flexible in terms of the deployment topology. What about R2? Would you have even more savings, or would that not matter because nothing's going outbound from the virtual network

Starting point is 00:49:45 there so i think it would depend on where you're running the compute if you were storing the data in r2 but you were running compute in aws you would get charged a lot of internet transfer as part of that if you're running your compute in one of the providers that has free peering with R2, then yeah, you would get a nice savings there and you'd be able to move data reliably across, let's say, multiple regions of whatever providers have peered for free with R2 using Workstream. I was thinking about getting started really, or just trying it out i do like your curl demo i did play with it i had no idea what i was doing but it was cool the command is on your home page it's curl and a url to an install script i did not review that script prior to running it

Starting point is 00:50:41 i just trusted you you're admitting that everybody well you know? Well, you know, it was a VM on Proxmox. I didn't care that I could just throw away. It wasn't my own machine. I was safe. That's a good layer. It did spin up, and then it gives you a URL you can go to to log in. And next thing you know, you're looking at a cluster. So I like that aspect about it.

Starting point is 00:51:01 Whose idea was it to come up with that demo? I mean, it's very hacker. It's very developer, right? Like no pain whatsoever. If you've got a VM or you want to spin up a VM or you have Proxmox and you can do it safely like I've done, or you can spin up a droplet on DigitalOcean or pick your own if you've got a VPC, whatever.

Starting point is 00:51:19 You could do it in a more safe manner and have some fun. What do you expect people to do with that? What are people saying about that? And whose idea was it to produce that demo? This is very hacker. I like it. Yeah, I think the demo was Richie's idea. It basically just starts up a producer and a consumer so that you can just see something happening in the console.

Starting point is 00:51:42 Like, yeah, it provides you a link. If you would have run that locally on your laptop, we would have opened the link automatically in your browser for you. It said it had a problem and I had to click it. We even designed the little niceties like that. The idea behind the demo is basically just to show people that it does something. Kafka is not an exciting technology to demo. We're kind of limited there.

Starting point is 00:52:04 It's even more boring than doing a demo for a relational database or something. But we, uh, there is another mode that you can run that in that's called playground and playground will let you start a cluster that doesn't have like a fake producer and consumer running on it as a, as a demo. It just starts a cluster for you temporarily and makes an account that expires in 24 hours. And you can take that Playground link and you can start multiple nodes,

Starting point is 00:52:36 like say one on my laptop and one on yours and point it at R2, and we can have a cluster that spans our two laptops together. My co-founder and I did that before and posted a video of it on Twitter or something like that. But because the data is all in object storage and the compute part is stateless, it's actually not that complicated to do.

Starting point is 00:53:00 It's basically the thing we were talking about a second ago with R2, just connecting two laptops instead of two different regions or something like that. So to get to the Playground version of it, is it like dash dash Playground? How do I get there? Yeah, so there's three different commands primarily that people would run. There's WarpStream Demo, there's WarpStream Playground, and then there's WarpStream Agent. The Agent is like the one you would run for production to start an an agent and the the playground one is is how you start a playground i think the playground even gives you like it spits out in the output the command that you would copy and send to somebody

Starting point is 00:53:36 else to to start it in another um in another terminal but it's been a long time since i have since i've played with it so i may be remembering wrong. The reason why people like the demo, or I should say the Playground, is that it makes it easy if you're a developer to just start a cluster and use it for local development instead of having to run. If you use WarpStream in production, you want to use the same thing in your development environment just to ensure consistency. You can use Playground Mode to create a cluster and it will just go away when you stop using it. And there's no cost. Yeah, I dig it. I kind of wish there was more documentation.

Starting point is 00:54:19 If there is, then I would go find it. Or maybe a video or something like that. Because that's kind of cool. I like this demo because for those who just want to tinker without having to spin it up in the EC2 or just whatever, you know, go the extra mile. I love that you can just sort of do this on your own, but I had no idea the playground version was there or the agent version was

Starting point is 00:54:40 there to go a little further. And there's some room that you can make some content around that to, to give people more of a guidance. And you should do that. Yeah, totally. The Playground has been, a lot of people have found, the Playground and the demo, people have found a lot of joy in because they're

Starting point is 00:54:56 just cool. We also have a serverless version of the product that basically just gives you a URL that you can connect to over the internet to fulfill a similar purpose, basically for people, if they want, if they want to try it out without actually doing anything locally on their machine. I think we give new accounts like $400 of credit

Starting point is 00:55:15 when they sign up. So you can, you do a lot with, you can do a lot with that if you just want to play around without actually starting the infrastructure. And I guess while I'm on your home page perusing just under this demo that is so cool there is a mention of plug and play part of your angst i suppose to get to where you're at was let's rethink what this meant like in a modern time which is what you've done but then also to be just swap out so one thing it says is there's no need to rewrite your application to use a proprietary sdk you just literally change a url was that how did you get there in terms of the it's fine to not want to contribute to kafka and make your own way and i'm totally cool with

Starting point is 00:55:58 that and warp stream reinventing or rethinking this model but how do you get to this point where you're like let's make this as frictionless as possible to focus on the DX of what it might actually be like to say, okay, well, if this is, like Jared said earlier, that subset of folks that maybe they're not doing credit card transactions and fraud detection where that needs to be literally real time, where the latency

Starting point is 00:56:20 cannot be absorbed. In a scenario where it can be absorbed and it's a large population of Kafka users to say, listen, we're here and this is how easy it is to swap. How did you get to that design, that idea? We got there by just talking to people, basically. The number of developers out there who are using Kafka, it's really high and we talked to a lot of them. And

Starting point is 00:56:46 when we asked them basically, what, you know, what do you not like about Kafka? You know, they would give us a bunch of different answers, but the, when we would ask them, if we could fix those problems for you, you know, would you want to do that? And it would involve, you know, essentially rewriting large parts of your application. It's a non-starter for people. And there are a bunch of other things out there in the world that integrate with Kafka, like Spark and Flink. And there's a bazillion open source tools out there that integrate with Kafka. We have no influence on any of those things either, really.

Starting point is 00:57:25 So it was kind of a choice that was forced upon us. There's really no way. Kafka has so much momentum behind it that it's pretty much impossible to get broad adoption of something that would be a replacement for it without having the exact same wire protocol. So you can use the exact same clients and stuff like that. It's a lot of work to maintain that compatibility. Thankfully, a lot of that work is front loaded. It's just you do it once and Kafka is not a particularly fast moving open source project. So they're not changing the protocol every day.

Starting point is 00:58:01 There's a lot of backwards compatibility is very good with Kafka. So it's thankfully it was mostly a one-time cost, but it's opened up a lot of opportunities because we are compatible to even just doing basic stuff for the company, like being able to do co-marketing with other vendors of products that are compatible with Kafka. If we weren't compatible with Kafka, you know, we would be able to do that. And a lot of the open source tools that we would want to integrate with, like let's say the OpenTelemetry Collector or Vector, these kind of observability agent tools, they all can write data to Kafka. And we inherit that benefit right out of the box. So it's been super important for us basically to have that compatibility. And do you think that, I know you're sort of young-ish, but do you think that, I suppose, how are you winning? Are you winning

Starting point is 00:58:56 the market? That's what I'm trying to get to is like, are you truly absorbing a lot of the Kafka user base? Is there, is there a major demand for WarpStream? What's the state of product market fit and are you winning? Yeah, so we have a number of large use cases in production today. I can't talk about very many of them, unfortunately, but there are WarpStream clusters out in the world

Starting point is 00:59:22 processing multiple gigabytes a second of traffic through, and not just like one of them, like there's, there's a decent number of them at this point. And where we're having success in the market is basically the large open source users who are, you know, they feel like the open source project is a bit too challenging for them to run. And there's budget pressure all over the industry today, especially in the, you know, in the corners that we're interested in, like in the observability and security and analytics side, there's a lot of budget pressure. So we're a pretty natural fit for those folks who are both tired of running the open source project and they're getting budget pressure to decrease their cost. We're having a lot of success there. What about Greenfield? Is there anybody that's like, okay, we need to adopt Kafka or something like it, but what is out there before we go and write a lot of code or flesh out our infrastructure model or

Starting point is 01:00:23 make any plans. What about those that are not migrating? What's the path, I suppose? What's the inbound of those folks and what's the path to like the DX? Because one of the things you mentioned is that you solve a few problems. You solve cloud economics, you solve operational overhead. And one thing that you mentioned, at least the article that was from last year, was a major problem with Kafka, which was developer user experience. And that's what I'm trying to get to there. Those were coming on green, brand new.

Starting point is 01:00:52 What is that user experience like and what is the path like for them? I think that for Greenfield products, there's two different branches of those. There's Greenfield products that are only greenfield in the sense that they're trying to adopt Kafka for some goal. They're not greenfield like the application didn't exist before. There's that aspect of it where they're just new users of Kafka. And then there are truly greenfield projects where the project itself is new. And also the choice to choose Kafka is new. And usually those products don't have a super high volume of data.

Starting point is 01:01:29 It's the existing initiatives or applications within a company that process a lot of data but are not using Kafka for cost reasons where we are having more success. There's a product that I would love to talk about that won't quite be public by the time this episode is posted, but they're in that first category where it's an, it's a large existing workload, but they were not using Kafka for a bunch of different reasons, cost being one of them. And they're now a big Workstream customer because they saw that there are benefits to using Kafka for their application, but they just couldn't use the open source project

Starting point is 01:02:17 for cost reasons. And now essentially they can. And there's a lot of cool stuff that they can do now that they couldn't do before that Kafka enabled them to do. And WarpStream is their Kafka-compatible product of choice for those cost reasons.

Starting point is 01:02:33 They're starting to get some benefits from it now. So I guess the obvious question to me at this point is Kafka is not dead. It is alive. It is open source. To my knowledge, I don't think it is. WarpStream is not dead. It is alive. It is open source. To my knowledge, I don't think it is. Warpstream is not open source.

Starting point is 01:02:50 Was there a conversation about licensing? Was there a conversation about being a commercial open source company? Just to follow in the footsteps of the predecessors that you at least, from a conceptual standpoint, copied and improved upon, right? You were led by your here you stood on the shoulder of giants where are you at with that what have you thought about in terms of licensing and open source and what y'all stance on open source as your core or not yeah so we we had a lot of back and forth initially when we were thinking about this specific issue.

Starting point is 01:03:31 And the conclusion that we came to is that in order to dishonesty move of the way a lot of commercial open source projects have evolved in the last five years in terms of either relicensing or, you know, changing the focus of the project drastically to, you know, benefit the primary commercial backer. And we, we just didn't think that it was, we're, we're providing a lot of value by providing a solution that is dramatically lower cost and also compatible with the existing ecosystem. And the way that that works in practice means that you can switch away from WarpStream because you're not locked into it from an application perspective or a protocol perspective. So we're not locking you into something proprietary from an interface perspective. So it's actually relatively easy to switch away from WarpStream if you decided to in the future because you didn't like something that we did. But we're hopeful that the fact that we provide something

Starting point is 01:04:47 that's dramatically lower cost and easier to use means that you won't switch away and you'll continue to have the best of both worlds, so to speak, where there is an open source thing out there that obviously is going to continue to exist because it has a ton of users.

Starting point is 01:05:03 But if you want to use our product to save money and have something easier to use, you can as well. And we will be able to continue to invest in making that product better and better over time because we are not stuck in these kind of middle-of-the-road outcome issues that a lot of commercial open-source companies have where they're forced a few years down the line to cash in all of their brand goodwill on a relicense

Starting point is 01:05:27 in order to gain that commercial success that they wanted. We're hoping to be able to, by sticking to this model, we're hopeful that we'll be able to be a good citizen of the Kafka ecosystem in terms of making a product that's not incompatible and proprietary and steering everybody away. And we do put a lot of effort into testing clients. We find bugs in Kafka clients that are typically open source and make improvements there. But the core part of the product is not going to be open source.

Starting point is 01:06:04 What's interesting about those re-licenses is that they all were commercially successful companies, even at the time of the re-license. They had arrived. And at a certain size and scale, it seems that the growth curve has to continue to go vertical to satisfy investors, to satisfy public demand in the case of Hashi. But I'm sure, I don't actually know the state of Redis Labs or the commercial success or not of Redis,

Starting point is 01:06:32 but many of them were large, successful commercial companies, bigger than most companies ever get before they actually went ahead and did that not cool rug pull. But I wonder if the pressures on them, because it's other people's money, similar in your situation, like you have a vent VC behind you. And I'm just curious about that decision

Starting point is 01:06:58 from your guys' perspective. Because you're a small team, probably well-funded in terms of you guys are highly successful software people, so you're probably making good money. Run way well into the next decade for the sites. Yeah, so why not bootstrap? Why not bootstrap and then not have any of that VC pressure that you currently have? That's a really good question.

Starting point is 01:07:22 And I think that the, take a step back from that question for a second, talking about the commercial open source stuff. This is obviously a little bit inside baseball, but as a part of going through that decision process, we talked to the founders of a lot of commercial open source companies. And we asked them, let's say you were starting our company today, what would you do? And without hesitation, the answer we got was, I would not start it as a commercial open source company today. And there are a lot of different reasons that they gave for that.

Starting point is 01:07:56 And I can't really give some of those reasons without potentially identifying who those people are. And I don't want to do that. But the challenges of a commercial open source company today with the, it's not even just the hyperscaler cloud providers anymore, taking your stuff and running it.

Starting point is 01:08:11 Like that's obviously a concern, but you can get around that with like the AGPL does a decent job of preventing some flavors of that. The other issue is just like the competition within the category that they're building their product in is extremely high and having your source code out there in the wild and letting everybody know your secrets essentially about how you made your product better the you you lose a lot of the juice behind why you have these huge staffs of developers working on interesting things. It's not to say you can't protect that other ways either, but like with software patents and stuff like that, but people don't, the appetite for software patents, it would do

Starting point is 01:08:57 a lot of brand reputation. I think if companies created a lot of software, if these commercial open source companies created a bunch of software patents and started enforcing them against each other, for example, it's just, it's a very challenging situation today. A lot of the companies that you might view as successful commercial open source projects, they might be successful in their, in the iteration that they exist in today or, you know, yesterday in the case of the holidays, real licenses, where they have good adoption and the developer community, and they might have good success in the case of the holidays licenses where they have good adoption in the developer community and they might have good success in the the vc funded startup segment of the world

Starting point is 01:09:31 but there is an inevitable push to go up market and to go after larger and larger customers because it's the it's effectively the only way to support growth like the The growth of what you can achieve within the small... If your customers are all small startups, even medium-sized startups, and developers playing around in their personal capacity

Starting point is 01:09:58 or stuff like that, the revenue opportunity is just really small, unfortunately, for a lot of these businesses. It's much easier to sell a million-dollar-a-year contract to an enterprise than it is to get a million dollars of revenue out of a bunch of small and medium-sized businesses. So the temptation when the growth starts to slow down is, I need to go do that now. That's the first thing your investors are going to tell you is, you need to go do that now. That's the first thing your investors are going to tell you is you need to go out market and get enterprise customers.

Starting point is 01:10:30 If the product that you're selling them is support or a couple of features on top of an open source project, your ability to exert pricing pressure on that enterprise buyer to get them to pay a higher price or to get them to pay at all. In the case of a lot of these open source projects where they spent so much time making it good that the enterprise can just hire one person to maintain it internally and just move on with their life

Starting point is 01:10:58 and run the open source forever and maybe pay you a peanuts support contract, essentially, not actually enough to support the business. It's just really hard. I completely understand where you're coming from and that it might've felt as if these companies were successful from the outside. And some of them definitely were, but just there is that inevitable pressure to keep the growth rate up. And the only way to do that is to go upmarket. And when you're going upmarket, you need to provide something that looks valuable.

Starting point is 01:11:31 And if your project is open source and the alternative is hiring a developer or two to maintain it internally, you kind of have a cap on how much you can charge. And it's the same thing if you're offering a cloud version of an open source project, for example. The premium someone will pay for your cloud version, it may be lower than you expect if they can self-host. Because they're always looking at that.

Starting point is 01:11:55 They're looking at both sides of the coin. How much will it cost me to self-host this versus how much does it cost to use your cloud-hosted version? And that calculus does not always come out in your favor as a vendor. And you may want to charge, you may have to charge significantly more to make the numbers work on your side than what they think they can run it for internally. It's really challenging stuff. And that's, we wanted to provide the best product possible with the best product experience possible. And we didn't feel like the shape of an open source,

Starting point is 01:12:26 commercial open source company was the right way to do it without having a lot of these distractions about the things that I'm talking about right now come up along the way. And we didn't feel like it would be right to do that, the bait and switch thing that people are doing these days.

Starting point is 01:12:41 We wanted to be honest, basically, from day one. That makes sense to some degree. I don't fully agree with all of your sentiment, although that's a very deep and lengthy conversation teetering on, just not fitting this conversation necessarily. But what I can appreciate, given that I don't fully agree with all of your reasons, The one reason I think that you've done well, or I suppose the most positive thing is, is you've made it easy to get in and get out. So if for some reason WarpStream is of great benefit, and let's just say a year down the road,

Starting point is 01:13:18 somebody does WarpNotStream, and it's commercially open source, and they eat your lunch, because they decided to be open source first, and they can get into that just as easily as they can get out of you. Then that's a whole different story. I'm not suggesting that's going to happen, but it's possible. It's totally possible.

Starting point is 01:13:36 Yeah. And that's, and you're, you're exactly right. If our, if one of our competitors came up with a, a better implementation tomorrow and it was... The exact implementation.

Starting point is 01:13:48 They can literally copy everything you do and just, they would be okay, the world would be okay with that because they made it open source. That's a version or at least a subset of a conversation we had at length on this podcast a few weeks back with JJ, Joseph Jaxx. He was like, yeah, I'm totally cool with somebody, a founder going out there and literally copying X and saying this is now X as open source. He was totally cool with that.

Starting point is 01:14:15 I'm not saying that makes sense completely to me too, but the world now believes that's an okay thing. And it's an okay thing because at the core, it is meant to be an open source commons good yeah i i would have no i would not harbor any ill will towards someone i would decided to do that i would be like come on man don't do that well someone's gonna do it yeah i mean as you guys have success now whether or not they can actually pull it or off is the question right but like there will be at some point as warp streamstream continues to grow a Hacker News number one story,

Starting point is 01:14:47 X is like Warpstream, only it's open source and self-hosted. And it'll get 500 to 1,000 and maybe it gets adoption, maybe it doesn't. Maybe by then you guys are so far ahead it doesn't matter.

Starting point is 01:15:00 There's tons of what ifs, but like it will happen from somewhere in the world if you're successful. And the reason why that doesn't bother me so much, basically, is the portion of the Kafka market, let's say because we have commercial competitors, obviously, the portion of the Kafka market that has been commercialized. Let's say somebody is paying a a licensing fee or you know some of their fee to use the product not just you know hiring somebody to run it for them the portion of that market that's been commercialized is very small so there's so much

Starting point is 01:15:34 greenfield market out there for us to commercialize along with this constant ever increasing trend of things becoming more real time and these other are these other tailwinds of uh more observability and security data being generated in the world there's just this market is just going to be so big in the future that a i think it's unlikely to have a winner-takes-all dynamic similar to the way that there are multiple large public cloud hyperscalers that exist and are very profitable. And there's just so much of this market out there that we're not super concerned about any particular competitor.

Starting point is 01:16:23 Even if one were open source, there's a lot of other dimensions that we would hopefully be better at competing on that you don't get out of just the fact that the product is open source. That combined with the fact that the market is so huge that we're pretty happy with our position as it is today. Hey, friends, I'm here with Brandon Fu, co-founder and CEO of Paragon.

Starting point is 01:17:08 Paragon lets B2B SaaS companies ship native integrations to production in days with more than 130 pre-built connectors or configure own custom integrations. So, Brandon, talk to me about the friction developers feel with integrations, SSO, dealing with rate limits, retries, auth, all the things. Yeah, so there's a lot of aspects to the different problems that you have to solve in the integration story in building these integrations and also providing them in a user-friendly way for your customers to self-serve and onboard and consume those integrations so part of what the paragon sdk provides is that embedded user experience again what we call our connect portal that's going to provide the authentication for your users to connect their accounts that's going going to be the initial onboarding. But in addition to that, your users may also want to configure different options or settings for their integrations. A common example that we see for Salesforce or for CRM integrations in general is that your users may want to select some type of

Starting point is 01:17:58 custom object mapping. Every CRM can be configured differently, so your users might want to map objects to some different type of record in their Salesforce or different fields in their Salesforce. And typically, that's what developers would have to build on their own, is this UI for your users to configure these different settings for every single integration. That's also going to be what's provided by the Paragon SDK, is not just that initial onboarding and authentication experience, but also the configuration end user UX for different settings like custom field mapping,

Starting point is 01:18:31 selecting which types of features on your integration that your user might want to configure. And that's also going to be provided fully out of the box by Paragon SDK. With integrations, different APIs might have different rate limits. They might have different policies that you have to conform with,

Starting point is 01:18:48 and your developers typically have to learn these different nuances for every API and write code individually to conform to those different nuances. With Paragon, because we build and maintain the connector with each of the integrations that we support in our catalog, we're automatically gonna handle for things like retries, things like rate limits.

Starting point is 01:19:08 And so we look at this as sort of the backend or infrastructure layer of the integration problem that we have spent the last five years essentially building and optimizing the Paragon infrastructure to act as the integration infrastructure for your application. Okay, Paragon is built for product management. It's built for engineering.

Starting point is 01:19:27 It's built for everybody. Ship hundreds of native integrations into your SaaS application in days. Or build your own custom connector with any API. Learn more at useparagon.com slash changelog. Again, useparagon.com slash changelog. That's U-S-E-P-A-R-A-G-O-N dot com slash changelog. And I'm also here with Dennis Pilarinos, founder and CEO of Unblocked. Check him out at getunblocked.com. It's for all the hows, whys, and WTFs.

Starting point is 01:20:01 Unblocked helps developers to find the answers they need to get their jobs done. So Dennis, you know we speak to developers. Who is Unblocked best for? Who needs to use it? I think if you are a team that works with a lot of coworkers, if you have like 40, 50, 60, 100, 200, 500 coworkers, engineers, and you're working on a code base that's old and large, I think Unblocked is going to be a tool that you're working on a code base that's old and large, I think Unblocked is

Starting point is 01:20:25 going to be a tool that you're going to love. Typically, the way that works is you can try it with one of your side projects. But the best outcomes are when you get comfortable with the security requirements that we have. You connect your source code, you connect a form of documentation, be that Slack or Notion or Confluence. And when you get those two systems together, it will blow your mind. Actually, every single person that I've seen on board with the product does the same thing. They always ask a question that they're an expert in. They want to get a sense for how good is this thing? So I'm going to ask a question that I know the answer to and people are generally blown away by the caliber of the response. And that starts to build a relationship of trust where they're like, no,

Starting point is 01:21:09 this thing actually can give me the answer that I'm looking for. And instead of interrupting a coworker or spending 30 minutes in a meeting, I can just ask a question, get the response in a few seconds and reclaim that time. The next step to get unblocked for you and your team is to go to getunblocked.com. Yourself, your team can now find the answer they need to get their jobs done and not have to bother anyone else on the team, take a meeting or waste any time whatsoever. Again, getunblocked.com.

Starting point is 01:21:40 That's G-E-T-U-N-B-L-O-C-K-E-D.com. And get unblocked. So let's go back to bootstrapping then. It seems like the kind of thing you could bootstrap. I mean, it's just you and Richie coding it up on nights and weekends, you know? Get it rocking and rolling. Keep all that equity.

Starting point is 01:22:14 No one to answer to. You're going to get customers pretty quick. Then you can start hiring based off of your customer. Why that decision to raise? So the reason why people raise money is let's only put the other front the right reason to raise money is that you want to go faster that's basically why someone should raise venture capital is they have something that's working and they want it to go faster my co-founder and i had so much conviction in what we were doing in terms of it being commercially successful that we knew on day one we would be able to go much faster if we raised money.

Starting point is 01:22:57 So that's why we did it. There was never a period of time where we were guessing like oh do people need this it was it was like very obvious to us from day one that we wanted to go as fast as possible and raising money is the way to do that because we were able to hire a lot you know relative to the two of us many more people and pay them very well and make them happy and support, you know, make it hiring people that are good at distributed systems stuff is very expensive. And the, those types of people also really appreciate job security. So being able to have a bunch of cash in the bank, even if we're not spending it is very

Starting point is 01:23:40 important to those folks. So our internal stakeholders, you know, as employees and founders and stuff, it makes it very comfortable to have that cushion and allows us to hire people that will make things go faster. And then on the complete other side of the coin, if you want to sell products to enterprise buyers as two people without having raised any money, it's going to

Starting point is 01:24:06 raise a lot of eyebrows if they want to put that in production as the backbone of their multi-billion dollar business. That makes a lot of sense. It's really hard. Yeah. Whereas if we can walk into a meeting and say, hey, we've raised roughly $20 million from Greylock and Amplify Partners, who are our Series A and seed investors, respectively, that sidesteps a lot of really awkward conversations about like, what's going to happen to you founders if you get hit by a bus tomorrow or something? Obviously, that'll be very bad for the company, but there is at least somebody else who cares and would like to continue to hopefully see their investment succeed. So the dilution stuff is really, obviously it's a good point, but you just have to think, are the odds of success higher? And will the eventual outcome be bigger

Starting point is 01:24:59 if I raise VC? And if that is true, then I think it's worth doing. But if you're in a position where you don't know if your product is going to be commercially successful, it closes a lot of doors to raise VC. Like every further round that you raise, it makes it harder and harder to explore different kinds of exit opportunities that you might personally view as a success, but your venture investors may not view as a success. So it's definitely a balancing act, but you just have to go into it with your eyes open and understand what you're, you have to understand the game you're playing basically and walk into it with your eyes open. Had you played this game before? Yes. Very briefly, a long time ago, unsuccessfully. I did, yeah.

Starting point is 01:25:45 And in between that and starting WarpStream, my co-founder and I were considering raising money for the thing that we were doing before we joined Datadog. And that's how we got to know our seed investors at Amplify Partners. And we didn't have that conviction at the time to say, let's go raise money. This is going to be huge. In hindsight, we probably would have done very well with that. Had we chose to raise VC and like remain as an independent thing and all that instead of joining

Starting point is 01:26:19 data dog. But because we didn't have that conviction, we took the quote unquote exit opportunities that were available to us at that moment because we hadn't yet raised money. We're very flexible. So we were able to join Datadog and it worked out super well. We got to meet a bunch of interesting people and the project we were going on was successful and super fun and all that stuff. But because we did have that conviction this time around and we wanted to go as fast as possible. That's why we, that's why we chose to raise money this time around. I think your reasons are sound. I don't disagree. And I will not argue.

Starting point is 01:26:56 Good answer. I'll give it to you. I will not argue. I think, you know, we, we check wisdom. We don't, while we love open source, I don't think that you would have had i can see how going the route of venture capital and not going as you had said some of the burden of open source in terms of distraction was your actual word i can understand that and that's your prerogative right bobby brown is is dated in terms of an artist. Nobody knows Bobby Brown anymore. But it's my prerogative is still a true phrase.

Starting point is 01:27:31 Ryan, do you know Bobby Brown? It's been a long time since I've heard any Bobby Brown, but I do indeed a little bit. I grew up on Bobby Brown, so I can't help but bring it up. It's my prerogative. Yeah, it's my prerogative. Yeah, great song. You know, so it's your prerogative. And it's Richie slash Richard's great name, by the way, Silicon Valley.

Starting point is 01:28:01 I mean, I had to bring it. He was called Richie and his name was Richard Hendricks, but he was called Richie by his attorney. I don't disagree with the reasoning for your direction. I hope it works out for you. I think it seems like it's going to, but I do agree with what Jared said, which was there is probably going to be, if you hit critical mass and enough scale, somebody who copies what you've done and simply just says, okay, literally copy and now it's open source and they'll be okay with that. I don't think that you should operate in a state of fear of that and make choices because of it because that's free market, man. That's going to happen. But good on you for being able to answer these hard questions.

Starting point is 01:28:49 I think you did well on that front. I don't any any argument really that's all i'll say and and that's that's only because we spent a lot of time thinking about it and a lot of time talking to folks who are like day-to-day building commercial open source businesses yeah that really brought our perspective to where it is today. And it's not to say that there are no possible opportunities to start a commercial open source company that would be successful today. There obviously are. It's just that for our particular market and the strategy that we were pursuing, it just wasn't going to be, I think I can put it a little bit more crisply. The segment of the market that we're going after is already price and cost sensitive. If we offered them the opportunity to run our product for free, the odds that we will be able to charge them almost any money

Starting point is 01:29:38 would be pretty low. There are other markets out there that have completely different dynamics in this, especially if you're not trying to provide the low cost solution. So I didn't mean to denigrate commercial open source companies. I was just saying that when we explained our strategy, basically, to these other commercial open source founders, they said, that's going to be hard. It's going to be very hard for you. So you should think about it before you choose to go down that path. And we chose this path because we think it's most likely to be successful for us while also, I would be personally very upset if I had to do one of those license change rug pulls. It would make me very sad because I know it causes a lot of consternation and heartburn for people

Starting point is 01:30:31 when those things happen. So we just wanted to be straight up with people from day one. I also think that you are a particularly easy target for the hyperscalers to reclone and host and offer because of the nature of what you're doing. Yeah, I mean it's a general purpose infrastructure building block and Amazon has a product. Amazon has MSK as a competing product with WarpStream so they very directly

Starting point is 01:31:03 could just offer a new skew of msk that is the the warp stream one if it were open source right that would be that would be very challenging for us ride your coattails are there other competitors out there are there other people said that are putting kafka on object storage yeah i mean there there are a number of companies out there that have talked about how they're doing this. I think the most notable of them out there would probably be Confluence announcement of any of them where they're taking a similar direct to S3 approach as Workstream does. And the product isn't available today for anybody to just go sign up for and do a comparison, but they've made an announcement and I'm sure that's going to progress more in the

Starting point is 01:32:04 future. I'm sure essentially every one of our competitors, if they haven't started working on it already, a similar storage engine, they will. So I have no doubts that the cat is out of the bag, so to speak, on the idea. Well, that does make sense then why you went venture capital so that you can go fast. And I think that from a visual standpoint and you've done well from a brand standpoint, I think your marketing site is pretty awesome. I mean, there's obviously always room for improvement, but it's pretty solid. I do want to bring up the idea of pricing because I don't disagree there either. There's large corporations, enterprises, so to speak, Fortune 500s, that if you're not charging them $10,000, $20,000 a year, they're like, what's wrong with you?

Starting point is 01:32:57 We can't use you. We literally need to give you a lot of money to trust you. And that's just the nature of the beast there. But when you land on your page for pricing out the gate, the TCO, total cost of ownership, is at least the default numbers that are put there, is $2,295 per month. So you're not even scaring people away. I mean, you're literally putting your fist in their face and saying, like, it costs a lot, y'all.

Starting point is 01:33:24 Yeah, but that's the cheap version. These people are probably used to paying more than that, right? Yeah, I mean, there's a little slider that lets you turn on the breakdown mode of the comparison to open source Kafka running in three AZs or one AZ

Starting point is 01:33:40 or comparing to AWS MSK. And we didn't even put a particularly big workload as the default on the pricing calculator. I think it's a pretty standard workload. And people are used to looking at big numbers when it comes to running, yeah, when they're used to running Kafka

Starting point is 01:33:58 for these kinds of observability and telemetry workloads. They just cost a lot. If you look a little bit further down the pipeline there, if they're sending the data to Elasticsearch or Snowflake or Clickhouse, they're probably paying significantly more for those things. So Kafka looks cheap in comparison and then WarpStream looks cheap compared to Kafka. So we're very open about the fact that our product is designed to be more cost effective.

Starting point is 01:34:30 But we do offer additional, we call them account tiers, basically, where the things that enterprises want from you. a month is they want to be able to file a support ticket and have somebody reply to their support ticket extremely quickly that's the thing that they're basically paying you for that's the stuff that doesn't scale basically as you get bigger or your product gets better obviously you might have fewer support tickets but you still need humans to be able to respond quickly when somebody does file those support tickets. So our account tiers for pro and enterprise give customers a support response time SLA that they can count on that today is backed by the engineering team. Like if you file like if you're an enterprise customer and you file a priority zero support ticket, which is just like my, my production cluster is down. I need help right away. That pages the engineering on call rotation and gets you help as quickly as somebody can respond to page reading.

Starting point is 01:35:37 So that's the type of stuff that people would be paying for basically on top. And that's how we make enterprises trust us. Another reason to raise venture capital, you need to hire people so you can have a 24-7 follow the sun on-call rotation in order to back those support response time SLS. So if you needed five gigabit write throughput, which I imagine is quite high,

Starting point is 01:36:02 but let's say that you do, 14-day retention, so that's two weeks retention. Not that much. We're talking 97 grand per month going to WarpStream and $1.76 million a month using Kafka? These are numbers that blow my little mind. Sorry, I didn't hear the first year, your throughput number.

Starting point is 01:36:23 It was the highest. It was five gigabits. Five gigabits. Yeah. Yeah. I mean, it's obviously as you get up into these larger and larger, well, first of all, I'll say 14 days, pretty long retention for most people for Kafka. Usually because it's a transitory, I'd say three to seven days.

Starting point is 01:36:43 That's a pretty, that's a pretty typical one. And if you're at these kinds of scales, you're probably not paying your cloud provider retail price for cross-AZ networking anymore. If Kafka was a big part of your bill, that would be probably one of the items that you would want to negotiate with your cloud provider. So the comparison doesn't get nearly as rosy if you've negotiated some discounts.

Starting point is 01:37:08 But the way that you can kind of estimate what those would be is if you switch it from Kafka 3.0 AZ to Kafka 1.0 AZ, that will reduce the inner zone networking dramatically and turn on the single zone consumer's flag. So the comparison doesn't look quite as good anymore. Still 10X. Still looking pretty good. Yeah. There you turn it one day retention.

Starting point is 01:37:32 Turn it to one day retention. And then it goes to 86% savings versus 60% savings. So it's still big, but we understand that there are a lot of big Kafka workloads out there. And we're confident that if we can deliver 75%, 80% savings, they don't always come out at 90% like that example does. But if we can deliver 75%, 80% savings, it's a compelling enough reason for someone to... There's a little bit of activation energy it takes to get people to do anything. And we're confident that that 75% to 80% cheaper thing

Starting point is 01:38:08 is enough of that activation energy to get people to at least give us a shot. I want to point out that these are just dollars too. This is not developer friction or operational burden or enhanced developer experience, which are the hallmark of any conversation today with dev tools, right? Like you could be a 13-year-old tool like Kafka and get away with, and I have no idea.

Starting point is 01:38:33 So no skin in the game. I've never used Kafka personally. So if there's some haters out there, those marginal haters I mentioned earlier, don't hate on me. But there may be some warts and blemishes and burdens within the kafka ecosystem that just makes it just challenging to operate to stand up obviously there's costs we've already

Starting point is 01:38:51 talked about that literally at length but i think there's something to be said about a modern take given today's cloud infrastructure with some of the dev user experience attributes I've seen you already put in place. So cost is one thing, but then happy developers is retained developers, morale boosts, you know, maybe freedom on weekends, less pager duty, you know, less whatever from anybody who might be competing with pager duty.

Starting point is 01:39:21 That's a good thing. Yeah, we all, like at Workstream, we know that that's like us, a very important part of what we do. But it's always easier to walk into a sales conversation with the hard facts numbers and not the, a lot of vendors use those exact attributes to describe, to attribute a lot of savings to their product, which is

Starting point is 01:39:46 probably true, but they feel a little bit more wishy-washy compared to the hard facts numbers. So that's why we lead with those in our pricing calculator. And obviously those are still things that we highlight when we're talking to potential customers to help them understand the value of the product. But we like to think of that as more like the icing on the cake stuff. And the cost savings is what we're promising them, basically. Everything else is just icing on the cake.

Starting point is 01:40:21 Icing on the cake. What's a good next step? I mean, I feel like we've really just gone through all of it, Jared. You got anything else? I think we have. We've covered it all, man. I think we've covered every ounce of Warpstream. Ryan, thank you for being patient with our questions

Starting point is 01:40:34 and going through everything and filling in all the blanks too. I think you did a great job with this conversation. I'm happy. I'm impressed. I think there's a lot of things I can see as quality in you as a person and also the thing that you're trying to do. I think you guys have led with some wisdom. I like a lot that you went out and talked to folks

Starting point is 01:40:50 rather than just shooting from the hip, so to speak, with your choices and letting it be opinion-based. You seem to have leaned into the wisdom of those who have come before you with your particular target market, which I think is key to your choices. And so I'm stoked that you were able to answer the questions we asked. So thank you. Yeah, this has been, it's been very fun. I was not expecting to talk about raising money at all during this conversation, but that was something that we, we spent a lot of time. They like, when you're building a company, you have to spend a lot of time thinking

Starting point is 01:41:23 about like strategic stuff. That's not just writing code, you have to spend a lot of time thinking about like strategic stuff. That's not just writing code. And that one was a lot of back and forth with my co-founder and I about how we were going to do things. And we're very happy with our direction now. But it took the input of a lot of people to arrive at this conclusion. And we're very thankful for those people that made themselves available for for learning more about commercial open source stuff because we had never really even considered it before and super important uh to learn along the way very cool well warpstream.com is where you can go we'll

Starting point is 01:41:57 obviously put links in the show notes ryan thank you it's been awesome. Thanks, man. Thanks. Okay, so WarpStream seems to be what Kafka would look like if it was redesigned from the ground up to run in modern cloud environments. They did not tend to open source, and I think Ryan had a pretty solid argument for why not. But time will tell if an open source copycat comes along to sniff out their lunch and eat it. Until then, good for them for putting in the work to gain the conviction they have for their choices and their position. Later this week, our game show Pound to Find is back on ChangeLoginFriends, and it was

Starting point is 01:42:36 epic. This is the closest I've come to winning, and I was still pretty far off, and that's this Friday. Okay, big thanks to our sponsors this week. Speakeasy, love them. Newdomainspeakeasy.com. Also our friends over at Superbase celebrating launch week number 12, superbase.com. And our friends over at Paragon, all the B2B SaaS integrations you want in a single platform, useparagon.com. And also our friends over at Unblocked for all those whys, hows, and WTFs. Check them out, getunblocked.com.

Starting point is 01:43:13 And of course, to our partners over at fly.io. That is the home of changelog.com. Check them out, fly.io. And to the beatmaster in residence, Breakmaster Cylinder. Bringing those beats. Okay, that's it. This show's done. We'll see you on Friday. Game on.

CODACE Plant Stand

The Changelog: Software Development, Open Source - Reinventing Kafka on object storage (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

CODACE Plant Stand

The Changelog: Software Development, Open Source - Reinventing Kafka on object storage (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.