The Data Stack Show - 35: The Future of Development is Distributed with Jim Walker of Cockroach Labs

Starting point is 00:00:00 The Data Stack Show is brought to you by Rudderstack, the complete customer data pipeline solution. Thanks for joining the show today. Welcome back to the Data Stack Show. We have a really exciting guest from a company that Costas and I have talked about a ton, which is Cockroach Labs. And Jim Walker, who is from their product marketing team, is going to join us on the show today. My burning question, Costas, is, and this is not going to be a surprise to people who've been following along for a while, Jim was a developer before he got into marketing. And so we've had

Starting point is 00:00:42 several people on the show who have sort of crossed lines between sort of marketing and engineering and technical roles. And so, of course, I want to ask him what lessons he's brought from engineering into marketing. What are you going to ask him? Oh, I have plenty of questions. Cockroach CB is a very interesting

Starting point is 00:01:00 piece of technology out there. They have done amazing innovation. It's one of these companies that they are really like on the borderline between doing research and at the same time productize it. So there are many questions around the database systems, distributed systems, and of course, what the vision of the product is in the company. So yeah, I'm super excited to chat with Jim today. Great. Let's dive in. Jim, thank you so much for joining us. Kostas may have mentioned this in your chat before the show, but he and I actually talk about Cockroach Labs a lot,

Starting point is 00:01:38 just because we admire so many things that the company is doing. And so a real privilege to have you on the show. Thanks for joining us. Well, thanks a lot. I'm fortunate and privileged to be an employee with a good group of people. So I hope I represent them well here. So I'm happy that that's what you guys think about us. It's a fun place. That seems like it. Well, why don't we start with, I think a lot of our listeners are probably familiar with what the company does, but we'd love to get to know you a little bit better. And you have a background as a developer, but now work in marketing. And that's absolutely something I'm going to ask you a lot about, but we'd just love to hear a little bit about your personal history. And then for those listeners who might not be familiar with Cockroach, can you

Starting point is 00:02:19 just tell us a little bit about what the company does and what you provide? Sure. Yeah, that's a lot. And I'll try to be somewhat brief so we can get into some, some technical stuff and some other concepts. But yeah, I mean, you know, y'all, I started as a developer. I mean, I coded at the age of 11. I was, you know, you know, the early eighties, you know, I had a, it was 1099 and Commodore 64. I was just always into computers. I always had a kind of creative side of my life as well, but I kind of landed in electrical engineering, computer science in undergrad, graduated. I loved it.

Starting point is 00:02:51 And I ended up being a programmer and I coded for seven years professionally in a language called Smalltalk, which I will argue is still the most elegant and beautiful language ever created. And I was in C++ and C, and, you know, I was a developer and I was working as a consultant. And every time a salesperson opened up their mouth, we had like, you know, two months

Starting point is 00:03:11 of scope to the project. And it just frustrated me every time. And so, you know, I ended up being the person that they would put in front of people to actually explain what was going on. And I loved it. And I loved kind of taking what, you know, developers and what we were building and explaining that to people when they got it, that aha, when they understood it was just, it was the juice, man. I loved it. And so, you know, I naturally kind of gravitated towards product marketing because, well, as a developer, I mean, I loved it, but I was a hack. I mean, I was good, but like, you know, I was, you know, I was managing teams and stuff. So it was just a real natural fit for me to kind of move into product marketing explicitly because I always feel that it's my job to be translator of, you know, deep technical concepts into English so that people can actually understand these things. And I love it. And, and, you know, when, when something clicks and something works really well, there's nothing like it now, you know, I've been kind of in startups been in startups just, oh gosh, I think this is my 10th or something like this. All of them, except for one has been successful. It has been in security.

Starting point is 00:04:11 I was doing master data management, which is kind of must be familiar with you guys about customer information. I was at a company called Initiate. I've been in open source for a long time. I was a company called Talend doing data integration. I started their MDM project there and moved them into big data. I was at Hortworks very early and helped to find the Hadoop space. From there, I was at another little marketing company. And then I landed at CoreOS, which CoreOS was a real special company that really kind of innovated in some different ways and really built kind of the foundation of a lot of things that are happening, I think, right now in infrastructure. And that was a joy. And I landed at Cockroach really about two and a half years ago. Cockroach Labs, we're the creators of CockroachDB. This is a brand new approach to building a database. We've architected the database from the ground

Starting point is 00:04:59 up to be distributed. It's basically taking all this distributed systems and distributed thinking stuff, applying it to the database so that we have a database that's kind of prepped and ready for you know kind of you know modern applications as you know we we kind of move quickly into this kind of next generation of distributed systems and kind of cloud-based applications and whatnot there's a lot there i mean i will i'm sure we'll talk a little bit more about cockroach, but that's kind of a quick overview. Is that, is that a sufficient Eric? That was, that was sufficient and efficient. Amazing job hearing about your background. And one quick note, the image for small talk from the small talk book on the small talk Wikipedia page is awesome.

Starting point is 00:05:45 It would make a great poster. So go check that out. Everyone who's listening. That's pretty great. Very cool. Well, Costas, you have tons of questions on the technical side, and I'm really bad about stealing the mic at the beginning of the program. So I'm going to hand it over to you and let you dig in.

Starting point is 00:06:00 Thank you, Eric. Thank you so much. So Jim, do you want to give us like a quick overview about CRDB, CockroachDB as a product and as a technology? The founders of this company, you know, Spencer Kimball, Peter Madison, Ben Darnell, you know, all three of them spent a considerable amount. In fact, they all landed at Google at the same time. In fact, I think all of their employee numbers are around 300. You know, they spent a lot of time there.

Starting point is 00:06:25 Ben was responsible. It was a big part of the reader team. Spencer and Peter actually met in college in, I think, the 90s. And they actually built out something called GIMP. A lot of people are familiar with it, but open source image manipulation tool. They're the founders of that. And gosh, that thing's still going on. So this guy's been together for a long time. But you know, it's interesting to see what's happening across the board in everything right now as

Starting point is 00:06:50 kind of the world kind of coalesces around a lot of the innovation that happened at Google in the 2000s and 2010s. And, you know, led by kind of Jeff Dean and Sanjay Gemalat and, you know, Eric Brewer and some of the kind of leading minds over there. But, you know, just look at the things that have come out of that stretch of time. It's amazing. So, you know, Spencer and Peter kind of front row and Ben as well, front row looking at all this stuff. And, you know, when they eventually left, they were off building a startup. They were, I think they were building a photo sharing startup and they were, they were frustrated because they didn't have, they didn't have a spanner-like database, right? They didn't have a comparable version of Bigtable, right? Kind of able to use those things,

Starting point is 00:07:35 but they were frustrated. And so it was really kind of out of frustration in that they ended up kind of starting to build Cockroach Database. And honestly, the name is really kind of after the resilient nature of our database, you can't kill it. But I think Spencer has a little bit of a dark humor. So I think that's where the name came from. We love it. Honestly, I, you know, love it or hate it. People, people remember it. That's for sure. So, yeah. So they basically took the, the, the spanner white paper, which you can go check out on the, you know, the Google, Google publication site and built an open source version of that. They built something that wasn't going to be dependent on explicit hardware to do certain things. And they built a database that was massively distributed, but built to scale very easy, survive any failure, even a region or even anything, a Kubernetes cluster for that matter. But most importantly, being able to tie data to

Starting point is 00:08:23 a location, which is actually one of these concepts I think is not understood by a lot of people when they start thinking about distributed systems and distributed data. Distributed means, well, you have to take into consideration the physical location of things. And so, you know, typically when you prop up a database, you think about the logical data model, you know, here's my keys, here's my referential integrity. I, you know, I figure out what's going on with all my tables and whatnot. With with a distributed system, you have to think about the physical nature of the database, the physical model as well. And I think that's one of those core concepts. And so being able to tie data to a location is kind of a critical piece of CockroachDB because well, it allows us to fight latency issues, you know, put data close to users,

Starting point is 00:09:04 it allows you to survive the failure of an entire region and whatnot. But all the while, let's do this in a database that's built from the ground up, just, you know, completely new. This isn't kind of move and improve. This is, you know, brand new from bit one all the way through and make it SQL and make it, you know, wire compatible with Postgres. So it's familiar to developers and they can get running. And so, you know, we feel we're defining kind of this next generation of database for transactional workloads, and it's called Distributed SQL.

Starting point is 00:09:33 That's super interesting. Two things, actually. One is I found very interesting what you said about the location, and I want to ask you about that to give us a little bit more information around that. But before we go there, I know that one of the most important and the most, let's say, the biggest trouble that people have when they architect and they build distributed systems is time, right? And how you deal with the dimension

Starting point is 00:09:55 of time. And I know that there's a lot of innovation around that that SDB has done. And there are differences compared to Spanner. I mean, you mentioned something about the specialized hardware. Do you want to give us like a little bit more information around that? Because I think it's super, super interesting. Yeah, it's actually, it's so, I mean, it's way deep in the technical nature of our product. And it's, you know, if you start thinking about transactions in a database, now, you know, can you do transactions in something like Mongo or, you know, another database? Well, you're going to end up being kind of eventually consistent. And it really comes back to how you use the algorithms that are, that are in front of you to actually execute transaction. And so, you know, for Cockroach, we chose to be, you know, to implement serializable isolation by

Starting point is 00:10:38 default and actually for all transactions and to serializable isolation, it just means that, you know, every transaction is going to happen kind of, you know, atomically in order, right? It's an ACID concept. You know, I think Kyle, Kyle does a really good job in a Jepson.io website talking about, you know,

Starting point is 00:10:53 all the different levels of isolation that you can have in a database. And if anybody's interested, it's really cool stuff. But serializable isolation is a big deal. But in order to do that, you know, to have multi-version concurrency controls,

Starting point is 00:11:04 well, the clock and the time actually becomes really important because that is what really, you know, demands, you know, that things are happening in order. Now, in Spanner, in the original Spanner architecture, well, Google had relied on hardware atomic clocks, right? So, you know, if you can just align every server to have the same exact time, well, that's great. Well, as you know, and everybody knows, I mean, you know, there's no such thing as, you know, true time on every single, every single server. I mean, you kind of get there with some of these true time services and stuff, but, you know, for us, we want it to be independent of any sort of hardware or any sort of other service. And so we basically built from the ground up. We said,

Starting point is 00:11:42 look at how can we actually, how can we do this a little bit differently? And so we actually, probably one of the most popular blog posts that we ever wrote was living without atomic clocks. And man, that blog post on our website does a really good job of describing this more in depth. But basically what we said, we said, how can we use software to actually get the same sort of thing? And so you've got to start with one, some sense of time.

Starting point is 00:12:04 So kind of start with something like NTP, which is like a network time protocol. It's been around for a long time and then build up some logical drift around that so that, you know, servers can be off by, I think it was, you know, it was like 50 milliseconds or whatever that is. And then use gossip in RAF to actually start to understand

Starting point is 00:12:22 where all these nodes are from a time point of view and correct them as we need to. And so, you know, you can do this via hardware, in Raft to actually start to understand where all these nodes are from a time point of view and correct them as we need to. And so, you know, you can do this via hardware, but, you know, as really clever software engineers, we chose to actually solve the problem so that, you know, it can be wholly owned and in the binary itself of Cockroach, which comes back to this kind of concept of distributed systems and containerizing and everything. And I don't want to be dependent on anything that is external. I want a single binary.

Starting point is 00:12:50 And so that's kind of why we chose to do it. Now, it allows us to be deployed on anywhere, right? And so that's a big, big value for what we actually did. Yeah, yeah, that's amazing. And I think if I'm not wrong, and you can correct me if I am, that the time drift actually also exists in Spanner, right? Even with the atomic clocks,

Starting point is 00:13:11 it's not like they avoid it completely. It's just like it's a very, very small drift that they consider there. Yeah, yeah, that's amazing. And I'm totally aware, and I think our audience is probably also aware of this great blog post. It's very popular. Okay, we talked about

Starting point is 00:13:27 time, but you also mentioned space, right, like location, actually. And that's something that's, okay, usually, as I said previously, when we're talking about distributed systems, we tend to focus a little bit more on the consequences of time there. So why location is important? How this

Starting point is 00:13:43 affects CockroachDB and how it makes it special as a product and as a technology? Yeah, well, you know, the speed of light's no joke, man, right? Like maybe one day we'll figure out how to beat it, but we ain't like, I'm not, I don't think it's going to happen in my lifetime. Maybe some quantum thing will happen, but, and you know, it's funny internally, you know, people ask us about who's our ultimate competitor and, you know, in our engineering team, our ultimate competition is SpeedoLite. And we do lots of things within our software to actually, you know, fight it and to work with it within the context of it. Right. And so when you're deploying a database and you want consistent transactions, serializable isolation is going to guarantee the data is correct.

Starting point is 00:14:21 Right. We're talking about, you know, financial transactions across a planet. Look, we're going to be really good in a single data center. There's lots of reason why people would use this in a single data center for a simple application. But when you start to kind of, you know, build something that is the next generation of database for these mission critical workloads,

Starting point is 00:14:39 you know, the stuff that's been wrapped up in mainframes for years and years, you know, having consistent transactions is really important, but having global access to this is also critically important, especially in kind of the modern, you know, world and basically businesses everywhere. But there's a problem, you know, because the the jump from New York to Singapore is going to be what a 500 milliseconds sometimes, or maybe 300. And what happens when a transaction takes, you know, two or three hops back and forth, you know, we're talking about a second or two.

Starting point is 00:15:08 It just doesn't work in certain workloads. You know, the, again, another Google thing, I always forget the guy's name, Paul Bechtel, or I forget his name. One of the guys, one of the original guys who worked on Gmail came up with this concept called the 100 millisecond rule. And the 100 millisecond rule. And the 100 millisecond rule basically states that anything that happens, you know, sub 100 milliseconds appears to be in real time to the human. Anything over, you can actually notice the lag a little bit. And so for us, it's like, how do you get all transactions to be under, you know, sub 50 millisecond? Well, the only way you're going to do that with data wrapped around the entire world is to make sure that data is located close to that user.

Starting point is 00:15:47 Now, we use Raft. Raft is a distributed consensus algorithm that many are familiar with. Anybody in distributors, if you don't know Raft, go check it out. I mean, gosh, it's like, yeah. Yeah, I think Raft and Paxos is like the two most commonly referred algorithms around distributed systems. That's right. And we're using Raft to kind of place data, you know, these replicas around the world.

Starting point is 00:16:10 And, you know, how do we actually, you know, make sure the Raft leader or the leaseholder, as we call it within Cockroach, how do we make sure that that's close to a user so that, you know, all transactions to that Raft group are going to happen very close to that user. And so, you know, we went to great lengths, basically within the way that we store data using, we're using raft and distributed

Starting point is 00:16:28 consensus to make sure that data is going to live close to users because I, you know, I want to guarantee, you know, sub, you know, can we, can we tune our database to get, you know, sub 10 millisecond transactions to every single, you know, transaction, no matter where it's at on all the tables. Yeah. But on the other side of that is sometimes you don't, sometimes you just need data to be, you know, accessed all over the world. And so, you know, we, we chose to do this at the table level, you know, and so for each table, we're defining how data is actually persisted within, you know, the physical nature of all the nodes within Cockroach. And so it's pretty simple to do. It's, it's a pretty straightforward

Starting point is 00:17:03 process. We're actually going to simplify that a whole lot over the next couple of weeks. We'll have a release come out that really kind of breaks this down into some really kind of simple declarative kind of SQL statements that allow you to define at a table how you want data to live so that it's easily accessed or quickly accessed or how it wants to survive. So that's great. Actually, as you were talking about speed of light and location and where the data is located, I couldn't stop thinking about at the end, it ends up to physics again, right? Like it's, you have space, you have time, but at the end, you need both of them like

Starting point is 00:17:36 to solve the problem. And that's right, Custis. And it's, it's, it's a really difficult, we, you know, we, the, the software, the team here, the, the, the team of engineers, I mean, some of the stuff we did, some of the stuff we'd done is just truly remarkable. One of the things they worked on was this, there's this feature in cockroach called parallel commits and, and look at, I'm going to say, I'm just the marketing guy. I'll explain the best I can, but we actually have a Sigma paper that gets into this pretty well that that's published. I think it's available on our site, but the team basically, they said, look at how can I actually forward

Starting point is 00:18:04 commit a transaction before, you know, and just and say, with five nines probability, that's going to commit on the second node. So if I can actually go through and look at basically, if I could commit a transaction locally, and then look at the data around that transaction and say, hey, look, I'm going to send the transaction and the picture of the data around that I'm going to send that to the second node. And if the second node takes, it looks at all the picture around it and says, yeah, everything looks the same, just forward commit and just say it's done. Like instead of doing all the transactional steps within each, you know, within that thing, just come back and say, yes, acknowledge it's going to be, it's going to be fine. That is awesome, right? Because that's

Starting point is 00:18:43 a, you know, you can't, you know, it's the speed of light. You can't change the photons, but you can't, you know, maybe you could change that package, what's in there, right? And so it's a different way of thinking about things and, you know, has made huge, huge gains for us as we kind of, you know, ratchet down on latencies and continually fight the speed of light here. Yeah, yeah. I think it's more than obvious

Starting point is 00:19:04 that there's some amazing engineering behind CockroachDB. So, all right. We talked a little bit about all the amazing stuff that is happening behind the scenes and what the problem is that the Cockroach database is trying to solve. Who should be using this database?

Starting point is 00:19:20 Because, okay, we know that every engineer, I mean, you as an engineer and me as an engineer, we know that we always like to play with new toys and exciting technologies. But who should invest in building applications on top of CockroachDB today? You know, I mean, you're asking me, everybody, right? Like I work here, So, you know, well, I mean, look, you know, look at if we're going to be wire compatible with Postgres, if it's basically the same syntax, but you're going to get basically all this value of never having to manually shard a database, never having to think about setting up any sort of active, passive, resilient system. Like, I mean, why wouldn't you use this for a simple application, right? Like, yeah, sure. You could spin up RDS Postgres and gets going pretty quick, you know, like in a single region, it's pretty cool. But like,

Starting point is 00:20:09 literally like the complexity of actually dealing with some of these kinds of day two operation stuff is it's, it's, it's killer. And so, you know, for us, it's really any application. Now that said, I mean, we've got some pretty world-class customers out there. And, you know, some of them that I can actually talk about, you know, DoorDash is a big customer of ours. Lush, Bose, Comcast, you know, LaunchDarkly, which is a great, you know, dev tool, right? Like, so there's a lot of really great logos. You can go to our website, we have a lot more,

Starting point is 00:20:36 but, you know, they're looking at us as kind of the next generation of database. You know, DoorDash is a really good example. You know, they, you know, height of the pandemic a year ago, you know, they've had, they had a couple of issues with some outages of the database that they had using

Starting point is 00:20:49 because they had kind of a right bottleneck, right? Like, you know, you know, if you're going to be distributed, then every node can take a reader, right? That's, that's our theory. Well, some other databases like single right node,

Starting point is 00:20:58 but, but read nodes all over the planet. And, you know, that was just causing issues because they had downtime. I mean, you know, how did, and then if you fast forward a couple of months where they're going to IPO, they can't have crisis. They can't have downtime. Like that's just going to, that's going to have an adverse effect on any sort of, you know, what they were trying to do. And so, you know, midstream, they, they chose to move to Cockroach and, you know, fast forward a year and, you know,

Starting point is 00:21:21 a lot of their transactional workloads are now either moved or moving to Cockroach and Cockroach database. And then they set it up internally as a service for the new developers to use this because you're right, Costas. Developers do want to play with the latest tech and we want to take away all of the complexity, make data easy enough

Starting point is 00:21:40 so they can focus on their breakthrough application and build. And I think that's what some of these organizations, so, you know, time and time again, we keep seeing companies, you know, set us up as a service internally for them, for all net new workloads. I think we fit net new workloads really well. Like, you know, it's some of the legacy migration stuff. It just wasn't built for the distributed world. You know, when you start using stored procedures and you have all these kinds of crazy concepts in there, you kind of have to think differently. And so we fit really well in these net new workloads. I think anybody who's thinking about Kubernetes or deploying anything on

Starting point is 00:22:13 Kubernetes, forget about it. There is no other database that was built directly for Kubernetes. I mean, this is descendant to Spanner, just as Spanner was built kind of for Borg, right? And so I think that's where we're seeing most people kind of turn to us. Yeah, you're absolutely right. I think I didn't put the question probably that well, but the point and the reason that I asked is because, you know, like most people, they have in their mind that when we're talking about distributed systems, like distributed databases, we are talking about very niche problems out there that huge companies only have to deal with, or it's about crunching

Starting point is 00:22:46 a lot of data and doing analytics. But I think it's important for all the engineers out there to understand and to communicate us as the vendors to them, the benefits that they can get by relying on technologies like CockroachDB at the end. Regardless, I mean, of, let's say, the complexity or the size of the project that you're running. I mean, Kostas, it's such a big piece of this. And so I keep getting into these conversations over the past couple of years,

Starting point is 00:23:10 and it's like shifting to a distributed mindset is not easy. Like it took me a while to figure it out. And I think that's the thing. And it's like, it's not just operations. It's not just infrastructure, but it's the developer has to think differently. And I think that's where we're,

Starting point is 00:23:27 you know, I think we're going to be in a different world four or five years from now when, when this is kind of the, the, the, the facto way of actually building, you know, but we're in this transition mode. And I think that's where people, you know, think, think like, oh, it's maybe it isn't for me. Well, it is. And this is the future. This is what's happening y'all. So question for you on that, because I, I totally agree. And I'll And I'll give a quick story here as a background as kind of leading the question. I was listening to a podcast with the guy who started Spotify, and he was talking about the early days, and they were trying to replicate the experience of having music downloaded directly on your hard drive. And so his goal was sort of, can I create an

Starting point is 00:24:05 experience that is better than, you know, sort of going on a Napster and downloading a bunch of songs directly on your hard drive. And he ran into the a hundred millisecond problem and actually had to introduce more latency in the experience and sort of make it seem like it was taking a little bit longer to give people the impression that it was sort of, you know, being downloaded. But I was thinking about that relative to what you were saying. And back then, I mean, they started in 2005 or 2006, and there were really significant technical limitations relative to what's available now. But I think we're moving into a phase where people, the expectation of the consumer, you know, sort of in a D2C model and even more and more in B2B

Starting point is 00:24:45 is that there is no latency, right? It's just everything is instantly available. So all of that to say, the question is, how much are you seeing when you talk to people who are looking at migrating or starting to think in a distributed mindset? How much of this is that pull coming from the consumer just demanding a better experience because they're starting to get it with the services they use most often. And so smaller companies are having to replicate that or get as close as they can. Yeah, that's a really great question, Eric. And so I think of this as kind of three pieces. That consumer experience is a big deal. It's not a big deal for every application, but when it is, it's a big deal. And so it really comes down to the workload.

Starting point is 00:25:28 I think the other thing that is a big kind of weight here is we figured out big data for analytics. We never really figured out big data for transactions. And this kind of like, it's an accepted concept, but it's almost like transactions were ignored and everybody wants it. And so basically there's this big push there, but I got to tell you the, the big reason that people are turning to this and other distributed systems is because you look at the cloud is awesome. And yeah, I get CapEx, OpEx gains and you know, it's everywhere. And I got

Starting point is 00:26:02 like these great services, but like for the core kind of concepts of cloud around scale and resilience and kind of you know exposure everywhere all across the whole planet we're still limited in many ways you know you know for the for the general purpose you know user of these services right like the and for us it's like well what's the what is that if that if that infrastructure is changing for us it's like well what's the what is that if that if that infrastructure is changing for us it's the the equation is like where where does infrastructure end in your application begin right and and for me i always thought of the database as part of the application right that's just i that's the way i that's the way i was raised that's what i you know

Starting point is 00:26:39 shit man the first thing i ever built was on fox pro it was in the database itself right like sure sure you know like way back and like that's just not the case. The database is infrastructure. And so the last piece of infrastructure that has to move towards this distributed, this is like, as we like throttle, like accelerate in this world is the database, man. And I think that's the piece that, you know, being truly distributed is a key thing. So I think that that consumer experience is, is, is it's a big deal. It's not a big deal everywhere, but I got to tell you, I think it, increasingly that expectation of instant access and it's gotta, it's gotta,

Starting point is 00:27:16 it's gotta be as good as Instagram or Facebook or whatever I'm using as a, you know, like my mom, my mom's happy with that. Right. So it's gotta be the same. Yeah. And why do you think transactions were, you said it was interesting. You said it kind of seems like they got ignored. Why do you think that happened? Because it's difficult, Eric. It's really difficult.

Starting point is 00:27:37 Like this is not simple stuff to solve, man. Like look at, so I was at Hortonworks and the team, again, amazing group of engineers. I mean, these, you know, Owen O'Malley was like troubleshooting the Mars rover when it was down. Talking to the guy one day, he's like, I'm like, dude, you fixed the Mars rover. He's like, no, I just fixed the scheduler. And it's like, okay, dude, like, you know, like some of the stuff that was going on. But like, I remember Alan Gates is working on H know we had you know how do we provide transactions there was like the impala thing going on at cloud era hive llap was that approach i think we've kind of retrofitted transactions into the number of no sql databases it just you can't take existing concepts like so why did google build spanner when they already had big table right right? Like, well, because it's a fundamentally different problem. And to solve it, it takes a rework of the entire stack, like, you know, from storage, the way that data gets written to disk through the transaction

Starting point is 00:28:37 model and being a distributed transaction model, distributed transaction, you know, execution engine, all the way up to the way that the language works and how these things happen. And so it's a complete rework. And, you know, Stonebreaker, who, you know, Michael Stonebreaker, if people aren't familiar with him, I'm kind of one of the godfathers of all databases. I mean, you know, started Postgres, by the way, y'all, if you don't know who Stonebreaker is, Stonebreaker say like, it takes seven to eight years for a database to fully gestate and be, you know, really kind of valuable for, for large scale kind of operations and whatnot. Well, if it takes seven, eight years to build a database, you guys, and I hope your audience building distributed systems is also difficult. So let's put those two things together, you know, and it's not, and it's not, you know,

Starting point is 00:29:18 if it was easy to build a quarter, a database, everybody would be doing it, but it's the corner cases. It's the real weird, odd things that happen. And for databases, those are difficult things to solve. And so I think that's why it's a really, really difficult problem. Sure. So it was less about, if I had to summarize that, it was less about the advanced optimization of existing systems and sort of rethinking the fundamental architecture of how it actually works. That's right. Absolutely. Thank you. You just take my three minutes and broke it down into 10 seconds.

Starting point is 00:29:55 That's the best. It's exactly right, though. Eric, I think you did a very good thing on moving the conversation also from the side of the end user at the end. But I want to shift the discussion a little bit back again to the developers and discuss a little bit more about the experience, Jim, that the developer has with CockroachDB. So you mentioned that you are Postgres compatible, but what an engineer should expect by interacting and using and integrating CockroachDB today? So great question. So, you know, we're wire compatible with Postgres, right? We're going

Starting point is 00:30:29 to speak SQL syntax. So if people are understanding SQL, they're going to get us. If they're using ORMs, you know, we've built out, you know, a lot of ORM integration. So if people are doing that sort of stuff, first of all, like, so number one, it's pretty similar and familiar to that experience. But there's concepts that are different when you're dealing with distributed data and kind of distributed systems. You know, if you're going to have a serializable isolation database,

Starting point is 00:30:54 well, you know, as a developer, by the way, I never thought about isolation levels. I was like, whatever. In fact, it was all new to me, you know? And again, I was a hack, like try catch blocks. You guys want, what do you want me to do? Like, just let me deal with logic. And so, you was a hack, like try catch blocks. You guys want, what, what do you want me to do? Like, just let me deal with logic. And so, you know, some transactions are going to conflict and, you know, implementing best

Starting point is 00:31:12 practices around try catch is a big deal, right? So that's one thing. I think another big thing is actually when you start to think about the, how data is stored in Cockroach, you know, we get a lot of conversations with customers about unique IDs and a lot of times we'll see tables that just increment values for unique IDs. And that's actually an anti-pattern for distributed systems because we're using that unique ID to actually, you know, to distribute the data across the cluster. So you don't want like a hotspot, you don't want all records in one range. Right. And so, so there's like another layer down,

Starting point is 00:31:42 which it's a little bit deeper, but, you know, using UUIDs to do that is actually a big deal. But I think one of the things that's most important for the developer, let's start thinking about distributed data is how you construct your, your, your, your, your transactions. And Sean at DoorDash, I was on a webinar with him a couple of weeks ago. He, it was a really great example. It was like really crystal clear. You know, if you're going to insert 10,000 records into a table, right? Like, okay. Yeah. Postgres insert. Here's the records. It's optimized, man. It's going to, it's going to fly through that, right? It's going to just depend that data, update the indexes. You're good. Right? Well, in a distributed system, you don't really want to do that because you're going to overload

Starting point is 00:32:20 kind of one node, right? Like you just basically overload and it's trying to communicate with all the other nodes. Wouldn't you want to execute that as say 10 transactions, each of them with a thousand inserts, right? So you get the parallelism of, of basically, you know, multiple endpoints all working on this, right? Because any endpoint in Cockroach can, can, you know, receive and, and, and process reads and writes, right? And so there's a little bit of a, this comes back to what we were talking about before, this like distributed systems require a different mindset. And I think that's the stuff that is interesting to me and fascinating to see in the developer community,

Starting point is 00:32:55 how people are starting to come around to that. Like it's a different way of thinking when you code and interact with things on the back end, you got to start thinking about location and that sort of stuff, if that makes sense. Yeah, yeah, absolutely. And I really love that you keep saying about this change in mindset

Starting point is 00:33:10 because I think, I truly believe it's a very important thing that's happening and engineers should learn about it. And based on my experience and my exposure to distributed systems, one of the biggest revelations that I got from distributed systems is about designing

Starting point is 00:33:25 systems with having in mind that things will go wrong. Things going wrong is not the exception, right? It will happen, right? And that's a big part of trying to build distributed systems. What are all these edge cases and what are the limits and what can we do to secure our data, our transactions and the behavior of our systems when we are dealing with all these problems? And I think that one of the bad things that happened because of the introduction of cloud, and unfortunately, this is also part because of marketing, is that cloud was also evangelized as a solution that takes all the hard problems away, right? Like, I can have my servers there.

Starting point is 00:34:03 I don't have to worry if a hard drive dies, right? My, I can have my servers there. I don't have to worry if hard drive dies, right? My file system and servers will be still running. But at the end, this is not the case. And actually, I think that whatever happened, like, there are many failures that are happening on cloud. And as you are dealing a lot with resilience, and you see also like from large scale deployments from your customers, how often do you think that you know that like this is happening and how much of a problem it is for an engineer to keep in mind that failures will happen? It doesn't matter if we are on Google Compute Cloud or like on AWS, we have to build all the components of our systems around the concept that something might go wrong and we have to be ready for that. So first of all, Kostas,

Starting point is 00:34:45 you're going to go and attack the marketing guy? Really? You're just going to blame it on marketing, buddy? Come on. That's funny. You know, I mean... I actually went the reverse way. I came from marketing and we always blame marketing. Cem, that's the...

Starting point is 00:35:01 I know. Look at it. You know, look at some marketing organizations do, you know, look at it, look at some, some marketing organizations do, you know, they're going to, they're going to go a little too far. And I think there's definitely some of that at Costas. I know exactly what you're talking about. And like, you know, they're delivering, they're, they're selling a promise of something that's just not a reality, or if it is, it's really difficult to attain. Right. And I think, and you're right, like this, this distributed thinking and distributed

Starting point is 00:35:25 mindset requires you to basically build for resilience. That that's the concept. Like, that's the thing. Like there's no such thing as disaster recovery because disaster should have no impact. Right. Like, and so how do you design for that? And well, that takes a whole, this is what I mean by you have to re-architect. You have to rethink. Everything that we ever thought about before is kind of architects of systems. You kind of re-architect, you architect for resilience in the system itself. And I think that's kind of one of these core, again, one of these kind of core concepts that came out of the Google team over the past, you know, 15, 20 years. And I think, you know, there's a lot of research.

Starting point is 00:36:03 There's a lot of technology, a lot of software engineering in this, you know, there's, there's a lot of, there's a lot of research. There's a lot of technology, a lot of software engineering in this, you know, I mean, understanding Raft and Paxos, understanding things like MVCC, you know, the, some of the core kind of concepts that are out there, I think are, you know, part and parcel of how to do this. Luckily enough, a lot of this stuff is open source, which is just awesome. Right? Like, and so you want to go get a PhD in how to actually implement Raft, go check out at CD raft, right? Like go check out the implementation. There's some amazing people worked on, on that and including parts of our team. But I think there's, there's a lot of examples for people to actually go out and figure out how to do that because you know what your right causes, everything fails, everything fails. And, and if, if you don't understand your own,

Starting point is 00:36:43 you know, the, your own mortality, everything fails y'all. own, you know, your own mortality, everything fails, y'all. So, you know, and regions do go out and you know what backhoes hit cables every day. And Google has failures of regions go out and Gmail goes down. These things happen. It's about basically dealing with it. Like the concept of an SRE is brilliant. You know, talking about RPO and RTO and understanding what those things are. As a developer, I found it to be extremely important to get because I think you'll start looking at it

Starting point is 00:37:12 in a different way. Yeah, absolutely. I totally agree. And as you mentioned SREs, what an SRE should expect by working with CockroachDB? I guess they could sit around and eat bonbons and let the thing run all day and work on other things.

Starting point is 00:37:27 You know, funny, like, you know, I typically think of SRAs in the concept of, a lot of times in the concept of Kubernetes. I've been kind of in the Kubernetes community for a while and I just love being a part of it. And, you know, from that point of view, well, you got a database that's already fit for this kind of world.

Starting point is 00:37:42 You know, it's aligned with their objectives, right? It is built for easing scale. Spin up a node, point it at the cluster, great, all the data balances. You don't have to really deal with those sorts of things. We can do rolling upgrades. There's online schema changes, right? The whole nature of a distributed system allows you to do some really cool things. So it's kind of a low touch database for the SRE in many different ways, but it's aligned with the way that they're moving forward with adoption of orchestration systems. You know, this is something that was built for Kubernetes or Nomad for that matter. And so, you know, for us, our conversations with the SRE is typically around that.

Starting point is 00:38:18 Now we employ, oh gosh, I know we have a little small little army of SREs who's managing and dealing with Cockroach Cloud right now, our own managed service. And, you know, they'll tell you, I'm sure they'll laugh at this part of the conversation. It's like, what are you talking about, man? There is a lot of work we do. They do. And, you know, but it's built to be automated. And I think that's the whole key there, right?

Starting point is 00:38:40 That's what that concept is all about. So, yeah. And the last thing that an engineering team is to have their SREs unhappy. So that's, I think it's quite important to keep them happy. It's kind of like a, remember when there was like an unhappy DBA? Oh, I'm sorry. They were just unhappy all the time, mostly. Yeah, I know what you're talking about. All right. We're getting close to the end of this amazing and very exciting conversation that we have. I have two more questions. One is about open source and databases. And I think anyone who has worked like with databases, especially like in the past couple of years,

Starting point is 00:39:13 we see that pretty much having an open source version of the database is mandatory. It's out there. How important is open source for building a database system in your experience? Yeah, I mean, it's a great question. Let me ask you a question back, Kostas. What's important for you with something being open source?

Starting point is 00:39:38 That's a very good question. And I think it has many dimensions, the answer. But I would say that one of the most important things around open source is support. I feel that the project is alive and there are people there to take care of it, especially as the project is complex. So for me, as someone who would try to architect or engineer a system, that's important, especially for the backbone of my system, which is database, right? That's right. And I agree.

Starting point is 00:40:10 It's community, right? And it's about building people who are all kind of into it and using this and seeing the code base move on, right? And so I always think about there's code and then there's the community side of things. And so code has got to be open source, right? Community has got to be there. The funny thing is when,

Starting point is 00:40:34 I think we get into these weird conversations about the commercialization of open source and we confuse the business model with the open source project because ultimately like, look, man, I've been an open source for a long time. And the beauty of open source to me is consumption. Like, it's free. I could go use it. And I have all this community of people to support, right?

Starting point is 00:40:56 And, like, that's, you know, and so the problem is over the past three years, what's changed is consumption. Like, how do you consume software today? Y'all like you, you go and spin up a service in some public cloud provider. And so the consumption factor has changed. Now, what we've done is we've taken free beer away from open source. Basically that's, we always talk about free beer or free puppy, right. In the context of open source. And, and so how do we get that back? How do we get it so that everybody can use that tool, but still consume it as a service? Now we're hellbent in making sure that we do that. We build up a community of people that are around us.

Starting point is 00:41:34 We changed our license about a year and a half ago, two years ago, to the BSL, business source license. I think MariaDB was the first ones to have it. And basically what it says, it says, look at what we want to protect ourselves from is from a large club, a public cloud provider taking our code base and going and making a bunch of money off it. Right. I mean, it's like elastic. Okay. Like in Mongo, the same, like right down the board, we've all changed our licenses because database technology is a little bit different than other open source technologies. It's, it is complex. Remember I talked about this seven, eight years to get it to a point where it's even kind of like, you know, I mean, you guys, Postgres has been around since 96. It was the official.

Starting point is 00:42:13 I mean, it really started like 88, something like that. Like these things, they aren't simple. And so, you know, we want to build a business. We want to build something that's going to be there in the right database for all consumers, not just every single developer, but every large enterprise too. And for us, you know, it's a balance, it's a delicate balance of doing those things when doing it the right way. And I think if you build a good, honest, humble, and kind of, you know, open community and are authentic, you can do that. And that's what we're about. So I always think of open source as code, community, and consumption.

Starting point is 00:42:46 And then there's this weird thing about license that everybody gets wrapped around the axel on. So yeah. Yeah. I think that's one of the best descriptions I have here about open source and how it interacts like with a business. Cool. So Jim, last question from me, and then I'll let Eric.

Starting point is 00:43:02 What's next for CockroachDB? What's in the product roadmap that you have and what exciting new things you are going to deliver in the next couple of months? Yeah, you know, the ultimate vision of this company is, you know, it was funny when I first got here, it was like make data easy. Well, we actually do want to make data easy.

Starting point is 00:43:19 And I kind of really ruffled with it when I first got to Cockroach and I first met Spencer about four years ago, actually. But we're doing things that take these complex concepts and make it really simple. Like deploying a database across multiple regions. Like, you know, where does data get located? I don't, you don't need a PhD.

Starting point is 00:43:39 Like, okay, you can kind of do that. You can do this in, say, Cassandra, but you need a PhD. How do you make it so dead simple? It's declarative, right? Like we're simple SQL statements and the database just takes care of this complexity. You know, we spend a lot of time doing those things, taking the complex distributed concepts and making them simple. But we also understand that consumption is a big deal. And so, you know, our ultimate vision, our ultimate vision is that Cockroach Database is a SQL API in the cloud. I want to make data available to every single developer on the planet, no matter where they

Starting point is 00:44:11 deploy their application. And I want them to just communicate via SQL to some REST interface or whatever it is into the cloud. And let us deal with scale. Let us deal with resilience. Let us deal with locating data so that you're going to be guaranteed, say, you know, I don't want to put a number out there,

Starting point is 00:44:30 but sub 50 millisecond, you know, access to data, no matter where your users are on a planet, right? And so, you know, for us, it's how do you do that? And while you deliver it through, you know, kind of the, you know, the whole kind of move towards serverless. So how do you build this truly serverless database? You know, make it multi-tenant, you know, be able to spin up and spin down dormant clusters. So we don't get killed on cost, right? Like make it consumption-based,

Starting point is 00:44:52 you know, all the security controls that have to be in place. And so, you know, for Cockroach, we're pushing really far at that. You know, we've, we launched a beta version of this Cockroach cloud free, the free beta is available on our website. People can start to play with it. You know, it's limited about five gig of storage and, you know, it's single region for now. But, you know, it's where we're building and focusing a lot of our future because we really do believe consumption is via, you know, A, the cloud. B, more importantly, I think people just want to, I think people just want an API. Let all that complexity just meld into the background. I don't want to have to think

Starting point is 00:45:25 about scale. Just give me a bill. And truly delivering on that promise, I think, that's where we're headed. And I've never been more excited about a company because I think this vision is right. And we'll see how it plays out over the next couple of years. Yeah, absolutely. And we will be watching closely. all right so eric it's your turn the burning question that i mentioned at the beginning and this is just more out of curiosity we love to we love to give our audience uh a little bit of insight just into the people behind you know these companies and technologies and stuff and And I'm interested to know, coming from a programming background and now working in marketing, what principles have you brought from programming

Starting point is 00:46:12 into marketing in your role? And how has that background helped you frame the way you think about it? Oh my God, my team's going to laugh. Structure. Structure and framework. Always give people a framework to consume break it down into three things give it structure because otherwise you're just all over the place and so i'll i'll often start questions like this and say well there's three reasons eric one two and i don't even know what the third one is by the you know while i'm talking about the second one so i just i you know i just having frameworks for people to actually understand things is just really critical. And I think in all aspects of our life, I mean, you know, if I'm going to write a paper, well, there's a heavy outline done before we do that so that we're all in agreement, you

Starting point is 00:46:53 know, we're, we're 30% done, right. That way we're all directionally correct. And, and, and having those concepts definitely apply in marketing for sure. Because I mean, ultimately, you know, what's your, you know, God, we used to do PRDs and, and, you know, you know, these deeper technical kind of concepts, conceptual diagrams. So we were all aligned, right. And so that, that, that, that core concept has been fundamental, a game changer for me in, in, in my career and product marketing for sure. Very cool. And one last question, and this is, you know, we don't know a ton about, we didn't discuss a lot about how the, how the org is structured at Cockroach, but one thing that I think would be helpful for our listeners,

Starting point is 00:47:37 especially with your unique background is a lot of our listeners are engineers working with data in some way or in some engineering capacity. And a lot of them interact with marketing teams in various ways. And those relationships are all over the place. We've had interesting, just interesting discussions with people about how they interact with marketing. And we just love your thoughts on that. What does a really good relationship there look like in terms of sort of the people working with data from an engineering standpoint within a company and then how the relationship with marketing works? I'm very fortunate to work in a company

Starting point is 00:48:16 that works the way that Cockroach Labs works. You know, there's a level of respect across all the functions in this company that I find to be truly unique, you know, in all the places I've been. And I use the term respect because it's actually pretty important. I don't think a lot of organizations actually understand what's, what's beneath the iceberg when it comes to marketing and how complex it is and how difficult it is. People think it's a website. Why are you writing it like that? Why are you doing this? Like, like there's so much that goes into it. We work as hard as anybody else in the organization. And, and honestly, I think the, the, like, don't get upset at marketing because they're doing something wrong, help them get it right. Right. Like I use this word with, with our sales and marketing teams all the time. I use the word authentic all the time. Authenticity. You got to be authentic. Like I don't go into a company and

Starting point is 00:49:09 not understand what an AZ is when you're talking about, you know, you don't say it's Arizona. It's an availability zone. Do you know what that means? You know, you know how physically it looks at data center, help them understand what it is and, and, and make sure they get these concepts, right? Because the more authentic marketers can be, the better off everybody's going to be. The better off we're going to be able to translate and sell what you're trying to do. So don't be against them, help them,

Starting point is 00:49:33 I think is one of those things. And I think that's where the best relationships we have across our marketing team is there. I mean, look, we're selling to developers. You know, I love talking to my development team. Well, they're going to be a little bit more software engineers sometimes, or a little bit, you know, out there sometimes, but you know, we learn a lot from them too. And so that, that,

Starting point is 00:49:52 that respectful communication back and forth and the, the, you know, having the patience to look, man, the, you know, you may think something's wrong on the website. There's a whole lot of other stuff going on y'all. Right. But if something's wrong, call it out too, and do it respectfully. I think that that's the, that's the thing. It's not a simple job. Love it. And I now feel fully guilty for earlier advocating the idea that we can blame everything on marketing. It's okay. Hey, call, Hey, but you know what? Call them out when they're wrong too. Like, Hey guys, like, but this is why you're wrong. Like, don't just say you're wrong to say,

Starting point is 00:50:24 this is why you're wrong. And this is the effect that it's going to have, right. Give it a reason. Right. And so call it out. Gosh, by all means, you know, we're, you know, we're here to make it better. We are also after a goal too. Right. And that's, that's critical. Love it. The, the concept of respect in that relationship and really all, all relationships is huge. And I think that'll be really helpful for our listeners. Jim, it's been a really wonderful show. Is cockroachlabs.com the best place to check out all things cockroach? Yeah, absolutely. You know, the free tier, of course we're hiring, we're always looking at, we're growing like crazy right now. Y'all like, this is just a lot of fun. So yeah, everything's there at cockroachlabs.com.

Starting point is 00:51:03 Very cool. Well, we'll check back in with you in another six months or so, have you back on the show and thank you again for your time and insights. Well, thanks for having me guys. I really appreciate it. Wow. Another super interesting show. I think my big takeaway was hearing Jim talk about the sort of lagging migration of transactions to the distributed architecture. And that was just really interesting to hear about how difficult that problem is and how sort of optimizing existing systems wasn't going to work in order to deliver the experience that, you know, sort of ultimately people are demanding. And I just thought that was a really thought-provoking answer to that question. How about you, Kostas?

Starting point is 00:51:50 Yeah, absolutely. I think like at the end, distributed systems are hard. They are hard to build them. And most importantly, it's hard like to reason about them. At the end, you can end up in situations where you're traveling time. It can be completely mind-bending. So there is a reason that it took a while to see all these technologies becoming more and more approachable out there. I think, though, that probably the most important outcome from our conversation with Jim today was about the need

Starting point is 00:52:23 for the engineers to change their perception and start thinking more in terms of distributed systems and computing. And that this is going, it's something that's going to become more and more important in the future. Not necessarily in the way that it's like, okay, anyone, everyone has to understand how Raft or Paxos works, but more about understanding the differences and the challenges and also the advantages of using distributed systems

Starting point is 00:52:52 and how these affect your product, your architecture, and the overall way of thinking in engineering terms. I think that's super important. And it's what makes marketing in this company really important. And I think that's super important. And it's what makes marketing in this company is really important. And I think that's a testament of that. Talking with Jim today is marketing can really be an educational tool to help all these engineers out there figure out the right things to understand and the right concepts from distributed systems to use in their everyday work. Yeah, I agree. It was interesting. I almost asked a question that we ask a lot of our guests, which is, what are some other ways that people are solving this problem today? And the more I thought about that question, I ended up not asking it because we were talking

Starting point is 00:53:39 about a shift in the way that you think about architecting a system. And so I just appreciated his perspective on the mindset shift that's required. Well, thank you again for joining us on the Data Stack Show. Be sure to hit subscribe on your favorite podcast provider so you can get notified of new shows every week. We have a great lineup in the next couple of weeks. You'll want to be sure to grab those episodes. And until next time, we'll catch you later. The Data Stack Show is brought to you by Rudderstack, the complete customer data pipeline solution.

Starting point is 00:54:13 Learn more at rudderstack.com.

The Data Stack Show - 35: The Future of Development is Distributed with Jim Walker of Cockroach Labs

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.