The Data Stack Show - 35: The Future of Development is Distributed with Jim Walker of Cockroach Labs
Episode Date: May 12, 2021This week on The Data Stack Show, Eric and Kostas talk with Jim Walker, the VP of product marketing at Cockroach Labs, about distributed systems, competing against the speed of light, and making data ...easy.Highlights from this week's episode include: Jim background of translating deep technical concepts into understandable English and his work at Cockroach Labs (2:23)The origin of Cockroach Labs and distributed SQL (6:10) Living without Atomic Clocks (10:10)Having the speed of light as the ultimate competitor (13:49)CockroachDB’s users (19:35)Figuring out big data for transactions (25:14)Dealing with failure (35:04)Open source code, community, and consumption (39:26)Making data easy, and what's next for Cockroach (43:12)Bringing programming into marketing (46:18)Mentioned Links:Spanner White PaperRaft & PaxosMichael Stonebraker The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Transcript
Discussion (0)
The Data Stack Show is brought to you by Rudderstack, the complete customer data pipeline solution.
Thanks for joining the show today.
Welcome back to the Data Stack Show.
We have a really exciting guest from a company that Costas and I have talked about a ton,
which is Cockroach Labs.
And Jim Walker, who is from their product marketing team, is going to join us on the show today.
My burning question, Costas, is, and this is not going to be a surprise to people who've been
following along for a while, Jim was a developer before he got into marketing. And so we've had
several people on the show who have sort of crossed lines between sort of marketing
and engineering and technical roles.
And so, of course, I want to ask him
what lessons he's brought
from engineering into marketing.
What are you going to ask him?
Oh, I have plenty of questions.
Cockroach CB is a very interesting
piece of technology out there.
They have done amazing innovation.
It's one of these companies
that they are really like on the borderline between doing research and at the same time
productize it. So there are many questions around the database systems, distributed systems,
and of course, what the vision of the product is in the company. So yeah, I'm super excited
to chat with Jim today.
Great. Let's dive in. Jim, thank you so much for joining us. Kostas may have mentioned this in your chat before the show, but he and I actually talk about Cockroach Labs a lot,
just because we admire so many things that the company is doing. And so a real privilege to
have you on the show. Thanks for joining us. Well, thanks a lot. I'm fortunate and privileged to be an employee with a good group of people.
So I hope I represent them well here. So I'm happy that that's what you guys think about us.
It's a fun place. That seems like it. Well, why don't we start with, I think a lot of our listeners
are probably familiar with what the company does, but we'd love to get to know you a little bit
better. And you have a background as a developer, but now work in marketing. And that's absolutely
something I'm going to ask you a lot about, but we'd just love to hear a little bit about your
personal history. And then for those listeners who might not be familiar with Cockroach, can you
just tell us a little bit about what the company does and what you provide? Sure. Yeah, that's a
lot. And I'll try to be somewhat brief so we can get into some, some technical stuff
and some other concepts. But yeah, I mean, you know, y'all, I started as a developer. I mean,
I coded at the age of 11. I was, you know, you know, the early eighties, you know, I had a,
it was 1099 and Commodore 64. I was just always into computers. I always had a kind of creative side of my life as well,
but I kind of landed in electrical engineering,
computer science in undergrad, graduated.
I loved it.
And I ended up being a programmer
and I coded for seven years professionally
in a language called Smalltalk,
which I will argue is still the most elegant
and beautiful language ever created.
And I was in C++ and C, and, you know,
I was a developer and I was working as
a consultant. And every time a salesperson opened up their mouth, we had like, you know, two months
of scope to the project. And it just frustrated me every time. And so, you know, I ended up being
the person that they would put in front of people to actually explain what was going on. And I loved
it. And I loved kind of taking what, you know, developers and what we were building and explaining that to people when they got it, that aha, when they understood it was just, it was the juice, man. I loved it. And so, you know, I naturally kind of gravitated towards product marketing because, well, as a developer, I mean, I loved it, but I was a hack. I mean, I was good, but like, you know, I was, you know, I was managing teams and stuff. So it was just a real natural fit for me to kind of move into product marketing
explicitly because I always feel that it's my job to be translator of, you know, deep technical
concepts into English so that people can actually understand these things. And I love it. And,
and, you know, when, when something clicks and something works really well, there's nothing
like it now, you know, I've been kind of in startups been in startups just, oh gosh, I think this is my 10th
or something like this. All of them, except for one has been successful. It has been in security.
I was doing master data management, which is kind of must be familiar with you guys about
customer information. I was at a company called Initiate. I've been in open source for a long
time. I was a company called Talend doing data integration. I started their MDM project there and moved them into big data. I was at Hortworks very early and helped to find the
Hadoop space. From there, I was at another little marketing company. And then I landed at CoreOS,
which CoreOS was a real special company that really kind of innovated in some different ways
and really built kind of the foundation of a lot of things that are happening, I think, right now in infrastructure. And that was a joy. And I landed at Cockroach
really about two and a half years ago. Cockroach Labs, we're the creators of CockroachDB. This is
a brand new approach to building a database. We've architected the database from the ground
up to be distributed. It's basically taking all this distributed systems and distributed thinking
stuff, applying it to the database so that we have a database that's kind of prepped and ready for
you know kind of you know modern applications as you know we we kind of move quickly into this
kind of next generation of distributed systems and kind of cloud-based applications and whatnot
there's a lot there i mean i will i'm sure we'll talk a little bit more about cockroach, but that's kind of a quick overview. Is that, is that a sufficient Eric?
That was, that was sufficient and efficient. Amazing job hearing about your background. And
one quick note, the image for small talk from the small talk book on the small talk Wikipedia page
is awesome.
It would make a great poster.
So go check that out.
Everyone who's listening.
That's pretty great.
Very cool.
Well, Costas, you have tons of questions on the technical side,
and I'm really bad about stealing the mic at the beginning of the program.
So I'm going to hand it over to you and let you dig in.
Thank you, Eric.
Thank you so much.
So Jim, do you want to give us like a quick overview
about CRDB, CockroachDB as a product and as a technology?
The founders of this company, you know, Spencer Kimball, Peter Madison, Ben Darnell,
you know, all three of them spent a considerable amount. In fact, they all landed at Google at
the same time. In fact, I think all of their employee numbers are around 300. You know,
they spent a lot of time there.
Ben was responsible.
It was a big part of the reader team.
Spencer and Peter actually met in college in, I think, the 90s.
And they actually built out something called GIMP.
A lot of people are familiar with it, but open source image manipulation tool.
They're the founders of that.
And gosh, that thing's still going on.
So this guy's been together for a long time. But you know, it's interesting to see what's happening across the board in everything right now as
kind of the world kind of coalesces around a lot of the innovation that happened at Google in the
2000s and 2010s. And, you know, led by kind of Jeff Dean and Sanjay Gemalat and, you know, Eric
Brewer and some of the kind of leading minds over there. But, you know, just look at the things that have come out of that stretch of time. It's amazing.
So, you know, Spencer and Peter kind of front row and Ben as well, front row looking at all this
stuff. And, you know, when they eventually left, they were off building a startup. They were,
I think they were building a photo sharing startup and they were, they were frustrated
because they didn't have, they didn't have a spanner-like database, right?
They didn't have a comparable version of Bigtable, right? Kind of able to use those things,
but they were frustrated. And so it was really kind of out of frustration in that they ended up kind of starting to build Cockroach Database. And honestly, the name is really kind of after
the resilient nature of our database, you can't kill it. But I think Spencer has a little bit of a dark humor. So I think that's where the name came from.
We love it. Honestly, I, you know, love it or hate it. People, people remember it. That's for sure.
So, yeah. So they basically took the, the, the spanner white paper, which you can go check out
on the, you know, the Google, Google publication site and built an open source version of that.
They built something that wasn't going to be dependent on explicit hardware to do certain things. And they built a database that was
massively distributed, but built to scale very easy, survive any failure, even a region or even
anything, a Kubernetes cluster for that matter. But most importantly, being able to tie data to
a location, which is actually one of these concepts I think is not understood by a lot of people when they start thinking about
distributed systems and distributed data. Distributed means, well, you have to take
into consideration the physical location of things. And so, you know, typically when you
prop up a database, you think about the logical data model, you know, here's my keys, here's my
referential integrity. I, you know, I figure out what's going on with all my tables and whatnot. With with a distributed system, you have
to think about the physical nature of the database, the physical model as well. And I think that's one
of those core concepts. And so being able to tie data to a location is kind of a critical piece of
CockroachDB because well, it allows us to fight latency issues, you know, put data close to users,
it allows you to survive the failure of an entire region and whatnot.
But all the while, let's do this in a database that's built from the ground up, just, you know, completely new.
This isn't kind of move and improve.
This is, you know, brand new from bit one all the way through and make it SQL and make it, you know, wire compatible with Postgres.
So it's familiar to developers and they can get running. And so, you know, we feel we're defining
kind of this next generation of database
for transactional workloads,
and it's called Distributed SQL.
That's super interesting.
Two things, actually.
One is I found very interesting what you said about the location,
and I want to ask you about that
to give us a little bit more information around that.
But before we go there,
I know that one of the most important and the most, let's say, the biggest trouble that people have when
they architect and they build distributed systems is time, right? And how you deal with the dimension
of time. And I know that there's a lot of innovation around that that SDB has done. And
there are differences compared to Spanner. I mean, you mentioned something about the specialized hardware. Do you want to give us like a little bit more information around that?
Because I think it's super, super interesting. Yeah, it's actually, it's so, I mean, it's way
deep in the technical nature of our product. And it's, you know, if you start thinking about
transactions in a database, now, you know, can you do transactions in something like Mongo or,
you know, another database? Well, you're going to end up being kind of eventually consistent. And it really comes back
to how you use the algorithms that are, that are in front of you to actually execute transaction.
And so, you know, for Cockroach, we chose to be, you know, to implement serializable isolation by
default and actually for all transactions and to serializable isolation, it just means that,
you know, every transaction is going to happen kind of, you know,
atomically in order, right?
It's an ACID concept.
You know, I think Kyle,
Kyle does a really good job
in a Jepson.io website
talking about, you know,
all the different levels of isolation
that you can have in a database.
And if anybody's interested,
it's really cool stuff.
But serializable isolation is a big deal.
But in order to do that,
you know, to have multi-version
concurrency controls,
well, the clock and the time actually becomes really important because that is what really,
you know, demands, you know, that things are happening in order. Now, in Spanner,
in the original Spanner architecture, well, Google had relied on hardware atomic clocks,
right? So, you know, if you can just align every server to have the same exact time,
well, that's great. Well, as you know, and everybody knows, I mean, you know, there's no such thing as, you know,
true time on every single, every single server. I mean, you kind of get there with some of these
true time services and stuff, but, you know, for us, we want it to be independent of any sort of
hardware or any sort of other service. And so we basically built from the ground up. We said,
look at how can we actually, how can we do this a little bit differently?
And so we actually, probably one of the most popular blog posts that we ever wrote was
living without atomic clocks.
And man, that blog post on our website does a really good job of describing this more
in depth.
But basically what we said, we said, how can we use software to actually get the same sort
of thing?
And so you've got to start with one, some sense of time.
So kind of start with something like NTP,
which is like a network time protocol.
It's been around for a long time
and then build up some logical drift around that
so that, you know, servers can be off by,
I think it was, you know, it was like 50 milliseconds
or whatever that is.
And then use gossip in RAF to actually start to understand
where all these nodes are from a time point of view
and correct them as we need to. And so, you know, you can do this via hardware, in Raft to actually start to understand where all these nodes are from a time point of view and
correct them as we need to. And so, you know, you can do this via hardware, but, you know,
as really clever software engineers, we chose to actually solve the problem so that, you know,
it can be wholly owned and in the binary itself of Cockroach, which comes back to this kind of
concept of distributed systems and containerizing and everything.
And I don't want to be dependent on anything that is external.
I want a single binary.
And so that's kind of why we chose to do it.
Now, it allows us to be deployed on anywhere, right?
And so that's a big, big value for what we actually did.
Yeah, yeah, that's amazing.
And I think if I'm not wrong,
and you can correct me if I am,
that the time drift actually also exists in Spanner, right?
Even with the atomic clocks,
it's not like they avoid it completely.
It's just like it's a very, very small drift
that they consider there.
Yeah, yeah, that's amazing.
And I'm totally aware,
and I think our audience is probably also aware
of this great blog post.
It's very popular. Okay, we talked about
time, but you also mentioned space,
right, like location, actually.
And that's something that's, okay,
usually, as I said previously, when we're talking
about distributed systems, we tend
to focus a little bit more on the consequences
of time there. So why
location is important? How this
affects CockroachDB and how it makes it
special as a product and as a technology? Yeah, well, you know, the speed of light's no joke,
man, right? Like maybe one day we'll figure out how to beat it, but we ain't like, I'm not,
I don't think it's going to happen in my lifetime. Maybe some quantum thing will happen, but,
and you know, it's funny internally, you know, people ask us about who's our ultimate competitor
and, you know, in our engineering team, our ultimate competition is SpeedoLite.
And we do lots of things within our software to actually, you know, fight it and to work with it within the context of it.
Right. And so when you're deploying a database and you want consistent transactions, serializable isolation is going to guarantee the data is correct.
Right. We're talking about, you know, financial transactions across a planet.
Look, we're going to be really good
in a single data center.
There's lots of reason why people would use this
in a single data center for a simple application.
But when you start to kind of, you know,
build something that is the next generation
of database for these mission critical workloads,
you know, the stuff that's been wrapped up
in mainframes for years and years, you know,
having consistent transactions is really important,
but having global access to this is also critically important, especially in kind of the modern,
you know, world and basically businesses everywhere. But there's a problem, you know,
because the the jump from New York to Singapore is going to be what a 500 milliseconds sometimes,
or maybe 300. And what happens when a transaction takes, you know,
two or three hops back and forth, you know, we're talking about a second or two.
It just doesn't work in certain workloads. You know, the, again, another Google thing,
I always forget the guy's name, Paul Bechtel, or I forget his name. One of the guys, one of the
original guys who worked on Gmail came up with this concept called the 100 millisecond rule.
And the 100 millisecond rule. And the 100 millisecond rule
basically states that anything that happens, you know, sub 100 milliseconds appears to be in real
time to the human. Anything over, you can actually notice the lag a little bit. And so for us, it's
like, how do you get all transactions to be under, you know, sub 50 millisecond? Well, the only way
you're going to do that with data wrapped around the entire world is to make sure that data is located close to that user.
Now, we use Raft.
Raft is a distributed consensus algorithm that many are familiar with.
Anybody in distributors, if you don't know Raft, go check it out.
I mean, gosh, it's like, yeah.
Yeah, I think Raft and Paxos is like the two most commonly referred algorithms around distributed systems.
That's right.
And we're using Raft to kind of place data,
you know, these replicas around the world.
And, you know, how do we actually, you know,
make sure the Raft leader or the leaseholder,
as we call it within Cockroach,
how do we make sure that that's close to a user
so that, you know, all transactions to that Raft group
are going to happen very close to that user.
And so, you know, we went to great lengths,
basically within the way that we store data using, we're using raft and distributed
consensus to make sure that data is going to live close to users because I, you know, I want to
guarantee, you know, sub, you know, can we, can we tune our database to get, you know, sub 10
millisecond transactions to every single, you know, transaction, no matter where it's at on all
the tables. Yeah. But on the other side of that is
sometimes you don't, sometimes you just need data to be, you know, accessed all over the world. And
so, you know, we, we chose to do this at the table level, you know, and so for each table,
we're defining how data is actually persisted within, you know, the physical nature of all
the nodes within Cockroach. And so it's pretty simple to do. It's, it's a pretty straightforward
process. We're actually going to simplify that a whole lot over the next couple of weeks.
We'll have a release come out that really kind of breaks this down into some really
kind of simple declarative kind of SQL statements that allow you to define at a table how you
want data to live so that it's easily accessed or quickly accessed or how it wants to survive.
So that's great.
Actually, as you were talking about speed of light and location and where the data is
located, I couldn't stop thinking about at the end, it ends up to physics again, right?
Like it's, you have space, you have time, but at the end, you need both of them like
to solve the problem.
And that's right, Custis.
And it's, it's, it's a really difficult, we, you know, we, the, the software, the team
here, the, the, the team of engineers, I mean, some of the stuff we did, some of the stuff we'd done is just truly remarkable.
One of the things they worked on was this, there's this feature in cockroach called parallel commits
and, and look at, I'm going to say, I'm just the marketing guy. I'll explain the best I can, but
we actually have a Sigma paper that gets into this pretty well that that's published. I think it's
available on our site, but the team basically, they said, look at how can I actually forward
commit a transaction before, you know, and just and say, with five nines probability,
that's going to commit on the second node. So if I can actually go through and look at basically,
if I could commit a transaction locally, and then look at the data around that transaction and say,
hey, look, I'm going to send the transaction and the picture of the data around that I'm going to
send that to the second node. And if the second node takes, it looks at all the picture around it and
says, yeah, everything looks the same, just forward commit and just say it's done. Like instead of
doing all the transactional steps within each, you know, within that thing, just come back and say,
yes, acknowledge it's going to be, it's going to be fine. That is awesome, right? Because that's
a, you know, you can't, you know, it's the speed of light.
You can't change the photons, but you can't, you know,
maybe you could change that package, what's in there, right?
And so it's a different way of thinking about things and, you know, has made huge, huge gains for us
as we kind of, you know, ratchet down on latencies
and continually fight the speed of light here.
Yeah, yeah.
I think it's more than obvious
that there's some amazing engineering
behind CockroachDB.
So, all right.
We talked a little bit about all the amazing stuff
that is happening behind the scenes
and what the problem is
that the Cockroach database is trying to solve.
Who should be using this database?
Because, okay, we know that every engineer,
I mean, you as an engineer and me as an engineer,
we know that we always like to play with new toys and exciting technologies.
But who should invest in building applications on top of CockroachDB today?
You know, I mean, you're asking me, everybody, right?
Like I work here, So, you know, well, I mean, look, you know, look at if we're going to be wire compatible with Postgres, if it's basically the same syntax, but you're going to get basically all this value of never having to manually shard a database, never having to think about setting up
any sort of active, passive, resilient system. Like, I mean, why wouldn't you use this for a
simple application, right? Like, yeah, sure. You could spin up RDS Postgres and gets going pretty quick, you know, like in a single region, it's pretty cool. But like,
literally like the complexity of actually dealing with some of these kinds of day two operation
stuff is it's, it's, it's killer. And so, you know, for us, it's really any application. Now
that said, I mean, we've got some pretty world-class customers out there.
And, you know, some of them that I can actually talk about, you know, DoorDash is a big customer of ours.
Lush, Bose, Comcast, you know, LaunchDarkly,
which is a great, you know, dev tool, right?
Like, so there's a lot of really great logos.
You can go to our website, we have a lot more,
but, you know, they're looking at us
as kind of the next generation of database.
You know, DoorDash is a really good example.
You know, they, you know, height of the pandemic a year ago,
you know, they've had,
they had a couple of issues
with some outages of the database
that they had using
because they had kind of
a right bottleneck, right?
Like, you know, you know,
if you're going to be distributed,
then every node can take a reader, right?
That's, that's our theory.
Well, some other databases
like single right node,
but, but read nodes all over the planet.
And, you know, that was just causing issues
because they had downtime.
I mean, you know, how did,
and then if you fast forward a couple of months where they're going to IPO, they can't have
crisis. They can't have downtime. Like that's just going to, that's going to have an adverse effect
on any sort of, you know, what they were trying to do. And so, you know, midstream, they,
they chose to move to Cockroach and, you know, fast forward a year and, you know,
a lot of their transactional workloads are now either moved or moving to Cockroach
and Cockroach database.
And then they set it up internally as a service
for the new developers to use this
because you're right, Costas.
Developers do want to play with the latest tech
and we want to take away all of the complexity,
make data easy enough
so they can focus on their breakthrough application
and build.
And I think that's what some of these organizations, so, you know, time and time again, we keep seeing companies, you know,
set us up as a service internally for them, for all net new workloads. I think we fit net new
workloads really well. Like, you know, it's some of the legacy migration stuff. It just wasn't
built for the distributed world. You know, when you start using stored procedures and you have
all these kinds of crazy concepts in there, you kind of have to think differently. And so we fit really well in these
net new workloads. I think anybody who's thinking about Kubernetes or deploying anything on
Kubernetes, forget about it. There is no other database that was built directly for Kubernetes.
I mean, this is descendant to Spanner, just as Spanner was built kind of for Borg, right?
And so I think that's where we're seeing most people kind of turn to us.
Yeah, you're absolutely right.
I think I didn't put the question probably that well, but the point and the reason that
I asked is because, you know, like most people, they have in their mind that when we're talking
about distributed systems, like distributed databases, we are talking about very niche
problems out there that huge companies only have to deal with, or it's about crunching
a lot of data and doing analytics.
But I think it's important for all the engineers out there to understand and to communicate
us as the vendors to them, the benefits that they can get by relying on technologies like
CockroachDB at the end.
Regardless, I mean, of, let's say, the complexity or the size of the project that you're running.
I mean, Kostas, it's such a big piece of this.
And so I keep getting into these conversations
over the past couple of years,
and it's like shifting to a distributed mindset
is not easy.
Like it took me a while to figure it out.
And I think that's the thing.
And it's like, it's not just operations.
It's not just infrastructure,
but it's the developer has to think differently.
And I think that's where we're,
you know, I think we're going to be in a different world four or five years from now when, when this is kind of the, the, the, the facto way of actually building, you know, but
we're in this transition mode. And I think that's where people, you know, think, think like, oh,
it's maybe it isn't for me. Well, it is. And this is the future. This is what's happening y'all.
So question for you on that, because I, I totally agree. And I'll And I'll give a quick story here as a background as kind of leading the question.
I was listening to a podcast with the guy who started Spotify, and he was talking about
the early days, and they were trying to replicate the experience of having music downloaded
directly on your hard drive.
And so his goal was sort of, can I create an
experience that is better than, you know, sort of going on a Napster and downloading a bunch of
songs directly on your hard drive. And he ran into the a hundred millisecond problem and actually had
to introduce more latency in the experience and sort of make it seem like it was taking a little
bit longer to give people the impression that it was sort of, you know, being downloaded. But
I was thinking about that relative to what you were saying. And back then, I mean, they started in 2005 or 2006,
and there were really significant technical limitations relative to what's available now.
But I think we're moving into a phase where people, the expectation of the consumer,
you know, sort of in a D2C model and even more and more in B2B
is that there is no latency, right? It's just everything is instantly available. So
all of that to say, the question is, how much are you seeing when you talk to people who are
looking at migrating or starting to think in a distributed mindset? How much of this is that
pull coming from the consumer just demanding a better experience because they're starting to get it with the services they use
most often. And so smaller companies are having to replicate that or get as close as they can.
Yeah, that's a really great question, Eric. And so I think of this as kind of three pieces.
That consumer experience is a big deal. It's not a big deal for every application, but when it is, it's a big deal.
And so it really comes down to the workload.
I think the other thing that is a big kind of weight here
is we figured out big data for analytics.
We never really figured out big data for transactions.
And this kind of like, it's an accepted concept,
but it's almost like transactions were ignored
and everybody wants it. And so basically there's this big push there, but I got to tell you the,
the big reason that people are turning to this and other distributed systems is because you look at
the cloud is awesome. And yeah, I get CapEx, OpEx gains and you know, it's everywhere. And I got
like these great services, but like for the core
kind of concepts of cloud around scale and resilience and kind of you know exposure everywhere
all across the whole planet we're still limited in many ways you know you know for the for the
general purpose you know user of these services right like the and for us it's like well what's
the what is that if that if that infrastructure is changing for us it's like well what's the what is that if that if that infrastructure
is changing for us it's the the equation is like where where does infrastructure end in your
application begin right and and for me i always thought of the database as part of the application
right that's just i that's the way i that's the way i was raised that's what i you know
shit man the first thing i ever built was on fox pro it was in the database itself right like sure
sure you know like way back and like that's just not the case. The database is infrastructure. And so
the last piece of infrastructure that has to move towards this distributed, this is like,
as we like throttle, like accelerate in this world is the database, man. And I think that's
the piece that, you know, being truly distributed is a key thing. So I think that that consumer
experience is, is, is it's a big deal.
It's not a big deal everywhere, but I got to tell you, I think it,
increasingly that expectation of instant access and it's gotta, it's gotta,
it's gotta be as good as Instagram or Facebook or whatever I'm using as a,
you know, like my mom, my mom's happy with that. Right. So it's gotta be the same.
Yeah. And why do you think transactions were,
you said it was interesting.
You said it kind of seems like they got ignored.
Why do you think that happened?
Because it's difficult, Eric.
It's really difficult.
Like this is not simple stuff to solve, man.
Like look at, so I was at Hortonworks
and the team, again, amazing group of engineers. I mean, these, you know, Owen O'Malley was like troubleshooting the Mars rover when it was down. Talking to the guy one day, he's like, I'm like, dude, you fixed the Mars rover. He's like, no, I just fixed the scheduler. And it's like, okay, dude, like, you know, like some of the stuff that was going on. But like, I remember Alan Gates is working on H know we had you know how do we provide transactions there was like the impala thing going on at cloud era
hive llap was that approach i think we've kind of retrofitted transactions into the number of
no sql databases it just you can't take existing concepts like so why did google build spanner when
they already had big table right right? Like, well,
because it's a fundamentally different problem. And to solve it, it takes a rework of the entire
stack, like, you know, from storage, the way that data gets written to disk through the transaction
model and being a distributed transaction model, distributed transaction, you know, execution
engine, all the way up to the way that the language works and how these things happen. And so it's a complete rework. And, you know,
Stonebreaker, who, you know, Michael Stonebreaker, if people aren't familiar with him, I'm kind of
one of the godfathers of all databases. I mean, you know, started Postgres, by the way, y'all,
if you don't know who Stonebreaker is, Stonebreaker say like, it takes seven to eight years for a
database to fully gestate and be, you know, really kind of valuable for, for large scale kind of operations and whatnot. Well, if it takes seven, eight years to build a
database, you guys, and I hope your audience building distributed systems is also difficult.
So let's put those two things together, you know, and it's not, and it's not, you know,
if it was easy to build a quarter, a database, everybody would be doing it, but it's the corner
cases. It's the real weird, odd things that happen. And for databases, those are difficult things to solve. And so
I think that's why it's a really, really difficult problem.
Sure. So it was less about, if I had to summarize that, it was less about the advanced optimization
of existing systems and sort of rethinking the fundamental architecture
of how it actually works.
That's right. Absolutely. Thank you.
You just take my three minutes and broke it down into 10 seconds.
That's the best.
It's exactly right, though.
Eric, I think you did a very good thing
on moving the conversation also from the side of the end user at the end.
But I want to shift the discussion a little bit back again to the developers and discuss a little
bit more about the experience, Jim, that the developer has with CockroachDB. So you mentioned
that you are Postgres compatible, but what an engineer should expect by interacting and using
and integrating CockroachDB today? So great question. So, you know, we're wire compatible with Postgres, right? We're going
to speak SQL syntax. So if people are understanding SQL, they're going to get us. If they're using
ORMs, you know, we've built out, you know, a lot of ORM integration. So if people are doing that
sort of stuff, first of all, like, so number one, it's pretty similar and familiar to that experience.
But there's concepts that are different
when you're dealing with distributed data
and kind of distributed systems.
You know, if you're going to have
a serializable isolation database,
well, you know, as a developer, by the way,
I never thought about isolation levels.
I was like, whatever.
In fact, it was all new to me, you know?
And again, I was a hack, like try catch blocks.
You guys want, what do you want me to do?
Like, just let me deal with logic. And so, you was a hack, like try catch blocks. You guys want, what, what do you want me to do? Like, just let me deal with logic.
And so, you know, some transactions are going to conflict and, you know, implementing best
practices around try catch is a big deal, right?
So that's one thing.
I think another big thing is actually when you start to think about the, how data is
stored in Cockroach, you know, we get a lot of conversations with customers about unique
IDs and a lot of times we'll see tables that just increment values for unique IDs. And that's
actually an anti-pattern for distributed systems because we're using that unique ID to actually,
you know, to distribute the data across the cluster. So you don't want like a hotspot,
you don't want all records in one range. Right. And so, so there's like another layer down,
which it's a little bit deeper, but, you know, using UUIDs to do that is actually a big deal. But I think one of the things that's
most important for the developer, let's start thinking about distributed data is how you
construct your, your, your, your, your transactions. And Sean at DoorDash, I was on a webinar with him
a couple of weeks ago. He, it was a really great example. It was like really crystal clear. You
know, if you're going to insert 10,000 records into a table, right? Like, okay. Yeah. Postgres
insert. Here's the records. It's optimized, man. It's going to, it's going to fly through that,
right? It's going to just depend that data, update the indexes. You're good. Right?
Well, in a distributed system, you don't really want to do that because you're going to overload
kind of one node, right? Like you just basically overload and it's trying to communicate with all
the other nodes. Wouldn't you want to execute that as say 10 transactions, each of them
with a thousand inserts, right? So you get the parallelism of, of basically, you know, multiple
endpoints all working on this, right? Because any endpoint in Cockroach can, can, you know,
receive and, and, and process reads and writes, right? And so there's a little bit of a, this
comes back to what we were talking about before, this like distributed systems require a different mindset.
And I think that's the stuff that is interesting to me
and fascinating to see in the developer community,
how people are starting to come around to that.
Like it's a different way of thinking
when you code and interact with things on the back end,
you got to start thinking about location
and that sort of stuff, if that makes sense.
Yeah, yeah, absolutely.
And I really love that you keep saying
about this change in mindset
because I think, I truly believe
it's a very important thing that's happening
and engineers should learn about it.
And based on my experience
and my exposure to distributed systems,
one of the biggest revelations
that I got from distributed systems
is about designing
systems with having in mind that things will go wrong. Things going wrong is not the exception,
right? It will happen, right? And that's a big part of trying to build distributed systems.
What are all these edge cases and what are the limits and what can we do to secure our data,
our transactions and the behavior of our systems when we are dealing with all these problems?
And I think that one of the bad things that happened because of the introduction of cloud,
and unfortunately, this is also part because of marketing, is that cloud was also evangelized
as a solution that takes all the hard problems away, right?
Like, I can have my servers there.
I don't have to worry if a hard drive dies, right? My, I can have my servers there. I don't have to worry if hard
drive dies, right? My file system and servers will be still running. But at the end, this is not the
case. And actually, I think that whatever happened, like, there are many failures that are happening
on cloud. And as you are dealing a lot with resilience, and you see also like from large
scale deployments from your customers, how often do you think that you know that like this is happening and how much of a problem it is for an engineer to keep in mind
that failures will happen? It doesn't matter if we are on Google Compute Cloud or like on AWS,
we have to build all the components of our systems around the concept that something might go wrong
and we have to be ready for that. So first of all, Kostas,
you're going to go and attack the marketing guy?
Really? You're just going to blame it on marketing, buddy?
Come on.
That's funny.
You know, I mean...
I actually went the reverse way.
I came from marketing and we always blame marketing.
Cem, that's the...
I know.
Look at it.
You know, look at some marketing organizations do, you know, look at it, look at some, some marketing organizations do,
you know, they're going to, they're going to go a little too far. And I think there's definitely
some of that at Costas. I know exactly what you're talking about. And like, you know, they're
delivering, they're, they're selling a promise of something that's just not a reality, or if it is,
it's really difficult to attain. Right. And I think, and you're right, like this, this distributed
thinking and distributed
mindset requires you to basically build for resilience. That that's the concept. Like,
that's the thing. Like there's no such thing as disaster recovery because disaster should have no
impact. Right. Like, and so how do you design for that? And well, that takes a whole, this is what
I mean by you have to re-architect. You have to rethink. Everything that we ever thought about before is kind of architects of systems.
You kind of re-architect, you architect for resilience in the system itself.
And I think that's kind of one of these core, again, one of these kind of core concepts
that came out of the Google team over the past, you know, 15, 20 years.
And I think, you know, there's a lot of research.
There's a lot of technology, a lot of software engineering in this, you know, there's, there's a lot of, there's a lot of research. There's a lot of technology, a lot of software engineering in this, you know, I mean, understanding Raft and Paxos,
understanding things like MVCC, you know, the, some of the core kind of concepts that are out
there, I think are, you know, part and parcel of how to do this. Luckily enough, a lot of this
stuff is open source, which is just awesome. Right? Like, and so you want to go get a PhD
in how to actually implement Raft, go check out at CD raft, right? Like go check out the implementation. There's some amazing people
worked on, on that and including parts of our team. But I think there's, there's a lot of
examples for people to actually go out and figure out how to do that because you know what your
right causes, everything fails, everything fails. And, and if, if you don't understand your own,
you know, the, your own mortality, everything fails y'all. own, you know, your own mortality, everything fails,
y'all. So, you know, and regions do go out and you know what backhoes hit cables every day.
And Google has failures of regions go out and Gmail goes down. These things happen. It's about
basically dealing with it. Like the concept of an SRE is brilliant. You know, talking about RPO and
RTO and understanding what those things are.
As a developer,
I found it to be extremely important to get
because I think you'll start looking at it
in a different way.
Yeah, absolutely.
I totally agree.
And as you mentioned SREs,
what an SRE should expect
by working with CockroachDB?
I guess they could sit around and eat bonbons
and let the thing run all day and work on other things.
You know, funny, like, you know,
I typically think of SRAs in the concept of,
a lot of times in the concept of Kubernetes.
I've been kind of in the Kubernetes community for a while
and I just love being a part of it.
And, you know, from that point of view,
well, you got a database that's already fit
for this kind of world.
You know, it's aligned with their objectives, right? It is built for easing scale. Spin up a node, point it at the cluster, great,
all the data balances. You don't have to really deal with those sorts of things.
We can do rolling upgrades. There's online schema changes, right? The whole nature of a distributed
system allows you to do some really cool things. So it's kind of a low touch database for the SRE
in many different ways, but it's aligned
with the way that they're moving forward with adoption of orchestration systems.
You know, this is something that was built for Kubernetes or Nomad for that matter.
And so, you know, for us, our conversations with the SRE is typically around that.
Now we employ, oh gosh, I know we have a little small little army of SREs who's managing
and dealing with Cockroach Cloud right now, our own managed service.
And, you know, they'll tell you, I'm sure they'll laugh at this part of the conversation.
It's like, what are you talking about, man?
There is a lot of work we do.
They do.
And, you know, but it's built to be automated.
And I think that's the whole key there, right?
That's what that concept is all about.
So, yeah.
And the last thing that an engineering team is to have their SREs unhappy. So that's, I think it's quite important to keep
them happy. It's kind of like a, remember when there was like an unhappy DBA? Oh, I'm sorry.
They were just unhappy all the time, mostly. Yeah, I know what you're talking about. All right.
We're getting close to the end of this amazing and very exciting conversation that we have. I have two more questions. One is about open source and databases.
And I think anyone who has worked like with databases,
especially like in the past couple of years,
we see that pretty much having an open source version
of the database is mandatory.
It's out there.
How important is open source
for building a database system in your experience?
Yeah, I mean, it's a great question.
Let me ask you a question back, Kostas.
What's important for you with something being open source?
That's a very good question.
And I think it has many dimensions, the answer.
But I would say that one of the most important things around open source is support.
I feel that the project is alive and there are people there to take care of it, especially as the project is complex.
So for me, as someone who would try to architect or engineer a system, that's important, especially for the backbone of my system,
which is database, right?
That's right.
And I agree.
It's community, right?
And it's about building people who are all kind of into it
and using this and seeing the code base move on, right?
And so I always think about there's code
and then there's the community side of things.
And so code has got to be open source, right?
Community has got to be there.
The funny thing is when,
I think we get into these weird conversations
about the commercialization of open source
and we confuse the business model
with the open source project because ultimately like, look, man, I've been an open source for a long time.
And the beauty of open source to me is consumption.
Like, it's free.
I could go use it.
And I have all this community of people to support, right?
And, like, that's, you know, and so the problem is over the past three years, what's changed is consumption.
Like, how do you consume software
today? Y'all like you, you go and spin up a service in some public cloud provider. And so
the consumption factor has changed. Now, what we've done is we've taken free beer away from
open source. Basically that's, we always talk about free beer or free puppy, right. In the
context of open source. And, and so how do we get that back? How do we get it so
that everybody can use that tool, but still consume it as a service? Now we're hellbent
in making sure that we do that. We build up a community of people that are around us.
We changed our license about a year and a half ago, two years ago, to the BSL,
business source license. I think MariaDB was the first ones to have it. And basically what it says,
it says, look at what we want to protect ourselves from is from a large club, a public cloud provider
taking our code base and going and making a bunch of money off it. Right. I mean, it's like elastic.
Okay. Like in Mongo, the same, like right down the board, we've all changed our licenses because
database technology is a little bit different than other open source technologies. It's,
it is complex. Remember I talked about this seven, eight years to get it to a point where it's even kind of like, you know, I mean, you guys, Postgres has been around since 96.
It was the official.
I mean, it really started like 88, something like that.
Like these things, they aren't simple.
And so, you know, we want to build a business.
We want to build something that's going to be there in the right database for all consumers, not just every single developer, but every large enterprise too.
And for us, you know, it's a balance, it's a delicate balance of doing those things when
doing it the right way. And I think if you build a good, honest, humble, and kind of, you know,
open community and are authentic, you can do that. And that's what we're about. So I always
think of open source as code, community, and consumption.
And then there's this weird thing about license that everybody gets wrapped around the axel
on.
So yeah.
Yeah.
I think that's one of the best descriptions I have here about open source and how it interacts
like with a business.
Cool.
So Jim, last question from me, and then I'll let Eric.
What's next for CockroachDB?
What's in the product roadmap that you have
and what exciting new things
you are going to deliver in the next couple of months?
Yeah, you know, the ultimate vision of this company is,
you know, it was funny when I first got here,
it was like make data easy.
Well, we actually do want to make data easy.
And I kind of really ruffled with it
when I first got to Cockroach
and I first met Spencer about four years ago, actually.
But we're doing things that take these complex concepts
and make it really simple.
Like deploying a database across multiple regions.
Like, you know, where does data get located?
I don't, you don't need a PhD.
Like, okay, you can kind of do that.
You can do this in, say, Cassandra, but you need a PhD.
How do you make it so dead simple? It's declarative, right? Like we're simple SQL statements and the database just
takes care of this complexity. You know, we spend a lot of time doing those things,
taking the complex distributed concepts and making them simple.
But we also understand that consumption is a big deal. And so, you know, our ultimate vision,
our ultimate vision is that Cockroach Database is a SQL API in the cloud.
I want to make data available to every single developer on the planet, no matter where they
deploy their application.
And I want them to just communicate via SQL to some REST interface or whatever it is into
the cloud.
And let us deal with scale.
Let us deal with resilience.
Let us deal with locating data
so that you're going to be guaranteed, say,
you know, I don't want to put a number out there,
but sub 50 millisecond, you know, access to data,
no matter where your users are on a planet, right?
And so, you know, for us, it's how do you do that?
And while you deliver it through, you know,
kind of the, you know,
the whole kind of move towards serverless.
So how do you build this truly serverless database? You know, make it multi-tenant, you know, be able to spin up and
spin down dormant clusters. So we don't get killed on cost, right? Like make it consumption-based,
you know, all the security controls that have to be in place. And so, you know, for Cockroach,
we're pushing really far at that. You know, we've, we launched a beta version of this Cockroach
cloud free, the free beta is available on our website. People can start to play with it.
You know, it's limited about five gig of storage and, you know, it's single region for now.
But, you know, it's where we're building and focusing a lot of our future because we really do believe consumption is via, you know, A, the cloud.
B, more importantly, I think people just want to, I think people just want an API.
Let all that complexity just meld into the background.
I don't want to have to think
about scale. Just give me a bill. And truly delivering on that promise, I think, that's
where we're headed. And I've never been more excited about a company because I think this
vision is right. And we'll see how it plays out over the next couple of years. Yeah, absolutely.
And we will be watching closely. all right so eric it's your
turn the burning question that i mentioned at the beginning and this is just more out of curiosity
we love to we love to give our audience uh a little bit of insight just into the people behind
you know these companies and technologies and stuff and And I'm interested to know, coming from a programming
background and now working in marketing, what principles have you brought from programming
into marketing in your role? And how has that background helped you frame the way you think
about it? Oh my God, my team's going to laugh. Structure. Structure and framework. Always give
people a framework to consume break it down into three
things give it structure because otherwise you're just all over the place and so i'll i'll often
start questions like this and say well there's three reasons eric one two and i don't even know
what the third one is by the you know while i'm talking about the second one so i just i you know
i just having frameworks for people to actually understand things is just really critical. And I think in all aspects of our life, I mean, you know, if I'm going to write
a paper, well, there's a heavy outline done before we do that so that we're all in agreement, you
know, we're, we're 30% done, right. That way we're all directionally correct. And, and, and having
those concepts definitely apply in marketing for sure. Because I mean, ultimately, you know, what's your, you know,
God, we used to do PRDs and, and, you know, you know, these deeper technical kind of concepts,
conceptual diagrams. So we were all aligned, right. And so that, that, that, that core concept
has been fundamental, a game changer for me in, in, in my career and product marketing for sure.
Very cool. And one last question, and this is,
you know, we don't know a ton about, we didn't discuss a lot about how the, how the org is
structured at Cockroach, but one thing that I think would be helpful for our listeners,
especially with your unique background is a lot of our listeners are engineers working with data in some way or in some engineering
capacity. And a lot of them interact with marketing teams in various ways. And those
relationships are all over the place. We've had interesting, just interesting discussions with
people about how they interact with marketing. And we just love your thoughts on that. What does
a really good relationship there look like in terms of sort of the people working with data
from an engineering standpoint within a company
and then how the relationship with marketing works?
I'm very fortunate to work in a company
that works the way that Cockroach Labs works.
You know, there's a level of respect
across all the functions in this company that I find to be truly unique, you know, in all the places I've been. And I use the term respect because it's actually pretty important. I don't think a lot of organizations actually understand what's, what's beneath the iceberg when it comes to marketing and how complex it is and how difficult it is. People think it's a website.
Why are you writing it like that? Why are you doing this? Like, like there's so much that goes
into it. We work as hard as anybody else in the organization. And, and honestly, I think the,
the, like, don't get upset at marketing because they're doing something wrong,
help them get it right. Right. Like I use this word with, with our sales and marketing teams
all the time. I use the word authentic all the time. Authenticity. You got to be authentic. Like I don't go into a company and
not understand what an AZ is when you're talking about, you know, you don't say it's Arizona. It's
an availability zone. Do you know what that means? You know, you know how physically it looks at
data center, help them understand what it is and, and, and make sure they get these concepts,
right? Because the more authentic marketers can be,
the better off everybody's going to be.
The better off we're going to be able to translate
and sell what you're trying to do.
So don't be against them, help them,
I think is one of those things.
And I think that's where the best relationships we have
across our marketing team is there.
I mean, look, we're selling to developers.
You know, I love talking to my development team.
Well, they're going to be a little bit more
software engineers sometimes, or a little bit,
you know, out there sometimes, but you know, we learn a lot from them too. And so that, that,
that respectful communication back and forth and the, the, you know, having the patience to
look, man, the, you know, you may think something's wrong on the website. There's a whole
lot of other stuff going on y'all. Right. But if something's wrong, call it out too,
and do it respectfully. I think that that's the,
that's the thing. It's not a simple job. Love it. And I now feel fully guilty for
earlier advocating the idea that we can blame everything on marketing.
It's okay. Hey, call, Hey, but you know what? Call them out when they're wrong too. Like,
Hey guys, like, but this is why you're wrong. Like, don't just say you're wrong to say,
this is why you're wrong. And this is the effect that it's going to have,
right. Give it a reason. Right. And so call it out. Gosh, by all means, you know, we're,
you know, we're here to make it better. We are also after a goal too. Right. And that's,
that's critical. Love it. The, the concept of respect in that relationship and really all,
all relationships is huge. And I think that'll be really helpful for our listeners. Jim, it's been a really wonderful show. Is cockroachlabs.com the best
place to check out all things cockroach? Yeah, absolutely. You know, the free tier,
of course we're hiring, we're always looking at, we're growing like crazy right now. Y'all like,
this is just a lot of fun. So yeah, everything's there at cockroachlabs.com.
Very cool. Well, we'll check back in
with you in another six months or so, have you back on the show and thank you again for your
time and insights. Well, thanks for having me guys. I really appreciate it. Wow. Another super
interesting show. I think my big takeaway was hearing Jim talk about the sort of lagging migration of transactions to the distributed architecture.
And that was just really interesting to hear about how difficult that problem is and how
sort of optimizing existing systems wasn't going to work in order to deliver the experience that,
you know, sort of ultimately people are demanding. And I just thought that was a really thought-provoking answer to that question.
How about you, Kostas?
Yeah, absolutely.
I think like at the end, distributed systems are hard.
They are hard to build them.
And most importantly, it's hard like to reason about them.
At the end, you can end up in situations where you're traveling time.
It can be completely mind-bending. So there is a reason that it took a while to see all
these technologies becoming more and more approachable out there. I think, though,
that probably the most important outcome from our conversation with Jim today was about the need
for the engineers to change their perception and start
thinking more in terms of distributed systems and computing.
And that this is going, it's something that's going to become more and more important in
the future.
Not necessarily in the way that it's like, okay, anyone, everyone has to understand how
Raft or Paxos works, but more about understanding the differences
and the challenges and also the advantages
of using distributed systems
and how these affect your product, your architecture,
and the overall way of thinking in engineering terms.
I think that's super important.
And it's what makes marketing in this company really important. And I think that's super important. And it's what makes marketing in this company is really important. And I think that's a testament of that. Talking with Jim today is marketing can really be an educational tool to help all these engineers out there figure out the right things to understand and the right concepts from distributed systems to use in their everyday work. Yeah, I agree.
It was interesting.
I almost asked a question that we ask a lot of our guests, which is, what are some other
ways that people are solving this problem today?
And the more I thought about that question, I ended up not asking it because we were talking
about a shift in the way that you think about architecting a system.
And so I just appreciated
his perspective on the mindset shift that's required. Well, thank you again for joining us
on the Data Stack Show. Be sure to hit subscribe on your favorite podcast provider so you can get
notified of new shows every week. We have a great lineup in the next couple of weeks. You'll want
to be sure to grab those episodes. And until next time, we'll catch you later.
The Data Stack Show is brought to you by Rudderstack,
the complete customer data pipeline solution.
Learn more at rudderstack.com.