The Changelog: Software Development, Open Source - RethinkDB (Interview)
Episode Date: December 11, 2013Slava Akhmechet, co-founder and CEO of RethinkDB, joined the show to talk with Andrew about RethinkDB - the open-source database for the realtime web....
Transcript
Discussion (0)
Welcome back everyone.
This is The Change Log and I'm your host Adam Stachowiak.
We're a member supported blog podcast and weekly email covering what's fresh and what's
new in open source.
Check out the blog at thechangelog.com, our past shows at 5by5.tv slash changelog, and subscribe to the Changelog Weekly.
It's our weekly email covering everything that hits our open source radar.
You don't want to miss it.
It ships on Saturdays.
Subscribe at thechangelog.com slash weekly.
This show is hosted by Andrew Thorpe.
It's episode 114, and it's sponsored by DigitalOcean and TopTile.
We'll tell you a bit more about TopTile later in the show, but they're awesome sponsors
of ours. We absolutely love them.
They connect startups, businesses, and organizations
to a growing network
of elite engineers all
around the world. Head to
toptile.com slash developer. That's
T-O-P-T-A-L
dot com slash developer.
And DigitalOcean. We love DigitalOcean.
We're hosted on DigitalOcean. And we want you to be hosted on DigitalOcean.
Today, get hosted on a blazing fast DigitalOcean SSD cloud server.
You can easily create a brand new droplet with root access in 55 seconds.
Literally, in 55 seconds, you'll be at your prompt setting up your new machine.
You get your choice of size, region, operating system,
all through a simple and easy-to-use dashboard or via the command line if you want to.
They've got an API.
And as for our fans across the pond, DigitalOcean just announced their brand-new second Amsterdam data center.
AMS2 just opened up on December 2nd and now offers expanded server capacity to Europe,
as well as shared private networking,
which is only a feature we had here in the States at their NYC2 data center.
We want you to try DigitalOcean today for free using our promo code.
Try them out today.
ChangelogSentMe is the promo code to use.
You'll want to use that when you enter your billing code information.
There's a spot there asking for your promo code.
Or if you miss it and you sign up
Just email support
Let them know that Changelog sent you
Use Changelog sent me as your promo code
And they'll hook you up
It's a $10 hosting credit you'll get
So we want you to enjoy DigitalOcean
Head to DigitalOcean.com today to get started
And now, on to the show
We're joined today by Slava Akhmachet
To talk about RethinkDB Welcome to the show, Sl're joined today by Slava Akhmachet to talk about RethinkDB.
Welcome to the show, Slava.
Hi, Andrew. It's good to be here.
Yeah, so RethinkDB is a...
I love your tagline on the website.
I do this often. I say, built with love.
But RethinkDB is an open-source distributed database built with love.
Why don't you give us a little introduction.
First, who is Slava Akhmachet and what is RethinkDB?
Yeah, well, I was born in Ukraine and I moved to New York City when I was 13.
I now live in California.
You know, I did my undergrad in computer science.
I worked for the financial industry for a while.
And then I sort of didn't fit in, so I went to grad school.
And, you know, we looked around and we saw that there are a lot of changes
in how people access databases and sort of a lot of changes of how things get deployed,
how applications get built.
So we thought it was me and my co-founder, Michael.
And I'll tell you more about these details as we get deeper in.
And we thought we were going to start a project to take some of these ideas and some of these thoughts
and sort of implement them into a product, an open source product that people could use.
So we moved from, I was in grad school at the time doing something totally different.
We were doing computational neuroscience and supercomputers.
And it sounds kind of fancy, but really it was just trying to figure out how to simulate big
things with a lot of interconnections on IBM BlueJune, which turns out to be really difficult.
So we were doing that and then started Rethink, moved to California.
And we've just been working on this project ever since about 2009.
Gotcha. So you guys moved out.
Did you go through the Y Combinator? Is that right?
Yes, we did. That was actually the catalyst for moving to California
and then we never went back.
Gotcha.
Yeah, so it's a relatively new project.
I mean, no sequel is, I wouldn't say new,
but it started to really gain in popularity in the last couple of years.
What is it that made you want to do your own thing?
Were the current solutions not good enough?
Were there no solutions that you were aware of to solve the problem?
What really made you kind of rethink the NoSQL?
Well, I think the really major, there were a lot of factors going into it,
but there's one thing that I think is a really big deal.
If you look at traditional databases and even NoSQL databases,
they're databases that just happen to have a programmer interface, like an API.
And we saw this trend, like if you look at programming languages, people understand that, you know,
developers spend many, many hours a day building their programs.
And these things don't just have to be, like, easy or pragmatic.
They also have to be pleasant because pleasant programming languages win.
So we thought that we're going to start a database that is a developer tool
first and a database second.
And what that really means, I mean, there's a lot of details that go into it,
but every time we design a feature or sort of make any kind of a decision,
we first think of developers and what it feels like to develop in the system.
And then after that, we think of all the implications in the database,
to the database world and the operations world.
And what comes out of that is what we think and a lot of our users think,
a really, really pleasant database to develop in because many times people,
when they build web applications, right, like backend is a huge, huge deal
and they spend many hours a day just working through a lot of these things. So it's stuff like a really pleasant administration UI that takes a lot of cues from many of the
consumer projects or consumer products.
Like why do consumers have to get better UIs than programmers?
It's something that didn't sit well with us, so we thought we're going to make that part
really good.
It's things like a query language that's designed to be just a really unique,
pleasant, and pragmatic query language.
We wanted to do that.
So if you take this core premise that it's a developer tool first
and a database second, a lot of very interesting things come out of it,
and you get something that looks quite different
and feels quite different and feels
quite different from from anything else out there i'm not sure does that make sense yeah it does i
mean actually on the on some of your docs you kind of say like you call you like to call it the best
of both worlds and so you say that there are like when i first saw the the rethink db kind of like
interfaced it it reminded me a little bit of couchouchDB, right? The same kind of idea.
So you say there's like the more developer-oriented products,
which would include CouchDB, MongoDB, and things like that.
And then there's the more ops-oriented solutions
like Cassandra and React,
which are a little bit more difficult to get started with,
and they are designed for kind of a different purpose.
Would it be appropriate to say that Rethink
is more of like a DevOps solution?
It's like the mixture of the two?
Yes, we always wanted to do that.
So I think, actually, in a lot of the NoSQL projects,
and really databases in general,
this tension between developers and operations
and how the team behind the project manages that tension
is really what pretty much defines the project.
So, for example, in the case of Cassandra and React to a large extent,
this tension between developers and operations and how they make decisions
definitely falls closer to the operations side, far closer to the operations side.
Because in Cassandra's case,
it was really important to maintain write availability.
So they designed a Dynamo-type system.
And then if you're writing an application,
you have to deal with conflicts and things like that.
So just by design,
it makes writing applications a little bit more difficult
and running a large system a little easier.
And then MongoDB was kind of the opposite, where they made really pleasant decisions for designing applications.
It was just JSON in, JSON out, really simple.
You couldn't do joins, couldn't do many things.
So it was just that simple system that people really, really loved.
But then on the operations side, things got tougher because of failover and things like
that that weren't as nice as Cassandra or React were.
So Rethink is just our own take on this tension between developers and apps.
And we thought that a lot of these systems are very nuanced.
So if you start looking at the details and looking at the nuances, we thought that a lot of these systems are very nuanced. So if you start looking at the details and looking at the nuances, we thought that we could design a much more balanced, much more pleasant experience.
But the product is definitely developers first.
We sort of look at what it's like to develop applications, what it feels like just from landing on know, landing on the page to downloading the product, doing the first five minutes and so on.
And then we, of course, have to make sure that operations like that it works, that it's
good, that it's pleasant for people.
But whenever there is a decision like a trade off and we can't do the best of both worlds,
we usually fall closer towards the developers.
Not always, but usually.
Yeah, so one of the things that you tout is the query language,
and I read very positive responses to Requel.
What was the decision behind Requel?
Give me some information about when you guys sat down and talked about your query language. I guess, like, what was the decision behind Requel?
Like, give me some information about when you guys sat down to talk about your query language.
You know, what do those talks sound like?
Because that's pretty low-level stuff to talk about.
Yeah, so I'll sort of start with an anecdote.
I don't know if you remember, there was an operating system a long time ago called BOS.
Do you remember that at all? This was like maybe in the 90s. It was a an operating system a long time ago called BOS. Do you remember that at all?
This was like maybe in the 90s. It was a media operating system. Sounds vaguely, vaguely familiar. I was very young. Yeah. Well, so BOS was this really pleasant, it was an operating system,
was a really pleasant UI. And I think someone asked like the lead developer or an architect of BOS, how did you guys get a UI that is so snappy?
And the guy said, oh, it's easy.
The UI guy was sitting in a cube very close to the kernel guy,
right next to the kernel guy.
And that interaction just resulted in a snappy UI.
So I think the way what happened with Recall at Rethink
is that I'm originally a programming language person.
I absolutely love programming languages.
I used to just build interpreters for fun for different languages and learn like every language I could get my hands on.
And then when we started Rethink and we started building the team around it, part of what I did is this was completely unconscious,
but the people that joined also happened to be programming language people.
Not because I was looking for that or anything.
It's just because people just tend to unconsciously sort of attract people
and work with people that are similar to them.
And then my co-founder, Mike, was a UI person,
so he got people to join that were
really interested in user interfaces. So a lot of us are programming language people
and we thought, okay, we have to design an interface and it has to be really pleasant,
it has to be easy to use, it has to be familiar to people. So just starting with these premises, we built a query language that's sort of like
the domain-specific language that integrates into whatever language you use. So if you're using
Python, for example, everything to be query language is just a library for Python. Or if
you're using Ruby, it's just a library for Ruby. So some of these things were pretty easy.
But once you get into like
the esoteric parts of it and how lots of pieces fit in, a lot of the discussions get pretty
contentious. People have different ideas, different opinions. So we've created almost like,
I mean, to some degree, it's like the US judicial system, right? It's very adversarial.
And this adversarial process, I think, results in something
quite good. Sometimes it's stressful. There's a lot of tension. Sometimes, you know, people don't
often agree. But I think at the end, it results in a really pleasant experience for people.
Yeah, I mean, you're ultimately working toward the same goal, right? So if you guys have different
opinions, you can, you know, be adults and sit down and talk about it oh yeah yeah absolutely so the way i mean the
way the process works it's actually completely open online so if you go to github and search
for rethink db and look at the issue tracker um so when we started we actually we couldn't do that
online so we sat down in a room and the first first version of Requel was just completely banged out in a room with five people sitting around.
Right now, because the core of the language already exists and most of the changes are smaller,
all of the discussions are happening online and on GitHub.
So if you look at the issue tracker and look at Requel issues, you'll see exactly what the process looks like.
And typically, we have a discussion process where anybody could participate you know it's anyone who's working
on rethink or users or really anybody at all um and we have a we time box it so it takes about
i believe it's a week um to settle on an issue and then if we still can't settle there is a
tiebreaker um and it's just the person you know we think is is uh
has a really good sense for programming languages so we try to arrive at a consensus and if we can't
that person breaks ties um and that's how the process works right now
gotcha so yeah it's a it's an open we've had guests on the show i think that uh we felt like
chad whittaker from gid if that would love would love to hear that the community and everyone kind of plays a part in the decisions that are
made.
That's a pretty cool thing.
How often do you have to...
Go ahead.
So actually, the community playing a part in design discussions has been a huge deal
for us.
I think it's incredibly important because what often happens, and actually, it's not
just Rethink.
I think it's open source in general. But used to happen was you know commercial projects as people
would release a feature and then they'd get the feedback afterwards and you could do all sorts
of stuff before like you could do you know studies and you can do betas and demos and things like
that but it's just not the same as having users you know jump in on in on a GitHub issue during the technical discussion and comment on what you're doing.
And so far, I mean, I wouldn't say every single recall design decision benefited from this, but, like, the majority probably did.
How often do you have to, you know, I feel like a year, two years ago, maybe a little longer than that, the question that I always read was, you know, NoSQL versus SQL, right?
What's the right solution?
Should I use like a Mongo or should I use like a Postgres or, you know, what's the solution for my application?
How often now, it seems like that question has shifted now to people kind of know what they want to use for their solution.
And now it's like, it's gone back to if you're going to use SQL, like, is it MySQL or is it Postgres?
If you're going to use NoSQL, is it, you know, which one?
So how often do you have to answer or kind of defend the decision to whether to go with, you know, Postgres or Rethink kind of a thing?
Well, I think we're, so Rethink is of a thing? Well, I think we're...
So Rethink is a young product and a young project,
and a lot of people that start using Rethink
already have a very good idea of what's going on.
So we very rarely have to talk about Rethink versus Postgres
or Mongo versus Postgres or anything like that.
I think most people pretty much know who use RethinkDB.
But if you zoom out a little bit and look at programmers in the world in general,
I think there is still a lot of education to do
and a lot of work to do for people to understand the differences
between these two approaches and what fits when.
Because people have, I mean, we've studied relational systems
and taught relational
systems to people for the past 40 years and i don't think a change like that a very fundamental
change like that can happen within a couple of years i think it's going to take a while
for like the programming world at large to really understand the difference and i actually think
you know even people building these things like we're learning every day how RethinkDB is and isn't useful to people.
So even for the vendors and the people that are building these projects, it takes a while to understand what their project actually means and what it does for people and when it's a good idea and when it's not such a good idea yeah i mean i think i like i would say most
people and myself included tend to still just think relationally in terms of you know our system
design and so i would i would wonder and i'd probably imagine that a lot of people who are
just doing like going to rethink or going to mongo are just kind of doing it at this point because
it's like the new thing to do and still trying to slam relational models into it and use it that way and and so i wonder at what point will we you know you said
like you said i mean just object oriented in general kind of lends itself to relational ideas
so you know when at what point how many years will it take before we like are able to actually
kind of free our minds of that and think in different ways that really enable this mindset.
Well, we sometimes talk about this.
So when people first built cars, they used to not be called cars.
They used to be called horseless carriages.
And NoSQL kind of reminds me of that because when you define a whole field by an absence of something, that means the field is pretty young.
It's going to take a while for it to really settle.
I think if you jump a little bit into the details, when people first start using Rethink
in particular, they maybe start with preconceptions of relational design, but then they very quickly
learn not to necessarily do that because the project just sort of guides them towards
the thing that makes sense.
You know, these things are often not about what's possible because you could build anything and anything.
It's more about what's easy and what is like the path of least resistance.
So people learn pretty quickly on individual basis the moment they start using Rethink.
And I'm sure that's true about other NoSQL projects too. But the world at large, I think it will probably take another five to ten years for this to really become old news.
And everyone just understands what everything is and what it means.
So let me ask you then, just kind of for an answer, what makes NoSQL a good choice?
And then more specific, what makes Rethink a good choice once you've gotten to that point?
So I think NoSQL as a field is still definitely young. But what makes NoSQL a good choice is two things. The first is that a lot of data that people work with now,
it's not relational in nature, at least not as relational as it used to be. It's much more
hierarchical.
And pragmatically what it means is if you just do a relational design,
you're going to have a lot of missing columns.
You just have 1,000 columns, and in most rows, most of them are null,
and it's very unpleasant to work that way.
And NoSQL makes that very pleasant.
You don't have to worry about that very much.
That's the first thing that makes NoSQL makes that very pleasant. You don't have to worry about that very much. That's the first thing that makes NoSQL easier for that kind of problem.
The second thing is scale-out.
So there was a big promise of that, and it's still, I think, quite debatable whether NoSQL makes things easier to scale-out in practice today,
but I think when the field matures, it's definitely going to be the case,
because the thing is fundamentally more scalable than relational systems just because it does less.
And when these systems mature, I think scale out is going to be a no-brainer and no SQL,
but it's still going to be hard in SQL. So that's the field in general. As far as rethink,
we make it really, really, really easy to build applications
that have to deal with JSON.
Specifically, if you
want to do things other than sets
and gets and basic aggregations
in a single table, the moment you start doing
cross-table stuff or cross-collection
stuff, Rethink just makes
that really easy. The programming language is
really easy. And then you build your app, and
then we make deploying and
scaling out
just a very
pleasant and easy experience.
You could go to rethinkdb.com
and watch the video and we sort of
show like a one minute video of how easy it is
to scale things out. It's just a press of a button.
So we make
building applications and then scaling them out
really simple.
Now, I would point out that Rethink is still in beta, and we set it on the front page.
We're getting very close to making it be a production release that people can start using in real production products,
and a lot of people have already.
But we've been very careful about making promises to people because these systems are hard.
They take a long time to design.
They take a long time to iron out the bugs so they work well.
So Rethink is new.
And we certainly encourage everyone to try it and play with it and start building applications.
But it's always a disclaimer that I kind of use before we start offering commercial versions of the product.
Yeah.
Being in beta, I mean, so, you know, just to kind of be transparent, you guys are – so there's a 13-minute video I watch, right? The first thing I did with Rethink, I was like, let me watch this video.
13-minute video, and you guys kind of explained Rethink, what it is.
You showed me sharding, replication, failover
all in 13 minutes
and
I just think back to
a couple years ago
somebody trying to explain sharding to me
and a couple years ago somebody trying to explain
what their replication strategy
is to me
and it's just shocking to me that you guys can do all that
in a 13 minute video well. And it's just shocking to me that you guys can do all that in a 13 minute
time.
Well,
so it's,
it's 13 minutes to demo the product,
but it's about three years to make all of that possible.
Right.
Exactly.
Yeah,
it's,
it's,
it's really cool.
So let me ask you this.
What is your,
you guys officially support, uh, I guess three languages, the best way to put it, right?
Python, Ruby, and JavaScript.
What's your favorite implementation and why?
So I am personally a Python fan.
But I think, and I love Python.
I love the programming language.
And I love the Python driver, RethinkDB driver.
I use it a lot.
I also use JavaScript a lot, both because I like the language and I like the driver.
I'm not a fan of Ruby myself.
But if I had to be honest with myself and with everyone listening, I'd say that the
Ruby driver for RethinkDB is probably best just because Ruby, with their blocks
and in general how the language is designed
and how easy it is to hack in and do anything you want,
it's the most pliable if you want to build a domain-specific language.
So the Python driver, for example, and the JavaScript drivers are great,
but Ruby, the language, makes some things easier.
Specifically, I think blocks are the most important and it's a little bit difficult to describe without you know just
actually typing so i can't do that that verbally but if you look at rethinkdb.com and see just a
basic example of what it looks like in ruby and python and javascript ruby just is a little bit
nicer yeah well i mean it's it's just the idea of chaining in general like javascript ruby just is a little bit nicer yeah well i mean it's it's just the idea
of chaining in general like javascript chaining is great but it's it there's some parts of it
that anyone who's worked in javascript has you kind of it feels weird sometimes and ruby lends
itself to that i think just in a real elegant way just yeah it's a great language for dsls and
stuff like that so awesome well the fact that oh sorry go ahead no you got it i i was just
gonna say the fact that blocks um have a really nice syntax make it easier because in javascript
you have to type like the word function um right and that's a lot of typing to do whereas in ruby
you just put brackets and that makes things a lot easier yeah definitely let's go ahead and pause
for a minute give a shout out to our sponsor toptow for a minute and give a shout out to our sponsor, TopTal.
Yes, let's give a shout out to our awesome sponsor, TopTal.
They've been sponsoring the show for a bit now and they're going to sponsor I think one more month.
But I've been working with their CTO, Brendan, and I mentioned before I wasn't quite sure what to expect from them when we first started working out with them.
But I've got to say these guys are the real deal. They're engineers themselves from top to bottom.
They built the company around engineers.
They're not non-technical recruiters trying to pimp developers.
They're a network of engineers from all around the world
who work with some really awesome clients.
And for those of you out there who are freelancing
or maybe you'd like to freelance
or maybe you're in a full-time position kind of doing one thing by day and you like to do another thing by night let's say node or something in javascript
or ruby just as an example and you'd like to try kind of testing out freelancing you gotta check
out top top because they're doing some really awesome stuff with companies like airbnb artsy
ideo and many others you can work remotely a beach, or anywhere in the world.
No office required.
To get started, head to TopTile.com slash developer and click join the best.
Because they want to work with only the best senior engineers out there,
they got a well-thought-out four-stage screening process
that begins with a personal call via Skype
to kind of get to know who you are and what you're up to and introduce you to TopTile and what their mission is and see if you're a fit. And from end
to end, the process includes an English speaking test, a timed algorithm test, technical interviews
with core TopTile engineers, and a test project. But once you've got through that screening process,
the sky is the limit. And if you think you have what it takes, head to TopTile.com slash developer to get started.
Tell them the changelog sent you.
TopTile.com slash developer.
All right, so we were talking about which languages and Ruby versus Python, and we don't want to get too much into that right now. But what I do want to kind of get into is just a little bit more specific, deep dive
into Rethink itself and, you know, less of the theory behind it.
And like, let's talk a little bit about how it works.
So what do you guys recommend for the kind of the best way for somebody to get started
working with Rethink?
So we wanted the, the way we create, like getting started is almost like a game, right?
So it's got to be really easy when you start out.
And then as you start doing more advanced things, it should keep being easy and the learning curve shouldn't jump too much.
So, you know, getting started is really easy.
You can go to rethinkdb.com.
You can download it on Linux or OSX.
And then there is a tutorial for any pretty much you know ruby python javascript but you could
really use this with any programming language the tutorial is just 10 seconds and then if you like
that you can move on to a 10 minute tutorial and start inserting documents and and querying and
doing more advanced things gotcha let's talk a little bit about the querying i think it's a neat
way the way that you guys do the chaining. And so every, basically, every, let's specifically talk in JavaScript.
Every, I don't know, operation is essentially a chain of, you know, different.
This is what we're talking about with Requel, the query language, right?
So you would basically say, you know, r.database,
and I guess that's probably optional if you're only dealing with one database. I'm not sure, but, you know, you would say r.database, and I guess that's probably optional if you're only dealing with
one database, I'm not sure, but, you know, you would say r.database, and then you'd pass the
name of your database into that function, then you would say.table, pass the name of the table
into that function, and then you would start talking about your operations and what you want
to do, and then you end it with a run. Yes. So the query language is designed in a way where
you start, so the data sort of flows left to right.
So on the very left of your query, you specify where the data comes from.
Usually it's a table, right? So you say table, you know, users.
And then after that, you say dot, and you can put any command you want.
So for example, you want to filter users in a specific city.
So you say dot filter, and then, you know, the city that you want.
And then you can say
dot again and let's say you want to group things so you say you know group by da da da da and then
you can say dot again and you can just do this indefinitely um so it's it's very similar to
um how you do chaining in jquery if people are familiar with that it's also very similar with
how you do it on the on the Unix command line in Bash, right,
where data just flows
left to right
and you can keep adding pipes
and each pipe is just
an operation on that data.
So then once you actually
execute it,
you just,
once it hits run,
it actually executes
everything from before, right?
Yeah, so the important
subtlety here is
as you write that query in JavaScript, all of that is on
the client.
It's all written in JavaScript.
And you say, you know, table, filter, group by, you can count things, you could do whatever
you want.
You know, you could do joins across tables.
But all of that is still just a program in JavaScript.
And then when you type.run and you give it the connection
to the database,
what happens is the client
takes that query,
packages it into a binary format,
into protocol buffers,
actually Google protocol buffers,
and that gets shipped over
to the database server.
And then RethinkDB clusters,
basically the machine
on the other side,
the server machine,
takes that query,
compiles it down to a distributed program, and sends it out to all the machine on the other side the server machine takes that query compiles it
down to a distributed program and sends it out to all the nodes in the cluster it knows where
everything is so you can send the query to any machine and it gets the data and then as a user
you just get the result right so none of this gets executed in the client the client side is just a
convenient way to write the query the whole thing runs on the server in the cluster.
One little thing I wanted to point out.
I was looking at your FAQs, and at the top of the site,
I see a little example on inserting into students,
and it looks like you guys are kind of taking a shot at SQL with the SQL injection.
Bobby drop tables.
Yeah, Bobby drop tables.
That's funny.
But that does kind of give me a, I mean, is that just because, do you guys deal with just only this requel and this format?
Or can you actually write actual, not SQL, but something similar?
So right now we only deal with this format.
But if you look at, if you actually dive into the details of how the protocol is designed,
there is no reason this has to be a DSL in Python, Ruby, JavaScript,
or any other language.
This could be a text language.
We just haven't designed one yet.
I think this is going to be important for people like business analysts
who later, you know, they have a running database
and they want to analyze the data.
And I don't think, well, I'm not sure, but, you know,
I think it's nicer for people to be able to do it in a language closer to English rather than Python.
So we're thinking about this a little bit, but yes, there's no language like that now.
Right now it's just a DSL.
And as you pointed out, an interesting property of that is you can't really get injection attacks in the way you can with SQL.
So do you think that you'll ever, do you think it would be SQL that you would support or
would you write your own mapping or how would you, what kind of decisions?
I don't think it would ever be SQL for a couple of reasons.
I think SQL isn't very good for hierarchical data and people have tried extensions to it
in particular, like Postgres has extensions to SQL to work with JSON.
And, you know, it's okay.
It's not nearly as nice as a language designed from scratch to work with hierarchical data.
So I don't think it will ever be SQL.
I just think it's going to be designed.
If we ever do this, it's going to be more for non-programmers, if that makes sense.
SQL sort of has this interesting property where it was designed for non-programmers, right?
And then programmers were kind of forced to use it,
but it was really designed for business people.
So we designed the first version of the language
as DSLs for programmers.
And then if we ever do a SQL-like language
for business people,
it's not going to necessarily look like SQL.
It's just going to be closer to a natural language.
So you don't have to put like quotes and dots and things like that, which non-programmers probably don't understand.
Right. So one thing that's interesting that kind of, I don't know, just to me personally,
it jumped out was watching the tutorials on Rethink, the join part of the language. And
I think that being a non-relational, I think that you don't see that a lot with NoSQL because they want to – I mean the word join kind of implies that these two different databases or – I'm sorry, two different tables are related in some way.
And so that's some sort of relationship. hierarchical data, they oftentimes do have things that are, you know, that are relatable, or you
want to, maybe they're not necessarily related to each other, but you want to compare with each
other, things like that. So what was the decision behind supporting a join like that? And why do
you think other, you know, NoSQL solutions, and I don't know which ones do and don't support that,
but you know, what do you think goes behind that? Well, it's actually really interesting,
because when people talk about relational databases,
I mean, when this thing was designed in like the 70s and 80s,
the word relational really came from mathematical relations,
which has almost nothing to do with relationships,
but because the word sounds so similar, it has the same root.
People talk about relational databases in terms of relationships between data.
And this was completely unintended, right?
This was not the original intention at all.
And with NoSQL, you just can't escape the fact that data has relationships.
I mean, every hierarchical data, graph data, any data,
it's all about encoding relationships,
whether it's SQL databases or NoSQL databases.
And to us, a join operation was really a no-brainer because if you look at what people do with
a database like MongoDB, for example, that doesn't have a join operation, what they'll
do is they'll have a table where they'll often get the data out into the client and
then loop through every record and then go to the database again.
And you can, of course, you can get around that
by storing documents inline,
but you can only do that to a point
because that's not necessarily very scalable.
And we thought that, hey, Rethink has to support both
because it's just a matter of time
until every NoSQL database supports a join operation.
It was sort of a no-brainer to us,
so we just went ahead and did it
because we designed the architecture on day one
to support commands that work across tables.
And you could do this,
so pretty much anything you could do in SQL,
you could do in Rethink,
so you could do subqueries and things like that.
If you're running a MapReduce command or something, you can put a join inside there, you could do sub queries and things like that if you're running a group you know if you're running a map reduce command or something you can put a join inside there you could do sub
queries inside there and never made sense to us that a query should just be on a single table
we always thought it should be able to support dealing with relationships yeah that's kind of
a big decision though right i mean do you guys have to kind of answer for that a lot or it's
is that it seems like that would be a pretty big selling point of rethink yes it of a big decision though, right? I mean, do you guys have to kind of answer for that a lot? Or it seems like that would be a pretty big selling point of Rethink.
Yes, it's a big selling point.
So the downside to this, of course, is that a system like this is much, much harder to develop
because there's a lot that goes on in the back end to make this work.
And it almost makes the complexity like exponential, right?
It's just so much harder to develop a system like this,
so much harder to design an architecture
and then every feature you have to think about how it fits in.
So we have to pay for that in just development time.
You know, every time we do something,
we have to make sure everything fits.
But now that we understand that really well,
it became a lot easier i think
early on um we just had to pay a lot in development time but i we think about this in terms of just
you know what's what's better for users and we thought it's it's totally worth it so talking
about what's better for users can you kind of give me a uh like a practical application
everything something you know the real world scenario where it would make sense?
We originally designed it for web applications and
mobile applications, but we just find people
use it in a lot of different places.
People use it in municipalities
to record police events.
People use it in biotech
to store gene sequence data.
It just shows up all over the place.
It was very
exciting and
sort of makes me personally very happy to see that a lot of people like, just like what we've
built and find it useful. But I still think that every time you're dealing with, so Rethink is
really useful every time you're dealing with JSON. So, you know, stuff like log data, any kind of
middleware where you're dealing with different APIs.
Anytime you're doing things like product catalogs where you can't, you know, you have different products and they all have different structure.
Just really anytime you're dealing with JSON or hierarchical data, Rethink is really useful.
And I still think most of the time that's building things for the web.
Gotcha.
Do you have any plans of releasing a Windows support for Rethink?
I'd love to do this.
I actually grew up on Windows.
I think one of my first development environments was Visual Studio.
So I'm still in love with that platform.
I think it's just a matter of time until we do it.
We don't have plans for this right now
because we don't want to increase the surface area of the project, right?
Because the moment we port to Windows, we have to support it and everything gets a little bit harder.
So sooner or later, we're going to do it.
I don't have an ETA for this right now.
Gotcha.
So you talk about having to support it.
And you mentioned earlier that you guys are still in beta um although you're in beta do you see people using
this in production like anyone you know or you know any companies that are you know big companies
or anything using this in production yeah so one thing we quickly learned is people don't listen
when you say it's in beta right like gmail was in beta for a very long time you know up to a point
where like the whole world was using so the same is true as we think i can't speak to specific
companies right now we're definitely gonna you know post it on the site and talk about it and
do case studies and sort of showcase a lot of interesting use cases but yeah people definitely
have been you know starting to build production software on rethink like from day one which
really surprised us because you know we expected people would be a little bit more careful.
Right.
So like with traditional solutions, you have, you know,
I mean, sharding and replication.
Those aren't, I mean, those are common things, right?
And pretty much every database solution has to handle that in some way.
It's so easy with rethink, though.
But so part of that is, and I think a lot of, you know,
developers and ops people, and I think a lot of developers and ops people,
I think they like to kind of have fine control and fine tune.
But when I'm watching this video and I see,
I don't know who it was that was doing the video, but when I see him shard one of the servers and replication was so easy.
But is there fine tuning?
Can somebody get in there and really tune tune like you know i don't know
like to speed up queries or can they do stuff like that and we think oh yeah so there's a command
line interface um that allows you to really deep down dive deep down into the details and and take
complete control um over the system we designed it was the idea that well there are a couple of
ideas there the first is we learned that when you automate too much,
it works, and it works, let's say, 95% of the time.
But 5% of the time it breaks down.
Well, that's great, but it's not very useful to people, right?
Because they don't know what to do when there is an actual error.
So we didn't want to automate too much,
and we wanted to build it in a way where administrators could do,
you know, could be very explicit about what they want.
And we built that first, and that's available in the command line.
And then after that, we thought, you know,
to get started with the system, that's got to be really easy.
So we built tools on top of that
that use the lower-level tools to automate all that,
and that's what you see in the web UI.
And it turned out to work really well,
so, you know, 95% of the time,
people just do not have to look at the deeper thing
because the high-level interface will work.
But if you want to, you totally can.
You just type everything to be admin on the command line,
point it at the cluster,
and you can administer and change pretty much anything you want.
Awesome.
Yeah, so you guys have a page sequel to requel and i think it's neat
to see how these projects are vastly different um and you know just in general but how easy it is
to kind of map terminology and stuff like that is it's pretty cool to see i think it's going to be
you know it really helps to enable people who've been in a traditional environment to kind of move
into the next the next era of databases and really learn.
It's not like learning from the bottom.
You kind of have a foundation already.
Yeah, it's actually amazing how similar they look but how different they feel when you actually start using the two things.
Yeah.
Let's talk a little bit about the business.
And Rethink is a – we talked a little bit. You guys were in Y Combinator and this is public information.
You guys put this – I think it's on your website or somewhere but you guys have raised
funding but at some point, you guys have monetization at some point and making money.
So what's the goal look like for Rethink as a business?
J.D.
So we really want the product to be open source forever.
It's sort of at the core of what we do.
Every developer here really cares about that,
and we think it results in better software for people.
So Rethink will always be open source.
Well, always is a long time, but I really believe that.
I can't see a world where it wouldn't be. Let's put it that way.
But commercially, I mean, we wouldn't do anything very different from other companies like this.
We plan to offer support versions, supported versions of RethinkDB, so support packages.
And we found that what happens is developers pick up Rethink.
They start building an application on it, and then they hand it off to operations people. And operations people usually want to make sure that if something goes wrong,
they can pick up the phone and call someone on the other end of the line.
So that's the model for Rethink.
We're going to offer support versions and announce them pretty soon.
I can't talk about the details right now.
And that's going to be the immediate monetization. And we have a lot of ideas on what to do after that, specifically with services and platforms as services.
But I don't want to get into that too much.
It's a little bit early for that.
Yeah, that's fine.
So you guys, though, obviously are thinking about things like that.
And part of what comes with that is you guys have started to really – well, I don't know if started is the right word, but you guys have gotten a lot of popularity.
So when you first started working on this project and you guys kind of started the business and all that, there were other viable options to NoSQL and just databases in general.
Were you expecting the kind of popularity that you guys have now, or has this kind of taken you by surprise as far as just your day-to-day goes?
Oh, it's definitely taken us by surprise, at least with the very first release of RethinkDB as it is now.
We worked – these systems take a while to build.
It's not like it took three months and then we released it, we were working on it pretty much in isolation
for about, I want to say, two and a half or three years
because it took a really long time to design the architecture,
make everything work,
and make the first sort of quantum of utility
that we could release.
And that's a really long time.
Very few projects take that long.
So when we released it, and people were just absolutely blown away by the UI and the query language, how easy it is to use and how pleasant and how all these things feel.
I mean, that felt amazing.
We would never expect that kind of popularity early on. Because every time we'd make a decision, it sort of felt as the right thing at the time,
but you never really know how people are going to perceive it or they're going to understand
it.
Is it going to be useful to people?
And the fact that on balance, most of these decisions came out, I don't want to say right,
but at least useful to a lot of people.
I think that's definitely not something we expected to this degree.
Yeah.
So when you guys started, who was the team?
It was you and one other person, is that right?
It was me and my co-founder, Michael, and we had a third co-founder, a guy named Leif
from Stony Brook University, Leif Walsh.
He has long, long flowing red hair.
I still remember that. We're still friends. Leif Walsh. He has long, long flowing red hair. I still remember that.
We're still, I mean, we're still friends.
Leif now works at TokoTech,
which is not a NoSQL company,
but also in the database world,
in the database industry.
And then right now we're a team of 11,
but it started with just the three of us.
Right.
I love looking at the people on RethinkDB, and your title is Raising the Bar.
What does that mean?
Well, I'm the CEO.
Officially, I'm the CEO of the company.
But if you look at what I do on a daily basis, it's really anything from just basic services,
make sure the fridge is stocked and the engineers here have what they need to get their jobs done,
all the way to feature design and architecture and project management
and, you know, talking to people and things like that.
But I think if you boil it down to one thing,
it's about getting the product to be so good that people just can't ignore it.
It's got to be so pleasant and so helpful and so nice for people. And they have
to find it so valuable that they just can't, you know, not talk about it, not pick it up,
not download it, not find it useful. And that I think is the main thing that I do,
or I'd like to think I do that, you know, the jury's still out, but that's how I think of my job.
Awesome. So you guys got a bunch of contributors that you've kind of specifically noted just probably because of the amount that they've
given to the project but it looks like you're also hiring is that is that accurate yes that's right
we actually so i can't talk about this too much um about the financing but we're going to announce
this pretty soon and yes we're hiring people um all over the board um i can talk about that a
little bit.
I don't know if the audience is interested in this kind of thing.
But yes, we are hiring and we're looking to make the project hopefully even better than it is now.
Awesome.
If you're interested, just head over to their website and click on people and you can get some more information.
Again, we won't belabor the point here.
But yeah, I mean, Rethink, it's really cool to see you guys growing.
I know I want to kind of just in general thank you for being so flexible with me.
This has been a crazy couple of weeks, but it's been really cool to see Rethink growing
and the company and the product and the community around it.
And to me, anytime that you can, I don't know, I look at you again, looking at your people page
and anytime you can kind of distinguish the core team from the notable contributors and the
contributors list is just as long, if not longer than the core team means that you've got something,
right? It means that the community is interested and it means that, um, that there's something
here and we just kind of hope that we can watch you guys succeed in the future with it. And,
you know, it's definitely a really awesome product that when you during the show today you said that
there was a bunch of things that are coming soon and announcements that are going to be made and
you didn't want to talk too much about them so it sounds like there's uh there's going to be some
news to follow on to kind of keep up with with rethink so so how can people do that how can
people keep up with you guys yeah there's there's definitely a lot of energy actually so it's interesting it started the the show with um
built with love and why that is and uh you know we think of ourselves the the people that work
for rethink we just think of ourselves as contributors that happen to get paid and there
are a lot of contributors that you know just from the community um but so we tried to get rid of that divide.
And anybody who contributes RethinkDB and even the user is just sort of part of this group, part of the team,
and everybody cares about the project and what it means.
So to answer your question, you could follow at rethinkdb.com slash blog.
We always announce things.
You could look at GitHub.
Or you could follow us on Twitter, just at R, just everythingDB and all the announcements happen there.
So any one of these three channels, you could hop onto IRC
and you'll know what's going on and you can follow some of the energy
and some of the things that are happening.
Awesome.
So for our listeners that are new,
we ask the same three questions at the end of every show.
So we'll go ahead and ask them.
The first question for you, Slava, is for a call to arms, for the community to help out with RethinkDB, what would you like to see?
So what we're trying to do right now at Big Push is making the experience more unique for people who use Django, people who use Ruby on Rails, and people who use Node.js. So we already have the three drivers in the languages,
but we want to make it unique and a nicer experience
for people building specific, you know, using specific web frameworks.
So for anybody who's a core contributor to Django or Rails or Node,
we are hiring right now, and we're looking for people
to contribute to the drivers and make RethinkDB
just a better experience
for those environments, please shoot me an email, jobs at rethinkdb.com, and we'd love
to talk about it.
Other than that, download the product, play with it, send us your feedback.
That's the most valuable thing.
Awesome.
If you weren't doing this, whether it was working at Rethink or just programming in
general, what would you be doing instead?
I tried to pick another problem in software that I think would make a big difference in the world.
The thing I think that really excites me is 3D printing.
It reminds me of Star Trek replicators, and I think it's going to be a huge deal.
So if I weren't doing Rethink, I'd probably work on that.
Nice. You'd be doing Rethink printing. Yes. I'd have to be a huge deal. So if I weren't doing Rethink, I'd probably work on that. Nice.
You'd be doing Rethink printing.
Yes.
I'd have to be useful.
I'm not sure I know very much about the field.
Well, you could at least,
maybe you wouldn't be working on it,
but you would be playing with the prototypes.
Yes.
We actually are building a 3D printer
from a kit at Rethink.
Awesome.
And the last one is for a programmer hero.
So somebody that's kind of influenced you
up to this point in your career um i'd say it's i mean john karmak comes to mind i grew up with
his games i just and absolutely amazed with his ability to marry research and pragmatism and and
and getting people something amazing that they're just amazed by.
And he really inspired me just as a kid when I started programming,
and he still does.
Yeah, that's a pretty – I was going to say,
I remember his name from Quake and stuff, but looking at it,
that's a pretty crazy chain, Wolfenstein 3D, Doom, Quake, Rage.
That's crazy that he's kind of been the lead on so many successful projects.
Yeah, the guy is amazing.
I mean, he should be an inspiration, I think, to the whole generation of programmers.
He probably is, right?
Yeah.
Awesome.
Well, I want to say thanks again for joining us.
And, you know, I was, once again, to reiterate, we kept tossing your day around to which day you would join us.
And every time you came back with a no problem, that'll work great.
And you've been really, really flexible. And I just want to say I appreciate that for would join us. And every time you came back with a no problem, that'll work great. And you've been really, really flexible.
And I just want to say I appreciate that for joining with us.
Thank you, Andrew.
I'm happy to be here.
I'm always excited to talk about Rethink
and talk about open source and technology in general.
So it's no problem at all.
I'm happy to be here.
Awesome.
I also wanted to give another shout-out to our sponsors,
DigitalOcean and TopTal, for supporting the show.
Head to DigitalOcean.com to set up your cloud server today, and make sure you use our promo code CHANGELOGSENTME, that's CHANGELOGSENTME, all caps, to get a $10 hosting credit.
And if you want to do freelance with companies like Airbnb, Artsy, or IDEO, head to TopTal.com slash developer, and click join the best to see if you have what it takes
to join Toptal's network of elite
engineers. Again, that URL is
toptal.com slash developer.
And that's it for this week. Thanks again
to Slava for joining us and also thanks
to the listeners for tuning in and for your support.
If you haven't yet, subscribe to the
Changelog Weekly. It's our weekly email where
we share everything that hits our open source radar.
You can subscribe at thechangelog.com weekly so for now let's say goodbye We'll see you next time.