The Changelog: Software Development, Open Source - ZeroDB (Interview)
Episode Date: January 8, 2016MacLane Wilkison and Michael Egorov, the creators of ZeroDB, joined the show to talk about ZeroDB — an end-to-end encrypted database (protocol), why it's open source, how it's different than other e...ncryption techniques, performance for running encrypted queries, and an interesting topic called Proxy re-encryption.
Transcript
Discussion (0)
I'm McLean Wilkerson.
I'm Michael Egorov.
And you're listening to The Change Log.
Welcome back, everyone.
This is The Change Log, and I'm your host, Adam Stachowiak.
This is episode 190.
And on today's show, Jared and I are joined by McLean Wilkerson and Michael Egorov,
the guys behind ZeroDB,
an end-to-end encrypted database and also a protocol.
We talked about why it's open source, how it's different from other encryption techniques,
if there's a performance hit for running encrypted queries.
We also talked about the business side of this thing and also an interesting topic all in itself,
proxy re-encryption and so much more.
Three sponsors we have on the show today were Codeship, TopTile, and DigitalOcean.
Our first sponsor is CodeShip.
If it works with Docker, it works with CodeShip.
For those out there with established Docker workflows
or those looking to leverage native Docker support
while automating your testing and deployment,
check out CodeShip's new Docker platform
at CodeShip.com slash changelog.
And for those looking for great resources
to automate your development workflows with Docker,
you should download CodeShip's free ebook
covering why consistent environments are so important,
how a company lost $400 million in 45 minutes
due to inconsistent environments,
and also how to build an app
to run inside an isolated Docker container.
Head to codeship.com slash changelog to check out Codeship's new Docker platform, and head
to resources.codeship.com slash ebooks to download that free ebook I mentioned on automating
your development workflows with Docker.
And now, on to the show all right we're back today talking about zero db with mclean wilkinson and michael eggerov
talking about an end-to-end encrypted database also a protocol we'll dive deep into that
got jared here as well
mclean michael welcome to the show thanks guys it's just excited to be on here i guess maybe
the best way since we have two guests on the show today jared let's maybe kick off real quick with
some quick introductions so just to kind of pinpoint voices i know when listeners listen to
any podcast there's more than a couple voices it's hard to kind of put a name or who's speaking to it.
So McLean, we'll go with you first.
So give an intro to who you are and what you do with ZeroDB.
Sure.
So again, McLean Wilkerson, one of the co-founders of ZeroDB.
Software engineer, also do business stuff now
that we're an official company at ZeroDB.
Have a background both in software
engineering as well as more traditional finance. I worked in investment banking for a few years and
I had a positive conversion from that. But during that time I was covering tech, media, and telecom
companies before getting into startups. Awesome. And Mike, what about you?
Yeah, so I'm a software engineer and basically I'm
the guy behind the tech here, even though McLean is also working on the tech in our company. And
I also have a physics background. And how this got on our radar. So I think Jared, we, you know,
chime in here on this because I thought this was kind of interesting. Before recalled jared and i have some sort of sync up right to think about like
you know what is this show gonna be about who's on the show what's their backgrounds that kind
of thing and jared said to me adam how this hit your radar jared what did i say i kind of stuttered
a little bit i had no idea but yeah you didn't know you were like well we had to backtrack a bit
it was december i know that yeah i. I'm going to guess Hacker News.
No, no, you would think Hacker News.
Now, this is how cool we are.
This is a slight pat on the back to us because we've been trying.
Right.
So we have this email called Change All Nightly.
And that email goes out every single night.
It's essentially top new repositories, top star repositories every single day from GitHub.
And that hit Change All Nightly, your repo repo when it open sourced on December 7th.
And so that hit the top news repos.
And then the very next day, December 8th, jumped to top star repositories.
And then days after that kept trending for several days.
And that's how we discovered you.
And almost immediately, I was like, this is going to be pretty interesting because it's obviously, you know, encryption, database related.
Sent you a tweet the very next day on December 7th and said, congrats on the launch.
Let's get you on the show sometime soon.
So it took about a month.
We're after the new year.
We're here in the new year finally.
And so now we have you on the show.
And that's kind of how this hit our radar.
Gotcha. the new year finally and so now we have in the show and that's that's kind of how this hit our radar gosh yeah when we when we open sourced it uh in i guess beginning of december we had a little
bit of a positive uh feedback loop or viral effect effect there for a little while because
i guess it caught some people's attention got a bunch of stars and then it hit the uh github
trending page and and once you get there it sort of it feeds on itself for a while so we we had a
nice little ride there for a few weeks.
Jared, do you want to mention anything about Teams All Nightly?
I love it.
You love it?
I mean, the fact that it's our own radar?
Yeah, I mean, there's no secret to how we find awesome stuff.
Stuff that we put in ChangeLog Weekly or ends up on the show is 10 p.m. U.S. Central.
What's interesting on GitHub for the day?
And like you said,
certain things pop into that new repositories list
and they pop in and they pop out
and you never see them again.
And then other things will pop into that list
and then you'll find them in the bigger global list
of top start altogether.
And yeah, you can just start to see what is trending,
what is trending, what's interesting,
and what's worth clicking through and checking out.
Well, that's a good example of dog feeding, I guess.
Well, I wanted to mention that at the top of the show,
mostly because of the discussion that Jared and I had earlier.
Like normally we don't come on the show and make it about us,
but I thought just for a second,
we can talk about that simply
because it was literally from our own dog food that you know for using that that metaphor that
we found out about zero db and immediately i was like we should get in touch with guys have them
on the show um but let's let's dive deeper into some some better interesting things by the way
if you want to subscribe it's a good transition chang. ChangeLaw.com slash Knightly. Go ahead and subscribe to that every single night. But let's dive in, I guess, to the
topic of how you guys got started. Obviously, we're going to dive deep into 0DB databases,
encryption later in the show, but let's find out where you came from. How did, I guess, you two
meet? What was your background seemed to cross a bit in terms of, you know,
tech and financial and things you've mentioned, but how did you guys meet?
Yes. We're going to go back a little ways. And actually even before ZODB, we were doing a lot
of blockchain cryptocurrency stuff, uh, sort of the intersection of our respective backgrounds.
We actually first met at a, at the SF francisco bitcoin uh users group um and so we
we met there sort of hacked on a bunch of different you know bitcoin cryptocurrency
blockchain related stuff for a while um and that was sort of the initial meeting and overlap of
interest and out of that process sort of uh we had the inspiration for zero db after
after many different experiments you guys still attend that meetup that bitcoin meetup uh we do
yeah actually we are not in silicon valley at the moment for the next several months uh we're in
london there's a there but yeah we attended it out there. For some reason, McLean, you got real soft there.
Did you back up for the mic?
I'm soft?
Yeah.
Did you step away?
Did you move away or something like that?
No, I'm actually pretty close.
Is it still bad?
No, you sound okay now.
I don't know if anybody else, if you found it.
I noticed it.
I haven't noticed yet.
Yeah, you kind of got soft there for a second.
But your back, we can hear you.
You're just softer.
Like your volume went down.
Like you backed up from the mic.
Yeah, I'm not sure what happened there then because I've been pretty close.
All right.
Sorry about the glitch there, everybody.
We'll see if we can edit some of that out.
But we might just leave it because that's radio.
All right.
So, Mike, what about you?
I mean, what's your story in terms of that meetup?
Like how did you get attracted to that Bitcoin meetup?
Yeah, so I've been attracted by Bitcoin even before I came to the US.
And at the time, while I was in Australia,
and I wanted to go to Silicon Valley to do some startup.
I don't know which one.
So I went there just to work for a tech company for linkedin
for a while and i started attending these meetups um especially bitcoin bitcoin meetups and that's
why i've met mclean um and yeah so i've been fascinated by bitcoin decentralized applications things like that and that's what sparkled this
zero db idea at what point did you know was that first meetup or that's when you first met but like
at what point did this discussion happen what was the problem you guys were solving was it just like
you know what was the situation there yeah we didn't really start working on 0db at all really till i guess the after a year of working on other stuff um sort of in that blockchain you know area
um we were looking a lot at sort of these new newer uh frameworks like ethereum or storage with
a j and actually how 0db came about we were thinking okay if you have a decentralized
application presumably like every other application,
it's going to need data,
and some of that data is going to be sensitive
or need to be private.
But if you don't know where that data is sitting,
and you're not trusting the server necessarily,
like how can you have that data be secure
and usable in this decentralized world?
And that was sort of the original problem that ZeroDB was solving.
Although, you know, we've ended up building it for a sort of a certainly a more traditional
client server architecture.
But that's the original origin story.
So from there, like, how did you, what was the conversation like was like, hey, we should
create this technology or hey, there's these problems and we should open source it we should create a business together like how did
that happen well i think we we had you know throughout all these different experiments we
had been planning on eventually you know having something that would become a business i think
that was in the back of our minds the entire time while also you know having you know the
flexibility to be able to work on cool stuff.
And we, so originally when we first had the idea, we didn't really think that it would be, that it would work.
We just thought it would be too slow.
And eventually we got around to sort of hacking together a really quick and dirty prototype
and realized, hey, actually, you know, you can, you can tweak some things and play around
with it and actually make it, make it performant enough that you can use it for, you know, real world
stuff and production applications.
And at that point, this was back in March of last year, so 2015.
At that point, we just thought it was kind of an interesting little technical trick.
But we posted about it on Hacker News and it blew up there and hit the front page twice in one week,
which for me is a, I would say, very frequent reader of Hacker News was a cool thing.
And that was when we sort of realized, okay, hey, you know, this could be, you know,
something that could go into actual business.
Well, I was just going to ask, you know, where it goes from there.
I mean, we have more and more of these open source projects
that have a GitHub page and they also have an AngelList page.
And so it's a bit of a newer phenomenon
and we're talking to more co-founders slash open sourcers.
It almost sounds like I said, oh, sourcers.
Open sourcers.
And so, yeah, I would just wonder,
is like, okay, so you have this cool technology.
You obviously both want to do startups
or be in a startup or start a startup.
And you've had something that's interesting
on Hacker News,
which shows it has at least some technical merit.
Where do you go from there in March
to here we are in january of the following year
um and now it's you know a business so what's the the process the steps yeah what's that look like
yeah so what happened was we we had obviously the little really quick and dirty prototype and
once we realized hey there's a lot more interest in this beyond just the two of us
from this hacker news post we we sort of built it out's a lot more interest in this beyond just the two of us from this Hacker News post,
we sort of built it out a little bit more,
started giving access to some of our friends
that were interested in potentially building stuff on top of it
and kept sort of on product development for a few months.
And then I guess the summertime,
we started reaching out to some banks
because we realized beyond decentralized applications, which are really cool in general, but I think in reality, it's not clear.
There's not a whole lot of regulatory and compliance
constraints and see you know if there's a use case for an end-to-end encrypted database in in that area and obviously that's where you could potentially build a an actual big business
as opposed to you know in this decentralized world which is i think very interesting but
still maybe a few years away from actually really taking off.
So I started having those conversations, realized, you know, this could be a tool that could help banks move a lot of their on-premise infrastructure to the cloud.
And so in general, banks have been very late adopters of cloud infrastructure and not a whole lot of clouds that are running infrastructure in AWS or Azure.
The big reason for that is because they don't want to give up control over their data to a cloud provider,
which even if they're encrypting their data at rest, if they're actually querying it,
they're going to have to send the key to the database server at some point.
But with 0DB, you can have the keys on-premise, still have the data be queryable,
and so potentially offload a lot of that on-premise infrastructure to the cloud. And obviously, there's a lot of economic benefits to that, as well as scalability and performance.
So went through a lot of those conversations. It actually got announced on Monday of this week, so I can say it publicly. We're in FinTech Innovation Lab out here, which is a really good chance for us to talk with the 15 or actually 20 banks that are involved with this program here and potentially help them with that long-term cloud strategy.
Tell us about FinTech Innovation Lab, exactly what that is and what it entails for you guys.
Sure. So FinTech Innovation Lab is sort of an accelerator, but not really the way most people think of accelerators out in the States, like in New York, where the London one, of course. And it's really sort of a mentoring slash sourcing mechanism for them to find new interesting technology that they can potentially use internally or help go to market.
So we're working and have the opportunity to work closely with a lot of senior IT executives and application developers inside of a lot of the leading global banks and figure out where zero to B would potentially fit within the organization.
What the other use cases are beyond the ones that we've already identified.
And there's 15 other companies in this program with us.
It's divided into three different tracks, five companies in each track.
We're in the tech track, which may be obvious.
There's a retail banking track and then a commercial and investment banking track as well.
So you said you're in London.
Where are you normally at?
I assume London's foreign to you, right?
Yep.
This is the first time I've really spent any time in London.
We're usually out in the valley, so we're in Mountain View typically.
Gotcha.
I was going to say, you made it this Bitcoin meetup. You're interested in blockchain and decentralized things and obviously security as well.
And then decide to get together and build some technology, start a startup, and
ends up not being specifically using the technologies that Bitcoin relies on.
But nonetheless, here you are.
And I guess I'm just curious, going back to the Bitcoin, if you guys, maybe Michael, you
can address this one first.
Are you guys as bullish on Bitcoin as you were back when you first started
uh meeting up in san francisco uh well probably not as much bullish as we were initially but we
still strongly believe in bitcoin and in decentralized applications and things like that. So that said, I would say now it's this Bitcoins
and Bitcoin infrastructure growth
is at much more sustainable rate
rather than this explosive craziness.
Have you guys seen any interest from the fintech
and the banks about these types of technologies like
the bitcoin the blockchain based technologies for infrastructure or anything like that uh yes
certainly there is some interest interestingly enough they uh recently called blockchain as
b word as what b word. As what?
A B word.
Oh, B-word.
Is it a bad thing or a good thing?
I don't know.
It's pretty much,
this word is used too often, I guess. And they jokingly call it B word
just to emphasize that.
That's funny.
People are talking about it so much,
it's become the B word.
Oh, I see.
Gotcha.
Well, I think this might be a good chance to step back, take a break.
Of course, we're here to talk about ZeroDB, which is your guys' startup and your open source project.
We'll probably start with the conversation around the open sourcing of it and why it's open source and all those things.
And then we'll dive deep into not just what it is, but how does it work?
Why is it cool?
And all those things that developers love to hear about.
So let's step back for a moment, hear from one of our sponsors, and we will talk about
all those things on the other side of the break.
We know you listen to other podcasts.
Don't worry, we're not upset.
We know the changelog isn't the only podcast in your list.
And you know what? Don't worry, we're not upset. We know the changelog isn't the only podcast in your list.
And you know what?
You may have heard advertisements on other shows about other hiring platforms and other places like our friends at TopTow.
But let me tell you, there is no match to how invested TopTow is to both sides of the equation.
You have companies out there who need great engineers, great designers.
And you have great designers and great engineers out there needing great opportunities.
And TopTow plays both sides of that fence very well, helping make sure the right kind
of opportunities are there and the right kind of developers and designers are there.
And if you're a CTO listening to this or someone on a team who knows you need to expand and
add more people to help reach your goals, TopTow is the best place to go and find the best designers and the best engineers out there.
And if you're one of those best engineers or best designers looking for great places to do great
work and freelance and travel the world and have a lot of freedom, but also have the same kind of
support you would want from working somewhere, TopTow is that place. You can blog with them.
You can travel the world with them. They have meetups all around the world. They absolutely love to encourage and to support and help their developers and the designers that are
in their network all around the world grow and be better and dop-t-a-l.com once again t-o-p-t-a-l.com but if you want a personal
introduction on either side of the fence whether it's an introduction to find the best designers
and developers or if you're a designer developer looking for great opportunities get in touch with
me i'd be glad to give you a personal introduction email me at adam at changelog.com.
And now back to the show.
All right, we are back speaking with McLean Wilkerson and Michael Egerov of ZeroDB, about ZeroDB.
And as we teed up before the break, our first question, as often our first question on a show about open source software is, why is 0DB open source?
Well, I think it's, I mean, well, there are a lot of reasons.
I mean, fundamentally, obviously, we're developers.
We believe in open source and we want the things we work on to be open source.
But even beyond that, just from a business perspective, I think in 2016, now that it's the new year, it's, it's, I think it's just super hard
to have a closed source proprietary business, software business, particularly when you're
dealing with infrastructure tools or dev tools. And even moreover, when it's, you know, something
that's so security focused, like zero DB, you of course don't want your, you know, have security
through obscurity. So obviously, the more eyeballs we can get on 0DB, the more we can be sure and confident in its security
and in our implementation of it.
And then beyond that, just looking at where the,
you know, where the software industry is gone,
you know, if you're a developer and you're selling to developers,
developers' first choice, they're going to look at an open source tool.
And if you're closed source, you're sort of starting behind the eight ball.
So I think it's very difficult to build a closed source tools company today.
Looks like you've selected the AGPL license.
Would one of you speak to that decision and what all that implies as an open source license
with ZeroDB?
Sure.
So we've looked at very closely at 10Gen and MongoDB
as kind of a model for what our business could potentially look like.
So with AGPL, obviously it's an open source license.
People are free to use ZeroDB for their projects.
The requirement there is if you're building on top of it
or making extensions to it,
you need to open source what you do on top of it as well.
And for a lot of people, that's fine.
For some people, particularly maybe enterprises
that aren't as into the open source movement quite yet,
they prefer to have what they build
still be closed and proprietary.
So in that case, we can sell them a commercial license.
And then we're also considering
if we include extra features and potentially like an enterprise edition that
we could license as well, although we don't, we haven't done that quite yet.
We're still thinking around that. And then obviously, you know,
on the business side, you can, in addition to selling licenses,
you can sell support and things like that.
I'm curious on that licensing front, like, you know, Jared, I don't know about
you, but I still feel like licenses to me, like they run together.
I have a good memory.
I try my best, but it's like documentation.
I have to go back and look it up.
And I'm just wondering McLean for you, if, if you had any advice from somebody, like
did you have a lawyer step in and sort of walk through that with you?
So as we have people listen to the show, I'm just wondering if there are resources out there,
aside from say the ones that are known, like choose a license from GitHub, which is really
helpful. What resources did you use to, to, to make the decisions you just made?
Yeah, we did. We did talk to a lawyer just to sort of sanity check everything we're doing. I
wouldn't say we went to a lawyer to start.
I mean, there's obviously tons of resources online in terms of advice around building an open source business
and what the different models might look like.
Of course, even though it's not super fun
reading through the actual licenses,
particularly the one that you choose,
you should probably do that at least once or twice
so that you actually understand.
Waste of time.
Waste of time.
Nobody reads them.
And then obviously, you know, talking to people that have done an open source business and have used a particular license.
One caution I would say is it's generally better, at least in my opinion, to choose sort of a standard license.
So something that's known and people are relatively familiar with. Obviously, the mit licenses the apache licenses the gpl ones um as long as
something's standard and not i think if i think where you could potentially get into trouble is
you have some sort of bespoke license and that just sort of that makes the hurdle for someone
actually using your software that much higher because then particularly if it's if it's a company they're going to have to have their legal department look at it and
understand it and right you involve nothing against lawyers but you involve those guys
things to get slow you have to do things that you wouldn't normally do and you can't
quite be quite as agile around decisions yeah and one of the benefits of having an open source in
the first place is is lowering the adoption hurdle where you you know, if you're, if you're,
if you're forcing people to go through the legal department,
that sort of defeats that entire purpose.
I like that process of looking at a company that you, you know,
that you like their model, you've seen it. It's, it seems to work at least,
uh,
with regard to the way they're going about building a business around open
source and then just saying, well, I like, for TenGen, like you said,
that's the kind of model that we want to follow with our business.
And so let's check out what they're doing
and use that as a starting point, at least,
to having conversation with lawyers.
Or if you're able to get somebody's ear at TenGen
who's in those decisions and ask,
how do they feel about the fact that they have this license
and how is it working out for them?
I think it's a great way of going about it
as opposed to just starting with a blank slate.
Yeah, and I think the interesting thing
also in the open source businesses,
it's in reality, the entire life cycle
of the entire sort of this open source movement
is very early, particularly thinking around
business models with it.
So I don't think anyone really has the answer quite yet.
So everyone's still sort of experimenting and seeing what might work, what doesn't work and trying to figure out how you can
build, you know, a sustainable business on, on top of open source software.
It's further down our notes. So Jared and I put together notes for each show and it's further
than a list than the first question, which was the obvious one, why open source? Um, but it makes
sense to bring up here, which is the patent pending algorithm piece.
How does that play into
and having a patent on software, I guess,
or I'm not sure what the exact patent is for,
but how does that play into the open source nature
of this project?
Yeah, so we have a provisional patent filed
and I think the reason you file a provisional patent
is sort of get a date on it until you have
up to a year after you file the provisional to apply for the actual patent.
I would say that was done out of an abundance of caution.
We don't know.
Obviously, it was open source.
The code is available and free for people to use.
I think also in practice, it's unclear that software patents
really work. You know, and so it's sort of an abundance of caution, a little bit of a check
the box from an investment perspective. You know, obviously a big question you get when you're
going to raise money is, you know, what does your IP look like? And is that defensible?
I wouldn't say the patent is really a
poor piece of our strategy going forward. Um, but it's, it's something that we did just,
just in case, you know, obviously when we're thinking through this, we didn't know exactly
what things will look like, you know, a year from now. So we just wanted to be, have, have that
option if it, uh, if it was something that we thought would be valuable for the business going forward.
That makes sense.
Interesting.
Well, let's dig down into what ZeroDB really is and what it's here to offer us as open source developers, what it does and how it works.
You call it an end-to-end encrypted database.
Michael, could you unpack that for us and tell us what zero2b does and why it's different
than stuff that's already out there?
Sure.
What it allows you to do is pretty much to use this database to do queries without the
database server knowing anything about your data or knowing your keys.
So all the keys always stay with the client.
And still from this client, you're able to query your data,
obviously without pulling the data down to the client.
So for those out there who may be thinking,
well, couldn't you just use full disk encryption?
There are offerings out there.
You have record level or column level encryption.
You have full disk encryption.
And so I just want to put a stressing point
on how this is different than those things.
So could you compare and contrast it to like,
well, I'll just encrypt the entire disk on my server
and I'm good to go?
Yeah, so basically when you encrypt your entire disk
on the server or use column-level encryption,
every time you query, at least,
you expose your encryption keys to the server.
So these keys are at least in memory.
And any attacker who is sitting on the server can pretty much take a snapshot of memory
and figure out where the keys are and decrypt the data. And it wasn't so much of a problem
when the threat was somebody physically stealing your hard drives,
but now in a very networked world, most of attacks are remote,
so if somebody infiltrates into the server,
this is actually really a problem.
There's a nice little analogy, which isn't necessarily exactly technically accurate,
but if you compare it to like a physical vault, so you have your valuables inside of the physical
vault, you're going to lock the door. You're not going to hang the key up on a nail next to that
door and go home. But in a way, not exactly, but in a way, that's what you're doing when you're
encrypting data at rest. So you're still having to send the key
over to this database server where your data is,
and that's the window
that we're trying to close with 0db.
I think that's definitely a helpful analogy.
Like you said, probably not exactly
100% accurate, but I think that does help
paint the picture a little bit.
One thing that's
always advised
with security is the defense in depth principle,
which is to say, don't just lock the door,
but also don't just have a key,
also have a padlock and then also do this
and just add layers of depth to that security.
Would you guys consider 0DB in that case
a layer of security?
Or would you say, if you're using 0DB,
we don't need the full disk encryption.
You obviously can't really even do column encryption,
I would guess, but are these instead ofs or are these alsos?
Well, I would say with 0DB, we call it encryption in use.
And so by default with encryption in use,
you still have it, it's still encrypted at rest.
It's still encrypted in flight, you know,
as the data is moving between client and server.
And of course you could also layer on top TLS or SSL to that if you liked.
But I would certainly not suggest that someone just use 0DB
and totally forget about the rest of their security posture.
That's definitely not what we would advise.
You don't want to go on record and say that, is that what you're saying?
Yeah.
What we tried to do as UADB was,
so if you assume that the database server is compromised,
how can you still mitigate the potential
for your data to be,
the actual underlying unencrypted data
to be stolen on the server?
So that's the way that we came at it.
And the answer is, your server should never know anything, basically.
Yeah, I mean, if you never send the encryption keys to the server
and assuming you're using strong encryption like neg AS-256,
then all an attacker is going to see is just a bunch of encrypted gibberish.
So how does it work?
You got the server's dumb in the sense of encrypted gibberish. So how does it work? You got the server's dumb
in the sense of the encryption technology.
I'm guessing it's just storing crypto text.
How does it work?
Tell us how it works, guys.
Yeah, so basically we,
apart from encrypted records,
we have encrypted index,
and that's what makes the thing working.
And the way it works is really easy.
So the encrypted index is a tree-like structure, pretty much a B-tree.
And it is pieces of this B-tree, the buckets,
are encrypted before they go to the server.
So the server.
So the server observes only already encrypted buckets.
And the way we do queries, the key piece in it is that the client traverses this B3 remotely. So pretty much it pulls down the root of the tree, decrypts it, figures out what piece of B tree to request next, requests it, and in several steps
it finds what it was looking for. Of course it's not all that easy because
it becomes a little bit more involved when you need to do some operations
like set intersections or things like that.
But the key piece is this traversal of the tree from the client.
So let's talk a little bit about the ergonomics of the database
or the kind of database it is,
because I think you mentioned somewhere that, you know,
it's 0DB is an end-to-end encrypted database
and you also have protocol mentioned.
But at the end of the day, it has to be a database, right?
So what kind of database is it?
Is it like an SQL compliant relational database?
Is it a key value store or, you know, a document store?
What kind of database is it?
How do you work with it?
It's more like an indexed document store, a document store? What kind of database is it? How do you work with it? It's more like an indexed document store, somewhat similar to MongoDB. Interestingly
enough, we base it on something called ZODB, which is a part of ZOB framework, if you remember this thing. And ZODB wasn't super popular at the time
when it was a thing. The reason, I think,
is because it's more like a kit for building databases rather than a database
on itself. So we used that to build
ZeroDB. The name is similar, as you can recognize.
And from that, we inherit such features as ACID compliance. We can do replication and things like that. Again, inherited from and yeah that's pretty much explains how we have all these features we just add the
end-to-end encryption piece on top of that that makes a lot of sense because what i was starting
to think when i was reading through what you guys are up to with zero db is um at the end of the day
like wow there's a lot of stuff you have to build if you're building a database right like your
your key differentiator is the encrypted part of it,
but that you're also offering a database, right?
And I was like,
they just decide to just redo all these things.
And now I'm finding out Zope object database,
it has transactions, history undo,
transparently plugable storage,
all these things that you guys could just build on top of
and leave that to somebody else, so to speak, or at least the foundation is there for you, which looks like it's a Python-based thing.
That's the beauty of open source.
Yeah.
There you go.
So good idea.
So is that a fork, though?
Are you using actually ZODB behind the scenes and you're following and tracking that project?
Or is this something that you've forked and you're using your own version of it?
Yeah, basically, the way ZODB is built, it's super modular.
So you can actually use its pieces as a library and you will still be okay.
So that's how we are using it. Of course, we could fork it,
but it seems like it wasn't necessary so far.
I guess maybe another question on that is,
since, Jared, you mentioned it's Python-based,
what made you choose this database?
Why this one and not some of the other options out there
that are available to everyone?
It's actually being built in Python.
It allows us to move pretty quickly,
to develop things quickly,
faster than if we would do that in C.
And if you think about performance,
well, it seems like performance is good enough.
I mean, all the limitations come not from Python by itself.
You have encryption overhead, things like that,
and that pretty much outweighs the fact that the database is written in Python.
And actually, ZODB, if you use that properly, is pretty fast.
Yeah, I think a big piece of it was just, as you said before,
we don't want to have to rebuild a new database from scratch. We lean on something that's existing, and ZODB is, as Michael said,
a good kit for doing that.
But as you pointed out, I think before, we in ZODB is, as Michael said, a good kit for doing that. But, you know, as you pointed out, I think before there's, you know, we had the word protocol there.
So in principle, you know, you could build a database from scratch that does ZODB.
You could potentially build it on top of another database on a SQL based database, for example.
That's what I was thinking about when you said that was like, if is this, you know,
obviously you're at a certain point, if someone else preferred a different database, could they
take the same principles or the same things you've done with the idea of it being a protocol? Could
they apply that to, you know, X, Y, Z database or a different preference at least? You certainly
could. And I would say it's not, not a trivial thing to do. And one thing we've actually thought about is, is there a way for us to potentially sit alongside of existing databases,
particularly when we're talking to a lot of the banks,
like they're running Oracle, MySQL, Postgres,
a lot are still running even DB2 and Sybase stuff.
So is there a way potentially for us to sit alongside of that?
Of course, a large bank isn't going to rip out all of their existing database instances
and replace them with 0DB as much as we would like that.
So in that case, I mean, we don't have this ready today,
but our thinking around there is potentially they can keep using Oracle, for example, for the storage,
throw encrypted records in there and have a 0 DB index that would sit next to that.
And actually when they want to go query those encrypted records, they'll go through our index.
So I would say still forming our thoughts around that piece, but that's a possibility for the future.
Yeah. Well, let's be honest. I mean, this is fairly, you know, fairly new, like in this last year.
So in 2015, this whole entire thing for you guys was born right like
you guys connected in march it sounded like if i'm painting back the history and then by time
december we had an open source project that was spawned off now a business so you're still
in the uh forming your ideas innovation stage to to say the least i'm sure right
yeah certainly still super early days and working through everything both from the technical engineering product development stuff to thinking through the business model stuff that we talked about earlier.
Gotcha.
Well, let's take another break.
When we come back from this break, we're going to cover the deeper sides of the client side of this database.
So stay tuned.
We'll cover that when we come back.
Our friends at DigitalOcean,
simple cloud hosting built for developers,
launched a new feature recently we absolutely love.
It's called Droplet Multi-Create for easily launching up to 10 servers at once.
You can deploy multiple droplets
when you create your droplets
with the exact same configuration.
And this is a huge feature.
We absolutely love it.
Head to digitalocean.com and when you sign up use our code change law to get a $10 hosting credit
when you sign up all right we're here with mclean and michael talking deeply about zero db the
protocol the database and all the the odds and ends of this.
And, you know, the encryption is handled by the client,
and right now there's only a Python client.
So it seems like you're bullish on Python, which is a good thing.
And at one point, Michael, you mentioned you could do it faster in Python
than you could in C.
But I guess the question we have here is roughly around the clients,
will this always be the case?
Are you planning other clients? And how much work does it take to create a new client let's say for ruby or
javascript for example all right yeah that's an that's a very interesting question so obviously
now we uh we we support uh json api so that you can interface it from anywhere. But that means that you have to run this JSON API server on the client machine, which is
not always very convenient.
And the first thing people are requesting is actually JavaScript client.
Right?
So, of course, nothing prevents us from writing that.
The easiest way is to port Python code
using something like PyPyJS,
and apparently it can work,
but probably the more proper way would be to actually write a JavaScript client.
And that's something we are planning.
Another thing is that in this financial world, when banks want to outsource their databases to the cloud and do things like that,
they often use Java.
And they obviously need some way for their Java code to interact with 0DB.
And for that, we would need some JDBC connector.
And that is possible to do.
And I think with Java, we can go with Jython actually.
So that begs the question of why at this point, maybe on the client side, at least,
maybe a C library that's portable and can be interfaced and wrapped from these other
higher level languages might in the long term become faster-moving.
Because it seems like there'd be a lot of smarts in the clients,
and so you may end up with six implementations
of the same encryption stuff.
Just your thoughts on that?
Yeah, we'll see how it goes. I guess we will start from porting from Python code, which is certainly possible to do.
You can do that with JavaScript, you can do it with Java, you can do mobile clients from
like Python ported to there.
But long term, you quite can be right i guess it kind of depends
too jared if um i mean it seems like this is for everyone but obviously financial tech has the most
or financial people have the most initial benefit from this but you know obviously other developers
are thinking how could i use this into an encryption on my own account kind of depends
maybe on their motivation on who they're focusing on so it might be open source but
they might be focusing on financial tech and they're they're kind of hacked with python for
example that's right yeah i mean i'm just thinking about it in terms of other databases let's take
again mongo db as an example like as a business it's in m in MongoDB or in 10Gen's best interest
of having as many
client libraries
as possible
and keeping those
up to date
and awesome
and everything like that.
And as the
technology matures,
they have to maintain
all of those,
even though they're open source
and of course,
the community can do
all this stuff.
It's in their best interest
to do that.
And similarly,
I would think that
it's in 0DB's best interest to have that. And similarly, I would think that it's in ZRDB's best interest
to have as many clients as possible.
Is that the case?
Yeah, I mean, I think, is that the case, guys?
Yeah, that's definitely the case.
Yeah.
And so then my question was like,
your guys' clients have to be really smart, right?
Lots of surface area of code
because of all that you're doing client-side.
And so I just would see that that might be a bottleneck for you,
like,
you know, software wise potentially.
So is it an issue?
I mean,
yeah,
I,
I obviously,
I think you're probably,
you're,
you're pretty,
you know,
that's a good insight.
I think in general,
like it's always a struggle,
you know,
allocating resources when you're a super early-stage company and
thinking long-term versus what are some quick wins.
So yeah, that's certainly one of the challenges of doing any business,
particularly a software business. Is that a place where you guys
are looking for help from the open-source community, or is that more like
your bread and
butter actually open source community already offers some help in that so we've seen some
people who want to uh to port zero db to mobile platforms for their own purposes and we are
actually we are happy to accept this help because this was one of the reasons why we open source.
Yeah, I mean, if someone wants to build an alternative client,
we'd be super happy to see that.
And we definitely support them in that effort.
Cool. So speaking just more about the client side of things,
since you're putting so much of the smarts in the client
and you have the decryption
is client side, right? The key handling is client side. And the server is an encryption
store, basically. There's some smarts in there, but like you said, it doesn't have the secrets.
It can't share the secrets. Is it just kind of moving the ball around a little bit? Because
now instead of your server being the
source of truth to a certain degree, and if it gets compromised, you're in big trouble.
If one of your clients gets compromised, is the whole game up?
Or is there something that segregates where if a client gets compromised, it's not a big
deal?
Well, even today, you have this problem with the smart server model. If your client is compromised,
you still get your data exposed. But we just close one window on the server.
But this is actually not the only thing we do. On the server, you can, of course,
throttle the data so that when a client is compromised,
the attacker couldn't steal absolutely everything
in just a little bit.
Another thing is we have this sharing
or proxy re-encryption piece.
The way it works, we can enable somebody else,
be it third party or yourself using a different device.
We can enable the third party to query your data
still without the server knowing anything.
So let's say you can share data with your friend,
and your friend can still do the queries which you allow him to do.
And the server doesn't know anything, but the server can revoke the access so that your friend with his key cannot access this data after a day or two.
How does that work?
Yeah, so it's based on a technology called proxy re-encryption.
It's a pretty young family of encryption algorithms.
The first one appeared in 1998. The way it works is you have, let's say you want to share your data with somebody with a
different key pair. You take their public key, your private key, and calculate a special thing called
transformation key. You give this transformation key to the server. The only thing the server can do with the transformation key,
it can transform data from being encrypted to you
into being encrypted for the third party
you are sharing your data with.
So we combine this encryption primitive
with our query algorithm,
and this allows us to pretty much share data
on a granular level.
So you can say, I want to share everything
matching this query with the guy who has that public key.
And then that guy can pretty much query the data
in that subset until this transformation key expires. When the transformation key expires,
the server removes that and that guy cannot query this data anymore. So for example, if you are
afraid that the third party can get the key compromised, you limit the time span of it to be very short.
And you don't have to re-encrypt all your data again and again
because you are not sharing the key
with which actually the data are encrypted.
I mean, that sounds like a real advancement.
Proxy re-encryption, that just sounds cool, right?
It does. Very cool.
Very impressive. We support proxy re-encryption. That just sounds cool, right? It does. Very cool. It's a very impressive term.
We support proxy re-encryption.
Of course.
Yeah, it's been around for not a super long time.
It's been around for a while,
and there's actually been a couple of companies
that have tried to commercialize it
in sort of a file-sharing context
where you have files that are sitting up in the cloud
that are encrypted under your keys
and you want to share them with someone else.
And I wouldn't say that's been super successful
or hasn't really taken off,
but I guess what's potentially interesting
about proxy re-encryption in the context of ZeroDB
is that we can pair it with a lot of our other query protocols
so you can share stuff on a much more granular level.
Yeah, I was going to ask about use cases because it was, like, incredibly impressive in my mind, are other query protocols that you can share stuff on a much more granular level?
Yeah, I was going to ask about use cases because it was like incredibly impressive in my mind,
but I kept thinking like, how exactly would somebody use this to like an application advantage?
I think file sharing is, you know, the one that comes to mind. Are there other ways that you guys see that being used that maybe even isn't currently being used or that can go to market? Right. So yeah, I mean, one kind of cool example would be, let's say,
a healthcare app. So let's say you have a mobile application for storing people's personal medical
records. And obviously those are quite sensitive. If you're a user of this app,
you don't necessarily want that app provider, that service provider to have access to your
medical information, your medical data. So if someone were to build something like that on top
of ZeroDB, they could guarantee to their users that their PHI was totally secure and totally
owned by them. They control the keys.
And of course, if they need to share it with, let's say, an insurance provider or their
hospital or their doctor, they can use the proxy or encryption piece for that.
So I always think that's a pretty cool example for how something like that could be used.
Yeah, that is very cool.
And that just brings up my thoughts of HIPAA and other such compliance
things. Do you guys run into any of those? I mean, like you said earlier, your main customer
currently is banks. And so HIPAA is not a thing you guys got to worry about. Perhaps
PCI. I don't know what other regulations the banking industry is under. You guys probably
do. But have you come against regulations so far when it comes to zero DB?
So in a way, obviously, a lot of our customers have to deal with these things.
Us ourselves as a company or how we think about our products, we want to be sort of an enabler. And we're obviously providing infrastructure, a developer tool for people to build applications
that potentially could be compliant
with these types of regulations like HIPAA
or Dodd-Frank type stuff in financial services.
And you have a whole other set of regulations
over in the EU.
So we don't, you know,
we're obviously not building applications ourself
at the moment.
So that's not something that we're having to directly deal with.
But obviously, to the extent that our customers could potentially be using zero ADB to, you know, be in compliance with these regulations.
That's something that we at least need to be aware of.
I can give you one interesting use case from the EU has to do with data sovereignty laws.
So let's say you're a bank in Europe
and you operate in a bunch of different countries
and then you generate some customer data in Germany.
It's very difficult for you to move that data,
that customer data in the clear across country borders,
even inside of Europe.
If you encrypt it, and it varies by country by country,
for example, Switzerland is quite strict, so it doesn't work there. But in general, if you encrypt it, and it varies by country by country, for example, Switzerland is quite strict, so it doesn't work there.
But in general, if you encrypt that data, you have more flexibility in terms of what you can do with it, where you can move it.
But of course, historically, the problem has been that once you encrypt it, you can't use it anymore.
You can't query it.
So a lot of banks actually will just have a data center in every single country that they operate because of this data sovereignty issue.
Whereas if they were to use something like 0DB, they could have the keys in-country, encrypt the data in-country,
ship it off to a data center in Luxembourg, for example, still have it be queryable,
and only ever decrypt it inside in-country as well.
So in that scenario, you could consolidate 25 data centers in 25
countries down to potentially a much more manageable handful.
And that is actually one of the reasons why we are in London at the moment. We see a bunch of
regulations which don't stop us from operating, but actually help us. And we are actually
exploring that. So you said you're at the fintech innovation
lab in london that's what you're doing there now that that's your incubator that you went through
right yep that just started actually this monday so we're sort of diving in this is new for you
though like this yeah yeah we have a few more questions on the tech side of things but since
this is the topic now i was just talking to to Jared behind the scenes about whether we should bring this back up again.
So what are your goals, I guess, for this?
How long is this process for you to be part of this incubator?
Yes, so the FinTech lab lasts until mid-April.
So we'll be primarily here in London, obviously.
We actually sit in Canary Wharf, which is sort of one of the banking
districts in London where a lot of the banks are located. We'll be traveling a little bit
to the EU to visit some of the banks in Italy and Germany, for example, and then occasionally
back to the US, obviously, because we have things going on there as well. But it's a standard sort of three-month program that a lot of these accelerators last.
It's actually run by Accenture.
So they're obviously quite helpful from having experience working with a lot of these banks and selling into banks, which, you know, you don't have to get into it.
But that whole enterprise sales process is a whole other bugaboo there is trying to sell into banks,
which in general is quite difficult.
I mean, is your goal to move the company forward
in this incubator
or is your goal to move the technology forward
in this incubator?
I mean, is it one and the same?
Yeah, they pretty much go hand in hand, I would say.
You know, obviously it's important for us
to make the product better
and not only more secure, but a big focus obviously for us is performance
and limiting the trade-off that people are making.
So obviously you have encryption and decryption overhead and things like that,
but you want to make it as drop-in as possible
so that people aren't really getting hit on the performance side
when they're using something like StereoDB
as opposed to just having data in the clear or just encrypted at rest.
And obviously when you're dealing with banks that have just massive amounts of
data, performance is a, is a pretty big concern for them. So yeah,
I mean, I think as you improve the product,
you're moving the business forward at the same time.
We have a few more questions on the performance stuff,
but prior to that, I'm just thinking how,
and maybe the question is really easy to answer,
but how do you or how have you been able to financially set yourselves up
to be able to take this time off and still live your lives
and still do whatever you do?
I have no idea if you guys are families, dads or whatnot,
so I'm just making assumptions.
Human beings have lives, so how do – obviously we have lives.
But how are you guys able to take this time to spend to develop the company
away from Silicon Valley where you're normally at?
What have you done?
Do you have backing?
Do you have funding?
What is that scenario like for your company? Yeah, so we've actually just been bootstrapped so far and been using our own
money i'd say it's just the two of us uh co-founders and developers at the moment so we don't have a
ton of expenses you know outside of sort of of course you have to support yourself and you have
to eat and you have to have somewhere to live right uh so there's some baseline um sacrifice too so it sounds like
you're okay with that yeah absolutely i mean you're i think the reality of being of doing a
startup a lot of times is you're gonna you're gonna go through that ramen stage where you're
of course not not able to spend a lot of money because there's just not a lot of money
there um we are hopefully you know fingers crossed in the next couple weeks gonna have
a small angel round um that'll obviously help quite a bit and help hopefully you know fingers crossed in the next couple weeks gonna have a small angel
round um that'll obviously help quite a bit and help us you know potentially hire a third engineer
to move even faster so potentially april may time frame that'll happen whenever the incubation stage
is over or is that yeah so we would we would uh raise around sometime early summer late spring
and towards the end of this and so j, Jared, chime in here wherever you want,
but I mean, as we've mentioned before this,
we've got this growing trend of, you know, RethinkDB,
and, you know, we can name a number of others
that have come on the show and talk about, you know,
building an open-source product,
but also building a business alongside of that.
Metabase rings a bell as well.
That discussion we had there,
it seems like databases are easy to get, you know, as well that that discussion we had there seems like databases
are easy to to get you know we're not so much easy but it seems like they're prime for this kind of
infrastructure yeah infrastructure these tools um so i guess that's just kind of an interesting
take there we're seeing uh you know this trend of of companies building up source products how does
that span for you i guess guess, moving forward? Like,
what does that look like for, you know, what is, I guess you're offering to anyone who would want
to fund you? What is their, what does their get? You know, I don't know if you watch Shark Tank
or not, but no one gives somebody money thinking they'll never get it back.
Yeah. And obviously for us as entrepreneurs and business people from that side, we're,
you know, we're obviously doing this.
Obviously, we want to build something cool and awesome that people can use.
But also, of course, you want to build a sustainable business as well.
So I think in general, I think if you think about going back to some of the things we talked about before with banks,
moving a lot of their infrastructure from on-premise to the cloud, you know, starting to hit that way, that's starting to happen over the next, well, now a little bit,
and particularly over the next three, five years. That's something that we can really help them with
and accelerate that process. But even beyond that, you know, I think just in general with
all the data breaches that have been happening and, you know, security becoming increasingly
sort of top of mind, not only for business people, but for developers and
technical people as well.
I mean, you need to think about, you know, how you can build applications that aren't
going to lose your customers' data and destroy your business.
That was exactly what Jared and I were talking about.
As I mentioned before every call, we have some sort of sync up and Jared asked me, he's
like, how did this hit your radar?
And he wasn't asking in our in an argumentative way he
was like you know genuinely you know how did this hit your radar what was interesting to you adam
and i was like well i mean it was it hit nightly so that was one thing and that was our own radar
but then also i was thinking like over the last four or five years and i'm sure it's it's happened
lots more that you know lots more before those times but it's become more and more clear to me that we've had more and more cybercrime, so to speak.
Sony was down for, I mean, it's a shame I couldn't play my PlayStation for a month, but Sony was down, and that was a big one there.
And then you've had lots of different comprom you know, compromises, so to speak.
Yeah, targets, compromises.
And next thing you know, credit card data, all this data.
And then not only that, you know, they've got my password if that stuff's not encrypted right or whatever.
I don't know.
But then, you know, everyone's not secure at that point.
So it made sense to have this conversation with you guys? I think over the last year, there happened maybe 13 or 14 big security compromises.
And yeah, that is something,
it was something which just started popping up
again and again.
And that's increasingly a problem.
Yeah.
And I guess I think also like part of the bet
of our company is if you look at sort
of the last generation of big tech companies like Facebook, Twitter, LinkedIn, a lot of the value
that they're creating is based off of their users' data. So they're monetizing that. But if you think
about potentially what the next wave could look like, and actually Lawrence Lesik had an interesting
post I think a week or so ago about this topic is, is how companies are not starting
to think of customer data actually as a liability and not necessarily the source of value. So if
you're, let's say a financial services company or a new insurance startup, insurance tech startup,
and you have data on your customers, that's a potential risk for your company. If you get
hacked and you lose that, particularly when you're just starting
to build trust with people,
your business is basically dead.
So if people start to build on top of things
like zero DB and other options
where they're not necessarily
having to access that data directly,
they remove that liability
and that threat to their business.
Well, let's talk about that.
Let's talk about building on top of zero DB.
After the break, our final break, by the way,
we will talk to you guys about,
let's just imagine we're going to build zero book.
You know, we want to build a huge worldwide social network
on top of zero DB technology, scale implications,
performance implications day
to day management those type of things and then of course getting started as a bit of that as well
so i think it'd be a good place to take this conversation uh after we hear from this last
sponsor and then we'll be right back every saturday morning we ship out an email called
change law weekly that covers everything that hits our open source radar it's our curated
editorialized take on what you need to pay attention to from this week in open source and software development.
There's no algorithms.
There's no machines.
It's just us paying attention to what we pay attention to so you don't have to.
Head to changelog.com slash weekly to subscribe.
And now back to the show. All right, we're back and we're ready to talk about using
zero DB practical applications, um, performance scale, what have you getting started, but let's
start with performance because you guys do have some posts on your blog about performance. And
anytime you add a layer to the conversation,
which you're sending encrypted bits down the wire
and decrypting them in the client,
there are performance implications.
So we'll just kind of lay that on the table
and say if I'm using zero DB versus some other DB,
which is exactly like zero minus the encryption stuff,
what penalty am I paying?
Right. like zero minus the encryption stuff what penalty am i paying right so basically uh the biggest uh
the biggest implication for us is that uh the way we work is we the client talks to the server while doing the query more than once so a typical database is sending a query to the server
the server is doing something and returning a result in our case the sending a query to the server, the server is doing something, and returning a result.
In our case, the client talks to the server a little bit.
So it sends a little bit, gets some information, decrypts it,
sends something again, and that repeats several times.
So what that comes down to is pretty much latency between client and server.
For each query, you have a performance penalty of multiple times this latency.
What helps with that a little bit is client-side caching of the top of the index and that makes queries faster by a factor of two
about that. But I would say there is a positive thing as well since a lot of logic and encryption
decryption happening on the client. If you have some database server and multiple clients,
each client decrypts its own data on its side.
So you have this decryption load
parallelized between multiple clients,
and this kind of takes this load off the server.
So in certain cases, you can have actually performance higher than in a typical situation.
It seems like, and I guess this is the same for traditional databases,
but a write-heavy application where a lot of clients are writing
is going to be busting that client-side cash more often and require
more round trips.
That's true.
Okay.
So there is a penalty.
Sometimes it's mitigated down to zero or negative, but something to definitely think about as
a concern when going.
And as it is with anything that you're going to adopt, there are trade-offs.
What about scaling?
So like I mentioned before the break, I'm building thisoffs. What about scaling? So, you know, like I
mentioned before the break, I'm building this thing.
I don't know if you've ever heard of it. It's called ZeroBook.
And it's going to be huge.
I've just got my
first 5 million users. I don't know
who they are because all their data is encrypted,
but I'm sure that they're awesome. And I'm
trying to now expand my server-side
infrastructure by 0DB
based database infrastructure past one
machine. I'm thinking of sharding or replicating or whatever it is these people do at Facebook.
What do I got to do? Yeah, so basically replication and sharding we inherit from ZODB or things built for that.
Pretty much one is called ZRS, which is just replication.
And there is a thing called Neo, which is actually already sharding.
And actually using that, we can scale it up.
So I guess for actually
scaling the thing you need
all your data to reside
on multiple servers. So when
you get to the situation when all
of your customers data doesn't
fit onto
one server.
So for this we can use these things I mentioned
which are already built.
And they are integrating pretty well with 0DB
just because we use this existing technology stack.
But of course there are some...
There are some scalability thoughts about doing certain queries in an encrypted manner.
So that is something we are constantly improving.
Are there best practices that are coming about or things that you or your users are learning over time
and kind of gathering together a corpus of knowledge about how to go
about it? You mean about the performance and scalability? Yeah, just the scalability. But
you said that there are things that you're learning about how to do the queries. Or is
that all internal to the protocol and not to the person who's actually... Yeah, it is actually
internal to the protocol. So I don't think at the moment developers should worry about that. If there is something they should worry about that, we will definitely publish that. And if there are some recommendations. stuff. So it seems like it's an easier onboard if I'm building a Python client, something that
can rely upon Python. So let's just start with that one. Let's say you sold me on the idea of
0db. I'm ready to start coding up 0book. How do I get started? Yeah, so we've got a little quick
start tutorial on docs.0db.io. And then you just grab 0db-server from a GitHub repo.
And it's pretty easy just to sort of get around,
get started and start playing with it.
You can populate it with some dummy data
and we give you an example of how to write your data models
and how to query the database.
So pretty easy to at least get a sample of it.
Seeing this as kind of a brave new world of databases, I'm assuming that there's somewhat
of a lack in terms of tooling around zero DB and you know, query generators or any sort of
thing like that. Is that an area where you guys are looking to innovate as a company perhaps,
or is that something that you just completely open up to the open source
community and hope that they would build these kinds of infrastructure around
your infrastructure?
Yeah. I mean,
there's obviously things that we have in our roadmap for the company itself
that we'll, we'll plan on doing.
And then to the extent that other people do them first, that's, that's great.
We're happy to support people who are,
who are building stuff around zero DB and on top of zero DB.
And even just in the last, in the couple of weeks that it's been open source,
we've seen some of that already. I know, I think, uh, in the first week,
maybe someone started starting playing around with Docker and put it into a
container.
I think a couple people have written plugins for various web frameworks.
So we're seeing some of that already.
And yeah, obviously to the extent that that continues
or increases, we're very happy to see any activity around 0DB.
What do you say to the person who's not interested in Python
at this point when you guys still, you know, the person who's not interested in python at this point when you guys still you
know the javascript clients in the works python the client is all that we have but they're willing
to get involved and actually you know do an implementation um getting started for that
person is the protocol documented or is it open up the python client and start to you know emulate
what you see there um what's the kind of process that you would imagine
somebody would go through to implement their own?
You mean to implement their own client
in a different language?
Yes, and then also just like that,
you say it's not just a database, it's also a protocol.
Is the protocol documented or is it out there?
Do you just read the code?
Yeah, so we're actually publishing a paper,
hopefully in the next week or so,
maybe a little bit longer.
Don't hold me to that.
It'll have some more specifics
and we'll also have sort of more on our threat model
and security assumptions as well.
Some of the crypto primitives that we're using
and potential sort of future optimization.
So from understanding the protocol in more depth,
that'll help quite a bit there.
Actually, the protocol itself is derived
from what was internal for the ODB.
And we've talked to the founder of Zope and ZODB, Jim Fulton,
who pretty much sees the future of ZODB as allowing alternative clients,
which are non-Python, to work with that.
And since we are based on that, we are pretty much in line with his vision.
So I guess we can pretty much work together
with people who work on ZODB
to enable these alternative clients.
I guess it's kind of been somewhat the ask here,
but not directly, you know,
talked about the protocol, the client, you know, if somebody wanted to create their own client,
you guys are in an incubator right now.
You've obviously got the next, you know, three or four months kind of,
to the most part mapped out in terms of what some of your goals are.
But if you had the ear of the open source community and they were thinking
similar to what Jared said earlier, Hey, you know,
I'm not digging Python, for example.
I'm digging something else. Not so much just that, but how can the community step in and help you to move the ball along?
If it's not, you know, your financial or not your financial goals, but your goals around your business.
If it's not that, if it's around this technology and moving it forward, what are the best areas that the open source community can step in and help out?
It's very much around building alternative clients, writing plugins.
Another thing actually that helped a lot in the first couple of weeks after we open sourced
it was just having people look at our implementation.
And we had actually a lot of good feedback around things we could potentially do in the
future to even make it even more secure.
One of the things that we do leak is access patterns.
And what I mean by that is the server can observe, obviously, which encrypted buckets that are being queried.
And in theory, over time, they could deduce things like the order of the database from those access patterns.
So one of the sort of active research areas right now is something called Oblivious RAM or ORAM,
and that's designed to obfuscate these access patterns. And so that's something that we could
potentially add in the future. So things like that, obviously, that help us sort of map out
the roadmap for what zero DB will look like in the future. So things like that, obviously that help us sort of map out the roadmap for what
zero DB will look like in the future. Um, that's helpful as well. So if someone's out there and
they're listening, okay, well, I'm going to go to the org on GitHub, which is github.com slash
zero hyphen DB. Would it be, uh, okay for them to hop into the zero DB, uh, repo and just drop an issue there and share some ideas or is there different
ways to communicate with you guys?
Yeah, that's, that's certainly an option where, you know, we're,
we're actively paying attention to GitHub and you can open issue there.
We have a Slack channel too, that we're,
that we're on most of the time and it can be very responsive there. Um,
I mean, we're, we're pretty much online 24-7,
so we're on Twitter too,
so it's pretty easy to get a hold of us.
Gotcha.
Is that Slack channel mentioned anywhere on 0db.io?
Oh, I guess it is right there, right there at the top.
I missed it.
It's too obvious.
It's so obvious that I missed it.
So you got a mailing list too,
so you're sending out emails.
How often are you emailing people about what's going on?
We don't
email very often i think we sent out two emails in the entire life of our mailing list so one was uh
you know blog like back this was back in march one was a follow-up blog post we talked about a
little bit in more detail how the protocol worked and then the second was when we actually open
sourced it in December.
So we're not going to,
we're not going to spam you too much.
Yeah.
Well,
if you're,
if you're heading to zero db.io,
they got to link the GitHub there,
obviously,
which is the organ just mentioned the Slack channel to sign up to,
if you want to talk real time or the mental list and never get an email or
at least infrequently,
that would be good.
Just throw important stuff
yeah well that's good right i mean you didn't promise a weekly email um but uh email is really
interesting so all right fellows we don't have many more questions for you i know we wanted to
talk quite a bit about that we're running out of time for our show anyways so we'll skip the
closing questions here this is twice in a row we did it with you huda and now we're doing it
here in this show to skip the closing questions. This is twice in a row. We did it with Yehuda, and now we're doing it here in this show to skip the closing questions.
So maybe it's becoming a trend.
Crazy.
We're crazy.
We'll see.
I don't know.
But, fellas, I want to thank you guys for taking the time to come on the show.
And obviously to think proactively about the future of technology and being able to, as you said earlier, McLean, to bet on this technology.
One, to build a company around it, and two, to open source it.
Obviously, you guys will have some financial gains in the future with what you do with
it, but it's generous of you to give back so much to open source and trust the process
of open source and all that good stuff.
So that's really awesome.
I want to thank our listeners for tuning in and also our members for tuning into the show this week
our sponsors for the episode this week are code ship top towel and digital ocean and if you're
not a member yet you can join the community for just 20 bucks a year and we'll give you an all
access pass everything we do head to changelog.com membership to hear more about that our next show
is actually already recorded it's already in the can,
but it's episode 191 with Richard Feldman.
He's from NoRedInk discussing Elm,
which is described as the best of functional programming in your browser.
Jared, that was an awesome show, right?
Yeah, it was.
And some awesome news came out of NoRedInk just last week.
Or was it this week?
We had a post on it on the website
you had a conversation with him yeah about uh what's going on there you want to talk about
that real quick well they have it announced we shared it on our blog uh it's a it's on sound
cloud as well but we talked to richard briefly about uh evan japlicki which i'm not sure that's
exactly how you say his last name so forgive me me if that's wrong. But he's joining NeverEdding, which is a really interesting thing.
They are stepping up to support him, and he's coming out of the work,
and he's creating the Elm Foundation, the Elm Software Foundation.
Really interesting to see that happen in there.
So check out the blog, changelog.com, to find that.
There's an audio piece there as well.
It's not in the main podcast feed, so if you're looking there,
you're not going to find it there.
But we mentioned both of our emails
also in the show,
so I'm going to mention that one more time.
ChangeLog Weekly, we ship that every Saturday.
It's our hand-curated editorial take on the week.
And we also ship ChangeLog Nightly,
which is our radar as well.
So as you can see, we create new shows from this.
We're constantly watching this.
So we ship that every single night
at 10 p.m. Central Standard Time, covering the daily trends in open source on GitHub.
You can subscribe to both of those at changelog.com slash weekly and changelog.com slash nightly, respectively.
But, fellas and everyone on the call, that's it.
So let's say goodbye.
Goodbye.
Thanks for coming on, guys.
Thanks, guys.
This was a lot of fun.
Bye. Bye. goodbye goodbye thanks for coming on guys thanks guys that was a lot of fun bye We'll see you next time.