The Changelog: Software Development, Open Source - Fauna is rethinking the database (Interview)
Episode Date: September 24, 2021This week we’re talking with Evan Weaver about Fauna — the database for a new generation of applications. Fauna is a transactional database delivered as a secure and scalable cloud API with native... GraphQL. It's the first implementation of its kind based on the Calvin paper as opposed to Spanner. We cover Evan's history leading up to Fauna, deep details on the Calvin algorithm, the CAP theorem for databases, what it means for Fauna to be temporal native, applications well suited for Fauna, and what's to come in the near future.
Transcript
Discussion (0)
What's up, welcome back, or welcome for the first time if you're new to the show.
I'm Adam Stachowiak, and you are listening to The Change Log.
On this show, we talk with the hackers, the leaders, and the innovators of the software world.
We face our imposter syndrome, so you don't have to.
Today, we're talking with Evan Weaver about Fauna, the database for a new generation of applications.
Fauna is a transactional database delivered as a secure and scalable cloud API with native GraphQL.
It's the first implementation of its kind based on the Calvin paper as opposed to Spanner.
We cover Evan's history leading up to Fauna, deep details on the Calvin algorithm, the cap theorem for databases,
what it means for Fauna to be temporal native, applications that make sense for Fauna, and what's to come in the near future.
Of course, big thanks to our partners Linode, Fastly, and LaunchDarkly.
We love Linode. They keep it fast and simple.
Get a hundred thousand credit at linode.com slash changelog.
Our bandwidth is provided by Fastly.
Learn more at fastly.com.
And get your feature flags powered by LaunchDarkly.
Get a demo at launchdarkly.com.
This episode is brought to you by our friends at Influx Data and their upcoming Influx Days North America virtual event happening October 26th and 27th.
Influx Days is an event focused on the impact of time series data.
Find out why time series databases are the fastest growing database segment, providing real-time observability of your solutions.
Get practical advice and insight from the engineers and developers behind InfluxDB,
the leading time series database.
Learn from real-world use cases and deep technology presentations from leading companies worldwide.
Also, for those who'd like to get hands-on with Flux,
our listeners get $50 off the hands-on Flux training when you use the code CHANGELOCK21.
Again, this event is happening October 26th and 27th.
Learn more and register for free at influxdays.com.
Again, influxdays.com. so we've had a lot of conversations recently around computing on the edge i'm not talking
about iot or smartphones necessarily but smart smart CDNs.
I don't know, you have your Lambda,
you have your Netlify and Fastly functions,
you have your Cloudflare workers.
We have a lot of people trying to push our logic,
our server-side logic as close to our users as possible
and reaping the benefits of that.
Every time I have that conversation,
I tend to interject with,
yeah, but what about my database?
Because caching is great,
and running logic at the edge of my CDN is great,
but if all of my data is centralized,
then aren't I going to be round-tripping
really far away eventually anyways?
And I've had at least three people say to me when I present them with that,
they say, Fauna's working on it. And I say, Fauna's working on it? And they say, yep.
So there are people working on this. Hopefully we'll get there. And they keep bringing up Fauna.
So Evan, welcome to the show. And tell me, is this the problem that Fauna's working on?
Are people misspeaking?
They are not misspeaking.
That's one of several problems we're working on.
Fauna, historically, we've had a boil the ocean kind of attitude.
The problem we're working on is the problem of the imperfection of the operational database.
Edge latency is definitely a big part of that.
Okay.
The imperfection of the operational database.
I love that phrase.
Can you unpack it and tell us what it means?
Yeah, so the genesis of Fauna really came from me and my co-founders' experience at Twitter.
I was employee 15.
I ran what we called the software infrastructure team there from 2008 through the end of 2011.
That was the early days of NoSQL.
MySQL and Postgres were big.
We started with a MySQL cluster.
Cluster is a strong word.
We started with a server.
It became a cluster over time.
Then we had to carry on from there as Twitter went through hyper growth.
We didn't go to Twitter as distributed systems experts.
We didn't go as DBAs.
We went there as essentially Rails developers.
And we were frustrated that we couldn't find any off the shelf data system that could meet
Twitter's needs, you know, delivering a soft real time product at global scale to a consumer
audience.
And we looked at Mongo and we looked at Cassandra and invested quite a bit in Cassandra
open source and quite a few other solutions and ended up building a whole bunch of custom stuff
in-house. But we never quite got to the general purpose data platform that we wanted to have.
And that dream never died for us. People who work on databases typically work on databases out of
frustration and rage, not out of love for the existing tools.
And we felt eventually, if we didn't try to attack this problem from the ground up,
it wasn't going to get solved.
And that led to the early versions of Fauna, where Fauna is essentially a database as an
API, trying to get rid of everything about the operational experience, everything about
the metaphor, the physical computer that
interferes with your ability to access your data. One of those things is latency variance based on
the physical location of the client. So my dream as a developer is I have objects in my code,
which represent, if I'm coding object oriented, which represent data and logic right and those objects
are always there and they're available and i can use them and i can set them aside and pick them
back up again and that's it i don't have to write to the database read from the database
i can grab some objects i can throw some objects away is that kind of what you're aiming at when you say
removing the concept of a database
are you talking about, only think about code that's just ever
persistent and available to me or are you saying
use an API client instead of a database client in my code
a little bit of both, the concept you're describing
is sort of the tuple space concept
Linda being one of the very early examples in the
80s where you would have a giant globally available
heap and you could change those records. So it turns out that's not
enough in the real world for data access. People want structure.
They want constraints enforced. They want transparency as to
what that data is. They want transparency as to what that data is.
They want the ability to index and query it beyond the single object level.
Fawn has inherited some of those concepts, in particular in the way we offer serverless functions in the database itself.
We would have called them stored procedures in the past, but stored procedures have a host of operational and developer productivity issues.
You know, they only run on the primary node, that kind of thing. So effectively, you know,
you can write business logic, which runs co-located in the database next to the data,
has that transparent access like you're talking about. And so sort of our goal now, and you know,
this works in Fonit today. People do it all the time. They really like it. Our goal now is to make that experience more seamless, more closer to, in particular,
developing with JavaScript and GraphQL so that you can get that world we're talking about.
You can have business logic on the client that interacts with familiar interfaces to the data.
You can also write business logic in the database that uses the same interfaces that has better consistency and availability and scalability properties.
And you can stop thinking about, like, I provisioned a server and I have to think about what it's going to do. geographic location of said server or any of those problems. When we talk about specifically that edge database
or that edge access layer for my data,
and if I'm running in Lambda function that happens to be in Singapore,
having my data co-located there,
that's also part of what Fauna is doing, right?
Yeah, it is.
Fauna offers global and regional deployments.
People don't necessarily want all their data available everywhere for compliance and performance reasons.
But we ensure that whatever topology you choose for your database, all your clients will access the nearest region within that topology and have a consistent performance and correctness experience by doing so.
It's automatic.
You don't have to think about, you know, I'm querying the primary, I'm querying the
secondary, or, you know, I set up my shard key to make sure this data is clustered together.
Otherwise, it can't perform or I can't do a transaction, that kind of thing.
So here's a really simple question that probably has a really complicated answer.
How do you accomplish this?
We do quite a few unique things in the database
world, in particular for the transactional algorithm. We grew up a little bit on NoSQL.
We were experienced with Mongo and Cassandra and those kind of things in that era where people said,
you have to scale because if you don't scale, your business will die. Well, if you're going to scale,
you have to give up transactions. You have to give up correctness or the ability to rely on your database to really enforce anything at all about the data, except
you make a best effort at replicating it sometime, somewhere. And that wasn't really
acceptable at Twitter at the time, but we had to tolerate it anyway. It's not acceptable in
particular for developer productivity for the general application or for higher risk data than arguably tweets
usually are your usual suspects, ticket reservations, banking, crypto, that kind of
thing. You really want to know if your data is accurate, especially with private data. You have
to make sure that security rules are enforced in a transactional way and that kind of thing.
That led us when we were prototyping Fauna to pick up an algorithm
called Calvin. And at the time, you know, this was about four years ago. At the time, there were
really only two serious algorithms available in industry for doing multi-region strict serializability
for read-write transactions. And strict serializability is like the optimal level of
correctness. It's the same level of correctness you would experience if you were doing everything on a single machine
sequentially without any concurrency at all. And it's easy to reason about for a developer. It's
easy to think about, you know, the happens before relationship, as long as you know there's no
overlapping in the read-write effects of those individual actions or groups of actions.
The first algorithm in industry is the Google Spanner algorithm, and that relies on access
to physical atomic clocks to sequence your transactions. Those are hard to get, not as
hard as they used to be, but still not generally available. It also relies on bounded latency for
accessing those atomic clocks,
because it's not enough to have the clock if you can't guarantee that you can check the time in a
specific latency window, then you don't really know what time it is, even though the actual
source data was correct. And also, it can be potentially slow, because for a lot of transactions, you have to do multiple round trips to multiple shards to drop locks into the records and then clean them up once you've made some rights to that data.
We felt, based on our Twitter experience, Twitter was a global system.
There wasn't natural partitioning in the user base the way there was for Facebook when Facebook rolled out school by
school in the early days and you couldn't actually communicate with people outside of your cluster.
Twitter was never like that. Twitter was always global. We wanted a data system which would
support that kind of global access but still give you an optimal latency experience. And that led us
to pick up Calvin, which came out of Dr. Abadi's lab at Yale.
Dr. Abadi is one of our advisors now.
And that algorithm is a unique algorithm.
It's a single-phase transactional consistency protocol, which doesn't rely on knowing what the transactions are before it puts them in order.
There are a couple of key things that have led people in industry to kind of reject that algorithm originally.
The first was that the paper is very opaque and kind of scary.
And there's a lot of sections where things are left as an exercise to the reader.
There's like hand waving about putting locks everywhere, which sounds slow and error prone.
It doesn't have the kind of brand backing of Google who can say you know we did this in production it works you know whether it right took 20 000 engineers five years to make it work is kind of beside the point
but like it happened so people believe it calvin was not like that but what calvin offered
was the system which was uniquely adapted to no sql it's harder to do SQL over Calvin.
You can do it, but it has some performance implications
that take extra work to work around on the part of the database vendor.
But in Calvin, if you submit the transaction
as a pure function over the current state of the data,
so not like begin transaction, do stuff database side,
do stuff application side, then commit transaction.
But if you submit it only as a work that can happen in the database as a single expression, then Calvin will order those in a globally partitioned and replicated log.
Tell you what the order is and then apply them to the replicas, which can be anywhere and tell you what the data is.
So that order is inversed from the typical lock-based transaction system you'd find in
something like Spanner or Postgres. And that means it can do everything in a single round trip
in the data center quorum. And it gives you optimal latency effectively. All reads can
happen from the closest data center without further coordination.
So it's a very good edge experience. And then all writes are just one round trip for whatever the
majority of the regional cluster is. Anybody else out there that you know has grabbed Calvin and
run with it like you guys are? Yeah, there's Yandex. After we did our work, Yandex eventually released a system that they had built internally initially,
which does SQL with a Calvin-inspired system.
Then also Facebook has an internal system which shares some similarities that also popped up somewhat concurrently,
maybe a little bit after we were first publishing what we were doing.
I forget the name of that system.
It's not available to the public.
It's not open source.
Well, I was Googling Calvin while you were talking about it,
and I found a nice blog post on, I think it's called fauna.com,
called Spanner vs. Calvin, Distributed Consistency at Scale,
back by Daniel Abadi back in 2017.
So we will link that one up for people who want that comparison.
Because I haven't heard of Calvin, I've definitely heard of Spanner,
probably because of the marketing prowess of Google
and just the fact that when Google does a thing,
it gets out there and is talked about by developers all around the world.
For a while we had a serious technical marketing challenge here
because if you remember the NoSQL vendors
like Datastacks and those guys,
they would bang on forever about how distributed transactions
were literally impossible.
And you should just abandon hope of enforcing transactional consistency
in your database, and you just need to make the application detect
when the data is corrupt and make a best effort to clean it up. And if you lose some transactions, who really cares? It was
only a few transactions. Hopefully they weren't big ones, but you know, that's your problem now.
Now it's the party line, you know, from data stacks, from Mongo, from Couchbase, from many
others. At the same time you would, you would, you know, the Postgres crowd, you had the Redis
crowd saying, well well you don't need
scale just get a really big server do everything with a really big lock locks will get faster over
time moore's law will never end it did end but putting that aside you know it theoretically
could start up again and you know you can get more and more hardware and if it goes down the
downtime isn't too bad you know and it's all worth it for the cost of transactions.
And, you know, that meant when we first started publishing what we were working on, people didn't
believe it. You know, even Google had some challenges to overcome with some of their
papers about Spanner, about Chubby, and some of these underlying strongly consistent systems where
people said, well, you know, the CAP theorem says you can't
do that. You know, I don't need to read this because you're trying to do something impossible.
This is a perpetual motion machine, that kind of attitude. So Google sort of paved the way and
convinced people, you know, with specialized hardware and a ton of grunt work, you could
actually get something which was better than the primary secondary replication system for
transactional
data. But then we had to extend that and prove, you know, through our blogs, through the Jepson
report and that kind of thing that Calvin actually works in an industrial context and that you can do
better than Spanner. You can do better than these multi-phase commits. You can do better than the
hardware dependencies and kind of get the best of both worlds in terms of the NoSQL experience of scale and the RDBMS experience of transactional consistency.
For the uninitiated, can you break down what CAP theorem is?
So the CAP theorem says that you can't have consistency, availability, or partition tolerance
all at the same time. Basically, you know, you've got a bunch of nodes on the network.
You want them to be perfectly synchronized.
Well, if you lose your network link, then either they become unsynchronized
because they can't replicate to each other,
or they become unavailable because they refuse to accept transactions
because they know they can't replicate to each other.
Or the P is kind of weird because you can't be partition tolerant.
Saying you're partition tolerant means you have a perfect network that can never partition,
which is not the real world.
But people read this as like, it's a theory.
It's in the name, the CAP theorem.
It's not a physical limitation on how data can replicate.
And in particular, if you have enough machines on wide enough distributed commodity hardware
with enough replication links between them
and enough algorithmic sophistication to handle those faults,
you can effectively approach now something
which feels like a fully CAP-compliant system.
So probably the best way to describe Fauna is
big C, small A, big P.
So we will never give up consistency. Worst comes to worst, you know, consistency will be maintained
while availability is sacrificed. But in practice, availability is essentially never sacrificed
because, you know, the algorithms are fault tolerant. They can route to other nodes and
that kind of thing. And when you have the client routing to the closest nodes and regions all the time and they're doing the same thing, you can actually dodge
the typical network partitions you have in the modern
hyperscale public cloud.
Yeah, that's interesting. You have kind of two angles of routing
around from two perspectives when you combine those from the client
side routing, like the network routing of the client to
the database network side.
You're saying that you basically can
just minimize those to where it's rarely a problem.
Yeah, and a key thing about making this work
is making sure that every step of the communication process
knows how correct it is.
One of the unique things about phone
is that phone is natively temporal.
So all data has a version timestamp,
and you can look back in history for audit purposes
or to show a chat history or that kind of thing.
That also means any query from any client knows how fresh that query was.
Transactions, when they're being written,
they know how fresh their dependent data was.
And at every step, you can check and make sure that you're not trying to do something which could fail if fresher data potentially existed.
That means you're never wondering, you know, do I have an up-to-date view or not?
You know how up-to-date it is and you know whether you can rely on it.
Any chance you're a Silicon Valley fan, Evan? The show, Silicon Valley?
I saw a couple episodes.
It was pretty close to home in terms of my Twitter experience.
I don't think I found it as humorous as others.
There were some early Twitter engineers who consulted on it.
A little painful to watch.
Jared loves when I bring it up because he's not a big fan or he hasn't gone, I guess, all the seasons.
And it's just so close to home, really, because this is a big thing that they did.
They solved for, at least, with this algorithm that was essentially the plot theory of the whole show.
But they never envisioned an internet where you can have so many devices, essentially, to take that P part of it, to make it possible, because you just have so much non-latency between devices and this possibility.
So I'm just curious if you saw that, because that was a big thing they saw for there was like, they never envisioned where you would have this many connected edge
devices, in this case, smartphones in everybody's pocket with data that were 10 times more powerful
than the computer that took us to the moon, for example, this amount of computing in everybody's
pocket with data with network globally. And that seems like what the partition part of the P of the CAP theorem
is the big part of it, is if you can kind of minimize that latency
between so many nodes and such a big network,
then you open up a world of possibilities, essentially.
Yeah, that's accurate.
It's not enough to say the database is available or not, either.
It's a much more nuanced, real-world question.
Like, is it fast enough?
What level of correctness did you explicitly request?
What period of time do you want your data to be served from?
That kind of thing.
The history of operational databases, I think,
it's a little, like, database development lags
other infrastructure software development because it's harder.
It's one thing if your compute node or something
craps out
and you have to start up a new one.
You lost a couple requests, but that's basically it.
If your data is unavailable, if your data is corrupt,
which is even worse, that has permanent impacts
on the health of the business, on the customer experience,
that kind of thing.
And making systems that reliable is very, very difficult.
So what we ended up with was, you know, the RDBMS is basically designed to be put in a closet in a
physical office building and accessed from PCs. You know, that was sort of the Microsoft Access
model, the SQL Server model, the Oracle model. You know, you'd run these rich clients on desktops,
which would have a relatively reliable network.
And if it wasn't reliable, you know, you could walk down the hall and pester somebody to go make it reliable on your behalf.
And then your problem would be fixed.
Well, that's not that's not the world anymore.
And we've tried to extend the assistance which were designed for much smaller deployments with physically accessible, you know, low latency links between them for a cloud world with, like you said, you know, people in smartphones, you know, in cars, all kinds of crazy smart devices accessing.
Refrigerators even.
My washing machine in the other room's got Wi-Fi access, you know.
Yeah.
I've got it on a VLAN, of course, because I don't want anybody like hacking my house through my LG model.
I'm just going to get it. I'm just kidding around like you. Give an access key essentially to my network. But it's on a VLAN. So I don't want anybody hacking my house through my LG model.
I'm just kidding.
I'm just kidding around.
Like, you give an access key, essentially, to my network.
But it's on a VLAN.
But the point is, you've got edge devices everywhere.
Yeah, and they move. Yeah.
Think about a corporate deployment in a store or something like Hertz.
Hertz knows which cars and which team members
and who's running from which site most of the time.
The data doesn't move around at high frequency.
But when you have people playing mobile games and doing social media stuff or even using Salesforce,
they're flying all over the place and they want their data to be quick and correct from anywhere.
And that was sort of the great unsolved problem of operational databases until very recently.
But there are a lot of other problems
that came along with that kind of legacy
physical deployment model.
And we lifted and shifted it to the cloud,
and you got your VM instead of your physical HP,
big iron, and that improved a bunch of things.
You didn't have to go to steak dinners
to buy
your itanium chip anymore you could deploy something immediately by clicking a button
but you still have to think about what it is like how much capacity i do i need well nobody knows
how much capacity they need so you either provision way too much and you pay for all this wasted
capacity wasted resources you literally waste you know, keeping those things on.
Or you don't deploy enough and then you have some kind of event, positive or negative,
that damages the experience of people using your product, whether it's an in-house IT thing or something for the consumer or the public.
And, you know, all these problems are problems of the metaphor of the physical machine. And if you use something like Stripe, for example, you know,
you never think about like, which Stripe node am I going to deploy so I can accept credit cards?
Like the concept doesn't make sense. And we want that concept to disappear for data too,
so that it stops making sense to think about where a physical piece of data
is linked to a physical machine.
This episode is brought to you by LaunchDarkly.
Fundamentally change how you deliver software, innovate faster, deploy fearlessly,
and take control of your software so you can ship value to customers faster and get feedback sooner.
LaunchDarkly is built for developers but empowers the entire organization.
Get started for free and get a demo at LaunchDarkly.com.
Again, LaunchDarkly.com. Again, launchdarkly.com.
So when you say Fauna is an API,
I think we all at this point know what that means
in terms of how I'm then using it.
It also, as a database layer, it brings me a little bit of apprehension because it's kind of like I get access to an API and then I get my access removed to an API.
And my data is precious.
And sometimes it's my business, right?
It is the business in some cases at the end of the day. So it kind of gives me a little bit of the apprehension of like,
well, if that API goes away, for whatever reason, my database is gone.
Now, same thing with Stripe, right?
So these are things that we have to deal with as developers
and as decision makers of what makes sense for our circumstances.
But open up with Fauna is an API, unpack it some more,
and then let us know what that all means.
What does that end up meaning for me as a user?
Yeah, I mean, the API experience really is the web experience.
You know, this idea that you can use standard interfaces
to access data from anywhere.
You don't care where you are,
and you don't care where the server you're talking to is.
You don't have to go over a secure link.
You don't have to be within a special network parameter.
You don't have to go get your special Lotus Notes credential
and install the special app and use the special protocol.
It's the web.
That's what makes the web interoperable. It's what makes it ubiquitous. It's what makes it so productive both to develop with
and to consume for SaaS and consumer products. We want the database to be just like using any
other web service. You're right, that comes with some downsides. In particular, operational
transparency is not a given
when you're not deploying your own server you know you don't have any administrative access to the
underlying hardware you can't go inspect the vm you can't go back up the physical bits on your own
that means it behooves us to you know build that transparency back into the system, to give you access to resource consumption metrics,
to give you access to a backup system
where you can make your own decisions about restoring your data,
to give you access to the history of your data
where the temporality feature is often useful,
and also to give you performance transparency
into how things are operating.
Do you really care that much how Stripe performs?
A little bit.
You don't want it to take five seconds to check out,
but if it's 10 milliseconds or 100 milliseconds,
that's probably not a huge deal for you.
If your database performance is that variable,
that is a problem.
It means we have a very high bar as a cloud operator
because we developed the software and we operate it ourselves.
This phone is hosted on AWS and GCP right now and we'll go to other cloud
providers shortly. We've taken that burden off you and that's part of
the value you get as a customer but we also have to make sure we don't eliminate
the transparency there that you would get from a managed cloud solution or
something you're physically operating yourself.
I noticed you have an entire subdomain on the website called trust.fauna.com.
I think that's really what it comes down to, right?
I'm kind of having a looser hold over things that are historically precious to me or held
tight in terms of, like you said, operational transparency.
I used to be able to walk over to the server and pop in a new drive, pop that drive
out, pop this one in, or run that backup script manually or whatever we used to do. And I don't
want to do those things, and I don't want to have to do those things, but I do not know if I can
trust not being able to do those things. So like you said, you have a very high burden of trust,
and like you said, it behooves you guys to be as transparent. So like you said, you have a very high burden of trust.
And like you said, it behooves you guys to be as transparent as possible.
It looks like you have all sorts of white papers
and reports and compliance things.
And the entire goal, I'm assuming, of that subdomain
and of these efforts is like, A, being trustworthy,
and then B, proving to us that you are trustworthy.
Right, yeah, you don't want to do the work
anymore, but you still want to supervise that work. You know, if you pop the drive out and
it's the wrong drive, well, that's your own fault. Honestly, that feels a little bit better
than if someone else popped out the wrong drive on your behalf and you're like, what the heck,
who are these idiots? Yeah. Like I could blame myself, but yeah, somebody else is way worse.
Right. Yeah. And then you're like, but yeah, somebody else is way worse. Right.
Yeah.
And then you're like, I learned something.
It was all worth it.
Well, we'll see.
We need to make it possible to supervise that operation to understand how everything works.
You know, you need to understand it so you can communicate effectively with our own support team if you're having an issue and that kind of thing. So it's not kind of the same level of opacity as you might get from something which
is, you know, lower risk, potentially lower value, like, you know, a domain focused vertically
integrated API. Because the way, like, we're sort of going through a series of transitions as we
usually do in industry, you know, we had all on prem everything, you know, even for payments,
like you would go get like i don't know your physical
payment block and you'd like sign a contract with braintree for like thousands of dollars and stuff
and you'd be able to like take some credit cards and then things like stripe and square move that
into into apis the same way for software infrastructure we had on-prem deployments. Then we got managed cloud.
Then we moved to more dynamic managed cloud
with VMs and fast provisioning
and even more dynamic with containers.
Now we're moving into purely API-driven
infrastructure solutions
where there's no physicality at all,
but databases lag because they're harder.
So we're in this transition,
especially you see it with things in the serverless space,
like you mentioned Netlify.
Netlify and Vercel and their deployment and hosting systems,
a lot of the work the edge providers are doing
with eventually consistent data and caching
and that kind of thing.
Lambda and moving to more dynamic,
even interfaces which aren't based on POSIX anymore,
like WebAssembly.
All of that is driving us to this world where physicality
and the computer metaphor doesn't matter anymore.
We need to get databases there too,
but like you're saying, you can't get there without trust.
Yeah, you say no until you say yes, basically.
And that's where trust comes in.
You mentioned before around technical marketing had some challenges, especially the Outlook on database.
Like you have to be you have to think so far down the road to do what you're doing with Fauna because you're you're sort of like bypassing a lot of things, even like turning this into an API.
It turns off a particular developer.
You know, there's some apprehension, as Jerry had said. What are the challenges you see now currently then, technical-wise,
that take that from a hard no to maybe even a yes
for developers listening to the show?
What is it that makes people trust that you can solve this problem?
Calvin is the algorithm.
It's newer or newer known.
You say no until you can say yes, and that yes comes from trust.
And so how do developers begin to trust that you've solved this problem?
Yes, choosing to adopt Fauna or something like it.
There's really nothing else like it, but imagine someday there is.
Choosing to develop a system, which is that novel, really comes down to two things.
It comes down to that trust.
It comes down to having an implementation and architecture,
transparency about that architecture and a feature set, in particular, the security model, which is pretty
unique in Fauna, that makes it safe to access data, either in a secure or insecure environment
like the web. And it also comes down to usability and adoptability. You know, it still has to fit
into the tool chain you currently use
and the tool chain you want to adopt.
That's where Fawana features like the GraphQL interface in particular come in
so that we're not a SQL database,
but we can still be familiar and approachable to people in particular
who know GraphQL and JavaScript.
Yeah, I was just going to ask about the interface itself
because, like you said, you're not doing SQL
and you come from a NoSQL kind of roots from the founding team. Yeah, I was just going to ask about the interface itself, because like you said, you're not doing SQL.
And you come from a NoSQL kind of roots from the founding team.
And so what is the API? Is it Mongo-esque? Is it a brand new thing?
So we offer a GraphQL API, which is compliant with most of the GraphQL standard. It lets you get up and running very quickly.
And GraphQL is great,
but it's also incomplete as a query language. It doesn't support mutations directly. It's about
composing existing services and data sets on the read side and kind of mixing and matching the
response you want to get on a smart client like a dynamic SPA on the web or a mobile device. We also offer a proprietary language called FQL,
which is a functionally oriented query language, which lets you write very complex business logic
and mutation statements and has a rich standard library. And for that, there are DSLs for
JavaScript and C Sharp and Java and Scala and Go and Python and what have you
that make it easy to compose those more sophisticated queries within your application
or to attach them to GraphQL resolvers as functions, as stored procedures
that let you expose them over a GraphQL API.
Yeah, I'm looking at this FQL, and this is very much in the vein of what SQL folk would at least look at.
I mean, select all, select where, select where not,
alter table, truncate table, at least at the very outset.
It seems familiar, even though it's its own thing and proprietary.
And it's heart fauna is a document relational database.
So the relational concepts are there. You have foreign keys, you have unique indexes, you have constraints, you have views, you have stored procedures. But, you know, like you said, earlier in the podcast, you know, you're developing an app, you're probably doing it in an object oriented way. We want to support that object oriented way directly, without forcing you to go through an ORM or something else that translates that to a tabular model that isn't actually what you want.
So what are some perfect use cases for this? If you were to describe either an application that
we all know or a business or even like a, you can make one up if you like, where it's like,
these people should be using Fauna and here's why, or this is using Fauna and here's why,
or I would use Fauna to build this and here's why.
You can't say everybody.
Not everybody.
I mean, obviously a lot of Fauna's features were inspired by things
that we wanted to have at Twitter
and were forced to develop or forgo on our own.
Fauna is really designed for the modern Web 2.0 plus application world.
So SaaS in particular, I would say the majority of our customers are building some kind of SaaS app with a business, or blockchain-adjacent applications, things that
use crypto for public transactional purposes, but also store additional data for application
purposes. The thing that these all have in common is that there's a wide variety of customers and
people interacting with data sets. They interact with it in a soft soft real-time way. They interact with it from the web, from
mobile applications. It's all the apps you use today. What we don't do is analytics. We're not
an OLAP database. We're not a data warehouse. We're not a cache for some other database.
The transactional consistency does have a cost in throughput and latency. So if all you want
is a cache, you should go get Memcache or something like that. We're not an information retrieval system. We don't replace
Elasticsearch. We're not a queue. You can go get Kafka or something else like that for those
purposes. It's really sort of the dream of MySQL. We want to be to the serverless era and Jamstack
and kind of the API infrastructure era, the same way MySQL was to the Web 1.0 era,
where this is a general purpose operational data platform. It's very easy to use. It's very easy
to adopt. No startup costs develop on your laptop. It does a very good job. We can argue about whether
MySQL did a good job, but it was a better job than others at the time because it existed. It does a very good job at that core, you know, short request, transactional,
user data, mission critical data, constrained indexed use cases. And then it does a decent
job at everything else you need to build a fully featured application so that you can get started
without having to have a whole bunch of tools, bunch of tools all mixed up in your tool chain.
We fundamentally don't really believe in the classic polyglot persistence
kind of attitude where you pick the best tool
for every single kind of query pattern you might have in your app.
Databases are heavy pieces of infrastructure.
It's hard to move data around.
You don't want to have too many of them.
So the more general purpose they can be, the less you have to use. We do have an advantage in the
cloud, though, that we can connect and integrate more easily with adjacent systems in a way that
takes the integration burden off the user. So that's one of the things we're working on going
forward and making it seamless to link up to the analytics database you want to use,
the queue you want to use, and that kind of thing.
This episode is brought to you by Teleport.
Teleport lets engineers operate as if all cloud computing resources they have access to are in the same room with them.
SSO allows discovery and instant access to all layers of your tech stack, behind NAT, across clouds, data centers, or on the edge.
I have Ev Contavoy here with me, co-founder and CEO of Teleport.
Ev, help me understand industry best practices and how Teleport Access plan gives
engineers unified access in the most secure way possible. So the industry best practice for remote
access means that the access needs to be identity based, which means that you're logging in as
yourself. You're not sharing credentials from anybody. And the best way to implement this is
certificates. It also means that you need to have unified audit for
all the different actions. With all these difficulties that you would experience
configuring everything you have, every server, every cluster, with certificate-based authentication
and authorization, that's the state of the world today that you have to do it. But if you are using
Teleport, that creates a single endpoint. It's a multi-protocol proxy that natively speaks all of these different
protocols that you're using. It makes you to go through SSO, single sign-on, and then it
transparently allows you to receive certificates for all of your cloud resources. And the beauty
of certificates is that they have your identity encoded and they also expire. So when the day
is over, you go home, your access is
automatically revoked. And that's what Teleport allows you to do. So it allows
engineers to enjoy the superpowers of accessing all of cloud computing
resources as if they were in the same room with them. That's why it's called
Teleport. And at the same time, when the day is over, the access is automatically
revoked. That's the beauty of Teleport.
Alright, you can try Teleport today in the cloud, self-hosted or open source. Head
to goteleport.com to learn more and get started. Again, goteleport.com. so evan it seems based upon your resume that you've been doing this for a while you mentioned
your time at twitter employee 15 20 2008 to 2011 i'm a linked stalker. I do it quite well.
So I saw that Fauna Research was there.
May 2012, which was obviously just after Twitter,
to January 2016.
And I think we've talked to many people like you
who have solved big problems like this,
and they begin with pain.
So probably pain to Twitter.
You mentioned how you couldn't solve these problems there.
And then I'm curious what Fauna Research was,
what that time frame represents, and how you got to kind of where
you're now, you know, given. It just seems like you've been working on this problem for a very long time.
Is that true? That is true. I mean, most of my, although I studied
bioinformatics in grad school and I worked on
gene orthologs in chickens, most of my career, really all my
career has been working around problems in the data systems. After grad school, I worked at SAP briefly,
and then I worked at CNET Networks, and I did chow.com and urbanbaby.com.
And Urban Baby was a threaded asynchronous real-time chat for mobs, which has a lot of
similarities with Twitter. If you stop limiting the audience only to mobs
it was hard to scale that on MySQL
and then Twitter was also scaling on MySQL
and we solved the problems in a number of ways there
after Twitter we weren't sure
me and my co-founder Matt Friels
we wanted to start a company
but we weren't really sure what we wanted to build
so we did consulting for almost four years in the data space
I had two kids
my co-founder had another kid.
So that's kind of low and slow, just exploring the market.
We didn't raise venture capital.
We didn't move into product development until 2016.
But we kept our eye on what people were doing.
And we saw that everyone was running a half dozen different databases.
It's single-digit resource utilization, struggling to integrate
their data, struggling to scale things up and down, struggling to keep their data consistent.
You're having the same kind of problems we had at Twitter. Did they need a purpose built,
you know, social graph that could do millions of requests per second? Probably not. So like
commercializing the stuff we had literally built at Twitter didn't make a lot of sense.
But we started to get this idea of a better data platform and a better data system.
And I think one of the things which is a little bit unique about Fauna, there are more deep tech startups now.
The last couple of years have changed things in terms of the funding market for companies that are based on real hardcore technology and focused on solving those problems first before bringing them
to market. But four or five years ago, it was rare to be looking for venture capital for a
deep tech infrastructure company. People believed Amazon had solved every problem that the market
would ever want. And the only thing to do was business model innovation. And if you were really
good at marketing, then it didn't matter how good your code was, sort of the, the Mongo or the Redis story, that kind of thing. Luckily,
you know, we got funded early on and we got the time to invest in solving these problems that
remained unsolved, you know, foundational problems in computer science, like the distributed
consistency problem, and also the opportunity to bring it to market. That also meant, and we were
a little too early to market. You know, when we first launched the serverless product as an alpha
in like 2017, people were scared of serverless. Lambda had just came out. There were no other
serverless data systems. The idea that you'd access a data API that scaled on your behalf
without your intervention, you know, without you having to go twiddle knobs and that kind of thing,
was weird.
People didn't want it.
So it took us some time to both mature the technology
and figure out how to go to market,
wait for the market to be ready for us.
Now serverless is big, Jamstack is big.
People are becoming familiar with these development models.
The vision for Fauna has never changed. What has changed
is the market readiness.
It's like a perfect-ish storm. To just touch on your funding,
2016 was your seed round, 2017, and it seems like, at least based on crunch-based
data, if this is accurate, you can tell me and I'll go back and edit it and make it correct if it's not.
But early 2017 was a series A,
another series A, I guess in 2020, that would be a series B technically, right? Wouldn't that be
that? But it seems like almost 60 million in funding to get to where you're at right now.
And even what you said too, like with the funding models and the capital available for a deep tech
company like yourself, it seems like now it's available. It's becoming more and more common.
The market is mature to the needs that you're bringing to market.
So it seems like a perfect-ish storm for you to be where you're at right now.
Yeah, I think that's true in particular.
Mongo and then Elastic and then Snowflake
really changed things in terms of
the capital market's appetite
for doing real deep tech infrastructure
software. And yeah, we've raised $60 million. We brought on Bob Muglia as chairman at the end of
2019. We brought on Eric Berg as the new CEO, replacing me last year. Got more professional
management around the table so my co-founder and I can focus on technical problems,
because I was always just the least technical co-founder.
I was never really the business guy, so to speak.
We were kind of surprised.
There were a few other companies that set out to solve the same problem
around the same time, in particular Cockroach and YugaByte.
But they also found a different technology than us,
but they also found the market wasn't really ready for this kind of interface model.
But their solution was to fall back and build another SQL database.
And that's fine, I guess.
You know, there's 30 cloud Postgres SQL things you can use, and it's hard to differentiate among them.
But if you can carve out a niche, you can make a business there.
We didn't want to replicate those little interfaces.
We really wanted to build an interface which was for where the world was headed,
where the new stack was being built.
For people who are building these dynamic edge and mobile and SPA browser applications,
in particular, we fit well with blockchain and crypto stuff.
There's no commitment to SQL in that world.
No.
You know, they're like, people are looking for the newer, better language,
the newer interaction model and those kinds of things.
You know, it's easy to adopt Fauna if you already use GraphQL
in your organization in particular,
because we offer a native GraphQL interface.
So, like, I think one mistake people sometimes make is,
well, how do I migrate my existing Postgres cluster to Fauna?
You can if you really want, but that's not what we're really designed to be.
We're designed for new workloads and for augmenting existing systems.
So if you have a big Postgres cluster, whether it's in the cloud or whatever, like, leave it alone.
That's okay.
If it works, great.
But maybe
you don't want to continue investing in it. You don't want to run the risk of altering tables.
You don't want to deal with provisioning more hardware for new use cases and that kind of thing.
Like use GraphQL and a federated system like Apollo to augment that with Fauna. You know,
put your new data, your new applications, your new product features in the new stack let the old stuff alone that's more the phono model i think that's a fair line
in the sand to draw especially considering you're taking on large problems and the difficulty of
providing migration paths for the older technologies or adapters or whatever it would be will take you
off of where you're trying to go with fauna and saying yeah well we're for new things and
you could slowly adopt us by doing like i said the things that you could do to to keep that thing
running and augment and put new stuff here. Maybe slowly transition. Maybe never completely transition off of it.
But have brown, what do you call them?
Brown path?
No.
Greenfield, brownfield.
Yeah, yeah.
Leave your brownfield alone and here's your greenfield over here.
And it's built on fauna.
I think that's a decent place to position yourself.
But what about all the free offerings out there? So Fauna,
maybe it has some free as in beer for a little while. I actually didn't look at your pricing page yet. So let me know exactly how that works. But free as in freedom is also important and
transparency to some folks. Postgres is a different kind of database, but definitely
the same in terms of it trying to be your primary general purpose data store that
fauna wants to be not built anywhere near the same way but the price tag there is zero there's a lot
of that out there and for databases as a service or apis data apis as a service as it grows and
becomes productive and useful,
it's going to cost you.
So how do you compete with free?
So we are also free.
In fact, we can be more free than other cloud database vendors.
In particular, like the Fauna architecture,
it's a true API.
It's multi-tenant.
The isolation is internal to the database kernel. So you're never paying for idle capacity.
So like most cloud
databases, you sign up, you get like a 30 day trial because they deployed some container or VM
and it's costing them a hundred bucks a month. That means like, you know, some salesperson looked
at your email, decided you were, you were worth spending 300 bucks on a database date. You get
your 90 days. And then after that, they call you and they're like, do you want to pay or do you want to go away? Fauna, we have no fixed costs for a new user. We only have to pay for resources
actually consumed. So anyone can sign up for Fauna for a free forever database. You don't need a
credit card or anything. And then if you start to scale, then you can start to pay for it by the
resources consumed. So the actual economics of it are much better,
both for us as the vendor and for the customer,
than your typical managed or containerized deployment of any kind.
I think some people do care about open source.
Phone is not open source.
It's only cloud.
It's only proprietary.
But the majority of the market, in our experience,
cares about the interface more than
they care about the code base. Like the number of people who are going to crack open their database
and make some fix is very, very small. You know, most people treat databases as some kind of
artifact that's handed down from the gods, like opening it up and changing the implementation is
the last thing you would ever try to do. You know, you'll exhaust every other opportunity
to fix your problem before you would take that risk.
Did you all patch MySQL at Twitter?
Oh, yeah.
Yeah, I'm sure you did.
So at a certain scale, those people do exist,
but I agree, they're on the margins.
But they're also big customers, right?
I mean, Twitter would be a huge customer for Fauna.
Twitter would be a great customer,
but in reality, that classic company is nobody's customer.
Google, Twitter, Facebook, Salesforce, LinkedIn, whatever, they all have the capacity to build
completely custom databases in-house if need be. So you don't get the kind of
vendor-customer relationship you do with someone who really needs you, which is the vast majority
of the market. And I don't mean that in like a leverage, oh, like, well, not like an Oracle way,
like, oh, you're stuck with us now, so now you have to buy us a yacht. Just like a collaborative
way where you're working together to solve the problem and the platform together. I think a lot
of people aspire, they want their companies to grow and be a Twitterer. But if you get to that
point, you're going to be building custom stuff most of the time anyway. We're designed for the typical
company, the small team, which is trying to get something to market quickly, the mid-market
company, which has a lot of products that they need to extend and augment, the large company,
which may have an internal system, which is custom, but it's also building other apps like
IT apps or new projects, and they need something which is custom, but it's also building other apps like IT apps or
new projects, and they need something which is easier to deploy in particular. We see a lot of
usage where there is kind of a classic IT organization. This is similar to the Mongo
story in some ways. There's a classic IT organization. They have the official way to
do things, but it takes a lot of work. File your JIRA tickets, get your machine provisioned, you have to justify everything to everyone. And there's no place
to run experiments in that world. If you want to build something new
quickly, you get off-the-shelf tools. And we're trying to be the most usable
and fastest to market off-the-shelf tool for people building
modern applications. Did you evaluate the open-source
nature of it?
I mean, there's some companies that will use it as a
I guess community adoption.
There's a lot of things around there. You said that
nobody's going to crack open their database,
code base, and start wielding it.
Jared used Twitter as an example.
Maybe, sure, at that scale you probably would,
but did you evaluate the
goodness, I suppose, you get
from being open source? Like the public good almost, the commons that people talk about and refer to often?
Did you evaluate that and say proprietors – because you've got to think about it from a business standpoint, right?
You're building a business.
Primarily, you're not necessarily building an open source product.
You're trying to build a business. And so when you evaluate that, you think, well, could some of this or should some of
this be open source, one, as market leverage, and two, maybe developer adoption.
But if you can do the free and no cost to you and just it's simply meter, maybe that's
the best of both worlds.
But did you evaluate the criteria of open source deeply and just simply say it wasn't
required to build the company you want to build?
We did, and we continue to evaluate it,
because the market changes, and what people need changes.
And we decided what people want are the benefits.
There's a certain section of the market which is religious about being open source.
Like, fine, well, those people don't use Amazon either.
They might use hardware and deploy Postgres to it,
but they're not using Aurora, They're not using Dynamo. They're not using like Azure Cosmos DB and Microsoft's cloud.
And those are all effectively proprietary systems. The benefits you get from open source,
you know, the things that really made it take off, especially in the nineties with LAMP,
where it was free to try and it fit with the rest of your development environment.
And that fit really means standard interfaces.
So the things we value about open source that we try to replicate are that free to try experience,
the local development experience.
We do have a Docker image.
You can run a single node copy of Fauna on your own machine to develop against it without having to deal with the cloud. And the interface standardization, which we're working to improve both in GraphQL and in FQL,
and most likely eventually other query languages too.
If you have that, who cares if you have the exact same code your cloud provider is running?
You have code you can run locally if you need a local edition.
You have interfaces that people are familiar with and understand.
And because of the unique architecture and so on, you have economic benefits from the deployment model and the vendor pricing and so on.
Good answer.
Anything else, Adam?
Not necessarily.
I mean, I think I can somewhat agree with you.
I mean, there's no real benefit.
I mean, because you have to think about what you're optimizing for as a company.
What you're optimizing for as a company is to build a successful product, successful database that solves the technical challenges first, not the must-be-open-source challenges as well.
It's just very common for a database because of security and different things involved in it,
whether you want contributions or you think it's valuable for people to see the code or not. It seems to be, in quotes, the way.
Even if you become source available, like with SSPL or business source license,
that the code is visible.
I was just curious about how that shifted for you and how that played out for you.
Yeah, I mean, there's a slow trend away from that model.
I think the thing that people really hang on to is the sense of transparency and trust
they get from the open sourceness nature of the database.
The portability, the idea that you can switch from one vendor to another is important to
a lot of people.
In reality, switching an operational database is painful no matter what,
even if in theory, going from one version of Postgres to another version has problems,
let alone going to a completely different implementation or from one cloud vendor to another.
Everything people use in AWS RDS or Aurora is heavily customized
to the point of being unrecognizable at this point
compared to the open source editions of the database. But there's definitely,
people definitely still value those aspects of the open source experience and occasionally they
ask for it. But I think as we move in particular to a wider variety of cloud databases, you know,
composed around these standard and proprietary interfaces, In particular, one of the things we're working on that we're excited about is better ability to query the same data from different query languages in the same database.
At that point, you don't care so much whether that particular implementation is open source. to market, the operational experience, the pricing and cost benefits, the unique capabilities
a lot more than they value, being able to fix their own bug or being able to put the
source code in a vault in case someday the vendor goes away.
What do you think the biggest challenge is for you right now, given the place of the
market, even speak business-wise, even though you're a CTO now and you've hired for a CEO.
Despite that, I'm not saying you can't play a role in those by any means, but talk about future funding.
What's the biggest challenge you face technically or business-wise right now?
I think at this point, the biggest challenge is really keeping up with our customers.
Building a database is a slow process.
It's not the kind of slapdash code development you would typically see at an early stage startup.
But that doesn't mean the market goes slower to match you.
So we have tons of growth on the platform,
lots of people pushing it in new and unique ways.
And they also want a lot of new capabilities,
stuff that's been on our roadmap for a long time
that we still have to deliver.
So there's no one else really doing what we're doing in the market.
And that means the bottleneck to our growth,
to satisfying our customers,
to giving everyone a better phone experience
is really us and our ability to execute
on the vision that we laid out several years ago.
So it's time to accelerate, basically.
Yeah, yeah, exactly. What's on the vision that we laid out several years ago. So it's time to accelerate, basically. Yeah, yeah, exactly.
What's on the horizon then?
So you mentioned your customer's not slowing down by any means.
That means you have to move faster to keep up,
even though you can't move fast because, or not so much can't,
but it's not by nature the way you build a database.
What's on the near horizon?
You mentioned some features that are specific that customers want.
What's on six-month roadmap that might be coming to fruition sometime soon?
Give us a tease of what's the future like.
Our focus is twofold. It's on maturity and resiliency for customers who are already successful in Fauna.
We launched the region groups feature earlier this year.
We'll have more region groups launching.
We'll have a better backup and restore capability that's more under direct user control, that kind of thing.
More compliance for different regulated industries. We did GDPR, we did SOC,
and there'll be other ones coming. So the kind of things that help you
grow once you're at scale and, like you said, have trust in the database.
And then at the same time, the other area of focus is really the adoptability,
making FQL easier to use, making GraphQL more standards compliant,
eventually building other popular query standards on top of the same database kernel,
making sure that phone is always the easiest thing, both operationally and in terms of the development experience
to build your new application or your new feature on.
Is there anything that we haven't asked you that you're like,
man, I really just wish they would ask me about these things?
You're speaking to a developer audience,
potentially future customers,
or at least curious about what you'll solve in the future.
They'll pay attention.
Is there anything we didn't ask you that you want to close on?
Yeah, I think we talked a lot about trust,
like the database is this scary thing that can never be changed.
There's no risk to trying it out.
So I just encourage people to go to the website,
click the sign-up button.
Database provisioning is instantaneous.
You can go through the tutorial, play around with the GraphQL
and the FQL interface and see if you like it and give us
feedback if you don't you mentioned the free forever before so you've got a free forever
monthly plan you mentioned the docker image you could use locally does that docker image locally
require a sign up or is that something you can just pull down from docker hub or whatever
it's not an authenticated package you can can get it and run it. Okay.
You can try without signing up if you wanted to then through the Docker image.
You can,
you can also,
you know, it's just an email or you can use GitHub or,
or sell or Netlify identity to sign up as well.
So.
Cool.
Evan,
thanks for the deep dive into all things Fauna.
I think that we really appreciate these technical deep dives.
I think going back to the white paper, Dr. Abadi, that you mentioned, as a board member for you, we'll link up the blog post that we kind of referenced to some degree in this call here in our show notes.
The trust page, of course.
Any other links we can think of that make sense.
But, Evan, thank you for your time.
It's been awesome.
You're welcome.
Great to be on the show great to meet you that's it for this episode thanks for tuning in what's got you excited about fauna are you
planning to try it out let us know in the comments coming up next week is britney dianigi talking
about learning focused engineering and big shout out to our partners linode fastly and launch
darkly also thanks to breakmaster cylinder for producing all of our awesome beats.
And of course, thank you to you.
If you enjoyed this show, tell a friend.
We'd love to have them as a listener.
Word of mouth is by far the best way to help us.
And don't forget the Galaxy Brain Move.
We have a ton of shows you can listen to.
Subscribe to them all in a single feed at changelog.com slash master.
And for those who want to get a little closer to the metal and get a free t-shirt along the way,
subscribe yearly to ChangeLog++.
Learn more at changelog.com slash plus plus.
That's it for this week.
We'll see you next week. Thank you. Game on.