Orchestrate all the Things - Another globally distributed cloud native SQL database on the rise: Yugabyte Raises $30 million in Series B Funding. Backstage chat with CEO and Founders
Episode Date: June 9, 2020Your good old on-premise SQL database is in terminal decline. A pure-play open-source cloud-native PostgreSQL, with support for Apache Cassandra and GraphQL interfaces, is what you need. Or at l...east, this is what the Yugabyte crew thinks. The company, founded by Facebook data infrastructure veterans, announced that it has raised $30 million in an oversubscribed Series B round to double down on community and team growth. This is a crowded market, but big enough to be a non-zero-sum game. We connected with Yugabyte founders Kannan Muthukkaruppan and Karthik Ranganathan, and newly recruited CEO Bill Cook, previously of Sun Microsystems and Pivotal, for a deep dive in the company, the funding, and the market. Article published on ZDNet in June 2020
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amatiotis and we'll be connecting the dots together.
Today's episode features another globally distributed
cloud-native SQL database on the rise, UgoByte.
Your good old on-premise SQL database is in terminal decline.
A pure-play, open-source, cloud-native, post-press SQL
with support for Apache Cassandra and GraphQL interfaces, is what you need.
Or at least, this is what the YugaByte tool thinks.
The company, founded by Facebook data infrastructure veterans,
announced that it has raised $30 million in an oversubscribed Series B round
to double down on community and team growth.
This is a crowded market, but big enough to be a non-zero-sum game.
We connected with Hugo Bout's founders, Kanan Muthukarupan and Karthik Ranganathan, and
newly recruited CEO Bill Cook, previously a Sun Microsystems and Pivotal, for a deep
dive in the company.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter,
LinkedIn and Facebook.
Well, thanks everyone
for making the time to
connect and discuss
your upcoming news
which is pretty exciting.
So, it's the first time we
connect actually, so I thought
the best way to
do this both for me and for the
people who may be listening would be to do a little bit of a flashback let's say
and go back to do a little bit of history so when was Youkabyte founded
and you know what brought the founding team together and you know your course up to to this point basically if you'd
like to summarize it in a few words do you want to take the lead connor karthik since you guys
lived it and i can i can talk a bit about why i joined yeah yeah george uh uh pleasure to talk
to you uh to yeah to give you a brief history, YugoByte was founded in 2016.
And, you know, Kartik, Mikhail, and me,
we founded it.
The three of us met at Facebook,
where we had the opportunity to work on
a lot of Facebook's, you know,
high-scale data infrastructure,
including having worked on Cassandra
before Cassandra was open-sourced by Facebook,
as well as a lot of work on HBase.
But the goal was to put some real mission-critical applications
like Facebook Messenger on a data tier that was elastic,
that was easy to manage and operate,
and that could really handle data you know, handle data center failures
and the likes of those challenges, which is now becoming pretty table stakes across many
organizations.
Prior to that, I have a history in databases having worked at Oracle in the database team.
So, yeah, here we found that there's a real need for bringing something as fundamental as a relational database or a transactional database, if you will, to power online workloads.
But to bring something as fundamental as that to the modern cloud, which is like built of commodity blocks, it's a shared nothing architecture. architecture and really needed a next generation database to that worked well
on the cloud and something that matched the enterprise demands of their modern
applications as they are making a decision to move to the cloud.
And maybe I'll just add some color, George, this is Bill Cook.
And I joined recently as CEO.
And, you know, from my perspective, I've had a, you know, privileged career in some sense of being able to work at Sun Microsystems almost 20 years.
And then signing on to help Scott Yara build Green Plum in 2006 and then spinning out as Pivotal in 2013 after been acquired by EMC and so from my perspective when when I got to know a
bit about you goodbye and get to know Conon and Karthik,
it started for me with what Conon was just explaining.
The mission of the company made sense to me.
If you think about what we did at Pivotal,
really advancing the idea of a platform as a service, i.e. Cloud Foundry and microservices
and the Pivotal Labs story around agile development and working with large enterprises to build software in
a better way.
In some sense, the database technology and data service in general are somewhat lagging
that opportunity, meaning that the applications are moving along, scale out,
running multi-cloud, hybrid cloud environments. And the data side of it's trickier. And so if
you look at the mission that Conor and Karthik and Mikhail have been on of really building
technology to address that need really appealed to me. So the market opportunity was certainly there.
And then the other aspect was really just about building a great company. I mean, I think the most important thing is that from a leadership team that we're aligned in kind of mission,
vision, and culture and our beliefs. And we just want to build a great company that,
you know, the best and brightest, whether you're an engineer or you're a salesperson out in the marketplace, wants to join YugoByte for this mission.
So that's why I'm here.
Okay.
Thank you.
Thank you both.
And actually, since you mentioned on the building a great company part of things and the cultural
aspect, that's a good opportunity for me to ask
something which i originally missed so uh how big is the company at this point how many people do
you have in your payroll and i know that it's uh it's it's an open source company so in a way you
may also want to count uh people in the broader community and we'll get to that as we move along in the discussion.
But for the time being, let's keep it to people in the payroll.
So how many people do work for Gigabyte at this point?
Yeah, we are about 50 people right now at Gigabyte.
Okay.
And that's really not counting the community.
Yeah, not counting the community.
I think community will be quite a bit bigger,
I mean, depending on which aspects
of the community you count.
So I think like, for example,
we started our community chat,
like community forum, like Slack forum,
about just over a year ago,
a year and a few months.
And we've already, I think,
crossed 1,100 people in the forum.
These are all technical people
helping ask and answer questions
as well as good technical discussions,
trying to push the direction of distributed SQL
and YugoByte in general.
We also recently just crossed
100 contributors to YugoByte.
So there's the set of people that contribute code
and documentation and various types of fixes.
So there's that on that side.
So yeah, depending on how you look at it, it could be a lot, lot bigger.
And then just to add a bit of color, George, if you think about, I know we'll talk about
the funding round that we've raised, you know, $30 million that will get announced here in
the next year or so with 8BC leading the round.
But with that investment, we plan to basically double the employee count over the next 12 to 8 months.
Okay, that's great.
You anticipated actually my next question, which was going to be precisely that.
So since the opportunity that brought us together is actually the funding, the funding you're going to be precisely that. So since the opportunity that brought us together
is actually the funding, the funding you're going to get,
I was going to ask if you want to share a little bit of background
on how that came about and, you know, as a second step,
what you're going to be using the funding for.
You already partially answered.
I'll let you take it from the start.
I'll let Conan answer it because the start. I'll let Kanan answer
because he really drove the funding round.
They were recruiting me and
driving the funding round.
I did the funding round, so
I'll let him comment. I guess it worked
out on both fronts.
Yes, it did.
I mean, George, fundamentally
we were...
The last 18 months has been phenomenal.
We have seen tremendous growth in terms of the community aspects that Karthik was referring to.
But also, you know, of the enterprise customers as well as production deployments of YugaByte.
I mean, I think that's sort of our barometer of like signal that the demand is massive.
Every day on Slack channel, we get more requests for,
hey, when is this feature coming and that feature coming?
So it was clear that market opportunity was there
and we wanted to double down and accelerate
our investments in product, commercial activities
and support and operations.
And obviously the investors recognized
this market opportunity as well, but also their belief in the team and the product and the way it's been architected.
So, in fact, 8VC, which is our lead investor in this round, their partner and CTO is Bhaskar Ghosh, and he's no stranger to enterprise infrastructure.
So he worked in the guts of Oracle and Informix
in the early years.
Later, he has headed data engineering at LinkedIn.
So he really understands the data infrastructure space.
And 8VC is a team of dynamic investors
with entrepreneurial groups.
And there are some of the folks behind Palantir.
And then we saw the market opportunity around a next generation cloud native
database that can help businesses move to the cloud.
I don't know if you wanted to add any
details. Yeah, I would just add that
we're proud of the investors we have, including the newest
being ABC and Wipro as investors joining this round, including Ravi from Lightspeed and Dell Tech Capital.
So if you look at the roster of investors, they're betting on UgoByte to be a big play in this big market. And then the one other maybe item that was in the announcement
is that my friend and partner, Scott Yarra, is joining the board as well.
Scott was the founder of Green Plum, and we've been on this journey together
for, oh, I guess almost 15 years now together.
So him investing in the company and becoming a board member,
I think will be quite helpful as well.
Yeah.
Yeah.
And since we're on the,
still on the investment side of things,
you know,
earlier in my,
my career,
let's say I,
I wasn't so,
so keen on covering investment rounds because,
you know,
I thought,
and I can explain that,
I thought that, okay, so great, they're getting some money behind them, good for them, but I'm not really that interested.
But I realized I was wrong for a number of reasons.
First, basically, it's a node, let's say, it's a kind of vote of confidence for the
people that are investing in you.
And it obviously draws some attention, which is a good opportunity for people like me to
dive into the technical underpinning of whatever companies get the funding.
And it's also a good opportunity for people who may not previously have been familiar
with the company to get to know you. So something
that I typically see people saying when they're in that position is
that well, we kind of chose which investors we wanted to bring on board
because it's not just about the money, it's also maybe even primarily some
people go as far to say it's also about the kind of doors that getting
those investors on board may open.
The kind of advice you may get from them, the kind of potential clients they could connect
you to and so on and so forth.
So with that, I'm going to kind of switch to the more technical, let's say, part of
the discussion.
So you know better than me that you are in a relatively crowded market, let's say.
So great.
Globally distributed databases.
You're not the first ones to come up with that idea.
It's a good one, obviously.
That's why, you know, hence the competition.
And since it's a market that I've also kind of doubled in myself, I know, let's say, at least the basics about it. So to quickly break it down also for the people who may be listening to us.
So you have a number of options there.
So first, you have like, you know, the no-sequel crowd, basically.
So Cassandra and all kinds of Cassandra clones and
you know, Mongo and what have you. All of these, most of these types of databases
by now have gotten that in one way or another. Then you also have SQL databases
and this is the subdomain, let's say the subsector that UcoByte is also in.
And again, there you have a number of options.
So databases come from cloud vendors, basically,
like Google Spanner or Microsoft Azure CodeMaskDB and so on.
And those obviously have some things going for them.
Having a massive cloud vendor behind you does help.
But on the other hand it also
means vendor lock-in basically and no ability to do multi-cloud and high-diff clouds.
So just to focus, to narrow in on your specific segment let's say, so SQL globally distributed
databases not backed by a cloud vendor, but independent, basically.
Still, there's a few players around there.
I'm not claiming to know all of them, but just off the top of my head,
we have Yuga Bytes, we have CockroachDB, we have FaunaDB,
and I'm probably forgetting a couple of others as well.
So to tie all that back into the investment part.
So clearly this is something that your investors must have looked into.
So what do you think, what would you say was it that made them decide,
okay, you can buy this, you know, worth investing in,
even though it's such a crowded market?
What makes you stand out?
Yeah, I'll kick it off and then Anand and Karthik can dive into the details underneath it.
But from my perspective, and I did a similar analysis, George,
and I'm looking to join here,
I think what you're really speaking to is it speaks to the market opportunity
in the broadest sense, meaning that there is a need here,
back to my earlier comments about, in some ways, simplifying the data answer to that question.
You're really appealing the developers on one side of the equation and then the people that
have to operate these systems at scale on the other side. And then obviously the business owners, you know,
from a general manager perspective or the BU leader that wants to drive
capabilities and results in a very, you know,
a much faster way and a quicker, quicker loop.
So I think what the market is looking for is obviously the,
the capabilities that you're speaking to that Conor and Karthik will dive into more detail about.
But also, you know, have the appeal to the broader community around open source and not being locked into a particular player, whether it's a cloud provider or a vendor, for that matter.
So that because it is such a big market, they want to have scale and have a future
from a cost perspective that's not going to be prohibitive. And then the other side of
the coin is really from the enterprise side, that the enterprises are looking for simplification.
This is a hard problem for them and they have a lot of database sprawl across that three
decade or four decade era that actually doesn't meet the requirement
for where they're headed now. So a combination of those two things are why I think we're
positioned particularly well to take on the market and win. So I'll let you guys take
it from here.
Karthik, you want to cover that?'re on mute yeah yeah i can go um so um i think uh our perspective right like
at least i'll give mine like uh there's a there are two three reasons why like i think your
question was in the face of so much competition why did our investors you know give us the vote
of confidence right um i think the first answer has got to be the team um so as a team i think
we're an incredibly strong team.
It's not one of the strongest teams around that has cut out to build for the task at hand.
And maybe I can explain.
I think Karan already talked about his deep background at Oracle,
working inside the guts of building one of the world's most advanced RDBMS SQL databases.
And he's not alone. There's other people that he's worked with in the past at Oracle advanced RDBMS SQL databases. And he's not alone, right?
There's other people that he's worked with in the past
at Oracle that have come and joined us
and are also working on the insights at YugoPack, right?
That's number one.
Number two is that all of us,
all three of us co-founders,
like Karnan, Mikhail, and myself,
are super fortunate to have been around
and worked on distributed databases,
NoSQL databases specifically,
even before NoSQL was a thing.
For example, the first project or one of the first projects that Kannan and I worked on in Facebook early back in 2007 that we kicked it off and worked on it, eventually ended
up becoming what is now known as Apache Cassandra.
So we built a database there to deal with the exploding amounts of data and open-sourced it.
And we didn't know open-source would take off at that time because it wasn't really too much of a thing in databases back then.
Subsequently, we worked on Apache HBase.
All three of us founders are HBase committers, along with a number of others in the company.
And the unique thing about Facebook at the time
was that we not only were builders of the database,
we also had to operate and run a massive DBaaS escape.
So that ties back, and we're talking about
billions and billions of operations per day
and many, many petabytes of data,
frequently having to upgrade machines,
rolling upgrade software,
as well as take care of down
times without you know in a totally automated way like zero downtime platform right this is some of
the critical data user data that was flowing through the system so um forward that all the
way to uh and like i current and i had a stint at nutanix as well learning the ropes of building an
enterprise company so forward that to YugoByte.
And we're a company or a team of people that have seen massively complex technical problems along with having both built and run databases at scale and seen enterprise company building
because Nutanix was relatively small when we joined.
So it kind of makes a good backdrop for building a large and successful company.
So that's on one side.
If you juxtapose that on the other side, and I know you asked about specific competitors,
but if you just forget specific competitors for a second and you just think about, hey,
where is the market headed and what is the market we're addressing?
We're addressing the market of people building applications, OLTP applications.
Now if for a second, again, we forget all databases and think about what are the databases
or what is the database that is most often picked in order to build these applications?
Well the answer would inevitably have to be PostgreSQL.
It just always ends up there.
And PostgreSQL's popularity is even better than that of MongoDB right now.
So what we
felt was a lot of people are
using Postgres to build their app. However,
their app is being built for the cloud
or a cloud-native environment like Kubernetes,
which requires some
key characteristics like
high availability and fault tolerance.
So you're not affected by
failures. It requires scale-out, you're not affected by failures.
It requires scale out,
like the ability to add more nodes in order to survive more requests
and scale it back down when needed.
And lastly, the ability to go
and deploy data across zones,
across regions, hybrid deployments,
et cetera, right?
So if you combine those three
with Postgres SQL,
what you get is a null set.
There's no solution that exists that can do all of these today, right?
Cloud vendors included, because if you look at Google Spanner, it doesn't really support
most of the features that PostgreSQL has.
Now, if you look at some of the other competitors that we're looking at, while CockroachDB,
for example, comes closest in the ability to speak the PostgreSQL language, it is far from having
all the features that PostgreSQL has in order to service the RDBMS workload, right? So even though
they have a head start on us, as you said, they have been building the database longer.
Thanks to some of our architectural choices, for example, starting with the PostgreSQL code base
itself, we support like a significantly larger number of more critical and complex relational features
compared to them.
That's on one side.
And Fauna on the other side,
I think it also has to do
with architectural underpinnings
like Picking, Calvin,
and the way they built their database
supports a different kind of a workload.
I mean, I wouldn't characterize it yet,
but a different kind of a workload
and would find it difficult
or would have found it difficult
to support full SQL.
Hence, they've gone more towards
the GraphQL side of things, right?
So, yeah, I'll stop there.
I think, Karan, I don't know if you have things to add.
Yeah, I mean, I was going to say,
I think the investors essentially
see the massive market opportunity
in spite of it being, you know, in spite of there being
multiple players, you know, give or take, it's like a 50, $60 billion market in the
database space. But they're also seeing that in spite of us being a younger company, you
know, the investors who did a deeper due diligence are able to see the architectural choices that Karthik was
referring to that really put us in good
stead, like being
an architectural choice where the lower
half is like Google Spanner and the upper
half is intelligently reusing the
Postgres code base to bring features
like stored procedures, triggers,
user-defined functions, all of it
exactly with identical
semantics to Postgres.
And that, along with some of the performance choices that we have made,
starting from the language of development being C++,
those are key things where we're really building for the long term,
and we didn't want to take shortcuts.
I mean, these choices will hold us in very good stead in the long term
in building just a very good stead in the long term in building
just a very foundational
database for the market.
Essentially,
the takeaway
we want to leave
with developers
is that over time
if they ask themselves
like,
you know,
hey,
if I had another database
that does everything,
every single thing
that PostgreSQL does
yet can get deployed
in a cloud native fashion
and we hope that,
you know,
and the performance and everything
is comparably good on a single node
and, like, really shines when you deploy it
in a distributed fashion in the cloud,
then that would indeed become
the default database for the cloud, right?
And it is critical to not deviate too much from semantics
to really have the high performance
and everything that can be achieved.
Okay, thank you.
Yeah, you touched upon quite a few points.
And, you know, if we had more time,
it would be interesting to maybe go a bit deeper into each of those.
But since we don't, I think I'm going to just wrap up
by basically going into your future plans.
And in terms, again, you know,
we already talked about how you want to grow the company
and so on, which makes perfect sense.
And I also agree, by the way, that, you know,
this is a huge market and not a zero-sum game.
So probably, you know, there's enough room for everyone.
I was just curious about your differentiation
because, you know, it's, yeah, you can't help but notice
that, you know, you and some of the competition
are pretty close in your offering.
So I was wondering how you may be able to frame it if I'm a CTO and looking to choose
between your offering and some competitors, the points may be quite fine.
I think, sorry, if the CTO thing triggered that we're 100% open source as well.
So we're probably the only database among all of the competition like, you know, that we've talked about today that's 100% open source on the core database.
Yeah, yeah, that's probably true to the best of my knowledge at least.
But again, you know, that's a big discussion.
How much does that really matter and for whom and what really constitutes open
source and all of these huge, huge topics.
We could be talking about it for a couple of hours and we'd still just be scratching
the surface.
That's right.
So just to wrap up in the few minutes we have left, something that piqued my interest
and one of you also mentioned earlier. You mentioned
briefly GraphQL and I saw that you have an interesting partnership going on in that space.
I've seen that you partner with a company, it's a vendor called Hasura, which basically
enables you to offer a GraphQL interface in addition to the standard SQL that you already offer.
And I was wondering if you could say a few words about your rationale for going GraphQL
and specifically partnering with Hasbro.
Yeah, totally.
I think on the GraphQL front itself, we think that GraphQL is increasingly seeing, like,
having a lot of momentum.
A lot of people are trying to build new-age applications in GraphQL because of the functionality and the convenience that it offers.
It's ideally suited for mobile and web and that type of application.
So as a space, we're very bullish about it.
And we think that there's going to be more and more GraphQL applications.
So that's on one side. Now, specifically with respect to which GraphQL technologies we work
with, it is our intention to work with almost all technologies in the space, right? And Hasura
is just one of them. And we'll come to why Hasura is interesting. So as an overall thesis in the
GraphQL space, right? Like the players in the GraphQL space kind of fall in three categories.
The first category is the generic GraphQL,
like the GraphQL solutions that cater to multiple databases underneath.
And they're not specialized on any, right?
So these would be like Apollo, for example, like a classic example.
Now, Apollo works with any rest of our database underneath and it's up to the end user to kind of write that binding themselves gives a lot of
flexibility but there's a little bit of work in order to leverage the power of
the underlying database there is another class which is a postgresql specific car
that one that once I bring out the power of PostgreSQL.
And it's great to see them double down on PostgreSQL as well,
given it's a, you know,
secular trend going on around with PostgreSQL.
And Hasura falls in this category, right?
There's a number of others here,
like PostgraphFile and Supergraph,
and there's a number of others,
but Hasura is like a big one there, right?
And a third category is a set of projects that are building
a combined GraphQL
plus database play. This is
where FaunaDB, I believe, is
headed, and people like
Dgraph are also working on it.
I would also call out that
Prisma is another GraphQL
community project that
supports multiple databases,
and that's something that we're super interested in working with. So overall like excluding
the third category which is which are also integrating a database and a GraphQL
together we're interested in working with everybody in the first two
categories. So now that brings up Hasura and why Hasura? I think Hasura is
very interesting because they really leverage the power of
PostgreSQL. And so the depth shows in the fact that a lot of people that want to pick PostgreSQL
as their database want to use Hasura as a solution on top of it. And as one of the only
horizontally scalable and open source PostgreSQL databases up there. Like we're the only ones, like if you take out like say Amazon's Aurora,
which is a cloud native database,
but it has a slightly different scaling property compared to YugoBit.
YugoBit is the only database that can actually support Hasura today the way it
was, right?
Because of its extensive use of stored procedures and triggers and a number of
other things internally.
So we are seeing the demand.
We are seeing a lot of people ask us about how they can run a GraphQL-like solution on top of Utabyte using Asura.
The other one that's interesting for us is the Jamstack in that area where Prisma is naturally evolving.
So that's a slightly different area it's turning out to be.
But anyway, so this is an area that we think is very interesting
that we're watching closely and working with the folks here.
Okay, thank you.
So with that, if we may go a little bit over time
just to get your quick, I don't know, one or two-liner
maybe about potentially new features, I mean, not that you don't
already have your place quite full, but about things like
going multi-model or adding support for a graph
analytics engine or analytics in general, going
edged up or adding machine learning capabilities. Are any of those in your
roadmap?
Today, YugaByte, it has actually two upper halves.
So the other API that YugaByte supports is a Cassandra-compatible API.
This is a very easy journey for folks
that are already familiar with, like,
sort of a no-suit paradigm,
whether it's like Apacheache or dynamo dv so it supports two apis already so from the operations teams and
organizations this this uh is also helping them with database consolidation you know
there are many instead of like learning many different ways of running securing and operating
database so that is one vector.
The other aspect that you talked about, like workloads,
that's actually a very interesting vector.
Although we're starting off from an OLTP and a transactional workload angle,
people want to do more real-time analytics,
the HTAB space, if you will,
like being able to do analytics on their transactional database itself,
the hybrid transactional analytical processing.
And we're constantly improving the product to handle more and more of like analytic capabilities
as well.
But I would say primary focus is starting off with the transactional workloads and then
getting more to analytics side.
So that includes work in our query optimizer, query planner, predicate pushdowns, all of
that.
This is a continuum of R&D work that we will continue to invest in.
I hope you enjoyed the podcast.
If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and
Facebook.