Orchestrate all the Things - Another globally distributed cloud native SQL database on the rise: Yugabyte Raises $30 million in Series B Funding. Backstage chat with CEO and Founders

Episode Date: June 9, 2020

Your good old on-premise SQL database is in terminal decline. A pure-play open-source cloud-native PostgreSQL, with support for Apache Cassandra and GraphQL interfaces, is what you need. Or at l...east, this is what the Yugabyte crew thinks. The company, founded by Facebook data infrastructure veterans, announced that it has raised $30 million in an oversubscribed Series B round to double down on community and team growth. This is a crowded market, but big enough to be a non-zero-sum game. We connected with Yugabyte founders Kannan Muthukkaruppan and Karthik Ranganathan, and newly recruited CEO Bill Cook, previously of Sun Microsystems and Pivotal, for a deep dive in the company, the funding, and the market. Article published on ZDNet in June 2020

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amatiotis and we'll be connecting the dots together. Today's episode features another globally distributed cloud-native SQL database on the rise, UgoByte. Your good old on-premise SQL database is in terminal decline. A pure-play, open-source, cloud-native, post-press SQL with support for Apache Cassandra and GraphQL interfaces, is what you need. Or at least, this is what the YugaByte tool thinks.
Starting point is 00:00:31 The company, founded by Facebook data infrastructure veterans, announced that it has raised $30 million in an oversubscribed Series B round to double down on community and team growth. This is a crowded market, but big enough to be a non-zero-sum game. We connected with Hugo Bout's founders, Kanan Muthukarupan and Karthik Ranganathan, and newly recruited CEO Bill Cook, previously a Sun Microsystems and Pivotal, for a deep dive in the company. I hope you will enjoy the podcast.
Starting point is 00:01:02 If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn and Facebook. Well, thanks everyone for making the time to connect and discuss your upcoming news which is pretty exciting. So, it's the first time we
Starting point is 00:01:19 connect actually, so I thought the best way to do this both for me and for the people who may be listening would be to do a little bit of a flashback let's say and go back to do a little bit of history so when was Youkabyte founded and you know what brought the founding team together and you know your course up to to this point basically if you'd like to summarize it in a few words do you want to take the lead connor karthik since you guys lived it and i can i can talk a bit about why i joined yeah yeah george uh uh pleasure to talk
Starting point is 00:02:02 to you uh to yeah to give you a brief history, YugoByte was founded in 2016. And, you know, Kartik, Mikhail, and me, we founded it. The three of us met at Facebook, where we had the opportunity to work on a lot of Facebook's, you know, high-scale data infrastructure, including having worked on Cassandra
Starting point is 00:02:24 before Cassandra was open-sourced by Facebook, as well as a lot of work on HBase. But the goal was to put some real mission-critical applications like Facebook Messenger on a data tier that was elastic, that was easy to manage and operate, and that could really handle data you know, handle data center failures and the likes of those challenges, which is now becoming pretty table stakes across many organizations.
Starting point is 00:02:51 Prior to that, I have a history in databases having worked at Oracle in the database team. So, yeah, here we found that there's a real need for bringing something as fundamental as a relational database or a transactional database, if you will, to power online workloads. But to bring something as fundamental as that to the modern cloud, which is like built of commodity blocks, it's a shared nothing architecture. architecture and really needed a next generation database to that worked well on the cloud and something that matched the enterprise demands of their modern applications as they are making a decision to move to the cloud. And maybe I'll just add some color, George, this is Bill Cook. And I joined recently as CEO. And, you know, from my perspective, I've had a, you know, privileged career in some sense of being able to work at Sun Microsystems almost 20 years.
Starting point is 00:04:11 And then signing on to help Scott Yara build Green Plum in 2006 and then spinning out as Pivotal in 2013 after been acquired by EMC and so from my perspective when when I got to know a bit about you goodbye and get to know Conon and Karthik, it started for me with what Conon was just explaining. The mission of the company made sense to me. If you think about what we did at Pivotal, really advancing the idea of a platform as a service, i.e. Cloud Foundry and microservices and the Pivotal Labs story around agile development and working with large enterprises to build software in a better way.
Starting point is 00:04:54 In some sense, the database technology and data service in general are somewhat lagging that opportunity, meaning that the applications are moving along, scale out, running multi-cloud, hybrid cloud environments. And the data side of it's trickier. And so if you look at the mission that Conor and Karthik and Mikhail have been on of really building technology to address that need really appealed to me. So the market opportunity was certainly there. And then the other aspect was really just about building a great company. I mean, I think the most important thing is that from a leadership team that we're aligned in kind of mission, vision, and culture and our beliefs. And we just want to build a great company that, you know, the best and brightest, whether you're an engineer or you're a salesperson out in the marketplace, wants to join YugoByte for this mission.
Starting point is 00:05:49 So that's why I'm here. Okay. Thank you. Thank you both. And actually, since you mentioned on the building a great company part of things and the cultural aspect, that's a good opportunity for me to ask something which i originally missed so uh how big is the company at this point how many people do you have in your payroll and i know that it's uh it's it's an open source company so in a way you
Starting point is 00:06:18 may also want to count uh people in the broader community and we'll get to that as we move along in the discussion. But for the time being, let's keep it to people in the payroll. So how many people do work for Gigabyte at this point? Yeah, we are about 50 people right now at Gigabyte. Okay. And that's really not counting the community. Yeah, not counting the community. I think community will be quite a bit bigger,
Starting point is 00:06:46 I mean, depending on which aspects of the community you count. So I think like, for example, we started our community chat, like community forum, like Slack forum, about just over a year ago, a year and a few months. And we've already, I think,
Starting point is 00:07:00 crossed 1,100 people in the forum. These are all technical people helping ask and answer questions as well as good technical discussions, trying to push the direction of distributed SQL and YugoByte in general. We also recently just crossed 100 contributors to YugoByte.
Starting point is 00:07:18 So there's the set of people that contribute code and documentation and various types of fixes. So there's that on that side. So yeah, depending on how you look at it, it could be a lot, lot bigger. And then just to add a bit of color, George, if you think about, I know we'll talk about the funding round that we've raised, you know, $30 million that will get announced here in the next year or so with 8BC leading the round. But with that investment, we plan to basically double the employee count over the next 12 to 8 months.
Starting point is 00:07:56 Okay, that's great. You anticipated actually my next question, which was going to be precisely that. So since the opportunity that brought us together is actually the funding, the funding you're going to be precisely that. So since the opportunity that brought us together is actually the funding, the funding you're going to get, I was going to ask if you want to share a little bit of background on how that came about and, you know, as a second step, what you're going to be using the funding for. You already partially answered.
Starting point is 00:08:22 I'll let you take it from the start. I'll let Conan answer it because the start. I'll let Kanan answer because he really drove the funding round. They were recruiting me and driving the funding round. I did the funding round, so I'll let him comment. I guess it worked out on both fronts.
Starting point is 00:08:38 Yes, it did. I mean, George, fundamentally we were... The last 18 months has been phenomenal. We have seen tremendous growth in terms of the community aspects that Karthik was referring to. But also, you know, of the enterprise customers as well as production deployments of YugaByte. I mean, I think that's sort of our barometer of like signal that the demand is massive. Every day on Slack channel, we get more requests for,
Starting point is 00:09:06 hey, when is this feature coming and that feature coming? So it was clear that market opportunity was there and we wanted to double down and accelerate our investments in product, commercial activities and support and operations. And obviously the investors recognized this market opportunity as well, but also their belief in the team and the product and the way it's been architected. So, in fact, 8VC, which is our lead investor in this round, their partner and CTO is Bhaskar Ghosh, and he's no stranger to enterprise infrastructure.
Starting point is 00:09:41 So he worked in the guts of Oracle and Informix in the early years. Later, he has headed data engineering at LinkedIn. So he really understands the data infrastructure space. And 8VC is a team of dynamic investors with entrepreneurial groups. And there are some of the folks behind Palantir. And then we saw the market opportunity around a next generation cloud native
Starting point is 00:10:08 database that can help businesses move to the cloud. I don't know if you wanted to add any details. Yeah, I would just add that we're proud of the investors we have, including the newest being ABC and Wipro as investors joining this round, including Ravi from Lightspeed and Dell Tech Capital. So if you look at the roster of investors, they're betting on UgoByte to be a big play in this big market. And then the one other maybe item that was in the announcement is that my friend and partner, Scott Yarra, is joining the board as well. Scott was the founder of Green Plum, and we've been on this journey together
Starting point is 00:10:58 for, oh, I guess almost 15 years now together. So him investing in the company and becoming a board member, I think will be quite helpful as well. Yeah. Yeah. And since we're on the, still on the investment side of things, you know,
Starting point is 00:11:17 earlier in my, my career, let's say I, I wasn't so, so keen on covering investment rounds because, you know, I thought, and I can explain that,
Starting point is 00:11:32 I thought that, okay, so great, they're getting some money behind them, good for them, but I'm not really that interested. But I realized I was wrong for a number of reasons. First, basically, it's a node, let's say, it's a kind of vote of confidence for the people that are investing in you. And it obviously draws some attention, which is a good opportunity for people like me to dive into the technical underpinning of whatever companies get the funding. And it's also a good opportunity for people who may not previously have been familiar with the company to get to know you. So something
Starting point is 00:12:05 that I typically see people saying when they're in that position is that well, we kind of chose which investors we wanted to bring on board because it's not just about the money, it's also maybe even primarily some people go as far to say it's also about the kind of doors that getting those investors on board may open. The kind of advice you may get from them, the kind of potential clients they could connect you to and so on and so forth. So with that, I'm going to kind of switch to the more technical, let's say, part of
Starting point is 00:12:44 the discussion. So you know better than me that you are in a relatively crowded market, let's say. So great. Globally distributed databases. You're not the first ones to come up with that idea. It's a good one, obviously. That's why, you know, hence the competition. And since it's a market that I've also kind of doubled in myself, I know, let's say, at least the basics about it. So to quickly break it down also for the people who may be listening to us.
Starting point is 00:13:15 So you have a number of options there. So first, you have like, you know, the no-sequel crowd, basically. So Cassandra and all kinds of Cassandra clones and you know, Mongo and what have you. All of these, most of these types of databases by now have gotten that in one way or another. Then you also have SQL databases and this is the subdomain, let's say the subsector that UcoByte is also in. And again, there you have a number of options. So databases come from cloud vendors, basically,
Starting point is 00:13:50 like Google Spanner or Microsoft Azure CodeMaskDB and so on. And those obviously have some things going for them. Having a massive cloud vendor behind you does help. But on the other hand it also means vendor lock-in basically and no ability to do multi-cloud and high-diff clouds. So just to focus, to narrow in on your specific segment let's say, so SQL globally distributed databases not backed by a cloud vendor, but independent, basically. Still, there's a few players around there.
Starting point is 00:14:30 I'm not claiming to know all of them, but just off the top of my head, we have Yuga Bytes, we have CockroachDB, we have FaunaDB, and I'm probably forgetting a couple of others as well. So to tie all that back into the investment part. So clearly this is something that your investors must have looked into. So what do you think, what would you say was it that made them decide, okay, you can buy this, you know, worth investing in, even though it's such a crowded market?
Starting point is 00:15:02 What makes you stand out? Yeah, I'll kick it off and then Anand and Karthik can dive into the details underneath it. But from my perspective, and I did a similar analysis, George, and I'm looking to join here, I think what you're really speaking to is it speaks to the market opportunity in the broadest sense, meaning that there is a need here, back to my earlier comments about, in some ways, simplifying the data answer to that question. You're really appealing the developers on one side of the equation and then the people that
Starting point is 00:15:40 have to operate these systems at scale on the other side. And then obviously the business owners, you know, from a general manager perspective or the BU leader that wants to drive capabilities and results in a very, you know, a much faster way and a quicker, quicker loop. So I think what the market is looking for is obviously the, the capabilities that you're speaking to that Conor and Karthik will dive into more detail about. But also, you know, have the appeal to the broader community around open source and not being locked into a particular player, whether it's a cloud provider or a vendor, for that matter. So that because it is such a big market, they want to have scale and have a future
Starting point is 00:16:26 from a cost perspective that's not going to be prohibitive. And then the other side of the coin is really from the enterprise side, that the enterprises are looking for simplification. This is a hard problem for them and they have a lot of database sprawl across that three decade or four decade era that actually doesn't meet the requirement for where they're headed now. So a combination of those two things are why I think we're positioned particularly well to take on the market and win. So I'll let you guys take it from here. Karthik, you want to cover that?'re on mute yeah yeah i can go um so um i think uh our perspective right like
Starting point is 00:17:07 at least i'll give mine like uh there's a there are two three reasons why like i think your question was in the face of so much competition why did our investors you know give us the vote of confidence right um i think the first answer has got to be the team um so as a team i think we're an incredibly strong team. It's not one of the strongest teams around that has cut out to build for the task at hand. And maybe I can explain. I think Karan already talked about his deep background at Oracle, working inside the guts of building one of the world's most advanced RDBMS SQL databases.
Starting point is 00:17:44 And he's not alone. There's other people that he's worked with in the past at Oracle advanced RDBMS SQL databases. And he's not alone, right? There's other people that he's worked with in the past at Oracle that have come and joined us and are also working on the insights at YugoPack, right? That's number one. Number two is that all of us, all three of us co-founders, like Karnan, Mikhail, and myself,
Starting point is 00:17:59 are super fortunate to have been around and worked on distributed databases, NoSQL databases specifically, even before NoSQL was a thing. For example, the first project or one of the first projects that Kannan and I worked on in Facebook early back in 2007 that we kicked it off and worked on it, eventually ended up becoming what is now known as Apache Cassandra. So we built a database there to deal with the exploding amounts of data and open-sourced it. And we didn't know open-source would take off at that time because it wasn't really too much of a thing in databases back then.
Starting point is 00:18:36 Subsequently, we worked on Apache HBase. All three of us founders are HBase committers, along with a number of others in the company. And the unique thing about Facebook at the time was that we not only were builders of the database, we also had to operate and run a massive DBaaS escape. So that ties back, and we're talking about billions and billions of operations per day and many, many petabytes of data,
Starting point is 00:19:00 frequently having to upgrade machines, rolling upgrade software, as well as take care of down times without you know in a totally automated way like zero downtime platform right this is some of the critical data user data that was flowing through the system so um forward that all the way to uh and like i current and i had a stint at nutanix as well learning the ropes of building an enterprise company so forward that to YugoByte. And we're a company or a team of people that have seen massively complex technical problems along with having both built and run databases at scale and seen enterprise company building
Starting point is 00:19:36 because Nutanix was relatively small when we joined. So it kind of makes a good backdrop for building a large and successful company. So that's on one side. If you juxtapose that on the other side, and I know you asked about specific competitors, but if you just forget specific competitors for a second and you just think about, hey, where is the market headed and what is the market we're addressing? We're addressing the market of people building applications, OLTP applications. Now if for a second, again, we forget all databases and think about what are the databases
Starting point is 00:20:11 or what is the database that is most often picked in order to build these applications? Well the answer would inevitably have to be PostgreSQL. It just always ends up there. And PostgreSQL's popularity is even better than that of MongoDB right now. So what we felt was a lot of people are using Postgres to build their app. However, their app is being built for the cloud
Starting point is 00:20:34 or a cloud-native environment like Kubernetes, which requires some key characteristics like high availability and fault tolerance. So you're not affected by failures. It requires scale-out, you're not affected by failures. It requires scale out, like the ability to add more nodes in order to survive more requests
Starting point is 00:20:50 and scale it back down when needed. And lastly, the ability to go and deploy data across zones, across regions, hybrid deployments, et cetera, right? So if you combine those three with Postgres SQL, what you get is a null set.
Starting point is 00:21:03 There's no solution that exists that can do all of these today, right? Cloud vendors included, because if you look at Google Spanner, it doesn't really support most of the features that PostgreSQL has. Now, if you look at some of the other competitors that we're looking at, while CockroachDB, for example, comes closest in the ability to speak the PostgreSQL language, it is far from having all the features that PostgreSQL has in order to service the RDBMS workload, right? So even though they have a head start on us, as you said, they have been building the database longer. Thanks to some of our architectural choices, for example, starting with the PostgreSQL code base
Starting point is 00:21:41 itself, we support like a significantly larger number of more critical and complex relational features compared to them. That's on one side. And Fauna on the other side, I think it also has to do with architectural underpinnings like Picking, Calvin, and the way they built their database
Starting point is 00:21:58 supports a different kind of a workload. I mean, I wouldn't characterize it yet, but a different kind of a workload and would find it difficult or would have found it difficult to support full SQL. Hence, they've gone more towards the GraphQL side of things, right?
Starting point is 00:22:12 So, yeah, I'll stop there. I think, Karan, I don't know if you have things to add. Yeah, I mean, I was going to say, I think the investors essentially see the massive market opportunity in spite of it being, you know, in spite of there being multiple players, you know, give or take, it's like a 50, $60 billion market in the database space. But they're also seeing that in spite of us being a younger company, you
Starting point is 00:22:38 know, the investors who did a deeper due diligence are able to see the architectural choices that Karthik was referring to that really put us in good stead, like being an architectural choice where the lower half is like Google Spanner and the upper half is intelligently reusing the Postgres code base to bring features like stored procedures, triggers,
Starting point is 00:23:00 user-defined functions, all of it exactly with identical semantics to Postgres. And that, along with some of the performance choices that we have made, starting from the language of development being C++, those are key things where we're really building for the long term, and we didn't want to take shortcuts. I mean, these choices will hold us in very good stead in the long term
Starting point is 00:23:24 in building just a very good stead in the long term in building just a very foundational database for the market. Essentially, the takeaway we want to leave with developers is that over time
Starting point is 00:23:33 if they ask themselves like, you know, hey, if I had another database that does everything, every single thing that PostgreSQL does
Starting point is 00:23:40 yet can get deployed in a cloud native fashion and we hope that, you know, and the performance and everything is comparably good on a single node and, like, really shines when you deploy it in a distributed fashion in the cloud,
Starting point is 00:23:52 then that would indeed become the default database for the cloud, right? And it is critical to not deviate too much from semantics to really have the high performance and everything that can be achieved. Okay, thank you. Yeah, you touched upon quite a few points. And, you know, if we had more time,
Starting point is 00:24:10 it would be interesting to maybe go a bit deeper into each of those. But since we don't, I think I'm going to just wrap up by basically going into your future plans. And in terms, again, you know, we already talked about how you want to grow the company and so on, which makes perfect sense. And I also agree, by the way, that, you know, this is a huge market and not a zero-sum game.
Starting point is 00:24:35 So probably, you know, there's enough room for everyone. I was just curious about your differentiation because, you know, it's, yeah, you can't help but notice that, you know, you and some of the competition are pretty close in your offering. So I was wondering how you may be able to frame it if I'm a CTO and looking to choose between your offering and some competitors, the points may be quite fine. I think, sorry, if the CTO thing triggered that we're 100% open source as well.
Starting point is 00:25:05 So we're probably the only database among all of the competition like, you know, that we've talked about today that's 100% open source on the core database. Yeah, yeah, that's probably true to the best of my knowledge at least. But again, you know, that's a big discussion. How much does that really matter and for whom and what really constitutes open source and all of these huge, huge topics. We could be talking about it for a couple of hours and we'd still just be scratching the surface. That's right.
Starting point is 00:25:35 So just to wrap up in the few minutes we have left, something that piqued my interest and one of you also mentioned earlier. You mentioned briefly GraphQL and I saw that you have an interesting partnership going on in that space. I've seen that you partner with a company, it's a vendor called Hasura, which basically enables you to offer a GraphQL interface in addition to the standard SQL that you already offer. And I was wondering if you could say a few words about your rationale for going GraphQL and specifically partnering with Hasbro. Yeah, totally.
Starting point is 00:26:18 I think on the GraphQL front itself, we think that GraphQL is increasingly seeing, like, having a lot of momentum. A lot of people are trying to build new-age applications in GraphQL because of the functionality and the convenience that it offers. It's ideally suited for mobile and web and that type of application. So as a space, we're very bullish about it. And we think that there's going to be more and more GraphQL applications. So that's on one side. Now, specifically with respect to which GraphQL technologies we work with, it is our intention to work with almost all technologies in the space, right? And Hasura
Starting point is 00:26:56 is just one of them. And we'll come to why Hasura is interesting. So as an overall thesis in the GraphQL space, right? Like the players in the GraphQL space kind of fall in three categories. The first category is the generic GraphQL, like the GraphQL solutions that cater to multiple databases underneath. And they're not specialized on any, right? So these would be like Apollo, for example, like a classic example. Now, Apollo works with any rest of our database underneath and it's up to the end user to kind of write that binding themselves gives a lot of flexibility but there's a little bit of work in order to leverage the power of
Starting point is 00:27:37 the underlying database there is another class which is a postgresql specific car that one that once I bring out the power of PostgreSQL. And it's great to see them double down on PostgreSQL as well, given it's a, you know, secular trend going on around with PostgreSQL. And Hasura falls in this category, right? There's a number of others here, like PostgraphFile and Supergraph,
Starting point is 00:27:58 and there's a number of others, but Hasura is like a big one there, right? And a third category is a set of projects that are building a combined GraphQL plus database play. This is where FaunaDB, I believe, is headed, and people like Dgraph are also working on it.
Starting point is 00:28:16 I would also call out that Prisma is another GraphQL community project that supports multiple databases, and that's something that we're super interested in working with. So overall like excluding the third category which is which are also integrating a database and a GraphQL together we're interested in working with everybody in the first two categories. So now that brings up Hasura and why Hasura? I think Hasura is
Starting point is 00:28:41 very interesting because they really leverage the power of PostgreSQL. And so the depth shows in the fact that a lot of people that want to pick PostgreSQL as their database want to use Hasura as a solution on top of it. And as one of the only horizontally scalable and open source PostgreSQL databases up there. Like we're the only ones, like if you take out like say Amazon's Aurora, which is a cloud native database, but it has a slightly different scaling property compared to YugoBit. YugoBit is the only database that can actually support Hasura today the way it was, right?
Starting point is 00:29:19 Because of its extensive use of stored procedures and triggers and a number of other things internally. So we are seeing the demand. We are seeing a lot of people ask us about how they can run a GraphQL-like solution on top of Utabyte using Asura. The other one that's interesting for us is the Jamstack in that area where Prisma is naturally evolving. So that's a slightly different area it's turning out to be. But anyway, so this is an area that we think is very interesting that we're watching closely and working with the folks here.
Starting point is 00:29:53 Okay, thank you. So with that, if we may go a little bit over time just to get your quick, I don't know, one or two-liner maybe about potentially new features, I mean, not that you don't already have your place quite full, but about things like going multi-model or adding support for a graph analytics engine or analytics in general, going edged up or adding machine learning capabilities. Are any of those in your
Starting point is 00:30:24 roadmap? Today, YugaByte, it has actually two upper halves. So the other API that YugaByte supports is a Cassandra-compatible API. This is a very easy journey for folks that are already familiar with, like, sort of a no-suit paradigm, whether it's like Apacheache or dynamo dv so it supports two apis already so from the operations teams and organizations this this uh is also helping them with database consolidation you know
Starting point is 00:30:59 there are many instead of like learning many different ways of running securing and operating database so that is one vector. The other aspect that you talked about, like workloads, that's actually a very interesting vector. Although we're starting off from an OLTP and a transactional workload angle, people want to do more real-time analytics, the HTAB space, if you will, like being able to do analytics on their transactional database itself,
Starting point is 00:31:24 the hybrid transactional analytical processing. And we're constantly improving the product to handle more and more of like analytic capabilities as well. But I would say primary focus is starting off with the transactional workloads and then getting more to analytics side. So that includes work in our query optimizer, query planner, predicate pushdowns, all of that. This is a continuum of R&D work that we will continue to invest in.
Starting point is 00:31:51 I hope you enjoyed the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.