Orchestrate all the Things - Open source growth and venture capital investment: data, databases, challenges and opportunities. Featuring Runa Capital Principal Konstantin Vinogradov

Episode Date: July 7, 2021

Open source software used to be poorly understood by commercial forces, and it's often approached in a biased way. A new generation of investment funds goes to show that things are changing. Ope...n source software is a lot of things to a lot of people. For some people -- engineers, mostly -- it's a way to work on their passion, be involved in a community, and give something back to the world. For others -- business people, mostly -- it's a way to grow projects organically, and sell software without actually investing too much in sales. It's a nuanced topic, and we've tried to explore it from many angles. From the business angle, exploring open source vendors relationships with hyperscalers, mostly Google and Amazon. From the contributor and license engineering angle, exploring different models for commercial open source projects. And from the community angle, exploring metrics for community health and value generation evaluation. Today, we explore open source software (OSS) and its commercialization from yet another angle - the investment angle. There are a couple of venture capitals out there that seem to be ahead of the curve in terms of their understanding of, and investment in, commercial open source companies. Runa Capital is one of them, and we caught up with Konstantin Vinogradov, Runa Capital Principal, who shared views, findings, and outlook for commercial OSS. Article published on ZDNet. Image: Shutterstock

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. Open source software used to be poorly understood by commercial forces and it's often approached in a biased way. A new generation of investment funds goes to show that things are changing. Open source software is a lot of things to a lot of people. For some people, engineers mostly, it's a way to work on their passion, be involved in a community, and give something back to the world.
Starting point is 00:00:30 For others, business people mostly, it's a way to grow projects organically and sell software without actually investing too much in sales. It's a nuanced topic and we've tried to explore it from many angles. From the business angle, exploring open source vendor relationships with hyperscalers, mostly Google and Amazon. From the contributor and license engineering angle, exploring different models for commercial open-source projects. And from the community angle, exploring metrics for community health and value generation evaluation. Today we explore open-source software and its commercialization from yet another
Starting point is 00:01:05 angle, the investment angle. There are a couple of venture capitals out there that seem to be ahead of the curve in terms of their understanding of and investment in commercial open source companies. Runa Capital is one of them and we caught up with Konstantin Vinogradov, Runa Capital Principal, which shared views, findings, and an outlook for commercial open-source software. I hope you will enjoy the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook. Well, I'm Principal of Runa Capital. I'm based in London, and I'm mostly focused on enterprise software, different deep tech solutions, and fintech.
Starting point is 00:01:44 And I joined Runa quite a long time ago, more than eight years ago in 2012. And Runa is actually a venture fund, started in 2010. And we are founded in three areas. We focus on three areas. First is cloud business applications. Second is different deep tech technologies, including complex software stuff, for example, open source, cloud infrastructure, security, machine learning applications, quantum computing recently by most of people during their life. So it's education, financial services, and healthcare. They're extremely regulated.
Starting point is 00:02:31 They have a lot of legacy. And that's why we believe that there are a lot of things can be built there with more than tech stack to dramatically change the markets. So we focus on these three areas. We are founded by serial tech entrepreneurs who created a few large software companies. One company is on these three areas. We were founded by serial tech entrepreneurs who created a few large software companies. One company is called Parallels. This is a Seattle-based company partially acquired by Ingram Micro and Coral Group. And maybe you know their application called Parallels Desktop. It's taught on every fourth Mac in the world. It actually allows to run Windows-based applications in macOS and
Starting point is 00:03:05 you can find it in every Apple store in the world. The second company founded by a senior partner is a company called Acronis. This is a Swiss-Singapurian company focused on data protection and they recently raised yet another round of funding and 2.5 billion dollar valuation. They have quarters based in Schaffhausen in Switzerland. And this is where the money came from for our first fund. But also we have a lot of OPs from United States and Europe, a lot of them from hosting industry, tech entrepreneurs and so on. We are focused on mostly series A stage, you invest from one up to $10 billion per company. And we
Starting point is 00:03:43 have a lot of forces across the globe so we have people in pala alta barely in moscow and myself based in london um also i think a few important things about us we are quite tech savvy team so almost all people in our fund we know how to code we were developers software developers or maybe physicists and computer scientists in Είμαστε εργαζόμενοι, εργαζόμενοι σε φυσικά ή σχετικά με τα πραγματικά, και αυτό είναι το λόγο για το οποίο εγκλειστούμε σχετικά με τα τεχνολογικά λύματα και αυτοσυνδέσεις. Ευχαριστώ για την παρακολούθηση. Θα ήθελα να πάρω την τελευταία σου στιγμή, η αυτοσυνδέστη, που είναι επίσης η συμφωνία για την οποία έχουμε αυτήν την συζήτηση σήμερα. on your last sentence actually, so open source, which is also the occasion why we are having
Starting point is 00:04:28 this discussion today actually. And to give a little bit of background, I'm also very much interested in open source and in all its flavors, including commercial open source as well. And as part of that, recently, I did a number of write-ups. So one of them was about an analysis that some people did on GitHub regarding, they tried to explore different contributions on open source projects, where each of them is coming from, and so on and so forth. And there was another one more recent in which I had a chat with the lead from the chaos project and this is a project under the Linux Foundation which does community health
Starting point is 00:05:15 analytics basically. It's found that a very interesting topic and while doing this the research for this article I also came across your open source index and we're going to discuss in more detail about that and that's kind of picked my attention and so I got to learn about you and the work that you do and so before we actually go into the specifics of the open source index that you produce let's talk a little bit about open source and commercial open source. And well, what kind, what do you, how do you position yourself, let's say,
Starting point is 00:05:55 as a venture capital in this regard, whether you actually have any investments at the moment in open source companies and how do you see that playing out in general? So you know there's a number of open questions around so can open source and commercial success co-exist? What kind of licensing is the right one and so on and so forth? Well we invested in our first open source company in 2010. It was a company called Cloud Linux. Actually, it was one of the first companies in our portfolio as a venture fund.
Starting point is 00:06:30 And then in 2011, we invested in Nginx Web Server, for example, which was acquired in 2019 for $700 million. And we started investing in open source before it became mainstream. And it was quite hard at that moment because not all venture funds understand how it works. And the common perception on the market was, well, how to make money on something which is totally free and could be easily copied and so on. But, well, since that time time market perception has changed. Now even public market understand how open source works and values open source companies at quite high price. Since that time we invested, for example, in MariaDB. This is another company in our portfolio created by MassicQL founder Michael Vdenius.
Starting point is 00:07:22 Also we invested in ATEM, Forest and a few other companies which have some open source repositories. Overall, we believe a lot in this model, and we see that the market has changed, especially, I think, the new way for open source started in 2017, when people started to care more about decentralization on the one side, about cost efficiency and customization on the other side. So I think there are a few waves how software develops. So there are a lot of, there could be a wave of, let's say, simplicity. And SaaS was some kind of wave of simplicity when people migrated from
Starting point is 00:08:07 cart or premises things to the cloud managed by some other vendor. And then we have another movement in other direction people like to have some stuff customized, maybe again on premise or in hybrid mode, but now they would like to have not just one fits all solution in the cloud, but they would like to have something which suits exactly their needs. And this is where open source comes to play and it could be much more customizable, more cost efficient. And also it has different go-to-market strategies.
Starting point is 00:08:46 So it's bottom-up approach when you initially you start to win a lot of developers and then you upsell developers and you actually go to some enterprises, try to sell them something on top of free open source and then it's much easier to do. And it's much easier to sell something on top of open source because you actually, you skip POC stage. And when you try to sell to CTO or CIO, the large price most probably they're already using your open source product and you just sell it.
Starting point is 00:09:19 So it's just another way of selling software. And it's market needs sometimes better than pure SaaS products. Yeah, I have to say that what you say makes sense to me personally and it's something that I see lots of people who are knowledgeable in this space also say. So, well, there has to be some truth in that, I guess. And I agree it's a different strategy. So as you very correctly pointed out, it goes mostly bottom up than as opposed to the traditional top down
Starting point is 00:09:58 in terms of sales. And then you basically try to upsell. So- Yeah, there are actually many trends affecting the rise of open source. First, I think the main one is the shift of power of decision-making closer to developers, but there are also other supporting trends. For example, overall concerns about data privacy and some kind of, well, it could be even called i think digital
Starting point is 00:10:25 nationalism when you don't have you know one infrastructure for all and then you will have some laws like gdpr or californian privacy law and so on so when you try to build more infrastructure within some countries or maybe within even regions and then you need to then you need something open source which you don't care about when you don't care about a region of a regional vendor but well you could use this open
Starting point is 00:10:56 source whatever you want disregarding the country of its vendor yeah that's right that's another concern that can come into play. So especially for organizations that for one reason or another have to work on premises. strategy for open source version with open source software, which is well, they provide an entirely open version which people can use on premises, but then the usual strategy is to upsell the their SaaS product. So there may be some issues there regarding that specific aspect. Well, I think open core model is answer to
Starting point is 00:11:46 this concern. So when you still could sell something working on-premises or in private cloud and earn money in open source world. Yeah, I was referring specifically to the data sovereignty aspect that you mentioned
Starting point is 00:12:01 earlier. So, I mean, as long as you use the open source version on-premises, it's fine. Potentially, an issue may arise if you want to upgrade, let's say, to the paid version, which usually is hosted in the cloud somewhere. So, this advantage that you originally have for organizations that want to have on-premises doesn't exist anymore. So, you fall back to the typical cloud user scenario in a way. Well, frankly speaking, I believe in hybrid models, for example. Well, I have a good example
Starting point is 00:12:34 of our portfolio company called Forest Admin. This is a SaaS company, but they also have open source package and actually it allows to build internal applications very quickly like admin panels for example and they have open source part which installed on premises and works with data of of the enterprise customer and they have sas with the user interface managing all this data but data never leaves the perimeter of customer so it works like SaaS, this is subscription model, but Forest Admin never access data, never have access to customer data and that's why, well, everybody's happy with it. Okay, yeah, yeah, you're right. That's a good example of a kind of mixed approach, let's say. It's just not the typical one. Yeah, true. All right, so
Starting point is 00:13:23 now that we've sort of established, let's say, the advantages of open source and your specific interest in it, I was wondering if you figure out, it looks like an index of open source companies and it looks very much aligned to what you as an organization are aimed at. So early stage open source companies basically and it tracks the growth. Would you like to expand a little bit and just tell us when did you start doing that and what's the rationale behind it? Sure. It started about a year ago when I played with GitHub data. And well, actually I was trying to leverage GitHub data to find fast growing startups.
Starting point is 00:14:18 And when I was doing this, I thought, okay, we have so many different SaaS metrics and benchmarks, what is good and what is bad for a SaaS company, but we have nothing about open source company. How could we define is it fast-growing company or not? We have no data about this. that, well, if I have a lot of data from GitHub, I could create some kind of open source growth benchmarks to relate growth of repositories, at least by stars. And then actually I published this article and then I just added top 20 startups, which have the fastest growth in that quarter.
Starting point is 00:15:03 And then eventually started, well, I started to receive a lot of interest from different VCs or startups on this theme. And I thought, well, it's something which is really interesting for the community. And I thought, why? Okay, why not to share it with the community? Then the next quarter, I launched the ROS Index, Runa Open Source Startup Index, which lists top 20 startups every quarter by their growth of GitHub stars. So it's constructed
Starting point is 00:15:37 in the following ways. So we collect data about all GitHub repositories having more than 1000 stars every day actually. Then we take the first and the last day of quarter, calculate the difference, calculate relative growth rate, and then we sort all repositories by this growth rate. And then manually, as an investment team, we go through this list from the top and then try to find startups. Actually, we define startups as a product focused organization, which have any meaningful connection with the repo founded less than 10 years ago, raised less than $100 million in non-funding and which was not acquired
Starting point is 00:16:23 or went public at least this quarter. And then we just published this list every quarter. And well, it's mostly used by open source founders, by developer, community managers, by other venture capitalists, because it actually highlights the most interesting things every quarter. What is hot in open source space in terms of startups? Well, usually companies mentioned in this list, they raise funding maybe one or two months after publishing.
Starting point is 00:17:02 If they didn't raise before. Okay. Well, it's interesting. I was meaning to ask you about the actual methodology that you apply and the actual metrics that you take into account when deriving this index and I was kind of imagining that you probably have an elaborate formula but I don't know you kind of brought me down to earth with what you said that you, it sounds like you just count GitHub stars. Is there nothing, no other metrics that you take into account for this index? Well, at current moment for this index, we use only GitHub stars, but of course it's not the best metric. It's not the only one metric. Internally, we use a lot of metrics
Starting point is 00:17:45 and we could create a lot of things. But GitHub starts as very simple and understandable things. So we started with this, it worked. Also, for example, recently I published a post about contributor analysis because contributors, this is, well, actually GitHub stars and contributors, they are two parts of the same system, where contributors are the most advanced users, people who actually make pull requests and
Starting point is 00:18:13 contribute code, and GitHub stars are just likes of people who know about the repo and actually liked it. So this is just two parts of the one funnel. So why focus more contributors recently. And for example, it's very hard to compare, uh, repositories by contributors because different kinds of repos attract different people. And it's not fair to compare some, you know, JavaScript framework with databases because entry level is totally,
Starting point is 00:18:48 is very different. That's why, for example, I focused on databases and compared only contribute the only databases by contributors, by their growth, by active contributors, new contributors, for example, use even some decentralization metric just to compare databases and distinguish them from one main show and community projects. Yeah, I think you're spot on there. And this is something that other people have had discussions with regarding ways basically to measure different aspects of open source projects. They all mentioned that, well, it's very hard to have like a level playing ground.
Starting point is 00:19:39 And what you just mentioned, the fact that, well, by nature, projects are different, not just in terms of their actual subject so a javascript framework and a database are very very different but also within the same subdomain let's say so two databases for example two open source databases they can have very different directives and very different culture around the way they commit, they make commits for example. So if you just count the number of commits, it may be misleading because well there are different quality standards and so on and so forth. You know as we see we care less about absolute numbers, we care more about the growth. And even in there are different cultures of
Starting point is 00:20:26 commits, different wrappers, they probably could have we could compare them by growth rates. Because, you know, as we see, we don't care about actual value, we care about first derivative. First derivative
Starting point is 00:20:41 don't have any constants. That's why it's simple simple math which works for this okay okay i see yeah so in in your case yeah that may be a bit easier than people who are interested in for example measuring the overall health or the overall overall value generated by a specific organization? Yeah, well, actually, I believe in specialized indexes as well. So very soon we'll launch our first index for open source databases, where they are compared by active contributors, new contributors,
Starting point is 00:21:20 and other meaningful metrics, not only by stars. Okay, that's, that's very interesting. And when you when you do that, well, please let me know. I also have a personal interest in that. So as an analyst, yeah, when I could, I could actually make some kind of preview of this index still in development. But I think it could be interesting for you to take a look. This is still in development, but I think it could be interesting for you to take a look. This is still in development, but overall, database index contains of GitHub repository of
Starting point is 00:21:56 repositories having more than 500 GitHub stars more than 50 GitHub forks, more than five active contributors in the last 12 months. And then you could play with different metrics. For example, you could take a look at the database, the most of new contributors in the last 12 months. And you can see, for example, the top database in the world right now by this parameter is click house. Okay. Or for example you could compare it by total contributors and see that the most contributed databases are Spark, SQL and Elasticsearch by total number of contributors.
Starting point is 00:22:45 Yeah that's very interesting and just give you a little bit of background why I said this is also very timely and relevant for me. So I'm working on a report on graph databases as we speak, actually. And well, some of them are open source and I thought it would be a good idea to incorporate some kind of metrics on their contributors and things like that. And well, obviously, as you can imagine, because it's not a straightforward thing, as we just
Starting point is 00:23:10 mentioned, well, some of them kind of were like, OK, so why this metric and not the other one? And I was like, well, I know that this is not an accurate metric, but I had to use some kind of proxy. And if you have a better proxy I would be both myself and them would be much happier yeah I have the same position there is no one
Starting point is 00:23:32 metric and it should be and it should be able to evaluate it in some holistic approach there should be no one single metric to use yeah indeed so you mentioned earlier that you There should be no one single metric to use. Yeah, indeed.
Starting point is 00:23:53 So you mentioned earlier that you update this index every quarter, if I'm not mistaken. Yeah. And it should be around the time that you're going to publish an update, right? Yes. I plan to publish it at the beginning of July. Okay. Do you already have an idea of what this update will include and any insights that you'd care to share? Yeah, I could actually tell you about a few companies which I believe will be in these index. Not about all of them. It's not, you know, stabilized yet because we have eight days.
Starting point is 00:24:27 And sometimes some open source companies could publish something in Product Hunt and then just skyrocket and be in the top. But I believe that, for example,
Starting point is 00:24:39 in these things, there will be a few categories with some companies who are already in this index in the past issues. Some companies will be totally new, but overall I think there will be, for
Starting point is 00:24:53 example, a few interesting companies in cloud app infrastructure space, like for example Superbase and AppRite, both companies building back-end as a service. There will be a new company in our list called SpaceCloud and AppRite, both companies building back-end as a service. There will be a new company in our list called SpaceCloud. This is the API platform for serverless apps.
Starting point is 00:25:14 There will be some interesting companies in that infrastructure space, for example, QuestDB. This is quite new time series database competing with KDB Plus, their proprietary competitor. And there will be a few interesting apps and alternatives to existing products. For example, I believe there will be Mattermost, which is actually open source alternative to Slack. The best open source startup in the last quarter by growth is NocoDB, open source alternative to Airtable. And there will be also AppSmith, which is alternative to R2.
Starting point is 00:26:03 It actually helps to build internal apps. There are some blockchain companies like Uniswap and Storch. Overall, blockchain is getting more and more popular in open source space because well, they have very good fit. Overall, most of companies are not from the United States. If we include Israel in Europe, then 45% will be from Europe, 15 company that personally attracted my interest in the last version of the index that you published, Athens Research. And for people who may be listening and are not necessarily familiar, Athens Research is like an open source version of ROAM research, which is a new kind of tool for note-taking and personal knowledge graphs or at least this is what we like to call them in a group I'm a member of.
Starting point is 00:27:14 And what specifically got my attention about Athens Research was the fact that they had a very, very minuscule funding in the index that I saw something like I don't know maybe a hundred thousand dollars or something in that area and yet they were one of the fastest growing companies and so I was wondering well A if you are in any way familiar with the company and whether you have an opinion on them and what they're doing and b if you can use that as an example to explain this apparent discrepancy and also i wonder if you know if this can be maintained so wouldn't they need to actually get some some funding to maintain their growth and go to market at some point well i'm quite familiar with the company. We haven't talked
Starting point is 00:28:06 with the founders yet. But overall, well, I think right now it's you don't need a lot of resources to do such kind of application. It's not an infrastructure, it's could be one person project became very popular after ROM research and similar applications because, well, people like network-based note-taking actually, because people think in that way. And, well, I think it requires a lot of funding, frankly speaking, for the scaling, but I think it's getting more and more popular because people tried a lot of simple things and there was some kind of wave of simple software, simple from usage point of view, and now people shift more to customizable,
Starting point is 00:28:57 maybe more complex software. And well, there is another event happened recently, Notion opened the API, and there are a lot of integrations emerging with Notion. I think that, I think research could play in this segment as well. So it will be some kind of network based infrastructure for note taking, and then you could combine to it other lifestyle systems, like maybe some logs or whatever you could connect to via Zapier or IFTTT.
Starting point is 00:29:34 And then people could build some kind of lifestyle operating system on top of Athens Research. But frankly speaking, it's not the kind of software I would like to invest in just because it's more consumer-driven and there are plenty of note-taking applications on the market and it's all about personal choice. For example, I don't use
Starting point is 00:29:56 Athens research or Rome research and I use other kind of note-taking apps and I'm totally happy with it. I believe that it's about some segmentation. Maybe they will find their target audience and then it will be very cool software for them. But I can't see how it solves massive and painful problem. Carpon.
Starting point is 00:30:18 Okay, and this particular example aside, would you say that a company that has this little funding, basically, would be able to go to market at some point, or is getting some substantial funding requirement? Well, it happens. If we talk about open source,
Starting point is 00:30:40 it doesn't require a lot of cloud infrastructure, right? Because to run on people's purposes, and you don't need to host high scale SaaS solution. That's why actually your only costs are development costs. Well, for lifestyle applications, could be not very high cost, frankly speaking. Let's say for large enterprise software, you need to develop a lot of enterprise features, you need to provide SLA and so on. So, I mean, it requires a lot of resources.
Starting point is 00:31:13 And for personal applications, I mean, it could be very simple and still solve some problem. So, I mean, they could raise a lot of money to invest, for example, in marketing, but maybe they could become very viral and don't require any additional funding. For example, like Zapier, right? So, I think there is a small round at the very beginning and then they just recently sold some secondaries to Sequoia, but they raised nothing actually for many years. Okay. And there are some examples of companies on the market which became very large companies without raising significant funding. And infrastructure now is very cheap. Yeah, well, but you still need it as you you pointed out. So going back to what you mentioned earlier about this new index that you're working on,
Starting point is 00:32:11 which focuses specifically on open source databases. This is kind of my area in a way. And one of the discussions I often find myself having with people these days is that well there seems to be too many choice in this market at the moment like too many databases to choose from so I'm wondering if you see a consolidation happening many people I talk to are of this opinion and what do you think would be the defining characteristics for the companies that survived this consolidation? Well, I believe there will be many databases because there are plenty of use cases
Starting point is 00:32:55 and plenty of different organizations with different requirements. So, of course, there are some very large database vendors like Regis or MongoDB or Elastic or other guys nobody knows about. But there is a long tail of niche databases focused on some particular use case. And, well, I'm not sure about consolidation of these markets. So, for example, well, there is like Neo4j for graphs. There is, for example, ClickHouse for column-based use cases. Let's say it that way.
Starting point is 00:33:37 And maybe a few others. But, well, I see no value in uniting all databases into one package, actually, under a few vendors. I think there will be different vendors for different use cases, for different data structures. Okay. And there still will be free adoption of databases with a long tail of just free users. There will be some enterprise databases like Regis or Mongo or Snowflake and so on.
Starting point is 00:34:19 Okay. So do you have any favorites that you're interested in? Well, you're free not to answer, actually, because I guess if you did, you would be giving away information. And I'm not sure you want to do that. But, well, let's rephrase. Do you have any favorites besides the one that you're considering for investment? Well, I think I could be pretty open with this. I think that, for example, ClickHouse phenomenon is very interesting because, well, it's verified.
Starting point is 00:34:52 I mean, this is a very fast database with the growing adoption. And I think that's very interesting. I think another company's database space is QuestDB, which most probably will be in the index for this quarter, which is also time series database. They're actually quite competitive with ClickHouse in some of these cases. Just because I think a time series database is a very interesting segment and very growing segment right now.
Starting point is 00:35:23 People generate, enterprises generate a lot of time series data and previous wave of software is not capable of processing it efficiently. That's why the price will shift to some new wave of time series databases and to analytical databases in general. There is another example, for example, like let's say interesting database startups. I think HDB is quite interesting database from the United States built on top of Postgres butσένα όπως ORM και τ.τ.
Starting point is 00:36:07 Είναι εξαιρετικό. Ευχαριστώ. Κάποιοι από αυτούς ήταν στο ρέιδαρ και άλλοι όχι. Ευχαριστώ που μου δώσατε σημαντικά σημαντικά σημαντικά πλαίσια και κατευθυντικές κατευθύνσεις. Ευχαριστώ. Είναι πολύ ενδιαφέρον. Ευχαριστώ που μοιραστούσατε τις ιδέες σας και τι σας έκανε να δημιουργήσετε αυτό το συνδέσμα Yeah, thanks. It's been really interesting and thank you for sharing your ideas and what drove you actually to create this index, which, as you mentioned, apparently other people have discovered as well. So I'm really looking forward to seeing the new version.
Starting point is 00:36:38 Yeah, it will be available, I think, the first days of July. I hope you enjoyed the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.