Orchestrate all the Things - Open source growth and venture capital investment: data, databases, challenges and opportunities. Featuring Runa Capital Principal Konstantin Vinogradov
Episode Date: July 7, 2021Open source software used to be poorly understood by commercial forces, and it's often approached in a biased way. A new generation of investment funds goes to show that things are changing. Ope...n source software is a lot of things to a lot of people. For some people -- engineers, mostly -- it's a way to work on their passion, be involved in a community, and give something back to the world. For others -- business people, mostly -- it's a way to grow projects organically, and sell software without actually investing too much in sales. It's a nuanced topic, and we've tried to explore it from many angles. From the business angle, exploring open source vendors relationships with hyperscalers, mostly Google and Amazon. From the contributor and license engineering angle, exploring different models for commercial open source projects. And from the community angle, exploring metrics for community health and value generation evaluation. Today, we explore open source software (OSS) and its commercialization from yet another angle - the investment angle. There are a couple of venture capitals out there that seem to be ahead of the curve in terms of their understanding of, and investment in, commercial open source companies. Runa Capital is one of them, and we caught up with Konstantin Vinogradov, Runa Capital Principal, who shared views, findings, and outlook for commercial OSS. Article published on ZDNet. Image: Shutterstock
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amadiotis and we'll be connecting the dots together.
Open source software used to be poorly understood by commercial forces
and it's often approached in a biased way.
A new generation of investment funds goes to show that things are changing.
Open source software is a lot of things to a lot of people.
For some people, engineers mostly, it's a way to work on their passion, be involved
in a community, and give something back to the world.
For others, business people mostly, it's a way to grow projects organically and sell
software without actually investing too much in sales.
It's a nuanced topic and we've tried to explore it from many angles.
From the business angle, exploring open source vendor relationships with hyperscalers, mostly Google and Amazon. From the
contributor and license engineering angle, exploring different models for
commercial open-source projects. And from the community angle, exploring metrics
for community health and value generation evaluation. Today we explore
open-source software and its commercialization from yet another
angle, the investment angle. There are a couple of venture capitals out there that seem to
be ahead of the curve in terms of their understanding of and investment in commercial open source
companies. Runa Capital is one of them and we caught up with Konstantin Vinogradov, Runa
Capital Principal, which shared views, findings, and an outlook for commercial open-source software.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.
Well, I'm Principal of Runa Capital.
I'm based in London, and I'm mostly focused on enterprise software, different deep tech solutions, and fintech.
And I joined Runa quite a long time ago, more than eight years ago in 2012.
And Runa is actually a venture fund, started in 2010.
And we are founded in three areas.
We focus on three areas.
First is cloud business applications.
Second is different deep tech technologies, including complex software stuff, for example, open source, cloud infrastructure, security, machine learning applications, quantum computing recently by most of people during their life.
So it's education, financial services, and healthcare.
They're extremely regulated.
They have a lot of legacy.
And that's why we believe that there are a lot of things can be built there with more than tech stack to dramatically change the markets.
So we focus on these three areas.
We are founded by serial tech entrepreneurs who created a few large software companies. One company is on these three areas. We were founded by serial tech entrepreneurs
who created a few large software companies. One company is called Parallels. This is a
Seattle-based company partially acquired by Ingram Micro and Coral Group. And maybe you
know their application called Parallels Desktop. It's taught on every fourth Mac in the world.
It actually allows to run Windows-based applications in macOS and
you can find it in every Apple store in the world. The second company founded by a senior partner is
a company called Acronis. This is a Swiss-Singapurian company focused on data protection
and they recently raised yet another round of funding and 2.5 billion dollar valuation.
They have quarters based in Schaffhausen in Switzerland. And this is where the money came from for our
first fund. But also we have a lot of OPs from United States and
Europe, a lot of them from hosting industry, tech
entrepreneurs and so on. We are focused on mostly series A
stage, you invest from one up to $10 billion per company. And we
have a lot of forces across the globe so we
have people in pala alta barely in moscow and myself based in london um also i think a few
important things about us we are quite tech savvy team so almost all people in our fund we know how
to code we were developers software developers or maybe physicists and computer scientists in Είμαστε εργαζόμενοι, εργαζόμενοι σε φυσικά ή σχετικά με τα πραγματικά,
και αυτό είναι το λόγο για το οποίο εγκλειστούμε σχετικά με τα τεχνολογικά λύματα και αυτοσυνδέσεις.
Ευχαριστώ για την παρακολούθηση.
Θα ήθελα να πάρω την τελευταία σου στιγμή,
η αυτοσυνδέστη, που είναι επίσης η συμφωνία για την οποία έχουμε αυτήν την συζήτηση σήμερα. on your last sentence actually, so open source, which is also the occasion why we are having
this discussion today actually. And to give a little bit of background, I'm also very
much interested in open source and in all its flavors, including commercial open source
as well. And as part of that, recently, I did a number of write-ups.
So one of them was about an analysis that some people did on GitHub regarding,
they tried to explore different contributions on open source projects,
where each of them is coming from, and so on and so forth.
And there was another one more recent in which I had a chat with the lead from the
chaos project and this is a project under the Linux Foundation which does community health
analytics basically. It's found that a very interesting topic and while doing this the
research for this article I also came across your open source index and we're
going to discuss in more detail about that and that's kind of picked my attention and so I got
to learn about you and the work that you do and so before we actually go into the specifics of the
open source index that you produce let's talk a little bit about open source
and commercial open source.
And well, what kind, what do you,
how do you position yourself, let's say,
as a venture capital in this regard,
whether you actually have any investments
at the moment in open source companies
and how do you see that playing out in general?
So you know there's a number of open questions around so can open source and commercial success
co-exist? What kind of licensing is the right one and so on and so forth?
Well we invested in our first open source company in 2010. It was a company called Cloud Linux.
Actually, it was one of the first companies in our portfolio as a venture fund.
And then in 2011, we invested in Nginx Web Server, for example,
which was acquired in 2019 for $700 million.
And we started investing in open source before it became mainstream.
And it was quite hard at that moment because not all venture funds understand how it works.
And the common perception on the market was, well, how to make money on something which is totally free and could be easily copied and so on.
But, well, since that time time market perception has changed. Now even public market understand how open source works and values open source companies at quite high price.
Since that time we invested, for example, in MariaDB.
This is another company in our portfolio created by MassicQL founder Michael Vdenius.
Also we invested in ATEM, Forest and a few other companies which have some open source repositories.
Overall, we believe a lot in this model, and we see that the market has changed,
especially, I think, the new way for open source started in 2017,
when people started to care more about decentralization on the one side,
about cost efficiency and customization on the other side.
So I think there are a few waves how software develops.
So there are a lot of, there could be a wave of, let's say, simplicity.
And SaaS was some kind of wave of simplicity when people migrated from
cart or premises things to the cloud
managed by some other vendor.
And then we have another movement in other direction
people like to have some stuff customized,
maybe again on premise or in hybrid mode,
but now they would like to have not just one fits all solution in the cloud, but they would like to have something which suits exactly their needs.
And this is where open source comes to play and it could be much more customizable, more cost efficient.
And also it has different go-to-market strategies.
So it's bottom-up approach when you initially you start to win a lot of developers and then
you upsell developers and you actually go to some enterprises, try to sell them something
on top of free open source and then it's much easier to do.
And it's much easier to sell something on top of open source because you
actually, you skip POC stage.
And when you try to sell to CTO or CIO,
the large price most probably they're already using your open source
product and you just sell it.
So it's just another way of selling software.
And it's market needs sometimes better than pure SaaS products.
Yeah, I have to say that what you say makes sense to me personally and it's something that I see lots of people who are knowledgeable in this space also say.
So, well, there has to be some truth in that, I guess.
And I agree it's a different strategy.
So as you very correctly pointed out,
it goes mostly bottom up
than as opposed to the traditional top down
in terms of sales.
And then you basically try to upsell.
So-
Yeah, there are actually many trends affecting the rise of open source.
First, I think the main one is the shift of power of decision-making closer to developers,
but there are also other supporting trends.
For example, overall concerns about data privacy and some kind of,
well, it could be even called i think digital
nationalism when you don't have you know one infrastructure for all and then you will have some
laws like gdpr or californian privacy law and so on so when you try to build more
infrastructure within some countries or maybe within even regions and then you need to
then you need something open source
which you don't care about
when you don't care about a region
of a regional vendor but
well you could use this open
source whatever you want disregarding
the country of its vendor
yeah that's right
that's another concern that
can come into play. So especially for organizations that for one reason or another have to work on premises. strategy for open source version with open source software, which is well, they provide
an entirely open version which people can use on premises, but then the usual strategy is to upsell
the their SaaS product. So there may be some issues there regarding that specific aspect.
Well, I think open core model is answer to
this concern. So when you still
could sell something
working on-premises or in private
cloud and earn
money in open source world.
Yeah, I was referring
specifically to the data
sovereignty aspect that you mentioned
earlier. So, I mean, as long as you use
the open source
version on-premises, it's fine. Potentially, an issue may arise if you want to upgrade,
let's say, to the paid version, which usually is hosted in the cloud somewhere. So, this
advantage that you originally have for organizations that want to have on-premises doesn't exist
anymore. So, you fall back to
the typical cloud user scenario in a way.
Well, frankly speaking, I believe in hybrid models, for example. Well, I have a good example
of our portfolio company called Forest Admin. This is a SaaS company, but they also have
open source package and actually it allows to build internal applications very quickly like
admin panels for example and they have open source part which installed on premises and works with
data of of the enterprise customer and they have sas with the user interface managing all this data
but data never leaves the perimeter of customer so it works like SaaS, this is subscription model, but Forest
Admin never access data, never have access to customer data and that's why, well, everybody's
happy with it. Okay, yeah, yeah, you're right. That's a good example of a kind of
mixed approach, let's say. It's just not the typical one. Yeah, true. All right, so
now that we've sort of established, let's say, the advantages of open source and your specific interest in it, I was wondering if you figure out, it looks like an index of open source
companies and it looks very much aligned to what you as an organization are aimed at. So early
stage open source companies basically and it tracks the growth. Would you like to expand a
little bit and just tell us when did you start doing that
and what's the rationale behind it?
Sure.
It started about a year ago when I played with GitHub data.
And well, actually I was trying to leverage GitHub data to find fast growing startups.
And when I was doing this, I thought, okay, we have so many different SaaS metrics and
benchmarks, what is good and what is bad for a SaaS company, but we have nothing about open source company.
How could we define is it fast-growing company or not?
We have no data about this. that, well, if I have a lot of data from GitHub, I could create some kind of open source growth benchmarks
to relate growth of repositories, at least by stars.
And then actually I published this article
and then I just added top 20 startups,
which have the fastest growth in that quarter.
And then eventually started,
well, I started to receive a lot of interest from different VCs or startups on this theme.
And I thought, well, it's something which is really interesting for the community.
And I thought, why?
Okay, why not to share it with the community?
Then the next quarter, I launched the ROS Index,
Runa Open Source Startup Index, which lists top 20 startups
every quarter by their growth of GitHub stars. So it's constructed
in the following ways. So we collect data about all GitHub
repositories having more than 1000 stars every day actually. Then we take the
first and the last day of quarter, calculate the difference, calculate relative growth rate,
and then we sort all repositories by this growth rate. And then manually, as an investment team,
we go through this list from the top and then try to find startups.
Actually, we define startups as a product focused organization, which
have any meaningful connection with the repo founded less than 10 years ago,
raised less than $100 million in non-funding and which was not acquired
or went public at least this quarter.
And then we just published this list every quarter.
And well, it's mostly used by open source founders, by developer,
community managers, by other venture capitalists,
because it actually highlights the most interesting things every quarter.
What is hot in open source space in terms of startups?
Well, usually companies mentioned in this list,
they raise funding maybe one or two months after publishing.
If they didn't raise before.
Okay.
Well, it's interesting. I was meaning to ask you about the actual methodology that you apply and the actual metrics that you take into account when deriving
this index and I was kind of imagining that you probably have an elaborate formula but
I don't know you kind of brought me down to earth with what you said that you, it sounds like you just count GitHub stars. Is there nothing,
no other metrics that you take into account for this index?
Well, at current moment for this index, we use only GitHub stars, but of course it's not the
best metric. It's not the only one metric. Internally, we use a lot of metrics
and we could create a lot of things.
But GitHub starts as very simple and understandable things.
So we started with this, it worked.
Also, for example, recently I published a post
about contributor analysis because contributors,
this is, well, actually GitHub stars and contributors,
they are two parts of the same system, where
contributors are the most advanced users, people who actually make pull requests and
contribute code, and GitHub stars are just likes of people who know about the repo and
actually liked it.
So this is just two parts of the one funnel.
So why focus more contributors recently.
And for example, it's very hard to compare, uh, repositories by
contributors because different kinds of repos attract different people.
And it's not fair to compare some, you know, JavaScript framework with
databases because entry level is totally,
is very different. That's why, for example, I focused on databases and compared only
contribute the only databases by contributors, by their growth, by active contributors,
new contributors, for example, use even some decentralization metric just to compare databases
and distinguish them from one main show and community projects.
Yeah, I think you're spot on there.
And this is something that other people have had discussions with regarding ways basically
to measure different aspects of open source projects.
They all mentioned that, well, it's very hard to have like a level playing ground.
And what you just mentioned, the fact that, well, by nature, projects are different, not
just in terms of
their actual subject so a javascript framework and a database are very very different but also
within the same subdomain let's say so two databases for example two open source databases
they can have very different directives and very different culture around the way they commit, they make
commits for example. So if you just count the number of commits, it may be misleading because
well there are different quality standards and so on and so forth. You know as we see we care less
about absolute numbers, we care more about the growth. And even in there are different cultures of
commits, different wrappers,
they probably could have
we could compare them by growth
rates.
Because, you know, as we see, we don't
care about actual value,
we care about first derivative.
First derivative
don't have any
constants.
That's why it's simple simple math which works for this okay okay i see yeah so in in your case yeah that may be a bit easier
than people who are interested in for example measuring the overall health or the overall
overall value generated by a specific organization?
Yeah, well, actually, I believe in specialized indexes as well.
So very soon we'll launch our first index for open source databases,
where they are compared by active contributors, new contributors,
and other meaningful metrics, not only by stars.
Okay, that's, that's very interesting. And when you when
you do that, well, please let me know. I also have a personal
interest in that. So as an analyst,
yeah, when I could, I could actually make some kind of
preview of this index still in development. But I think it
could be interesting for you to take a look. This is still in development, but I think it could be interesting for you to take a look. This is still in development, but
overall, database index contains of GitHub repository of
repositories having more than 500 GitHub stars more than 50
GitHub forks, more than five active contributors in the last
12 months. And then you could play
with different metrics. For example, you could take a look at the database, the most of new
contributors in the last 12 months. And you can see, for example, the top database in the world
right now by this parameter is click house.
Okay. Or for example you could compare it by total contributors and see that the most
contributed databases are Spark, SQL and Elasticsearch by total number of contributors.
Yeah that's very interesting and just give you a little bit of background why I said this is also very timely and relevant for me.
So I'm working on a report on graph databases
as we speak, actually.
And well, some of them are open source
and I thought it would be a good idea
to incorporate some kind of metrics
on their contributors and things like that.
And well, obviously, as you can imagine, because it's not a straightforward thing, as we just
mentioned, well, some of them kind of were like, OK, so why this metric and not the other
one?
And I was like, well, I know that this is not an accurate metric, but I had to use some
kind of proxy.
And if you have a better proxy I would be
both myself and them would be much happier
yeah I have
the same position there is no one
metric and it should be
and it should be able to evaluate
it in some holistic
approach
there should be no
one single metric to use
yeah indeed so you mentioned earlier that you There should be no one single metric to use.
Yeah, indeed.
So you mentioned earlier that you update this index every quarter,
if I'm not mistaken.
Yeah. And it should be around the time that you're going to publish an update, right?
Yes.
I plan to publish it at the beginning of July.
Okay. Do you already have an idea of what this update will include and any insights that you'd care to share?
Yeah, I could actually tell you about a few companies which I believe will be in these index. Not about all of them. It's not, you know, stabilized yet
because we have eight days.
And sometimes
some open source companies
could publish something
in Product Hunt
and then just skyrocket
and be in the top.
But I believe that,
for example,
in these things,
there will be a few categories
with some companies
who are already in this
index in the past issues.
Some companies will be
totally new, but overall I think
there will be, for
example, a few interesting
companies in cloud app
infrastructure space, like for example
Superbase and AppRite,
both companies building
back-end as a service.
There will be a new company in our list called SpaceCloud and AppRite, both companies building back-end as a service.
There will be a new company in our list called SpaceCloud. This is the API platform for serverless apps.
There will be some interesting companies
in that infrastructure space, for example, QuestDB.
This is quite new time series database competing with KDB Plus, their proprietary competitor.
And there will be a few interesting apps and alternatives to existing products.
For example, I believe there will be Mattermost, which is actually open source alternative to Slack.
The best open source startup in the last quarter by growth is NocoDB,
open source alternative to Airtable.
And there will be also AppSmith, which is alternative to R2.
It actually helps to build internal apps.
There are some blockchain companies like Uniswap and Storch.
Overall, blockchain is getting more and more popular in open source space
because well, they have very good fit.
Overall, most of companies are not from the United States.
If we include Israel in Europe, then 45% will be from Europe, 15 company that personally attracted my interest in the last version of the index that you published, Athens Research.
And for people who may be listening and are not necessarily familiar, Athens Research is like an open source version of ROAM research, which is a new kind of tool for note-taking and personal knowledge
graphs or at least this is what we like to call them in a group I'm a member of.
And what specifically got my attention about Athens Research was the fact that they had
a very, very minuscule funding in the index that I
saw something like I don't know maybe a hundred thousand dollars or something in
that area and yet they were one of the fastest growing companies and
so I was wondering well A if you are in any way familiar with the company
and whether you have an opinion on them and what they're doing and b if you can use that as an example to explain this apparent discrepancy and also i wonder if you
know if this can be maintained so wouldn't they need to actually get some some funding to maintain
their growth and go to market at some point well i'm quite familiar with the company. We haven't talked
with the founders yet. But overall, well, I think right now
it's you don't need a lot of resources to do such kind of
application. It's not an infrastructure, it's could be
one person project became very popular after ROM research and similar applications because, well, people
like network-based note-taking actually, because people think in that way.
And, well, I think it requires a lot of funding, frankly speaking, for the scaling, but I think
it's getting more and more popular because people tried a lot of simple things and there was some kind of wave of
simple software, simple from usage point of view, and now people shift more to customizable,
maybe more complex software. And well, there is another event happened recently, Notion opened the API,
and there are a lot of integrations emerging with Notion.
I think that, I think research could play
in this segment as well.
So it will be some kind of network based infrastructure
for note taking, and then you could combine
to it other lifestyle systems,
like maybe some logs or whatever you could connect to via Zapier or IFTTT.
And then people could build some kind of lifestyle operating system
on top of Athens Research.
But frankly speaking, it's not the kind of software I would like to invest in
just because it's more consumer-driven
and there are
plenty of note-taking applications on the market
and it's all about
personal choice. For example, I don't use
Athens research or Rome research and
I use other kind of
note-taking apps and I'm totally happy
with it. I believe that it's
about some segmentation.
Maybe they will find their target audience and then it will be very cool software for them.
But I can't see how it solves massive and painful problem.
Carpon.
Okay, and this particular example aside, would you say that a company that has
this little funding, basically,
would be able to go
to market at some point, or
is getting some substantial
funding requirement?
Well, it happens.
If we talk about open source,
it doesn't require a lot of
cloud infrastructure, right? Because
to run on people's purposes, and you don't need to
host high scale SaaS solution. That's why actually your only
costs are development costs. Well, for lifestyle applications,
could be not very high cost, frankly speaking. Let's say for large enterprise software, you need to develop a lot of
enterprise features, you need to provide SLA and so on.
So, I mean, it requires a lot of resources.
And for personal applications, I mean, it could be very simple and still
solve some problem.
So, I mean, they could raise a lot of money to invest, for example, in marketing, but maybe they could become very viral and don't require any additional funding.
For example, like Zapier, right? So, I think there is a small round at the very beginning and then they just recently sold some secondaries to Sequoia, but they raised nothing actually for many years.
Okay.
And there are some examples of companies on the market which became very large companies
without raising significant funding. And infrastructure now is very cheap.
Yeah, well, but you still need it as you you pointed out. So going back to what you mentioned earlier about this new index that you're working on,
which focuses specifically on open source databases.
This is kind of my area in a way.
And one of the discussions I often find myself having with people these days is that well there seems to
be too many choice in this market at the moment like too many databases to choose
from so I'm wondering if you see a consolidation happening many people I
talk to are of this opinion and what do you think would be the defining characteristics for the companies that survived this consolidation?
Well, I believe there will be many databases
because there are plenty of use cases
and plenty of different organizations
with different requirements.
So, of course, there are some very large database vendors
like Regis or MongoDB or Elastic or other guys nobody knows about.
But there is a long tail of niche databases focused on some particular use case.
And, well, I'm not sure about consolidation of these markets. So, for example, well, there is like Neo4j for graphs.
There is, for example, ClickHouse for column-based use cases.
Let's say it that way.
And maybe a few others.
But, well, I see no value in uniting all databases into one package, actually,
under a few vendors.
I think there will be different vendors for different use cases,
for different data structures.
Okay.
And there still will be free adoption of databases with a long tail of just free users.
There will be some enterprise databases like Regis or Mongo or Snowflake and so on.
Okay.
So do you have any favorites that you're interested in?
Well, you're free not to answer, actually, because I guess if you did, you would be giving away information.
And I'm not sure you want to do that.
But, well, let's rephrase.
Do you have any favorites besides the one that you're considering for investment?
Well, I think I could be pretty open with this. I think that, for example, ClickHouse phenomenon is very
interesting because, well, it's verified.
I mean, this is a very fast database with the growing
adoption.
And I think that's very interesting.
I think another company's database space is QuestDB, which most probably will be in the
index for this quarter, which is also time series database.
They're actually quite competitive with ClickHouse in some of these cases.
Just because I think a time series database is a very interesting segment and very growing
segment right now.
People generate, enterprises generate a lot of
time series data and previous wave of software
is not capable of processing it efficiently.
That's why the price will shift to some new wave
of time series databases and to analytical databases
in general.
There is another example, for example,
like let's say interesting database startups. I think HDB is quite interesting database from the United States built on top of Postgres butσένα όπως ORM και τ.τ.
Είναι εξαιρετικό. Ευχαριστώ.
Κάποιοι από αυτούς ήταν στο ρέιδαρ και άλλοι όχι.
Ευχαριστώ που μου δώσατε σημαντικά σημαντικά σημαντικά πλαίσια και κατευθυντικές κατευθύνσεις.
Ευχαριστώ. Είναι πολύ ενδιαφέρον.
Ευχαριστώ που μοιραστούσατε τις ιδέες σας και τι σας έκανε να δημιουργήσετε αυτό το συνδέσμα Yeah, thanks. It's been really interesting and thank you for sharing your ideas
and what drove you actually to create this index, which, as you mentioned,
apparently other people have discovered as well.
So I'm really looking forward to seeing the new version.
Yeah, it will be available, I think, the first days of July.
I hope you enjoyed the podcast.
If you like my work,
you can follow Linked Data Orchestration on Twitter, LinkedIn and Facebook.