Orchestrate all the Things - The biggest investment in database history, the biggest social network ever, and other graph stories from Neo4j. Featuring CEO and Co-founder Emil Eifrem
Episode Date: June 17, 2021A $325 million Series F funding round, bringing Neo4j's valuation to over $2 billion. A social network of 3 billion people, distributed across 1000 servers. The latter is a demo, the former is no...t. But both are real signs that the graph market and Neo4j are getting seriously big. If you're into the market and investment side of things, how does a Series F funding round as part of a $325 million investment led by Eurazeo and GV (formerly Google Ventures), bringing Neo4j's valuation to over $2 billion sound? Pretty impressive, probably. If you're into the technology and applications side of things, how does a Neo4j demo of a social network application with 3 billion people, running queries designed to test the limits of graph query languages and databases across a 1000 node cluster sound? Equally impressive, probably. Graph database vendor Neo4j CEO and co-founder Emil Eifrem is announcing the former and showcasing the latter today, at the company's annual virtual conference NODES. We caught up with Eifrem to get a taste of things to come. Article published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amadiotis and we'll be connecting the dots together.
A $325 million Series F funding round
bringing Neo4j's valuation to over $2 billion.
A social network of 3 billion people distributed across
1,000 servers. The latter is a demo, the former is not.
But both are real signs that the graph market and Neo4j are getting 1.000 σερβέρ. Η τελευταία είναι μια δημο, η παραδεκτική δεν είναι, αλλά τα δύο είναι αυτοί σημαντικά ότι ο
κράτος γραφείου και το Neo4j είναι σημαντικά μεγάλο.
Εάν βρίσκεστε στο κράτος και το επενδύσιο, πώς κάνει ένα σύριας F-funding γύρο ως
μέρος του $325.000.000 επενδύσεων, ουραζιού και GV, παραδείγματοι Google
εταιρείες, να φέρουν την αξία του Neo4j σε ακόμη 2 δις δόλους.
Πολύ εντυπωσιασμένο, πιθανότατα.
Αν είστε εντύπωστοι στο τεχνολογικό και το εργασιακό σύστημα,
πώς κάνει μια νεο4j δέμο μιας κοινωνικής ασφάλειας
με 3 δις άτομα, που δουλεύει με κουρίδες
που δημιουργούν τις ευθύνες της γραφικής γλώσσας και των διεθνών
σε έναν 1000 γνώμης κλείστος. Πιθανότατα, πιθανότατα. of graph query language and databases across a 1000 node cluster sound.
Equally impressive, probably.
Graph database vendor Neo4j CEO and co-founder Emil Afrem
is announcing the funding and showcasing the demo
later today at the company's annual virtual conference notes.
We caught up with Afrem to get a taste of things to come.
I hope you enjoyed enjoy the podcast.
If you like my work, you can follow Linked Data Orchestration
on Twitter, LinkedIn, and Facebook.
And yeah, obviously, the place to start is the funding.
So last time we spoke, I kind of had it in the back of my mind.
So there's been lots of funding rounds in the graph database space lately
and lots of interest and, you know, growth in all kinds of ways.
So I was thinking, okay, so last time you had the funding was a couple of years back, I guess.
And compared to the funding that, you know, other vendors are getting these days,
it was, you know, it was fine.
But I was wondering, like, okay, so if you want to maintain your growth,
basically, it kind of sounds plausible you would be reaching out for another one.
And, well, there you go.
So I guess you can start by saying a few words.
And, well, basically, who's funding you?
And why did you choose them?
And why did they choose you?
How much is their funding and all of that?
Yeah.
No, it was obviously, since it's just been a couple of weeks,
I guess, since we last spoke, it was obviously also top of mind for me
because I was going through kind of the final motions as we last spoke.
But yeah, it's a good question.
So just maybe backing up a little bit. So we're announcing
a new funding round, which is the biggest investment in database history, actually,
which is obviously pretty significant for Neo4j as a company, but for the graph database space
as a whole as well, which obviously is an area that you spend a lot of time on, George.
And it's led by two firms, a global European origin private equity firm called Eurazeo,
and then GV, which used to be called Google Ventures.
And just a little bit more
color on it to your question.
So 2020 was, and we touched on this
when we last spoke, 2020 was obviously like a human tragedy
on many levels, right? And challenging and other
levels of high growth in terms of headcount and
not being able to meet people and whatnot. From a commercial
perspective, it was an amazing year for Neo4j, right don't always feel bad saying it if you know what i mean
because 2020 was where the pandemic was such a tough year generally speaking but it's just a
really strong year um and so we ended the year with a lot of money in the bank burning very little
and so we didn't need to go out and raise money. But then you start looking around and to your point, you know, one, maybe a smaller aspect, but one aspect was just multiples in the market was just crazy, right? And there's a lot of fundraising going on with very rich valuations and things like that, right? And that's a little bit opportunistic. But as an entrepreneur, you have to be opportunistic too, right?
And so there's kind of one thing.
But maybe the more important one is kind of the broader trend that we're seeing.
And the backdrop really here is that, look, the relational database has been around forever.
And then, you know, no SQL happened.
Like the previous decade was all defined, you know,
in the database space by the growth of non-relational databases.
And, you know, we grew from, you know, when I grew up as a developer
in the mid-90s, there was four or five databases to choose from, right?
The relational, it was all a vendor choice,
like the relational vendors and then MySQL, right?
That was really it.
Then maybe a little bit of Postgres and whatnot,
but really a handful of choices.
To today, where there's, I don't know,
300 plus databases tracked on DB engines
or something like that.
But so really the 2010s was defined by that. The 2020s, that's going to be
the great unbundling of the database market, which is the biggest market in all of enterprise
software. Every single thing that everyone is doing in our digital daily life ultimately lands
in a database. And it's all unbundling in this decade. And what we've seen happen in the past few years
is that a few leaders have emerged.
And we're talking a handful, right?
Not the 10 or 20 or 30 or 40 or 50 that raised money,
but a few, right?
And this funding round really is this testament
to Neo4j and graphs being a significant part
of that future data landscape.
And so really, those were the things that came together
that caused us to say, you know what, now is the time.
Okay.
You mentioned participation from GV,
which used to be called Google Ventures.
And that kind of got my attention, to be honest.
I mean, well, I wasn't aware with the
other VC leading year round as well, but well, in all ways it looks like a traditional VC.
You mentioned a couple of things about them like being a Europe based and whatnot, but
you know, a traditional VC. Google Ventures on the the other hand, is kind of special. And I'm wondering
the parallel I want to draw there is that, well, basically, Google does not have a horse
in the race, so to speak. They don't really have their own graph database as opposed to
the other two cloud mega vendors. And I know that you also have a partnership with Google. So it's interesting that I interpreted that as a note,
as a kind of vote of confidence for Jay coming from Google.
Yeah, I think that's right.
I mean, it's very clear that the GV team
makes its own independent kind of financial decisions, right?
Their job is to invest in
companies and make money out of that and so on and so forth that's at least how how i look at them
but of course they're part of google they're part of alphabet and the fact that we had this really
strong partnership with gcp i'm sure it didn't hurt let's put it that way. Okay. Good. One other thing that you touched upon was valuations,
basically. And that's also something that we didn't get to discuss the last time when we
covered the broader landscape in graph, let's say. Or actually, we only touched on the periphery,
let's say, of valuations. And you mentioned valuations that other vendors are getting.
And I think part of what I saw in the preview of the announcement
you're going to make is the valuation of NEO4J, which is, if I recall correctly,
something like two billion.
So I wanted to take the opportunity to discuss a little bit the prospect of the market at large using this valuation as a kind of proxy.
Since, well, I guess you're the kind of the de facto leaders in this market.
So I guess what does this say about the market at large is the question.
Yeah, I mean, it's a good question.
Yeah, so we are now for the first time talking publicly about our valuations. It's over $2 billion, right? And I guess the way that I think about it, at least I put it in context of that broader shift, right? It's back to the database market being the biggest market in enterprise software, it's currently, it depends on who you ask, but probably around $50 billion,
right? And it's projected to grow rapidly to about $100 billion in just a few years,
right? We're talking like 2024, 2025, you know. And then there's nothing, of course, if you think about the secular trends of data and the value that people are getting from data,
it's not going to slow down. So it's going to keep growing and become this massively valuable, increasingly valuable market.
And the growth is all from the new.
The relational database, as you and I have talked about before, it's going to be around when you retire and I retire.
In fact, it's going to be around when you and I die.
And people will still stick data in the relational database, right? So it's not going away. But if you think about the growth, and that's all coming from the
new databases, right? And I would include NewSQL in that, by the way. And so then, so the way that
I think about this is, I believe I am, I'm unrepentant in my optimism around the opportunity and the potential of the
graph database. And I think we've only scratched like the tiniest part of the surface, you know,
so far about graph adoption, even when clueful people think about graph databases, and I'm
talking to like CTOs of the other database vendors, for example,
like just about as clueful as you can be, they still just think about graph databases as these,
the use cases that are driven by performance, right? And there's many of them and there's
becoming more and more every day because the world is becoming increasingly connected. So
there's more and more connected data, you know, right? So therefore there's more and more connected data. So therefore, there's more and
more of those types of use cases. But there's also a massive value in graph databases when it comes
to developer productivity, because most domains actually are connected already. And if we're able
to live up to that promise, which is ultimately a product surface and a developer experience promise,
then I think we can become a significant chunk
of that market that is emerging.
And then you add on top of that the fact that Neo4j
uniquely can address graph data science needs, right?
If you think about kind of the emerging leaders here,
you think about Heredis Labs, you think about like a MongoDB, you think about who knows who's going to be the winner in U-SQL.
Like they don't have a play.
They have a developer play, right?
Operational data store, developers building applications.
But data scientists don't use those databases.
They go to those databases to get data out of there and put in their real tools to get value.
They take data and they put it into Neo4j to get value out of it, right?
And so then you add that data science use case to it.
And I believe that graphs can become a significant part of that new data landscape. So then if you take the two billion dollars in context of that,
I think it's actually a fairly representative and fair number.
Okay, cool. So the other area I wanted to touch upon in the discussion was future plans,
basically. And it's kind of a typical question to ask someone who's just gotten
a massive funding round so how are you going to spend all that money? I think I kind of know at
least part of the answer because we've discussed before about your future plans which I'm going to
outline from memory so basically keep growing both yourselves and the market. You mentioned
hiring more engineers.
So I think an additional 100 until the end of the year
was the goal that you mentioned
and making the product easier to use.
And well, you also just touched upon developer productivity,
which I guess directly relates to that
and expanding your offerings,
your offering to data scientists,
which is another area that you've been pushing.
So I was wondering if there's anything I'm missing
which doesn't fit that bill,
or if you want to just add color to those areas.
No, I think you nailed it.
Do you want a job as the CEO of Neo4j?
Well, I'll think about it.
I think you're doing a good job.
No, I think you nailed it.
It's about investing in that.
I think of it maybe in three buckets.
If we take a more kind of narrow view,
like a 2021, you know, type next 12 months type of view,
I think of it in three buckets.
It's investing more behind our cloud portfolio.
And, you know, we actually were going to
make some, I think, pretty exciting announcements in that area at our developer conference nodes,
which will have happened when people listen to this podcast June 17. But the recordings will
be on our website, obviously. And so that's kind of one big area.
And then you nailed it with graph data science, right?
That's such an important early but high growth area for us.
And there's a lot of amazing things we can do still there.
And then the third one is, I would probably call it market reach, right?
And so this comes down to some of the stuff that you touched on with like cloud platform partnerships, for example, right?
And partnerships with SIs, right?
Systems integrators, like big global systems integrators, right?
But it also comes down to just, you know, this is maybe a weird statement to use in pandemic times, but feet on the ground, as it were, in areas like APAC,
where we have a little bit of a presence, but it's growing really fast.
And so we're investing a lot and intend to invest a lot
about growing fast in Asia.
So those are, I think, the three buckets that I see.
Okay. Yeah, actually, it's a good point about APAC. And, you know, because I keep an eye on
those things, I saw that I think it was recently that you opened an office there and new hires and
beefing up the team, I guess. It's a very fast growth,growing part of our business.
And it's also, I mean, we saw this clearly in kind of the early parts of the pandemic, right?
Because we're predominantly like a North American and European in terms of our presence,
headquartered in the Valley, but engineering in Europe.
And really, our field is present in both North America and Europe, and then a little bit in Asia. But then you saw that all of a sudden that, you know, with things like the pandemic, like these continents, they
adjusted at different times, right? And so it's just really good. We've always been blessed with
having massively diverse customer base in terms of verticals. So when something like this happens,
yes, cruise ships, like we actually have cruise ship operators as customers and we have hospitality as customers.
Right. Every every single time you ever stay that in Marriott or one of its it's the biggest hotel chain in the world.
Right. Well, the brands that's calculated with Neo4j.
Obviously, that's not going to be a growth area for us in, you know, in in pandemic times.
Right. But then, you know then other areas will, right? So that was great in
terms of verticals. And now I think it's also become really important to see that in terms of
just geography. Okay. Another thing you mentioned previously was the Nodes conference. And if I'm
not mistaken, actually the announcement of your funding is timed to coincide with...
That's exactly right.
So I'm wondering if there's anything else you're going to be announcing,
or even if not, if you just want to give a roundup of what people can expect.
Yeah, totally.
So it's our annual developer conference.
It's the Neo4j Online Developer and Data Scientist Expo and Summit.
It just happens to be N.O.D.E. as an acronym, of course.
It's on June 17, so it should be on the same day as this article goes out.
I don't know exactly when you will release the podcast, George.
It's the biggest graph event in the world. Last year, we had 13,000
registered people. We think we're going to burst right through that this year as well.
And it's practitioner focused. That's the D, the developer and data scientist.
And so there's no marketing fluff. It's just very hands-on, tangible. here's what people have done with neo4j that
worked here's what people done that didn't work here so we can use this new feature it's like
very hands-on stuff in my keynote we're going to do i mean we'll of course announce the the
the funding i'm probably as you know excuse me as excited as i am about the funding, I'm probably equally excited about a demo that we're going to be doing in the keynote.
And so what we've done is we've taken this demo application that we've written, which is basically a social network graph with more members than Facebook.
It has 3 billion people in it.
And you will know this, George,
it uses the LDBG schema for this, right?
And so you know it well,
which is this Linked Data Benchmark Council,
which is a collaboration on creating tools
and data sets to use in benchmarking graph databases.
And so we used that, but we made it run
over across a thousand shard database.
So it has more than a trillion relationships
sharded across over a thousand servers,
but all executing as a single graph database,
which is running super complex, very
graphy, low latency queries that return in tens of milliseconds or even less.
And and they're using actually the LDC queries, so you know that those queries
are designed to torture a graph database, it's not it is not like simple,
primitive, just get this one property or something like that.
It's over a trillion relationships.
We even run some queries that are graph global.
So they touch all 1,000 shards and returning in tens of milliseconds, which I think is a pretty mind-blowing demo.
And just to give you a little bit more, more, more call, like behind the scenes
color on that, when we ran the, when we spun it up and this was kind of a, we did it all in less
than a month, which is, you know, just generating the dataset ends up being the hardest thing. But
like when we spun up all the 1000 shards, we couldn't because Amazon ran out of servers in that availability zone because it was like too many servers.
And then at scale, when it was running, the first bill, it was $96,000 per day.
Just to give you, maybe I'll revisit my use of funds answer from previous in the
thing.
And then they ended up optimizing and running it
on smaller instances, so it ended up being cheaper.
But that was kind of the initial one.
And so this, I think, is a real accomplishment
and a real achievement that we feel really, really happy about.
And we're going to demo it live running,
and then we've open sourced it all
or it's going to be released on June 17.
So there's a GitHub repo where you can just take it,
all the queries are there,
all the data is there pre-generated
because that ends up being the long pole in the tent
for something like this
that anyone can get up and running with,
running a thousand short trillion relationship
graph with low latency queries. Yeah, that's pretty impressive and I can appreciate it because
while I've personally been involved in something similar, nowhere near the scale that you run it,
but actually like 10 years ago or something, I met a few of the researchers that actually devised that benchmark
and those queries and the data generators.
And yeah, I know what you mean.
I mean, it was a pain just generating, just running the code to generate the data.
Well, you know, it was a pretty involved process, let's say.
Maybe, you know, hopefully the code has evolved a little bit since then, but well, you still need parallelism
and you still need to consume lots of resources.
So yeah, I can appreciate it was a tough exercise
and well, kudos to whoever managed to do it in a month.
Yeah, yeah.
It was a huge task.
And then if you then marry that up with another thing
that we're announcing in nodes is the Aura free tier.
So we're going to launch a free tier of Aura.
And Aura is our cloud service where, you know, as a developer, you can just sign up for free, get up and running, zero dollars, free forever.
And get up and start building your application or play around Neo4j at zero cost.
And if you take that all the way with kind of our Aura professional, the self-serve,
swipe a credit card, you know, 50 bucks per month type of a thing, and then take it all
the way through Aura Enterprise, which is now used by some of the biggest customers
on the planet, right?
For really deep enterprise-wide mission-critical deployments.
And then you marry it up with this absolute global planet-scale trillion relationship,
a thousand shard Neo4j single database.
You add that up, that full spectrum, I think, covers really cradle to grave
everything that you need for graphs. Yeah, well, I should also mention that this may
be kind of an answer, let's say, to an oft-cited criticism I've been hearing about Neo4j, which is,
well, yeah, I mean, it works great on a single node, but if you start distributing it, it's not so great anymore. But, well, this kind of demo,
well, it's not really a demo at that scale,
but let's call it demo.
It sounds pretty impressive and maybe, well,
can serve as an answer to those criticisms, I guess.
Yeah, and just the fact that it's going to be running,
we're going to show it running,
and then there's a GitHub repo
where people can run it themselves, right?
And I think that's going to be a really important thing
to be able to point people to in the future.
Okay, well, yeah, it sounds like you've been keeping happily busy.
I mean, besides the obvious fact that just getting a funding round
obviously takes lots of negotiation and due diligence and whatnot.
On the developer front as well,
it sounds like you have some nice cool things to show to people.
It's a really exciting time in the space.
And I think that's the broader thing.
And you and i spoke about a
little bit about this with alicia on the on the last podcast i think you said it's uh the great
vindication of george donna dio this or something like that right and and for us of us who've been
you know talking about graph databases for a uh for a long time just seeing the recognition that
they're getting in i'm not even talking talking about funding rounds and stuff like that,
but just the massive usage that we see, like real people,
be it individual developers all the way up to massive companies,
betting going all in on graphs in the way that we just didn't see
outside of the consumer web.
That happened in the consumer web in the late 90s, early 2000s.
That's what built Google.
That's what built Facebook.
And that's what built Twitter.
You know, that big bet on graph as a technology.
But we haven't seen it at this scale
in the enterprise until now.
So I think it's a really exciting,
you know, part of the graph space
right now at this time.
I hope you enjoyed the podcast.
If you like my work, you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.