Orchestrate all the Things - The biggest investment in database history, the biggest social network ever, and other graph stories from Neo4j. Featuring CEO and Co-founder Emil Eifrem

Episode Date: June 17, 2021

A $325 million Series F funding round, bringing Neo4j's valuation to over $2 billion. A social network of 3 billion people, distributed across 1000 servers. The latter is a demo, the former is no...t. But both are real signs that the graph market and Neo4j are getting seriously big. If you're into the market and investment side of things, how does a Series F funding round as part of a $325 million investment led by Eurazeo and GV (formerly Google Ventures), bringing Neo4j's valuation to over $2 billion sound? Pretty impressive, probably. If you're into the technology and applications side of things, how does a Neo4j demo of a social network application with 3 billion people, running queries designed to test the limits of graph query languages and databases across a 1000 node cluster sound? Equally impressive, probably. Graph database vendor Neo4j​ CEO and co-founder Emil Eifrem is announcing the former and showcasing the latter today, at the company's annual virtual conference NODES. We caught up with Eifrem to get a taste of things to come. Article published on ZDNet

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. A $325 million Series F funding round bringing Neo4j's valuation to over $2 billion. A social network of 3 billion people distributed across 1,000 servers. The latter is a demo, the former is not. But both are real signs that the graph market and Neo4j are getting 1.000 σερβέρ. Η τελευταία είναι μια δημο, η παραδεκτική δεν είναι, αλλά τα δύο είναι αυτοί σημαντικά ότι ο κράτος γραφείου και το Neo4j είναι σημαντικά μεγάλο.
Starting point is 00:00:29 Εάν βρίσκεστε στο κράτος και το επενδύσιο, πώς κάνει ένα σύριας F-funding γύρο ως μέρος του $325.000.000 επενδύσεων, ουραζιού και GV, παραδείγματοι Google εταιρείες, να φέρουν την αξία του Neo4j σε ακόμη 2 δις δόλους. Πολύ εντυπωσιασμένο, πιθανότατα. Αν είστε εντύπωστοι στο τεχνολογικό και το εργασιακό σύστημα, πώς κάνει μια νεο4j δέμο μιας κοινωνικής ασφάλειας με 3 δις άτομα, που δουλεύει με κουρίδες που δημιουργούν τις ευθύνες της γραφικής γλώσσας και των διεθνών
Starting point is 00:01:02 σε έναν 1000 γνώμης κλείστος. Πιθανότατα, πιθανότατα. of graph query language and databases across a 1000 node cluster sound. Equally impressive, probably. Graph database vendor Neo4j CEO and co-founder Emil Afrem is announcing the funding and showcasing the demo later today at the company's annual virtual conference notes. We caught up with Afrem to get a taste of things to come. I hope you enjoyed enjoy the podcast. If you like my work, you can follow Linked Data Orchestration
Starting point is 00:01:29 on Twitter, LinkedIn, and Facebook. And yeah, obviously, the place to start is the funding. So last time we spoke, I kind of had it in the back of my mind. So there's been lots of funding rounds in the graph database space lately and lots of interest and, you know, growth in all kinds of ways. So I was thinking, okay, so last time you had the funding was a couple of years back, I guess. And compared to the funding that, you know, other vendors are getting these days, it was, you know, it was fine.
Starting point is 00:02:05 But I was wondering, like, okay, so if you want to maintain your growth, basically, it kind of sounds plausible you would be reaching out for another one. And, well, there you go. So I guess you can start by saying a few words. And, well, basically, who's funding you? And why did you choose them? And why did they choose you? How much is their funding and all of that?
Starting point is 00:02:28 Yeah. No, it was obviously, since it's just been a couple of weeks, I guess, since we last spoke, it was obviously also top of mind for me because I was going through kind of the final motions as we last spoke. But yeah, it's a good question. So just maybe backing up a little bit. So we're announcing a new funding round, which is the biggest investment in database history, actually, which is obviously pretty significant for Neo4j as a company, but for the graph database space
Starting point is 00:03:01 as a whole as well, which obviously is an area that you spend a lot of time on, George. And it's led by two firms, a global European origin private equity firm called Eurazeo, and then GV, which used to be called Google Ventures. And just a little bit more color on it to your question. So 2020 was, and we touched on this when we last spoke, 2020 was obviously like a human tragedy on many levels, right? And challenging and other
Starting point is 00:03:38 levels of high growth in terms of headcount and not being able to meet people and whatnot. From a commercial perspective, it was an amazing year for Neo4j, right don't always feel bad saying it if you know what i mean because 2020 was where the pandemic was such a tough year generally speaking but it's just a really strong year um and so we ended the year with a lot of money in the bank burning very little and so we didn't need to go out and raise money. But then you start looking around and to your point, you know, one, maybe a smaller aspect, but one aspect was just multiples in the market was just crazy, right? And there's a lot of fundraising going on with very rich valuations and things like that, right? And that's a little bit opportunistic. But as an entrepreneur, you have to be opportunistic too, right? And so there's kind of one thing. But maybe the more important one is kind of the broader trend that we're seeing.
Starting point is 00:04:37 And the backdrop really here is that, look, the relational database has been around forever. And then, you know, no SQL happened. Like the previous decade was all defined, you know, in the database space by the growth of non-relational databases. And, you know, we grew from, you know, when I grew up as a developer in the mid-90s, there was four or five databases to choose from, right? The relational, it was all a vendor choice, like the relational vendors and then MySQL, right?
Starting point is 00:05:07 That was really it. Then maybe a little bit of Postgres and whatnot, but really a handful of choices. To today, where there's, I don't know, 300 plus databases tracked on DB engines or something like that. But so really the 2010s was defined by that. The 2020s, that's going to be the great unbundling of the database market, which is the biggest market in all of enterprise
Starting point is 00:05:33 software. Every single thing that everyone is doing in our digital daily life ultimately lands in a database. And it's all unbundling in this decade. And what we've seen happen in the past few years is that a few leaders have emerged. And we're talking a handful, right? Not the 10 or 20 or 30 or 40 or 50 that raised money, but a few, right? And this funding round really is this testament to Neo4j and graphs being a significant part
Starting point is 00:06:03 of that future data landscape. And so really, those were the things that came together that caused us to say, you know what, now is the time. Okay. You mentioned participation from GV, which used to be called Google Ventures. And that kind of got my attention, to be honest. I mean, well, I wasn't aware with the
Starting point is 00:06:27 other VC leading year round as well, but well, in all ways it looks like a traditional VC. You mentioned a couple of things about them like being a Europe based and whatnot, but you know, a traditional VC. Google Ventures on the the other hand, is kind of special. And I'm wondering the parallel I want to draw there is that, well, basically, Google does not have a horse in the race, so to speak. They don't really have their own graph database as opposed to the other two cloud mega vendors. And I know that you also have a partnership with Google. So it's interesting that I interpreted that as a note, as a kind of vote of confidence for Jay coming from Google. Yeah, I think that's right.
Starting point is 00:07:15 I mean, it's very clear that the GV team makes its own independent kind of financial decisions, right? Their job is to invest in companies and make money out of that and so on and so forth that's at least how how i look at them but of course they're part of google they're part of alphabet and the fact that we had this really strong partnership with gcp i'm sure it didn't hurt let's put it that way. Okay. Good. One other thing that you touched upon was valuations, basically. And that's also something that we didn't get to discuss the last time when we covered the broader landscape in graph, let's say. Or actually, we only touched on the periphery,
Starting point is 00:08:02 let's say, of valuations. And you mentioned valuations that other vendors are getting. And I think part of what I saw in the preview of the announcement you're going to make is the valuation of NEO4J, which is, if I recall correctly, something like two billion. So I wanted to take the opportunity to discuss a little bit the prospect of the market at large using this valuation as a kind of proxy. Since, well, I guess you're the kind of the de facto leaders in this market. So I guess what does this say about the market at large is the question. Yeah, I mean, it's a good question.
Starting point is 00:08:42 Yeah, so we are now for the first time talking publicly about our valuations. It's over $2 billion, right? And I guess the way that I think about it, at least I put it in context of that broader shift, right? It's back to the database market being the biggest market in enterprise software, it's currently, it depends on who you ask, but probably around $50 billion, right? And it's projected to grow rapidly to about $100 billion in just a few years, right? We're talking like 2024, 2025, you know. And then there's nothing, of course, if you think about the secular trends of data and the value that people are getting from data, it's not going to slow down. So it's going to keep growing and become this massively valuable, increasingly valuable market. And the growth is all from the new. The relational database, as you and I have talked about before, it's going to be around when you retire and I retire. In fact, it's going to be around when you and I die. And people will still stick data in the relational database, right? So it's not going away. But if you think about the growth, and that's all coming from the
Starting point is 00:09:51 new databases, right? And I would include NewSQL in that, by the way. And so then, so the way that I think about this is, I believe I am, I'm unrepentant in my optimism around the opportunity and the potential of the graph database. And I think we've only scratched like the tiniest part of the surface, you know, so far about graph adoption, even when clueful people think about graph databases, and I'm talking to like CTOs of the other database vendors, for example, like just about as clueful as you can be, they still just think about graph databases as these, the use cases that are driven by performance, right? And there's many of them and there's becoming more and more every day because the world is becoming increasingly connected. So
Starting point is 00:10:41 there's more and more connected data, you know, right? So therefore there's more and more connected data. So therefore, there's more and more of those types of use cases. But there's also a massive value in graph databases when it comes to developer productivity, because most domains actually are connected already. And if we're able to live up to that promise, which is ultimately a product surface and a developer experience promise, then I think we can become a significant chunk of that market that is emerging. And then you add on top of that the fact that Neo4j uniquely can address graph data science needs, right?
Starting point is 00:11:19 If you think about kind of the emerging leaders here, you think about Heredis Labs, you think about like a MongoDB, you think about who knows who's going to be the winner in U-SQL. Like they don't have a play. They have a developer play, right? Operational data store, developers building applications. But data scientists don't use those databases. They go to those databases to get data out of there and put in their real tools to get value. They take data and they put it into Neo4j to get value out of it, right?
Starting point is 00:11:52 And so then you add that data science use case to it. And I believe that graphs can become a significant part of that new data landscape. So then if you take the two billion dollars in context of that, I think it's actually a fairly representative and fair number. Okay, cool. So the other area I wanted to touch upon in the discussion was future plans, basically. And it's kind of a typical question to ask someone who's just gotten a massive funding round so how are you going to spend all that money? I think I kind of know at least part of the answer because we've discussed before about your future plans which I'm going to outline from memory so basically keep growing both yourselves and the market. You mentioned
Starting point is 00:12:44 hiring more engineers. So I think an additional 100 until the end of the year was the goal that you mentioned and making the product easier to use. And well, you also just touched upon developer productivity, which I guess directly relates to that and expanding your offerings, your offering to data scientists,
Starting point is 00:13:03 which is another area that you've been pushing. So I was wondering if there's anything I'm missing which doesn't fit that bill, or if you want to just add color to those areas. No, I think you nailed it. Do you want a job as the CEO of Neo4j? Well, I'll think about it. I think you're doing a good job.
Starting point is 00:13:25 No, I think you nailed it. It's about investing in that. I think of it maybe in three buckets. If we take a more kind of narrow view, like a 2021, you know, type next 12 months type of view, I think of it in three buckets. It's investing more behind our cloud portfolio. And, you know, we actually were going to
Starting point is 00:13:45 make some, I think, pretty exciting announcements in that area at our developer conference nodes, which will have happened when people listen to this podcast June 17. But the recordings will be on our website, obviously. And so that's kind of one big area. And then you nailed it with graph data science, right? That's such an important early but high growth area for us. And there's a lot of amazing things we can do still there. And then the third one is, I would probably call it market reach, right? And so this comes down to some of the stuff that you touched on with like cloud platform partnerships, for example, right?
Starting point is 00:14:29 And partnerships with SIs, right? Systems integrators, like big global systems integrators, right? But it also comes down to just, you know, this is maybe a weird statement to use in pandemic times, but feet on the ground, as it were, in areas like APAC, where we have a little bit of a presence, but it's growing really fast. And so we're investing a lot and intend to invest a lot about growing fast in Asia. So those are, I think, the three buckets that I see. Okay. Yeah, actually, it's a good point about APAC. And, you know, because I keep an eye on
Starting point is 00:15:11 those things, I saw that I think it was recently that you opened an office there and new hires and beefing up the team, I guess. It's a very fast growth,growing part of our business. And it's also, I mean, we saw this clearly in kind of the early parts of the pandemic, right? Because we're predominantly like a North American and European in terms of our presence, headquartered in the Valley, but engineering in Europe. And really, our field is present in both North America and Europe, and then a little bit in Asia. But then you saw that all of a sudden that, you know, with things like the pandemic, like these continents, they adjusted at different times, right? And so it's just really good. We've always been blessed with having massively diverse customer base in terms of verticals. So when something like this happens,
Starting point is 00:16:02 yes, cruise ships, like we actually have cruise ship operators as customers and we have hospitality as customers. Right. Every every single time you ever stay that in Marriott or one of its it's the biggest hotel chain in the world. Right. Well, the brands that's calculated with Neo4j. Obviously, that's not going to be a growth area for us in, you know, in in pandemic times. Right. But then, you know then other areas will, right? So that was great in terms of verticals. And now I think it's also become really important to see that in terms of just geography. Okay. Another thing you mentioned previously was the Nodes conference. And if I'm not mistaken, actually the announcement of your funding is timed to coincide with...
Starting point is 00:16:46 That's exactly right. So I'm wondering if there's anything else you're going to be announcing, or even if not, if you just want to give a roundup of what people can expect. Yeah, totally. So it's our annual developer conference. It's the Neo4j Online Developer and Data Scientist Expo and Summit. It just happens to be N.O.D.E. as an acronym, of course. It's on June 17, so it should be on the same day as this article goes out.
Starting point is 00:17:15 I don't know exactly when you will release the podcast, George. It's the biggest graph event in the world. Last year, we had 13,000 registered people. We think we're going to burst right through that this year as well. And it's practitioner focused. That's the D, the developer and data scientist. And so there's no marketing fluff. It's just very hands-on, tangible. here's what people have done with neo4j that worked here's what people done that didn't work here so we can use this new feature it's like very hands-on stuff in my keynote we're going to do i mean we'll of course announce the the the funding i'm probably as you know excuse me as excited as i am about the funding, I'm probably equally excited about a demo that we're going to be doing in the keynote.
Starting point is 00:18:11 And so what we've done is we've taken this demo application that we've written, which is basically a social network graph with more members than Facebook. It has 3 billion people in it. And you will know this, George, it uses the LDBG schema for this, right? And so you know it well, which is this Linked Data Benchmark Council, which is a collaboration on creating tools and data sets to use in benchmarking graph databases.
Starting point is 00:18:43 And so we used that, but we made it run over across a thousand shard database. So it has more than a trillion relationships sharded across over a thousand servers, but all executing as a single graph database, which is running super complex, very graphy, low latency queries that return in tens of milliseconds or even less. And and they're using actually the LDC queries, so you know that those queries
Starting point is 00:19:18 are designed to torture a graph database, it's not it is not like simple, primitive, just get this one property or something like that. It's over a trillion relationships. We even run some queries that are graph global. So they touch all 1,000 shards and returning in tens of milliseconds, which I think is a pretty mind-blowing demo. And just to give you a little bit more, more, more call, like behind the scenes color on that, when we ran the, when we spun it up and this was kind of a, we did it all in less than a month, which is, you know, just generating the dataset ends up being the hardest thing. But
Starting point is 00:19:57 like when we spun up all the 1000 shards, we couldn't because Amazon ran out of servers in that availability zone because it was like too many servers. And then at scale, when it was running, the first bill, it was $96,000 per day. Just to give you, maybe I'll revisit my use of funds answer from previous in the thing. And then they ended up optimizing and running it on smaller instances, so it ended up being cheaper. But that was kind of the initial one. And so this, I think, is a real accomplishment
Starting point is 00:20:35 and a real achievement that we feel really, really happy about. And we're going to demo it live running, and then we've open sourced it all or it's going to be released on June 17. So there's a GitHub repo where you can just take it, all the queries are there, all the data is there pre-generated because that ends up being the long pole in the tent
Starting point is 00:20:58 for something like this that anyone can get up and running with, running a thousand short trillion relationship graph with low latency queries. Yeah, that's pretty impressive and I can appreciate it because while I've personally been involved in something similar, nowhere near the scale that you run it, but actually like 10 years ago or something, I met a few of the researchers that actually devised that benchmark and those queries and the data generators. And yeah, I know what you mean.
Starting point is 00:21:31 I mean, it was a pain just generating, just running the code to generate the data. Well, you know, it was a pretty involved process, let's say. Maybe, you know, hopefully the code has evolved a little bit since then, but well, you still need parallelism and you still need to consume lots of resources. So yeah, I can appreciate it was a tough exercise and well, kudos to whoever managed to do it in a month. Yeah, yeah. It was a huge task.
Starting point is 00:22:00 And then if you then marry that up with another thing that we're announcing in nodes is the Aura free tier. So we're going to launch a free tier of Aura. And Aura is our cloud service where, you know, as a developer, you can just sign up for free, get up and running, zero dollars, free forever. And get up and start building your application or play around Neo4j at zero cost. And if you take that all the way with kind of our Aura professional, the self-serve, swipe a credit card, you know, 50 bucks per month type of a thing, and then take it all the way through Aura Enterprise, which is now used by some of the biggest customers
Starting point is 00:22:40 on the planet, right? For really deep enterprise-wide mission-critical deployments. And then you marry it up with this absolute global planet-scale trillion relationship, a thousand shard Neo4j single database. You add that up, that full spectrum, I think, covers really cradle to grave everything that you need for graphs. Yeah, well, I should also mention that this may be kind of an answer, let's say, to an oft-cited criticism I've been hearing about Neo4j, which is, well, yeah, I mean, it works great on a single node, but if you start distributing it, it's not so great anymore. But, well, this kind of demo,
Starting point is 00:23:28 well, it's not really a demo at that scale, but let's call it demo. It sounds pretty impressive and maybe, well, can serve as an answer to those criticisms, I guess. Yeah, and just the fact that it's going to be running, we're going to show it running, and then there's a GitHub repo where people can run it themselves, right?
Starting point is 00:23:47 And I think that's going to be a really important thing to be able to point people to in the future. Okay, well, yeah, it sounds like you've been keeping happily busy. I mean, besides the obvious fact that just getting a funding round obviously takes lots of negotiation and due diligence and whatnot. On the developer front as well, it sounds like you have some nice cool things to show to people. It's a really exciting time in the space.
Starting point is 00:24:20 And I think that's the broader thing. And you and i spoke about a little bit about this with alicia on the on the last podcast i think you said it's uh the great vindication of george donna dio this or something like that right and and for us of us who've been you know talking about graph databases for a uh for a long time just seeing the recognition that they're getting in i'm not even talking talking about funding rounds and stuff like that, but just the massive usage that we see, like real people, be it individual developers all the way up to massive companies,
Starting point is 00:24:56 betting going all in on graphs in the way that we just didn't see outside of the consumer web. That happened in the consumer web in the late 90s, early 2000s. That's what built Google. That's what built Facebook. And that's what built Twitter. You know, that big bet on graph as a technology. But we haven't seen it at this scale
Starting point is 00:25:15 in the enterprise until now. So I think it's a really exciting, you know, part of the graph space right now at this time. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.