Orchestrate all the Things - Neo4j partners with Microsoft, unfolds strategy to power Generative AI applications with cloud platforms and Graph RAG. Featuring Neo4j CPO Sudhir Hasbe

Episode Date: March 27, 2024

From better together to full native integration, Neo4j is creating an ecosystem around all major cloud platforms to provide graph-powered features for Generative AI and beyond.  As Neo4j just a...nnounced its partneship with Microsoft, we met with Chief Product Officer Sudhir Hasbe to talk about: What this partnership means for users and how it worksHow graph-powered generative AI aligns with cloud platform AI strategiesSimilarities and differences across themHow Neo4j's strategy is shaping up, and when Databricks and Snowflake integration are coming. For additional analysis and a writeup of the conversation, you can read the article published on Orchestrate all the Things: https://linkeddataorchestration.com/2024/03/27/neo4j-partners-with-microsoft-unfolds-strategy-to-power-generative-ai-applications-with-cloud-platforms-and-graph-rag/

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Orchestrate All The Things. I'm George Anatiotis and we'll be connecting the dots together. Stories about technology, data, AI, and media, and how they flow into each other, saving our lives. From Better Together to full native integration, Neo4j is creating an ecosystem around all major cloud platforms to provide graph-powered features for generative AI and beyond. As Neo4j just announced its partnership with Microsoftνώρισε την συμφωνία της με την Microsoft,
Starting point is 00:00:26 συναντήσαμε με τον Πρόεδρο Προεδρικού, τον Σιδήρ Χάσπε, για να μιλήσουμε για τι σημαίνει αυτή η συμφωνία για τους χρήστες και πώς δουλεύει. Πώς η generative AI του GraphPower ασχολείται με τις στρατηγικές AI πλατφόρμας του cloud, τις συμμετοχές και τις διαφορές εκτός τους, πώς είναι η στρατηγική Neo4j's strategy is shaping up,
Starting point is 00:00:45 and when Databricks and Snowflake integration are coming. I hope you will enjoy this. If you like my work and orchestrate all the things, you can subscribe to my podcast, available on all major platforms, my self-published newsletter, also syndicated on Substack, HackerNin, Medium, and Dzone, or follow and orchesturate all the things on your social media of choice. So I think if you look at
Starting point is 00:01:08 what Fabric allows you to do is it's like they have a storage system and they have all these workloads that they have within it. And we want to be the default graph analytics and like, you know, database workload on their platform.
Starting point is 00:01:22 So that's the whole idea. Now we have a lot of integrations. It's better together story. Like Fabric is there, we are there. And how do you go out and integrate with Data Factory and Synapse, data science, engineering, warehousing, Power BI, and OpenAI APIs. But this is still better together.
Starting point is 00:01:40 It's not fully integrated. And so what we are planning to do is, so basically this is how microsoft fabric looks like we go in there are these standard workloads which they provide out of the box and now they are going to enable additional workloads that other partners can bring in and so this is an example of workloads like london stock exchange and s3 are a couple of examples but then you will also have graph analytics from Neo4j. This is where we will be completely integrated. You don't see anything
Starting point is 00:02:11 Neo4j out of the box. And as soon as you say connect, it will actually transform that into a graph model automatically based on the information that we can pull up. We are exploring also using large language model to go ahead and give it the input and then automatically ask us, give us the main entities and all from there. So that will happen. And once that happens, you can just then explore it. And once you explore it, it basically allows you to show me a graph, automatically graph comes up, you can start playing with graph, like you're not doing analysis and all. So the whole idea here is it's basically fully integrated experience that you have inside the full Fabric platform. So they have roughly 350,000 plus customers using Power BI and all, right? This is one of the most largest platforms in analytics.
Starting point is 00:03:01 And so we become native part of that. So that's the fifth one, which we are most excited about for the long term. But that's not like that's coming. It's more of a partnership and we're building together. Other than that, I think the other big portions of announcements are mostly around integrations that we have done with Azure OpenAI.
Starting point is 00:03:20 So we have been working with the whole OpenAI platform for a long time. We have been integrating with different APIs that they have got, like everything from building unstructured data to knowledge graphs. We have a full experience on that. But basically, one of the biggest challenges, they're one of our big customers that they're taking a lot of unstructured data, converting into a knowledge graph so that then they can get much more accurate results out of it. So that's one area where we've integrated with all the different APIs and components
Starting point is 00:03:52 like with Azure OpenAI. The second is GraphRag. I think we believe GraphRag is really the best way to go ahead and actually build GenAI applications. And we have a lot of integr that we have done with Langchain, Lama Index, all the ecosystem, but we also have integrated Azure OpenAI APIs directly from the database. So if you are a database user, you're building an application,
Starting point is 00:04:17 from there you can just pass any data information to OpenAI APIs and you will get embedding out of it, or you can go out and do summarization out of it, or anything that you want to do with LLMs, you can go out and run that directly from database. So that's the second one, right? And the third one is just we built the whole embedding API. And internally in the database, we have storage, vector storage, as well as vector search capability. So now you can use Neo4j as a vector store if you want. But what we have really seen is just purely chunking the data into vectors and doing vector search is not enough.
Starting point is 00:04:58 And we can actually give a lot more structured context on top of that data. So if you search for a particular entity, you search for like, you know, engine or BMW car or anything, any example like that, like a part or something, we can give what other subcomponents are there for that particular entity around it. So like search echo parts manuals or something like that. So that's the third one.
Starting point is 00:05:24 And fourth one... I mean, for that to work, you also need to have that context. So it's not enough to just use Neo4j as your vector backend. In that case, you actually need to have it populated with your entire knowledge graph. Yes.
Starting point is 00:05:40 When they come back, you mentioned you would have to have the dependencies between the different components and so on. Yes, you're absolutely right. And this is super helpful for our existing customers too. So the first one is build a knowledge graph out of unstructured data. That is a lot of new customers are coming in for that. But existing customers that we have, like one of the largest pharmaceutical companies, like we have financial services, pharmaceuticals.
Starting point is 00:06:04 All of these people have built knowledge graphs with neo4j for 10-15 years now and the only thing was they couldn't put the unstructured data with it so we have like most manufacturing companies their bill of material is in neo4j and so if you're doing that but your parts manuals are actually outside of the knowledge graph, then it's harder to do the use cases that need blending. And so now they're saying, oh, if I have my bill of material, I can put parts manuals directly and attach them to the engine or to whatever the component is.
Starting point is 00:06:38 And then now you can do a dual search. You can do a graph direct search with nodes, but you can also do search based on the unstructured data with like... So that's the use case that is getting really popular for folks. And then also, just making the knowledge graph accessible to more
Starting point is 00:06:55 users is another key reason for that. And then the fourth one is mostly like, you know, we are one piece of the big ecosystem for analytics workloads, like within organizations. So whoever is in the Azure space, they are going to use various of these components from Fabric. And so we've built integrations with various different components in Fabric, like Azure Data Factory, if you want to use it for moving data into OneLake or into Neo4j. I think we've integrated with that. like Azure Data Factory, if you want to use it for moving data into OneLake or into Neo4j,
Starting point is 00:07:27 I think we've integrated with that. You're seamlessly integrated with, we have a data warehousing connector, so we can extract data from Synapse easily. You can use Synapse data science notebooks to go ahead and use our data science capabilities directly from it. And you can just use Power BI to create reports directly out of
Starting point is 00:07:45 Neo4j database. So I think that those are set of integrations that we have clubbed together in the fourth announcement that we have. And finally, on top of all of this, we have had Neo4j available on all the different cloud platforms. The last bit which was remaining was making the Neo4j database available through Azure Marketplace for professionals. Professionals are starting skew for people to build applications. And so now it is available through the marketplace too. So GA like that's available now for all the developers. So developers can get started at a very low friction point and pretty low cost on it. Okay, so just to add some context for people who may be listening to the conversation, the I guess the occasion, the reason why you're announcing it now is because there's an upcoming
Starting point is 00:08:35 Microsoft event. So you want to make the announcement, the unveiling, let's say there, but going back about a year, I guess, because it's actually been a year since you've had this role as αλλά πέρασε περίπου ένα χρόνο, γιατί έχει γίνει ένα χρόνο από το οποίο έχεις αυτήν την ρόλη ως CPU στο Neo4j. Ένα από τα πράγματα που μιλήσατε πριν ήταν ένα λίγο, ξεκινάμενο πρόβλημα, ας πούμε, της ρόδομαπτς που είχατε. Και φυσικά πολύ υψηλή αυτή η ρόδομαπτ που ήθελες να διδάσεις με with Sansur, with a WS and with Google. You started with Google and then I was a WS and now you're going, let's say, full circle since you're going to be announcing the Microsoft integration as well.
Starting point is 00:09:15 So I just wanted to ask you if you could give like a very high level, let's say, comparison of what the end result is like. If you feel like the integration you have with these cloud providers are more or less on the same level. Let's say, obviously, there's going to be differences and nuances because each service provider has its own interfaces and so on. But do you think that the level of service that you're offering and the level of integration you're offering is equal on these providers? Yes. So I think our vision has always been like the core capabilities that we want to have are fully integrated at the base level in all three cloud providers and they're equal.
Starting point is 00:09:57 But there is also every cloud provider has some unique capabilities that actually are differentiated. So I think if I look at our integration on the base, a large language model APIs, right? Like the Azure OpenAI, Vertex AI, or Bedrock, at that level, they all are the same because we want to make sure we are integrated with all three platforms. And mostly what happens is the customers come in with a choice of cloud, and then they basically are like, I want to just use this cloud.
Starting point is 00:10:27 And so we are like, okay, fine, we have to support all the platforms. So at the core, a large language model API integration is exactly the same. Then the next piece of the puzzle is vector search, vector capability. That's part of the database. And then from there, getting embeddings and all, that is very similar in all the three platforms that we have built. I think the difference where it comes in actually is, we're trying to find what is the differentiated capability they're bringing in. So in Fabric's case, the extensibility points that they have provided is much more stronger than any other platform I have seen.
Starting point is 00:11:06 Like I ran Google's analytics platform before, but Fabric's, this whole integrated experience they're building with first-party and third-party experience as a full, like Fabric is more of a SaaS platform
Starting point is 00:11:21 than most other analytics platforms that people are building in Google, AWS as platform as a service where they give you a platform, you build on top. These days are coming from, like Microsoft is coming from a full SaaS experience perspective, and that is differentiated. And that's why the demo I was trying to show you was how we can be fully integrated as a component of the platform, hey we are here they are here
Starting point is 00:11:46 and better together so that's what i would say is like really more differentiated capability that that microsoft is providing so we are building fully with that platform and integrating natively and that actually allows us to target a different persona of users. Historically, Neo4j has targeted developers and some high-end analysts. But by making it a very simple, targeted experience next to Power BI, we can democratize graph analytics and make it super easy for people and not have to worry about everything that goes around it. So I think that's the opportunity and that's the integration that we are most excited about okay so that last part is actually
Starting point is 00:12:31 very interesting and obviously you're leveraging the the infrastructure let's say and the and also would say the strategic goal that microsoft has set out to accomplish with this power bi by making neo4j part of that, you're also able to serve that audience, let's say. Exactly right. That's perfect. And that's what we will do, right? Even in other cloud providers, if there is something unique that they're bringing to market,
Starting point is 00:12:57 we want to be part of that. And not just from the three cloud providers, I also think about Snowflake and Databricks are the other two big platforms. And we're working on some of I also think about Snowflake and Databricks are the other two big platforms. And we're working on some of the cool stuff at Snowflake. I think by June, we will be ready to announce that. But yeah, making a fully integrated experience for users
Starting point is 00:13:13 is a big part of the puzzle. And yeah, at the core level, it will be similar. But then on top of it, we will have something very specific for individual platforms. Yeah. And well, speaking of Microsoft, and it's strategic, one of the other things that's really sort of timely at this time that we're having this conversation is Microsoft's AI strategy. And so, you know, last week, they announced that they hired Mustafa Suleiman,
Starting point is 00:13:41 and they're making a number of moves, besides the obvious one of being like the που είχαν ο Μουσταφάς Λέιμαν και κάνουν πολλά νομούς, εκτός από την απίστευτη, μεγάλη προσπάθεια, ας πούμε, της OpenAI, αλλά επίσης, στρατηγικά, εμπιστεύονται σε πολλές άλλες κατεύθυνσες. Και νομίζω ότι η ενδιαφέρμανση με το Neo4j, το βλέπω ως ένα μέρος αυτής. Οπότε, να μπορέσεις να δείξεις, να προσφέρεις έναν γραφικό ρογγ, και να κάνεις πολλά πράγματα, σε σχέση με το γραφικό και το γραφικό αναλυτικό. able to serve to offer a graph route and do a number of things in terms of graph and graph analytics so do you have any any views let's say on that and how do you see neo4j as being potentially a part of that as well and number one is the question and number two how is neo4j's
Starting point is 00:14:19 own ai strategy shaping up yeah i think first all, I don't have all the insights into the strategic decisions that Azure or Microsoft is making. I'm super excited about all the investments they are making in OpenAI, but also the internal investments that you saw. Actually, Microsoft Research did a recent research paper on GraphRag and how GraphRag is really better than other things. Super excited about that. So I think from a mental model perspective, we all agree on the importance of graphs in AI and what kind of applications will be built.
Starting point is 00:14:54 So I think that strategically, I think we are aligned on what the importance of this looks like. Getting more and more integrated and embedded with the platform is going to make it super easy for users to use it and also from a end user perspective you won't be able to see the difference between what's neo4j versus what's the core fabric platform look like and the ai co-pilots that people built on top of it right like our goal is to be just inherent part of the platform so that when you build a co-pilot on top of Microsoft technology, I think graphs are always available to you as an option to go ahead and use. So that's that.
Starting point is 00:15:30 From our perspective, from AI strategy for Neo4j, there are two, three things I think about. One is we are going to always be part of the ecosystem of one of the AI platforms. And so making sure that we are inherently integrated with them is the most important thing. Whether it's like Google with Vertex platform, with Gemini, whether it's Bedrock plus Titan plus all of the platform at AWS, whether it's Microsoft and OpenAI together or independently. We integrate with OpenAI APS directly, but most enterprises want it from Azure OpenAI.
Starting point is 00:16:04 So making sure we are integrated with that platform. So that's one piece of the puzzle. We are using AI for all the different experiences for our customers. So for example, for internal, we have co-pilots that we are building for like a query interface. You saw an example of that in the demo around our visual tooling. So anywhere you want to use natural language, you will be able to use it. We actually, in another couple of weeks, we are going to announce integration of Neo4j,
Starting point is 00:16:35 like in a better training model with Google in their DoIt experience. So when you search for how do I code something with graphs, it will be able to go ahead and help you. And it's in their experience for developers. So making sure we are using generative AI for simplifying developer experience is one of the second things we're investing in. And the third one is also internally using it for various of our tools and improving our own productivity is a big part of the puzzle. I think we are trying to look at all the query logs to figure out what are the kinds of problems our customers face again and again. Can we go ahead and provide them better service and stuff like that?
Starting point is 00:17:16 So I think there is internal use of technology so that we can figure out how to improve our own operations. In marketing, we are using GenAI for various marketing activities and all that. So that's for internal use. So platform integrations, simplifying developers through co-pilots, and then finally internal usage that improves us. So that's our high-level three-pronged strategy on AI for us. Yeah, points. Now, some of that I was ableάσω, ας πούμε, μόνο με το τι είναι δημιουργικά εφαρμογό, γιατί έχει περάσει καιρό που η οικονομική ομάδα σας έχει κάνει πολλά R&D δουλειά και
Starting point is 00:17:53 και γράφει για αυτά τα R&D δουλειά. Και, νομίζω, μέρος του τι αναφέρετε σε μερικές μέρες, είναι βασικά βασικά αυτό το δουλειά. base is actually based on that work. So all the LLM and vector integration, it's stuff that the team has been working on for a while. Yeah, absolutely right. I think the integration with Azure OpenAI and Fabric is new, but the core technology
Starting point is 00:18:18 we have been working on it and building it for some time, the core vector capability has already been built. I think in August we announced early release and then I think September or October we went GA with it and all. So that core capability is already there, but just building vectors is not enough. Now
Starting point is 00:18:34 you have to be able to create embeddings from the database directly, calling the embedding APIs from Azure OpenAI and all. So that's where the new things are coming in. More integrations rather than building the core underlying systems now. And well, just because there's more to life and product development than AI, I knew that there were also more items in your agenda from last year.
Starting point is 00:18:57 And so let's do a quick check on those. I know that one of the things that we talked about back then was scalability. Yeah, I think you also did something to address that Αν ξέρω αυτά, ξέρω ότι ένα από τα πράγματα που μιλήσαμε πριν ήταν η σκέψη. Ναι, νομίζω ότι... Επίσης, έκανα κάτι για να το επηρεάσω, γιατί στην εποχή που εξελίξαμε, υπήρξε ένα επέστρεπο που έφερατε στο Query Engine, το «Γραφικό παραλληλισμό». Και επίσης, αναφέρατε κάποια επανέντευξη που σχεδόταν με την γραφική δεδοσιάση και and some renewed attention that's focused on graph data science. And also, in fact, something that's not technical at all, but seems like a good idea.
Starting point is 00:19:31 And I wonder if you made any progress with that. So you mentioned an advisory, putting together an advisory board from a number of your clients, and I think generally from the industry. So I was wondering if you could just quickly report on those. And what are the new items, let's say, in your agenda for the coming year? Yeah, no, I think we have a customer advisory board coming up in another week's time again to recap and all. I think, first of all, yes, a lot of focus on scalability, continue to improve it. Parallel runtime, as you mentioned, was a big release for us. And we are going to continue investing in in more of the scalability capability
Starting point is 00:20:05 because our customers want to go to now tens to hundreds of terabytes of large graphs especially with unstructured data coming in the graph sizes are going to become bigger and bigger so we keep optimizing and improving it that is there i think the biggest thing that we are adding this year now is various things on our cloud platform like i think i think the the interest in going to cloud and just using the cloud database is so phenomenal that now we are getting a lot of feedback specifically on types of security features that we can add compliance stuff we can add but also improving the the core platform itself. There's a lot of features and functions that we need to add. So we are adding that this year. Multi-database is one of the biggest things that people have been asking us so that we
Starting point is 00:20:54 can reduce costs. There are a lot of SaaS companies using us as a backbone database and security SaaS companies and all that. And they have thousands of customers and they want every customer to have their own isolated database. But each database is megs. It's not like big. And the large ones will be in tens of gigs, right? So 100 gigs and all.
Starting point is 00:21:13 So like enabling that kind of capability in cloud, we had that in our enterprise edition, but in cloud offering, we're rethinking and figuring out how to make it super easy for users. So things like those on cloud platform side is important. And then making the user experience super easy. So one of the big things we are doing is,
Starting point is 00:21:32 again, I will show you a demo maybe next time when we are ready to launch it, is moving from relational databases, because 80% plus of our customers have tried some kind of a relational use case, and then they come to us many times right in traditional graph use cases and so we are going to make going from relational database to graph with graph modeling just automated in three clicks like it should be super simple i showed you
Starting point is 00:21:58 some insight of that in the demo we saw but just making that like a completely seamless experience is a pretty big part of the puzzle. And then unstructured data to graph, that is another area that we're making a lot of investment in. So making that simple to happen. So if we can get people to see graph with their own data, everybody starts believing in it. So I think making that super easy is a big part of the puzzle now.
Starting point is 00:22:22 Okay, so if I wanted to recap, I'd say, okay, you have lots of engineering, let's say, work to do on the cloud, or no, the cloud version of Neo4j, and the other big front would be going easily, either from unstructured data or relational data to graph, for people who have legacy systems and want to make them grow. Yeah, that's exactly right. That should be more than enough to
Starting point is 00:22:51 keep you happy for the next year, I think. Yeah, there's a lot of other things going on too. And the last bit I will say is a lot of focus this year on ecosystems and how we integrate with them and how to go to market. I will share more on things we are doing with Databricks and like Snowflake in coming months, but there is a lot of opportunity there for us to simplify the journey.
Starting point is 00:23:14 We have a lot of customers using these platforms and Neo4j next to each other and making that journey very simple, make it like, you know, more embeddable is going to be a big part of the puzzle for us this year. Yeah, I'm sure. And you already work with the hyperscalers and so, Adi, there's this sort of theme that's going around, like with the five platforms. You already work with three of them and you started working with the other two. And so if you complete that task, I think that would be a major step ahead, let's say, not only because there's lots of engineering effort involved, obviously, to integrate some of those, but then also because you have to figure out the details of going to market, basically. And that's an equally challenging, I guess, effort. Absolutely right, George.
Starting point is 00:24:03 And I think there's also two things, right? One is just having better together where go-to-market is just aligned, but we run independent motions
Starting point is 00:24:11 is one thing. I fundamentally believe if we did few things right and integrated more deeply with platforms, like
Starting point is 00:24:18 the fabric example is a good one. If we are completely integrated, then we as a smaller company relatively to Microsoftrosoft we get their benefit of their go-to-market motion right it's different than saying hey we support
Starting point is 00:24:32 microsoft platform and like buy us but we are still selling versus the example of fabric workload integration and you will see similar stuff with other platforms. Once we do that, then Microsoft salespeople, when they sell fabric, by default, they're also selling graph analytics. And by default, we get rev share on top of that. So I think that is the whole strategy is how do we go ahead and expand our footprint without having to have go-to-market motions
Starting point is 00:25:01 that are linearly scaling. We want sublinear scale on the market side. So I think that's the real opportunity with Microsoft for us is expand at scale and leverage their go-to-market motion for us. It sounds like a good idea.
Starting point is 00:25:15 Not necessarily easy to implement, but well... Yes. Exactly right. They're pretty big, so we have to make sure that we are agile and we can fit into their ecosystem pretty quickly.
Starting point is 00:25:28 Thanks for sticking around. For more stories like this, check the link in bio and follow Link Data Registration.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.