Orchestrate all the Things - Neo4j partners with Microsoft, unfolds strategy to power Generative AI applications with cloud platforms and Graph RAG. Featuring Neo4j CPO Sudhir Hasbe
Episode Date: March 27, 2024From better together to full native integration, Neo4j is creating an ecosystem around all major cloud platforms to provide graph-powered features for Generative AI and beyond. As Neo4j just a...nnounced its partneship with Microsoft, we met with Chief Product Officer Sudhir Hasbe to talk about: What this partnership means for users and how it worksHow graph-powered generative AI aligns with cloud platform AI strategiesSimilarities and differences across themHow Neo4j's strategy is shaping up, and when Databricks and Snowflake integration are coming. For additional analysis and a writeup of the conversation, you can read the article published on Orchestrate all the Things: https://linkeddataorchestration.com/2024/03/27/neo4j-partners-with-microsoft-unfolds-strategy-to-power-generative-ai-applications-with-cloud-platforms-and-graph-rag/
Transcript
Discussion (0)
Welcome to Orchestrate All The Things.
I'm George Anatiotis and we'll be connecting the dots together.
Stories about technology, data, AI, and media,
and how they flow into each other, saving our lives.
From Better Together to full native integration,
Neo4j is creating an ecosystem around all major cloud platforms
to provide graph-powered features for generative AI and beyond.
As Neo4j just announced its partnership with Microsoftνώρισε την συμφωνία της με την Microsoft,
συναντήσαμε με τον Πρόεδρο Προεδρικού,
τον Σιδήρ Χάσπε, για να μιλήσουμε
για τι σημαίνει αυτή η συμφωνία για τους χρήστες
και πώς δουλεύει.
Πώς η generative AI του GraphPower
ασχολείται με τις στρατηγικές AI πλατφόρμας του cloud,
τις συμμετοχές και τις διαφορές εκτός τους,
πώς είναι η στρατηγική Neo4j's strategy is shaping up,
and when Databricks and Snowflake integration are coming.
I hope you will enjoy this.
If you like my work and orchestrate all the things,
you can subscribe to my podcast, available on all major platforms,
my self-published newsletter, also syndicated on Substack, HackerNin, Medium, and Dzone,
or follow and orchesturate all the things
on your social media of choice.
So I think if you look at
what Fabric allows you to do is
it's like they have a storage system
and they have all these workloads
that they have within it.
And we want to be the default
graph analytics and like,
you know, database workload
on their platform.
So that's the whole idea.
Now we have a lot of integrations.
It's better together story.
Like Fabric is there, we are there.
And how do you go out and integrate with Data Factory
and Synapse, data science, engineering,
warehousing, Power BI, and OpenAI APIs.
But this is still better together.
It's not fully integrated.
And so what we are planning to do is,
so basically this is how
microsoft fabric looks like we go in there are these standard workloads which they provide out
of the box and now they are going to enable additional workloads that other partners can
bring in and so this is an example of workloads like london stock exchange and s3 are a couple
of examples but then you will also have
graph analytics from Neo4j. This is where we will be completely integrated. You don't see anything
Neo4j out of the box. And as soon as you say connect, it will actually transform that into
a graph model automatically based on the information that we can pull up. We are exploring
also using large language model to go ahead and give it the input and then automatically ask us, give us the main entities and all from there.
So that will happen. And once that happens, you can just then explore it. And once you explore it,
it basically allows you to show me a graph, automatically graph comes up, you can start
playing with graph, like you're not doing analysis and all. So the whole idea here is it's basically fully integrated experience that you have inside the full Fabric platform.
So they have roughly 350,000 plus customers using Power BI and all, right?
This is one of the most largest platforms in analytics.
And so we become native part of that.
So that's the fifth one,
which we are most excited about for the long term.
But that's not like that's coming.
It's more of a partnership and we're building together.
Other than that, I think the other big portions
of announcements are mostly around integrations
that we have done with Azure OpenAI.
So we have been working with the whole OpenAI platform
for a long time. We have been
integrating with different APIs that they have got, like everything from building unstructured
data to knowledge graphs. We have a full experience on that. But basically, one of the biggest
challenges, they're one of our big customers that they're taking a lot of unstructured data,
converting into a knowledge graph so that then they can get much more accurate results out of it.
So that's one area where we've integrated
with all the different APIs and components
like with Azure OpenAI.
The second is GraphRag.
I think we believe GraphRag is really the best way
to go ahead and actually build GenAI applications.
And we have a lot of integr that we have done with Langchain,
Lama Index, all the ecosystem,
but we also have integrated Azure OpenAI APIs directly from the database.
So if you are a database user, you're building an application,
from there you can just pass any data information to OpenAI APIs
and you will get embedding out of it, or you can go out and
do summarization out of it, or anything that you want to do with LLMs, you can go out and run that
directly from database. So that's the second one, right? And the third one is just we built the
whole embedding API. And internally in the database, we have storage, vector storage,
as well as vector search capability.
So now you can use Neo4j as a vector store if you want.
But what we have really seen is just purely chunking the data into vectors and doing vector search is not enough.
And we can actually give a lot more structured context on top of that data. So if you search for a particular entity,
you search for like, you know, engine or BMW car
or anything, any example like that,
like a part or something,
we can give what other subcomponents are there
for that particular entity around it.
So like search echo parts manuals or something like that.
So that's the third one.
And fourth one...
I mean, for that to work,
you also need to have that context.
So it's not enough to just use Neo4j
as your vector backend.
In that case, you actually need to have it populated
with your entire knowledge graph.
Yes.
When they come back, you mentioned
you would have to have the dependencies
between the different components and so on.
Yes, you're absolutely right.
And this is super helpful for our existing customers too.
So the first one is build a knowledge graph out of unstructured data.
That is a lot of new customers are coming in for that.
But existing customers that we have, like one of the largest pharmaceutical companies, like we have financial services, pharmaceuticals.
All of these people have built knowledge graphs with neo4j for 10-15 years now
and the only thing was they couldn't put the unstructured data with it so we have like most
manufacturing companies their bill of material is in neo4j and so if you're doing that but your
parts manuals are actually outside of the knowledge graph,
then it's harder to do the use cases that need blending.
And so now they're saying, oh, if I have my bill of material,
I can put parts manuals directly and attach them to the engine
or to whatever the component is.
And then now you can do a dual search.
You can do a graph direct search with nodes,
but you can also do search based on the unstructured data
with like... So that's the
use case that is getting really
popular for folks.
And then also, just making the
knowledge graph accessible to more
users is another key
reason for that. And then the fourth
one is mostly like, you know,
we are one piece of the
big ecosystem for analytics workloads, like within organizations.
So whoever is in the Azure space, they are going to use various of these components from Fabric.
And so we've built integrations with various different components in Fabric, like Azure Data Factory, if you want to use it for moving data into OneLake or into Neo4j.
I think we've integrated with that. like Azure Data Factory, if you want to use it for moving data into OneLake or into Neo4j,
I think we've integrated with that.
You're seamlessly integrated with,
we have a data warehousing connector,
so we can extract data from Synapse easily.
You can use Synapse data science notebooks to go ahead and use our data science capabilities
directly from it.
And you can just use Power BI
to create reports directly out of
Neo4j database. So I think that those are set of integrations that we have clubbed together in the
fourth announcement that we have. And finally, on top of all of this, we have had Neo4j available
on all the different cloud platforms. The last bit which was remaining was making the Neo4j database available through Azure Marketplace for professionals.
Professionals are starting skew for people to build applications.
And so now it is available through the marketplace too.
So GA like that's available now for all the developers.
So developers can get started at a very low friction point and pretty low cost on it. Okay, so just to add some context for people who may be listening to the conversation,
the I guess the occasion, the reason why you're announcing it now is because there's an upcoming
Microsoft event.
So you want to make the announcement, the unveiling, let's say there, but going back
about a year, I guess, because it's actually been a year since you've had this role as αλλά πέρασε περίπου ένα χρόνο, γιατί έχει γίνει ένα χρόνο από το οποίο έχεις
αυτήν την ρόλη ως CPU στο Neo4j. Ένα από τα πράγματα που μιλήσατε
πριν ήταν ένα λίγο, ξεκινάμενο πρόβλημα, ας πούμε, της ρόδομαπτς που είχατε.
Και φυσικά πολύ υψηλή αυτή η ρόδομαπτ που ήθελες να διδάσεις με with Sansur, with a WS and with Google.
You started with Google and then I was a WS and now you're going, let's say,
full circle since you're going to be announcing the Microsoft integration as well.
So I just wanted to ask you if you could give like a very high level, let's say,
comparison of what the end result is like. If you feel like the integration you have with these cloud providers
are more or less on the same level.
Let's say, obviously, there's going to be differences and nuances
because each service provider has its own interfaces and so on.
But do you think that the level of service that you're offering
and the level of integration you're offering is equal on these providers? Yes. So I think our vision has always been like the core capabilities that we want to
have are fully integrated at the base level in all three cloud providers and they're equal.
But there is also every cloud provider has some unique capabilities that actually are differentiated. So I think if I look at our integration on the base,
a large language model APIs, right?
Like the Azure OpenAI, Vertex AI, or Bedrock,
at that level, they all are the same
because we want to make sure we are integrated
with all three platforms.
And mostly what happens is the customers come in
with a choice of cloud, and then they basically are like, I want to just use this cloud.
And so we are like, okay, fine, we have to support all the platforms.
So at the core, a large language model API integration is exactly the same.
Then the next piece of the puzzle is vector search, vector capability.
That's part of the database.
And then from there, getting embeddings and all,
that is very similar in all the three platforms that we have built.
I think the difference where it comes in actually is, we're trying to find what is the differentiated capability they're bringing in.
So in Fabric's case, the extensibility points that they have provided is much more stronger than any other platform I have seen.
Like I ran Google's
analytics platform before,
but Fabric's,
this whole integrated experience
they're building with first-party
and third-party experience
as a full, like Fabric
is more of a SaaS platform
than most other analytics platforms
that people are building
in Google,
AWS as platform as a service where they give you a platform, you build on top.
These days are coming from, like Microsoft is coming from a full SaaS experience perspective,
and that is differentiated.
And that's why the demo I was trying to show you was how we can be fully integrated as
a component of the platform, hey we are here they are here
and better together so that's what i would say is like really more differentiated capability
that that microsoft is providing so we are building fully with that platform and integrating
natively and that actually allows us to target a different persona of users. Historically, Neo4j has targeted developers and some high-end analysts.
But by making it a very simple, targeted experience next to Power BI, we can democratize graph
analytics and make it super easy for people and not have to worry about everything that
goes around it.
So I think that's the opportunity
and that's the integration that we are most excited about okay so that last part is actually
very interesting and obviously you're leveraging the the infrastructure let's say and the and also
would say the strategic goal that microsoft has set out to accomplish with this power bi
by making neo4j part of that,
you're also able to serve that audience, let's say.
Exactly right. That's perfect.
And that's what we will do, right?
Even in other cloud providers,
if there is something unique that they're bringing to market,
we want to be part of that.
And not just from the three cloud providers,
I also think about Snowflake and Databricks
are the other two big platforms. And we're working on some of I also think about Snowflake and Databricks are the other two big platforms.
And we're working on some of the cool stuff
at Snowflake. I think by June, we will be ready to
announce that. But yeah, making a
fully integrated experience for users
is a big part of the puzzle.
And yeah, at the core level, it will be similar.
But then on top of it, we
will have something very specific for
individual platforms.
Yeah. And well, speaking of Microsoft, and it's strategic, one of the other things that's
really sort of timely at this time that we're having this conversation is Microsoft's AI
strategy. And so, you know, last week, they announced that they hired Mustafa Suleiman,
and they're making a number of moves, besides the obvious one of being like the που είχαν ο Μουσταφάς Λέιμαν και κάνουν πολλά νομούς, εκτός από την απίστευτη, μεγάλη προσπάθεια, ας πούμε, της OpenAI,
αλλά επίσης, στρατηγικά, εμπιστεύονται σε πολλές άλλες κατεύθυνσες.
Και νομίζω ότι η ενδιαφέρμανση με το Neo4j,
το βλέπω ως ένα μέρος αυτής.
Οπότε, να μπορέσεις να δείξεις, να προσφέρεις έναν γραφικό ρογγ,
και να κάνεις πολλά πράγματα, σε σχέση με το γραφικό και το γραφικό αναλυτικό. able to serve to offer a graph route and do a number of things in terms of graph and graph
analytics so do you have any any views let's say on that and how do you see neo4j as being
potentially a part of that as well and number one is the question and number two how is neo4j's
own ai strategy shaping up yeah i think first all, I don't have all the insights into the strategic decisions
that Azure or Microsoft is making.
I'm super excited about all the investments they are making in OpenAI, but also the internal
investments that you saw.
Actually, Microsoft Research did a recent research paper on GraphRag and how GraphRag
is really better than other things.
Super excited about that.
So I think from a mental model perspective, we all agree on the importance of graphs in AI and what kind of applications will be built.
So I think that strategically, I think we are aligned on what the importance of this looks like.
Getting more and more integrated and embedded with the platform is going to make it super easy
for users to use it and also from a end user perspective you won't be able to see the difference
between what's neo4j versus what's the core fabric platform look like and the ai co-pilots that
people built on top of it right like our goal is to be just inherent part of the platform so that
when you build a co-pilot on top of Microsoft technology,
I think graphs are always available to you as an option to go ahead and use.
So that's that.
From our perspective, from AI strategy for Neo4j, there are two, three things I think
about.
One is we are going to always be part of the ecosystem of one of the AI platforms.
And so making sure that we are inherently integrated with them is the most important
thing.
Whether it's like Google with Vertex platform, with Gemini, whether it's Bedrock plus Titan
plus all of the platform at AWS, whether it's Microsoft and OpenAI together or independently.
We integrate with OpenAI APS directly, but most enterprises want it from Azure OpenAI.
So making sure we are integrated with that platform.
So that's one piece of the puzzle.
We are using AI for all the different experiences for our customers.
So for example, for internal, we have co-pilots that we are building for
like a query interface.
You saw an example of that in the demo around our visual tooling.
So anywhere you want to use natural language, you will be able to use it.
We actually, in another couple of weeks, we are going to announce integration of Neo4j,
like in a better training model with Google in their DoIt experience.
So when you search for how do I code something with graphs, it will be able to go ahead and
help you.
And it's in their experience for developers.
So making sure we are using generative AI for simplifying developer experience is one of the second things we're investing in.
And the third one is also internally using it for various of our tools and improving our own productivity is a big part of the puzzle. I think we are trying to look at all the query logs to figure out what are the kinds of problems
our customers face again and again.
Can we go ahead and provide them better service and stuff like that?
So I think there is internal use of technology so that we can figure out how to improve our
own operations.
In marketing, we are using GenAI for various
marketing activities and all that. So that's for internal use. So platform integrations,
simplifying developers through co-pilots, and then finally internal usage that improves us.
So that's our high-level three-pronged strategy on AI for us.
Yeah, points. Now, some of that I was ableάσω, ας πούμε, μόνο με το τι είναι δημιουργικά εφαρμογό,
γιατί έχει περάσει καιρό που η οικονομική ομάδα σας έχει κάνει πολλά R&D δουλειά και
και γράφει για αυτά τα R&D δουλειά.
Και, νομίζω, μέρος του τι αναφέρετε σε μερικές μέρες, είναι βασικά βασικά αυτό το δουλειά. base is actually based on that work. So all the LLM and vector integration,
it's stuff that the team
has been working on for a while.
Yeah, absolutely right.
I think the integration with
Azure OpenAI and Fabric
is new, but the core technology
we have been working on it and building it for
some time, the core vector capability
has already been built. I think in August
we announced early release
and then I think September or October we went
GA with it and all. So that
core capability is already there, but
just building vectors is not enough. Now
you have to be able to create embeddings from the
database directly, calling the embedding
APIs from Azure OpenAI and all. So that's
where the new things are coming in.
More integrations
rather than building the core underlying systems now.
And well, just because there's more to life and product development than AI,
I knew that there were also more items in your agenda from last year.
And so let's do a quick check on those.
I know that one of the things that we talked about back then was scalability. Yeah, I think you also did something to address that Αν ξέρω αυτά, ξέρω ότι ένα από τα πράγματα που μιλήσαμε πριν ήταν η σκέψη.
Ναι, νομίζω ότι...
Επίσης, έκανα κάτι για να το επηρεάσω, γιατί στην εποχή που εξελίξαμε, υπήρξε ένα
επέστρεπο που έφερατε στο Query Engine, το «Γραφικό παραλληλισμό».
Και επίσης, αναφέρατε κάποια επανέντευξη που σχεδόταν με την γραφική δεδοσιάση και and some renewed attention that's focused on graph data science.
And also, in fact, something that's not technical at all,
but seems like a good idea.
And I wonder if you made any progress with that. So you mentioned an advisory, putting together an advisory board
from a number of your clients, and I think generally from the industry.
So I was wondering if you could just quickly report on those.
And what are the new items, let's say, in your agenda for the coming year?
Yeah, no, I think we have a customer advisory board coming up in another week's time again to recap and all.
I think, first of all, yes, a lot of focus on scalability, continue to improve it.
Parallel runtime, as you mentioned, was a big release for us.
And we are going to continue investing in in more of the scalability capability
because our customers want to go to now tens to hundreds of terabytes of large graphs especially
with unstructured data coming in the graph sizes are going to become bigger and bigger so we keep
optimizing and improving it that is there i think the biggest thing that we are adding this year now is various things on our cloud platform like i think i think
the the interest in going to cloud and just using the cloud database is so phenomenal that now we
are getting a lot of feedback specifically on types of security features that we can add
compliance stuff we can add but also improving the the core platform itself. There's a lot of features and functions that we need to add.
So we are adding that this year.
Multi-database is one of the biggest things that people have been asking us so that we
can reduce costs.
There are a lot of SaaS companies using us as a backbone database and security SaaS companies
and all that.
And they have thousands of customers and they want every customer to have their own isolated database.
But each database is megs.
It's not like big.
And the large ones will be in tens of gigs, right?
So 100 gigs and all.
So like enabling that kind of capability in cloud,
we had that in our enterprise edition,
but in cloud offering,
we're rethinking and figuring out
how to make it super easy for users.
So things like those on cloud platform side is important.
And then making the user experience super easy.
So one of the big things we are doing is,
again, I will show you a demo maybe next time
when we are ready to launch it,
is moving from relational databases,
because 80% plus of our customers
have tried some kind of a relational use case,
and then they come to us many times right
in traditional graph use cases and so we are going to make going from relational database to
graph with graph modeling just automated in three clicks like it should be super simple i showed you
some insight of that in the demo we saw but just making that like a completely seamless experience is a pretty
big part of the puzzle.
And then unstructured data to graph, that is another area that we're making a lot of
investment in.
So making that simple to happen.
So if we can get people to see graph with their own data, everybody starts believing
in it.
So I think making that super easy is a big part of the puzzle now.
Okay, so if I wanted to recap, I'd say, okay, you have lots of engineering,
let's say, work to do on the cloud, or no, the cloud version of Neo4j,
and the other big front would be going easily,
either from unstructured data or relational data to graph,
for people who have legacy systems
and want to make them grow.
Yeah, that's exactly right.
That should be more than enough to
keep you happy for the next
year, I think. Yeah, there's a lot of
other things going on too. And the last
bit I will say is a lot of focus this
year on ecosystems and how we
integrate with them and how to go to market.
I will share more on things we are doing with Databricks and like Snowflake in coming
months, but there is a lot of opportunity there for us to simplify the journey.
We have a lot of customers using these platforms and Neo4j next to each other and making that
journey very simple, make it like, you know, more embeddable is going to be a big part of the puzzle for us
this year. Yeah, I'm sure. And you already work with the hyperscalers and so, Adi, there's this
sort of theme that's going around, like with the five platforms. You already work with three of
them and you started working with the other two. And so if you complete that task, I think that
would be a major step ahead, let's say, not only because there's lots of engineering effort involved, obviously, to integrate some of those, but then also because you have to figure out the details of going to market, basically.
And that's an equally challenging, I guess, effort.
Absolutely right, George.
And I think there's also two things, right?
One is just
having better
together where
go-to-market is
just aligned,
but we run
independent motions
is one thing.
I fundamentally
believe if we
did few things
right and
integrated more
deeply with
platforms, like
the fabric example
is a good one.
If we are
completely integrated,
then we as a
smaller company
relatively to Microsoftrosoft we get
their benefit of their go-to-market motion right it's different than saying hey we support
microsoft platform and like buy us but we are still selling versus the example of fabric workload
integration and you will see similar stuff with other platforms. Once we do that, then Microsoft salespeople,
when they sell fabric, by default,
they're also selling graph analytics.
And by default, we get rev share on top of that.
So I think that is the whole strategy is
how do we go ahead and expand our footprint
without having to have go-to-market motions
that are linearly scaling.
We want sublinear scale on the market side.
So I think that's the real opportunity
with Microsoft for us
is expand at scale
and leverage their
go-to-market motion for us.
It sounds like a good idea.
Not necessarily easy to implement,
but well...
Yes.
Exactly right.
They're pretty big,
so we have to make sure
that we are agile
and we can fit into their ecosystem pretty quickly.
Thanks for sticking around. For more stories like this, check the link in bio and follow Link Data Registration.