Orchestrate all the Things - Knowledge Graphs as the essential truth layer for Pragmatic AI. Featuring Tony Seale, The Knowledge Graph Guy
Episode Date: March 11, 2025Organizations are facing a critical challenge to AI adoption: how to leverage their domain-specific knowledge to use AI in a way that delivers trustworthy results. Knowledge graphs provide the mi...ssing "truth layer" that transforms probabilistic AI outputs into real world business acceleration. Knowledge graphs are powering products for the likes of Amazon and Samsung. The Knowledge graph market is expected to grow to $6.93 Billion by 2030, at a CAGR of 36.6%. Gartner has been advocating for the role of knowledge graphs in AI and the downstream effects in organizations going forward for the last few years. Neither the technology nor the vision are new. Knowledge graph technology has been around for decades, and people like Tony Seale were early to identify its potential for AI. Seale, also known as "The Knowledge Graph Guy", is the founder of the eponymous consulting firm. In this extensive conversation, we covered everything from knowledge graph first principles to application patterns for safe, verifiable AI, real-world experience, trends, predictions, and the way forward. Read the article published on Orchestrate all the Things here: https://linkeddataorchestration.com/2025/03/11/knowledge-graphs-as-the-essential-truth-layer-for-pragmatic-ai/
Transcript
Discussion (0)
Welcome to Orchestrate All The Things.
I'm George Anatiotis and we'll be connecting the dots together.
Stories about technology, data, AI and media and how they flow into each other, shaping our lives.
Organizations are facing a critical challenge to AI adoption.
How to leverage their domain-specific knowledge to use AI in a way that delivers trustworthy results.
Knowledge graphs provide the missing truth layer that transforms probabilistic
AI outputs into real-world business acceleration. Knowledge graphs are powering products for
the likes of Amazon and Samsung. The knowledge graph market is expected to grow to $6.93
billion by 2030 at a compound annual growth rate of 36.6%. Kartner has been advocating for the role of knowledge graphs in AI
and the downstream effects in organizations going forward for the last few years.
Neither the technology nor the vision are new.
Knowledge graph technology has been around for decades
and people like Tony Seale were early to identify its potential for AI.
Seale, also known as the Knowledge Graph Guy,
is the founder of the eponymous consulting firm.
In this extensive conversation,
we cover everything from knowledge graph first principles,
to application patterns for safe, verifiable AI,
real-world experience, trends, predictions, and the way forward.
I hope you will enjoy this.
If you like my work on orchestrate all the things,
you can subscribe to my podcast
available on all major platforms.
My self-published newsletter
also syndicated on
Substack, Hackernoon, Medium and D-Zone
or follow orchestrate all the things
on your social media of choice.
Hi, thanks for having me on the podcast.
It's really nice to be here.
I'm Tony Seal, also known as the Knowledge Graph Guy.
Well, yes, it's true that between you and me, we do know what a Knowledge Graph is,
I hope.
However, that's not necessarily the case for everyone else who may be listening.
So whenever I'm having this conversation
and in the presence of an audience
that I know is not necessarily familiar with the term,
the first thing I like to do is always start by defining,
okay, so let's take a step back
and actually explain to people
what exactly is a knowledge graph.
My way of doing that to people who are perhaps not even...
to people who perhaps don't even have a background in computer science
or data modeling or any of those things is I explain it like this.
Like, okay, do you know what a spreadsheet is?
99.9% of people know what a spreadsheet is.
So, fine. You can think of it as...
This is a table, basically, right?
So, a graph is something like a mind-bub.
So, you still have data, but instead of having it in a tabular format,
it's kind of more free-flowing and you have connections between those data.
And that's a graph. That's a graph in terms of a data model.
Because, again, people, when talking about graphs,
they tend to think about things like visualizations and bars
and that kind of stuff. So that's a first useful distinction.
But then actually it gets a bit more nuanced,
because a graph is not necessarily a knowledge graph.
And that's...
So to me, the defining, let's say, feature,
the defining characteristic of a knowledge graph
is the knowledge part, because there's many graphs out there,
but it's the knowledge that's attached to a graph that makes it special.
Do you agree with that simplistic maybe definition?
Yeah, I think it's a good one.
I think the other thing that can be helpful to people
to understand is when you think of some well-known graphs,
like social networks, like an obvious graph.
So people think about their Facebook network
of their friends, friends of those friends.
So I think that often lets people,
gives people a better understanding.
But yeah, I'm interested in what you're saying there
about the definition between a graph and a knowledge graph.
And yeah, so are you referring there to ontology?
Is that...?
We'll get to that part as well.
I mean, if we can agree on the fact that, well,
there's different types of graphs,
there's graphs that will connect different nodes,
and there's graphs that have some kind of knowledge,
quote, knowledge, attached to them,
then we get to the part like, okay,
how do you define this knowledge?
Then we start talking about semantics, I guess.
And so then kind of eventually and unav semantics, I guess. And so then, kind of eventually and
unavoidably, I would say we get to the to the part about ontologies. But yeah, okay. Well, so,
so with graphs, just I'll be just sticking with the graph part, I guess the key thing that you're
doing with a graph above anything else. And at that point, it's like quite interesting to take it
back down to the level of maths, I think.
So yeah, you could have your data in a set.
So you've got set theory, which that you could see in a way
that that's what we're doing with those Excel tables.
We're treating things as sets because the set
is like the table, basically.
And then in a relational database,
you're putting connections between them.
But the connections, they're not really first class citizens.
And what the raw, bare metal mathematics
is with the difference between graph theory and set theory
is that we're taking the relationships
and we're making them first class citizens.
So now you're able to view your data as a network.
Viewing your data as a network. Viewing your data as a
network is just, it's a fundamentally different way of looking at things. And I think it's a
powerful way of looking at things because in reality no piece of information actually exists
in isolation. In fact, the context is what gives meaning to pretty much everything. So in that,
to that extent, I think all graphs are graphs have the inherent potential to bring more knowledge
and meaning because they've already taken the first step of acknowledging the interconnectivity
of information and the contextual nature of information. So you're already on the kind of
first step to the path from turning just plain data into knowledge. But maybe, you know, I'd be
keen to, I think it would be fun to dig into exactly what you mean by knowledge there. That
could be like an interesting conversation to have. Yeah, okay. So now we're getting to the interesting part. So how do you define knowledge?
I mean, in my mind, yes, you're right.
If you put, if you model your domain,
your problem, whatever that is as a graph,
then yes, you're already taking the first step
towards enabling discovery, enabling things like path finding,
there are graph algorithms, there's a whole range of problems
that you can solve by modeling your domain as a graph.
So yes, you could argue that you're sort of enabling knowledge discovery. But usually, well, historically, let's say,
when people in our community talk about knowledge graphs,
they tend to mean something rather more constrained and specific.
They tend to talk about semantics that may be attached to your graph structure.
So you can define in specific ways what you are talking about
when you have a concept such as a person or a dog
or a piece of furniture.
And you can also define in specific ways
what you are talking about when you attach relations
between those concepts, like a person can be the owner of a dog, for example.
That already has some kind of meaning.
You can add some kind of constraint to that relationship.
It can only apply to this concept and that concept.
It's not just a free-floating thing that can be attached to anything else. As opposed to a graph that you talked about,
social network graphs for example.
Yes, you can discover many things by modeling your social network as a graph,
things like who's an influencer or who knows whom, or how many nodes you need to traverse to connect this person to that person.
But they don't necessarily have very good definitions about the people that are modeled in the network themselves,
like their professions, or how exactly they're related. You can find, like, okay, this person knows this person, and that person knows that person, but how exactly are they related, you can find like, okay, this person knows this person
and that person knows that person, but how exactly are they related?
They could be relatives, they could be friends, they could be colleagues.
If you start specifying things in those ways, I think this is where you get into the knowledge
territory.
Yeah.
Yeah.
So, I mean, again, if we sort of like drop down to bare metal levels for a second,
then people talk about the difference
between heterogeneous graphs and just a normal straight graph.
And what do they mean when they're talking about that?
Well, I could have a graph where I have these nodes
and I connect them with edges.
And the edges are all of the same type basically.
So then exactly as you're saying you can go and do those kind of pathfinding algorithms that go
between it. So I think but then it could be I could have then the next level up from that's
like a bipartite graph. So it could have like two different types of nodes, two different types of
thing that were in the graph like I, I don't know, like people and
the products that they bought. And so I think the base level definition of a knowledge graph is when
you go up to a fully heterogeneous graph. So what that would mean is I can have all sorts of different
nodes going on in that graph, and they can have all sorts of different edge types that are connecting them together.
So, you know, I could have people and products and people buy products and products are shipped
to people and, you know, they are stored in a warehouse. And so as we start attaching,
as we start attaching semantics labels to the edges within a graph, and we start saying that the nodes within the graph have got different types to them,
then I think that's our entry level to say that we're talking about a knowledge graph here.
Because the nature of the algorithms that you can run over that, including like the machine
learning algorithms that you can run over,
it becomes much more complicated at that point.
There's stuff that you can do with quite simply
at bare level mathematics with a graph where all the nodes are
just the same thing, and they've got all the same kind of
connections together between them.
Once you start saying, well, actually, no, some of these
nodes are different things, and the edges between between them they're special different types of edges that mean something,
then the complexity goes up and the algorithms that you're running so you have certain algorithms
which will run on that very basic one that are not going to be able to run on this more complicated situation. So I think that we could call that
kind of entry level to what a knowledge graph is. Things begin to get a little fuzzier, I think,
after that. But I don't think anybody would debate that that at least is a fair definition
of like the difference between a knowledge graph. And to give you a bit of my kind of backstory and history,
I guess I started with knowledge graphs about 10 years ago. I'd just seen Tim Berners-Lee's
TED talk on linked data. I don't know if you've seen that George, I guess you have, right? Yeah,
very famous one. And I was doing yet another ETL project
for a large investment bank, kind of bringing data
into a data warehouse and doing all the kind of data pipelines.
And I just kind of thought, there must be a better way.
And then I just kind of wondered, well,
could the stuff that Tim Berners-Lee, I mean,
at first I kind of thought, what does the guy want?
He's already invented the web. And now he's after like a second bite coming and talking to Lee
Datorius, you know. But then I thought I'll kind of give it a try and just did like an under the desk
little project to start looking at whether I could kind of connect some of the trades together in
order to feed into this data warehouse, kind of like a pre-layer, I guess, to the data
warehouse, passing these semi-structured trade files. I sort of expected it to fail, but it didn't,
and then slowly I became more and more obsessed with it, and I'm spending less and less time on
my day job and more and more time on like on this little sort of secret side project underneath my desk, my wife saying to me, you know, Tony, you've really got to stop doing this. You're gonna get
yourself fired. But, you know, the kind of bug had me and I, anyway, kind of a long story short
there after sort of several twists and turns, I ended up kind of showing it to the boss there.
And they gave me the opportunity to then go and try
this stuff out for real.
I ended up linking the whole of the FX trade population,
entire trade population, connecting it into client data.
And we were doing stuff for
finance and closing off regulatory issues with that. And then went and worked on the trading
floor with the traders building knowledge graphs, which was super exciting because we were doing
like knowledge graphs that were driving P&L and risk systems. So it's like kind of, if the system makes a mistake,
you're in the middle of the night,
then you've got people on the phone
because it's absolutely crucial,
get a decimal position wrong, or have some of those
kind of links going in your graph wrong,
that it's pulling off of this,
and it had very big economic impacts
if you made like one small mistake,
because there's such vast sums of money
kind of coming through this particular trading desk. So that was super exciting. And by this point, I'm completely now obsessed. And I don't, I don't care about anything else apart from all these graphs now, you know, that kind of fate that people go through where suddenly, you know, you see graphs everywhere, and everything's a graph now. So I kind of had completely crossed through that. And then,
but I started thinking about it in terms of, well, all right, it's working here within this situation,
but really, what would be like totally cool is if everything was connected together, like all of the
information within this large investment bank, why couldn't it all be connected together? And
and then I started thinking about, well, it's kind of like a distribution problem.
So putting it all into one big graph database
is not, you're never gonna be able to do it that way.
So I sort of became interested at that point
in this kind of like idea of taking the actual
kind of link data principles and applying them
within the context of one organization.
And I was kind of hoping at that point,
this sort of bottom-up movement would work there.
And to a certain extent it did,
but it didn't kind of get to the point
where like the whole organization was willing
to really buy in whole scale,
which by now I was sufficiently obsessed
that that's what I wanted.
So I moved to a different organization
and there like was architected, they're sort
of another tier one investment bank, and architected their
knowledge graph from the top down. So that working with a
team, a team of other people there, you know, great
ontologists, people who knew about linked data was all kind
of gathered together. And we built a successful project going from the top down there and kind
of out of that really established some of these kind of architectural patterns that allow you to
do the distributed piece. And I guess I should say, just to establish the kind of data layer part of it takes quite a long time.
And I'd sort of before where I was at the first bank, I had done it using these things called GNNs, which I know you'll be familiar with.
But for anybody listening, they got graph based neural networks, graphs, graph specific ones.
And I just kind of come to the point in the second bank that I'm at where I'm like, OK, I'm
going to start putting in the AI layer now.
And at that point, it was very early in the large language
models coming out, like the initial versions of the GPTs
were starting to come out.
So I thought, there seems to be some buzz within the community
around this. I'm just going to kick the tires on what happens when, you know, trying to
use that model in with the, to do the graph stuff as opposed to doing like it all with GNNs.
And it was clear like straight away that this was pretty exciting how these two technologies
were working together. So then basically basically I just kind of dropped everything else
and just focused in on that particular thing
of like how do you use the large language models
with the graphs and like how, you know,
how is that actually gonna kind of work out?
And then the most recent evolution,
I should say through this time I was kind of,
because I'm sort of passionate,
stroke obsessed with this idea.
I've been sharing stuff on LinkedIn
and people were like finding my posts interesting
on LinkedIn, I've become known as the Knowledge Graph Guy
there and I said, so the kind of latest evolution
of what I've done is set up a consultancy called the Knowledge Graph
Guys, which is an attempt to kind of bring together seasoned people like yourself that really know
about knowledge graphs and we can kind of come together and enable as many institutions to
embrace these patterns to be able to use generative AI in a safe way
and to be able to connect their data together. So the mission is to try to disseminate this
technology as widely as possible in as short a possible time. I don't know what's your
written there.
I'm thinking well in one of the things you mentioned in your introduction was Tim Berners-Lee,
Tim Berners-Lee and his talk on linked data and the whole concept of linked data.
And I think it's an interesting starting point and I have to confess it was also a kind of aha moment for me as well. Even though I got into the whole knowledge graph
and seen, let's say, a bit earlier than that,
even though it wasn't called by that name at that time,
it doesn't really matter.
It has kind of changed lots of different names through time.
LinkData was one of those names.
And hence, by the way, the name of my own brand there, Link Data Orchestration.
Yeah, we're on the third iteration now, I mean, like Semantic Web, Link Data,
now an autograph. Yeah.
Exactly. And probably, you know, somewhere along the way, there's going to be something else as well. But what it comes down is basically the same kind of ideas,
the same kind of principles.
So I would say, yes, I think what you described
is a good starting point.
This is where you get into the knowledge craft territory.
But how do you connect that with the linked data principles,
which are actually very
specific? Yeah, so like linked data in particular, because I mean as it's kind of gone through those
definitions, so it's maybe worth just doing a little bit of history lesson there. So if we go
back to like the semantic web as that was originally web as this stuff was originally conceived.
And there, that was something which came out of the AI community and was what people would
now call rather disparagingly good old fashioned AI, like knowledge systems.
And we have things like inference engines, And it was about kind of gathering facts
and then inferencing what the class is
and having first of all, the predicate logic
like symbolic, deductive logic,
which a computer could get old fashioned
Turing machine style computer could go and run
and do computations and come out with answers on.
So that was kind of like the first incarnation.
But it was called Semantic Web because it was also tied up with the web movement.
And maybe we can kind of come back to some of that because it kind of comes into the ontology piece, which is interesting.
But to the specifics of the linked data, because for me, actually, what the linked data was about was Tim Berners-Lee
coming back in and just like saying, okay, look, all of the other stuff, that's fine.
We were in the middle of the AI winter at this point. So like people were kind of like
dismissing it because AI was kind of out of fashion. It had been this kind of big failure.
These neural networks that everyone was banging on about,
they completely don't work.
And AI is just a complete damn squib.
So in that context, what Tim Berners-Lee did with Ink Data,
in my opinion, was do the decentralization and distribution
piece.
So he was like, let's refocus back on some of
the principles that drive the web, which is essentially that it should be like a decentralized
process. We should be using the same mechanism of the web with HTTP and URLs. So now what we're
kind of saying is, okay, well, each of these nodes in the graph, what we're going to do is we're going to connect them together by
just using URLs. So like if each node now is a URL, like a web page on the web, then I can now put
a hyperlink between these two edges of the graph and I can go and navigate off to that other piece
of data, which means that now I can like store these,
I don't have to store these graphs all in one big database, I can now sort of distribute the
process out, which is obviously something I'm very interested in because that was part of my thing
of like, okay, well, like, how are you going to like link whole organization together? Clearly,
this thing's got to be decentralized and distributed. So that's really, that's really
what the link data thing did.
But then I guess it's kind of interesting to say, well,
how is a link in linked data different to just a straight up
hyperlink on a web page?
So well, the first difference is obviously
that when you're going to that link, you're getting data back.
You're not getting an a, an H2,
like a way of doing a visualization,
but it's almost like, oh, I go to this,
I go to this, this end of this hyperlink
and I get back a, the web equivalent
of an Excel spreadsheet kind of comes back.
I've got, I've got structured data that's,
that's coming back there.
So I guess that's one difference.
But the other big difference is hyperlinks,
they're all the same.
They don't mean anything. Whereas when I've got a link that, you know, again, it comes back to the
same thing, that I've got a link and this link is the person, you know, to the order that this
person has made. Then I can do the same thing with, I can follow the link for order itself,
and then I'll take into a definition of the semantics, what an order means.
And those semantics can be quite nuanced.
So what order means in one organization
is not necessarily what an order means
in another organization.
So what I'm able to do now,
and this is kind of,
and I guess we come to this later on,
it's like why it's so important now
with what's going on with the,
now that AI is thoroughly not in winter.
Being able to kind of get that clear semantic definition of what these edges mean
turns out to be a really important thing.
Yeah, yeah, indeed.
And another really important thing is the fact that
these HTTP identifiers that you talked about are unique.
So not just having a graph that's sitting somewhere in isolation,
but giving things in your graph, whether they're nodes or edges,
giving them unique identifiers means that you can have this distribution that you talked about.
So you can have a piece of data somewhere in a database and just giving it this unique identifier.
You can aggregate with another piece of data sitting in some other database,
somewhere in a totally different organization, even you can have a sort
of virtualized view over this entire network. Yeah, so globally, yeah, you're creating something
globally unique, because that's exactly what the DNS system is doing for you. And that obviously
is a battle-hardened technology. And more than that, it's a technology and approach that has been proven to be
successful. You know, like the web is the greatest effort that humanity has done, or at least maybe,
you know, the maybe now the large language models, which have basically taken that web and
compressed it down on the next evolution. Maybe that's they're doing something even more now with that, but that as far as like humanity's collective effort to bring its
knowledge and information together, the web is a magnificent effort in that
regard, you know, whatever your feelings about whether it's gone right or wrong
and certain of the initial dreams of it may have sort of gone a little bit awry.
I don't think even Tim Berners-Lee himself
wouldn't, well, I know that he wouldn't disagree
with that statement.
But regardless of that, in terms of a proven mechanism
for lots of different people, a whole world of people
being able to connect their information and knowledge
together, it's HTTP works.
Enough said, it's just a proven battle-hardened technology.
So we're just taking that exact same proven battle-hardened
technology and then applying it now
to the underlying data concepts underneath.
It's like a no-brainer.
And I actually, probably shouldn't say this,
but I actually sometimes get a little bit annoyed
with people reinventing the wheel with some, you know,
you see some new flashy term that comes up and is like hyping up and it's, you know, it's like we really
don't need to reinvent the wheel here. We have a really good technology
that's proven for solving this distributed data integration problem.
You're going to struggle to come up with anything better than this that has got
as much, you know, all of the software that's
been written around it and the security protocols that
are over it, the caching optimization,
the fact that there's libraries in every language
that you could do, this bare metal stuff that's
optimized for it.
It's why go off and try and approach the problem
from a different angle.
We always would rely on it. That's a very good question.
But, you know, pragmatically, these approaches do exist out there.
And so I guess the question then becomes,
all right, so these different technologies are out there,
and people are using them.
So does it really matter?
I mean, yes, we've just talked about how great it is that this kind of link data approach
uses these standardized technologies and the fact that by doing that,
you can have these virtualized graphs and that's really a great approach
to do data integration
and to solve your data problems.
That said, however, if you're not
interested in standardization, and if you're not
interested in having this sort of distributed approach,
is it still OK to use some other kind of graph approach?
Is that a knowledge graph? Does it matter in the end?
I mean, no. My initial answer is no.
And I think it's very important to be pragmatic and not dogmatic.
And to a certain extent within
our community, within the graph community, like they have, you know, they've got like a history of
people kind of being rather dogmatic. And I think that that is, I think that's not not very helpful.
I think if you boil it down, there are two things are important, I think using your eyes as identifiers
is important. And I think having a shared vocabulary and an agreed schema, you know, those are the two
important things. And beyond that, really, the rest of it doesn't matter so much. But I think
if you're a small organisation, if you're a really small organization,
you know, actually just a plain old data warehouse
might well do your job for everything that you need.
But as soon as you kind of like step up to an organization
that's got multiple databases going on,
I, my hypothesis that I'm running is
the siloed approach to,
I'm just going to have a bunch of separate databases, each one looking after a particular one of this business,
and then I'm going to maybe link a little bit
of the information together in order to do
some finance reporting and send some information
up to the chief executive.
That's not going to cut it in my opinion in the age of AI.
So there's a kind of fundamental massive event on the horizon,
which is maybe sort of overhyped on the short term,
but is like massively underhyped on the long term.
Like people, in my opinion, most people have got no idea what is about to hit them.
It might not be coming quite as soon as some of the tech bros would lead us to believe,
but actually some of my timelines are shortening now with the kind of the speed which things are
accelerating here. So we're about to live in, at least in the next 10 year
timeframe, in a fundamentally different world.
And it could be shorter than that.
And in that world, I believe that an organization
can no longer afford to have its data
in this kind of loosely connected, siloed,
disorganized state because ultimately the value that is the currency,
the kind of value of what's going to be going on, when kind of like vast intelligence is available
on tap, is actually going to come back a lot to what the unique semantics of that business are
and the unique data that that business is holding.
And to get your head around what that really is,
it's very, very important to connect it
and to do the hard work of working out.
I guess we sort of touched slightly
on how we're going to give meaning to these edges.
We're going to have different things in the nodes.
And we talked about the semantic web
and some of that stuff that was in there.
We mentioned the O word for the ontology.
But to bring that out and make that,
when we talked about the importance of the shared
schema, to bring that out and make that all a bit more
concrete for people.
This is data modeling.
This is metadata.
This is the abstract concept, if you like.
It's the schema.
But schema really does not necessarily
do what the monology is justice because we've
got complicated ways of modeling inheritance hierarchies,
making a property to find classes,
or going back the other way, the other way back down.
And what that does, especially with graph structure,
which, as you said yourself, is kind of like a mind map.
So the name of the game is to get
as close to the semantics of the business as you possibly can,
but within this kind of formal conceptual model.
So you're trying to take the words that the business people
are using within a given organization.
And you're trying to turn those into these formal concepts
to kind of like actually get really specific
about what they are,
and then to connect those concepts together
in the way that the concepts interrelate to each other
with each other, like as a specific type of edge.
It's not an easy process to do,
actually generative rules can somewhat take out
some of the dog work in that,
and how easy it easy to do that.
Right, so, yeah, I think we are kind of gradually approaching
towards the old world, the ontology world.
And I know that one example that you like to use to sort of onboard people
to the idea of data modeling and
and all of that jazz is, you refer to schema.org
and how schema.org has been doing that for the web
that we just talked about.
Would you like to just give us the brief version of that
and how that all ties back to what you have been saying about the need to
integrate and define data on the organizational level and how that's going to
to serve the goal of having your data inform your AI models.
Yeah, that's a great point.
So, yeah, schema.org is something that's out there
on the web, and it's what the kind of web companies
have been using to build their own knowledge graph.
So you can just go to it, like schema.org,
tap into your browser, and you'll see what it has got
is the definitions from various different concepts
that we talk about on the web.
So for instance, there'll be the context.
Somebody can always say that the dentist, there'll be like a dentist will be identified there. And the dentist is a type of a service provider, which is a type of a local shop, you know, with whatever the kind of inheritance hierarchy that they've got going on there. And because of that, it will have like opening hours and the services that it offers. So they've basically gone to the effort of defining the semantics around what it is the dentist is
and the sort of properties that you would
expect to get back from there.
And then if I am a dentist and I put up my website,
then what I can do is in my website
I can include this little island of JSON-LD,
which is just like JSON, but it's
got a couple of special attributes in it.
JSON, for anybody non-technical, is just a way of representing data.
So it's a text way of representing data that's incredibly common
and ubiquitous amongst developers.
And you can take that normal JSON, and then you can put this special
app context tag, and the app context tag is pointing back to SchemaDog.
So now I can have my little bit of JSON data.
And it's got the app context tag pointing back
to where the ontology or the schema is lying,
which in this case is schema.org.
And then it's got a type.
And it's saying, well, this is a type of dentist.
And then I've got various properties there.
So then what the web crawlers are able to do
is when they hit that web page, and I'm
going to index this web page now on my
Google search index and I see this island of JSON LD there and I'll write okay. Oh, this is a dentist
I know exactly what that is and this structured data that I can then sort of take and suck in and that's kind of like what's
driving those info boxes on the side of Google search and most people don't know this but actually almost half of the web
Google search and most people don't know this but actually almost half of the web has got these islands of JSON-LD in it so this is not like a unused technology or
something that's difficult for people to do half of the pages on the web so this
is this is this is no one knows it normally kind of talks about it but you
know it is a very widely used technology so and what it is is a shared schema.
So there's something quite powerful about that.
There's something quite powerful about that concept.
Because we've got the URLs, and that
means that we can go and resolve them over the web
to someone else.
So schema.org is hosting their definition
of what these concepts are.
I can follow the URL of schema.org.
I can follow the URL of what dentist is, I can follow the URL of what dentist
is and I can go back and I can get this definition there of what dentist is and what that means
is that the process of the data integration, the cost of it has been taken from the person
who's aggregating the data and it's passed the cost on to the person who is providing the data.
So Google and Microsoft don't pay the cost of doing the data integration.
The dentist pays the cost of doing denture integration because he wants to come up higher
in their search indexes.
And that is the trick, in my opinion, that pretty much all organizations need to perform.
So they need to have their own internal version of Schema.org that's got their own specific semantics within it that reflect the semantics of
their particular niche and their particular business. And then they need to
try to get as much of their organisation as possible to buy into producing and
publishing their data in conformance with those shared semantics. And because you've
got things like inheritance that, you know, I can,
I can have a slightly different version of what my dentist thing is,
but I'm going to inherit the characteristics of this shared definition.
So that's does that's people don't realize it,
but that is like a massive magic trick.
And it's something that has not been performed yet really, because, you know,
even if we're talking about like creating a data mesh
and creating data products, that's great.
That's decentralized.
But what you're really wanting to do
is you're wanting to decentralize the effort
of doing the data integration.
That's the nub of where the hard part of the problem is here,
is kind of linking stuff together and conforming
to some shared terms.
And just briefly, why is that important then with AI?
And we can kind of get onto it a little bit more.
But you're creating your training sets for AI there.
You're defining the nature of your business in a formal way.
You're then connecting instances of data
to those formal semantic concepts.
So you're doing the job that needs to be done in order to get yourself through.
Once this massive event comes in of whatever it looks like with this
kind of general AI event comes through, the organizations that are going through
to the other side of that, in my opinion, are the ones which have done this task.
And we've got this window in order to do it.
Yeah, actually, I think part of the reason why some organizations, at least, are lured by the
promise of, well, AI is going to solve our problems is precisely because they believe that, well,
all we need to do is just feed our data to the AI,
the magic AI model, and presto, it's going to just make sense of everything.
And I think what they don't realize is that it's just going to create a mess
if they don't actually pay the upfront work that needs to be done to define what exactly is in that data
that they're going to feed to the AI model.
Because I think a very typical example is things like sales or products or customers.
If you have a dozen systems or even a couple of systems in your organization,
chances are these things are going to be repeated,
are going to be reused across different systems,
and not exactly in the same way.
So system A will define customer in a certain way,
and system B will define, again, customer in a slightly different way.
So if you feed data from both of those sources to your AI model,
what's going to come out?
Something kind of jumbled and inconsistent, right?
Yeah, yeah, I think that's absolutely it.
So there's a few things to say to it.
So, like, first of all, all organizations need,
are going to have to accept the reality that they're moving to a more probabilistic world.
So everyone has got to start using AI,
or you're probably going to go out of business, which
means that we're moving to this new world where stuff will
be probabilistic and AI will be embedded in a lot of this kind
of decision making.
So I think that's just a kind of you might not like it,
or you might have whatever opinion, but it doesn't care.
It's like some force of nature or whatever that's happening,
and it's happening.
So you might as well just kind of get used to it.
So then the question really becomes, well,
how do you do that in a kind of safe way?
And in my opinion, that comes through
like external verification.
So it's quite interesting if you take like,
you've probably seen like all of the hype over DeepSeek,
you know, and like the big impact that had
on the stock market with them coming in recently
and doing their version of generative model.
And so like everyone else, I was kind of obsessing with it
and going back and kind of looking at it
and try and work out what it is that they did.
And they did some quite interesting changes algorithmically,
but the bit that really struck out to me
is that kind of sitting at the heart of what they did,
how they got their model to be good,
is they kind of took all of the web data
like everybody is
doing, but then they pulled out like just the ones that were related to mathematics and the ones that
were related to coding. And then with that what you can do is you can create an external verifier.
So you can basically look at the bit of the maths and then you can look at the answer at the end
and then you can check whether the answer is actually correct
and likewise with the code.
And then you can feed that to the large language model
and ask the large language model to do that,
and then check it against the external formal verifier.
And what that is basically doing is kind of putting
quality control upon the probabilistic model.
So you're going out of,
this is where I kind of like talk about the continuous world
and the discrete world.
So you kind of, in the continuous world, everything's probabilistic, everything's fuzzy.
And that's where these kind of generative models are.
Like one thing blends into another and you get hallucination,
but you flip hallucination on the other side.
And there's something a bit like creativity there.
It's, you know, able to explore off to these kind of different places, you know,
talking back to the old world of the semantic web,
we had this thing called the Psyche project, which
it failed because we were trying to do complete general
knowledge in a formal way.
And it was just like it was an absolutely horrendous task.
This succeeds at that.
This succeeds at what that was unable to do because
of its probabilistic nature.
But it has got these dangers.
You want to go use one of these systems
to do something within the financial domain
or in the medical domain, absolutely in the legal domain.
Absolutely no way you're going to lose something
like that that's going to do those things.
So then the question comes, OK, well, you've got these kind of like for maths and coding,
it's quite obvious what your formal verifier is, you know, like, because with maths, you've got
and got the answer there, or even you could go out to come some kind of system like Lean or
something like that, where they've actually got the accidents in there. I don't think that the
DeepSeq team did that, but others are. With code, you can see if it compiles.
You can maybe even get the LAM to write a couple of unit tests
and see whether the answers are coming out right.
You're using the code compiler as the formal logical system
there.
So you have this really interesting thing then.
Well, how is a given organization going to do that?
Well, at the base level, again, let's go back to bare metal.
So we go, we go, we go back to bare metal first principles in our probabilistic fuzzy world.
If we are, if we are sort of modeling mathematically probabilities in this kind of across multiple different
factors that we would call like dimensions, then we create this vector.
People talk about it like vector embedding, so I've got my vector database.
When people are talking about that stuff, that's what they're talking about.
Big long list of floating-point numbers, you know, that's essentially what you get
out of a, you know, behind a large neural network is,
you know, when they talk about the weights of the model, that's what it is, you know,
like these, these big long list of floating point numbers. In maths, when you're wanting
to, and that's, that's in the continuous world. So you can almost see that like a kind
of like, let's, let's now not talk about the trillion parameter models that we've got with these
giant things let's like simplify things for ourselves and let's imagine that
we're just now down into two dimensions well it's quite simple now and we can
think of it like coordinates on a map and i've got my latitude and my
longitude and that's going to take me depends on the scale of the map to be
honest with you like where it's going to take me but it's going to take me to
some region within within within that space.
And that's essentially like what we're getting with the kind of vector embeddings.
And the structure that we would represent that with would be a vector, or sometimes
tensors and stuff like that. If you're going to do discrete mathematics on the other hand,
like if you went to a mathematician and said, how are you going to do like discrete mathematics, well like 99 times out of 100, you would use a graph structure to do that.
The graph is the structure that you would turn to in order to kind of do things like discreetly.
And what do we mean by discreetly? Well, you're basically kind of putting boundaries around stuff.
It's no longer probabilistic and fuzzy.
It's like, OK, there's this thing.
It's connected to this other thing.
It's connected to this other thing.
I can compute over that.
I can validate over that.
I can do formal verification over that.
So now let's zoom right back up from the bare metal
to the other thing.
We've got our ontology.
We've got our graph.
We've got our large language model.
We've got the fuzzy probabilistic, scary but kind of almost human creative hallucinating thing over here.
And then we've got our kind of like discrete computable, validatable part here, and stick
the two together. And this is exactly why knowledge graphs are zooming up the hyperscale at the moment
with Gartner identifying them as being the enabling technology for gen AI.
Well, you know, playing the devil's heart here, someone might tell you, great, you know, that all
sounds great in theory, but how does that actually work out in practice?
Right, I'm sold to the idea.
I'm going to start creating my own organizational ontology.
And then what? What can I actually do with that?
How can I use it to interact with an AI model in a meaningful way?
And I'm going to sit here a little bit by sort of previewing your answer.
I've seen you advocate a concept called the working memory graph,
which I guess is a way of conceptualizing this interaction.
So I'm going to let you talk through it,
and then I have a couple of specific questions
on how that plays in practice at this point in time.
Yeah, okay.
So the idea of the working memory graph is that you basically...
Let's assume that the ontology has been created and the knowledge graph has been created,
which are two big assumptions and it might well be worth looping back to the fact that in order
to create those we use generative AI not to create them on their own, but with humans in the loop you
use it as a tool to lower the effort of doing that.
But zooming past that and assuming that that bit is in place now.
So I've got a good ontology, which is like a formal conceptualization of my domain with the key concepts within it.
And I've got a knowledge graph, which is connecting to those concepts and giving me kind of facts about it.
So now, like the user could ask a question, you could have, like, just like you've got ChatGTP,
you're working within a given organization,
and you bring up your semantic agent that is your AI agent
with your private version of a large language model,
generative AI model model coming up there.
And you say to it, read through this document and give me the probability that the mortgages
in this, the MVS is mentioned in it, are likely to default or something like that. Like some, or depending upon your domain, whatever that is, you know, some highly technical
thing that is relevant to what you're doing, but a general large language model would not be able
to answer because it has been trained on all of the general information in the world and it's
not got the specific information about your organization and it's not got the specific information about your organization
and it hasn't got the specific know-how and knowledge of your experts. So what the large
language model can do is it can look at that query and understand that query and it can do this thing
called rag and then this thing that we call graph rag, which takes various different flavors,
but in the conception I'm gonna give you
with the working memory graph,
the first thing it's gonna do
is actually just go off to the ontology
and it's going to try and identify the concepts
that are being mentioned in here.
I see you're talking about an NBS,
I know what NBS is, it's mortgage backed security,
it's like a type of bond,
I know about the semantic, yeah, I know about them.
And then you're asking for like default.
Okay, yeah, this is our definition
of what default is, blah, blah, blah, blah.
Are there any facts around the graph
that are maybe relevant to doing that?
Can I, when I'm looking at this document,
what properties am I expecting to pull out
that I know of MPS is generally have
and that may be relevant to do this thing?
Now I end up with the situation with the working memory graph.
I've got the large language model at one side,
and now I've got like a subgraph of discrete facts on the other side.
And now I can basically kind of loop between these two structures.
So like going from the formal verifiable version
down to the using the large language models.
The large language model is going to kind of do exploration and creativity.
The graph is gonna give you like discrete formalization
and then kind of giving, yes, no answers maybe,
but you're kind of slightly on the wrong track here.
So, imagine take a different formal system,
take a mathematical verifier.
So imagine now over here, we, take a mathematical verifier. So imagine now over here
we've got a mathematical verifier, so like alpha geometry. In alpha geometry, basically you had
like a custom-trained large language model which is kind of coming up with ideas of, oh, I want to
solve this geometry problem, right, okay, my idea is solving the geometry of this problem, right,
and now I'm going to send it into the formal verify, which
is going to compute whether that answer is correct or not.
And then it's going to reply back to the large language model
saying, no, you're off by this because such and such
don't work out.
And basically, what we're talking about here
with the working memory graph is a domain-specific version
of that.
Harder to do, much, much harder to do. But basically, I think the only game in town, I mean, I'm biased, obviously,
but the only game in town I see for all organisations that want to survive the event
is to kind of get really crisp on what these semantics are, connect your data, be able to
do that kind of formal verification.
All right, so a couple of questions. Is it clear or is it not so clear?
So, yeah, a couple of questions here.
So what you're talking about is a kind of two-way relationship
between building your ontology and knowledge graph
and using your large language model.
So one of the ways that you see this interaction happening is using an AI model
to help you create an ontology and a knowledge graph.
And so my question there is, what has your experience been in the current state of affairs there. And to kind of preamble this, my own experience has been kind of so-and-so, let's say.
I mean, yes, you can use... I've tried a couple of different options,
not all of them, admittedly, so I may be missing something,
I have to say this upfront, but in my experience,
the kind of tools that I've used have been able to get something out of...
You feed them a lot of textual documents, basically,
and you expect them to give you something that resembles a kind of coherent data model
for the domain that your documents are describing.
My experience has been kind of hit and miss.
Yes, they are able to extract some kind of concepts,
some kind of relationships, but what you get out of the box is really not usable.
It's not really production level.
You have to put in additional effort to be able to come up with something that's usable.
So I kind of question whether it's actually worth doing that
in the first place.
Maybe you're better off just starting doing everything
manually.
I don't know what your experience has been.
Yeah, I mean, there are some tricks,
which should be too detailed to go into now.
But my experience has been very positive, actually, and only
getting more so as the models, particularly with the reasoning models that are kicking
in now. Yeah, it's just, it's kind of just getting better and better. So no, I think
it's good. Is it 100% of the way there? No, it's not. And actually, that's a good thing. Because if you're simply
able to take an AI model and tell it, show it one of the documents and say, tell me everything
that's going on in this document to the level of expertise that the people inside a given
niche of a business are able to do. That means the value of your business has just dropped down to the price of a
call to one of these models.
So like 0.01 dollars or something like, or even less now that we've got the
deep seek, et cetera out there.
So, um, this is a good thing.
So that what the, what the large language model is able to do in my opinion, is
take between somewhere up to 80 to 90% of the dog work effort out of doing this thing.
And then the rest does come over down to the experts.
But basically what you're able to do is, like, within, like, if you kind of like flip over to the machine learning side, let me talk about
this thing of like the latent space. So this is the kind of part within the neural network,
you've got the input layer, you've got the kind of output layer that comes off the other side,
and in between that you've got the latent space, which is this kind of like black box of whether
the machine learning model is learning what it's kind of learning and has got these latent representations, which, you know,
there's a lot of kind of debate that people go, you know, do these things have like world models
and do they have like conceptual understanding? But to my mind, they do create these abstract
concepts within there. They have to do that because they're basically compressing large amounts of
information, the whole of the information on the web and more.
They're compressing that down to quite a small number
of floating point numbers.
In order to do that compression, they
have to compress some of the semantic information.
So they have to compress information
about some of the concepts.
But they do it in a way that's completely
interpretable by humans.
And we would never be able to begin
to understand how that's done.
I mean, just as you wouldn't look into someone's brain
and look at their neurons and understand what they're doing.
So by using the large language model
to show an instance of a data and then get it to help you
to get it to do a first stab at what
the conceptual ontology is doing is kind
of a bit like looking into the latent space of the large language model. You're getting to kind of
pull out some of those concepts and put them into a discrete form and see the network of how that they
link together with each other. And once you've got that, that's when the domain experts can come in.
And this is where the business is adding its own value. So this is in some ways a really important
piece, because the bit where the large language model can do on its own is kind of like the gestalt knowledge that
everybody has already got. So that's the bit that's worth zero point zero zero one dollars. Now we come to the bit is
actually going to be worth something post the AI full on event, which is not really coming in like that, but it's going to be more
of a gradual process. But post that coming, the bit that's actually the large language model can't
do and isn't part of the general knowledge of the world is the part that's actually got the value.
And that's where your people can come in and they can make the changes in there and say, no,
actually it's much more nuanced than this, and or it's got this conception of this wrong, it's like this. So that's really valuable work
that's being done at that point and shouldn't be degraded. But who wants to do the boring bit that
everybody else knows? You know, you only want to kind of do the bit where you're the specialist
and you're adding some value and that's how these two things, well really it's three things
working in conjunction. It's the large language model,
taking the stout knowledge and doing a load of dog work. It's the formalization of the model
in what we've got with the technology of ontologies and knowledge graphs, and then of course it's the
humans that are able to kind of work in at this kind of discrete conceptual level and go and make
their changes, which are then fed back into the large language model. And now when you say, oh no, here's this nuanced understanding of what this concept actually is,
the next time the large language model reads that document, it's able to refer back to your
conceptual models and now it knows what you would know. You've distilled your knowledge into this
very crisp format, formal format. And then when it comes back out the other end, if it's made a
mistake or hallucinated or got something wrong, your model can catch it.
Right, so that's precisely what my next question was going to be about.
Like, fine, okay, so let's assume that one way or another
you've gotten your ontology,
and then you're trying to use it to guide,
to steer your language model towards,
to provide some context towards getting an answer that's, well, less,
that contains less hallucination, let's say.
Again, in my experience, and I also have to say that I've seen people use
various flavors of, well instructions or whatever other constructs
that they refer to by using the term ontology,
which in my opinion is a bit confusing again.
But anyhow, that aside, that kind of pet peeve aside,
let's assume that through whatever way you've gotten to the point where you have an ontology,
whatever form that ontology may actually assume.
What, in my experience, it still doesn't entirely rid you of hallucinations. So how do you actually, you still have to put in some work
to verify that whatever it is that you get back
actually holds.
What's your experience been?
And how do you see a way out of it?
The same and the way out of it.
So I call this pattern like the neural symbolic loop.
I've got all of these kind of strange names
and they're very like working memory graph
and neural symbolic loop.
But they're kind of, in a way, the working memory graph
is like, OK, this is how I'm going
to do the answering of the question.
The broader pattern that that sits within
is this thing I call the neural symbolic loop, which
is that you're just iterating between continuous
and discrete structures until you get something that you're happy with. So we're kind of like, we've
been one way through the loop in order to take the formalized concepts, put them into the context
of the large language model, allow the large language model to do its job because you're
kind of steering it by those formalized concepts. Then the information comes out of the large language model,
and we can convert the information back
into a kind of discrete form of a graph.
And then once we've got that in a discrete form of a graph,
we can actually then use the ontological model
that we've got there, plus the factual data
that we've got in the subgraph, to validate
that this makes sense.
I put constraints
in on some of these relationships, you know, like I'm trying to think of a good example off the top
of my head, but you know, like a dog can't have six legs or something like that, all mammals, yeah, we can see all mammals have four legs, anyway.
Yeah, but you've put in whatever constraints are, and then you can do that constraint checking
against what's coming out of it. So has it made a completely something that would send it down?
And you can then block off paths that basically are obvious hallucinations.
Okay.
And then you can feed back.
I'm sorry, large language modeling,
you seem to have made,
you seem to have said that this dog has got six legs.
That's not correct.
Dogs have got a max of four legs.
Try again.
Right.
Okay, so, yeah, verification is still needed,
but your suggested way out of it is,
well, you sort of re-,
you translate back to something that is verifiable
and you do the verification part,
and if the verification fails,
you go again and you try to course correct.
Alright, I think we're almost running out of time,
we're kind of slightly over already.
So let's wrap up by something that's a bit related to the most hyped topic of this day.
So you already kind of talked about it, so DeepSeq and reasoning,
or so-called reasoning models.
Again, there's a bit of disconnect, let's say, in terminology here,
because what people in the Knowledge Graph community
have traditionally been referring to when they use the term
Reasoning is something that's kind of tractable, deterministic.
You already, again, talked about it in one of your opening remarks.
As opposed to that, what these so-called reasoning models do
is something quite different.
So I've seen you being in support of the statement that,
well, eventually these models will be able to mimic reasoning so well that it's going to be indistinguishable
from actual tractable reasoning.
And I don't know if we'll ever get to that point,
but I think that's an interesting prediction anyhow.
And just to kind of again play devil's advocate here,
so as you also said at some point,
the way that these models are trained
are basically by extracting verifiable statements out
of otherwise very big and potentially messy training
datasets and focusing on those.
However, I've seen research that, and their claim
to reasoning fame, so to speak,
is the fact that they score very well on benchmarks that were specifically designed to measure reasoning capabilities.
However, I've already seen people do research and sort of prove that even on those tests that are included in those benchmarks, if you do, if you vary it slightly on the very questions that are included in those benchmarks,
those models will fail. So I don't know what that tells us exactly.
Yeah, that's a really good one. And this opinion of mine is not at all popular in the rest of my community.
It goes down like a lead balloon
whenever I say this one to the,
within the graph community.
So I guess, all right, let's kind of break it down.
So first of all, reasoning, it's a bit of a loaded term
and everyone's talking about it so much
that it's one of those ones that's maybe becoming on the edge of like how useful is it because if reasoning seems to mean
different things to to different people but like one of the things I quite like is if you sort of
look at large language models as doing logical deductive reasoning so like what we would call formal reasoning with first-order
predicate logic, you know, that sort of thing, and compare that to like the different, you know,
there are actually kind of different ways of reasoning out of a problem, so that type of
reasoning really probably is only one type of reasoning, and it's the type that our community is particularly keen on and it's got a lot of benefits to it.
But like actually, you know, coming back to the Psych Project again, and a huge admiration for the people who worked on the Psych Project,
I think it was like a brilliant endeavor. But, you know, to do some of this stuff, it just makes more sense to do it. Some things, you know, there
is no like exact truth to it, you know, it makes sense to be working in this kind of probabilistic
sort of space. So having like kind of inductive versus deductive reasoning, you know, having the
two working together with each other, I think, you know, that's the best way of doing it.
And that's certainly where we are right now at the moment.
I could, you know, you wouldn't get a stronger advocate for a firm formal verifier than myself.
That's exactly everything that I'm talking about, that organizations need
formal verifiers. But nevertheless, you know, with Rich Sutton's, in a bitter lesson and
everything like that, there's the continual surprising effects of things learning stuff
for themselves. And actually actually now with these models,
and when you're bringing like RL into it,
the reinforcement learning into it,
the reinforcement learning does have the capability
to go beyond what just the kind of
large language models themselves were able to do.
But even though what's surprising with DeepSeek is that we,
many of us kind of suspected there was a bit of an RL going on or something like that going on at
inference time, but it doesn't even look like it kind of needed that. It just, you know, looks like
the RL was used to train it so that it would spit out the thought processes that was going on.
And by sticking out those thought processes, it's able to get a closer approximation of what
reasoning looks like. Now, what it's, I guess, not going to be able to do is like create like a fully
symbolic function where, you know, I just the large language model is probably is never going to be
able to create those fully symbolic functions where, you know, like you like a mathematical
function where I'm going to pass a couple of things in and once I've defined that
mathematical function, this thing is always going to precisely come out with the same answer every
single time. You know, they are of the continuous nature and that symbolic nature of doing that is
something that they're not going to be able to probably ever do, although they're getting better and better at approximating it.
But humans can't do that either.
My kind of in-head math is absolutely terrible.
So we use tools in order to do those things.
And they too will be able to use tools.
So with a combination of a bit of tool use,
a bit of external verification,
maybe on like some kind of like formal structures, then yeah, it will be an even just the kind of
models on their own, like the way that they're proving, I think, you know, they will,
you know, they're already demonstrating the level to which the mathematics that they're able to do, etc.
It's shockingly good. And everybody, you know, yeah, people always come up with examples that kind of
crush them down and these are the kind of edge cases. But, you know, look back to some of those ones
that were being, take any of the original ones that people were finding at the beginning, which were laughable, and they're all kind of gone now.
And I think that that process I would expect to continue over the next period,
until the point where it's increasingly hard to trip these things up.
Well, let's see how it all unfolds.
It's certainly going to be interesting to watch.
Alright, so thanks.
I don't know if you have any kind of closing statement
or anything that you'd like people maybe listening to
to take away from all of this.
Yeah, my big message that I try and hammer down to everybody
that is working in organizations at the moment
is that it's very easy to get distracted
by the tip of this AI iceberg.
So the glittering algorithms, the reasoning models,
all this kind of stuff that's doing the clever stuff
on the top. Don't get distracted by that. That's off and it's going and there's nothing anybody can do to stop
it. It's like some massive geopolitical economic event that's going that major powers and huge
amounts of money and billions and trillions of dollars and other currencies are going to be
pumped into doing that. That's off like a rocket. There's pretty much nothing anybody can do to stop
that. That's happening anyway. So you're going to be then in your organisation, you're going to be in a situation
where you're going to be able to import this general intelligence in. So you've got this period of time now
before this stuff gets it's smart at the moment. But like you were just pointing out, it's not super smart. In
my opinion, it's going to get super smart. And in quite a lot of other people's opinions as well,
we'll see. But I would hedge on it getting super smart within the next five to 10 year framework,
it's going to get super smart. I believe some other people it's even down to like three years.
I think that's over optimistic, but we'll see. Who knows? You've got this short window. So what
you need to do is concentrate on the bottom of the, you need to take it
to the context of your organization
and you need to concentrate on the bottom of the AI
iceberg, which is the data.
So you need to take the power that we've got on the models
that we have at our hands right now,
and you need to focus that back upon the data
that you've got in there.
You need to clean and consolidate the data up
so that it's in a state to be an effective external verifier
and that you're aware of what information is worth $0.00000.1
and what information only you have and is the value add that you're adding.
You need to connect that and you need to consolidate that down because that's the only game in town as far as I see it at the moment.
For a huge number of organisations.
for a huge number of organisations.
Okay, well, that's pretty clear and pretty powerful, I would say, as well. So, thanks a lot for your time. It's been a super interesting conversation
and I hope people will like it as well.
Yeah, cool chatting with you, George.
Thanks for sticking around.
For more stories like this, check the link in bio and follow Link Data Orchestration.