Orchestrate all the Things - Knowledge Graphs as the essential truth layer for Pragmatic AI. Featuring Tony Seale, The Knowledge Graph Guy

Episode Date: March 11, 2025

Organizations are facing a critical challenge to AI adoption: how to leverage their domain-specific knowledge to use AI in a way that delivers trustworthy results. Knowledge graphs provide the mi...ssing "truth layer" that transforms probabilistic AI outputs into real world business acceleration. Knowledge graphs are powering products for the likes of Amazon and Samsung. The Knowledge graph market is expected to grow to $6.93 Billion by 2030, at a CAGR of 36.6%. Gartner has been advocating for the role of knowledge graphs in AI and the downstream effects in organizations going forward for the last few years. Neither the technology nor the vision are new. Knowledge graph technology has been around for decades, and people like Tony Seale were early to identify its potential for AI. Seale, also known as "The Knowledge Graph Guy", is the founder of the eponymous consulting firm. In this extensive conversation, we covered everything from knowledge graph first principles to application patterns for safe, verifiable AI, real-world experience, trends, predictions, and the way forward. Read the article published on Orchestrate all the Things here: https://linkeddataorchestration.com/2025/03/11/knowledge-graphs-as-the-essential-truth-layer-for-pragmatic-ai/

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Orchestrate All The Things. I'm George Anatiotis and we'll be connecting the dots together. Stories about technology, data, AI and media and how they flow into each other, shaping our lives. Organizations are facing a critical challenge to AI adoption. How to leverage their domain-specific knowledge to use AI in a way that delivers trustworthy results. Knowledge graphs provide the missing truth layer that transforms probabilistic AI outputs into real-world business acceleration. Knowledge graphs are powering products for the likes of Amazon and Samsung. The knowledge graph market is expected to grow to $6.93
Starting point is 00:00:37 billion by 2030 at a compound annual growth rate of 36.6%. Kartner has been advocating for the role of knowledge graphs in AI and the downstream effects in organizations going forward for the last few years. Neither the technology nor the vision are new. Knowledge graph technology has been around for decades and people like Tony Seale were early to identify its potential for AI. Seale, also known as the Knowledge Graph Guy, is the founder of the eponymous consulting firm. In this extensive conversation,
Starting point is 00:01:11 we cover everything from knowledge graph first principles, to application patterns for safe, verifiable AI, real-world experience, trends, predictions, and the way forward. I hope you will enjoy this. If you like my work on orchestrate all the things, you can subscribe to my podcast available on all major platforms. My self-published newsletter
Starting point is 00:01:32 also syndicated on Substack, Hackernoon, Medium and D-Zone or follow orchestrate all the things on your social media of choice. Hi, thanks for having me on the podcast. It's really nice to be here. I'm Tony Seal, also known as the Knowledge Graph Guy. Well, yes, it's true that between you and me, we do know what a Knowledge Graph is,
Starting point is 00:01:56 I hope. However, that's not necessarily the case for everyone else who may be listening. So whenever I'm having this conversation and in the presence of an audience that I know is not necessarily familiar with the term, the first thing I like to do is always start by defining, okay, so let's take a step back and actually explain to people
Starting point is 00:02:19 what exactly is a knowledge graph. My way of doing that to people who are perhaps not even... to people who perhaps don't even have a background in computer science or data modeling or any of those things is I explain it like this. Like, okay, do you know what a spreadsheet is? 99.9% of people know what a spreadsheet is. So, fine. You can think of it as... This is a table, basically, right?
Starting point is 00:02:45 So, a graph is something like a mind-bub. So, you still have data, but instead of having it in a tabular format, it's kind of more free-flowing and you have connections between those data. And that's a graph. That's a graph in terms of a data model. Because, again, people, when talking about graphs, they tend to think about things like visualizations and bars and that kind of stuff. So that's a first useful distinction. But then actually it gets a bit more nuanced,
Starting point is 00:03:13 because a graph is not necessarily a knowledge graph. And that's... So to me, the defining, let's say, feature, the defining characteristic of a knowledge graph is the knowledge part, because there's many graphs out there, but it's the knowledge that's attached to a graph that makes it special. Do you agree with that simplistic maybe definition? Yeah, I think it's a good one.
Starting point is 00:03:42 I think the other thing that can be helpful to people to understand is when you think of some well-known graphs, like social networks, like an obvious graph. So people think about their Facebook network of their friends, friends of those friends. So I think that often lets people, gives people a better understanding. But yeah, I'm interested in what you're saying there
Starting point is 00:04:06 about the definition between a graph and a knowledge graph. And yeah, so are you referring there to ontology? Is that...? We'll get to that part as well. I mean, if we can agree on the fact that, well, there's different types of graphs, there's graphs that will connect different nodes, and there's graphs that have some kind of knowledge,
Starting point is 00:04:34 quote, knowledge, attached to them, then we get to the part like, okay, how do you define this knowledge? Then we start talking about semantics, I guess. And so then kind of eventually and unav semantics, I guess. And so then, kind of eventually and unavoidably, I would say we get to the to the part about ontologies. But yeah, okay. Well, so, so with graphs, just I'll be just sticking with the graph part, I guess the key thing that you're doing with a graph above anything else. And at that point, it's like quite interesting to take it
Starting point is 00:05:03 back down to the level of maths, I think. So yeah, you could have your data in a set. So you've got set theory, which that you could see in a way that that's what we're doing with those Excel tables. We're treating things as sets because the set is like the table, basically. And then in a relational database, you're putting connections between them.
Starting point is 00:05:24 But the connections, they're not really first class citizens. And what the raw, bare metal mathematics is with the difference between graph theory and set theory is that we're taking the relationships and we're making them first class citizens. So now you're able to view your data as a network. Viewing your data as a network. Viewing your data as a network is just, it's a fundamentally different way of looking at things. And I think it's a
Starting point is 00:05:50 powerful way of looking at things because in reality no piece of information actually exists in isolation. In fact, the context is what gives meaning to pretty much everything. So in that, to that extent, I think all graphs are graphs have the inherent potential to bring more knowledge and meaning because they've already taken the first step of acknowledging the interconnectivity of information and the contextual nature of information. So you're already on the kind of first step to the path from turning just plain data into knowledge. But maybe, you know, I'd be keen to, I think it would be fun to dig into exactly what you mean by knowledge there. That could be like an interesting conversation to have. Yeah, okay. So now we're getting to the interesting part. So how do you define knowledge?
Starting point is 00:06:46 I mean, in my mind, yes, you're right. If you put, if you model your domain, your problem, whatever that is as a graph, then yes, you're already taking the first step towards enabling discovery, enabling things like path finding, there are graph algorithms, there's a whole range of problems that you can solve by modeling your domain as a graph. So yes, you could argue that you're sort of enabling knowledge discovery. But usually, well, historically, let's say,
Starting point is 00:07:28 when people in our community talk about knowledge graphs, they tend to mean something rather more constrained and specific. They tend to talk about semantics that may be attached to your graph structure. So you can define in specific ways what you are talking about when you have a concept such as a person or a dog or a piece of furniture. And you can also define in specific ways what you are talking about when you attach relations
Starting point is 00:08:02 between those concepts, like a person can be the owner of a dog, for example. That already has some kind of meaning. You can add some kind of constraint to that relationship. It can only apply to this concept and that concept. It's not just a free-floating thing that can be attached to anything else. As opposed to a graph that you talked about, social network graphs for example. Yes, you can discover many things by modeling your social network as a graph, things like who's an influencer or who knows whom, or how many nodes you need to traverse to connect this person to that person.
Starting point is 00:08:50 But they don't necessarily have very good definitions about the people that are modeled in the network themselves, like their professions, or how exactly they're related. You can find, like, okay, this person knows this person, and that person knows that person, but how exactly are they related, you can find like, okay, this person knows this person and that person knows that person, but how exactly are they related? They could be relatives, they could be friends, they could be colleagues. If you start specifying things in those ways, I think this is where you get into the knowledge territory. Yeah. Yeah.
Starting point is 00:09:19 So, I mean, again, if we sort of like drop down to bare metal levels for a second, then people talk about the difference between heterogeneous graphs and just a normal straight graph. And what do they mean when they're talking about that? Well, I could have a graph where I have these nodes and I connect them with edges. And the edges are all of the same type basically. So then exactly as you're saying you can go and do those kind of pathfinding algorithms that go
Starting point is 00:09:49 between it. So I think but then it could be I could have then the next level up from that's like a bipartite graph. So it could have like two different types of nodes, two different types of thing that were in the graph like I, I don't know, like people and the products that they bought. And so I think the base level definition of a knowledge graph is when you go up to a fully heterogeneous graph. So what that would mean is I can have all sorts of different nodes going on in that graph, and they can have all sorts of different edge types that are connecting them together. So, you know, I could have people and products and people buy products and products are shipped to people and, you know, they are stored in a warehouse. And so as we start attaching,
Starting point is 00:10:49 as we start attaching semantics labels to the edges within a graph, and we start saying that the nodes within the graph have got different types to them, then I think that's our entry level to say that we're talking about a knowledge graph here. Because the nature of the algorithms that you can run over that, including like the machine learning algorithms that you can run over, it becomes much more complicated at that point. There's stuff that you can do with quite simply at bare level mathematics with a graph where all the nodes are just the same thing, and they've got all the same kind of
Starting point is 00:11:18 connections together between them. Once you start saying, well, actually, no, some of these nodes are different things, and the edges between between them they're special different types of edges that mean something, then the complexity goes up and the algorithms that you're running so you have certain algorithms which will run on that very basic one that are not going to be able to run on this more complicated situation. So I think that we could call that kind of entry level to what a knowledge graph is. Things begin to get a little fuzzier, I think, after that. But I don't think anybody would debate that that at least is a fair definition of like the difference between a knowledge graph. And to give you a bit of my kind of backstory and history,
Starting point is 00:12:07 I guess I started with knowledge graphs about 10 years ago. I'd just seen Tim Berners-Lee's TED talk on linked data. I don't know if you've seen that George, I guess you have, right? Yeah, very famous one. And I was doing yet another ETL project for a large investment bank, kind of bringing data into a data warehouse and doing all the kind of data pipelines. And I just kind of thought, there must be a better way. And then I just kind of wondered, well, could the stuff that Tim Berners-Lee, I mean,
Starting point is 00:12:41 at first I kind of thought, what does the guy want? He's already invented the web. And now he's after like a second bite coming and talking to Lee Datorius, you know. But then I thought I'll kind of give it a try and just did like an under the desk little project to start looking at whether I could kind of connect some of the trades together in order to feed into this data warehouse, kind of like a pre-layer, I guess, to the data warehouse, passing these semi-structured trade files. I sort of expected it to fail, but it didn't, and then slowly I became more and more obsessed with it, and I'm spending less and less time on my day job and more and more time on like on this little sort of secret side project underneath my desk, my wife saying to me, you know, Tony, you've really got to stop doing this. You're gonna get
Starting point is 00:13:30 yourself fired. But, you know, the kind of bug had me and I, anyway, kind of a long story short there after sort of several twists and turns, I ended up kind of showing it to the boss there. And they gave me the opportunity to then go and try this stuff out for real. I ended up linking the whole of the FX trade population, entire trade population, connecting it into client data. And we were doing stuff for finance and closing off regulatory issues with that. And then went and worked on the trading
Starting point is 00:14:12 floor with the traders building knowledge graphs, which was super exciting because we were doing like knowledge graphs that were driving P&L and risk systems. So it's like kind of, if the system makes a mistake, you're in the middle of the night, then you've got people on the phone because it's absolutely crucial, get a decimal position wrong, or have some of those kind of links going in your graph wrong, that it's pulling off of this,
Starting point is 00:14:38 and it had very big economic impacts if you made like one small mistake, because there's such vast sums of money kind of coming through this particular trading desk. So that was super exciting. And by this point, I'm completely now obsessed. And I don't, I don't care about anything else apart from all these graphs now, you know, that kind of fate that people go through where suddenly, you know, you see graphs everywhere, and everything's a graph now. So I kind of had completely crossed through that. And then, but I started thinking about it in terms of, well, all right, it's working here within this situation, but really, what would be like totally cool is if everything was connected together, like all of the information within this large investment bank, why couldn't it all be connected together? And and then I started thinking about, well, it's kind of like a distribution problem.
Starting point is 00:15:26 So putting it all into one big graph database is not, you're never gonna be able to do it that way. So I sort of became interested at that point in this kind of like idea of taking the actual kind of link data principles and applying them within the context of one organization. And I was kind of hoping at that point, this sort of bottom-up movement would work there.
Starting point is 00:15:49 And to a certain extent it did, but it didn't kind of get to the point where like the whole organization was willing to really buy in whole scale, which by now I was sufficiently obsessed that that's what I wanted. So I moved to a different organization and there like was architected, they're sort
Starting point is 00:16:09 of another tier one investment bank, and architected their knowledge graph from the top down. So that working with a team, a team of other people there, you know, great ontologists, people who knew about linked data was all kind of gathered together. And we built a successful project going from the top down there and kind of out of that really established some of these kind of architectural patterns that allow you to do the distributed piece. And I guess I should say, just to establish the kind of data layer part of it takes quite a long time. And I'd sort of before where I was at the first bank, I had done it using these things called GNNs, which I know you'll be familiar with.
Starting point is 00:16:54 But for anybody listening, they got graph based neural networks, graphs, graph specific ones. And I just kind of come to the point in the second bank that I'm at where I'm like, OK, I'm going to start putting in the AI layer now. And at that point, it was very early in the large language models coming out, like the initial versions of the GPTs were starting to come out. So I thought, there seems to be some buzz within the community around this. I'm just going to kick the tires on what happens when, you know, trying to
Starting point is 00:17:29 use that model in with the, to do the graph stuff as opposed to doing like it all with GNNs. And it was clear like straight away that this was pretty exciting how these two technologies were working together. So then basically basically I just kind of dropped everything else and just focused in on that particular thing of like how do you use the large language models with the graphs and like how, you know, how is that actually gonna kind of work out? And then the most recent evolution,
Starting point is 00:18:01 I should say through this time I was kind of, because I'm sort of passionate, stroke obsessed with this idea. I've been sharing stuff on LinkedIn and people were like finding my posts interesting on LinkedIn, I've become known as the Knowledge Graph Guy there and I said, so the kind of latest evolution of what I've done is set up a consultancy called the Knowledge Graph
Starting point is 00:18:26 Guys, which is an attempt to kind of bring together seasoned people like yourself that really know about knowledge graphs and we can kind of come together and enable as many institutions to embrace these patterns to be able to use generative AI in a safe way and to be able to connect their data together. So the mission is to try to disseminate this technology as widely as possible in as short a possible time. I don't know what's your written there. I'm thinking well in one of the things you mentioned in your introduction was Tim Berners-Lee, Tim Berners-Lee and his talk on linked data and the whole concept of linked data.
Starting point is 00:19:14 And I think it's an interesting starting point and I have to confess it was also a kind of aha moment for me as well. Even though I got into the whole knowledge graph and seen, let's say, a bit earlier than that, even though it wasn't called by that name at that time, it doesn't really matter. It has kind of changed lots of different names through time. LinkData was one of those names. And hence, by the way, the name of my own brand there, Link Data Orchestration. Yeah, we're on the third iteration now, I mean, like Semantic Web, Link Data,
Starting point is 00:19:57 now an autograph. Yeah. Exactly. And probably, you know, somewhere along the way, there's going to be something else as well. But what it comes down is basically the same kind of ideas, the same kind of principles. So I would say, yes, I think what you described is a good starting point. This is where you get into the knowledge craft territory. But how do you connect that with the linked data principles, which are actually very
Starting point is 00:20:25 specific? Yeah, so like linked data in particular, because I mean as it's kind of gone through those definitions, so it's maybe worth just doing a little bit of history lesson there. So if we go back to like the semantic web as that was originally web as this stuff was originally conceived. And there, that was something which came out of the AI community and was what people would now call rather disparagingly good old fashioned AI, like knowledge systems. And we have things like inference engines, And it was about kind of gathering facts and then inferencing what the class is and having first of all, the predicate logic
Starting point is 00:21:11 like symbolic, deductive logic, which a computer could get old fashioned Turing machine style computer could go and run and do computations and come out with answers on. So that was kind of like the first incarnation. But it was called Semantic Web because it was also tied up with the web movement. And maybe we can kind of come back to some of that because it kind of comes into the ontology piece, which is interesting. But to the specifics of the linked data, because for me, actually, what the linked data was about was Tim Berners-Lee
Starting point is 00:21:49 coming back in and just like saying, okay, look, all of the other stuff, that's fine. We were in the middle of the AI winter at this point. So like people were kind of like dismissing it because AI was kind of out of fashion. It had been this kind of big failure. These neural networks that everyone was banging on about, they completely don't work. And AI is just a complete damn squib. So in that context, what Tim Berners-Lee did with Ink Data, in my opinion, was do the decentralization and distribution
Starting point is 00:22:22 piece. So he was like, let's refocus back on some of the principles that drive the web, which is essentially that it should be like a decentralized process. We should be using the same mechanism of the web with HTTP and URLs. So now what we're kind of saying is, okay, well, each of these nodes in the graph, what we're going to do is we're going to connect them together by just using URLs. So like if each node now is a URL, like a web page on the web, then I can now put a hyperlink between these two edges of the graph and I can go and navigate off to that other piece of data, which means that now I can like store these,
Starting point is 00:23:05 I don't have to store these graphs all in one big database, I can now sort of distribute the process out, which is obviously something I'm very interested in because that was part of my thing of like, okay, well, like, how are you going to like link whole organization together? Clearly, this thing's got to be decentralized and distributed. So that's really, that's really what the link data thing did. But then I guess it's kind of interesting to say, well, how is a link in linked data different to just a straight up hyperlink on a web page?
Starting point is 00:23:38 So well, the first difference is obviously that when you're going to that link, you're getting data back. You're not getting an a, an H2, like a way of doing a visualization, but it's almost like, oh, I go to this, I go to this, this end of this hyperlink and I get back a, the web equivalent of an Excel spreadsheet kind of comes back.
Starting point is 00:23:57 I've got, I've got structured data that's, that's coming back there. So I guess that's one difference. But the other big difference is hyperlinks, they're all the same. They don't mean anything. Whereas when I've got a link that, you know, again, it comes back to the same thing, that I've got a link and this link is the person, you know, to the order that this person has made. Then I can do the same thing with, I can follow the link for order itself,
Starting point is 00:24:22 and then I'll take into a definition of the semantics, what an order means. And those semantics can be quite nuanced. So what order means in one organization is not necessarily what an order means in another organization. So what I'm able to do now, and this is kind of, and I guess we come to this later on,
Starting point is 00:24:41 it's like why it's so important now with what's going on with the, now that AI is thoroughly not in winter. Being able to kind of get that clear semantic definition of what these edges mean turns out to be a really important thing. Yeah, yeah, indeed. And another really important thing is the fact that these HTTP identifiers that you talked about are unique.
Starting point is 00:25:11 So not just having a graph that's sitting somewhere in isolation, but giving things in your graph, whether they're nodes or edges, giving them unique identifiers means that you can have this distribution that you talked about. So you can have a piece of data somewhere in a database and just giving it this unique identifier. You can aggregate with another piece of data sitting in some other database, somewhere in a totally different organization, even you can have a sort of virtualized view over this entire network. Yeah, so globally, yeah, you're creating something globally unique, because that's exactly what the DNS system is doing for you. And that obviously
Starting point is 00:25:59 is a battle-hardened technology. And more than that, it's a technology and approach that has been proven to be successful. You know, like the web is the greatest effort that humanity has done, or at least maybe, you know, the maybe now the large language models, which have basically taken that web and compressed it down on the next evolution. Maybe that's they're doing something even more now with that, but that as far as like humanity's collective effort to bring its knowledge and information together, the web is a magnificent effort in that regard, you know, whatever your feelings about whether it's gone right or wrong and certain of the initial dreams of it may have sort of gone a little bit awry. I don't think even Tim Berners-Lee himself
Starting point is 00:26:46 wouldn't, well, I know that he wouldn't disagree with that statement. But regardless of that, in terms of a proven mechanism for lots of different people, a whole world of people being able to connect their information and knowledge together, it's HTTP works. Enough said, it's just a proven battle-hardened technology. So we're just taking that exact same proven battle-hardened
Starting point is 00:27:10 technology and then applying it now to the underlying data concepts underneath. It's like a no-brainer. And I actually, probably shouldn't say this, but I actually sometimes get a little bit annoyed with people reinventing the wheel with some, you know, you see some new flashy term that comes up and is like hyping up and it's, you know, it's like we really don't need to reinvent the wheel here. We have a really good technology
Starting point is 00:27:35 that's proven for solving this distributed data integration problem. You're going to struggle to come up with anything better than this that has got as much, you know, all of the software that's been written around it and the security protocols that are over it, the caching optimization, the fact that there's libraries in every language that you could do, this bare metal stuff that's optimized for it.
Starting point is 00:27:57 It's why go off and try and approach the problem from a different angle. We always would rely on it. That's a very good question. But, you know, pragmatically, these approaches do exist out there. And so I guess the question then becomes, all right, so these different technologies are out there, and people are using them. So does it really matter?
Starting point is 00:28:27 I mean, yes, we've just talked about how great it is that this kind of link data approach uses these standardized technologies and the fact that by doing that, you can have these virtualized graphs and that's really a great approach to do data integration and to solve your data problems. That said, however, if you're not interested in standardization, and if you're not interested in having this sort of distributed approach,
Starting point is 00:29:01 is it still OK to use some other kind of graph approach? Is that a knowledge graph? Does it matter in the end? I mean, no. My initial answer is no. And I think it's very important to be pragmatic and not dogmatic. And to a certain extent within our community, within the graph community, like they have, you know, they've got like a history of people kind of being rather dogmatic. And I think that that is, I think that's not not very helpful. I think if you boil it down, there are two things are important, I think using your eyes as identifiers
Starting point is 00:29:44 is important. And I think having a shared vocabulary and an agreed schema, you know, those are the two important things. And beyond that, really, the rest of it doesn't matter so much. But I think if you're a small organisation, if you're a really small organization, you know, actually just a plain old data warehouse might well do your job for everything that you need. But as soon as you kind of like step up to an organization that's got multiple databases going on, I, my hypothesis that I'm running is
Starting point is 00:30:25 the siloed approach to, I'm just going to have a bunch of separate databases, each one looking after a particular one of this business, and then I'm going to maybe link a little bit of the information together in order to do some finance reporting and send some information up to the chief executive. That's not going to cut it in my opinion in the age of AI. So there's a kind of fundamental massive event on the horizon,
Starting point is 00:30:53 which is maybe sort of overhyped on the short term, but is like massively underhyped on the long term. Like people, in my opinion, most people have got no idea what is about to hit them. It might not be coming quite as soon as some of the tech bros would lead us to believe, but actually some of my timelines are shortening now with the kind of the speed which things are accelerating here. So we're about to live in, at least in the next 10 year timeframe, in a fundamentally different world. And it could be shorter than that.
Starting point is 00:31:31 And in that world, I believe that an organization can no longer afford to have its data in this kind of loosely connected, siloed, disorganized state because ultimately the value that is the currency, the kind of value of what's going to be going on, when kind of like vast intelligence is available on tap, is actually going to come back a lot to what the unique semantics of that business are and the unique data that that business is holding. And to get your head around what that really is,
Starting point is 00:32:11 it's very, very important to connect it and to do the hard work of working out. I guess we sort of touched slightly on how we're going to give meaning to these edges. We're going to have different things in the nodes. And we talked about the semantic web and some of that stuff that was in there. We mentioned the O word for the ontology.
Starting point is 00:32:36 But to bring that out and make that, when we talked about the importance of the shared schema, to bring that out and make that all a bit more concrete for people. This is data modeling. This is metadata. This is the abstract concept, if you like. It's the schema.
Starting point is 00:32:57 But schema really does not necessarily do what the monology is justice because we've got complicated ways of modeling inheritance hierarchies, making a property to find classes, or going back the other way, the other way back down. And what that does, especially with graph structure, which, as you said yourself, is kind of like a mind map. So the name of the game is to get
Starting point is 00:33:27 as close to the semantics of the business as you possibly can, but within this kind of formal conceptual model. So you're trying to take the words that the business people are using within a given organization. And you're trying to turn those into these formal concepts to kind of like actually get really specific about what they are, and then to connect those concepts together
Starting point is 00:33:51 in the way that the concepts interrelate to each other with each other, like as a specific type of edge. It's not an easy process to do, actually generative rules can somewhat take out some of the dog work in that, and how easy it easy to do that. Right, so, yeah, I think we are kind of gradually approaching towards the old world, the ontology world.
Starting point is 00:34:14 And I know that one example that you like to use to sort of onboard people to the idea of data modeling and and all of that jazz is, you refer to schema.org and how schema.org has been doing that for the web that we just talked about. Would you like to just give us the brief version of that and how that all ties back to what you have been saying about the need to integrate and define data on the organizational level and how that's going to
Starting point is 00:34:56 to serve the goal of having your data inform your AI models. Yeah, that's a great point. So, yeah, schema.org is something that's out there on the web, and it's what the kind of web companies have been using to build their own knowledge graph. So you can just go to it, like schema.org, tap into your browser, and you'll see what it has got is the definitions from various different concepts
Starting point is 00:35:19 that we talk about on the web. So for instance, there'll be the context. Somebody can always say that the dentist, there'll be like a dentist will be identified there. And the dentist is a type of a service provider, which is a type of a local shop, you know, with whatever the kind of inheritance hierarchy that they've got going on there. And because of that, it will have like opening hours and the services that it offers. So they've basically gone to the effort of defining the semantics around what it is the dentist is and the sort of properties that you would expect to get back from there. And then if I am a dentist and I put up my website, then what I can do is in my website I can include this little island of JSON-LD,
Starting point is 00:36:00 which is just like JSON, but it's got a couple of special attributes in it. JSON, for anybody non-technical, is just a way of representing data. So it's a text way of representing data that's incredibly common and ubiquitous amongst developers. And you can take that normal JSON, and then you can put this special app context tag, and the app context tag is pointing back to SchemaDog. So now I can have my little bit of JSON data.
Starting point is 00:36:27 And it's got the app context tag pointing back to where the ontology or the schema is lying, which in this case is schema.org. And then it's got a type. And it's saying, well, this is a type of dentist. And then I've got various properties there. So then what the web crawlers are able to do is when they hit that web page, and I'm
Starting point is 00:36:44 going to index this web page now on my Google search index and I see this island of JSON LD there and I'll write okay. Oh, this is a dentist I know exactly what that is and this structured data that I can then sort of take and suck in and that's kind of like what's driving those info boxes on the side of Google search and most people don't know this but actually almost half of the web Google search and most people don't know this but actually almost half of the web has got these islands of JSON-LD in it so this is not like a unused technology or something that's difficult for people to do half of the pages on the web so this is this is this is no one knows it normally kind of talks about it but you know it is a very widely used technology so and what it is is a shared schema.
Starting point is 00:37:26 So there's something quite powerful about that. There's something quite powerful about that concept. Because we've got the URLs, and that means that we can go and resolve them over the web to someone else. So schema.org is hosting their definition of what these concepts are. I can follow the URL of schema.org.
Starting point is 00:37:43 I can follow the URL of what dentist is, I can follow the URL of what dentist is and I can go back and I can get this definition there of what dentist is and what that means is that the process of the data integration, the cost of it has been taken from the person who's aggregating the data and it's passed the cost on to the person who is providing the data. So Google and Microsoft don't pay the cost of doing the data integration. The dentist pays the cost of doing denture integration because he wants to come up higher in their search indexes. And that is the trick, in my opinion, that pretty much all organizations need to perform.
Starting point is 00:38:21 So they need to have their own internal version of Schema.org that's got their own specific semantics within it that reflect the semantics of their particular niche and their particular business. And then they need to try to get as much of their organisation as possible to buy into producing and publishing their data in conformance with those shared semantics. And because you've got things like inheritance that, you know, I can, I can have a slightly different version of what my dentist thing is, but I'm going to inherit the characteristics of this shared definition. So that's does that's people don't realize it,
Starting point is 00:38:56 but that is like a massive magic trick. And it's something that has not been performed yet really, because, you know, even if we're talking about like creating a data mesh and creating data products, that's great. That's decentralized. But what you're really wanting to do is you're wanting to decentralize the effort of doing the data integration.
Starting point is 00:39:15 That's the nub of where the hard part of the problem is here, is kind of linking stuff together and conforming to some shared terms. And just briefly, why is that important then with AI? And we can kind of get onto it a little bit more. But you're creating your training sets for AI there. You're defining the nature of your business in a formal way. You're then connecting instances of data
Starting point is 00:39:40 to those formal semantic concepts. So you're doing the job that needs to be done in order to get yourself through. Once this massive event comes in of whatever it looks like with this kind of general AI event comes through, the organizations that are going through to the other side of that, in my opinion, are the ones which have done this task. And we've got this window in order to do it. Yeah, actually, I think part of the reason why some organizations, at least, are lured by the promise of, well, AI is going to solve our problems is precisely because they believe that, well,
Starting point is 00:40:22 all we need to do is just feed our data to the AI, the magic AI model, and presto, it's going to just make sense of everything. And I think what they don't realize is that it's just going to create a mess if they don't actually pay the upfront work that needs to be done to define what exactly is in that data that they're going to feed to the AI model. Because I think a very typical example is things like sales or products or customers. If you have a dozen systems or even a couple of systems in your organization, chances are these things are going to be repeated,
Starting point is 00:41:07 are going to be reused across different systems, and not exactly in the same way. So system A will define customer in a certain way, and system B will define, again, customer in a slightly different way. So if you feed data from both of those sources to your AI model, what's going to come out? Something kind of jumbled and inconsistent, right? Yeah, yeah, I think that's absolutely it.
Starting point is 00:41:34 So there's a few things to say to it. So, like, first of all, all organizations need, are going to have to accept the reality that they're moving to a more probabilistic world. So everyone has got to start using AI, or you're probably going to go out of business, which means that we're moving to this new world where stuff will be probabilistic and AI will be embedded in a lot of this kind of decision making.
Starting point is 00:41:59 So I think that's just a kind of you might not like it, or you might have whatever opinion, but it doesn't care. It's like some force of nature or whatever that's happening, and it's happening. So you might as well just kind of get used to it. So then the question really becomes, well, how do you do that in a kind of safe way? And in my opinion, that comes through
Starting point is 00:42:25 like external verification. So it's quite interesting if you take like, you've probably seen like all of the hype over DeepSeek, you know, and like the big impact that had on the stock market with them coming in recently and doing their version of generative model. And so like everyone else, I was kind of obsessing with it and going back and kind of looking at it
Starting point is 00:42:49 and try and work out what it is that they did. And they did some quite interesting changes algorithmically, but the bit that really struck out to me is that kind of sitting at the heart of what they did, how they got their model to be good, is they kind of took all of the web data like everybody is doing, but then they pulled out like just the ones that were related to mathematics and the ones that
Starting point is 00:43:12 were related to coding. And then with that what you can do is you can create an external verifier. So you can basically look at the bit of the maths and then you can look at the answer at the end and then you can check whether the answer is actually correct and likewise with the code. And then you can feed that to the large language model and ask the large language model to do that, and then check it against the external formal verifier. And what that is basically doing is kind of putting
Starting point is 00:43:36 quality control upon the probabilistic model. So you're going out of, this is where I kind of like talk about the continuous world and the discrete world. So you kind of, in the continuous world, everything's probabilistic, everything's fuzzy. And that's where these kind of generative models are. Like one thing blends into another and you get hallucination, but you flip hallucination on the other side.
Starting point is 00:43:56 And there's something a bit like creativity there. It's, you know, able to explore off to these kind of different places, you know, talking back to the old world of the semantic web, we had this thing called the Psyche project, which it failed because we were trying to do complete general knowledge in a formal way. And it was just like it was an absolutely horrendous task. This succeeds at that.
Starting point is 00:44:18 This succeeds at what that was unable to do because of its probabilistic nature. But it has got these dangers. You want to go use one of these systems to do something within the financial domain or in the medical domain, absolutely in the legal domain. Absolutely no way you're going to lose something like that that's going to do those things.
Starting point is 00:44:40 So then the question comes, OK, well, you've got these kind of like for maths and coding, it's quite obvious what your formal verifier is, you know, like, because with maths, you've got and got the answer there, or even you could go out to come some kind of system like Lean or something like that, where they've actually got the accidents in there. I don't think that the DeepSeq team did that, but others are. With code, you can see if it compiles. You can maybe even get the LAM to write a couple of unit tests and see whether the answers are coming out right. You're using the code compiler as the formal logical system
Starting point is 00:45:15 there. So you have this really interesting thing then. Well, how is a given organization going to do that? Well, at the base level, again, let's go back to bare metal. So we go, we go, we go back to bare metal first principles in our probabilistic fuzzy world. If we are, if we are sort of modeling mathematically probabilities in this kind of across multiple different factors that we would call like dimensions, then we create this vector. People talk about it like vector embedding, so I've got my vector database.
Starting point is 00:45:54 When people are talking about that stuff, that's what they're talking about. Big long list of floating-point numbers, you know, that's essentially what you get out of a, you know, behind a large neural network is, you know, when they talk about the weights of the model, that's what it is, you know, like these, these big long list of floating point numbers. In maths, when you're wanting to, and that's, that's in the continuous world. So you can almost see that like a kind of like, let's, let's now not talk about the trillion parameter models that we've got with these giant things let's like simplify things for ourselves and let's imagine that
Starting point is 00:46:29 we're just now down into two dimensions well it's quite simple now and we can think of it like coordinates on a map and i've got my latitude and my longitude and that's going to take me depends on the scale of the map to be honest with you like where it's going to take me but it's going to take me to some region within within within that space. And that's essentially like what we're getting with the kind of vector embeddings. And the structure that we would represent that with would be a vector, or sometimes tensors and stuff like that. If you're going to do discrete mathematics on the other hand,
Starting point is 00:47:01 like if you went to a mathematician and said, how are you going to do like discrete mathematics, well like 99 times out of 100, you would use a graph structure to do that. The graph is the structure that you would turn to in order to kind of do things like discreetly. And what do we mean by discreetly? Well, you're basically kind of putting boundaries around stuff. It's no longer probabilistic and fuzzy. It's like, OK, there's this thing. It's connected to this other thing. It's connected to this other thing. I can compute over that.
Starting point is 00:47:31 I can validate over that. I can do formal verification over that. So now let's zoom right back up from the bare metal to the other thing. We've got our ontology. We've got our graph. We've got our large language model. We've got the fuzzy probabilistic, scary but kind of almost human creative hallucinating thing over here.
Starting point is 00:47:53 And then we've got our kind of like discrete computable, validatable part here, and stick the two together. And this is exactly why knowledge graphs are zooming up the hyperscale at the moment with Gartner identifying them as being the enabling technology for gen AI. Well, you know, playing the devil's heart here, someone might tell you, great, you know, that all sounds great in theory, but how does that actually work out in practice? Right, I'm sold to the idea. I'm going to start creating my own organizational ontology. And then what? What can I actually do with that?
Starting point is 00:48:35 How can I use it to interact with an AI model in a meaningful way? And I'm going to sit here a little bit by sort of previewing your answer. I've seen you advocate a concept called the working memory graph, which I guess is a way of conceptualizing this interaction. So I'm going to let you talk through it, and then I have a couple of specific questions on how that plays in practice at this point in time. Yeah, okay.
Starting point is 00:49:11 So the idea of the working memory graph is that you basically... Let's assume that the ontology has been created and the knowledge graph has been created, which are two big assumptions and it might well be worth looping back to the fact that in order to create those we use generative AI not to create them on their own, but with humans in the loop you use it as a tool to lower the effort of doing that. But zooming past that and assuming that that bit is in place now. So I've got a good ontology, which is like a formal conceptualization of my domain with the key concepts within it. And I've got a knowledge graph, which is connecting to those concepts and giving me kind of facts about it.
Starting point is 00:50:01 So now, like the user could ask a question, you could have, like, just like you've got ChatGTP, you're working within a given organization, and you bring up your semantic agent that is your AI agent with your private version of a large language model, generative AI model model coming up there. And you say to it, read through this document and give me the probability that the mortgages in this, the MVS is mentioned in it, are likely to default or something like that. Like some, or depending upon your domain, whatever that is, you know, some highly technical thing that is relevant to what you're doing, but a general large language model would not be able
Starting point is 00:50:56 to answer because it has been trained on all of the general information in the world and it's not got the specific information about your organization and it's not got the specific information about your organization and it hasn't got the specific know-how and knowledge of your experts. So what the large language model can do is it can look at that query and understand that query and it can do this thing called rag and then this thing that we call graph rag, which takes various different flavors, but in the conception I'm gonna give you with the working memory graph, the first thing it's gonna do
Starting point is 00:51:30 is actually just go off to the ontology and it's going to try and identify the concepts that are being mentioned in here. I see you're talking about an NBS, I know what NBS is, it's mortgage backed security, it's like a type of bond, I know about the semantic, yeah, I know about them. And then you're asking for like default.
Starting point is 00:51:45 Okay, yeah, this is our definition of what default is, blah, blah, blah, blah. Are there any facts around the graph that are maybe relevant to doing that? Can I, when I'm looking at this document, what properties am I expecting to pull out that I know of MPS is generally have and that may be relevant to do this thing?
Starting point is 00:52:04 Now I end up with the situation with the working memory graph. I've got the large language model at one side, and now I've got like a subgraph of discrete facts on the other side. And now I can basically kind of loop between these two structures. So like going from the formal verifiable version down to the using the large language models. The large language model is going to kind of do exploration and creativity. The graph is gonna give you like discrete formalization
Starting point is 00:52:30 and then kind of giving, yes, no answers maybe, but you're kind of slightly on the wrong track here. So, imagine take a different formal system, take a mathematical verifier. So imagine now over here, we, take a mathematical verifier. So imagine now over here we've got a mathematical verifier, so like alpha geometry. In alpha geometry, basically you had like a custom-trained large language model which is kind of coming up with ideas of, oh, I want to solve this geometry problem, right, okay, my idea is solving the geometry of this problem, right,
Starting point is 00:53:02 and now I'm going to send it into the formal verify, which is going to compute whether that answer is correct or not. And then it's going to reply back to the large language model saying, no, you're off by this because such and such don't work out. And basically, what we're talking about here with the working memory graph is a domain-specific version of that.
Starting point is 00:53:23 Harder to do, much, much harder to do. But basically, I think the only game in town, I mean, I'm biased, obviously, but the only game in town I see for all organisations that want to survive the event is to kind of get really crisp on what these semantics are, connect your data, be able to do that kind of formal verification. All right, so a couple of questions. Is it clear or is it not so clear? So, yeah, a couple of questions here. So what you're talking about is a kind of two-way relationship between building your ontology and knowledge graph
Starting point is 00:54:02 and using your large language model. So one of the ways that you see this interaction happening is using an AI model to help you create an ontology and a knowledge graph. And so my question there is, what has your experience been in the current state of affairs there. And to kind of preamble this, my own experience has been kind of so-and-so, let's say. I mean, yes, you can use... I've tried a couple of different options, not all of them, admittedly, so I may be missing something, I have to say this upfront, but in my experience, the kind of tools that I've used have been able to get something out of...
Starting point is 00:54:49 You feed them a lot of textual documents, basically, and you expect them to give you something that resembles a kind of coherent data model for the domain that your documents are describing. My experience has been kind of hit and miss. Yes, they are able to extract some kind of concepts, some kind of relationships, but what you get out of the box is really not usable. It's not really production level. You have to put in additional effort to be able to come up with something that's usable.
Starting point is 00:55:24 So I kind of question whether it's actually worth doing that in the first place. Maybe you're better off just starting doing everything manually. I don't know what your experience has been. Yeah, I mean, there are some tricks, which should be too detailed to go into now. But my experience has been very positive, actually, and only
Starting point is 00:55:47 getting more so as the models, particularly with the reasoning models that are kicking in now. Yeah, it's just, it's kind of just getting better and better. So no, I think it's good. Is it 100% of the way there? No, it's not. And actually, that's a good thing. Because if you're simply able to take an AI model and tell it, show it one of the documents and say, tell me everything that's going on in this document to the level of expertise that the people inside a given niche of a business are able to do. That means the value of your business has just dropped down to the price of a call to one of these models. So like 0.01 dollars or something like, or even less now that we've got the
Starting point is 00:56:35 deep seek, et cetera out there. So, um, this is a good thing. So that what the, what the large language model is able to do in my opinion, is take between somewhere up to 80 to 90% of the dog work effort out of doing this thing. And then the rest does come over down to the experts. But basically what you're able to do is, like, within, like, if you kind of like flip over to the machine learning side, let me talk about this thing of like the latent space. So this is the kind of part within the neural network, you've got the input layer, you've got the kind of output layer that comes off the other side,
Starting point is 00:57:13 and in between that you've got the latent space, which is this kind of like black box of whether the machine learning model is learning what it's kind of learning and has got these latent representations, which, you know, there's a lot of kind of debate that people go, you know, do these things have like world models and do they have like conceptual understanding? But to my mind, they do create these abstract concepts within there. They have to do that because they're basically compressing large amounts of information, the whole of the information on the web and more. They're compressing that down to quite a small number of floating point numbers.
Starting point is 00:57:50 In order to do that compression, they have to compress some of the semantic information. So they have to compress information about some of the concepts. But they do it in a way that's completely interpretable by humans. And we would never be able to begin to understand how that's done.
Starting point is 00:58:06 I mean, just as you wouldn't look into someone's brain and look at their neurons and understand what they're doing. So by using the large language model to show an instance of a data and then get it to help you to get it to do a first stab at what the conceptual ontology is doing is kind of a bit like looking into the latent space of the large language model. You're getting to kind of pull out some of those concepts and put them into a discrete form and see the network of how that they
Starting point is 00:58:36 link together with each other. And once you've got that, that's when the domain experts can come in. And this is where the business is adding its own value. So this is in some ways a really important piece, because the bit where the large language model can do on its own is kind of like the gestalt knowledge that everybody has already got. So that's the bit that's worth zero point zero zero one dollars. Now we come to the bit is actually going to be worth something post the AI full on event, which is not really coming in like that, but it's going to be more of a gradual process. But post that coming, the bit that's actually the large language model can't do and isn't part of the general knowledge of the world is the part that's actually got the value. And that's where your people can come in and they can make the changes in there and say, no,
Starting point is 00:59:21 actually it's much more nuanced than this, and or it's got this conception of this wrong, it's like this. So that's really valuable work that's being done at that point and shouldn't be degraded. But who wants to do the boring bit that everybody else knows? You know, you only want to kind of do the bit where you're the specialist and you're adding some value and that's how these two things, well really it's three things working in conjunction. It's the large language model, taking the stout knowledge and doing a load of dog work. It's the formalization of the model in what we've got with the technology of ontologies and knowledge graphs, and then of course it's the humans that are able to kind of work in at this kind of discrete conceptual level and go and make
Starting point is 01:00:00 their changes, which are then fed back into the large language model. And now when you say, oh no, here's this nuanced understanding of what this concept actually is, the next time the large language model reads that document, it's able to refer back to your conceptual models and now it knows what you would know. You've distilled your knowledge into this very crisp format, formal format. And then when it comes back out the other end, if it's made a mistake or hallucinated or got something wrong, your model can catch it. Right, so that's precisely what my next question was going to be about. Like, fine, okay, so let's assume that one way or another you've gotten your ontology,
Starting point is 01:00:38 and then you're trying to use it to guide, to steer your language model towards, to provide some context towards getting an answer that's, well, less, that contains less hallucination, let's say. Again, in my experience, and I also have to say that I've seen people use various flavors of, well instructions or whatever other constructs that they refer to by using the term ontology, which in my opinion is a bit confusing again.
Starting point is 01:01:20 But anyhow, that aside, that kind of pet peeve aside, let's assume that through whatever way you've gotten to the point where you have an ontology, whatever form that ontology may actually assume. What, in my experience, it still doesn't entirely rid you of hallucinations. So how do you actually, you still have to put in some work to verify that whatever it is that you get back actually holds. What's your experience been? And how do you see a way out of it?
Starting point is 01:01:58 The same and the way out of it. So I call this pattern like the neural symbolic loop. I've got all of these kind of strange names and they're very like working memory graph and neural symbolic loop. But they're kind of, in a way, the working memory graph is like, OK, this is how I'm going to do the answering of the question.
Starting point is 01:02:17 The broader pattern that that sits within is this thing I call the neural symbolic loop, which is that you're just iterating between continuous and discrete structures until you get something that you're happy with. So we're kind of like, we've been one way through the loop in order to take the formalized concepts, put them into the context of the large language model, allow the large language model to do its job because you're kind of steering it by those formalized concepts. Then the information comes out of the large language model, and we can convert the information back
Starting point is 01:02:49 into a kind of discrete form of a graph. And then once we've got that in a discrete form of a graph, we can actually then use the ontological model that we've got there, plus the factual data that we've got in the subgraph, to validate that this makes sense. I put constraints in on some of these relationships, you know, like I'm trying to think of a good example off the top
Starting point is 01:03:13 of my head, but you know, like a dog can't have six legs or something like that, all mammals, yeah, we can see all mammals have four legs, anyway. Yeah, but you've put in whatever constraints are, and then you can do that constraint checking against what's coming out of it. So has it made a completely something that would send it down? And you can then block off paths that basically are obvious hallucinations. Okay. And then you can feed back. I'm sorry, large language modeling, you seem to have made,
Starting point is 01:03:54 you seem to have said that this dog has got six legs. That's not correct. Dogs have got a max of four legs. Try again. Right. Okay, so, yeah, verification is still needed, but your suggested way out of it is, well, you sort of re-,
Starting point is 01:04:14 you translate back to something that is verifiable and you do the verification part, and if the verification fails, you go again and you try to course correct. Alright, I think we're almost running out of time, we're kind of slightly over already. So let's wrap up by something that's a bit related to the most hyped topic of this day. So you already kind of talked about it, so DeepSeq and reasoning,
Starting point is 01:04:53 or so-called reasoning models. Again, there's a bit of disconnect, let's say, in terminology here, because what people in the Knowledge Graph community have traditionally been referring to when they use the term Reasoning is something that's kind of tractable, deterministic. You already, again, talked about it in one of your opening remarks. As opposed to that, what these so-called reasoning models do is something quite different.
Starting point is 01:05:31 So I've seen you being in support of the statement that, well, eventually these models will be able to mimic reasoning so well that it's going to be indistinguishable from actual tractable reasoning. And I don't know if we'll ever get to that point, but I think that's an interesting prediction anyhow. And just to kind of again play devil's advocate here, so as you also said at some point, the way that these models are trained
Starting point is 01:06:09 are basically by extracting verifiable statements out of otherwise very big and potentially messy training datasets and focusing on those. However, I've seen research that, and their claim to reasoning fame, so to speak, is the fact that they score very well on benchmarks that were specifically designed to measure reasoning capabilities. However, I've already seen people do research and sort of prove that even on those tests that are included in those benchmarks, if you do, if you vary it slightly on the very questions that are included in those benchmarks, those models will fail. So I don't know what that tells us exactly.
Starting point is 01:06:58 Yeah, that's a really good one. And this opinion of mine is not at all popular in the rest of my community. It goes down like a lead balloon whenever I say this one to the, within the graph community. So I guess, all right, let's kind of break it down. So first of all, reasoning, it's a bit of a loaded term and everyone's talking about it so much that it's one of those ones that's maybe becoming on the edge of like how useful is it because if reasoning seems to mean
Starting point is 01:07:30 different things to to different people but like one of the things I quite like is if you sort of look at large language models as doing logical deductive reasoning so like what we would call formal reasoning with first-order predicate logic, you know, that sort of thing, and compare that to like the different, you know, there are actually kind of different ways of reasoning out of a problem, so that type of reasoning really probably is only one type of reasoning, and it's the type that our community is particularly keen on and it's got a lot of benefits to it. But like actually, you know, coming back to the Psych Project again, and a huge admiration for the people who worked on the Psych Project, I think it was like a brilliant endeavor. But, you know, to do some of this stuff, it just makes more sense to do it. Some things, you know, there is no like exact truth to it, you know, it makes sense to be working in this kind of probabilistic
Starting point is 01:08:33 sort of space. So having like kind of inductive versus deductive reasoning, you know, having the two working together with each other, I think, you know, that's the best way of doing it. And that's certainly where we are right now at the moment. I could, you know, you wouldn't get a stronger advocate for a firm formal verifier than myself. That's exactly everything that I'm talking about, that organizations need formal verifiers. But nevertheless, you know, with Rich Sutton's, in a bitter lesson and everything like that, there's the continual surprising effects of things learning stuff for themselves. And actually actually now with these models,
Starting point is 01:09:26 and when you're bringing like RL into it, the reinforcement learning into it, the reinforcement learning does have the capability to go beyond what just the kind of large language models themselves were able to do. But even though what's surprising with DeepSeek is that we, many of us kind of suspected there was a bit of an RL going on or something like that going on at inference time, but it doesn't even look like it kind of needed that. It just, you know, looks like
Starting point is 01:09:55 the RL was used to train it so that it would spit out the thought processes that was going on. And by sticking out those thought processes, it's able to get a closer approximation of what reasoning looks like. Now, what it's, I guess, not going to be able to do is like create like a fully symbolic function where, you know, I just the large language model is probably is never going to be able to create those fully symbolic functions where, you know, like you like a mathematical function where I'm going to pass a couple of things in and once I've defined that mathematical function, this thing is always going to precisely come out with the same answer every single time. You know, they are of the continuous nature and that symbolic nature of doing that is
Starting point is 01:10:38 something that they're not going to be able to probably ever do, although they're getting better and better at approximating it. But humans can't do that either. My kind of in-head math is absolutely terrible. So we use tools in order to do those things. And they too will be able to use tools. So with a combination of a bit of tool use, a bit of external verification, maybe on like some kind of like formal structures, then yeah, it will be an even just the kind of
Starting point is 01:11:13 models on their own, like the way that they're proving, I think, you know, they will, you know, they're already demonstrating the level to which the mathematics that they're able to do, etc. It's shockingly good. And everybody, you know, yeah, people always come up with examples that kind of crush them down and these are the kind of edge cases. But, you know, look back to some of those ones that were being, take any of the original ones that people were finding at the beginning, which were laughable, and they're all kind of gone now. And I think that that process I would expect to continue over the next period, until the point where it's increasingly hard to trip these things up. Well, let's see how it all unfolds.
Starting point is 01:12:03 It's certainly going to be interesting to watch. Alright, so thanks. I don't know if you have any kind of closing statement or anything that you'd like people maybe listening to to take away from all of this. Yeah, my big message that I try and hammer down to everybody that is working in organizations at the moment is that it's very easy to get distracted
Starting point is 01:12:35 by the tip of this AI iceberg. So the glittering algorithms, the reasoning models, all this kind of stuff that's doing the clever stuff on the top. Don't get distracted by that. That's off and it's going and there's nothing anybody can do to stop it. It's like some massive geopolitical economic event that's going that major powers and huge amounts of money and billions and trillions of dollars and other currencies are going to be pumped into doing that. That's off like a rocket. There's pretty much nothing anybody can do to stop that. That's happening anyway. So you're going to be then in your organisation, you're going to be in a situation
Starting point is 01:13:09 where you're going to be able to import this general intelligence in. So you've got this period of time now before this stuff gets it's smart at the moment. But like you were just pointing out, it's not super smart. In my opinion, it's going to get super smart. And in quite a lot of other people's opinions as well, we'll see. But I would hedge on it getting super smart within the next five to 10 year framework, it's going to get super smart. I believe some other people it's even down to like three years. I think that's over optimistic, but we'll see. Who knows? You've got this short window. So what you need to do is concentrate on the bottom of the, you need to take it to the context of your organization
Starting point is 01:13:47 and you need to concentrate on the bottom of the AI iceberg, which is the data. So you need to take the power that we've got on the models that we have at our hands right now, and you need to focus that back upon the data that you've got in there. You need to clean and consolidate the data up so that it's in a state to be an effective external verifier
Starting point is 01:14:03 and that you're aware of what information is worth $0.00000.1 and what information only you have and is the value add that you're adding. You need to connect that and you need to consolidate that down because that's the only game in town as far as I see it at the moment. For a huge number of organisations. for a huge number of organisations. Okay, well, that's pretty clear and pretty powerful, I would say, as well. So, thanks a lot for your time. It's been a super interesting conversation and I hope people will like it as well. Yeah, cool chatting with you, George.
Starting point is 01:14:43 Thanks for sticking around. For more stories like this, check the link in bio and follow Link Data Orchestration.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.