The Good Tech Companies - How to Use Knowledge Graphs for Retrieval-Augmented Generation—Without a Graph DB

Episode Date: April 23, 2024

This story was originally published on HackerNoon at: https://hackernoon.com/how-to-use-knowledge-graphs-for-retrieval-augmented-generationwithout-a-graph-db. This post ...explores the use of knowledge graphs for RAG, using DataStax Astra DB for storage. The code for the examples is in this notebook using some prototype Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #knowledge-graph, #retrieval-augmented-generation, #what-is-rag, #astradb, #datastax, #good-company, #llmgraphtransformer, #sub-knowledge-graphs, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. This post explores the use of knowledge graphs for RAG, using DataStax Astra DB for storage. The code for the examples is in this notebook using some prototype code for storing and retrieving knowledge graphs using Astra DB

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. How to use knowledge graphs for retrieval augmented generation, without a graph DB, by Datastacks. Retrieval augmented generation, RAG, refers to a variety of techniques for retrieving information and using it to provide contextual information for generative AI. The most common form operates on text chunks and involves 1. Extracting the text from the original documents, HTML, PDF, Markdown, etc. Backslash dot. 2. Chunking the text to specific sizes based on document structure and semantics. Backslash dot. 3. Storing chunks in a vector database keyed by an embedding of the chunk. Backslash dot. 4. Retrieving the chunks relevant to a question for use as a context when generating
Starting point is 00:00:50 the answer. Backslash dot. However, RAG based on vector similarity has a few weaknesses. Since it focuses on information similar to the question, it is harder to answer questions involving multiple topics and or requiring multiple hops, for instance. Additionally, it limits the number of chunks retrieved. Each chunk comes from a distinct source, so in cases where largely similar information exists in multiple places, it needs to choose between retrieving multiple copies of the information and possibly missing out on other information or picking only one copy in order to get more different chunks, which then
Starting point is 00:01:25 misses out on the nuances of the other sources. Knowledge graphs can be used as an alternative or supplement to vector-based chunk retrieval. In a knowledge graph, nodes correspond to specific entities, and edges indicate relationships between the entities. When used for RAG, entities relevant to the question are extracted, and then the knowledge subgraph containing those entities and the information about them is retrieved. This approach has several benefits over the similarity-based approach. 1. Many facts may be extracted from a single source and associated with a variety of entities within the knowledge graph. This allows for the retrieval of just the relevant facts from a given source rather than the whole chunk, including irrelevant information. Backslash.2. If multiple sources say the same thing,
Starting point is 00:02:10 they produce the same node or edge. Instead of treating these as distinct facts and retrieving multiple copies, they can be treated as the same node or edge and retrieved only once. This enables retrieving a wider variety of facts and or focusing only on facts that appear in multiple sources. Backslash dot three. The graph may be traversed through multiple steps, not just retrieving information directly related to the entities in the question, but also pulling back things that are two or three steps away. In a conventional RAG approach, this would require multiple rounds of querying. In addition to the benefits of using a knowledge graph for RAG, LLMs have also made it easier to create knowledge graphs. Rather than requiring subject matter experts to carefully craft the
Starting point is 00:02:54 knowledge graph, an LLM and a prompt can be used to extract information from documents. This post explores the use of knowledge graphs for RAG using Datastacks AstraDB for storage. The code for the examples is in this notebook using some prototype code for storing and retrieving knowledge graphs using AstraDB from this repository. We will make use of Langchain's LLM Graph Transformer to extract knowledge graphs from documents, write them to Astra, and discuss techniques fortening the prompt used for knowledge extraction. We'll then create lang chain runnables for extracting entities from the question and retrieving the relevant sub-graphs. We'll see that the operations necessary to implement RAG
Starting point is 00:03:35 using knowledge graphs do not require graph databases or graph query languages, allowing the approach to be applied using a typical data store that you may already be using. Knowledge Graph As mentioned earlier, a knowledge graph represents distinct entities as nodes. For example, a node may represent Marie Curie, the person, or French, the language. In Langchain, each node has a name and a type. We'll consider both when uniquely identifying a node, to distinguish French, the language from, French, the nationality. Relationships between entities correspond to the edges in the graph. Each edge includes the source, for example, Marie Curie the person, the target, Nobel Prize the
Starting point is 00:04:17 award, and a type, indicating how the source relates to the target, for example, one. An example knowledge graph extracted from a paragraph about Marie Curie using Langchain is shown below. Depending on your goals, you may choose to add properties to nodes and edges. For example, you could use a property to identify when the Nobel Prize was won in the category. These can be useful to filter out edges and nodes when traversing the graph during retrieval. Extraction. Creating the knowledge graph. The entities and relationships comprising the knowledge graph can be created directly or imported from existing known good data sources. This is useful when you wish to
Starting point is 00:04:55 curate the knowledge carefully, but it makes it difficult to incorporate new information quickly or handle large amounts of information. Luckily, LLMs make it easy to extract information from content, so we can use them for extracting the knowledge graph. Below, I use the LLM Graph Transformer from Langchain to extract a graph from some information about Marie Curie. This uses a prompt to instruct an LLM to extract nodes and edges from a document. It may be used with any document that lang change on load, making it easy to add to existing langchain projects. Langchain supports other options such as diffbot, and you could also look at some of the knowledge extraction models available, like rebel. This shows how to
Starting point is 00:05:36 extract a knowledge graph using langchains. You can use the found in the repository to render a langchain for visual inspection. In a future post, we'll discuss how you can examine the knowledge graph both in its entirety as well as the sub-graph extracted from each document and how you can apply prompt engineering and knowledge engineering to improve the automated extraction. Retrieval. Answering with the sub-knowledge graph. Answering questions using the knowledge graph requires several steps. We first identify where to start our traversal of the knowledge graph. For this example, I'll prompt an LLM to extract entities from the question. Then, the knowledge graph is traversed to retrieve all relationships within a
Starting point is 00:06:15 given distance of those starting points. The default traversal depth is 3. The retrieved relationships in the original question are used to create a prompt and context for the LLM to answer the question. Extracting entities from the question as with the extraction of the knowledge graph. Extracting the entities in a question can be done using a special model or an LLM with a specific prompt. For simplicity, we'll use an LLM with the following prompt which includes both the question and information about the format to extract. We use a pydantic model with the name and type to get the proper structure. Running the above example we can see the entities extracted and of course, a lang chain runnable can be used in a chain to extract the entities from a question.
Starting point is 00:07:02 In the future, we'll discuss ways to improve entity extraction, such as considering node properties or using vector embeddings and similarity search to identify relevant starting points. To keep this first post simple, we'll stick with the above prompt and move on to traversing the knowledge graph to retrieve the and include that as the context in the prompt. Retrieving the sub-knowledge graph the previous chain gives us the nodes in question. We can use those entities in TheGraphStore to retrieve the relevant knowledge triples. As with RAG, we drop them into the prompt as part of the context and generate answers. And the above chain can be executed to answer a question. For example, traverse, don't query. While it may seem intuitive to use a GraphDB to store the knowledge graph, it isn't actually necessary. Retrieving the sub-knowledge graph
Starting point is 00:07:45 around a few notices a simple graph traversal, while graph dbs are designed for much more complex queries searching for paths with specific sequences of properties. Further, the traversal is often only to a depth of two or three, since nodes that are farther removed become irrelevant to the question pretty quickly. This can be expressed as a few rounds of simple queries, one for each step, or in SQL join. Eliminating the need for a separate graph database makes it easier to use knowledge graphs. Additionally, using AstraDB or Apache Cassandra simplifies transactional rights to both the graph and other data stored in the same place, and likely scales better. That overhead would only be
Starting point is 00:08:25 worthwhile if you were planning to generate and execute graph queries using gremlin or cipher or something similar. But this is simply overkill for retrieving the sub-knowledge graph and it opens the door for a host of other problems such as queries that go off the rails in terms of performance. This traversal is easy to implement in Python. The full code to implement this, both synchronously and asynchronously, using CQL and the Cassandra driver can be found in the repository. The core of the asynchronous traversal is shown below for illustration, conclusion. This article showed how to build and use knowledge graph extraction and retrieval for question answering. The key takeaway is that you don't need a graph database with a graph query language like Gremlin or Cypher to do this today.
Starting point is 00:09:10 A great database like Astra that efficiently handles many queries in parallel can already handle this. In fact, you could just write a simple sequence of queries to retrieve the sub-knowledge graph needed for answering a specific query. This keeps your architecture simple, no added dependencies, and lets you get started immediately. We've used these same ideas to implement graph rag patterns for Cassandra and AstraDB. We're going to contribute them to Langchain and work on bringing other improvements to the use of knowledge graphs with LLMs in the future, by Ben Chambers. Datastacks thank you for listening to this hackernoon story read by artificial intelligence visit hackernoon.com to read write learn and publish

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.