The Good Tech Companies - How to Use Knowledge Graphs for Retrieval-Augmented Generation—Without a Graph DB
Episode Date: April 23, 2024This story was originally published on HackerNoon at: https://hackernoon.com/how-to-use-knowledge-graphs-for-retrieval-augmented-generationwithout-a-graph-db. This post ...explores the use of knowledge graphs for RAG, using DataStax Astra DB for storage. The code for the examples is in this notebook using some prototype Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #knowledge-graph, #retrieval-augmented-generation, #what-is-rag, #astradb, #datastax, #good-company, #llmgraphtransformer, #sub-knowledge-graphs, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. This post explores the use of knowledge graphs for RAG, using DataStax Astra DB for storage. The code for the examples is in this notebook using some prototype code for storing and retrieving knowledge graphs using Astra DB
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
How to use knowledge graphs for retrieval augmented generation, without a graph DB,
by Datastacks. Retrieval augmented generation, RAG, refers to a variety of techniques for
retrieving information and using it to provide contextual information for generative AI.
The most common form operates on text chunks and involves 1. Extracting the text
from the original documents, HTML, PDF, Markdown, etc. Backslash dot. 2. Chunking the text to
specific sizes based on document structure and semantics. Backslash dot. 3. Storing chunks in
a vector database keyed by an embedding of the chunk. Backslash dot. 4. Retrieving the chunks relevant to a question for use as a context when generating
the answer. Backslash dot. However, RAG based on vector similarity has a few weaknesses.
Since it focuses on information similar to the question, it is harder to answer questions
involving multiple topics and or requiring multiple hops, for instance.
Additionally, it limits the number of chunks retrieved. Each chunk comes from a distinct
source, so in cases where largely similar information exists in multiple places,
it needs to choose between retrieving multiple copies of the information and possibly missing
out on other information or picking only one copy in order to get more different chunks,
which then
misses out on the nuances of the other sources. Knowledge graphs can be used as an alternative
or supplement to vector-based chunk retrieval. In a knowledge graph, nodes correspond to specific
entities, and edges indicate relationships between the entities. When used for RAG,
entities relevant to the question are extracted, and then the knowledge subgraph containing those entities and the information about them is retrieved. This approach has
several benefits over the similarity-based approach. 1. Many facts may be extracted from
a single source and associated with a variety of entities within the knowledge graph.
This allows for the retrieval of just the relevant facts from a given source rather
than the whole chunk, including irrelevant information. Backslash.2. If multiple sources say the same thing,
they produce the same node or edge. Instead of treating these as distinct facts and retrieving
multiple copies, they can be treated as the same node or edge and retrieved only once.
This enables retrieving a wider variety of facts and or focusing only on facts that appear
in multiple sources. Backslash dot three. The graph may be traversed through multiple steps,
not just retrieving information directly related to the entities in the question,
but also pulling back things that are two or three steps away. In a conventional RAG approach,
this would require multiple rounds of querying. In addition to the benefits of using a knowledge graph for RAG, LLMs have also made it easier to
create knowledge graphs. Rather than requiring subject matter experts to carefully craft the
knowledge graph, an LLM and a prompt can be used to extract information from documents.
This post explores the use of knowledge graphs for RAG using Datastacks AstraDB for storage.
The code for the examples is in this notebook using some prototype code for storing and
retrieving knowledge graphs using AstraDB from this repository. We will make use of
Langchain's LLM Graph Transformer to extract knowledge graphs from documents, write them to
Astra, and discuss techniques fortening the prompt used for knowledge extraction.
We'll then create lang chain runnables for extracting entities from the question and
retrieving the relevant sub-graphs. We'll see that the operations necessary to implement RAG
using knowledge graphs do not require graph databases or graph query languages,
allowing the approach to be applied using a typical data store that you may already be using.
Knowledge Graph As mentioned earlier, a knowledge graph represents distinct entities as nodes.
For example, a node may represent Marie Curie, the person, or French, the language. In Langchain,
each node has a name and a type. We'll consider both when uniquely identifying a node,
to distinguish French, the language from,
French, the nationality. Relationships between entities correspond to the edges in the graph.
Each edge includes the source, for example, Marie Curie the person, the target, Nobel Prize the
award, and a type, indicating how the source relates to the target, for example, one. An
example knowledge graph extracted from a paragraph about
Marie Curie using Langchain is shown below. Depending on your goals, you may choose to add
properties to nodes and edges. For example, you could use a property to identify when the Nobel
Prize was won in the category. These can be useful to filter out edges and nodes when traversing the
graph during retrieval. Extraction. Creating the
knowledge graph. The entities and relationships comprising the knowledge graph can be created
directly or imported from existing known good data sources. This is useful when you wish to
curate the knowledge carefully, but it makes it difficult to incorporate new information quickly
or handle large amounts of information. Luckily, LLMs make it easy to extract information from
content, so we can use them for extracting the knowledge graph. Below, I use the LLM Graph
Transformer from Langchain to extract a graph from some information about Marie Curie.
This uses a prompt to instruct an LLM to extract nodes and edges from a document.
It may be used with any document that lang change on load, making it easy
to add to existing langchain projects. Langchain supports other options such as diffbot, and you
could also look at some of the knowledge extraction models available, like rebel. This shows how to
extract a knowledge graph using langchains. You can use the found in the repository to render a
langchain for visual inspection. In a future post, we'll discuss how
you can examine the knowledge graph both in its entirety as well as the sub-graph extracted from
each document and how you can apply prompt engineering and knowledge engineering to
improve the automated extraction. Retrieval. Answering with the sub-knowledge graph.
Answering questions using the knowledge graph requires several steps. We first identify where to start
our traversal of the knowledge graph. For this example, I'll prompt an LLM to extract entities
from the question. Then, the knowledge graph is traversed to retrieve all relationships within a
given distance of those starting points. The default traversal depth is 3. The retrieved
relationships in the original question are used to create a prompt and context for the LLM to answer the question.
Extracting entities from the question as with the extraction of the knowledge graph.
Extracting the entities in a question can be done using a special model or an LLM with a specific prompt.
For simplicity, we'll use an LLM with the following prompt which includes both the question and information about the format to extract.
We use a pydantic model with the name and type to get the proper structure.
Running the above example we can see the entities extracted and of course,
a lang chain runnable can be used in a chain to extract the entities from a question.
In the future, we'll discuss ways to improve entity extraction, such as considering node properties or using vector embeddings and similarity search to identify relevant starting points. To keep this first post simple, we'll
stick with the above prompt and move on to traversing the knowledge graph to retrieve the
and include that as the context in the prompt. Retrieving the sub-knowledge graph the previous
chain gives us the nodes in question. We can use those entities in TheGraphStore to retrieve the relevant
knowledge triples. As with RAG, we drop them into the prompt as part of the context and generate
answers. And the above chain can be executed to answer a question. For example, traverse,
don't query. While it may seem intuitive to use a GraphDB to store the knowledge graph,
it isn't actually necessary. Retrieving the sub-knowledge graph
around a few notices a simple graph traversal, while graph dbs are designed for much more
complex queries searching for paths with specific sequences of properties. Further, the traversal is
often only to a depth of two or three, since nodes that are farther removed become irrelevant to the
question pretty quickly. This can be expressed as a few rounds of
simple queries, one for each step, or in SQL join. Eliminating the need for a separate graph database
makes it easier to use knowledge graphs. Additionally, using AstraDB or Apache Cassandra
simplifies transactional rights to both the graph and other data stored in the same place,
and likely scales better. That overhead would only be
worthwhile if you were planning to generate and execute graph queries using gremlin or cipher or
something similar. But this is simply overkill for retrieving the sub-knowledge graph and it opens
the door for a host of other problems such as queries that go off the rails in terms of performance.
This traversal is easy to implement in Python. The full code to implement this,
both synchronously and asynchronously, using CQL and the Cassandra driver can be found in
the repository. The core of the asynchronous traversal is shown below for illustration,
conclusion. This article showed how to build and use knowledge graph extraction and retrieval for
question answering. The key takeaway is that you don't need a graph database with a graph query language like Gremlin or Cypher to do this today.
A great database like Astra that efficiently handles many queries in parallel can already
handle this. In fact, you could just write a simple sequence of queries to retrieve the
sub-knowledge graph needed for answering a specific query. This keeps your architecture
simple,
no added dependencies, and lets you get started immediately. We've used these same ideas to implement graph rag patterns for Cassandra and AstraDB. We're going to contribute them to
Langchain and work on bringing other improvements to the use of knowledge graphs with LLMs in the
future, by Ben Chambers. Datastacks thank you for listening to this hackernoon story read by artificial intelligence visit hackernoon.com to read write learn and publish