The Good Tech Companies - 6 Critical Challenges of Productionizing Vector Search
Episode Date: April 23, 2024This story was originally published on HackerNoon at: https://hackernoon.com/6-critical-challenges-of-productionizing-vector-search. Prepare for complexities of deployin...g vector search in production with insights on indexing, metadata filtering, query language, and vector lifecycle management Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #vector-search, #vector-database, #app-development, #rockset, #cloud-computing, #scaling-vector-search, #vector-lifecycle-management, #good-company, and more. This story was written by: @rocksetcloud. Learn more about this writer by checking @rocksetcloud's about page, and for more stories, please visit hackernoon.com. Productionizing vector search involves addressing challenges in indexing, metadata filtering, query language, and vector lifecycle management. Understanding these complexities is crucial for successful deployment and application development.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
Six Critical Challenges of Productionizing Vector Search
By Roxette
You've decided to use vector search in your application, product, or business.
You've researched how and why embeddings in vector search make a problem-solvable or can
enable new features. You've dipped your toes into the hot, emerging area of approximate
nearest-neighbor algorithms and vector databases. Almost immediately upon productionizing vector search applications,
you will start to run into very hard and potentially unanticipated difficulties.
This blog attempts to arm you with some knowledge of your future, the problems you will face,
and questions you may not know yet that you need to ask.
1. Vector search does not equal vector database. Vector
search and all the associated clever algorithms are the central intelligence of any system trying
to leverage vectors. However, all of the associated infrastructure to make it maximally useful and
production-ready is enormous and very, very easy to underestimate. To put this as strongly as I can,
a production-ready vector database will solve many, many more database problems than vector problems. By no means is vector search itself an easy
problem, and we will cover many of the hard sub-problems below, but the mountain of traditional
database problems that a vector database needs to solve certainly remain the hard part. Databases
solve a host of very real and very well-studied
problems from atomicity and transactions, consistency, performance and query optimization,
durability, backups, access control, multi-tenancy, scaling and sharding and much more.
Vector databases will require answers in all of these dimensions for any product,
business or enterprise. Be very wary of home-rolled vector search infra. It's not that hard to download a state-of-the-art
vector search library and start approximate nearest neighboring your way towards an
interesting prototype. Continuing down this path, however, is a path to accidentally reinventing
your OWN database. That's probably a choice you want to make consciously.
2. Incremental indexing of vectors. Due to the nature of the most modern and vector search
algorithms, incrementally updating a vector index is a massive challenge. This is a well-known,
hard problem. The issue here is that these indexes are carefully organized for fast look UPS and any
attempt to incrementally update them with new vectors will rapidly deteriorate the fast lookup properties.
As such, in order to maintain fast lookups as vectors are added,
these indexes need to be periodically rebuilt from scratch.
Any application hoping to stream new vectors continuously,
with requirements that both the vectors show up in the index quickly and the queries remain fast,
will need serious support for the incremental indexing problem.
This is a very crucial area for you to understand about your database and a good place to ask a number of hard questions. There are many potential approaches that a database might take to help
solve this problem for you. A proper survey of these approaches would fill many blog posts of
this size. It's important to understand some of the technical details of your database's approach because it may have unexpected trade-offs or consequences in your application.
For example, if a database chooses to do a full re-index with some frequency,
it may cause high CPU load and therefore periodically affect query latencies.
You should understand your application's need for incremental indexing and the capabilities
of the system you're relying on to serve you. 3. Data latency for both vectors and metadata. Every application should understand
its need and tolerance for data latency. Vector-based indexes have, at least by other
database standards, relatively high indexing costs. There is a significant trade-off between
cost and data latency. How long after you create a vector do
you need it to be searchable in your index? If it's soon, vector latency is a major design point
in these systems. The same applies to the metadata of your system. As a general rule, mutating
metadata is fairly common. E.G. Change whether a user is online or not. In SWA is typically very
important that metadata filtered queries rapidly react to updates to metadata. Taking the above example, it's not useful if
your vector search returns a query for someone who has recently gone offline. If you need to
stream vectors continuously to the system, or update the metadata of those vectors continuously,
you will require a different underlying database architecture than if it's acceptable for your use case t.g. Rebuild the full index every evening to be used the next day. 4. Metadata filtering.
I will strongly state this point. I think in almost all circumstances, the product experience
will be better if the underlying vector search infrastructure can be augmented by metadata
filtering or hybrid search. Greater than show me all the
restaurants I might like, a vector search, that are located greater than within 10 miles and are
low to medium priced, metadata filter. The second part of this query is a traditional SQL-like
clause intersected with, in the first part, a vector search result. Because of the nature of
these large, relatively static, relatively monolithic vector indexes,
it's very difficult to do joint vector plus metadata search efficiently.
This is another of the well-known, hard problems that vector databases need to address on your
behalf. There are many technical approaches that databases might take to solve this problem for you.
You can pre-filter, which means to apply the filter first and then do a vector lookup.
This approach suffers from not being able to effectively leverage the pre-built vector index.
You can post-filter the results after you've done a full vector search.
This works great unless your filter is very selective, in which case,
you spend huge amounts of time finding vectors you later toss out because they don't meet the
specified criteria. Sometimes, as is the case in Rockset, you can do single-stage filtering which is to
attempt to merge the metadata filtering stage with the vector lookup stage in a way that preserves
the best of both worlds. If you believe that metadata filtering will be critical to your
application and deposit above that it will almost always be. The metadata filtering trade-off sand functionality
will become something you want to examine very carefully. 5. Metadata Query Language. If I'm
right and metadata filtering is crucial to the application you are building, congratulations,
you have yet another problem. You need a way to specify filters over this metadata.
This is a query language coming from a database angle and as
this is a Rockset blog, you can probably expect where I am going with this. SQL is the industry
standard way to express these kinds of statements. Metadata filters in vector language is simply
the clause to a traditional database. It has the advantage of also being relatively easy to port
between different systems. Furthermore, these filters
are queries, and queries can be optimized. The sophistication of the query optimizer can have
a huge impact on the performance of your queries. For example, sophisticated optimizers will try to
apply the most selective of the metadata filters first because this will minimize the work later
stages of the filtering require, resulting in a large performance win.
If you plan on writing non-trivial applications using vector search and metadata filters,
it's important to understand and be comfortable with the query language,
both ergonomics and implementation, you are signing up to use, write, and maintain.
6. Vector Lifecycle Management. Alright, you've made it this far. You've got a vector database that has all the right database fundamentals you require, has the right incremental indexing strategy for
your use case, has a good story around your metadata filtering needs, and will keep its
index up to date with latencies you can tolerate. Awesome, your ML team, or maybe OpenAI, comes out
with a new version of their embedding model. You have a gigantic database filled with old vectors that now need to be updated. Now what? Where are you going to run this large batch ML job?
How are you going to store the intermediate results? How are you going to do the switchover
to the new version? How do you plan to do this in a way that doesn't affect your production workload?
Ask the hard questions. Vector search is a rapidly emerging area, and we're seeing a lot
of users starting to bring applications to production. My goal for this post was to arm
you with some of the crucial hard questions you might not yet know to ask. And you'll benefit
greatly from having them answered sooner rather than later. In this post what I didn't cover was
how Rockset has and is working to solve all of these problems and why some of our solutions to
these are groundbreaking and better than most other attempts at the state of the art. Covering
that would require many blog posts of this size, which is, I think, precisely what we'll do.
Stay tuned for more. Thank you for listening to this HackerNoon story,
read by Artificial Intelligence. Visit HackerNoon.com to read, write, learn and publish.