The Good Tech Companies - A Look Into 5 Use Cases for Vector Search from Major Tech Companies

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. A look into 5 use cases for vector search from major tech companies, by Roxette. Many organizations that we've spoken to are in the exploration phase of using vector search for AI-powered personalization, recommendations, semantic search and anomaly detection. The recent and astronomical improvements in accuracy and accessibility of large language models, LLMs, including BERT and OpenAI have made companies rethink how to build relevant search and analytics experiences. In this blog, we capture engineering stories from five early adopters of Vector Search Pinterest, Spotify, eBay, Airbnb and DoorDash who have integrated Iinto their applications. We hope these stories will be helpful to engineering teams who are thinking through the full lifecycle of vector search all

Starting point is 00:00:49 the way from generating embeddings to production deployments. What is vector search? Vector search is a method for efficiently finding and retrieving similar items from a large dataset based on representations of the data in a high-dimensional space. In this context, items can be anything, such as documents, items can be anything, such as documents, images, or sounds, and are represented as vector embeddings. The similarity between items is computed using distance metrics, such as cosine similarity or Euclidean distance, which quantify the closeness of two vector embeddings. The vector search process usually involves generating embeddings, where relevant features are extracted from the raw data to create vector representations using models such as Word2Vec,

Starting point is 00:01:30 BERT or Universal Sentence Encoder. Indexing. The vector embeddings are organized into a data structure that enables efficient search using algorithms such as FAISS or HNSW. Vector search. Where the most similar items to a given query vector are retrieved based on a chosen distance metric like cosine similarity or Euclidean distance. To better visualize vector search, we can imagine a 3D space where each axis corresponds to a feature. The time and the position of a point in the space is determined by the values of these features. In this space, similar items are relocated closer together and dissimilar items are farther apart. GitHub Julie Mills given a query, we can then find

Starting point is 00:02:11 the most similar items in the dataset. The querious represented as a vector embedding in the same space as the item embeddings, and the distance between the query embedding and each item embedding is computed. The item embeddings with the shortest distance to the query embedding and each item embedding is computed. The item embeddings with the shortest distance to the query embedding are considered the most similar. This is obviously a simplified visualization as vector search operates in high-dimensional spaces. In the next sections, we'll summarize five engineering blogs on vector search and highlight key implementation considerations. The full engineering blogs can be found below Pintext, a multitask text embedding system in Pinterest by Jin Feng Zhuang at Pinterest.

Starting point is 00:02:50 Introducing Natural Language Search for Podcast Episodes by Alexander Tamborino at Spotify. How eBay's new search feature was inspired by window shopping by Senthil Kumar Gopal, Shabangi Tandon, Christopher Miller, Deepika Srinivasan, Rui Kong, Selchuk Kapru and Srinivas Bhagavathula at eBay. Listing embeddings in search ranking by Mihal O. Gerbovic at Airbnb. Personalized store feed with vector embeddings by Mitchell Koch, Amir Manasawala, Raghav Ramesh at DoorDash. Pinterest. Interest search and discovery. Pinterest uses vector search for image search and discovery across multiple areas of its platform, including recommended content on the home feed, related pins and search using a multitask learning model. A multitask model is trained to perform

Starting point is 00:03:37 multiple tasks simultaneously, often sharing underlying representations or features, which can improve generalization and efficiency across related tasks. In the case of Pinterest, the team trained and used the same model to drive recommended content on the home feed, related pins and search. Pinterest trains the model by pairing a user's search query, Q, with the content they clicked on or pins they saved, P. Here is how Pinterest created the Q,, pairs for each task-related pins. Word embeddings are derived from the selected subject, q, and the pin clicked on or saved by the user, p. Search. Word embeddings are created from the search query text, q, and the pin clicked on or saved by the user, p. Home feed. Word embeddings are generated based on the interest of the user, q, and the pin

Starting point is 00:04:26 clicked on or saved by the user, p. backslash dot. To obtain an overall entity embedding, Pinterest averages the associated word embeddings for related pins, search in the home feed. Pinterest created and evaluated its own supervised Pintext MTL, multi-task learning, against unsupervised learning models including GloVe, Word2Vec as well as a single task learning model, Pintext SR on precision. Pintext MTL had higher precision than the other embedding models, meaning that it had a higher proportion of true positive predictions among all positive predictions. Pinterest also found that multi-task learning models had a higher recall or a higher

Starting point is 00:05:05 proportion of relevant instances correctly identified by the model, making them a better fit for search and discovery. To put this all together in production, Pinterest has a multitask model trained in streaming data from the home feed, search and related pins. Once that model is trained, vector embeddings are created in a large batch job using either Kubernetes plus Docker or a map-reduced system. The platform builds a search index of vector embeddings and runs a k-nearest-neighbors, KNN, search to find the most relevant content for users. Results are cached to meet the performance requirements of the Pinterest platform. Spotify. Podcast search. Spotify combines keyword and semantic search

Starting point is 00:05:46 to retrieve relevant podcast episode results for users. As an example, the team highlighted the limitations of keyword search for the query, electric cars climate impact, a query which yielded zero results even though relevant podcast episodes exist in the Spotify library. To improve recall, the Spotify team used Approximate Nearest Neighbor and, for fast, relevant podcast search. The team generates vector embeddings using the Universal Sentence Encoder CMLM model as it is multilingual, supporting a global library of podcasts, and produces high-quality vector embeddings. Other models were also evaluated including BERT, a model trained on a big corpus

Starting point is 00:06:25 of text data, but found that BERT was better suited for word embeddings than sentence embeddings and WASPR trained only in English. Spotify builds the vector embeddings with the query text being the input embedding and a concatenation of textual metadata fields including title and description for the podcast episode embeddings. To determine the similarity, Spotify measured the cosine distance between the query and episode embeddings. To train the base universal sentence encoder CMLM model, Spotify used positive pairs of successful podcast searches and episodes. They incorporated in batch negatives, a technique highlighted in papers including dense passage retrieval for open domain question answering, DPR, and K2 search, fast and accurate query and document understanding for search at Facebook to generate random negative pairings. Testing was also conducted using synthetic

Starting point is 00:07:16 queries and manually written queries. To incorporate vector search into serving podcast recommendations in production, Spotify used the following steps and technologies index episode vectors. Spotify indexes the episode vectors offline in batch using Vespa, a search engine with native support for ANN. One of the reasons that Vespa was chosen is that it can also incorporate metadata filtering post-search on features like episode popularity. Online inference. Spotify uses Google Cloud Vertex AI to generate a query vector. Vertex AI was chosen for its support for GPU inference, which is more cost-effective when using large transformer models to generate embeddings, and for its query cache. After

Starting point is 00:07:58 the query vector embedding is generated, it is used to retrieve the top 30 podcast episodes from Vespa. Backslash dot. Semantic search contributes to the identification of pertinent podcast episodes, yet it is unable to fully supplant keyword search. This is due to the fact that semantic search falls short of exact term matching when users search an exact episode or podcast name. Spotify employs a hybrid search approach, merging semantic search in Vespa with keyword search in Elasticsearch, followed by a conclusive re-ranking stage to establish the episodes displayed to users. eBay Image search

Starting point is 00:08:33 Traditionally, search engines have displayed results by aligning the search query text with textual descriptions of items or documents. This method relies extensively on language to infer preferences and is not as effective in capturing elements of style or aesthetics. eBay introduces image search to help users find relevant, similar items that meet the style they're looking for. eBay uses a multi-modal model which is designed to process and integrate data from multiple modalities or input types, such as text, images, audio, or video, to make predictions or perform tasks. eBay incorporates both text and imagesinto its model, producing image embeddings utilizing a convolutional neural network, CNN, model, specifically RESNET50, and title embeddings

Starting point is 00:09:19 using a text-based model such as BERT. Every listing is represented by a vector embedding that combines both the image and title embeddings. Once the multi-modal model is trained using a large dataset of image-title listing pairs and recently sold listings, it is time to put it into production in the site search experience. Due to the large number of listings at eBay, the data is loaded in batches to HDFS, eBay's data warehouse. eBay uses Apache Sparkto retrieve and store the image and relevant fields required for further processing of listings, including generating listing embeddings. The listing embeddings are published to a columnar store such as HBase which is good at aggregating large-scale data. From HBase, the listing embedding is indexed and served in Cassini,

Starting point is 00:10:04 a search engine created at eBay. The pipeline is managed using Apache Airflow, which is capable of scaling even when there is a high quantity and complexity of tasks. It also provides support for Spark, Hadoop, and Python, making it convenient for the machine learning team to adopt and utilize. Visual search allows users to find similar styles and preferences in the categories of furniture and home decor, where style and aesthetics are key topperchase decisions. In the future, eBay plans to expand visual search across all categories and also help users discover related items so they can establish the same look and feel across their home. Airbnb. Real-time personalized listings. Search and similar listings features drive 99% of bookings on the Airbnb site. Airbnb built a listing embedding technique to improve

Starting point is 00:10:52 similar listing recommendations and provide real-time personalization in search rankings. Airbnb realized early on that they could expand the application of embeddings beyond just word representations, encompassing user behaviors including clicks and bookings as well. To train the embedding models, Airbnb incorporated over 4, 5M active listings and 800 million search sessions to determine the similarity based on what listings ouster clicks and skips in a session. Listings that were clicked by the same user in a session are pushed closer together. Listings that were skipped by the user are pushed further away. The team settled on the dimensionality of a listing embedding of D equals 32 given the trade-off between offline performance and memory needed

Starting point is 00:11:34 for online serving. HTTPS colon slash slash U2. B. AWJSUEX7B1I. C equals GRERO VRWX WQTQLM and embeddable equals true Airbnb found that certain listings characteristics do not require learning, as the can be directly obtained from metadata, such as price. However, attributes like architecture, style, and ambiance are considerably more challenging to derive from metadata. Before moving to production, Airbnb validated their model by testing how well the model recommended listings that a user actually booked. The team also ran an A-B test comparing the existing listings algorithm against the vector embedding-based algorithm. They found that the algorithm with vector embeddings resulted in a 21% uptick in center and 4.9% increase in users discovering

Starting point is 00:12:26 a listing that they booked. The team also realized that vector embeddings could be used as part of the model for real-time personalization in search. For each user, they collected and maintained in real-time, using Kafka, a short-term history of user clicks and skips in the last two weeks. For every search conducted by the user, they ran two similarity searches based on the geographic markets that were recently searched and then. The similarity between the candidate listings and the ones the user has clicked, skipped. Embeddings were evaluated in offline and online experiments and became part of the real-time personalization features. DoorDash. Personalized store feeds. DoorDash has a wide variety of stores

Starting point is 00:13:07 that users can choose to order from and being able to surface the most relevant stores using personalized preferences improves search and discovery. DoorDash wanted to apply latent information to its store feed algorithms using vector embeddings. This would enable DoorDash to uncover similarities between stores that were not well-documented including if a store has sweet items, is considered trendy or features vegetarian options. DoorDash used a derivative of Word2Vec, an embedding model used in natural language processing, called Store2Vec that it adapted based on existing data. The team treated each store as a word and formed sentences using the list of stores viewed during a single-user session, with a maximum limit of five stores per sentence. To create user vector

Starting point is 00:13:50 embeddings, DoorDash summed the vectors of the stores from which users placed orders in the past six months or up to 100 orders. As an example, DoorDash used vector search to find similar restaurants for AUSER based on their recent purchases at Popular, Trendy Joints, 4505 Burgers and New Nagano Sushi in San Francisco. DoorDash generated a list of similar restaurants measuring the cosine distance from the user embedding to store embeddings in the area. You can see that the stores that were closest in cosine distance include Kizar Pub and Wooden Charcoal Korean Village BBQ. DoorDash Inc.'s Door2Vec distance feature is one of the features in its larger recommendation and personalization model. With VectorSearch, DoorDash was able to see a 5% increase in click-through rate. The team is also experimenting with new models like Seek2Seek, model optimizations

Starting point is 00:14:42 and incorporating real-time on-site activity data from users. Key considerations for vector search. Pinterest, Spotify, eBay, Airbnb and DoorDash create better search and discovery experiences with vector search. Many of these teams started out using text search and found limitations with fuzzy search or searches of specific styles or aesthetics. In these scenarios, adding vector search to the experience made it easier to find relevant, and often personalized, podcasts, pillows, rentals, pins and eateries. There are a few decisions that these companies made that are worth calling out when implementing vector search embedding models. Many started out using an off-the-shelf model and

Starting point is 00:15:21 then trained it on their own data. They also recognized that language models like Word2Vec could be used by swapping words and their descriptions with items and similar items that were recently clicked. Teams like Airbnb found that using derivatives of language models, rather than image models, could still work well for capturing visual similarities and differences. Training. Many of these companies opted to train their models on past purchase and click through data, making use of existing large-scale datasets. Indexing. While many companies adopted and search, we saw that Pinterest was able to combine metadata filtering with KNN search for efficiency at scale. Hybrid search. Vector search rarely replaces text search. Many

Starting point is 00:16:03 times, like in Spotify's example, a final ranking algorithm is used to determine whether vector search or text search generated the most relevant result. Productionizing. We're seeing many teams use batch-based systems to create the vector embeddings, given that these embeddings are rarely updated. They employ a different system, frequently Elasticsearch, to compute the query vector embedding live and incorporate real-time metadata in their search. Rockset, a real-time search and analytics database, recently added support for VectorSearch. Give VectorSearch on Rockset a try for real-time personalization, recommendations, anomaly detection and more by starting a free trial with $300 in credits today.

Starting point is 00:16:44 Thank you for listening to this Hackernoon story, read by Artificial Intelligence. Visit hackernoon.com to read, write, learn and publish.

The Good Tech Companies - A Look Into 5 Use Cases for Vector Search from Major Tech Companies

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.