The Good Tech Companies - How to Use Vector Search to Build a Movie Recommendation App
Episode Date: November 5, 2025This story was originally published on HackerNoon at: https://hackernoon.com/how-to-use-vector-search-to-build-a-movie-recommendation-app. Learn how to build a semantic ...movie recommendation app using ScyllaDB’s vector search to find films by meaning, not just keywords. Check more stories related to programming at: https://hackernoon.com/c/programming. You can also check exclusive content about #scylladb-vector-search, #movie-recommendation-app, #semantic-search-tutorial, #vector-similarity-functions, #python-streamlit-app, #sentence-transformers, #ann-index-scylladb, #good-company, and more. This story was written by: @scylladb. Learn more about this writer by checking @scylladb's about page, and for more stories, please visit hackernoon.com. ScyllaDB’s new Vector Search lets developers build semantic search apps that understand meaning, not just text. This tutorial shows how to create a movie recommendation app using Sentence Transformers, Python, and Streamlit. It covers schema design, vector indexing, and ANN-based querying for fast, intelligent recommendations.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
How to Use Vector Search to build a movie recommendation app by Skyladyby.
Use Skyladyby to perform semantic search across movie plot descriptions.
We built a sample movie recommendation app to showcase Skyladyby's new vector search capabilities.
The sample app gives you a simple way to experience building low latency semantic search and vector-based applications with Skyladyby.
Join the Vector Search Early Access Program in this post, we'll show how to perform semantic search across movie plot descriptions to find movies by meaning, not keywords.
This example also shows show you can add SkyladyB vector search to your existing applications.
Before diving into the application, let's clarify what we mean by semantic search and provide some context about similarity functions.
About vector similarity functions, similarity between two vectors can be calculated in several ways.
ways. The most common methods are cosine similarity, dot product, inner product, and L2, Euclidean, distance.
SkyladyB vector search supports all of these functions. For text embeddings, cosine similarity is the most
often used similarity function. That's because, when working with text, we mostly focus on the
direction of the vector, rather than its magnitude. Cosine similarity considers only the angle
between the vectors i e the difference in directions and in erase the magnitude length of the vector for example
a short document one page and a longer document 10 pages on the same topic will still point in similar
directions in the vector space evanthaw their different lengths this is what makes cosine similarity
ideal for capturing topical similarity in practice many embedding models e g open a i models produce normalized vectors
normalized vectors all have the same length, magnitude of one, four normalized vectors,
cosine similarity, and the dot product return the same result. This is because
cosine similarity divides the dot product by the magnitudes of the vectors, which are all
one when vectors are normalized. The L2 function produces different distance values compared to
the dot product orcosine similarity, but the ordering of the embeddings remains the same,
assuming normalized vectors. Now that you have a better understanding of semantic similarity function,
Let's explain how the recommendation app works.
App overview.
The application allows users to input what kind of movie they want to watch.
For example, if you type American Football,
the app compares your input to the plots of movies stored in the database.
The first result is the best match, followed by other similar recommendations.
This comparison uses SkylaDB vector search.
You can find the source code on GitHub, along with setup instructions and a step-by-step tutorial in the documentation.
For the dataset, we are reusing a TMDB dataset available on Kaggle.
Project requirements to run the application,
you need a SkyladyB cloud account and a vector search enabled cluster.
Right now, you need to use the API to create a vector search enabled cluster.
Follow the instructions here to get started.
The application depends on a few Python packages.
Skyladybee Python driver for connecting and querying Skyladyby.
Sentence Transformers to generate embeddings locally without
requiring open A.I. or other paid APIs. Streamlet for the UI, Pidantic, to make working with
query results easier. By default, the app uses the all-mini LML 6V2 model so anyone can run it locally
without heavy compute requirements. Other than SkyladyB Cloud, no commercial or paid services are
needed to run the example. Configuration and database connection. A file store's SkyladyB Cloud
credentials, including the host address and connection details. A separate
SkyladyB Helper module handles the following, creating the connection and session.
Inserting and querying data, providing helper functions for clean database interactions. Database schema. The schema
is defined in a file, executed when running the project's migration script. It includes keyspace
creation, with a replication factor of three. Table definition for movies, storing fields like
and vector search index schema cql hosted with heart by github schema highlights backquote plot
back quote text stores the movie description used for similarity comparison
back quote plot underscore embedding back quote vector embedding representation of the plot
defined using the vector data type with 384 dimensions matching the sentence transformers model
Backquote primary key, backquote id as the partition key for efficient lookups querying by ID.
CDC enabled, required for SkyladyB vector search, backquote vector index,
backquote an approximate nearest neighbor, and index created on the plot underscore embedding column
to enable efficient vector queries. The goal of this schema is to allow efficient search on the
plot embeddings and store additional information alongside the vectors.
Embeddings, an embedding creator class handles text embedding generation with
sentence transformers. The function accepts any text input and returns a list of float values
that you can insert into Skylodyb's backquote vector backquote column. Recommendations implemented
with vector search. The app's main function is to provide movie recommendations. Thesi
recommendations are implemented using vector search. So we create a module called that handles one.
Taking the input text. Two, turning the text into embeddings. Three, running vector search.
Recommender, P.Y. Hosted with Heart by GitHublets break down the vector search query. User input is first
converted to an embedding, ensuring that we're comparing embedding to embedding. The rows in the
table are ordered by similarity using the AN operator. Results are limited to five similar movies.
The statement retrieves all columns from the table. In similarity search, we calculate the distance
between two vectors. The closer the vectors in vector space, the more similar their underlying
content. Or, in other words, a smaller distance suggests higher similarity. Therefore, an ORD-E-R-B-Y
sort results in ascending order, with smaller distances appearing first. Streamlit UI, the UI, defined
in, ties everything together. It takes the user's query, converts it to an embedding, and executes a vector
search. The UI displays the best match and a list of other similar movie recommendations. Try it yourself.
If you want to get started building with Skyladyby Vector Search, you have a severe options.
Explore the source code on GitHub.
Use the readme to set up the app on your computer.
Follow the tutorial to build the app from scratch.
And if you have questions, use the forum and we'll be happy to help.
About Attila Tothatilla Toth is a developer advocate at Skyladyby.
He writes tutorials and blog posts, speaks at events, creates demos and sample applications
to help developers build high performance applications.
Thank you for listening to this Hackernoon story, read by artificial intelligence.
Visit hackernoon.com to read, write, learn and publish.
