The Good Tech Companies - How Vector Search Cracks the Code on Contract Analytics

Episode Date: December 9, 2024

This story was originally published on HackerNoon at: https://hackernoon.com/how-vector-search-cracks-the-code-on-contract-analytics. Learn how vector search helped buil...d a powerful way to identify recurring payments in the financial sector. Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #vector-search, #software-development, #app-development, #what-is-vector-search, #wealthapi, #wealthapi-implementation, #transaction-search-engine, #good-company, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. A look at the application architecture of wealthAPI, a data analytics provider for the financial sector that has created a highly accurate way to identify recurring payment entries.

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. How Vector Search Cracks the Code on Contract Analytics by Datastacks A look at the application architecture of Wealth API, a data analytics provider for the financial sector that has created a highly accurate way to identify recurring payment entries. At Wealth API, we've always believed financial analytics should be smarter and faster, especially when identifying recurring payments hidden in transaction data. We've built a solution that transforms raw transaction data into actionable insights by leveraging AI. Our system uses vector embeddings to group transactions into recurring
Starting point is 00:00:39 payment patterns, ensuring accuracy even when recurring payment entries contain subtle wording differences. From subscriptions to insurance payments, our platform delivers reliable results while maintaining the speed and scalability financial companies need. Here, we'll show how we designed our architecture to solve these challenges, from data ingestion and vector embeddings to clustering transactions into meaningful groups. We'll also explore how AI powers advanced features like semantic search, allowing users to find and analyze financial data effortlessly. What the application does Wealth API tackles a common yet challenging problem for financial companies.
Starting point is 00:01:17 Identifying recurring payments, such as subscriptions, in bank transaction histories. Traditional methods struggled with scaling and often relied on exact matches, missing subtle differences, e.g. Spotify vs. Spotify AB. Wealth API addresses this problem with an AI-driven approach that delivers accuracy and speed. At the heart of this solution lies Datastacks AstraDB, a database platform purpose-built for modern, scalable, and AI-integrated workflows. Architecture. Wealthoppy's system takes raw bank transactions, processes them into embeddings, and groups them into recurring payment patterns, all powered by AstraDB's vector similarity search capabilities. The architecture ensures scalability and responsiveness at each stage, even under high data volumes. Here's a simplified flow of the process. 1. Data ingestion. When bank transactions are received, the Wealth API
Starting point is 00:02:12 backend publishes them on a message queue for asynchronous processing. Backslash dot. 2. Embedding creation. Each transaction, e.g. Spotify, minus 10 euros, 22, 10, 24 inches, is transformed into a numeric vector, EG, 0, 12, 0, 65, 0, 78, 0, 23, using AstraDB's vectorize feature. Backslash dot, 3, vector storage and search in AstraDB. The embeddings are stored in AstraDB, where lightning-fast vector similarity searches allow the system to find and cluster similar transactions. 4. Regularity analysis. The clusters are analyzed to identify recurring payments, categorizing them as contracts like Spotify, Music Service, Monthly, or Health Insurance, Health, yearly. AstraDB ensures the entire process is scalable and responsive, even with high volumes of data. The process also adheres
Starting point is 00:03:13 to strict data security measures to ensure that end users and their transactions remain anonymous and protected from external access. Technical implementation, clustering transactions into contracts grouping transactions has always been a core challenge. Previous tools depended in exact matches, e.g., vendor name or payment amount, which often failed to capture variations and were slow to scale. At Wealth API, we tried searching for patterns among millions of transactions with traditional databases in the past, which was both slow and prone to errors. Even small variations in transaction details broke the clustering logic. Because we're using AstraDB, we can store embeddings and efficiently search for similar
Starting point is 00:03:54 transactions, even with minor variations in details. Here's an example. A payment labeled Spotify AB for 10 euros on one day and Spotify for 10 euros the next is correctly grouped as the same recurring payment. Handling large data volumes with thousands of transactions processed daily, Wealth API required a database that could scale seamlessly while maintaining speed and accuracy. AstraDB's foundation is Apache Cassandra, so it's built for scalability. It also integrates with AI workflows, enabling Wealth API to maintain fast queries without compromising precision. Transaction search engine Because embeddings capture the underlying meaning of transactions,
Starting point is 00:04:36 Wealth API can also implement a search feature. Users can type a keyword like health to retrieve all health-related transactions without relying on predefined tags or categories. The system generates an embedding from the user query and runs a simple similarity search using AstraDB. Its vector search capability makes this kind of semantic search fast and accurate. A user typing health, for example, will see all payments for health-related services, like insurance or gym memberships, even if the vendor names differ. Wrapping up, Wealthoppy's use of AstraDB demonstrates how advanced database technology can drive innovation in financial analytics. From precise transaction clustering to enabling a cutting-edge semantic search engine, AstraDB's vector search and scalability empower Wealth API to deliver faster, smarter solutions to its clients. By integrating AI workflows directly into AstraDB's architecture,
Starting point is 00:05:31 Wealth API has enhanced financial data processing and introduced a valuable new capability for contract analytics. By Belkacem Berchiche, machine learning engineer, Wealth API, and Dieter Flick, solution engineer, data stacks learn more about AstraDB and Wealth API, and Dieter Flick, Solution Engineer, Data Stacks learn more about AstraDB and Wealth API. Thank you for listening to this Hackernoon story, read by Artificial Intelligence. Visit hackernoon.com to read, write, learn and publish.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.