The Good Tech Companies - Behind AI Agents: The Infrastructure That Supports Autonomy

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. Behind AI Agents, the infrastructure that supports autonomy, by Datastacks. Most descriptions of AI agents and agentic systems focus on agents' ability to act autonomously, without user intervention, in many situations across the agent's intended use cases. Some agents operate with a human-in-the-loop model, engaging the user only when they encounter uncertainty, but still acting autonomously under typical and certain circumstances. With autonomy being the primary defining feature of AI agents, there are supporting capabilities that agents need in order to act independently from user input.

Starting point is 00:00:39 In an earlier blog post, we identified four requirements for agentic AI architectures 1. Ability and access, the capability to act on behalf of the user, including permissions and authenticated access to relevant systems. Backslash dot 2. Reasoning and planning, using reasoning to make decisions within a structured thought process, often defined as a chain, tree, graph, or algorithm that guides the agent's actions. Backslash.3. Component orchestration, coordination of multiple parts, including prompts, LLMs, available data sources, context, memory, history, and the execution and status of potential actions. Backslash.4. Guardrails, mechanisms to keep the agent focused and effective, including safeguards to avoid errors or provide helpful diagnostic information in case of failure. Each of these

Starting point is 00:01:31 four requirements has different infrastructure needs. For ability and access, the primary needs are software integrations and credential management. Reasoning and planning are mainly supported by LLMs and other AI models. The topic of guardrails is vast and often specific to the use cases involved, so we will save that for a future article. Here, I'd like to focus on orchestration, and the infrastructure needed to support intelligent orchestration across a large number of moving parts and a long history of data and context that might be needed at decision time. Component orchestration and the role of context in AI agents. Assuming that the first two requirements above, including ability, access,

Starting point is 00:02:10 reasoning, and planning, are functioning as intended, the main challenge of component orchestration boils down to knowledge management. The agentic system needs to maintain awareness on a variety of levels, its core tasks and goals, the state of various relevant systems, the history of interactions with the user and other external systems, and potentially more. With LLMs, we use the concept of a context window to describe the set of information available to the model, generally at prompt time. This is distinct from the information contained in the prompt itself and also distinct from Thelm's internal knowledge set that was formed during the model training process. In long texts, context windows can be thought of as a recent history of information that is available to the LLM at prompt time.

Starting point is 00:02:55 This is implicit in the architecture of LLMs and prompting. In that way, most LLMs have a one-dimensional concept of context, and older context simply falls out of the window over time. Agents need a more sophisticated system for managing context and knowledge, in order to make sure that the most important or urgent context is made a priority, whenever the agent needs to make a decision. Instead of a single monolithic context, AI agents must track different types of context at varying levels of importance.

Starting point is 00:03:24 This can be compared to memory in computer systems, where different types of storage, cache, RAM, and hard drives, serve different purposes based on accessibility and frequency of use. For AI agents, we can conceptually structure context into three primary levels 1. Primary context, the agent's core task list or goals. This should always be top of mind, guiding all actions. 2. Direct context, the state of connected, relevant systems in the immediate environment, including resources like messaging systems, data feeds, critical APIs, or a user's email and calendars. 3. External context, general knowledge, or any information that might

Starting point is 00:04:06 be relevant, but which is not explicitly designed to be a core part of the agentic system. External context could be provided by something as simple as a search of the internet or Wikipedia. Or, it could be urgent and complicated, such as unexpected factors that arise from third-party news or updates, requiring the agent to adapt its actions dynamically. These levels of context are not definitive, the lines between them can be very blurry, and there are other useful ways of describing types of context, but this conceptual structure is useful for our discussion here. Storage infrastructure for context management. The storage needs of AI agents vary depending on the type of context being managed. Each level, primary, direct, and external context, requires different data structures,

Starting point is 00:04:51 retrieval mechanisms, and update frequencies. The key challenge is ensuring efficient access, long-term persistence, and dynamic updates without overloading the agent's processing pipeline. Rather than treating context as a monolithic entity, AI agents benefit from hybrid storage architectures that blend structured and unstructured data models. This allows for fast lookups, semantic retrieval, and scalable persistence, ensuring that relevant context is available when needed while minimizing redundant data processing. Primary Context. Task lists and agent goals. The primary context consists of the agent's core objectives and active tasks, the foundation that drives decision-making. This information must be persistent, highly structured, and easily queryable, as it guides all agent actions.

Starting point is 00:05:37 Potential Storage Needs. Transactional Databases, Key Value or Document Stores, for Structured Task Lists task lists and goal hierarchies. Low latency indexing to support quick lookups of active tasks. Event-driven updates to ensure tasks reflect real-time progress. Example agent implementation a scheduling assistant managing a task queue needs to store. Persistent tasks, e.g. Schedule a meeting with Alex, with status updates. Execution history, e.g. Sent initial email, awaiting response. Priorities and dependencies, ensuring urgent tasks are surfaced first. A distributed, highly available data store ensures that tasks are tracked reliably, even as the agent processes new events and context updates.

Starting point is 00:06:22 Direct context. State of connected systems direct context includes the current state of relevant systems, calendars, messaging platforms, APIs, databases, and other real-time data sources. Unlike primary context, direct context is dynamic and often requires a combination of structured and real-time storage solutions. Potential storage needs. Time series databases for event logs and real-time status tracking. Caching layers for frequently accessed system states. Vector-based retrieval for contextual queries on recent interactions. Example agent implementation. A customer support AI agent tracking live user interactions needs to store. Real-time conversation history in an in-memory store.

Starting point is 00:07:06 Session state, E, G. Ongoing support ticket details in a time series database. API response caches for external system lookups, avoiding redundant queries. By structuring direct context storage with a combination of time-sensitive and long-term data stores, AI agents can act with awareness of their environment without excessive latency. External context. Knowledge retrieval and adaptation external context encompasses general knowledge and unexpected updates from sources outside the agent's immediate control. This could range from on-demand search queries to dynamically ingested external data, requiring a flexible approach to storage and retrieval. Unlike primary

Starting point is 00:07:45 and direct contexts, which are closely tied to the agent's ongoing tasks and connected systems, external context is often unstructured, vast, and highly variable in relevance. Potential storage considerations. Document stores and knowledge bases for persistent, structured reference material. Vector search for querying large datasets of documents, internal or external. Retrieval augmented generation, RAG, to fetch relevant knowledge before responding. Streaming and event-driven ingestion for real-time updates from external data sources. Example agent implementation. A personal assistant assembling a report on the latest scientific discoveries in climate change research needs to Retrieve scientific articles from external sources, filtering for relevance based on keywords or vector similarity.

Starting point is 00:08:33 Analyze relationships between papers, identifying trends using a knowledge graph. Summarize key insights using LLM-based retrieval augmented generation. Track recent updates by subscribing to real-time publication feeds and news sources. By structuring external context storage around fast retrieval and semantic organization, AI agents can continuously adapt to new information while ensuring that retrieved data remains relevant, credible, and actionable. Hybrid storage for context-aware AI agents. Designing context-aware AI agents requires a careful balance between efficient access to critical information and avoiding memory

Starting point is 00:09:10 or processing overload. AI agents must decide when to store, retrieve, and process context dynamically to optimize decision-making. A hybrid storage architecture, integrating transactional, vector, time-ser series, and event-driven models, allows AI agents to maintain context persistence, retrieval efficiency, and adaptive intelligence, all of which are crucial for autonomy at scale. Achieving this balance requires structured strategies across three key dimensions. 1. Latency versus persistence. Frequently accessed context, e.g. active task states should reside in low-latency storage, while less frequently needed but essential knowledge, e.g. Historical interactions

Starting point is 00:09:52 should be retrieved on demand from long-term storage. Backslash dot 2. Structured versus unstructured data, tasks, goals, and system states benefit from structured storage, e.g. key value or document databases, while broader knowledge retrieval requires unstructured embeddings and graph relationships to capture context effectively. 3. Real-time versus historical awareness. Some contexts require continuous monitoring, e.g. live API responses, whereas others, e.g. prior decisions or reports, should only be retrieved when relevant to the agent's current task. Given these different types of contexts, AI agents need a structured approach to storing and accessing information. Relying solely on LLM context windows is inefficient, as it limits the agent's ability

Starting point is 00:10:43 to track long-term interaction sand evolving situations. Instead, context should be persistently stored, dynamically retrieved, and prioritized based on relevance and urgency. Primary context, tasks and goals, stored in transactional databases for structured tracking and referenced in every inference cycle. Direct context, system state and active data, maintained in real-time through caching, time series storage, or event-driven updates. External context, knowledge and dynamic updates, queried on demand via vector search, retrieval augmented generation, RAG, or graph-based knowledge representation. In practice, multi-tiered memory models combining short-term caches, persistent databases, and external retrieval mechanisms are required for scalable AI agent architectures.

Starting point is 00:11:31 By leveraging a hybrid storage approach, AI agents can maintain real-time awareness of active systems, retrieve historical knowledge only when relevant, dynamically adjust priorities based on evolving needs. By integrating these storage strategies, AI agents can function autonomously, retain contextual awareness over long periods, and respond dynamically to new information, laying the foundation for truly intelligent and scalable agentic systems. Hybrid storage solutions. Implementing a hybrid storage architecture for AI agents requires selecting the right databases and storage tools to handle different types of contexts efficiently. The best choice depends on factors such as latency requirements, scalability, data structure compatibility, and retrieval mechanisms.

Starting point is 00:12:16 A well-designed AI agent storage system typically includes transactional databases for structured, persistent task tracking, time-series and event-driven storage for real-time system state monitoring Vector search and knowledge retrieval for flexible, unstructured data access Caching and in-memory databases for rapid short-term memory access Let's take a closer look at each of these elements. Transactional and distributed DATA BASE SAI agents require scalable, highly available transactional databases to store tasks, goals, and structured metadata reliably. These databases ensure that primary context is always available and efficiently queryable.

Starting point is 00:12:56 Apache Cassandra A distributed NoSQL database designed for high availability and fault tolerance. Ideal for managing structured task lists and agent goal tracking at scale. Datastacks AstraDB, a managed database as a service, DBAAS, built on Cassandra, providing elastic scalability and multi-region replication for AI applications requiring high durability. PostgreSQL, a popular relational database with strong consistency guarantees, well-suited for structured agent metadata, persistent task logs, and policy enforcement. Time-series and event-driven storage for real-time system monitoring. AI agents need databases optimized for logging, event tracking, and state persistence.

Starting point is 00:13:40 InfluxDB, a leading time-series database designed for high-speed ingestion and efficient queries, making it ideal for logging AI agent activity and external system updates. TimescaleDB, a PostgreSQL extension optimized for time series workloads, suitable for tracking changes in AI agent workflows and system events. Apache Kafka plus KSQLDB, a streaming data platform that allows AI agents to consume, process, and react to real-time events efficiently. Redis Streams, a lightweight solution for real-time event handling and message queuing, useful for keeping AI agents aware of new updates as they happen.

Starting point is 00:14:19 Vector Search for Knowledge RETRIEVALAI agents working with unstructured knowledge require efficient ways to store, search, and retrieve embeddings for tasks like semantic search, similarity matching, and retrieval augmented generation, RAG. A well-optimized vector search system enables agents to recall relevant past interactions, documents, or facts without overloading memory or context windows. Datastacks AstraDB, a scalable, managed vector database built on Cassandra, offering high-performance similarity search and multimodal retrieval. Astra combines distributed resilience with vector search capabilities, making it a top choice for AI agents that need to process embeddings efficiently while

Starting point is 00:15:01 ensuring global scalability and high availability. Weavey 8, a cloud-native vector database designed for semantic search and multimodal data retrieval. It supports hybrid search methods and integrates well with knowledge graphs, making it useful for AI agents that rely on contextual reasoning. FAISS, Facebook AI Similarity Search, an open-source library for high-performance nearest-neighbor search, often embedded in AI pipelines for fast vector lookups on large datasets. While not a full database, FAISS provides a lightweight, high-speed solution for local similarity search. Caching and in-memory storage AI agents require low-latency access to frequently

Starting point is 00:15:42 referenced context, making caching an essential component of hybrid storage architectures. Redis, a high-performance in-memory key value store, widely used for short-term context caching and session management in AI agents. Memcached, a simple but effective distributed caching system that provides rapid access to frequently used AI agent data. By integrating these diverse storage solutions, AI agents can efficiently manage short-term memory, persistent knowledge, and real-time updates, ensuring seamless decision-making at scale. The combination of transactional databases, time-series storage, vector search, and caching allows agents to balance speed, scalability, and contextual awareness, adapting dynamically

Starting point is 00:16:25 to new inputs. As AI-driven applications continue to evolve, selecting the right hybrid storage architecture will be crucial for enabling autonomous, responsive, and intelligent agentic systems that can operate reliably in complex and ever-changing environments. The future of AI agents with hybrid databases As AI systems grow more complex, hybrid databases will be crucial for managing short-term and long-term memory, structured and unstructured data, and real-time and historical insights. Advances in retrieval augmented generation, RAG, semantic indexing, and distributed inference are making AI agents more efficient, intelligent, and adaptive. Future AI agents will rely on fast,

Starting point is 00:17:06 scalable, and context-aware storage to maintain continuity and make informed decisions over time. Why hybrid databases? AI agents need storage solutions that efficiently manage different types of context while ensuring speed, scalability, and resilience. Hybrid database software the best of both worlds, high-speed structured data with deep contextual retrieval, making them foundational for intelligent AI systems. They support vector-based search for long-term knowledge storage, low-latency transactional lookups, real-time event-driven updates, and distributed scalability for fault tolerance. Building a scalable AI data infrastructure. To support intelligent AI agents,

Starting point is 00:17:47 developers should design storage architectures that combine multiple data models for seamless context management. Vector search and columnar data store semantic context alongside structured metadata for fast retrieval. Event-driven workflows stream real-time updates to keep AI agents aware of changing data. Global scale and resilience deploy across distributed networks for high availability and fault tolerance. By integrating transactional processing, vector search, and real-time updates, hybrid databases like Datastacks AstraDB provide the optimal foundation for AI agent memory, context awareness, and decision-making. As iDriven applications evolve, hybrid storage solutions will be essential for enabling autonomous, context-rich AI agents

Starting point is 00:18:31 that operate reliably in dynamic, data-intensive environments. Written by Brian Godsey, Datastacks thank you for listening to this HackerNoon story, read by Artificial Intelligence. Visit HackerNoon.com to read, write, learn and publish.

The Good Tech Companies - Behind AI Agents: The Infrastructure That Supports Autonomy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.